Clinical target volume (CTV) and organ-at-risk (OAR) contouring are essential steps in designing highly conformal radiation dose distributions while sparing surrounding normal tissues. However, the delineation process is both time-consuming (taking up to two to three hours for some anatomical sites) and subject to significant inter-observer variability. Although consensus delineation guidelines have been published to reduce this variability, substantial differences still remain.
Auto-segmentation has been studied and introduced into clinical practice for over two decades to support heavy structure delineation demands in technologies such as intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT). Traditional approaches, such as atlas-based methods, have consistently suffered from low accuracy and reliability, particularly when applied to complex anatomies (e.g., abdominal bowel loops) or low-contrast regions (e.g., lymph nodes in the head and neck).
With the rapid development of machine learning, deep learning-based auto-segmentation algorithms have achieved remarkable success in both general and medical imaging. Convolutional neural networks (CNNs) remain the most widely used models, enabling automated segmentation on CT, MRI, and PET images. Given their potential to improve efficiency in clinical workflows, both commercial and institution-developed solutions have been rapidly adopted in radiation oncology departments worldwide. However, robust strategies and tools for quality assurance (QA) of auto-segmentation — particularly for CTVs — are still limited.
This study, presented by Phillip Chlap, MS, at ASTRO’s 67th Annual Meeting, evaluated an automated QA method for CTV contouring in the TROG 08.08 TOPGEAR gastric cancer trial, aiming to reduce the resource burden of manual QA. Using a small training dataset and incorporating anatomical label maps, 3D nnUNet and probabilistic UNet models were trained to detect contouring violations by comparing clinical CTVs against predicted uncertainty bands. Among 93 cases (10 training, 33 validation, 50 testing), the best-performing metric — distance-to-band for under-contouring — achieved an AUC of 0.88 on the test set, detecting 91% of violations with a 39% false-positive rate. This method could identify most contouring errors while potentially halving the number of manual reviews required, demonstrating strong potential for efficient QA in radiotherapy clinical trials.
Overall, this work represents a promising advance in automated QA for clinical trial radiotherapy planning. It offers a feasible pathway to scaling QA efforts while maintaining high sensitivity for detecting protocol deviations. To conclude, "AI-assisted contour QA, as demonstrated in the gastric cancer TOPGEAR trial, can support peer review and enable broader cohort coverage, laying the groundwork for future integration of automated QA in radiotherapy clinical trials," shared Chlap.
Abstract 179 - Automated Clinical Target Volume Contour Quality Assurance for the TROG 08.08 TOPGEAR Trial, was presented during the SS 13 - DHI 1: The Digital Revolution in Radiation Oncology: AI Models for Enhanced Patient Care of the 67th ASTRO Annual Meeting.
Reference: