A Crowdsourcing Challenge for the Development of Artificial Intelligence Algorithms for Automated Identification and Delineation of Pancreatic Cancer Primary Tumors. International journal of radiation oncology, biology, physics Toesca, D. A., Bush, K., Baclay, J. R., Sergeev, R. A., Brougher, N., Godeny, B., Pollom, E., Chang, D. T. 2021; 111 (3S): e99

Abstract

PURPOSE/OBJECTIVE(S): To develop a crowdsourced artificial intelligence computational algorithm to accurately identify and delineate pancreatic cancer (PC) primary tumors and major peripancreatic vessels on high-resolution CT images.MATERIALS/METHODS: After IRB approval, a well-curated data set of 243 diagnostic pancreatic-protocol high-resolution CT scans of patients diagnosed with PC was prepared. A data science marathon competition was then hosted online by a third-party company between December 2020 and January 2021. The full data set was randomly partitioned into training, validation and hold-out (scoring) data sets. Contestants were provided a training dataset containing [de-identified files from a high-resolution IV contrast-enhanced CT DICOM image set; de-identified DICOM metadata; de-identified RT structure set file with contours of the tumor and superior mesenteric artery (SMA), celiac axis/common hepatic artery continuum (CA/CHA), and superior mesenteric vein/portal vein continuum (SMV/PV)], along with a smaller validation data set with CT scans without contours to test their trained algorithms. Provisional scores were assigned using the validation data set throughout the competition. Final scoring of submitted algorithms was based on the performance of the algorithms on a hold-out data set (unavailable for the contestants), matched against the expected ground truth data using an F1 scoring metric [2 * precision * recall / (precision?+?recall)], in which precision?=?TP / (TP?+?FP), and recall?=?TP / (TP?+?FN). TP is the area (measured in pixels) of the overlap of expected (ground truth delineation) and extracted (submitted algorithm) regions, FP is the area extracted but which does not belong to expected regions, FN is the area of the expected regions that is not covered by extracted regions. The above was calculated separately for the 4 structure types (tumor, SMA, CA/CHA, and SMV/PV), with a weighted average of the four F1 values, where the weight of the tumor was 7, and the weight of each of the vessels was 1. A final average for each contestant was scaled up to 0-100 range.RESULTS: In total, 337 algorithm solutions from 23 competitors were submitted. The winning algorithm achieved an average F1 score of 70.2. The F1 score ranged from 69.8 to 66.7 between the 2nd and 5th best solutions. The winning solution used a 3D-UNet model to develop a 3D semantic segmentation approach that was found to be significantly accurate on tumor and vessel delineation. A comparison between the winning solution and an ensemble between the 2nd to 5th place solutions showed that in 82% of cases, both the winning and the ensembled solution succeeded in achieving an F1 score > 60.CONCLUSION: A crowdsourced innovation challenge was successfully employed to generate artificial intelligence algorithms capable of accurately delineating pancreatic cancer on a diagnostic CT scan.

View details for DOI 10.1016/j.ijrobp.2021.07.490

View details for PubMedID 34702015