Development and validation of a model to predict survival in colorectal cancer using a gradient-boosted machine. Gut Bibault, J., Chang, D. T., Xing, L. 2020

Abstract

OBJECTIVE: The success of treatment planning relies critically on our ability to predict the potential benefit of a therapy. In colorectal cancer (CRC), several nomograms are available to predict different outcomes based on the use of tumour specific features. Our objective is to provide an accurate and explainable prediction of the risk to die within 10 years after CRC diagnosis, by incorporating the tumour features and the patient medical and demographic information.DESIGN: In the prostate, lung, colorectal and ovarian cancer screening (PLCO) Trial, participants (n=154 900) were randomised to screening with flexible sigmoidoscopy, with a repeat screening at 3 or 5 years, or to usual care. We selected patients who were diagnosed with CRC during the follow-up to train a gradient-boosted model to predict the risk to die within 10 years after CRC diagnosis. Using Shapley values, we determined the 20 most relevant features and provided explanation to prediction.RESULTS: During the follow-up, 2359 patients were diagnosed with CRC. Median follow-up was 16.8 years (14.4-18.9) for mortality. In total, 686 patients (29%) died from CRC during the follow-up. The dataset was randomly split into a training (n=1887) and a testing (n=472) dataset. The area under the receiver operating characteristic was 0.84 (±0.04) and accuracy was 0.83 (±0.04) with a 0.5 classification threshold. The model is available online for research use.CONCLUSIONS: We trained and validated a model with prospective data from a large multicentre cohort of patients. The model has high predictive performances at the individual scale. It could be used to discuss treatment strategies.

View details for DOI 10.1136/gutjnl-2020-321799

View details for PubMedID 32887732