The identification of acutely ill patients at high risk for venous thromboembolism (VTE) may be determined clinically or by use of integer-based scoring systems. These scores demonstrated modest performance in external data sets.To evaluate the performance of machine learning models compared to the IMPROVE score.The APEX trial randomized 7513 acutely medically ill patients to extended duration betrixaban vs. enoxaparin. Including 68 variables, a super learner model (ML) was built to predict VTE by combining estimates from 5 families of candidate models. A "reduced" model (rML) was also developed using 16 variables that were thought, a priori, to be associated with VTE. The IMPROVE score was calculated for each patient. Model performance was assessed by discrimination and calibration to predict a composite VTE end point. The frequency of predicted risks of VTE were plotted and divided into tertiles. VTE risks were compared across tertiles.The ML and rML algorithms outperformed the IMPROVE score in predicting VTE (c-statistic: 0.69, 0.68 and 0.59, respectively). The Hosmer-Lemeshow goodness-of-fit P-value was 0.06 for ML, 0.44 for rML, and <0.001 for the IMPROVE score. The observed event rate in the lowest tertile was 2.5%, 4.8% in tertile 2, and 11.4% in the highest tertile. Patients in the highest tertile of VTE risk had a 5-fold increase in odds of VTE compared to the lowest tertile.The super learner algorithms improved discrimination and calibration compared to the IMPROVE score for predicting VTE in acute medically ill patients.
View details for DOI 10.1002/rth2.12292
View details for PubMedID 32110753
View details for PubMedCentralID PMC7040551