Stabl: sparse and reliable biomarker discovery in predictive modeling of high-dimensional omic data. Research square Hédou, J., Maric, I., Bellan, G., Einhaus, J., Gaudillière, D. K., Ladant, F. X., Verdonk, F., Stelzer, I. A., Feyaerts, D., Tsai, A. S., Ganio, E. A., Sabayev, M., Gillard, J., Bonham, T. A., Sato, M., Diop, M., Angst, M. S., Stevenson, D., Aghaeepour, N., Montanari, A., Gaudillière, B. 2023

Abstract

High-content omic technologies coupled with sparsity-promoting regularization methods (SRM) have transformed the biomarker discovery process. However, the translation of computational results into a clinical use-case scenario remains challenging. A rate-limiting step is the rigorous selection of reliable biomarker candidates among a host of biological features included in multivariate models. We propose Stabl, a machine learning framework that unifies the biomarker discovery process with multivariate predictive modeling of clinical outcomes by selecting a sparse and reliable set of biomarkers. Evaluation of Stabl on synthetic datasets and four independent clinical studies demonstrates improved biomarker sparsity and reliability compared to commonly used SRMs at similar predictive performance. Stabl readily extends to double- and triple-omics integration tasks and identifies a sparser and more reliable set of biomarkers than those selected by state-of-the-art early- and late-fusion SRMs, thereby facilitating the biological interpretation and clinical translation of complex multi-omic predictive models. The complete package for Stabl is available online at https://github.com/gregbellan/Stabl.

View details for DOI 10.21203/rs.3.rs-2609859/v1

View details for PubMedID 36909508

View details for PubMedCentralID PMC10002850