RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci. Genome biology Fazal, S., Danzi, M. C., Xu, I., Kobren, S. N., Sunyaev, S., Reuter, C., Marwaha, S., Wheeler, M., Dolzhenko, E., Lucas, F., Wuchty, S., Tekin, M., Züchner, S., Aguiar-Pulido, V. 2024; 25 (1): 39

Abstract

Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies.

View details for DOI 10.1186/s13059-024-03171-4

View details for PubMedID 38297326

View details for PubMedCentralID 7425049