Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Garvin, J. H., DuVall, S. L., South, B. R., Bray, B. E., Bolton, D., Heavirland, J., Pickard, S., Heidenreich, P., Shen, S., Weir, C., Samore, M., Goldstein, M. K. 2012; 19 (5): 859-866


Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics.We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements.System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%).An EF value of <40% can be accurately identified in VA echocardiogram reports.An automated information extraction system can be used to accurately extract EF for quality measurement.

View details for DOI 10.1136/amiajnl-2011-000535

View details for Web of Science ID 000307934600025

View details for PubMedID 22437073

View details for PubMedCentralID PMC3422820