Learn about the flu shot, COVID-19 vaccine, and our masking policy »
New to MyHealth?
Manage Your Care From Anywhere.
Access your health information from any device with MyHealth. You can message your clinic, view lab results, schedule an appointment, and pay your bill.
ALREADY HAVE AN ACCESS CODE?
DON'T HAVE AN ACCESS CODE?
NEED MORE DETAILS?
MyHealth for Mobile
Get the iPhone MyHealth app »
Get the Android MyHealth app »
Abstract
Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.
View details for DOI 10.1136/amiajnl-2011-000607
View details for Web of Science ID 000307934600032
View details for PubMedID 22291166
View details for PubMedCentralID PMC3422822