Artificial Intelligence Algorithm Improves Radiologist Performance in Skeletal Age Assessment: A Prospective Multicenter Randomized Controlled Trial. Radiology Eng, D. K., Khandwala, N. B., Long, J., Fefferman, N. R., Lala, S. V., Strubel, N. A., Milla, S. S., Filice, R. W., Sharp, S. E., Towbin, A. J., Francavilla, M. L., Kaplan, S. L., Ecklund, K., Prabhu, S. P., Dillon, B. J., Everist, B. M., Anton, C. G., Bittman, M. E., Dennis, R., Larson, D. B., Seekins, J. M., Silva, C. T., Zandieh, A. R., Langlotz, C. P., Lungren, M. P., Halabi, S. S. 2021: 204021

Abstract

Background Previous studies suggest that use of artificial intelligence (AI) algorithms as diagnostic aids may improve the quality of skeletal age assessment, though these studies lack evidence from clinical practice. Purpose To compare the accuracy and interpretation time of skeletal age assessment on hand radiograph examinations with and without the use of an AI algorithm as a diagnostic aid. Materials and Methods In this prospective randomized controlled trial, the accuracy of skeletal age assessment on hand radiograph examinations was performed with (n = 792) and without (n = 739) the AI algorithm as a diagnostic aid. For examinations with the AI algorithm, the radiologist was shown the AI interpretation as part of their routine clinical work and was permitted to accept or modify it. Hand radiographs were interpreted by 93 radiologists from six centers. The primary efficacy outcome was the mean absolute difference between the skeletal age dictated into the radiologists' signed report and the average interpretation of a panel of four radiologists not using a diagnostic aid. The secondary outcome was the interpretation time. A linear mixed-effects regression model with random center- and radiologist-level effects was used to compare the two experimental groups. Results Overall mean absolute difference was lower when radiologists used the AI algorithm compared with when they did not (5.36 months vs 5.95 months; P = .04). The proportions at which the absolute difference exceeded 12 months (9.3% vs 13.0%, P = .02) and 24 months (0.5% vs 1.8%, P = .02) were lower with the AI algorithm than without it. Median radiologist interpretation time was lower with the AI algorithm than without it (102 seconds vs 142 seconds, P = .001). Conclusion Use of an artificial intelligence algorithm improved skeletal age assessment accuracy and reduced interpretation times for radiologists, although differences were observed between centers. Clinical trial registration no. NCT03530098 © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Rubin in this issue.

View details for DOI 10.1148/radiol.2021204021

View details for PubMedID 34581608