Race and ethnicity, age, and breast density can influence AI scores in DBT exams

icad-profound-ai-for-2d-mammography_1.jpg

Patient characteristics seem to influence the case and risk scores of a commercially available AI algorithm analyzing negative screening digital breast tomosynthesis (DBT) examinations.

In a new study using ProFound AI by iCAD, DBT scans from 4855 patients (27% White, 26% Black, 28% Asian, and 19% Hispanic) were retrospectively interpreted to generate case scores and risk scores for each exam. Specifically, Black patients were more likely to have false-positive case score and risk score and Asian patients were less likely to have a false-positive case score than White patients. In general, older patients were more likely to have a false-positive case score (patients aged 71–80 years) or risk score (patients aged 61–70 years) than patients aged 51–60 years, while patients with extremely dense breasts had more false-positive risk scores than patients with fatty density breasts.

The authors state that the development on a demographically diverse population is not a required component by the FDA. They conclude: “The FDA should provide clear guidance on the demographic characteristics of samples used to develop algorithms, and vendors should be transparent about how their algorithms were developed.”

Read full study


Patient Characteristics Impact Performance of AI Algorithm in Interpreting Negative Screening Digital Breast Tomosynthesis Studies

Radiology, 2024

Abstract

Objective: To understand the impact of patient characteristics (race and ethnicity, age, and breast density) on the performance of an AI algorithm interpreting negative screening digital breast tomosynthesis (DBT) examinations.

Materials and Methods: This retrospective cohort study identified negative screening DBT examinations from an academic institution from January 1, 2016, to December 31, 2019. All examinations had 2 years of follow-up without a diagnosis of atypia or breast malignancy and were therefore considered true negatives. A subset of unique patients was randomly selected to provide a broad distribution of race and ethnicity. DBT studies in this final cohort were interpreted by a U.S. Food and Drug Administration–approved AI algorithm, which generated case scores (malignancy certainty) and risk scores (1-year subsequent malignancy risk) for each mammogram. Positive examinations were classified based on vendor-provided thresholds for both scores. Multivariable logistic regression was used to understand relationships between the scores and patient characteristics.

Results: A total of 4855 patients (median age, 54 years [IQR, 46–63 years]) were included: 27% (1316 of 4855) White, 26%(1261 of 4855) Black, 28% (1351 of 4855) Asian, and 19% (927 of 4855) Hispanic patients. False-positive case scores were significantly more likely in Black patients (odds ratio [OR] = 1.5 [95% CI: 1.2, 1.8]) and less likely in Asian patients (OR = 0.7 [95%CI: 0.5, 0.9]) compared with White patients, and more likely in older patients (71–80 years; OR = 1.9 [95% CI: 1.5, 2.5]) and less likely in younger patients (41–50 years; OR = 0.6 [95% CI: 0.5, 0.7]) compared with patients aged 51–60 years. False-positive risk scores were more likely in Black patients (OR = 1.5 [95% CI: 1.0, 2.0]), patients aged 61–70 years (OR = 3.5 [95% CI: 2.4, 5.1]), and patients with extremely dense breasts (OR = 2.8 [95% CI: 1.3, 5.8]) compared with White patients, patients aged 51–60 years, and patients with fatty density breasts, respectively.

Conclusion: Patient characteristics influenced the case and risk scores of a Food and Drug Administration–approved AI algorithm analyzing negative screening DBT examinations.