ChestEye Quality: enhancing accuracy, but vital role for expert review

oxipit-chesteye-quality.png

In a retrospective study across two Dutch hospitals, the AI tool ChestEye Quality by Oxipit analyzed 25,104 chest X-rays from 21,039 patients, acting as a secondary reviewer by comparing its image analysis against radiologists' reports to spot potential false negatives. It flags discrepancies for radiologists' review, aiming to improve reporting accuracy and act as a preventive measure against diagnostic oversights. While the AI detected discrepancies in 21.1% of cases, only 0.1% (35 cases) contained clinically relevant missed findings, such as unreported lung nodules, pneumothoraces, and consolidations. This demonstrates the AI's potential in enhancing diagnostic precision. However, the large number of AI-detected discrepancies, many of which weren't deemed clinically relevant by an external radiologist, underscores the critical balance between AI support and the indispensable role of expert review in minimizing diagnostic errors.

Read full study


Abstract

Objectives

To evaluate an artificial intelligence (AI)–assisted double reading system for detecting clinically relevant missed findings on routinely reported chest radiographs.

Methods

A retrospective study was performed in two institutions, a secondary care hospital and tertiary referral oncology centre. Commercially available AI software performed a comparative analysis of chest radiographs and radiologists’ authorised reports using a deep learning and natural language processing algorithm, respectively. The AI-detected discrepant findings between images and reports were assessed for clinical relevance by an external radiologist, as part of the commercial service provided by the AI vendor. The selected missed findings were subsequently returned to the institution’s radiologist for final review.

Results

In total, 25,104 chest radiographs of 21,039 patients (mean age 61.1 years ± 16.2 [SD]; 10,436 men) were included. The AI software detected discrepancies between imaging and reports in 21.1% (5289 of 25,104). After review by the external radiologist, 0.9% (47 of 5289) of cases were deemed to contain clinically relevant missed findings. The institution’s radiologists confirmed 35 of 47 missed findings (74.5%) as clinically relevant (0.1% of all cases). Missed findings consisted of lung nodules (71.4%, 25 of 35), pneumothoraces (17.1%, 6 of 35) and consolidations (11.4%, 4 of 35).

Conclusion

The AI-assisted double reading system was able to identify missed findings on chest radiographs after report authorisation. The approach required an external radiologist to review the AI-detected discrepancies. The number of clinically relevant missed findings by radiologists was very low.

Clinical relevance statement

The AI-assisted double reader workflow was shown to detect diagnostic errors and could be applied as a quality assurance tool. Although clinically relevant missed findings were rare, there is potential impact given the common use of chest radiography.