Professional Documents
Culture Documents
Purpose: To evaluate the diagnostic efficacy of artificial intelligence (AI) software in detecting incidental pulmonary embolism (IPE) at
CT and shorten the time to diagnosis with use of radiologist reading worklist prioritization.
Materials and Methods: In this study with historical controls and prospective evaluation, regulatory-cleared AI software was evaluated to
prioritize IPE on routine chest CT scans with intravenous contrast agent in adult oncology patients. Diagnostic accuracy metrics were
calculated, and temporal end points, including detection and notification times (DNTs), were assessed during three time periods (April
2019 to September 2020): routine workflow without AI, human triage without AI, and worklist prioritization with AI.
Results: In total, 11 736 CT scans in 6447 oncology patients (mean age, 63 years ± 12 [SD]; 3367 men) were included. Prevalence
of IPE was 1.3% (51 of 3837 scans), 1.4% (54 of 3920 scans), and 1.0% (38 of 3979 scans) for the respective time periods. The AI
software detected 131 true-positive, 12 false-negative, 31 false-positive, and 11 559 true-negative results, achieving 91.6% sensitiv-
ity, 99.7% specificity, 99.9% negative predictive value, and 80.9% positive predictive value. During prospective evaluation, AI-based
worklist prioritization reduced the median DNT for IPE-positive examinations to 87 minutes (vs routine workflow of 7714 minutes
and human triage of 4973 minutes). Radiologists’ missed rate of IPE was significantly reduced from 44.8% (47 of 105 scans) without
AI to 2.6% (one of 38 scans) when assisted by the AI tool (P < .001).
Conclusion: AI-assisted workflow prioritization of IPE on routine CT scans in oncology patients showed high diagnostic accuracy and
significantly shortened the time to diagnosis in a setting with a backlog of examinations.
© RSNA, 2023
Figure 1: Flowchart shows study design and data selection per time period. AI = artificial intelligence, CTPA = CT pulmonary
angiogram, TP1 = time period 1, TP2 = time period 2, TP3 = time period 3.
collected data to increase the sample size. Sensitivity, specific- rejected for P < .05. Statistical analyses were performed by
ity, negative predictive value, positive predictive value, and ac- a statistician (R.M.) using the R environment for statistical
curacy were calculated. computing (version 3.6.3; https://www.r-project.org) (12).
Figure 2: True-positive detection of incidental pulmonary embolism (PE) by the artificial intelligence (AI) software. (A, B) Im-
ages in a 68-year-old woman who underwent routine CT with intravenous contrast agent for outpatient follow-up of melanoma.
(A) Axial CT image shows a large filling defect straddling the bifurcation of the pulmonary trunk (arrow) and extending into both
pulmonary arteries, compatible with an incidental saddle PE. (B) Corresponding AI heatmap highlights the detected abnormality
(red), thereby prioritizing the case in the radiologists’ worklist. (C, D) Images in a 58-year-old woman with a history of rectal cancer
undergoing outpatient follow-up. (C) Axial restaging CT image with intravenous contrast agent shows a small incidental subsegmen-
tal PE in the right lower lung lobe (arrow). (D) Corresponding AI heatmap enables the radiologist to localize the finding (red).
Figure 3: False-negative findings of two chronic lobar incidental pulmonary embolisms (PEs) that were not detected by the
artificial intelligence (AI) software. (A) Axial CT image with intravenous contrast agent in a 70-year-old man (an outpatient) with
urothelial carcinoma shows a small incidental PE (IPE) located against the vessel wall in the right pulmonary artery bifurcation (ar-
row), compatible with a small chronic IPE. (B) Contrast-enhanced coronal CT image in a 62-year-old man (an inpatient) with lung
cancer shows a small eccentric filling defect in the pulmonary artery of the left lower lobe (arrow). These findings were not detected
by the AI software.
When only considering true-positive examinations flagged by overlap between the time period with AI assistance and those
the AI software in the third time period (n = 34), the median without AI assistance, differences were statistically significant
DNT was 62 minutes. The DNT of 29 of the 34 true-positive between the time periods. In contrast, we found no evidence
examinations (85.3%) was less than 6 hours. In comparison, of a difference between routine workflow without AI and hu-
the DNT of the four nonprioritized false-negative examina- man triage.
tions in the third time period ranged from 1280 minutes to Report TAT showed similar results as DNT. The median
12 684 minutes. Figure 5 shows the DNTs of all IPE-negative TATs for IPE-positive examinations were 7772, 4983, and 148
versus -positive studies for each time period. CIs were calcu- minutes for the three respective time periods. When only consid-
lated for the time differences between negative and positive ering true-positive examinations flagged by the AI software, the
studies in each time period (Fig 6). Given that the CIs did not median TAT was 91 minutes.
The median TTR for all examinations was 16 minutes. The prioritization resulted in a significantly reduced median DNT
median TTR for positive examinations was 21 minutes. We and TAT for flagged scans with IPE, from several days to 1.0 and
found no evidence of a difference in TTR between time periods. 1.5 hours, respectively. In contrast, unassisted triage of CT scans
by radiologists did not have a significant effect on the reduction
Discussion of DNT or TAT when compared with the routine workflow.
We evaluated the clinical value of AI software for the analysis of This is likely a result of the time-consuming nature of this task,
IPE on a large sample of chest CT scans (n = 11 736) in oncol- contributing to low yield.
ogy patients. The AI tool accurately detected IPE on chest CT To the best of our knowledge, no other published study has
scans with intravenous contrast agent, with a high sensitivity investigated the diagnostic performance of AI software for the
of 91.6% (131 of 143 scans), specificity of 99.7% (11 559 of detection and prioritization of IPE. Previous studies have as-
11 590 scans), and negative predictive value of 99.9% (11 559 sessed the diagnostic accuracy of deep learning algorithms in the
of 11 571 scans). False-negative classification occurred in 12 of detection of PE on dedicated CTPAs (13–18). For this task, sen-
143 examinations (8.4%) but was limited to segmental, sub- sitivities and specificities ranged from 73% to 96% and 77% to
segmental, and small chronic lobar clots. No IPE in the main 96%, respectively. Although our study focused on the detection
pulmonary arteries was missed by the software. The number
of false-positive detections by the software, 31 of 165 flagged Table 4: Causes of False-Positive Detections by the Arti-
examinations (18.8%), can be considered acceptable because ficial Intelligence Software
radiologists could easily identify false-positive findings by us-
ing the heatmap. In total, only 0.3% of all analyzed examina- No. of CT
tions (31 of 11 736) were falsely positive. The impact of false- Cause Scans (n = 31)
positive alerts on radiologist workflow was therefore limited. Technical artifact 7 (22.6)
During prospective evaluation, the AI software was de- Flow artifact 13 (41.9)
ployed in a clinical environment with a backlog of unreported Abnormality adjacent to a pulmonary artery 9 (29)
examinations to prioritize IPE on routinely acquired chest CT Abnormality within a pulmonary artery 2 (6.5)
scans, mostly obtained in outpatients with known primary ma-
Note.—Data in parentheses are percentages.
lignancy for posttreatment follow-up. The AI-based worklist
Figure 4: False-positive detections by the artificial intelligence (AI) software. (A, B) Images in a 59-year-old woman (an out-
patient) with melanoma who underwent CT with intravenous contrast agent. (A) Axial CT image shows slightly decreased contrast
opacification in a segmental pulmonary artery in the right lower lobe (arrow). This finding was compatible with a flow artifact without
any clinical significance. (B) Corresponding AI software heatmap misclassified the finding as a possible incidental pulmonary em-
bolism (IPE), as highlighted in red. (C, D) Images in a 36-year-old woman (an outpatient) with cervical cancer who underwent CT
with intravenous contrast agent. (C) Axial CT image shows hilar and mediastinal lymphadenopathy. An enlarged right hilar lymph
node shows impression on the right pulmonary arteries (arrow). The finding was misclassified by the AI software as IPE, as shown on
(D) the corresponding axial heatmap in red.
Figure 5: Box plot shows detection and notification times (DNTs) of incidental pulmonary embolism (IPE)–negative versus IPE-positive CT scans
per time period: routine workflow without artificial intelligence (AI), human triage without AI, and worklist prioritization with AI. DNT was markedly
reduced for positive CT scans during the third time period with AI assistance (median DNT, 87 minutes vs routine workflow DNT of 7714 minutes [5
days]). The horizontal line in each box plot indicates the median, and the box corresponds to the IQR. The whiskers indicate minimum and maximum
values in the data. Circles represent outliers. TP1 = time period 1, TP2 = time period 2, TP3 = time period 3.
Figure 6: Graph shows 95% CIs of detection and notification time (DNT) differences between incidental pulmonary
embolism–positive and –negative CT scans per time period: routine workflow without artificial intelligence (AI) (TP1), human
triage without AI (TP2), and worklist prioritization with AI (TP3). The DNT difference was largest for the third time period.
Given that the CIs of the third time period versus the first and second periods did not overlap, differences were significant
between these periods.
of IPE on venous CT scans, which can be considered more chal- and 95.0% specificity. However, there was no significant reduc-
lenging, our sensitivity and specificity were comparable with or tion in report communication times. This is likely related to the
higher than studies identifying PE on CTPAs. Furthermore, ret- overall short TATs of examinations in an emergency department.
rospective studies on diagnostic accuracy might not determine Tools to prioritize the reading worklist would provide the most
the real clinical impact on patient care (18). Schmuelling et al benefit in clinical settings with a high workload and a backlog of
(19) evaluated the clinical implementation of AI software for unreported examinations, as in our situation. The shorter time
prioritization of positive CTPAs in the emergency setting; the to detection of IPE in our study has limited generalizability to
authors showed good diagnostic accuracy, with 79.6% sensitivity practices with short report TATs.
PE is one of the diagnoses that is most commonly missed or consisted solely of oncology patients. The frequency of IPE is
delayed by physicians (20). Detection of IPE on routine contrast- likely lower in the general patient population, which might im-
enhanced chest CT scans can be especially challenging when the pact the clinical relevance of the AI software. Third, the study
IPE is small and isolated. When analyzing historical CT data, we design was not randomized. Although patient volumes in all
found that 44.8% of IPEs were missed by radiologists. Low detec- time periods were similar and unaffected by the COVID-19
tion rates of IPE by radiologists have also been reported in other pandemic, TATs in the radiology department can be affected
studies (21,22). Wildman-Tobriner et al (23) applied a different by many factors, such as staffing levels. To account for varia-
AI algorithm to retrospectively analyze 11 913 CT examinations tions between periods, we calculated the time differences be-
for undiagnosed IPE and found 49 missed IPEs (0.41%), lead- tween positive and negative IPE examinations within each
ing to a missed rate of 38% (49 of 128 IPEs). In our study, we period separately and compared CIs of the difference among
prospectively evaluated the effect of AI assistance on missed IPEs. the periods. Fourth, statistical analysis of the time variables
The number of missed IPEs was reduced to one scan of 38 (2.6%), assumed independence of examinations; however, the study
thereby demonstrating that AI software can assist radiologists to included multiple examinations per patient. Fifth, the study
significantly improve the detection rate of IPE. focused on diagnostic efficacy; we did not evaluate the value on
The clinical relevance and proper management of IPE remain patient outcomes and cost-effectiveness. Future studies should
a subject of debate. It is well known that venous thromboembo- investigate the effect of early diagnosis of IPE on morbidity
lism in oncology patients is associated with high morbidity and and mortality.
mortality (24). Observational studies suggest that the prognosis In conclusion, we demonstrated that commercially available
of IPE is similar to that of symptomatic PE with regard to the AI software had high diagnostic accuracy in the detection of IPE
risk of recurrence and mortality (25). Consequently, treatment on chest CT scans in patients with cancer and was effective in
guidelines for IPE are similar to those in symptomatic PE (26). significantly reducing the time to diagnosis of positive examina-
Radiologic findings, such as thrombus load and central location, tions compared with the routine workflow in a setting with a
have been associated with adverse clinical outcomes in acute PE backlog of unreported scans.
(27). These findings can also help determine IPE severity (4).
Author contributions: Guarantor of integrity of entire study, L.T.; study concepts/
In our study, 37.8% (54 of 143) of IPE-positive scans showed
study design or data acquisition or data analysis/interpretation, all authors; manu-
emboli in the main or lobar pulmonary arteries. Therefore, the script drafting or manuscript revision for important intellectual content, all authors;
benefit of AI-based worklist prioritization for timely assessment approval of final version of submitted manuscript, all authors; agrees to ensure any
questions related to the work are appropriately resolved, all authors; literature re-
and treatment is most evident in these patients. The majority of
search, L.T., E.R.R., J.J.V.; clinical studies, L.T., A.B.R., A.N., J.J.V.; statistical
physicians also treat smaller, more distal incidental emboli in pa- analysis, L.T., R.M., J.J.V.; and manuscript editing, L.T., E.R.R., A.B.R., R.M.,
tients with cancer (28). In our study, most IPEs that were missed R.G.H.B.T., J.J.V.
by radiologists but detected by the AI software were segmental
Data sharing: Data generated or analyzed during the study are available from the
(28 of 48 [58.3%]) or subsegmental (12 of 48 [25%]). We must corresponding author by request.
therefore consider the risk of overdiagnosis, specifically of iso-
lated subsegmental IPE, which if left untreated would cause no Disclosures of conflicts of interest: L.T. No relevant relationships. E.R.R. No
more harm than treatment complications (29). To our knowl- relevant relationships. A.B.R. No relevant relationships. A.N. No relevant relation-
ships. R.M. No relevant relationships. R.G.H.B.T. No relevant relationships. J.J.V.
edge, no randomized controlled trials have assessed the effective- Grant to institution from Qure.ai; consulting fees from Tegus; payment to institu-
ness of anticoagulation therapy in patients with subsegmental tion for lectures from Roche; travel grant from Qure.ai; participation on a data
PE (30). However, recent studies support the use of anticoagula- safety monitoring board or advisory board from Contextflow, Noaber Foundation,
and NLC Ventures; leadership or fiduciary role on the steering committee of the
tion therapy for subsegmental PE in oncology patients (1,31). PINPOINT Project (payment to institution from AstraZeneca) and RSNA Com-
Further studies are needed to assess the relevance of diagnosing mon Data Elements Steering Committee (unpaid); phantom shares in Contextflow
and treating small incidental emboli. and Quibim.
7. Ranschaert E, Topff L, Pianykh O. Optimization of radiology workflow significant effects on report communication times and patient turnaround
with artificial intelligence. Radiol Clin North Am 2021;59(6):955–966. in the emergency department nine months after technical implementation.
8. O’Neill TJ, Xi Y, Stehel E, et al. Active reprioritization of the reading worklist Eur J Radiol 2021;141:109816.
using artificial intelligence has a beneficial effect on the turnaround time for 20. Schiff GD, Hasan O, Kim S, et al. Diagnostic error in medicine: analysis of
interpretation of head CT with intracranial hemorrhage. Radiol Artif Intell 583 physician-reported errors. Arch Intern Med 2009;169(20):1881–1887.
2020;3(2):e200024. 21. Gladish GW, Choe DH, Marom EM, Sabloff BS, Broemeling LD, Munden
9. Baltruschat I, Steinmeister L, Nickisch H, et al. Smart chest x-ray worklist RF. Incidental pulmonary emboli in oncology patients: prevalence, CT
prioritization using artificial intelligence: a clinical workflow simulation. Eur evaluation, and natural history. Radiology 2006;240(1):246–255.
Radiol 2021;31(6):3837–3845. 22. Deniz MA, Deniz ZT, Adin ME, et al. Detection of incidental pulmonary
10. Annarumma M, Withey SJ, Bakewell RJ, Pesce E, Goh V, Montana G. embolism with multi-slice computed tomography in cancer patients. Clin
Automated triaging of adult chest radiographs with deep artificial neural Imaging 2017;41:106–111.
networks. Radiology 2019;291(1):196–202. 23. Wildman-Tobriner B, Ngo L, Mammarappallil JG, Konkel B, Johnson JM,
11. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine Bashir MR. Missed incidental pulmonary embolism: harnessing artificial intel-
learning in action: identification of intracranial hemorrhage on computed ligence to assess prevalence and improve quality improvement opportunities.
tomography scans of the head with clinical workflow integration. NPJ Digit J Am Coll Radiol 2021;18(7):992–999.
Med 2018;1(1):9. 24. Schmaier AA, Ambesh P, Campia U. Venous thromboembolism and cancer.
12. R Core Team. R: a language and environment for statistical computing. Curr Cardiol Rep 2018;20(10):89.
Vienna, Austria: R Foundation for Statistical Computing; 2020. https:// 25. Klok FA, Huisman MV. Management of incidental pulmonary embolism.
www.R-project.org. Eur Respir J 2017;49(6):1700275.
13. Huang SC, Kothari T, Banerjee I, et al. PENet—a scalable deep-learning 26. Mulder FI, Di Nisio M, Ay C, et al. Clinical implications of incidental venous
model for automated diagnosis of pulmonary embolism using volumetric thromboembolism in cancer patients. Eur Respir J 2020;55(2):1901697.
CT imaging. NPJ Digit Med 2020;3(1):61. [Published correction appears 27. Meinel FG, Nance JW Jr, Schoepf UJ, et al. Predictive value of computed
in NPJ Digit Med 2020;3:102.] tomography in acute pulmonary embolism: systematic review and meta-
14. Weikert T, Winkel DJ, Bremerich J, et al. Automated detection of pulmonary analysis. Am J Med 2015;128(7):747–59.e2.
embolism in CT pulmonary angiograms using an AI-powered algorithm. 28. den Exter PL, van Roosmalen MJG, van den Hoven P, et al. Physicians’
Eur Radiol 2020;30(12):6545–6553. management approach to an incidental pulmonary embolism: an international
15. Buls N, Watté N, Nieboer K, Ilsen B, de Mey J. Performance of an artificial survey. J Thromb Haemost 2013;11(1):208–213.
intelligence tool with real-time clinical workflow integration—detection of 29. Dobler CC. Overdiagnosis of pulmonary embolism: definition, causes and
intracranial hemorrhage and pulmonary embolism. Phys Med 2021;83:154– implications. Breathe (Sheff) 2019;15(1):46–53.
160. 30. Yoo HH, Nunes-Nogueira VS, Fortes Villas Boas PJ. Anticoagulant treat-
16. Liu W, Liu M, Guo X, et al. Evaluation of acute pulmonary embolism and ment for subsegmental pulmonary embolism. Cochrane Database Syst Rev
clot burden on CTPA with deep learning. Eur Radiol 2020;30(6):3567–3575. 2020;2(2):CD010222.
17. Cheikh AB, Gorincour G, Nivet H, et al. How artificial intelligence improves 31. Yan M, Kieser R, Wu CC, Qiao W, Rojas-Hernandez CM. Clinical factors
radiological interpretation in suspected pulmonary embolism. Eur Radiol and outcomes of subsegmental pulmonary embolism in cancer patients.
2022;32(9):5831–5842. Blood Adv 2021;5(4):1050–1058.
18. Soffer S, Klang E, Shimon O, et al. Deep learning for pulmonary embolism 32. U.S. Food & Drug Administration. K201020. https://www.accessdata.fda.
detection on computed tomography pulmonary angiogram: a systematic gov/cdrh_docs/pdf20/K201020.pdf. Published August 26, 2020. Accessed
review and meta-analysis. Sci Rep 2021;11(1):15814. May 6, 2022.
19. Schmuelling L, Franzeck FC, Nickel CH, et al. Deep learning-based auto-
mated detection of pulmonary embolism on CT pulmonary angiograms: no