Identifying Pulmonary Nodules or Masses On Chest Radiography Using Deep Learning: External Validation and Strategies To Improve Clinical Practice

Clinical Radiology 75 (2020) 38e45
Contents lists available at ScienceDirect
Clinical Radiology
journal homepage: www.clinicalradiologyonline.net
Identifying pulmonary nodules or masses on

chest radiography using deep learning: external
validation and strategies to improve clinical
practice
C.-H. Liang a, b, c, Y.-C. Liu d, M.-T. Wu b, c, e, F. Garcia-Castro f, g,
A. Alberich-Bayarri f, g, F.-Z. Wu b, c, e, *
a
Department of Biomedical Imaging and Radiological Sciences, National Yang-Ming University, Taipei, Taiwan
b
Faculty of Medicine, School of Medicine, National Yang Ming University, Taipei, Taiwan
c
Institute of Clinical Medicine, National Yang Ming University, Taipei, Taiwan
d
Department of Diagnostic Radiology, Xiamen Chang Gung Hospital, China
e
Department of Radiology, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
f
Radiology Department, Hospital Universitarioy Polite’cnico La Fe and Biomedical Imaging Research Group (GIBI230),
Valencia, Spain
g
QUIBIM SL, Valencia, Spain
article in formation AIM: To test the diagnostic performance of a deep learning-based system for the detection of
clinically significant pulmonary nodules/masses on chest radiographs.
Article history: MATERIALS AND METHODS: Using a retrospective study of 100 patients (47 with clinically
Received 1 March 2019 significant pulmonary nodules/masses and 53 control subjects without pulmonary nodules),
Accepted 14 August 2019 two radiologists verified clinically significantly pulmonary nodules/masses according to chest
computed tomography (CT) findings. A computer-aided diagnosis (CAD) software using a
deep-learning approach was used to detect pulmonary nodules/masses to determine the
diagnostic performance in four algorithms (heat map, abnormal probability, nodule proba-
bility, and mass probability).
RESULTS: A total of 100 cases were included in the analysis. Among the four algorithms, mass
algorithm could achieve a 76.6% sensitivity (36/47, 11 false negative) and 88.68% specificity (47/
53, six false-positive) in the detection of pulmonary nodules/masses at the optimal probability
score cut-off of 0.2884. Compared to the other three algorithms, mass probability algorithm
had best predictive ability for pulmonary nodule/mass detection at the optimal probability
score cut-off of 0.2884 (AUCMass: 0.916 versus AUCHeat map: 0.682, p<0.001; AUCMass: 0.916
versus AUCAbnormal: 0.810, p¼0.002; AUCMass: 0.916 versus AUCNodule: 0.813, p¼0.014).
CONCLUSION: In conclusion, the deep-learning based computer-aided diagnosis system will
likely play a vital role in the early detection and diagnosis of pulmonary nodules/masses on
chest radiographs. In future applications, these algorithms could support triage workflow via
double reading to improve sensitivity and specificity during the diagnostic process.
Ó 2019 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
* Guarantor and correspondence: F.-Z. Wu, Department of Radiology, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan. Tel.: þ886 985 330160.
E-mail address: cmvwu1029@gmail.com (F.-Z. Wu).
https://doi.org/10.1016/j.crad.2019.08.005
0009-9260/Ó 2019 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
C.-H. Liang et al. / Clinical Radiology 75 (2020) 38e45 39
Introduction anterior view) from 100 patients who had undergone

chest CT within 2 weeks of chest radiography was used to
Chest radiography is the most commonly performed evaluate the performance of the QUIBIM Chest X-ray
diagnostic examination in daily medical practice because of Classifier. The CT images were used as the reference
its easy accessibility, relative low cost, and wide availability standard to evaluate the accuracy of the original chest
in outpatient centres.1,2 Chest radiography interpretation radiography report and the deep-learning algorithms. To
guides subsequent investigations, and could help to deter- establish the reference standard from the CT images, the
mine further laboratory analyses and additional imaging clinical significance of pulmonary nodules/masses detec-
studies if needed. ted at CT were defined according to British Thoracic Soci-
Recent evidence has demonstrated that low-dose ety guidelines for the Investigation and Management of
computed tomography (CT) screening could identify small Pulmonary Nodules, which indicate the need for further
subsolid nodules and reduce lung cancer mortality3e6; investigation and work-up.
however, it may be difficult to screen extensively due to
cost-effectiveness and national health policies. Chest radi- Deep learning and imaging processing
ography is usually the initial examination in patients with a
clinical suspicion of pulmonary nodules or masses. Errors in The chest radiographic images in digital imaging and
pulmonary nodule/mass detection at chest radiography can communications in medicine (DICOM) format were loaded
result in delayed diagnoses and management for both onto a computer installed with the QUIBIM Chest X-ray
benign and malignant conditions2; however, as a result of Classifier. One hundred de-identified frontal chest radio-
the tremendous increase in radiologists’ workloads over the graphs of adult patients were processed with the QUIBIM
past decade, the overworked radiologists could miss artificial intelligence (AI) algorithm. The AI module is an
important diagnoses leading to medical malpractice.7e10 ensemble of 14 pathology-specific 19-layer convolutional
Recent advances in deep-learning techniques have neural networks, followed by a fully connected layer, that
enabled outstanding performance in a wide variety of ro- imports a chest radiograph and outputs the probability of
botic tasks in the areas of perception, planning, localisation, the disease along with heat maps localising the areas of the
and classification in radiology.11,12 In the recent years, image most indicative of chest disease. The recently
applying deep learning with convolutional neural networks released database, ChestX-ray 14, which contains 112,120
in radiology has shown promising results in various clinical chest radiography images labelled with up to 14 different
situations, such as pulmonary tuberculosis, pneumonia, and thoracic diseases, including nodule and mass20,21 that were
other abnormalities detected at chest radiography, lung chosen based on frequency of observation and diagnosis in
nodule detection at CT, image segmentation, and tumour clinical practice. Dense connections and batch normal-
texture analysis.13e19 isation were also implemented to optimise for deep
QUIBIM (Valencia, Spain) has developed a chest radiog- network training.21,22 The algorithms were modified and
raphy classification tool using an algorithm approach that adopted by QUIBIM Precision and the software has been
offers a solution to detect pulmonary nodules, which can trained with ChestX-ray14 to estimate the probability of the
help radiology departments become more efficient. The presence of the 14 chest diseases using chest radiographs:
present study evaluates the diagnostic performance and atelectasis, cardiomegaly, effusion, infiltration, mass,
efficacy of the QUIBIM Chest X-ray Classifier commercially nodule, pneumonia, pneumothorax, consolidation, oedema,
available software for automatic detection of pulmonary emphysema, fibrosis, pleural thickening, and hernia. Four
nodules or masses on chest radiographs using four different different deep-learning algorithms for pulmonary nodules
deep-learning algorithms (heat map, nodule, mass, and or masses detection were evaluated in this study, which
abnormal probability algorithms). included heat map algorithm, abnormal probability algo-
rithm, nodule probability algorithm, and mass probability
algorithm. Heat map is the ability to highlight the most
Materials and methods abnormal region correctly on the heat map. Possibility is the
index value between 0 and 1. The parameters of chest
Patient and image selection radiographic imaging datasets were also recorded,
including imaging devices such as computed radiography
The institutional review board approved this retro- (CR) and digital radiography (DR) units and processing time.
spective study, and thus informed consent was waived. For
this external validation study, a dataset of 100 patients was Statistical analysis
retrieved from the Kaohsiung Veterans General Hospital,
Kaohsiung, Taiwan database: 47 with clinically significant All statistical analyses were performed with SPSS 17.0 for
pulmonary nodules or masses and 53 control subjects Windows (SPSS, Chicago, IL, USA) and MedCalc 13.2.2.0
without pulmonary nodules, which were validated (MedCalc Software, Ostend, Belgium). Continuous variables
through CT images according to British Thoracic Society are presented as meanstandard deviation (SD). To
guidelines for the investigation and management of pul- compare the processing time between CR and DR, differ-
monary nodules.20 A retrospectively obtained indepen- ences in continuous variables between two groups were
dent set of de-identified chest radiographs (postero- compared using the independent Student t-test. On the
40 C.-H. Liang et al. / Clinical Radiology 75 (2020) 38e45
external validation test datasets based on the reference process time using AI, the mean processing time of CR was
standard from CT images, the predictive ability and cut-off significantly longer compared to DR (116.8512.27 versus
values for each algorithm in the prediction of pulmonary 85.26.33 seconds). Of the 47 pulmonary nodules or
nodules/masses were assessed using area under the masses, 39 (82.97%) were solid nodules and eight (17.02%)
receiver operating characteristic (AUROC) curves. AUROC were part-solid nodules. The mean nodule size was
between 0.7 and 0.9 was regarded as moderate accuracy 4.370.41 cm (range 0.7e13.5 cm).
according to Greiner et al.23 Youden index and the The cross-tables for the best-performing models for
discriminant ability at each cut-off value for the four algo- pulmonary nodule/mass detection, including the heat map
rithms were used to determine the optimal cut-off value to algorithm, abnormal probability algorithm, nodule proba-
diagnose pulmonary nodules/masses. Cross-tables, sensi- bility algorithm, and mass probability algorithm are pro-
tivity, specificity, positive likelihood ratio (positive LR), vided in Fig 1.
negative likelihood ratio (negative LR), positive predictive Table 2 shows the sensitivity, specificity, diagnostic ac-
value (PPV), negative predictive value (NPV), and diagnostic curacy, negative predictive value (NPV), positive predictive
accuracy were determined from the optimal cut-off value value (PPV), likelihood ratio (LR) (þ), and LR () values of
by the Youden index for different algorithm models in the four algorithms of QUIBIM Chest X-ray Classifier at
pulmonary nodules or masses detection. optimal threshold of probability score for pulmonary
To determine and compare the diagnostic performance nodule/mass detection. The sensitivity of the heat map al-
of four different AI algorithms in pulmonary nodules or gorithm was 38.3% and the specificity was 98.11% for iden-
masses detection, the optimal diagnostic cut-off values of tifying the most abnormal region. The sensitivity of the
these algorithms was determined by using the receiver abnormal probability algorithm was 74.47% and the speci-
operating characteristic curve (ROC) curve via the Youden ficity was 81.13% for pulmonary nodule/mass detection at
index maximises the overall diagnostic accuracy. A com- the optimal probability score cut-off of 0.4116. The sensi-
parison of the ROC curves was performed by using a method tivity of the nodule probability algorithm was 85.11% and
described by DeLong and colleagues.24 A p-value of <0.05 the specificity was 64.15% for pulmonary nodule/mass
was considered significant. detection at the optimal probability score cut-off of 0.2879.
The sensitivity of the mass probability algorithm was 76.6%
and the specificity was 88.68% for pulmonary nodule/mass
Results
detection at the optimal probability score cut-off of 0.2884.
Among these four different algorithms for pulmonary
Demographics and clinical characteristics nodules detection, the nodule probability algorithm was the
most sensitive algorithm whereas the heat map algorithm
A total of 100 patients with 100 chest radiographs were was the most specific.
enrolled and summarised in Table 1. There were 47 pa- The areas under the ROC curves for pulmonary nodule
tients with clinically significant pulmonary nodules/ detection were 0.682 (95% confidence interval [CI]
masses and 53 patients with negative findings. The mean 0.581e0.772) for the heat map algorithm, 0.810 (95% CI
age was 55.0713.80 years and 54 (54%) patients were 0.719e0.882) for the abnormal probability algorithm, 0.813
men. Among 100 chest radiographs, 72% of the chest ra- (95% CI 0.723e0.884) for the nodule probability algorithm,
diographs were produced using DR, and the rest using CR. and 0.916 (95% CI 0.844e0.962) for the mass probability
Average processing time per case was 94.0716.54 sec- algorithm, respectively (Fig 2). Compared to the other three
onds, with a maximum of 133 seconds. For imaging algorithms, the mass probability algorithm had best pre-
Table 1
dictive ability for pulmonary nodule detection at the
Baseline characteristics of 100 study subjects. optimal cut-off of probability score of 0.2884 (AUCMass:
0.916 versus AUCHeat map: 0.682, p<0.001; AUCMass: 0.916
Characteristic p-Value
versus AUCAbnormal: 0.810, p¼0.002; AUCMass: 0.916 versus
Age (years) 55.07 13.80
AUCNodule: 0.813, p¼0.014).
Gender (male) 54 (54%)
Chest radiographs DICOM modality
CR 28 Subgroup analysis of detected findings using the four
DR 72 algorithms
Processing time (seconds) <0.001a
CR 116.8512.27
DR 85.26.33
Detailed distribution of detected nodular diameter and
Positive pulmonary nodule/mass 47 (47%) type according to four algorithms is displayed in Table 3.
Nodular size (cm) 4.370.41 (0.7e13.5) This mass algorithm can detect pulmonary nodule/mass
Nodular type with an average diameter of 4.8722.894 cm. This heat map
Solid nodule 39
algorithm can detect and localise pulmonary nodules
Part-solid nodule 8
correctly with an average diameter of 6.0673.029 cm;
DICOM, digital imaging and communications in medicine; CR, computed however, the ability to detect part-solid nodules was rela-
radiography; DR, digital radiography.
a
For the imaging processing time per case, the mean processing time of CR
tive weak compared to solid nodules for these algorithms.
modality was significantly larger in comparison to DR modality Nodules detected by these algorithms were usually larger
(116.8512.27 versus 85.26.33 seconds). than undetectable nodules.
Figure 1 Flowchart of the 100 consecutive patients and retrospective assessment using the deep-learning Chest X-ray Classifier. Cross-tables for
the best-performing models for pulmonary nodule/mass detection, including the heat map algorithm, abnormal probability algorithm, nodule
probability algorithm, and mass probability algorithm.
Table 2
ROC analysis results at the threshold to maximise sensitivity and specificity in pulmonary nodule detection across different algorithm models.
Algorithm model Cut-off ROC Sensitivity Specificity Positive LR Negative LR PPV % NPV % Accuracy %
Heat map Identify lesion 0.682 38.30 98.11 20.30 0.63 94.73% 64.19% 70%
Abnormal probability 0.4116 0.810 74.47 81.13 3.95 0.31 77.78% 78.20% 78%
Nodule probability 0.2879 0.813 85.11 64.15 2.37 0.23 67.80% 82.90% 74%
Mass probability 0.2884 0.916 76.60 88.68 6.77 0.26 85.70% 81.00% 83%
ROC, receiver operating characteristic; LR, likelihood ratio; PPV, positive predictive value; NPV, negative predictive value.
Figure 2 Comparison of ROC curves for the four algorithms. Compared to the other three algorithms, the mass probability algorithm had best
predictive ability for pulmonary nodule/mass detection at the optimal probability score cut-off of 0.2884 (AUCMass: 0.916 versus AUCHeat map:
0.682, p<0.001; AUCMass: 0.916 versus AUCAbnormal: 0.810, p¼0.002; AUCMass: 0.916 versus AUCNodule: 0.813, p¼0.014).
Diagnostic performance according to nodule size across models, the Chest X-ray Classifier software appears to
the different algorithm models is summarised in Electronic have superior diagnostic accuracy for pulmonary nodules
Supplementary Material Table S1. For the four algorithm 3 cm than pulmonary nodules <3 cm. Comparison of the
Table 3 Using these different algorithms strategies, the deep-

Detailed distribution of detected nodular diameter (cm) and type according learning algorithm could potentially assist radiologists in
to the four algorithms.
detecting pulmonary nodules on chest radiographs while
Correct in Failure in p-Value minimising both false positives and false negatives. These
detection detection algorithms could support radiology workflow by flagging
Heat map algorithm n¼18 n¼29 suspicious cases prioritised on the radiology worklist so
Nodule size 6.0673.029 3.3242.029 0.001
that they can be reviewed first.25,26 For an instant medical
Nodule type (solid nodule %) 100% 72.4% 0.017
Abnormal algorithm n¼35 n¼12 alert system, the heat map algorithm could be used (with
Nodule size 4.8542.876 2.9751.955 0.018 highest specificity in support of rule in pulmonary nodules/
Nodule type (solid nodule %) 79.50% 20.50% 0.176 mass) to assist the radiographers or radiologists in flagging
Nodule algorithm n¼40 n¼7 suspicious cases in a timely manner via integration with the
Nodule size 4.6302.892 2.9141.360 0.023
PACS (picture archiving and communication system). To
Nodule type (solid nodule %) 87.50% 57.10% 0.084
Mass algorithm n¼36 n¼11 streamline the reporting process assisted by AI Chest X-ray
Nodule size 4.8722.894 2.7451.535 0.003 Classifier, an early warning score-based alert could be in-
Nodule type (solid nodule %) 88.90% 63.6% 0.073 tegrated into the reporting system. Therefore, the radiolo-
gists could pay more attention to the plain films with a
higher probability score according to the cut-off value of the
diagnostic performances of CR versus DR is summarised in mass probability score (0.2884). For plain chest films with
Electronic Supplementary Material Table S2. DR appears to the lowest probability score, radiologists can exclude le-
have superior diagnostic performance than CR for the four sions with greater confidence. To exclude lesions, a safer
algorithms. threshold should be set (an increase in sensitivity by taking
a lower threshold value than the cut-off of the nodule
probability score) to ensure that lesions are not mis-
Discussion diagnosed according to the cut-off value (0.2879) of the
nodule probability score. Further prospective study is war-
To the authors’ knowledge, this is the first study to ranted to further validate these findings.
externally validate the diagnostic performance of AI deep- Regarding the false-positive and false-negative results,
learning algorithms for the detection of clinically signifi- there is still room for improvement. For example, a faint
cant pulmonary nodules/masses, which were validated by nodule with false-negative findings, which was missed by
chest CT. The study reveals three main major findings: first, the heat map algorithm, was diagnosed as pulmonary
among the four different algorithms, the mass algorithm adenocarcinoma manifesting as a part-solid nodule of 2.9
had the best diagnostic accuracy with an AUC of 0.916. cm in the right upper lobe as demonstrated on CT in Fig 4;
Second, a rapid imaging processing time of 94.0716.54 however, it was correctly detected with an abnormality
seconds per case could help make workflow more efficient. probability score of 0.48. Another example of a false-
Therefore, the QUIBIM Chest X-ray Classifier could be used negative finding missed by the heat map and abnormal al-
to automatically evaluate all chest radiographs efficiently gorithm is shown in Fig 5. In this case, the part-solid
despite the large volume of chest radiographs prescribed in nodular opacity is subtle in the right lower lobe. One
the outpatient setting. Third, implementation of deep example of a false positive by both the AI algorithm classi-
learning with different algorithms could help radiologists fiers (nodule and abnormality probability algorithm) is
improve medical imaging care with reduced error by initi- shown in Fig 6. The chest radiography is of a 73-year-old
ating selective double reading. woman with increasing lung marking in both lower lung
The present results show that the mass algorithm dem- fields, confirmed as normal at chest CT. There are some
onstrates the best diagnostic performance level in the strategies that can help to reduce false alarms by standard
detection of pulmonary nodules/masses with cut-off value double reading with AI. At the same time, standard double
of 0.2884 (probability score). This algorithm can detect reading with AI interpretation can reduce the workload of
pulmonary nodules/masses with average diameter of 4.872 radiologists and reduce misdiagnosis caused by excessive
cm, but there are still limits to the ability to detect smaller work.27e29
size or part-solid lesions. This study has several important limitations. First, the
Among the algorithms for pulmonary nodule detection, mass algorithm has the best diagnostic accuracy in pul-
the heat map algorithm demonstrated the highest speci- monary nodule/mass detection in comparison with the
ficity with a very high positive LR of 20.30. The heat map other algorithms; however, this algorithm can detect pul-
algorithm has the ability to automatically identify and monary nodule/mass with average diameter of 4.872 cm,
localise the lesion correctly. Fig 3 shows a solid nodule of 2.3 but there are still limits to the ability to detect smaller size
cm, diagnosed as lung cancer, in the left middle lung field, or part-solid lesions. This phenomenon actually also occurs
which was detected by the heat map algorithm, with an when experienced thoracic radiologists view the same le-
abnormality score of 0.67. This model, however, has a high sions.30 Future work will evaluate the value of interactive
false-negative rate with an average detectable nodule size database enhancement and continuous training for opti-
of approximately 6.067 cm. The nodule probability algo- misation algorithms for pulmonary subsolid nodule(s) <3
rithm demonstrated the highest sensitivity. cm. Second, the present study evaluated deep-learning
Figure 3 A solid nodule 2.3 cm, diagnosed as lung cancer, in the left middle lung field properly detected by the heat map algorithm, with an
abnormality score of 0.67.
Figure 4 A faint nodule with false-negative findings missed by the heat map algorithm, which was diagnosed as pulmonary adenocarcinoma
manifesting as a part-solid right upper lobe nodule of 2.9 cm as demonstrated at CT; however, it was properly detected with an abnormality
score of 0.48.
Figure 5 A faint nodule with false-negative findings missed by both classifiers (heat map and abnormal probability score), which was diagnosed
as pulmonary adenocarcinoma manifesting as a right lower lobe part-solid nodule of 1.6 cm as demonstrated at CT.
performance in a retrospective setting. Future work will aim previous studies have demonstrated blinds spots in chest
to drive implementation of deep learning to aid the radi- radiographs, which have been shown to contribute to
ologists in detecting lung nodules in real-time. Third, detection and interpretation errors.2 Further work aiming
Figure 6 False positive nodule by both AI algorithm classifiers (nodule and abnormality score). The chest radiography showed that this 73-year-
old woman with increasing lung marking in both lower lung fields, which was diagnosed as a normal chest finding at CT.
to investigate the diagnostic performance of deep learning 015, VGHKS104-048, VGHKS105-064, VGHKS108-159,
for blinds spots in chest radiography is warranted. Fourth, MOST108-2314-B-075B-008-).
the heat map algorithm, which focused on the automatic
identification and localisation of pulmonary nodules/
masses, has a good PPV (94.73%), is highly specific (98.11%), Appendix A. Supplementary data
but has poor sensitivity (38.3%). This algorithm can
detect pulmonary nodules with an average diameter of Supplementary data to this article can be found online at
6.0673.029 cm. Therefore, reliable identification and https://doi.org/10.1016/j.crad.2019.08.005.
localisation of smaller pulmonary nodules with deep
learning is critical to clinical implementation in real-world
References
practice. Finally, the present study included chest radio-
graphs generated by both DR and CR technologies and the 1. Brogdon BG, Kelsey CA, Moseley Jr RD. Factors affecting perception of
processing time of CR was found to be much longer than DR. pulmonary lesions. Radiol Clin N Am 1983;21(4):633e54.
This may be attributed to differences in the principle of 2. de Groot PM, Carter BW, Abbott GF, et al. Pitfalls in chest radiographic
interpretation: blind spots. Sem Roentgenol 2015;50(3):197e209.
image processing between CR and DR. In the future, the
3. Aberle DR, Adams AM, Berg CD, et al. Reduced lung-cancer mortality
diagnostic accuracy of convolutional neural networks be- with low-dose computed tomographic screening. N Engl J Med
tween CR and DR should be investigated. 2011;365(5):395e409.
In conclusion, deep-learning based CAD systems will 4. Wu FZ, Chen PA, Wu CC, et al. Semiquantative visual assessment of sub-
solid pulmonary nodules 3 cm in differentiation of lung adenocarci-
likely play a vital role in the early detection and diagnosis of
noma spectrum. Sci Rep 2017;7(1):15790.
pulmonary nodules/masses on chest radiographs. In future 5. Hsu HT, Tang EK, Wu MT, et al. Modified Lung-RADS improves perfor-
applications, these algorithms could support triage work- mance of screening LDCT in a population with high prevalence of non-
flow with double reading to improve sensitivity and spec- smoking-related lung cancer. Acad Radiol 2018;25(10):1240e51.
ificity during the diagnostic process. 6. Wu FZ, Huang YL, Wu CC, et al. Assessment of selection criteria for low-
dose lung screening CT among Asian ethnic groups in Taiwan: from mass
screening to specific risk-based screening for non-smoker lung cancer.
Clin Lung Cancer 2016;17(5):e45e56.
Conflicts of interest 7. Bhargavan M, Kaye AH, Forman HP, et al. Workload of radiologists in
United States in 2006e2007 and trends since 1991e1992. Radiology
2009;252(2):458e67.
The authors declare the following financial interests/
8. Levin DC, Rao VM, Parker L, et al. Analysis of radiologists’ imaging
personal relationships which may be considered as poten- workload trends by place of service. J Am Coll Radiol 2013;10(10):760e3.
tial competing interests: Fabio GarciaCastro and Angel 9. Brady AP. Error and discrepancy in radiology: inevitable or avoidable?
Alberich-Bayarri are founders of the spin-off company Insights Imaging 2017;8(1):171e82.
10. Forrest JV, Friedman PJ. Radiologic errors in patients with lung cancer.
QUIBIM SL. The other authors declare that they have no
West J Med 1981;134(6):485e90.
competing interests. 11. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for
computer-aided detection: CNN architectures, dataset characteristics
and transfer learning. IEEE Trans Med Imaging 2016;35(5):1285e98.
Acknowledgements 12. Novikov AA, Lenis D, Major D, et al. Fully convolutional architectures for
multiclass segmentation in chest radiographs. IEEE Trans Med Imaging
2018;37(8):1865e76.
This study was supported by grants from Kaohsiung 13. Cicero M, Bilbily A, Colak E, et al. Training and validating a deep con-
Veterans General Hospital, Taiwan, R.O.C. (nos. VGHKS103- volutional neural network for computer-aided detection and
classification of abnormalities on frontal chest radiographs. Invest Radiol 22. Wang X, Peng Y, Lu L, et al. ChestX-Ray8: hospital-scale chest x-ray
2017;52(5):281e7. database and benchmarks on weakly-supervised classification and
14. Lakhani P, Sundaram B. Deep learning at chest radiography: automated localization of common thorax diseases. In: IEEE conference on computer
classification of pulmonary tuberculosis by using convolutional neural vision and pattern recognition (CVPR), Honolulu, HI; 2017. p. 3462e71.
networks. Radiology 2017;284(2):574e82. https://doi.org/10.1109/CVPR.2017.369.
15. Liu V, Clark MP, Mendoza M, et al. Automated identification of pneu- 23. Greiner M, Pfeiffer D, Smith RD. Principles and practical application of
monia in chest radiograph reports in critically ill patients. BMC Med the receiver-operating characteristic analysis for diagnostic tests. Prev
Inform Decis Making 2013;13(1):90. Vet Med 2000;45(1e2):23e41.
16. Hua K-L, Hsu C-H, Hidayati SC, et al. Computer-aided classification of 24. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under
lung nodules on computed tomography images via deep learning two or more correlated receiver operating characteristic curves: a
technique. OncoTargets Ther 2015;8:2015e22. nonparametric approach. Biometrics 1988;44(3):837e45.
17. Zhang W, Li R, Deng H, et al. Deep convolutional neural networks for 25. Lee J-G, Jun S, Cho Y-W, et al. Deep learning in medical imaging: general
multi-modality isointense infant brain image segmentation. NeuroImage overview. Korea J Radiol 2017;18(4):570e84.
2015;108:214e24. 26. Choy G, Khalilzadeh O, Michalski M, et al. Current applications and
18. Cheng J-Z, Chen C-M, Shen D. Chapter 9: deep learning techniques on future impact of machine learning in radiology. Radiology 2018;288(2):
texture analysis of chest and breast images. In: Depeursinge A, Al-Kadi 318e28.
O S, Mitchell JR, editors. Biomedical texture analysis. London: Academic 27. Ribli D, Horva th A, Unger Z, et al. Detecting and classifying lesions in
Press; 2017. p. 247e79. mammograms with deep learning. Sci Rep 2018;8(1):4165.
19. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep 28. Geijer H, Geijer M. Added value of double reading in diagnostic radi-
learning-based automatic detection algorithm for malignant pulmonary ology, a systematic review. Insights Imaging 2018;9(3):287e301.
nodules on chest radiographs. Radiology 2019 Jan;290(1):218e28. 29. Ciatto S, Del Turco MR, Burke P, et al. Comparison of standard and
20. Baldwin DR, Callister MEJ. The British Thoracic Society guidelines on the double reading and computer-aided detection (CAD) of interval cancers
investigation and management of pulmonary nodules. Thorax 2015; at prior negative screening mammograms: blind review. Br J Cancer
70(8):794. 2003;89(9):1645e9.
21. Rajpurkar P, Irvin J, Ball RL, et al. Deep learning for chest radiograph 30. del Ciello A, Franchi P, Contegiacomo A, et al. Missed lung cancer: when,
diagnosis: a retrospective comparison of the CheXNeXt algorithm to where, and why? Diagn Interv Radiol 2017;23(2):118e26.
practicing radiologists. PLoS Med 2018;15(11). e1002686-e1002686.

Identifying Pulmonary Nodules or Masses On Chest Radiography Using Deep Learning: External Validation and Strategies To Improve Clinical Practice

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Identifying Pulmonary Nodules or Masses On Chest Radiography Using Deep Learning: External Validation and Strategies To Improve Clinical Practice

Uploaded by

Copyright:

Available Formats

Clinical Radiology 75 (2020) 38e45

Contents lists available at ScienceDirect

Identifying pulmonary nodules or masses on

Introduction anterior view) from 100 patients who had undergone

Table 3 Using these different algorithms strategies, the deep-

You might also like