Professional Documents
Culture Documents
A R T I C L E I N F O A B S T R A C T
Keywords: Purpose: To compare the performance of lesion detection and Prostate Imaging-Reporting and Data System (PI-
Deep learning RADS) classification between a deep learning-based algorithm (DLA), clinical reports and radiologists with
Prostate different levels of experience in prostate MRI.
Prostate neoplasms
Methods: This retrospective study included 121 patients who underwent prebiopsy MRI and prostate biopsy.
Prostate imaging reporting and data system
Magnetic resonance imaging
More than five radiologists (Reader groups 1, 2: residents; Readers 3, 4: less-experienced radiologists; Reader 5:
expert) independently reviewed biparametric MRI (bpMRI). The DLA results were obtained using bpMRI. The
reference standard was based on pathologic reports. The diagnostic performance of the PI-RADS classification of
DLA, clinical reports, and radiologists was analyzed using AUROC. Dichotomous analysis (PI-RADS cutoff value
≥ 3 or 4) was performed, and the sensitivities and specificities were compared using McNemar’s test.
Results: Clinically significant cancer [CSC, Gleason score ≥ 7] was confirmed in 43 patients (35.5%). The AUROC
of the DLA (0.828) for diagnosing CSC was significantly higher than that of Reader 1 (AUROC, 0.706; p = 0.011),
significantly lower than that of Reader 5 (AUROC, 0.914; p = 0.013), and similar to clinical reports and other
readers (p = 0.060–0.661). The sensitivity of DLA (76.7%) was comparable to those of all readers and the clinical
reports at a PI-RADS cutoff value ≥ 4. The specificity of the DLA (85.9%) was significantly higher than those of
clinical reports and Readers 2–3 and comparable to all others at a PI-RADS cutoff value ≥ 4.
Abbreviations: AI, artificial intelligence; bpMRI, biparametric MRI; CSC, clinically significant prostate cancer; DLA, deep learning-based algorithm; mpMRI,
multiparametric MRI; PI-RADS, Prostate Imaging-Reporting and Data System; PSA, prostate-specific antigen; TRUS, transrectal ultrasonography.
* Corresponding author at: Department of Radiology, Eunpyeong St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, 1021, Tongil-ro,
Eunpyeong-gu, Seoul 03312, Republic of Korea.
E-mail addresses: cmh@catholic.ac.kr (M.H. Choi), rhdwngkwk333@naver.com (D.H. Kim), yjleerad@catholic.ac.kr (Y.J. Lee), henkjan.huisman@radboudumc.nl
(H. Huisman), tobias.penzkofer@charite.de (T. Penzkofer), shabunin@pateroclinic.ru (I. Shabunin), davidjean.winkel@usb.ch (D.J. Winkel), 746992685@qq.com
(P. Xing), dieter.szolar@diagnostikum-graz.at (D. Szolar), robertgrimm@siemens-healthineers.com (R. Grimm), heinrich.von_busch@siemens-healthineers.com
(H. von Busch), yohan.son@siemens-healthineers.com (Y. Son), bin.lou@siemens-healthineers.com (B. Lou), ali.kamen@siemens-healthineers.com (A. Kamen).
https://doi.org/10.1016/j.ejrad.2021.109894
Received 2 May 2021; Received in revised form 30 June 2021; Accepted 1 August 2021
Available online 5 August 2021
0720-048X/© 2021 Elsevier B.V. All rights reserved.
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
Conclusions: The DLA showed moderate diagnostic performance at a level between those of residents and an
expert in detecting and classifying according to PI-RADS. The performance of DLA was similar to that of clinical
reports from various radiologists in clinical practice.
2
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
information of age, prostate-specific antigen (PSA) level, the use of 5α- was determined by the highest PI-RADS score or by the largest diameter
reductase inhibitor, clinical report of prostate MRI, biopsy result, if multiple lesions with the same PI-RADS score existed.
number of previous TRUS-guided biopsies before MRI, interval between
MRI and biopsy, prostatectomy result, and interval between MRI and 2.5. Clinical report as a routine interpretation
prostatectomy of patients who underwent prostatectomy were collected
from electronic medical records. In the routine clinical process at our institution, six board-certified
abdominal/genitourinary radiologists with at least six years of experi
2.2. MR imaging techniques ence in prostate MRI reviewed prostate MRI and made radiological re
ports. More than 1,000 prostate MRIs are performed each year at our
The MRI examinations in this study consisted of 58 biparametric institution. The number, size, location, PI-RADS score, and snapshot of
MRIs (bpMRIs) and 63 multiparametric MRIs (mpMRIs) using a pelvic each of up to five lesions in each patient report were collected. In some
phased-array coil. The following sequences were acquired with the reports made before release of PI-RADS v2, the conclusion did not follow
following parameters: axial, sagittal, and coronal T2-weighted images the guidelines; these reports were interpreted according to PI-RADS v2
(T2WIs), repetition time [TR] > 3,200 ms; echo time [TE], 80–100 ms; by a radiologist who was not aware of the biopsy results; an indeter
matrix, 320 × 320; slice thickness, 3 mm; field of view [FOV], 200–220 minate conclusion was interpreted as PI-RADS 3 and a definite conclu
mm; axial diffusion-weighted images (DWIs) with b-values of 0, 50, 500, sion as prostate cancer was interpreted as PI-RADS 4 and 5 according to
and 1,000 sec/mm2, slice thickness 3 mm; matrix, 100 × 100; FOV, tumor size and extracapsular extension.
200–220 mm. DWI with a b-value of 1,500 sec/mm2 was obtained in
some patients, but apparent diffusion-coefficient (ADC) maps were 2.6. Reference standard for prostate cancer
calculated from DWI with b-values of 50 and 1,000 sec/mm2 to maintain
consistency in the imaging protocol. All prostate cancers, regardless of Gleason grade group, and CSC ≥
Gleason grade group 2 (Gleason score 7 [3 + 4]) were used to define
2.3. DLA for prostate MRI prostate cancer. Reader 5 reviewed the pathologic results and the MRI
results from DLA and all other readers at least 4 months after Reader 5′ s
A non-commercially available, deep learning-based prototype soft image review and determined whether the lesion were cancers based on
ware (Prostate AI version 1.2.1, build date 2019-11-27, Siemens the pathology. For patients who underwent systematic and targeted
Healthcare) was used. The algorithm was trained and validated with biopsy, the reader assessed whether prostate cancer was confirmed in
2,170 bpMRIs from seven institutions, including ours [14]. Our insti the targeted lesion. If prostate cancer was confirmed in systematic cores
tution provided 100 cases for the development of DLA and these cases other than the targeted cores, the reader matched the biopsy results to
were excluded from the current study. This DLA was designed to eval the focal lesion on prostate MRI. In patients who underwent prostatec
uate bpMRI using axial T2WI and DWI with high b-values to detect tomy, the schematic diagram of the histopathology map that depicted
suspicious prostatic lesions and categorize the lesions according to PI- the cancer area, instead of whole mount pathology, was the primary
RADS v2. Only T2WI and DWI were loaded to DLA even in patients reference for prostate cancer location. When the Gleason score differed
who underwent mpMRI. It displays abnormal areas using a suspicion between biopsy and prostatectomy pathology in a patient, the prosta
map on T2-weighted images and shows the segmented abnormal lesion. tectomy result was used for evaluation.
It also automatically produces a draft radiologic report containing text
information about PI-RADS classification, size and location of the lesions 2.7. Statistical analysis
in order of higher PI-RADS score and larger size, for user to review and
edit. The DLA results provide PI-RADS scores and a snapshot of each of The PI-RADS assessments from the DLA, clinical reports and each
up to five lesions in each patient. Detailed information on the DLA is radiologist were compared with those of Reader 5 (the most experienced
provided in the Supplementary material. radiologist) using weighted Kappa statistics to analyze inter-reader
agreement.
2.4. Prostate MRI review by radiologists The diagnostic performance of per-patient PI-RADS scores for all
readings was analyzed using the area under the receiver operating
Two reader groups of radiology residents (Reader group 1, composed characteristics (ROC) curve (AUROC) in diagnosing all prostate cancers
of four 2-year residents, and Reader group 2, composed of four 3-year or CSCs. The ROC curves were compared using Delong’s method. We
residents) and three board-certified radiologists (Reader 3, 4.5 years also performed dichotomous analysis of the diagnostic performance
of experience; Reader 4, 5 years of experience; and Reader 5, 10 years of using either PI-RADS ≥ 3 or ≥ 4 as the cutoff value. Given that physi
experience in prostate imaging) who were blinded to the biopsy results cians dichotomically determine whether to conduct prostate biopsy, we
independently reviewed bpMRI (three planes of T2WIs, axial DWIs [b = thought that dichotomous analysis would be more helpful to understand
0, 1,000 sec/mm2], and an ADC map). Given that the radiology residents the findings intuitively. To calculate the sensitivity, specificity, positive
were not familiar with prostate MRI, four with similar levels of experi predictive value (PPV), negative predictive value (NPV), and accuracy in
ence in prostate MRI split the cases. Integration of the four residents’ detecting prostate cancer, we defined true or false positives/negatives
reviews was considered as a single reader review in Reader groups 1 and for an index lesion at the per-patient level. For example, a true positive
2. Reader 5 had reviewed MRIs for more than four months before means that a reader (DLA, clinical report, or radiologist) correctly
matching the other readers’ MRI reviews to pathology results. All detected prostate cancer at the same location and categorized the lesion
readers recorded the number, location, and size of suspicious prostate with at least the PI-RADS score cutoff value. Sensitivities and specific
lesions and PI-RADS version 2 score. For later comparison with the ities of the clinical report and radiologists were compared to those of the
reference standard, all detected prostate lesions (PI-RADS scores from 2 DLA using McNemar’s test.
to 5) were captured on axial images with indicators on the picture Statistical analysis was performed using SPSS software, version 23.0
archiving and communication system by all readers. The index lesion (IBM, Armonk, NY, USA) and MedCalc version 19.2.0 (MedCalc
3
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
Table 1 3.2. PI-RADS assessment and inter-reader agreement for DLA, clinical
Clinical characteristics of the 121 patients. reports, and radiologists
Characteristics Total (n = 121)
The proportions of PI-RADS categories determined by the DLA,
Mean age (range) (years) 68.2 ± 8.5
(47–85) clinical reports, and radiologists varied. The detailed distribution of the
Median PSA (interquartile range) (ng/mL) 6.5 (4.5–10.4) PI-RADS scores by DLA, clinical reports, and Readers 1–5 is shown in
Use of 5α-reductase inhibitor 4 (3.3%) Fig. 2. DLA assigned 60.3%, 0.8%, 9.9% and 27.3% of cases as PI-RADS
Number of previous TRUS-guided biopsies before MRI 1, 3, 4, and 5, respectively. Inter-reader agreement for PI-RADS score
None 87 (71.9%)
One time 29 (24.0%)
was moderate (κ, 0.461) between DLA and Reader 5 and varied from
Two times 3 (2.5%) poor to good agreement between the other readers and Reader 5 (κ,
Three times 1 (0.8%) 0.340, 0.457, 0.467, 0.609, for Readers 1–4, respectively). Inter-reader
Four times 1 (0.8%) agreement for the PI-RADS score between clinical reports and Reader 5
Median time interval between MRI and biopsy (interquartile 17 (9–26)
was moderate (κ, 0.422).
range) (days)
Median time interval between MRI and prostatectomy 44 (31–67)
(interquartile range) (days)
3.3. Comparison of diagnostic performance using AUROC
Pathologically proven prostate cancers 52 (43.0%)
Pathologically proven CSCs 43 (35.3%)
Maximum Gleason score For diagnosing all prostate cancers, the AUROC of the DLA was
6 (3 + 3) 9 (17.3%) 0.808, which was significantly higher than those of Reader 1 (AUROC,
7 (3 + 4) 24 (42.3%) 0.698; p = 0.031) and clinical reports (AUROC, 0.687; p = 0.015) and
7 (4 + 3) 13 (23.1%)
8 (4 + 4) 6 (11.5%)
was similar to those of Readers 2–5 (AUROC, 0.786, 0.729, 0.862 and
9 (4 + 5) 3 (5.8%) 0.874; p = 0.623, 0.101, 0.174 and 0.098, respectively) (Fig. 3a). In the
diagnosis of CSCs, the AUROC of DLA was 0.828, with no significant
CSC, clinically significant prostate cancer; PSA, prostate-specific antigen; TRUS,
difference from that of Readers 2–4 and clinical reports (AUROC, 0.811,
transrectal ultrasonography.
0.754, 0.882, and 0.730; p = 0.661, 0.110, 0.122, and 0.060, respec
tively). The performance of the DLA was superior to that of Reader
Software, Mariakerke, Belgium). A p-value of<0.05 was considered
group 1 (AUROC, 0.706; p = 0.011) but inferior to that of Reader 5
statistically significant. To compare diagnostic performance between
(AUROC, 0.914; p = 0.013) (Fig. 3b).
clinical reports, the five radiologists, and DLA, p-values were multiplied
by 6 according to the Bonferroni correction.
3.4. Dichotomous analysis
3. Results
The sensitivities and specificities of the DLA, clinical reports, and
3.1. Patient characteristics radiologists in the diagnosis of prostate cancer (all prostate cancers or
CSCs) varied widely (Tables 2 and 3). For both all prostate cancers and
Among the 121 patients (mean age 68.2 ±8.5 years, range 47–85 CSCs, no significant difference in sensitivity was noted between the DLA
years), 52 (43.0%) were diagnosed with prostate cancer, and 43 (35.5%) results and any of the readers or clinical reports for PI-RADS cutoff value
were confirmed to have CSC. The median prostate-specific antigen (PSA) of either ≥ 3 or 4 except for Reader 5 at a PI-RADS cutoff value ≥ 3. The
level was 6.5 ng/mL (interquartile range 4.5–10.4 ng/mL). Twenty- DLA showed significantly higher specificity in diagnosing all prostate
three patients underwent radical prostatectomy. The demographic, cancers and CSCs relative to any of the radiologists and clinical reports
clinical, and pathologic information of the patients is summarized in for a PI-RADS cutoff value ≥ 3. The sensitivities and specificities of the
Table 1. DLA and Reader 5 did not significantly differ when using a PI-RADS
cutoff value ≥ 4 for diagnosing all prostate cancers and CSCs. The
PPV also varied among the radiologists and clinical reports. The DLA
showed a better PPV than all radiologists and clinical reports. The
Fig. 2. Proportion (%) of PI-RADS score in DLA, clinical reports and Readers 1–5 The distribution of PI-RADS scores determined by DLA, clinical reports and Readers
1–5 were variable.
4
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
5
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
Table 2
Dichotomous analysis of DLA, clinical reports and Readers 1–5, reference standard based on presence of pathologically proven all prostate cancer.
Sensitivity, % Corrected P value* Specificity, % Corrected P value* Accuracy, % PPV, % NPV, %
DLA
PI-RADS ≥ 3 73.1 (38/52) Reference 87.0 (60/69) Reference 81.0 (98/121) 80.9 (38/47) 81.1 (60/74)
PI-RADS ≥ 4 69.2 (36/52) Reference 88.4 (61/69) Reference 80.2 (97/121) 81.8 (36/44) 79.2 (61/77)
Reader group 1
PI-RADS ≥ 3 69.2 (36/52) >0.99 40.6 (28/69) <0.001 52.9 (64/121) 46.8 (36/77) 63.6 (28/44)
PI-RADS ≥ 4 57.7 (30/52) 0.654 68.1 (47/69) 0.042 63.6 (77/121) 57.7 (30/52) 68.1 (47/69)
Reader group 2
PI-RADS ≥ 3 76.9 (40/52) >0.99 49.3 (34/69) <0.001 61.2 (74/121) 53.3 (40/75) 73.9 (34/46)
PI-RADS ≥ 4 71.2 (37/50) >0.99 65.2 (45/69) 0.030 67.8 (82/121) 60.7 (37/61) 78.3 (45/60)
Reader 3
PI-RADS ≥ 3 82.7 (43/52) >0.99 29.0 (20/69) <0.001 52.1 (63/121) 46.7 (43/92) 69.0 (20/29)
PI-RADS ≥ 4 80.8 (42/52) 0.876 50.7 (35/69) <0.001 63.6 (77/121) 55.3 (42/76) 77.8 (35/45)
Reader 4
PI-RADS ≥ 3 90.4 (47/52) 0.072 60.9 (42/69) <0.001 73.6 (89/121) 63.5 (47/74) 89.4 (42/47)
PI-RADS ≥ 4 86.5 (45/52) 0.072 79.7 (55/69) >0.99 82.6 (100/121) 76.3 (45/59) 88.7 (55/62)
Reader 5
PI-RADS ≥ 3 92.3 (48/52) 0.036 58.0 (40/69) <0.001 72.7 (88/121) 62.3 (48/77) 90.9 (40/44))
PI-RADS ≥ 4 84.6 (44/52) 0.234 81.2 (56/69) >0.99 82.6 (100/121) 77.2 (44/59) 87.5 (56/64)
Clinical reports
PI-RADS ≥ 3 84.6 (44/52) >0.99 23.2 (16/69) <0.001 49.6 (60/121) 44.4 (44/99) 72.7 (16/22)
PI-RADS ≥ 4 78.8 (41/52) >0.99 36.2 (25/69) <0.001 54.5 (66/121) 47.1 (41/87) 73.5 (25/34)
DLA, deep learning-based algorithm; PI-RADS, prostate imaging-reporting and data system; PPV, positive predictive value; NPV, negative predictive value.
*
Bonferroni corrected p-value; p-values were multiplied by 6.
Table 3
Dichotomous analysis of DLA, clinical reports and Readers 1–5, reference standard based on presence of pathologically proven clinically significant prostate cancer.
Sensitivity, % Corrected P value* Specificity, % Corrected P value* Accuracy, % PPV, % NPV, %
DLA
PI-RADS ≥ 3 81.4 (35/43) Reference 84.6 (66/78) Reference 83.5 (101/121) 74.5 (35/47) 89.2 (66/74)
PI-RADS ≥ 4 76.7 (33/43) Reference 85.9 (67/78) Reference 82.6 (100/121) 75.0 (33/44) 87.0 (67/77)
Reader group 1
PI-RADS ≥ 3 76.7 (33/43) >0.99 43.6 (34/78) <0.001 55.4 (67/121) 42.9 (33/77) 77.3 (34/44)
PI-RADS ≥ 4 62.8 (27/43) 0.420 67.9 (53/78) 0.066 66.1 (80/121) 51.9 (27/52) 76.8 (53/69)
Reader group 2
PI-RADS ≥ 3 83.7 (36/43) >0.99 50.0 (39/78) <0.001 62.0 (75/121) 48.0 (36/75) 84.8 (39/46)
PI-RADS ≥ 4 79.1 (34/43) >0.99 65.4 (50/77) 0.036 71.2 (84/118) 55.7 (34/61) 87.7 (50/57)
Reader 3
PI-RADS ≥ 3 88.4 (38/43) >0.99 30.8 (24/78) <0.001 51.2 (62/121) 41.3 (38/92) 82.8 (24/29)
PI-RADS ≥ 4 86.0 (37/43) >0.99 50.0 (39/78) <0.001 62.8 (76/121) 48.7 (37/76) 86.7 (39/45)
Reader 4
PI-RADS ≥ 3 95.3 (41/43) 0.420 57.7 (45/78) <0.001 71.1 (86/121) 55.4 (41/74) 95.7 (45/47)
PI-RADS ≥ 4 90.7 (39/43) 0.420 74.4 (58/78) 0.468 80.2 (97/121) 66.1 (39/59) 92.5 (58/62)
Reader 5
PI-RADS ≥ 3 100.0 (43/43) 0.048 56.4 (44/78) <0.001 71.9 (87/121) 55.8 (43/77) 100.0 (44/44)
PI-RADS ≥ 4 93.0 (40/43) 0.234 78.2 (61/78) >0.99 83.5 (101/121) 70.2 (40/57) 95.3 (61/64)
Clinical reports
PI-RADS ≥ 3 88.4 (38/43) >0.99 24.4 (19/78) <0.001 47.1 (57/121) 38.4 (38/99) 86.4 (19/22)
PI-RADS ≥ 4 83.7 (36/43) >0.99 37.2 (29/78) <0.001 53.7 (65/121) 41.4 (36/87) 85.3 (29/34)
DLA, deep learning-based algorithm; PI-RADS, prostate imaging-reporting and data system; PPV, positive predictive value; NPV, negative predictive value.
*
Bonferroni corrected p-value; p-values were multiplied by 6.
6
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
Fig. 4. True positive lesion detected by the readers and the DLA. MRI of the prostate gland was performed in a 71-year old male with elevated PSA (19.9 ng/mL).
Axial T2-weighted image shows ill-defined low signal intensity mass (area with yellow dotted line) in both peripheral zone at the prostate base (a). The mass shows
high signal intensity on diffusion-weighted image (b = 1000 sec/mm2) (b) and low value on ADC map (c). DLA detected the same lesion and assigned PI-RADS
category 5. DLA shows the abnormal area by using a suspicion map and presents the lesion as pink area on T2-weighted image (d). Readers 2 and 3 missed this
lesion and Readers 1, 4 and 5 detected the lesion. In clinical report, PI-RADS score was 1. Clinically significant cancer (Gleason score 7 [4 + 3]) was confirmed by
biopsy. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
7
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
Fig. 5. False negative result of DLA MRI of the prostate gland was performed in an 80-year old male with elevated PSA (15.0 ng/mL). Axial T2-weighted image
shows focal homogeneous low signal intensity lesion (area with white dotted line) in anterior aspect of transition zone (a). The mass shows high signal intensity on
diffusion-weighted image (b = 1000 sec/mm2) (b) and low value on ADC map (c). DLA could not detect the lesion, but all readers detected the same lesion as PI-
RADS ≥ 4. In clinical report, the lesion was also classified to PI-RADS 4. Clinically significant cancer (Gleason score 7 [3 + 4]) was confirmed by biopsy.
8
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
cancer in the real world. However, the percentage of assigned PI-RADS 1 References
or 2 scores by Reader 5 in the study population was 36.0%, which was
similar to the percentage of 33% in a previous prospective study [23]. [1] V. Kasivisvanathan, M. Emberton, C.M. Moore, MRI-Targeted Biopsy for Prostate-
Cancer Diagnosis, N. Engl. J. Med. 379 (6) (2018) 589–590.
This study included patients in the period at which the usefulness of [2] H.U. Ahmed, A. El-Shater Bosaily, L.C. Brown, R. Gabe, R. Kaplan, M.K. Parmar,
prebiopsy MRI was being determined. Therefore, many patients under Y. Collaco-Moraes, K. Ward, R.G. Hindley, A. Freeman, A.P. Kirkham, R. Oldroyd,
went prostate biopsy even with negative prebiopsy MRI results. Third, C. Parker, M. Emberton, Diagnostic accuracy of multi-parametric MRI and TRUS
biopsy in prostate cancer (PROMIS): a paired validating confirmatory study, The
the reference standard was mainly based on biopsy results. We tried our Lancet 389 (10071) (2017) 815–822.
best to minimize this limitation by undergoing targeted biopsy and [3] F.J.H. Drost, D. Osses, D. Nieboer, C.H. Bangma, E.W. Steyerberg, M.J. Roobol, I.
considering all systematic biopsy results. In addition, including only G. Schoots, Prostate Magnetic Resonance Imaging, with or Without Magnetic
Resonance Imaging-targeted Biopsy, and Systematic Biopsy for Detecting Prostate
radical prostatectomy patients induces bias by excluding many patients Cancer: A Cochrane Systematic Review and Meta-analysis, Eur. Urol. 77 (1) (2020)
who underwent active surveillance, systemic therapy, or focal therapy. 78–94.
Fourth, Reader groups 1 and 2 each consisted of four residents with [4] J.C. Weinreb, J.O. Barentsz, P.L. Choyke, F. Cornud, M.A. Haider, K.J. Macura,
D. Margolis, M.D. Schnall, F. Shtern, C.M. Tempany, H.C. Thoeny, S. Verma, PI-
similar experience in prostate imaging. Although this analysis cannot
RADS Prostate Imaging - Reporting and Data System: 2015, Version 2, Eur. Urol. 69
assess each resident’s diagnostic performance, we believe that the re (1) (2016) 16–40.
sults reflect the overall diagnostic performance of residents with similar [5] G.A. Sonn, R.E. Fan, P. Ghanouni, N.N. Wang, J.D. Brooks, A.M. Loening, B.
levels of experience. Fifth, our results did not strictly follow PI-RADS v2, L. Daniel, K.J. To’o, A.E. Thong, J.T. Leppert, Prostate Magnetic Resonance
Imaging Interpretation Varies Substantially Across Radiologists, European urology
because we used bpMRI but not mpMRI. Many previous studies have focus 5 (4) (2019) 592–599.
shown comparable diagnostic performance between bpMRI and mpMRI [6] A.R. Padhani, B. Turkbey, Detecting Prostate Cancer with Deep Learning for MRI: A
[16,19,24–27]. Sixth, the evaluation of MRIs taken in one of the in Small Step Forward, Radiology 293 (3) (2019) 618–619.
[7] P. Schelb, S. Kohl, J.P. Radtke, M. Wiesenfarth, P. Kickingereder, S. Bickelhaupt, T.
stitutions that had provided cases (100/2,170 cases) for developing DLA A. Kuder, A. Stenzinger, M. Hohenfellner, H.P. Schlemmer, K.H. Maier-Hein,
could overestimate the performance of DLA in this study. Seventh, lastly, D. Bonekamp, Classification of Cancer at Prostate MRI: Deep Learning versus
PI-RADS v2 recommends high b-values (≥ 1400 sec/mm2). In our study, Clinical PI-RADS Assessment, Radiology 293 (3) (2019) 607–617.
[8] S. Yoo, I. Gujrathi, M.A. Haider, F. Khalvati, Prostate Cancer Detection using Deep
DWI with highest b-value of 1000 sec/mm2 were used for review by Convolutional Neural Networks, Sci. Rep. 9 (1) (2019) 19518.
radiologists and analysis by DLA. This could potentially reduce overall [9] Y. Song, Y.D. Zhang, X. Yan, H. Liu, M. Zhou, B. Hu, G. Yang, Computer-aided
diagnostic performance for both readers and DLA. However, this was diagnosis of prostate cancer using a deep convolutional neural network from
multiparametric MRI, J. Magn. Reson. Imaging 48 (6) (2018) 1570–1577.
because this study included patients who underwent prostate MRI [10] Y. Sumathipala, N. Lay, B. Turkbey, C. Smith, P.L. Choyke, R.M. Summers, Prostate
before releasing PIRADS v2, therefore high b-values ≥ 1400 sec/mm2 cancer detection from multi-institution multiparametric MRIs using deep
were not routinely used as part of scan protocol. convolutional neural networks, J Med Imaging (Bellingham) 5 (4) (2018), 044507.
[11] J. Ishioka, Y. Matsuoka, S. Uehara, Y. Yasuda, T. Kijima, S. Yoshida, M. Yokoyama,
K. Saito, K. Kihara, N. Numao, T. Kimura, K. Kudo, I. Kumazawa, Y. Fujii,
6. Conclusion Computer-aided diagnosis of prostate cancer on magnetic resonance imaging using
a convolutional neural network algorithm, BJU Int 122 (3) (2018) 411–417.
This study provides the first comparison between DLA and radiolo [12] T. Sanford, S.A. Harmon, E.B. Turkbey, D. Kesani, S. Tuncer, M. Madariaga,
C. Yang, J. Sackett, S. Mehralivand, P. Yan, S. Xu, B.J. Wood, M.J. Merino, P.
gists with various levels of experience in PI-RADS classification. The A. Pinto, P.L. Choyke, B. Turkbey, Deep-Learning-Based Artificial Intelligence for
DLA showed moderate diagnostic performance on a level between those PI-RADS Classification to Assist Multiparametric Prostate MRI Interpretation: A
of residents and an expert for detecting and classifying according to PI- Development Study, J. Magn. Reson. Imaging (2020).
[13] P. Schelb, X. Wang, J.P. Radtke, M. Wiesenfarth, P. Kickingereder, A. Stenzinger,
RADS. The performance of the DLA was similar to that of clinical reports M. Hohenfellner, H.P. Schlemmer, K.H. Maier-Hein, D. Bonekamp, Simulated
from various radiologists in clinical practice. clinical deployment of fully automatic deep learning for clinical prostate MRI
Funding assessment, Eur. Radiol. (2020).
[14] X. Yu, B. Lou, B. Shi, D. Winkel, N. Arrahmane, M. Diallo, T. Meng, H.v. Busch, R.
This work was supported by the National Research Foundation of Grimm, B. Kiefer, D. Comaniciu, A. Kamen, H. Huisman, A. Rosenkrantz, T.
Korea (NRF) under Grant (2018R1D1A1B07050160). Penzkofer, I. Shabunin, M.H. Choi, Q. Yang, D. Szolar, False Positive Reduction
Using Multiscale Contextual Features for Prostate Cancer Detection in Multi-
Parametric MRI Scans, 2020 IEEE 17th International Symposium on Biomedical
CRediT authorship contribution statement Imaging (ISBI), 2020, pp. 1355-1359.
[15] M. de Rooij, B. Israel, M. Tummers, H.U. Ahmed, T. Barrett, F. Giganti, B. Hamm,
Seo Yeon Youn: Methodology, Formal analysis, Writing - review & V. Logager, A. Padhani, V. Panebianco, P. Puech, J. Richenberg, O. Rouviere,
G. Salomon, I. Schoots, J. Veltman, G. Villeirs, J. Walz, J.O. Barentsz, ESUR/ESUI
editing. Moon Hyung Choi: Conceptualization, Methodology, Writing -
consensus statements on multi-parametric MRI for the detection of clinically
review & editing, Supervision. Dong Hwan Kim: Investigation, Formal significant prostate cancer: quality requirements for image acquisition,
analysis. Young Joon Lee: Investigation, Methodology. Henkjan interpretation and radiologists’ training, Eur. Radiol. (2020).
[16] Z. Kang, X. Min, J. Weinreb, Q. Li, Z. Feng, L. Wang, Abbreviated Biparametric
Huisman: Investigation. Evan Johnson: Investigation. Tobias
Versus Standard Multiparametric MRI for Diagnosis of Prostate Cancer: A
Penzkofer: Investigation. Ivan Shabunin: Investigation. David Jean Systematic Review and Meta-Analysis, AJR Am. J. Roentgenol. 212 (2) (2019)
Winkel: Investigation. Pengyi Xing: Investigation. Dieter Szolar: 357–365.
Investigation. Robert Grimm: Data curation, Software. Heinrich von [17] M.D. Greer, J.H. Shih, N. Lay, T. Barrett, L. Bittencourt, S. Borofsky, I. Kabakus, Y.
M. Law, J. Marko, H. Shebel, M.J. Merino, B.J. Wood, P.A. Pinto, R.M. Summers, P.
Busch: Software. Yohan Son: Software, Resources. Bin Lou: Software. L. Choyke, B. Turkbey, Interreader Variability of Prostate Imaging Reporting and
Ali Kamen: Software. Data System Version 2 in Detecting and Assessing Prostate Cancer Lesions at
Prostate MRI, AJR, Am. J. Roentgenol. (2019) 1–8.
[18] B.G. Muller, J.H. Shih, S. Sankineni, J. Marko, S. Rais-Bahrami, A.K. George, J.J. de
Declaration of Competing Interest la Rosette, M.J. Merino, B.J. Wood, P. Pinto, P.L. Choyke, B. Turkbey, Prostate
Cancer: Interobserver Agreement and Accuracy with the Revised Prostate Imaging
Robert Grimm, Heinrich von Busch, Yohan Son, Bin Lou, and Ali Reporting and Data System at Multiparametric MR Imaging, Radiology 277 (3)
(2015) 741–750.
Kamen are employees of Siemens Healthineers or Siemens Healthcare. [19] M.H. Choi, C.K. Kim, Y.J. Lee, S.E. Jung, Prebiopsy Biparametric MRI for Clinically
The other authors declare that they have no known competing financial Significant Prostate Cancer Detection With PI-RADS Version 2: A Multicenter
interests or personal relationships that could have appeared to influence Study, AJR Am. J. Roentgenol. 212 (4) (2019) 839–846.
[20] C.P. Smith, S.A. Harmon, T. Barrett, L.K. Bittencourt, Y.M. Law, H. Shebel, J.Y. An,
the work reported in this paper. M. Czarniecki, S. Mehralivand, M. Coskun, B.J. Wood, P.A. Pinto, J.H. Shih, P.
L. Choyke, B. Turkbey, Intra- and interreader reproducibility of PI-RADSv2: A
Appendix A. Supplementary data multireader study, J. Magn. Reson. Imaging 49 (6) (2019) 1694–1703.
[21] A.B. Rosenkrantz, L.A. Ginocchio, D. Cornfeld, A.T. Froemming, R.T. Gupta,
B. Turkbey, A.C. Westphalen, J.S. Babb, D.J. Margolis, Interobserver
Supplementary data to this article can be found online at https://doi. Reproducibility of the PI-RADS Version 2 Lexicon: A Multicenter Study of Six
org/10.1016/j.ejrad.2021.109894. Experienced Prostate Radiologists, Radiology 280 (3) (2016) 793–804.
9
S.Y. Youn et al. European Journal of Radiology 142 (2021) 109894
[22] A.C. Westphalen, C.E. McCulloch, J.M. Anaokar, S. Arora, N.S. Barashi, J. [24] S. Woo, C.H. Suh, S.Y. Kim, J.Y. Cho, S.H. Kim, M.H. Moon, Head-to-Head
O. Barentsz, T.K. Bathala, L.K. Bittencourt, M.T. Booker, V.G. Braxton, P.R. Carroll, Comparison Between Biparametric and Multiparametric MRI for the Diagnosis of
D.D. Casalino, S.D. Chang, F.V. Coakley, R. Dhatt, S.C. Eberhardt, B.R. Foster, A. Prostate Cancer: A Systematic Review and Meta-Analysis, AJR Am. J. Roentgenol.
T. Froemming, J.J. Fütterer, D.M. Ganeshan, M.R. Gertner, L.M. Gettle, S. Ghai, R. 211 (5) (2018) W226–W241.
T. Gupta, M.E. Hahn, R. Houshyar, C. Kim, C.K. Kim, C. Lall, D.J.A. Margolis, S. [25] D. Junker, F. Steinkohl, V. Fritz, J. Bektic, T. Tokas, F. Aigner, T.R.W. Herrmann,
E. McRae, A. Oto, R.B. Parsons, N.U. Patel, P.A. Pinto, T.J. Polascik, B. Spilseth, J. M. Rieger, U. Nagele, Comparison of multiparametric and biparametric MRI of the
B. Starcevich, V.S. Tammisetti, S.S. Taneja, B. Turkbey, S. Verma, J.F. Ward, C. prostate: are gadolinium-based contrast agents needed for routine examinations?
A. Warlick, A.R. Weinberger, J. Yu, R.J. Zagoria, A.B. Rosenkrantz, Variability of World J. Urol. 37 (4) (2019) 691–699.
the Positive Predictive Value of PI-RADS for Prostate MRI across 26 Centers: [26] X.K. Niu, X.H. Chen, Z.F. Chen, L. Chen, J. Li, T. Peng, Diagnostic Performance of
Experience of the Society of Abdominal Radiology Prostate Cancer Disease-focused Biparametric MRI for Detection of Prostate Cancer: A Systematic Review and Meta-
Panel 296 (1) (2020) 76–84. Analysis, AJR Am. J. Roentgenol. 211 (2) (2018) 369–378.
[23] A.R. Padhani, J. Barentsz, G. Villeirs, A.B. Rosenkrantz, D.J. Margolis, B. Turkbey, [27] M. Alabousi, J.P. Salameh, K. Gusenbauer, L. Samoilov, A. Jafri, H. Yu, A. Alabousi,
H.C. Thoeny, F. Cornud, M.A. Haider, K.J. Macura, C.M. Tempany, S. Verma, J. Biparametric vs multiparametric prostate magnetic resonance imaging for the
C. Weinreb, PI-RADS Steering Committee: The PI-RADS Multiparametric MRI and detection of prostate cancer in treatment-naive patients: a diagnostic test accuracy
MRI-directed Biopsy Pathway, Radiology 292 (2) (2019) 464–474. systematic review and meta-analysis, BJU Int 124 (2) (2019) 209–220.
10