Professional Documents
Culture Documents
Background and Aims: We developed a system for computer-assisted diagnosis (CAD) for real-time automated
diagnosis of precancerous lesions and early esophageal squamous cell carcinomas (ESCCs) to assist the diagnosis
of esophageal cancer.
Methods: A total of 6473 narrow-band imaging (NBI) images, including precancerous lesions, early ESCCs, and
noncancerous lesions, were used to train the CAD system. We validated the CAD system using both endoscopic
images and video datasets. The receiver operating characteristic curve of the CAD system was generated based on
image datasets. An artificial intelligence probability heat map was generated for each input of endoscopic images.
The yellow color indicated high possibility of cancerous lesion, and the blue color indicated noncancerous lesions
on the probability heat map. When the CAD system detected any precancerous lesion or early ESCCs, the lesion
of interest was masked with color.
Results: The image datasets contained 1480 malignant NBI images from 59 consecutive cancerous cases (sensi-
tivity, 98.04%) and 5191 noncancerous NBI images from 2004 cases (specificity, 95.03%). The area under curve
was 0.989. The video datasets of precancerous lesions or early ESCCs included 27 nonmagnifying videos (per-
frame sensitivity 60.8%, per-lesion sensitivity, 100%) and 20 magnifying videos (per-frame sensitivity 96.1%,
per-lesion sensitivity, 100%). Unaltered full-range normal esophagus videos included 33 videos (per-frame
specificity 99.9%, per-case specificity, 90.9%).
Conclusions: A deep learning model demonstrated high sensitivity and specificity for both endoscopic images
and video datasets. The real-time CAD system has a promising potential in the near future to assist endoscopists
in diagnosing precancerous lesions and ESCCs. (Gastrointest Endosc 2019;-:1-10.)
Both nonmagnifying and magnifying narrow-band imag- varices in 9 cases, ectopic gastric mucosa in 105 cases,
ing (NBI) is important for the diagnosis of precancerous le- esophagitis in 64 cases, and normal esophagus in 180
sions and ESCCs.3 The brownish area is the main feature of cases, were used to train the CAD system. WCH contrib-
precancerous lesions and early ESCCs under nonmagnifying uted 191 cases of precancerous lesions and superficial
NBI, whereas intrapapillary capillary loops (IPCLs) are key ESCCs, 9 cases of esophageal varices, 105 cases of ectopic
features under magnifying NBI.4 Unfortunately, it is not gastric mucosa, 64 cases of esophagitis, and 60 cases of
easy to identify these imaging features in ESCC at an early normal esophagus images. San Bernardino Gastroenter-
stage. When NBI was used by an inexperienced ology Associates and Jaswant Rai Speciality Hospital
endoscopist, the sensitivity for detecting ESCC was only contributed 60 cases of normal esophagus images. All
53%.5 A recent study on missed esophageal cancer found images were obtained and recorded in 2017. Endoscopic
that 6.4% of patients had negative endoscopy results images were captured using Olympus endoscopes (GIF-
within 3 years before diagnosis.6 Because of the shortage H260Z, EVIS LUCERA CV260 (SL)/CV290 (SL), Olympus
of trained endoscopists, especially in rural or undeveloped Medical Systems, Tokyo, Japan).
regions, the ability to detect precancerous lesions and Four datasets were used for validation purposes. For im-
ESCCs is a significant challenge. age validation of dataset A, we collected 1480 malignant
Computer-assisted diagnosis (CAD) using an artificial in- NBI images in 59 consecutive cases of precancerous le-
telligence (AI) system has made remarkable progress in sions or ESCCs from January 2018 to February 2018. All
recent years. Researchers have used the CAD system to cases were confirmed histologically. Among these, 32 cases
improve the diagnosis of various GI lesions, such as colo- (35 lesions) were confirmed by endoscopic submucosal
rectal polyps, gastric ulcers, Helicobacter pylori infections, dissection or surgery. The other 27 cases were diagnosed
and gastric cancer.7-10 The application of CAD in the diag- by biopsy. Suboptimal quality images due to insufficient
nosis of early ESCC has gained much attention recently. air inflation and blurred images were excluded from the
Horie et al11 first used AI to detect esophageal cancer study. Suboptimal image quality was defined if senior en-
with a sensitivity of 98% and a positive predictive value doscopists were unable to see enough imaging features
of 40% in 2018. However, in his study, only static images to determine the diagnosis. For image validation of dataset
were tested, and the difference between nonmagnifying B, we collected 5191 noncancerous NBI images in 2004
and magnifying settings was not demonstrated. Zhao consecutive cases of normal epithelium or benign lesions
et al12 developed a deep learning model based on of the esophagus. Cases with suboptimal image quality
magnifying NBI images to investigate automated were excluded. The dataset included normal squamous
classification of IPCLs. However, that study did not use epithelium (821 cases from January 2018 to February
real-time analysis, and the study findings focused mainly 2018), normal gastroesophageal junction epithelium (799
on the classification of NBI images instead of detection. cases from January 2018 to February 2018), typical reflux
A few other studies also used a CAD system to differentiate esophagitis (109 cases from January 2018 to March 2018),
cancerous from noncancerous esophageal images using heterotrophic gastric mucosa (154 cases from January
micro-endoscopy techniques.13 The authors of those 2018 to June 2018), esophageal varices (69 cases from
studies did not use white light or NBI images. January 2018 to June 2018), and submucosal tumor (52
Our aim was to develop a CAD system to achieve real- cases from January 2018 to June 2018). All diagnoses of
time automated diagnosis of precancerous lesions and submucosal tumor were confirmed by endoscopic ultraso-
early ESCCs, in both nonmagnifying and magnifying set- nography. Three of our center’s experienced endoscopists
tings. We hope that our system will improve the diagnosis validated the diagnoses, and the relevant images were
of early esophageal malignancies. selected. For video validation of dataset C, we selected
27 precancerous lesions and cases of early ESCC that
were recorded from March 2018 to January 2019. All pa-
METHODS tients had endoscopic submucosal dissection, and each
diagnosis was confirmed histologically. Each video was
Training and test datasets clipped from the time the lesion first appeared in the visual
Four institutions were involved in the development of field until the same lesion disappeared in the visual field
the CAD system for diagnosing precancerous lesions and under NBI examination. We clipped nonmagnifying NBI
early ESCCs: Endoscopy Center of West China Hospital videos from all 27 cases, and 20 cases were clipped using
(WCH) in Chengdu, China; Jaswant Rai Speciality Hospital, magnifying NBI. All nonmagnifying and magnifying videos
Meerut, India; San Bernardino Gastroenterology Associ- were then processed using video-editing software to elim-
ates, Inc and ACE Endoscopy and Surgery Center, Rialto, inate frames in which our senior endoscopists were unable
California, USA; and Shanghai Wision AI Co Ltd, Shanghai, to see the detailed imaging features. Finally, for video vali-
China. A total of 2770 NBI images of precancerous lesions dation of dataset D, we selected all 33 cases with normal
and early ESCCs in 191 cases and 3703 NBI images of esophagus from March 2018 to December 2018 that we re-
noncancerous lesions in 358 cases, including esophageal corded previously. Three of those cases used magnifying
Figure 1. The architecture and workflow of the deep learning model. An artificial intelligence hot zone image was generated for any input endoscopic
image. The yellow color indicates high possibility of a cancerous lesion, and the blue color indicates a noncancerous lesion. When CAD detects any pre-
cancerous lesion or early ESCC, the lesion of interest is covered with color. CAD, computer-assisted diagnosis; ESCC, esophageal squamous cell carci-
noma; NBI, narrow-band imaging.
NBI. All videos contained unaltered images from the prox- were then modified slightly to decrease the error on the
imal esophagus to the gastroesophageal junction. same image. The same process was then repeated multiple
times for every image in the training set. The mathematical
Model development function used in this study was based on SegNet architec-
Experienced endoscopists who had at least 5 years of ture (Fig. 1).
experience in diagnosing and treating early esophageal SegNet is a deep encoder-decoder architecture for
cancer from WCH Endoscopy Center annotated each ma- multi-class pixelwise segmentation. The architecture con-
lignant endoscopic image. The boundaries of precancerous sists of a sequence of nonlinear processing layers (en-
lesions and ESCCs in the image were drawn using Adobe coders) and a corresponding set of decoders followed by
Photoshop software. Those boundaries were used to a pixelwise classifier. Typically, each encoder consists of
represent the actual lesion area within the image. one or more convolutional layers with batch normalization
During the CAD training process, the parameters of the and a ReLU nonlinearity, followed by nonoverlapping max-
mathematical function were initially set to random values. pooling and subsampling. The sparse encoding due to the
For each annotated image, the location of a lesion pooling process is upsampled in the decoder using the
computed by the deep learning function was compared maxpooling indices in the encoding sequence (http://mi.
with the location annotated by the endoscopist during eng.cam.ac.uk/projects/segnet/). For our models, we
endoscopy. The parameters of this mathematical function revolved the last 3 convolution layers in the encoder and
the first 3 deconvolution layers in the decoder for better All continuous variables are expressed as the mean
generalization. within a range. Statistical analyses were conducted using
SPSS, version 16.0 (SPSS Inc, Chicago, Ill, USA).
Definition of the analysis
For image evaluation, images of precancerous lesions Ethics
and ESCCs were included in dataset A to test the sensitivity The study was approved by the Ethics Committee of
of the model, and noncancerous images were used in WCH, Sichuan University (no. ChiECRCT-20180131).
dataset B to test the specificity. For video evaluation of pre-
cancerous lesions and early ESCCs, 3 experienced endo- RESULTS
scopists from the WCH Endoscopy Center carefully
examined each algorithmically labeled frame in each video. Patient characteristics: imaging features of
For dataset C, the sensitivity of the model for each precan- esophageal lesions in the validation datasets
cerous lesion and early ESCC, as well as each image frame, We used 4 independent datasets for validation of our
was calculated. For specificity evaluation of normal esoph- CAD model. Datasets A and B were used for image analysis,
agus in dataset D, each labeled frame was counted as false and datasets C and D were used for video analysis. A total
positive (FP) to provide a per frame specificity. The early of 2123 cases, including 6671 images and 80 video clips,
stage of esophageal cancer is defined as mucosal (T1a) were tested. The detailed features of the validation datasets
and submucosal (T1b) cancer regardless of lymph node are listed in Table 1.
metastasis. In the sensitivity test for datasets A and C, both images
and video clips of precancerous lesions and ESCCs were
Statistical analysis used. Most lesions were located in the middle part of the
If the detection of labeled algorithm occurred in the esophagus. The morphology of the lesions was summa-
same instance as precancerous lesions and early ESCCs, rized using the Paris classification.14 Flat and superficial
the result was considered true positive (TP), and only depressed types were the most common types; 46.8% of
one TP was counted for each image, regardless of the num- the lesions were classified as mucosal cancers (T1a) in
ber of times the algorithm detection labels fell on the same the image setting and 88.9% were classified as mucosal
lesion. The absence of algorithmically detected labeling on cancers (T1a) in the video setting.
precancerous lesions and early ESCCs was counted as false In the specificity test, benign diseases were included in
negative (FN). Per-image sensitivity was therefore defined dataset B, whereas only cases with normal esophagus were
as TP divided by the total number of images with precan- used in dataset D. All videos in dataset D were unaltered,
cerous lesions and early ESCCs (sensitivity Z TP/(TP þ including 30 nonmagnifying videos and 3 magnifying
FN)). videos. The mean duration of nonmagnifying and magni-
If there was no algorithmically detected labeling on an fying videos in dataset D was 67.2 seconds and 205.8 sec-
image without precancerous lesions and early ESCCs, the onds, respectively. The total number of frames was
image was then counted as true negative (TN). An FP 50,372 in nonmagnifying videos and 15,433 in magnifying
was defined as any detection label on an area without pre- videos.
cancerous lesions and early ESCCs. Therefore, per-image When the CAD model detected precancerous lesions or
specificity was defined as TN divided by the total number ESCCs, the area of interest was masked in blue color (Fig. 2).
of images without precancerous lesions and early ESCCs
(specificity Z TN/(TN þ FP)). Imaging diagnosis of the CAD system
For video validation of precancerous lesions and early The per-image sensitivity of our CAD system for all ma-
ESCCs, per frame sensitivity was defined as the number lignant images in 59 cases (dataset A) was 98.04%, and the
of TP frames divided by the total number of frames per-image specificity for 5191 noncancerous NBI images
with precancerous lesions and ESCCs (sensitivity Z TP/ from 2004 cases (dataset B) was 95.03% (Table 2).
(TP þ FN)). We also measured one additional metric, Among malignant images, 32 cases were confirmed as
per lesion sensitivity, which we defined as the number precancerous lesions or early ESCCs, and the per-image
of lesions correctly detected by the algorithm in at least sensitivity for this group was 97.5%. The per-image
1 frame of each lesion divided by the total number of sensitivity for another 27 cases that were diagnosed by
lesions. biopsy and contained precancerous lesions, early ESCCs,
For video validation of a normal esophagus, per frame or advanced ESCCs was 98.5%. The per-image specificity
specificity was defined as the number of TN frames divided of our CAD system for normal epithelium and various
by the total number of frames with normal esophagus benign diseases varied from 86.1% to 96.8%. Esophagitis
(specificity Z TN/(TN þ FP)). Per-case specificity was was most frequently misinterpreted by CAD as ESCC
defined as the number of cases correctly detected by the (Table 2). The receiver operating characteristic (ROC)
algorithm in all frames of each case divided by the total curve of the CAD system for image analysis was
number of cases. generated using datasets A and B (Fig. 3). Different
TABLE 1. Patient demographics and clinical characteristics for the validation datasets
Duration of datasets January-February 2018 January-June 2018 March 2018-January 2019 March-December 2018
Data content 59 consecutive cases of Randomly selected 27 randomly selected 33 randomly selected
precancerous lesions/ESCCs 2004 cases of normal precancerous lesions/early cases of normal esophagus
among which 32 cases (total epithelium or benign ESCCs with videos with videos, including 30
35 lesions) were confirmed lesions of the confirmed by post-ESD nonmagnifying videos and
by ESD or surgery; the other esophagus* pathologic examination, 3 magnifying videos
27 cases were diagnosed including 27 nonmagnifying
by biopsy videos and 20 magnifying
videos
Data processing Exclude unclear pictures Exclude unclear Exclude unclear frames Unaltered, full-range
pictures
Patient demographics
Female gender (%) 16.9 54.3 16.9 46.9
Age (years), mean (range) 63.6 (43-78) 46.5 (16-83) 62.5 (46-78) 45.2 (24-71)
Size (mm), mean (range) 34.1 (5-130) NA 23.3 (5-35) NA
Location (upper/middle/low) 10/41/11 NA 3/20/4 NA
Macroscopic type-Paris 5/8/22/20/7 NA 0/5/10/12/0 NA
classification (0-I/IIa/IIb/IIc/II)I
Tumor depth (LGD/HGD/ 3/12/6/8/6/27 NA 3/12/8/1/3/0 NA
LPM/MM/SM/uncertain)
Duration of video (seconds), NA NA 19.7 (5.16-44.56) 79.8 (10.44-348.6)
mean (range)
ESD, Endoscopic submucosal dissection; NA, not applicable; LGD, low-grade dysplasia; HGD, high-grade dysplasia; LPM, laminae propria mucosae; MM, muscularis mucosae;
SM, submucosal layer.
*Normal squamous epithelium (821 cases), normal gastroesophageal junction epithelium (799 cases), typical reflux esophagitis (109 cases), heterotopic gastric mucosa (154
cases), esophageal varices (69 cases), and submucosal tumor (52 cases) were selected from the database of West China Hospital.
probability thresholds for detection were used. The area system for lesions <10 mm is demonstrated in Video 4
under the ROC curve was 0.989. (available online at www.giejournal.org).
There were 18 cases with irregular cornification (313 im-
Video diagnosis of the CAD system ages) in image dataset A and 4 cases in video dataset C (4
Our video model was capable of processing at least 25 nonmagnifying videos and 2 magnifying videos). The per-
frames per second with a latency period of less than 100 image or per-frame sensitivity of the CAD system for these
milliseconds in real-time video analysis. Video demon- lesions was 93.6% in the image dataset, 46.9% in the non-
stration of CAD for precancerous lesions and early magnifying video dataset, and 85.8% in the magnifying
ESCCs was shown with nonmagnifying (Video 1, video dataset. Video diagnosis of the CAD system for le-
available online at www.giejournal.org) and magnifying sions with irregular cornification is demonstrated in
(Videos 2 and 3, available online at www.giejournal. Video 5 (available online at www.giejournal.org).
org) video clips.
The total per-frame sensitivity and specificity of CAD for DISCUSSION
datasets C and D were 91.5% and 99.9%, respectively
(Table 2). In dataset C, the per-frame sensitivity of ESCC is a major global health challenge. Early detection
nonmagnifying video clips was lower than the magnifying is essential. However, quality control of endoscopic exam-
video clips (60.8% versus 96.1%, respectively). Per lesion ination and the overall shortage of trained endoscopists are
sensitivity was 100% in both nonmagnifying and magnifying major problems worldwide.15 To enable a CAD system to
video clips. In dataset D, per-case specificity was 90.9%. serve as “a second observer” in an endoscopic
For lesions <10 mm, there were 4 cases (49 images) in examination to support nonexperts in the detection of
image dataset A and 2 cases in video dataset C (2 nonmag- ESCC and reduce missed diagnoses will require a high-
nifying videos and 1 magnifying video). The per-image or performing deep learning model.
per-frame sensitivity of the CAD system for these lesions Early work on deep learning for medical imaging mainly
was 91.8% in the image dataset (Fig. 4), 46.6% in the used past collected images and videos to develop an algo-
nonmagnifying video dataset, and 98.5% in the rithm retrospectively and then used a small portion of the
magnifying video dataset. Video diagnosis of the CAD remaining collected images as a validation set. Few studies
Figure 2. Examples of precancerous lesions and ESCC detection in dataset A. A-D, Nonmagnifying malignant images with different shape and size. E-H,
Magnifying malignant images with different histologic depths (E, low-grade dysplasia; F, high-grade dysplasia; G, laminae propria mucosae; H, muscularis
mucosae). ESCC, esophageal squamous cell carcinoma.
Figure 2. Conituned.
Figure 4. Examples of lesions <10 mm. A, Early esophageal cancer, PT1a-muscularis mucosae; B, high-grade dysplasia; C, high-grade dysplasia; D, low-
grade dysplasia.