You are on page 1of 1

Using Generative Adversarial Networks and Transfer Learning

for Breast Cancer Detection by Convolutional Neural Networks


Shuyue Guan, Murray Loew
Medical Imaging and Image Analysis Laboratory, Department of Biomedical Engineering,
George Washington University, Washington DC, 20052, USA

INTRODUCTION ➢ For normal ROIs, they are also rectangle-shape RESULTS


images and their sizes are approximately the average
size of abnormal ROIs. In DDSM, the average size of
In recent years, developments in machine learning have abnormal ROIs is 506.02×503.90 pixels, so the
provided alternative methods for feature extraction; one is to cropping size for normal ROIs was chosen to be Real
learn features from whole images directly through a 505×505 pixels. Their locations are selected randomly
Convolutional Neural Network (CNN), however, to obtain a on normal breast areas. We cropped only one ROI
sufficient number of images to train a CNN classifier is difficult from an entire normal breast image.
because the true positives are scarce in the datasets and Synthetic
expert labeling is expensive. We resized the ROIs by resampling and made them to
RGB (3-layer cubes) by duplication (Fig. 3).
We proposed two deep-learning based solutions to the
Fig.5. (Top row) Real abnormal ROIs; (Bottom row) synthetic
lack of training images:
B A abnormal ROIs generated from GAN.
1)To generate synthetic mammographic images for
training by the Generative Adversarial Network (GAN).
(Fig.1)
SStd:-- SStd: --
Best:0.5000 Best:0.5000 SStd: 0.0090 SStd: 0.0063
2)To apply transfer learning in CNN. (Fig.2) Stable:0.5000 Stable:0.5000 Best:0.9808 Best:0.9885
Over-fitting:-- Over-fitting:-- Stable:0.9682 Stable: 0.9796
Time: 12.04s/ep Over-fitting: No Over-fitting: No
Then, we combined the two technologies together. That Time: 16.22s/ep
Time: 27.80s/ep Time: 39.78s/ep
is to apply GAN for image augmentation and transfer C
learning in CNN for breast cancer detection.

Real data distribution Generated distribution


SStd: 0.0080 SStd:-- SStd: 0.0109 SStd: 0.0081
Best:0.9423 Best:0.5654 Best:0.9346 Best:0.9385
real sample y
generated sample z Stable:0.9148 Stable: 0.5084 Stable:0.9206 Stable: 0.9194
Fig.3. (A) A mammographic image from DDSM rendered in Over-fitting:Yes Over-fitting:-- Over-fitting: No Over-fitting: No
Time: 1.46s/ep Time: 1.45s/ep Time: 4.23s/ep
grayscale; (B) Cropped ROI by the given truth abnormality Time: 2.82s/ep

boundary; (C) Convert Grey to RGB image by duplication.

Discriminator

D G
Generator

Simulated mapping
METHODS Fig.6. Training accuracy and validation accuracy for four training
datasets. SStd: the standard deviation of validation accuracy after
z = G ( x)
Update
600 epochs), Best: maximum validation accuracy, Stable:
GAN average validation accuracy after 600 epochs, Over-fitting: if the
1300 real abnormal ROIs value of average validation accuracy after 400 epochs minus
Real mapping DDSM 1300 real normal ROIs average validation accuracy before 400 epochs is negative.
y = pdata ( x ) x N ( 0, 1) Crop ROIs
Standard Gaussian Distribution
90% for training 10% for validation CONCLUSIONS
Fig.1. The principle of GAN GAN
Discriminator ➢ GAN-based image augmentation: to classify the normal
ROIs and abnormal (tumor) ROIs from DDSM, adding GAN
generated ROIs in training data can help the classifier
CNN classifier prevent from over-fitting and the validation accuracy using
Noise vector x Generator mixture ROIs reached at most (best) 98.85%. Therefore,
TL by VGG-16
GAN could be promising image augmentation method.
Pre-trained model: weights frozen ➢ Transfer learning in CNN: the pre-trained CNN model
Input new images Feature extraction (VGG-16) can automatically extract features from
GAN (synthetic) abnormal ROIs Training accuracy mammographic images, and a good NN-classifier (achieves
Train new classifiers GAN (synthetic) normal ROIs Validation accuracy stable average validation accuracy about 91.48% for
Fig.2. To reuse a pre-trained CNN model that has been classifying abnormal vs. normal cases in the DDSM
trained by image datasets from other fields (WildML, 2015) Fig.4. The flowchart of our experiment plan. CNN classifiers database) can be trained by only real ROIs.
were trained by data including real and GAN ROIs. Validation ➢ Combining the two deep-learning based technologies
data for the classifier were real ROIs that were never used for together: to apply GAN for image augmentation and then
MATERIALS training. The real and GAN ROIs were also used to Transfer use transfer learning in CNN for detection. Although to train
Learning by pre-trained VGG-16 model. the transfer learning model by adding GAN ROIs did not
perform better than to train the CNN by adding GAN ROIs,
In this study, we used mammogram from the Digital Database
Set# Dataset for training Validation Classifier Model the speed of training transfer learning model was about 10
for Screening Mammography (DDSM). We used the regions of
times faster than CNN training.
interest of images (ROIs) instead of entire images to train our 1170 Real abnormal ROIs
neural-network models. These ROIs are cropped rectangle- 1
1170 Real normal ROIs 130 Real In summary, adding GAN ROIs can help training avoid
shape images and obtained by: 1170 GAN abnormal ROIs abnormal ROIs TL model over-fitting and image augmentation by GAN is
2 CNN
1170 GAN normal ROIs 130 Real normal by VGG-16 necessary to train CNN classifiers from scratch. On the
➢ For abnormal ROIs from images containing 3 Set 1 + Set 2 ROIs other hand, transfer learning is necessary to be applied
abnormalities, they are the minimum rectangle-shape 4 Set 1 + double Set 2 for training on pure ORG ROIs. To apply GAN to augment
areas surrounding the whole given ground truth training images for training CNN classifier obtained the
boundaries. Table 1. Training plans. best classification performance.

You might also like