You are on page 1of 8

2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Using Deep Learning to Classify X-ray Images of


Potential Tuberculosis Patients

Ojasvi Yadav and Kalpdrum Passi Chakresh Kumar Jain


Department of Mathematics and Computer Science Department of Biotechnology
Laurentian University Jaypee Institute of Information Technology
Sudbury, Ontario, Canada Noida, India
kpassi@laurentian.ca, yadavo@tcd.ie ckj522@yahoo.com

Abstract— Deep Learning is widely used for image network to a bias towards the larger class, which was the class
classification. Its success heavily relies on data which contains a ‘abnormal’. Consequently, this could have led to multiple
sufficient amount of region of interest (~10%). However, due to false positives when testing for positive tuberculosis as the
the region of interest in medical images being as low as 1% of model was trained excessively on cases of tuberculosis as
the entire image, Deep Learning has not been conveniently used compared to normal cases. The study also used images in
for such cases. In this study, we employ recent techniques their raw format. No pre-processing or data augmentation
brought forth in Deep Learning and aim to classify X-ray was applied on the images before training or during testing of
images of potential Tuberculosis patients. Different types of the model. The convolutional neural network model used for
learning rate enhancement techniques were used. Significant
this study was a pre-trained GoogleNet. This model is based
improvement was observed when coarse-to-fine knowledge
transfer was employed to fine-tune the model further using
on several very small convolutions to drastically reduce the
multiple data augmentation techniques. We achieved an overall number of parameters. Each layer connects directly to the
accuracy of 94.89% on the augmented images. next layer via the inception module as described in “Fig. 1”.
The study in [2] achieved an accuracy of 89.6% when testing
Keywords— Deep Learning; Convolutional Neural Network; for tuberculosis in chest X-rays.
Coarse-to-Fine knowledge transfer, Residual Networks, Learning
rate decay, Discriminative Fine Tuning.

I. INTRODUCTION
Tuberculosis (TB) is a potentially serious infectious
disease that mainly affects the lungs. Most of the symptoms
can be detected by reviewing the Chest X-ray images of the
potential patients. Tuberculosis has visual symptoms like [1]
Consolidation, Fibrosis, Infiltration, Mass, Nodule, Pleural
Thickening, Pneumonia etc., most of which can be
recognized in a Chest X-ray image.
Thus, it is of great importance to detect these symptoms
when trying to assess whether the patient is infected with
Tuberculosis or not. These symptoms however, usually cover
an extremely small region of the entire image. Therefore,
risking the chance of vanishing gradient problem when
training such images over a standard deep convolutional
neural network (CNN). Residual networks [6], a variety of
the CNNs were used to overcome the vanishing gradient
problem in cases where the region of interest is minute.
We can see from [2] that previous attempt at classifying
cases of Tuberculosis have rendered appreciable results of
89.6% accuracy using convolutional neural networks applied Fig. 1. GoogleNet’s Inception module [16]
to bi-class dataset. The dataset used in [2] was organized by
segmenting the data into two categories; ‘normal’ and In present research we use three different datasets of
‘abnormal’. The dataset employed for training the model was Tuberculosis [3, 9, 13] instead of the dataset used in [2] as
asymmetric, i.e. 4248 images in the ‘abnormal’ category and this dataset is highly imbalanced. To increase the accuracy in
453 images in the ‘normal’ category. This is a our study, techniques like cosine annealing, restarts,
disproportionality which usually trains a convolutional neural discriminative fine tuning, noise and blur masks, flipping,

978-1-5386-5488-0/18/$31.00 ©2018 IEEE 2368


zooming, rotating were used to augment the dataset [3,9,13]. recognize these symptoms, it will be easier to build
We applied coarse-to-fine knowledge transfer to reduce Tuberculosis knowledge on top of the prior knowledge. The
errors by more than 85%. dataset which contained the separated diseases was named
‘BIG_Lung’.
The main contribution of this research is to show that
several techniques when used in conjunction can give high The creators of the Montgomery dataset have included
accuracy results in classifying medical data. Lung masks, which are binary images that reveal only the
lungs and block rest of the image. These were used to a great
The paper is organized as follows. We first describe the effect in data augmentation. As mentioned before, the main
datasets and tools in Section II. Methodology is given in challenge in medical image classification is that the region of
Section III. Section IV presents the Results and Discussions. interest can be miniscule. Thus, it is helpful to isolate the
Section V contains the conclusions of this study and states probable regions of interest. We used these masks to achieve
prospects and limitation of this work. that and trained the model to only look at the regions of X-
II. DATASETS AND TOOLS ray images where lungs are present and ignore the rest of the
image.
A. Datasets
Finally, the model was trained and validated on the China
The datasets used in our study are "China-Shenzhen set dataset as the last stage of coarse-to-fine Knowledge
Chest X-ray Database" [3,9,13], "Montgomery County Chest Transfer.
X-ray Database" [3,9,13], "NIH Chest X-ray Dataset of 14
Common Thorax Diseases" [15]. The NIH dataset is a large B. Pre-Augmentation
dataset of more than 100,000 medium resolution images. The The second dataset, Montgomery dataset, was heavily
China and Montgomery dataset combined total to just 800 augmented. The results can be seen in “Fig. 2”.
images. However, it contains very high-resolution pictures of
tuberculosis cases and could be accessed from various For each image:
sources. The China dataset as well as the Montgomery dataset 1. Histogram Equalization was applied to correct the
are the databases for Tuberculosis created by National contrast and exposure anomalies.
Library of Medicine, Maryland, USA. The NIH dataset is 2. The image dataset was separated using the binary
extracted from the clinical PACS database at National masks into lungs and rest of the image. This helped
Institutes of Health Clinical Center and consists of 112,120 in applying different techniques on the region of
frontal view X-ray images of 30,805 unique patients with the interest and rest of the image.
text-mined fourteen (14) disease image labels. 3. Random Gaussian blurring was applied on the rest of
B. Tools the image. This reduced the focus on the non-lung
regions.
In our study, we used FastAi [4] as the main tool. It's a 4. Random Salt and Pepper noise was applied on the
Deep Learning library built on top of PyTorch (a framework
rest of the image. This degraded the non-lung image
built on Python). All the results in this paper have been
obtained from models built using the FastAi open source data further.
library. We rented NVidia P4000 and NVidia P6000 from 5. Lung region was sharpened to intensify our region of
Paperspace cloud GPUs to increase the speed of computation interest using the image sharpening tool in [12].
multifold. For Data Augmentation, we used MATLAB [12]
and inbuilt data augmentation techniques of FastAi.
III. METHODOLOGY
There are three parts in our experiments: Separation of
dataset into low and high resolution, pre-augmentation, and
training.
A. Dataset Separation
Instead of mixing all three datasets together and not Fig. 2. An image before and after the pre-augmentation
taking advantage of the dataset sizes and resolutions, we
decided to use Coarse-to-Fine Knowledge Transfer [5]. In it, The histogram equalization and Image sharpening pre-
we trained our model on the large, low resolution NIH dataset augmentation techniques were also applied on the China
first and fine-tuned it on small, high resolution China- dataset to get rid of contrast and exposure anomalies as well
Montgomery datasets. as to intensify the entire image. No pre-augmentation was
applied on the NIH dataset. This was majorly because for the
To our advantage, the NIH dataset also further classified Coarse-to-Fine Knowledge Transfer, the first dataset can be
every single image as a combination of 14 different lung a raw, low-quality dataset.
diseases. Out of these 14 diseases, 7 diseases are strong
symptoms of Tuberculosis, namely, Consolidation, Fibrosis, C. The Model Network
Infiltration, Mass, Nodule, Pleural Thickening and One way to get better accuracy is to design a deeper neural
Pneumonia. We theorized that if our model can first learn to network architecture and train it on the images. However,

2369
with increasing network depth, accuracy gets saturated and IV. RESULTS AND DISCUSSION
then degrades rapidly. Such degradation need not always be
due to overfitting and adding more layers to such deep model A. Coarse Model
might lead to higher training error. This is also known as the As discussed in Section III-A, NIH dataset was used for
Vanishing gradient problem. In such methods, each of the training the model with low resolution images. The dataset
neural network's weights receives an update proportional to comprised of training and validation datasets. Each of which
the partial derivative of the error function with respect to the contained a ‘Normal’ and a ‘TB_Symp’ dataset. ‘Normal’
current weight in each iteration of training. The problem is contained X-ray images of patients with normal lungs.
that in some cases, the gradient will be vanishingly small, ‘TB_Symp’ contained X-ray images of lungs of patients
effectively preventing the weight from changing its value. In having symptoms of Tuberculosis. We call this dataset as
the worst case, this may completely stop the neural network ‘BIG_LUNG’. Organization of images of ‘BIG_Lung’
from further training. dataset is shown in Table I. The idea behind this was to train
the model to recognize possible hints to Tuberculosis and to
A residual network solves this problem [6]. The unique get a basic idea of how the lungs of a potential Tuberculosis
character of a residual network is that the data of a layer is patient look like. The ratio between training and validation
not only passed to the next layer but also to the layers was maintained at the universally accepted ratio of 80:20
appearing later in the entire network. Instead of assuming that percent.
each stack of layers would directly fit a desired underlying
mapping, we explicitly let these layers fit a residual mapping. TABLE I. BIG_LUNG DATASET SEPARATED BY DISEASES
The original mapping is recast into F(x)+x as described in
“Fig. 3”. The authors of [6] hypothesize that it is easier to Tuberculosis symptoms No. of samples No. of samples in
in training validation
optimize the residual mapping than to optimize the original, Consolidation 999 201
unreferenced mapping and they prove this by getting a Fibrosis 976 226
residual network to perform well on many image Infiltration 6052 1535
classification problems. To the extreme, if an identity Mass 2034 542
mapping were optimal, it would be easier to push the residual Nodule 2587 723
to zero than to fit an identity mapping by a stack of nonlinear Pleural thickening 1706 368
layers. ResNet-50 is a 50-layered network consisting of such Pneumonia 389 90
residual modules and finishing with fully connected layers in
the end. For training, we first employed transfer learning to
retrain the last fully connected layers of a pre-trained ResNet-
50. The hyperparameters of this training are given in Table
II. This was pre-trained on the image-net which is a relatively
different dataset than the lungs’ dataset used for this study.

TABLE II. HYPERPARAMETERS FOR TRANSFER LEARNING OF


BIG_LUNG DATASET
Hyperparameters Value
Max_zoom 1.2
Augmentations Side-On
Dropout 0.25
Batch Size 16
Fig. 3. The building block of ResNet [6]
Image Size 299 x 299
Learning Rate decay Cosine Annealing
D. Coarse-to-Fine Transfer Learning Learning Rate 0.01
Transfer learning has been well established as a highly Restarts 0
efficient technique to increase a model’s accuracy. If a model Epochs 3
develops a certain intuition about one type of dataset, it can
perform better for another dataset as compared to not having Max_zoom:1.2 represents the maximum cropping
that prior intuition at all. ratio that was used during data augmentation. This meant that
This was done in [17] by using a neural network already at most the image would be zoomed in by 20% of its original
trained on one dataset as an apriori model to be trained and size while generating augmented images.
tested on a different dataset. The authors were able to get Augmentations: Various augmentations of the original
state-of-the-art results when using this technique. images were used to bring in variety. Augmentations like
We propose that in order to apply machine learning on vertically flipped images were used because lung X-rays are
one large dataset, it’s better to separate the dataset in majorly symmetrical baring a few minor characteristics. But
accordance to the resolution and quality of the images in it the main idea was that Tuberculosis and its symptoms can
first. And then apply transfer learning to the low-resolution occur in either of the lungs and our model should be able to
dataset first and then transfer that learning on the higher recognize it in both cases. These augmented images were also
resolution dataset.

2370
rotated and brightened/dimmed slightly to incorporate more
variety.
Dropout: A 0.25 dropout ensured that 25% of the weights
were randomly deleted during training. This technique
greatly reduced overfitting. The general intuition behind this
technique was to not allow the model to rely on weights and
develop a general knowledge of the data by employing as
many weights as possible.
Batch Size: 16 was the image batch size used for NIH
dataset. This meant that in one iteration, 16 images would be
trained. Higher the batch size, more general the model would
become. However, this reduces the model’s ability to classify
certain odd classes. So, there’s a trade-off between generality
and specificity when deciding this number.
Fig. 4. The learning finder on the FCL
Image Size: The creators of ResNet recommend
converting images to square sized images of either 224x224 The loss remained somewhat static from 10-5 to 10-2
pixels or 299x299 pixels before feeding them to ResNet. We before spiking up again. 10-3 looked like an acceptable
opted for the higher resolution 299x299 for higher accuracy. learning rate for next training session. After training the last
FCL, we then trained the entire model. For this we used the
Learning rate decay: Cosine Annealing was employed concept of Discriminative Fine Tuning [8]. The concept
as a learning rate decay [11]. This technique decays the behind this is that the initial layers of a trained network are
learning rate mimicking a cosine wave from 0 to . The trained to categorize simple things like lines and corners. The
advantage of doing this is to rapidly decay the learning rate middle layers are trained to categorize simple shapes like
and smooth it down at the end so that the model can stabilize circles, polygons etc. The last layers are trained to categorize
near a minima instead of constantly jumping through the complex structures for which the model was designed to
minimas in a spiky loss graph. classify. Hence, we don’t need to change the initial layers by
Learning Rate: 0.01 learning rate was used which is a a huge margin as their function is elementary for most
high learning rate for a batch size of 16. We used this because datasets, i.e. recognizing lines and corners. Most of the
the model was already pre-trained on the ImageNet which is changes happen in the last layers and somewhat moderately
a significantly different looking dataset than our datasets. We in the middle layers. Hence, assigning an extremely low
acknowledge that the model requires extensive re-training to learning rate for the initial layers is desirable so they can
get better at the lung datasets. retain their ability to recognize simple structures. The
learning rate progressively is increased for later layers to re-
Restarts: [11] This technique helped the model during learn how to recognize the complex structures. We retrained
stochastic gradient descent to come out of local minimas and the entire model this time with the learning rates and other
move towards the global minimas of the loss by restarting the parameters shown in Table IV.
entire annealing repeatedly.
Epochs: 3 epochs were used to train the last fully TABLE IV. HYPERPARAMETERS FOR TRAINING THE ENTIRE COARSE
MODEL
connected layers (FCL). The training results are shown in
Table III. Hyperparameters Value

All these hyperparameters were determined by trial and Learning rates [10 /9, 10 /3, 10-3]
-3 -3

error.
Restarts 2
Epoch 1
TABLE III. TRAINING FCL OF COARSE MODEL Cycle Multiplier 2
Iteration Training Loss Validation Accuracy
Loss Cycle Multiplier: [11] is another technique used in
0 0.601 1.170 0.436 conjecture with Restarts which elongates the learning rate by
1 0.561 1.019 0.452
2 times in this case. When the training comes out of one local
2 0.527 0.959 0.433
minima of loss into another lower minima of loss, it has
improved. We ideally want it to stay there, hence we slow
Next, we used a learning rate finder [14] which can down the cosine annealing by delaying the time to the next
estimate the losses at different learning rates and plot it. We restart.
used this graph in “Fig. 4” to select the learning rate
corresponding to the least validation loss. We trained the model using these hyperparameters. The
results of the training are given in Table V.

2371
TABLE V. TRAINING ENTIRE COARSE MODEL convolutional layers were trained on the Montgomery dataset
Iteration Training Loss Validation Accuracy the generality of the model would be exchanged for the
Loss specificity towards the Montgomery dataset. Although the
0 0.498 0.982 0.449 validation against the Montgomery dataset would give better
1 0.510 1.013 0.463 results, we were validating against the China dataset, so
2 0.484 1.071 0.471 specificity towards Montgomery dataset was not desired.
The results got progressively better. Hence, we trained TABLE VII. HYPERPARAMETERS FOR TRAINING THE MEDIUM MODEL
them once again with a better learning rate using the learning
Hyperparameters Value
rate finder in “Fig. 5”.
Learning rate 10-2
Restarts 0
Epoch 4
Cycle Multiplier 1

The results of training using these values are given in


Table VIII.

TABLE VIII. TRAINING ENTIRE MEDIUM MODEL


Iteration Training Loss Validation Accuracy
Loss
0 0.442 0.200 0.897
1 0.419 0.222 0.897
2 0.403 0.205 0.918
3 0.351 0.183 0.938

Fig. 5. The learning rate finder on the entire coarse model We trained it again with a cycle-multiplier, so the loss
stabilized with the learning rate 4x10-3, Restarts 3, epoch 1
This time the loss remained static from 10-5 to 10-3 before and cycle multiplier 2. The resulting accuracy is shown in
spiking up again and the accuracy was linearly progressing in Table IX.
the previous instance. Thus, we did not change the learning
rate and kept it at 10-3 and trained it again as shown in Table TABLE IX. TRAINING ENTIRE MEDIUM MODEL AGAIN
VI with 3 Restarts and the remaining hyperparameters Iteration Training Loss Validation Accuracy
remaining the same as Table IV. As the learning rate Loss
stabilized in the end, this was considered sufficient for the 0 0.295 0.192 0.928
coarse stage as there was a great amount of variety in this 1 0.314 0.221 0.908
dataset and the image quality was not the best either. 2 0.289 0.193 0.938

TABLE VI. TRAINING ENTIRE COARSE MODEL AGAIN We observed results consistently above 90% accuracy.
This was a good indication to move towards the Fine Model.
Iteration Training Loss Validation Accuracy
Loss C. Fine Model
0 0.469 1.082 0.473
1 0.477 1.019 0.471 At this stage the model was well generalized and knew
2 0.472 1.076 0.479 how to look at lungs and ignore the rest of the image. Next,
3 0.478 1.186 0.478 we fine-tuned this model to specialize for China dataset. We
4 0.490 1.002 0.476 used the ratio 85:15 for training to validation for the China
5 0.451 1.046 0.507 dataset. This was due to the relatively smaller than ideal size
6 0.441 1.064 0.509 of the dataset (662 images in total). As in the coarse model,
we first trained the FCL using the hyperparameters in Table
B. Medium Model X. The results are shown in Table XI. Then the entire network
was trained.
For this dataset we used heavily augmented images from
Montgomery dataset as described in Section III-B. The main TABLE X. HYPERPARAMETERS FOR TRAINING THE FINE MODEL
objective was to train the model to only see the lungs and
ignore the rest of the image. We maintained the 80:20 ratio of Hyperparameters Value
training to validation data. For this model we only trained the Learning rate 1 x 10-3
FCL using the hyperparameters in Table VII. We opted not Restarts 0
to train the entire model because it resulted in sub-optimal Epoch 3
results of degrading accuracy with each epoch. This was Cycle Multiplier 1
because the model we started with was a general model
trained on 100,000 images of Chest X-ray. If the

2372
TABLE XI. TRAINING ENTIRE FINE MODEL
Iteration Training Loss Validation Accuracy
Loss
0 0.445 0.218 0.928
1 0.381 0.212 0.938
2 0.386 0.241 0.877

We then used the learning rate finder to find the most


optimal learning rate by analyzing “Fig. 6”.

Fig. 7. The learning rate finder on the fine model

The optimal learning rate was increasing throughout the


graph, hence we opted for the least loss corresponding value
8x10-5 for the next training session. Table XIV shows the
hyperparameters and Table XV shows the corresponding
results.
Fig. 6. The learning rate finder on the fine model TABLE XIV. HYPERPARAMETERS FOR TRAINING THE FINE MODEL
AGAIN
-3
We kept 10 as the next learning rate because we needed
a higher learning rate to retrain the general model into a Hyperparameters Value
specific model although other hyperparameters were changed Learning rate [ 8x10 /16, 8x10-5/4,8x10-5 ]
-5

as shown in Table XII with the corresponding results in Table Restarts 2


XIII. Epoch 1
Cycle Multiplier 2
TABLE XII. HYPERPARAMETERS FOR TRAINING ENTIRE FINE MODEL
Hyperparameters Value
TABLE XV. TRAINING ENTIRE FINE MODEL AGAIN
Learning rates [10-3/4, 10-3/2, 10-3]
Iteration Training Loss Validation Accuracy
Restarts 2 Loss
Epoch 1 0 0.190 0.145 0.938
Cycle Multiplier 2 1 0.221 0.140 0.928
2 0.199 0.142 0.928

TABLE XIII. TRAINING ENTIRE FINE MODEL The validation loss was climbing up in the last iteration.
Iteration Training Loss Validation Accuracy This was an indication that further training would’ve resulted
Loss in overfitting. Hence, we stopped here with respectable
0 0.398 0.212 0.908 accuracy of 92.8%.
1 0.457 0.219 0.897
2 0.309 0.143 0.928 In all the above training sessions, the accuracy was being
measured on raw images. To enhance the accuracy further,
We used the learning rate finder again and got the results we tested the accuracy on augmented version of the
as shown in “Fig. 7”. validation set and the predictions on the augmented versions
were averaged out. This is also known as Test Time
Augmentation [7]. Test time augmentation accuracy of
94.89% was obtained. The accuracy was higher because the
model got to look at all the augmented versions of the images
and then made a prediction instead of trying to predict by
watching only one instance of the validation image.
D. Further Results
The confusion matrix of the Fine model gave a picture of
the true positives, true negatives, false positives and false
negatives as shown in “Fig. 8”.

2373
V. CONCLUSIONS AND LIMITATIONS
With a shortage of radiologists in resource-poor areas,
there is a need for technology assisted tuberculosis detection
to help the cause in reducing the time and effort spent in
detecting tuberculosis.
Using the coarse-to-fine knowledge transfer learning,
along with several other techniques, we went from a 50%
accuracy to 94.8% accuracy in detecting tuberculosis.
Medical analysis using deep learning is still not as
impressive as experts would want it to be. This study
concludes that we can get closer to that level of precision
using multiple recent advancements in deep learning and
using them in relevant situations.
We can also pursue different models [10] which
Fig. 8. The confusion matrix of the Fine model specifically look at separate segments of the image
individually. This will increase the relative area of the region
To visualize certain key results, we use the following of interest and thus vastly improve accuracy.
“Fig. (9 - 13)”.
One limitation of this study is that the model, in its final
state, is specifically trained for China dataset. That means we
cannot guarantee the same accuracy 94.8% on a chest X-ray
image which was not taken in a similar setting as the chest X-
ray images of China dataset. However, in the future we will
have increasing amounts of data available. A large-scale
Fig. 9. Most confident Normal cases, with accuracy Tuberculosis chest X-ray database will immensely help in
creating generalized models which are accurate across a
variety of chest X-ray datasets.
REFERENCES
[1] J. Burrill, C.J. Williams, G. Bain, G. Conder, A.L. Hine, and R.R.
Misra . “Tuberculosis: A Radiologic Review”. RadioGraphics, 2007,
Vol. 27, No. 5, 1255-1273.
Fig. 10. Most confident Tuberculosis cases, with accuracy
[2] Y. Cao, C. Liu, B. Liu, M. J. Brunette, N. Zhang, T. Sun, P. Zhang, J.
Peinado, E. S. Garavito, L. L. Garcia, W. H. Curioso, "Improving
Tuberculosis Diagnostics Using Deep Learning and Mobile Health
Technologies among Resource-Poor and Marginalized Communities,"
2016 IEEE First International Conference on Connected Health:
Applications, Systems and Engineering Technologies (CHASE),
Washington, DC, 2016, pp. 274-281.doi: 10.1109/CHASE.2016.18
[3] S. Candemir, S. Jaeger, K. Palaniappan, J.P. Musco, R.K. Singh, Z.
Xue, A. Karargyris, S. Antani, G. Thoma, C.J. McDonald. Lung
segmentation in chest radiographs using anatomical atlases with
nonrigid registration. IEEE Trans Med Imaging. 2014 Feb;33(2):577-
90. doi: 10.1109/TMI.2013.2290491. PMID: 24239990
Fig. 11. Incorrectly classified Normal cases, with accuracy
[4] fastai, J.Howard, 2018, GitHub, https://github.com/fastai/fastai
[5] A. Franco and L. Oliveira. "A coarse-to-fine deep learning for person
re-identification." 2016 IEEE Winter Conference on Applications of
Computer Vision (WACV), Lake Placid, NY, 2016, pp. 1-7.
doi: 10.1109/WACV.2016.7477677
[6] K. He, X. Zhang, S. Ren, J. Sun. “Deep Residual Learning for Image
Recognition”. The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 27-30 June, 2016, Las Vegas, NV, USA, pp.
Fig. 12. Incorrectly classified Tuberculosis cases, with accuracy 770-778. arXiv:1512.03385.
[7] K. He, X. Zhang, S. Ren, J. Sun. "Delving Deep into Rectifiers:
Surpassing Human-Level Performance on ImageNet Classification".
2015 IEEE International Conference on Computer Vision (ICCV), Dec
7-13, 2015, Santiago, Chile, pp. 1026 - 1034.
[8] J. Howard, S. Ruder. “Universal Language Model Fine-tuning for Text
Classification”. Proceedings of the 56th Annual Meeting of the
Association for Computational Linguistics, Melbourne, Australia, July
15-20, 2018, pp. 328-339. arXiv:1801.06146.
Fig. 13. Most uncertain predictions (closest to 0.5), with accuracy [9] S. Jaeger, A. Karargyris, S. Candemir, L. Folio, J. Siegelman, F.
Callaghan, Z. Xue, K. Palaniappan, R.K. Singh, S. Antani, G. Thoma,

2374
Y.X. Wang, P.X. Lu, C.J. McDonald. Automatic tuberculosis
screening using chest radiographs. IEEE Trans Med Imaging. 2014
Feb;33(2):233-45. doi: 10.1109/TMI.2013.2284099. PMID: 24108713
[10] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, A.C. Berg.
“SSD: Single Shot MultiBox Detector”. 14th European Conference on
Computer Vision (ECCV 2016), Amsterdam, The Netherlands,
October 11-14, 2016, pp 21-37, LNCS, Vol 9905, Springer, Cham.
arXiv:1512.02325
[11] I. Loshchilov, F. Hutter.“SGDR: Stochastic Gradient Descent with
Warm Restarts”. International Conference on Learning
Representations (ICLR 2017), Toulon, France, April 24-26, 2017.
arXiv:1608.03983.
[12] MATLAB and Statistics Toolbox Release 2017b, The MathWorks,
Inc., Natick, Massachusetts, United States.
[13] National Library of Medicine, National Institutes of Health, Bethesda,
MD, USA and Shenzhen No.3 People’s Hospital, Guangdong Medical
College, Shenzhen, China
[14] L.N. Smith. “Cyclical Learning Rates for Training Neural Networks”.
2017 IEEE Conference on Applications of Computer Vision (WACV),
24-31 March, 2017, Santa Rosa, CA, USA. arXiv:1506.01186.
[15] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, R. M. Summers.ChestX-
ray8: Hospital-scale Chest X-ray Database and Benchmarks on
Weakly-Supervised Classification and Localization of Common
Thorax Diseases, 2017 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 21-26 July, 2017, Honululu, HI, USA,
pp. 3462-3471.
[16] https://leonardoaraujosantos.gitbooks.io/artificial-
inteligence/content/googlenet.html
[17] Tatiana Tommasi, Barbara Caputo “Frustratingly Easy NBNN Domain
Adaptation”. Published in 2013 IEEE International Conference on
Computer Vision. INSPEC Accession Number: 14132259, DOI:
10.1109/ICCV.2013.116. Electronic ISBN: 978-1-4799-2840-8.

2375

You might also like