You are on page 1of 4

Deep Learning in Liver Biopsies using Convolutional

Neural Networks

Alexandros Arjmand*, Constantinos T. Angelis*, Alexandros T. Tzallas*, Markos G. Tsipouras†, Evripidis Glavas*,
Roberta Forlano‡, Pinelopi Manousou‡ and Nikolaos Giannakeas*
*
Department of Informatics and Telecommunication, University of Ioannina, Arta, Greece

Department of Informatics and Telecommunications Engineering, University of Western Macedonia,
Kozani, Greece,

Liver Unit/ Division of Integrative Systems Medicine and Digestive Disease, Department of
Surgery and Cancer, Imperial College, London, UK

Abstract—Nonalcoholic fatty liver disease (NAFLD) presents a chronic liver diseases as it is closely linked to advanced liver
wide range of pathological conditions, varying from nonalcoholic fibrosis.
steatohepatitis (NASH) to cirrhosis and hepatocellular carcinoma
(HCC). Their prevalence is characterized by increased fat Even though liver biopsy is currently considered the gold
accumulation and hepatocellular ballooning. They have become a standard for NAFLD and NASH activity evaluation, it still refers
cause of concern among physicians and engineers, as significant to an invasive method [5]. Furthermore, since the visual
implications tend to occur regarding their accurate diagnosis and counting of these findings indicates a difficult and time-
treatment. Although magnetic resonance, ultrasonography and consuming process, modern studies have emphasized the
other noninvasive methods can reveal the presence of NAFLD, development of new recognition methods through digital image
image quantitative interpretation through histology has become processing techniques. Therefore, an automated and accurate
the gold standard in clinical examinations. The proposed work examination would exclude false positive results causing the
introduces a fully automated diagnostic tool, taking into account misdiagnosis of NAFLD and NASH [6]. It would also contribute
the high discrimination capability of histological findings in liver to the improvement of the patient's pathological condition,
biopsy images. The developed methodology is based on deep through treatment intervention, even if liver transplantation was
supervised learning and image analysis techniques, with the required in cases of end-stage decompensated cirrhosis.
determination of an efficient convolutional neural network (CNN)
architecture, performing eventually a classification accuracy of The current work presents a novel methodology for the
95%. analysis of multiple hepatic structures from biopsy images, by
building and training two convolutional neural networks with a
Keywords—Liver Biopsies; Fatty Liver; Hepatocyte common architecture. Deep learning based techniques are
Ballooning; Deep Learning; Convolutional Neural Networks efficient with appropriate parameters particularly in medical
image analysis due to their flexible design and overcoming the
I. INTRODUCTION
problems caused by hand-crafted features used in traditional
Nonalcoholic fatty liver disease (NAFLD) is the most techniques [7-9]. The purpose of the new network topology is to
common cause of liver ailment worldwide with prevalence solve a 4-class prediction problem of different histological
estimates ranging from 25% to 45% in most clinical studies [1]. findings, namely: a) ballooned hepatocytes, b) fat droplets, c)
Its existence is characterized by significant evidence of hepatic sinusoids and d) veins. The study’s main objectives are the
steatosis, as well as other causes of fat accumulation, such as precise classification of ballooning degeneration and fat
significant alcohol consumption, long-term use of a steatogenic accumulation areas (Fig. 1), as well as the elimination of the
medication and monogenic hereditary disorders [2]. These remaining two anatomical structures as findings of liver
contribute to the activation of a pro-inflammatory environment diseases. As a final stage, the developed 4-class detection system
that engenders hepatocellular injury, resulting in a portion of can be integrated into a complete methodology for calculating
NAFLD patients developing cirrhosis, but also undergoing liver the fat and ballooning ratio, thereby providing a more
transplantation due to complications generated by portal representative state of the patient's clinical condition.
hypertension, hepatic failure and hepatocellular cancer [3]. On
the other hand, nowadays the physicians’ clinical interest is
focused on nonalcoholic steatohepatitis (NASH), for the reason
that is an evolutionary form of NAFLD [2, 4]. The minimal
histological requirements for the diagnosis of NASH include ≥
5% liver steatosis, lobular inflammation and ballooning of
hepatocytes without any evidence of hepatocellular damage.
Particularly, ballooning degeneration belongs to the class of

Fig. 1. Fat droplet and balloon cell samples retrieved by a liver biopsy.
This work is partly funded by the project xBalloon, co financed by the
European Union and Greek national funds through the Operational Program for
Research and Innovation Smart Specialisation Strategy (RIS3) of Ipeiros
(Project Code: 5033187)

978-1-7281-1864-2/19/$31.00 ©2019 IEEE 496 TSP 2019


II. METHODOLOGY input layer. Since all image patches refer to microscopic samples
with a small magnification factor, the classification model is set
A three-step classification method is developed, which to be trained on 64×64 pixel data, with 3 color channels of RGB
ultimately leads to the automatic computation of the ballooning information. Due to the fact that this generates a massive number
and fat ratio in the entire liver tissue: of 12,288 connection weights per hidden neuron (in the first
• The collection of a sufficient number of isolated training hidden layer), for modeling image data, a series of convolution
samples from digitized biopsies, pointing to the 4-class operations are utilized for dimensionality reduction and
findings. computational acceleration purposes [10]. The first convolution
layer is set to detect edge and shape features from the raw image
• Training two convolutional neural networks carrying data, by using 64 filters with a 5-by-5 kernel size. For example,
the same architecture, but employing different the amount of pure white pixels that characterize either circular
optimization algorithms, as well as estimating their fat cells or irregularly-shaped sinusoids can be initially
performance in a number of validation images. determined. Provided that convolution performs a multiplication
• The characterization of unknown anatomical structures, of multidimensional array inputs, or else tensors [11], its output
leading to the isolation and quantification of NAFLD is calculated as:
and NASH prevalence. C

A. Histological Features Isolation hj i, j = fi × gij i, j (1)


i = 1
All biopsy slides involved in this study come from NAFLD
th
cases collected at St. Mary Hospital (Imperial College where hj(x) denotes the j non-activated feature map at a spatial
Healthcare NHS Trust of London, UK). Part of the population location (i, j), gij the kernel between hj and the ith input channel
suffers from a high rate of hepatocyte ballooning, while in all fi, whereas C is the input channels in which the weighted sum
cases large areas of fat accumulation are also found. In our runs [12]. While the suggested number of filters scan the RGB
occasion, all current needle specimens are colored with the gold color channels with a stride of 1, a 2D convolution is executed
standard Hematoxylin and Eosin (H&E) stain to highlight the by applying the following formula:
regions of interest. Images were scanned and downsampled at
×20 magnification using a Hamamatsu microscope (Hamamatsu fi × gij i, j = fi m, n gij i - m, j - n (2)
Photonics, Hamamatsu, Japan). This step makes the image data m n
suitable for digital processing and analysis since they originally
constituted sizes exceeding 10,000 × 10,000 pixels. It should be noted that in each convolution layer, a zero
padding technique is utilized in order to assign 0 values around
An extraction of histological features, as image patches, the inputs. This ensures that the kernel will not lead to any data
from the entire tissue area, is then performed. These patches are leakage, but will also maintain the output size equal to the input,
stored separately in four categories implying the number of respectively [13]. As mentioned above, all convolution
individual class objects. According to the above assumption, an functions are executed simultaneously. This creates the
identification label is assigned for each liver class: a) ballooning, probability of a series of unexpected results to occur, especially
b) fat, c) sinusoid and d) vein. In total, 720 biopsy findings are during the interference of backpropagation algorithms [11, 14].
provided forming a balanced image dataset (180 samples per For this particular reason, in each layer batch normalization is
class), which in the next stage can be resized and processed by set to help in the training process, making it more robust in
the built deep CNNs. The dataset is divided into three subsets, coordinated weight updates. During this procedure, a minibatch
aiming the training performance and the classification capability H with the size of 64 activations, takes the form of matrix rows
of the two neural networks to be measured: a) 620 training and attempts to normalize the filtered convolution values, before
samples, b) 60 validation samples and c) 40 testing images they are delivered to a nonlinear activation function. To address
unknown to each classifier. the nonlinear nature of the processed data, the Rectified Linear
B. Convolutional Neural Network Architecture Unit (ReLU) function: f(x) = max(0, x), is used to solve the
vanishing gradient problem [15]. Now with the aim of moving
In this paper, a CNN topology (Fig. 2) is employed to learn
the process to the second convolution layer, a 2×2 max pooling
features from the extracted biopsy findings. In the first phase, a
critical consideration is taken regarding the size of the initial

Fig. 2. The proposed for the experiments CNN arrchitecture.

497
filtering is performed with a stride of 2. This is highly (620 samples), and this input value proves to be sufficient for the
recommended, as it helps to break up each feature map into current amount of data. In each case and at regular intervals
equally sized tiles [14] and to further decrease overfitting by during training, the accuracy of the validation data is calculated.
reducing the spatial size (width and height) of the data It is recalled that the validation set is not used to update the
representation [10]. network weights, but to assess how much a model suffers from
either overfitting or underfitting and for an early assessment of
In the second convolution layer, a number of 32 filters with the classification capability to be made.
a 3-by-3 kernel size is set to look for smaller features inside each
liver finding. For instance, the emphasis is currently on B. Visualization of Learned Features
ballooned hepatocytes within a white region and in occurring red The second evaluation step focuses on investigating
pixels pointing at blood cells of a hepatic vein. Batch ballooned cell and fat droplet features, by observing which areas
normalization, ReLU activation and max pooling are also of interest are filtered on the first two convolution layers and
included, while a dropout layer with a 0.5 probability is now how they are finally activated in their corresponding ReLU
introduced. This process is repeated once more in the third functions. A key feature of each convolution layer is that it
convolution layer, with the difference that 16 filters with a 3-by- converts the color input image into many 2D channels in the
3 kernel size are now activated. This is achieved with the form of a grayscale image (Fig. 3). This refers to a classic and
purpose of giving even greater emphasis to the textural efficient image processing technique since each pixel involved
differences between the four examined histological findings. in the activation of a channel signifies the same (x, y) spatial
Max pooling is no longer necessary, as the training process is to location in the original input image [18]. A channel that is
make a transition to the upcoming fully connected layers. mostly gray does not activate as strongly as a white pixel. A
At this moment, a dense layer with 4,096 neurons is defined similar logic is also followed in each rectified linear unit that
in order to gather the detected anatomical features from the three follows each network layer. In ReLUs, each activation is scaled
convolution layers, which are further connected to the final to a minimum 0 and a maximum 1 value. Consequently, bright
softmax layer. Dense and softmax layer connections act white pixels represent strong positive activations, while pure
similarly to a classic artificial neural network, while the softmax black pixels represent strong negatives, respectively. As shown
function assigns probability distributions to generate predictions in Fig. 3, it is found that the channels in the first layer mostly
for the four hepatic classes [16] learn simple features, including edges and cracks contained in
NAFLD findings, while channels in the second layer focus on
C. Optimization Algorithms more complex features, such as existing hepatocytes in a
So far, several techniques have been incorporated into the ballooning area and the texture of a lipid droplet. It is also
constructed deep model to speed up its training. Now, two confirmed by the second convolution layer, that dropout leads to
modern optimizers are called to further accelerate the learning a slight blurring of images, which forces the discriminating
operation and to determine the values that minimize the cost model to learn more important features that are less co-adapted
function. The first optimizer for the methodology solution and lead to better generalization.
comes from applying the Stochastic Gradient Descend with
C. Testing Results
Momentum (SGDM) algorithm. Specifically, the momentum is
set to accumulate an exponentially decaying average of past In order to test the reliability of the developed methodology,
gradients, as it continues to move in their direction [11]. The the two trained models are tested to identify 40 unknown liver
second suggested optimizer is the Adaptive moment estimation structures (10 per class) for the purpose of measuring the
(Adam). It belongs to the family of stochastic optimization classification accuracy. For the current task, the learning
algorithms and has gained popularity due to the inclusion of only algorithms by utilizing a function y = f(x), are asked to assign an
first-order gradients and fewer memory requirements [17]. Since input image described by a vector x, to a class identified by a
the method owns an adaptive property, it tends to compute class label y ∈ {ballooning, fat, sinusoid, vein}. Thus, the
individual adaptive learning rates for different parameters from function outputs a probability distribution value for the four
the first (mean) and second raw (uncentered variance) moment classes within a [0,1] confidence interval. Based on the exported
estimates of the gradients. percentages (Table I), it is clear that the classifiers are more
stable in detecting swollen hepatocytes, as these consist of
III. RESULTS AND DISCUSSION multiple changes in the values of their neighbor pixels. They also
In this section and through a series of evaluation steps, the successfully achieve a visual separation of circular structures not
behavior of the deep supervised models, which are ultimately always referred to steatotic fat cells, but to hepatic veins, as a
intended for objective quantitative measurements of the number of red blood cells are included therein.
examined liver diseases, is monitored. These include the
TABLE I. ACCURACY RESULTS
validation of the training performance, the visualization of the
learned histological features and most importantly, the reliable Classification Results (%)
identification of unknown testing samples. Deep
Liver Class Performance
Model
A. Training and Validation Results ballooning fat sinusoid vein Testing

Having defined the CNN topology, the interest is currently CNNadam 100 100 70 100 92.5
focused on the training procedure, which is set to run for 30 CNNsgdm 90 100 90 100 95
epochs. An epoch involves a full cycle on the entire training set

498
Fig. 3. Visualization of histological features learned by the first two convolution layers.

IV. CONCLUSION hepatic fibrosis and ballooning in patients with non-alcoholic fatty liver
disease,” World Journal of Gastroenterology, vol. 24, no. 11, pp. 1239–
In the current study, a deep learning approach for 1249, March 2018.
identifying the presence of various liver biopsy findings is [6] E. Goceri, Z. K. Shah, R. Layman, X. Jiang and M. N. Gurcan,
“Quantification of liver fat: A comprehensive review”, Computers in
proposed. The developed methodology focuses on the training Biology and Medicine, vol. 71, pp. 174–189, April 2016.
of two convolutional neural networks employing different [7] E. Goceri and N. Goceri, “Deep learning in medical image analysis:
optimization algorithms. According to the overall performance, Recent advances and future trends”, 11th International Conference on
Computer Graphics, Visualization, Computer Vision and Image
the supervised models can produce high accuracy results up to Processing, Lisbon, Portugal, July 21–23, 2017.
95%, with SGDM being the most efficient optimizer. [8] E. Goceri and A. Gooya, “On the importance of batch size for deep
Compared to traditional semi-quantitative diagnostic methods, learning”, An Istanbul Meeting for World Mathematicians,
CNNs provide fully automated activations for detecting diverse Minisymposium on Approximation Theory & Minisymposium on Math
Education, Instanbul, Turkey, July 3–6, 2018.
histological features, including edges, shape, pixel intensity and [9] E. Goceri, “Formulas behind deep learning success”, International
texture. The trained networks are able to accurately detect the Conference on Applied Analysis and Mathematical Modeling, Instanbul,
main differences between ballooned hepatocytes and fat Turkey, July 20–24, 2018.
droplets, two findings responsible for the increasing prevalence [10] J. Patterson, and A. Gibson, “Deep learning: A practitioner’s approach,”
O’Reilly Media, July 2017.
of NAFLD and NASH in recent years. [11] I. Goodfellow, Y. Bengio, and A. Courville, “Deep learning,” The MIT
Press, November 2016.
REFERENCES [12] V. Andrearczyk, and P. F. Whelan, “Deep learning in texture analysis and
its application to tissue image classification,” Biomedical Image Analysis,
[1] M. E. Rinella, “Nonalcoholic fatty liver disease: A systematic review,” pp. 95–129, January 2017.
JAMA (Systematic Review), vol. 313, no. 22, pp. 2263–2273, June 2015.
[13] A. Geron, “Hands-on machine learning with scikit-learn and tensorflow:
[2] N. Chalasani, Z. Younossi, J. E. Lavine, M. Charlton, K. Cusi, M. Rinella, Concepts, tools, and techniques to build intelligent systems,” O’Reilly
S. A. Harrison, E. M. Brunt, and A. J. Sanyal, “The diagnosis and Media, April 2017.
management of nonalcoholic fatty liver disease: Practice guidance from
[14] N. Buduma, “Fundamentals of deep learning: Designing next-generation
the American association for the study of liver diseases,” Hepatology, vol.
machine intelligence algorithms,” O’Reilly Media, May 2017.
67, no. 1, pp. 328–357, January 2018.
[15] S. Kevin Zhou, H. Greenspan, and D. Shen, “Deep learning for medical
[3] P. Angulo, D. E. Kleiner, S. Dam-Larsen, L. A. Adams, E. S. Bjornsson, image analysis,” Academic Press, January 2017.
P. Charatcharoenwitthaya, P. R. Mills, J. C. Keach, H. D. Lafferty, A.
Stahler, S. Haflidadottir, and F. Bendtsen, “Liver fibrosis, but no other
[16] M. V. Hernandez, and V. Gonzalez-Castro, “Medical image
understanding and analysis: 21st annual conference, MIUA 2017,
histologic features, is associated with long-term outcomes of patients with
Edinburgh, UK, July 11–13, 2017, proceedings,” Communications in
nonalcoholic fatty liver disease,” Gastroenterology, vol. 149, no. 2, pp.
Computer and Information Science, Springer, 2017.
389–397, August 2015.
[4] Z. M. Younossi, M. Stepanova, N. Rafiq, L. Henry, R. Loomba, H. [17] D. Kingma, and J. Ba, “Adam: A method for stochastic optimization,”
International Conference on Learning Representations, 2015.
Makhlouf, and Z. Goodman, “Nonalcoholic steatofibrosis independently
predicts mortality in nonalcoholic fatty liver disease,” Hepatology [18] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson,
Communications, vol. 1, no. 5, pp. 421–428, June 2017. “Understanding Neural Networks Through Deep Visualization,” Deep
Learning Workshop, 31st International Conference on Machine Learning,
[5] N. Fujimori, T. Umemura, T. Kimura, N. Tanaka, A. Sugiura, T.
June 2015.
Yamazaki, S. Joshita, M. Komatsu, Y. Usami, K. Sano, K. Igarashi, A.
Matsumoto, and E. Tanaka, “Serum autotaxin levels are correlated with

499

You might also like