You are on page 1of 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/336391222

Deep Learning Model Integrating Dilated Convolution and Deep Supervision for
Brain Tumor Segmentation in Multi-parametric MRI

Chapter · October 2019


DOI: 10.1007/978-3-030-32692-0_66

CITATIONS READS

6 347

4 authors:

Tongxue Zhou Su Ruan


Institut National des Sciences Appliquées de Rouen Université de Rouen
24 PUBLICATIONS 969 CITATIONS 344 PUBLICATIONS 6,993 CITATIONS

SEE PROFILE SEE PROFILE

Haigen Hu Stéphane Canu


Zhejiang University of Technology Institut National des Sciences Appliquées de Rouen
68 PUBLICATIONS 1,039 CITATIONS 60 PUBLICATIONS 1,695 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Su Ruan on 21 October 2019.

The user has requested enhancement of the downloaded file.


Deep learning model integrating dilated
convolution and deep supervision for brain
tumor segmentation in multi-parametric MRI?

Tongxue Zhou1,2 , Su Ruan1 , Haigen Hu1,3 , and Stéphane Canu2


1
Université de Rouen Normandie, LITIS - QuantIF, Rouen 76183, France
2
INSA de Rouen, LITIS - Apprentissage, Rouen 76800, France
tongxue.zhou@insa-rouen.fr
3
Zhejiang University of Technology, College of Computer Science and Technology,
Hangzhou 310023, China.

Abstract. Automatic segmentation of brain tumor in magnetic reso-


nance images (MRI) is necessary for diagnosis, monitoring and treat-
ment. Manual segmentation is time-consuming, expensive and subjec-
tive. In this paper we present a robust automatic segmentation algorithm
based on 3D U-Net. We propose a novel residual block with dilated con-
volution (res dil block) and incorporate deep supervision to improve the
segmentation results. We also compare the effect of different losses on
the class imbalance problem. To prove the effectiveness of our method,
we analyze each component proposed in the network architecture and we
demonstrate that segmentation results can be improved by these compo-
nents. Experiment results on the BraTS 2017 and BraTS 2018 datasets
show that the proposed method can achieve good performance on brain
tumor segmentation.

Keywords: Deep learning · Brain tumor segmentation · Residual block·


Deep supervision

1 Introduction
Brain tumor is one of the most aggressive cancers in the world. Gliomas are the
most common brain tumors that arise from glial cells. According to the malig-
nant degree of gliomas [1], they can be categorized into two grades: low-grade
gliomas (LGG) and high-grade gliomas (HGG), the former one tend to be be-
nign, grow more slowly with lower degrees of cell infiltration and proliferation,
the latter one are malignant, more aggressive and need immediate treatment.
Magnetic resonance imaging (MRI) is a widely used imaging technique to assess
these tumors, because it offers a good soft tissue contrast without radiation.
The commonly used sequences are T1-weighted, contrast enhanced T1-weighted
(T1c), T2-weighted and Fluid Attenuation Inversion Recovery (FLAIR) images.
?
Supported by the Normandie Regional Council via the MoNoMaD project (Grant
number: 18P03397/18E01937).
2 Tongxue Zhou, Su Ruan, Haigen Hu, and Stéphane Canu

Different sequences can provide complementary information to analyze differ-


ent subregions of gliomas. For example, T2 and FLAIR highlight the tumor
with peritumoral edema, designated whole tumor. T1 and T1c highlight the tu-
mor without peritumoral edema, designated tumor core. An enhancing region
of the tumor core with hyper-intensity can also be observed in T1c, designated
enhancing tumor core. Therefore applying multi-modal images can reduce the
information uncertainty and improve clinical diagnosis and segmentation accu-
racy.
Over the years, there have been many studies on automatic brain tumor seg-
mentation. Cui et al. [2] proposed a cascaded deep learning convolutional neural
network consisting of two subnetworks. The first network is to define the tumor
region from a MRI slice and the second network is used to label the defined
tumor region into multiple subregions. Zhao et al. [3] integrated fully convolu-
tional neural networks (FCNNs) [4] and Conditional Random Fields (CRFs) to
segment brain tumor. Havaei et al. [5] implemented a two-pathway architecture
that learns about the local details of the brain as well as the larger context
feature. Wang et al. [6] proposed to decompose the multi-class segmentation
problem into a sequence of three binary segmentation problems according to
the subregion hierarchy. Kamnitsas et al. [7] proposed an efficient fully con-
nected multi-scale CNN architecture named deepmedic that reassembles a high
resolution and a low resolution pathway to obtain the segmentation results. Fur-
thermore, they used a 3D fully connected conditional random field to effectively
remove false positives. More specifically, U-Net [8] is the most widely used struc-
ture in medical image analysis. Such architecture has advantages, following the
flexible input image sizes, consideration of spatial information and an end-to-
end prediction, leading to lower computational cost and higher representation
power. Isensee et al. [9] modified the U-Net to brain tumor segmentation and use
data augmentation to prevent the over-fitting. Kamnitsas et al. [10] introduced
EMMA, an ensemble of multiple models and architectures including DeepMedic,
FCNs and U-Net, and won the first position in BraTS 2017 competition.
However, there are still some challenges on the brain tumor segmentation
task. First, the brain anatomy structure varies from patients to patients, mak-
ing segmentation task more difficult. Second, the tumor contour is fuzzy due to
low contrast. Furthermore, the sizes of tumor and background are highly imbal-
anced, the background is overwhelmingly dominant, resulting in extreme class
imbalance for brain tumor segmentation.
Inspired by U-Net, we propose a 3D MRI U-Net brain segmentation network
in multi-parametric MRI to address these problems. The main contributions
of our method are three folds: 1) A novel 3D MRI brain tumor segmentation
network is proposed, which is based on a novel residual block with dilated con-
volution (res dil block) to increase the receptive field for getting more semantic
information. 2) Deep supervision is employed to integrate multi-level segmenta-
tion outputs to improve the final segmentation result. 3) The class imbalance
problem is addressed by comparing different loss functions in order to find the
best one for our architecture.
Title Suppressed Due to Excessive Length 3

2 Method
2.1 Dataset and Pre-processing
The datasets used in the experiments come from BraTS 2017 and 2018 train-
ing sets and validation sets. The training set includes 210 HGG patients and
75 LGG patients. The validation set includes 46 and 66 patients, respectively.
Each patient has four image modalities including T1-weighted, contrast en-
hanced T1-weighted (T1c), T2-weighted and Fluid Attenuation Inversion Recov-
ery (FLAIR) images. All data used in the experiments have been pre-processed
with a standard procedure. The N4ITK [11] method is first used to correct the
distortion of MRI data and intensity normalization is applied to normalize each
modality of each patient. To exploit the spatial contextual information of the
image, we used the 3D image and clip and resize the image from 155 × 240 × 240
to 128 × 128 × 128.

2.2 Network architecture


Our network is inspired by the U-Net architecture [8]. Multi-modal images are
directly integrated in the original input space channel by channel, which can
maximumly reserve the original image information and learn the intrinsic im-
age feature representation. In the encoder and decoder part, res dil blocks are
proposed to increase the receptive field, each block consists of two dilated con-
volutions with dropout in the middle to alleviate the over-fitting. In order to
maintain the spatial information, we use a convolution with stride equals 2 in
encoder part to replace pooling operation. Inspired by [8], deep supervision is
proposed in the decoder part to integrate different level segmentations to improve
the final result. The network architecture is shown in Fig.1.

Fig. 1. Network architecture.

The proposed Res dil block uses dilated convolution that defines the spacing
between the values in a kernel [12], which can increase the receptive field. For
example, a 3 × 3 kernel with a dilation rate of 2 will have the same field of view
4 Tongxue Zhou, Su Ruan, Haigen Hu, and Stéphane Canu

as a 5 × 5 kernel, it can generate different features in this case, as illustrated


in Fig.2. It’s likely to require different receptive field when segmenting different
regions in an image. For example, large regions may need a large receptive field
at the expense of fine details, while small regions may require high resolution
local information. Since standard U-Net can’t get enough semantic features due
to the limited receptive field, inspired by dilated convolution, we use residual
block with dilated convolutions (rate=1, 2, 4) on both encoder part and decoder
part to obtain features at multiple scales, shown in Fig.3. The res dil block can
obtain more extensive local information to help retain information and fill details
during training process.

Fig. 2. Dilated convolution

Fig. 3. Residual dilated block

2.3 Solutions for Class imbalance

Due to the physiological characteristics of brain tumours, the segmentation task


has an inherent class imbalance problem. Table 1 illustrates the distribution of
the classes in the training data of BraTS 2017. The background is overwhelm-
ingly dominant. So, the choice of the loss functions is crucial to deal with the
imbalance problem. We present three different loss functions as follows. The Dice
scores for different loss functions on the BraTS 2017 dataset are shown in Table
2.
Category cross entropy loss evaluates individually the class predictions for
each pixel and then averages all pixels. Focal tversky loss weights false positives
and false negatives for highly imbalanced data. Dice loss is to calculate the
overlap of prediction and real annotation. The comparison results in Table 2
show that focal tversky loss is may be good for binary classification problems to
solve intra-class imbalance. However, it is less helpful for inter-class imbalance.
Title Suppressed Due to Excessive Length 5

Categorical cross-entropy requires special attention on the dataset with an severe


class imbalance. While Dice loss can deal well with the class imbalance problem
on the dataset. C N
1 XX
Lcategory cross entropy = − wc gic logpic (1)
N i=1 j=1

C PN
X j=1pic gic + 
Lf ocal tversky = (1− PN PN PN )1/γ (2)
i=1 j=1 pic gic + α p g
j=1 ic ic + p g
j=1 ic ic + 
PC PN
i=1 j=1 pic gic + 
Ldice = 1 − 2 PC PN (3)
i=1 j=1 pic + gic + 
where N is the set of all examples, C is the set of the classes, wc represents the
weight assigned to the class c, pic is the probability that pixel i is of the tumor
class c and pic is the probability that pixel i is of the non-tumor class c. The
same is true for gic and gic , and  is a small constant to avoid dividing by 0.

Table 1. The distribution of classes on BraTS 2017 training set, NET: Non Enhancing
Tumor, NCR: Necrotic.

Region Background NET/NCR Edema Enhancing tumor


Percentage 99.12 0.28 0.40 0.20

Table 2. Comparison with different loss functions on BraTS 2017 training set.

Loss whole tumor tumor core enhanced tumor


Dice 88.5 84.5 73.4
Focal Tversky Loss 87.9 81.1 70.0
Categorical Cross Entropy Loss 42.1 51.4 44.3

3 Experiments and Results


3.1 Implementation details
Our network is implemented in Keras with a single Nvidia GPU Quadro P5000
(16G). We trained the network for 200 epochs using both the HGG and LGG
datasets simultaneously. We randomly sampled patches of size 128 × 128 × 128
voxels with a batch size of 1. The models are optimized using the Adam optimizer
(initial learning rate = 5e-4) with a decreasing learning rate factor 0.5 with
patience of 10 epochs for 50 epochs.

3.2 Experiments Results


Following the challenge, four intra-tumor structures have been grouped into
three mutually inclusive tumor regions: (a) whole tumor (WT) that consists
of all tumor tissues, (b) tumor core (TC) that consists of the enhancing tumor,
necrotic and non-enhancing tumor core, and (c) enhancing tumor (ET). The
results are evaluated based on the online evaluation platforms, including dice
score, sensitivity, specificity and Hausdorff distance.
6 Tongxue Zhou, Su Ruan, Haigen Hu, and Stéphane Canu

Quantitative Analysis We randomly split 20% (57) of the training sets (285)
in BRATS 2017 as local validation set. Table 3 shows the contributions of each
components in the network on local validation set. We refer to our basic U-Net
without res dil block and deep suprvision as base. We can see an increase of
dice score, sensitivity and Hausdorff across all tumor regions when we added the
proposed strategies gradually. More precisely, we achieve Dice scores of 88.5, 84.5,
73.4 for whole, core and enhancing tumor, respectively. Table 4 shows the results
on Brats 2017 and Brats 2018 validation sets. To further verify the effectiveness of
the proposed method, we compare the performance of our method with original
U-Net and a state-of-the-art U-Net-like network [9] on BraTS 2017 dataset in
Table 5. We get the best result of sensitivity and Hausdorff on whole tumor.
For tumor core, we achieve the best performance on all evaluation metrics. For
enhancing tumor, we obtain the best result of dice and specificity. In general, the
proposed method achieves better segmentation result on Dice score than others.

Table 3. Results of different strategies on BraTS 2017 training set.

Strategies Dice Sensitivity Specificity Hausdorff


WT TC ET WT TC ET WT TC ET WT TC ET
base 86.6 76.8 64.1 85.4 75.1 67.4 99.5 99.7 99.8 8.39 8.84 8.14
base+super 87.9 81.2 66.67 87.4 80.4 72.45 99.4 99.7 99.8 7.54 7.59 7.67
base+res dil 88.2 81.9 69.5 87.8 81.8 74.0 99.4 99.3 99.7 7.52 6.90 6.26
base+super+res dil 88.5 84.5 73.4 91.7 83.1 74.3 99.1 99.7 99.7 5.81 6.47 6.81

Table 4. Results on BraTS 2017 and BraTS 2018. Val: validation set

Dataset Dice Sensitivity Specificity Hausdorff


WT TC ET WT TC ET WT TC ET WT TC ET
BraTS 2017 Val 86.5 75.4 65.9 86.2 75.9 68.0 99.5 99.7 99.8 9.84 6.35 9.70
BraTS 2018 Val 87.3 77.9 70.5 88.5 79.1 72.6 99.4 99.7 99.8 8.06 8.81 5.60

Table 5. Quantitative comparison results of different methods on BraTS 2017 dataset

Methods Dice Sensitivity Specificity Hausdorff


WT TC ET WT TC ET WT TC ET WT TC ET
U-Net 79.1 49.9 8.0 84.6 45.9 7.3 97.6 99.4 99.7 18.8 21.1 38.0
Isenn et al. [9] 89.5 82.8 70.7 89.0 83.1 80.0 99.5 99.7 99.8 6.04 6.95 6.24
Ours 88.5 84.6 73.4 91.7 83.2 74.3 99.1 99.7 99.8 5.81 6.47 6.81

Qualitative Analysis We randomly select one sample on BraTS 2017 dataset


and visualize the segmentation result in Fig.4. From the results, we can see
the orginal U-Net can’t segment the enhancing tumor at all and predict many
false regions on tumor core and edema. The method in [9] predicts many false
positive, especially on necrotic and edema regions and we can see many false
edema regions are segmented from the coronal view. However, our method is
Title Suppressed Due to Excessive Length 7

capable of segmenting large tumor region (necrotic and edema) as well as the
difficult region (enhancing tumor). The results show that the proposed method
achieves almost the same results as the real annotation. To verify the effectiveness
of the proposed method, the relative quantity evaluation results are shown in
Table 6. In accordance with the qualitative result, each sample obtains a high
dice score on the three brain tumor regions.

Fig. 4. Qualitative comparison of different methods on patient Brats 2017 TCIA 201 1.
Edema is shown in green, enhancing tumor in red and necrotic in blue.

Table 6. Qualitative results of the segmentation sample

Sample Dice Sensitivity Specificity Hausdorff


WT TC ET WT TC ET WT TC ET WT TC ET
Brats 2017 TCIA 201 1 97.1 92.9 85.2 97.5 90.1 84.6 99.6 99.8 99.6 1.41 3 2.24

4 Conclusion

In this paper, we propose a 3D U-Net brain tumor segmentation in multi-


parametric MRI images, where a novel residual block with dilated convolution
and a deep supervision are proposed to improve the segmentation result. We
compared the results achieved by our method with others. The experimental
results on BraTS 2017 and 2018 datasets clearly verify the effectiveness of our
method on brain tumor segmentation. In this work, mult-parametric MRI im-
ages are simply cascaded to the architecture. In the future, we will focus on how
to fuse them effectively to obtain better segmentation.
8 Tongxue Zhou, Su Ruan, Haigen Hu, and Stéphane Canu

References
1. Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Key-
van Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom,
Roland Wiest, et al. The multimodal brain tumor image segmentation benchmark
(brats). IEEE transactions on medical imaging, 34(10):1993–2024, 2014.
2. Shaoguo Cui, Lei Mao, Jingfeng Jiang, Chang Liu, and Shuyu Xiong. Automatic
semantic segmentation of brain gliomas from mri images using a deep cascaded
neural network. Journal of healthcare engineering, 2018, 2018.
3. Xiaomei Zhao, Yihong Wu, Guidong Song, Zhenye Li, Yazhuo Zhang, and Yong
Fan. A deep learning model integrating fcnns and crfs for brain tumor segmenta-
tion. Medical image analysis, 43:98–111, 2018.
4. Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks
for semantic segmentation. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 3431–3440, 2015.
5. Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron
Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, and Hugo Larochelle.
Brain tumor segmentation with deep neural networks. Medical image analysis,
35:18–31, 2017.
6. Guotai Wang, Wenqi Li, Sébastien Ourselin, and Tom Vercauteren. Automatic
brain tumor segmentation using cascaded anisotropic convolutional neural net-
works. In International MICCAI Brainlesion Workshop, pages 178–190. Springer,
2017.
7. Konstantinos Kamnitsas, Christian Ledig, Virginia FJ Newcombe, Joanna P Simp-
son, Andrew D Kane, David K Menon, Daniel Rueckert, and Ben Glocker. Efficient
multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation.
Medical image analysis, 36:61–78, 2017.
8. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional net-
works for biomedical image segmentation. In International Conference on Medi-
cal image computing and computer-assisted intervention, pages 234–241. Springer,
2015.
9. Fabian Isensee, Philipp Kickingereder, Wolfgang Wick, Martin Bendszus, and
Klaus H Maier-Hein. Brain tumor segmentation and radiomics survival prediction:
Contribution to the brats 2017 challenge. In International MICCAI Brainlesion
Workshop, pages 287–297. Springer, 2017.
10. Konstantinos Kamnitsas, Wenjia Bai, Enzo Ferrante, Steven McDonagh, Matthew
Sinclair, Nick Pawlowski, Martin Rajchl, Matthew Lee, Bernhard Kainz, Daniel
Rueckert, et al. Ensembles of multiple models and architectures for robust brain
tumour segmentation. In International MICCAI Brainlesion Workshop, pages 450–
462. Springer, 2017.
11. Brian B Avants, Nick Tustison, and Gang Song. Advanced normalization tools
(ants). Insight j, 2:1–35, 2009.
12. Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convo-
lutions. arXiv preprint arXiv:1511.07122, 2015.

View publication stats

You might also like