You are on page 1of 20

Journal Pre-proof

A Novel Enhanced Softmax Loss Function for Brain Tumour Detection


using Deep learning

Sunil Maharjan, Abeer Alsadoon, P.W.C. Prasad, Mustafa Salam,


Omar Hisham Alsadoon

PII: S0165-0270(19)30377-2
DOI: https://doi.org/10.1016/j.jneumeth.2019.108520
Reference: NSM 108520

To appear in: Journal of Neuroscience Methods

Received Date: 26 August 2019


Revised Date: 22 October 2019
Accepted Date: 11 November 2019

Please cite this article as: Maharjan S, Alsadoon A, Prasad PWC, Salam M, Alsadoon OH, A
Novel Enhanced Softmax Loss Function for Brain Tumour Detection using Deep learning,
Journal of Neuroscience Methods (2019),
doi: https://doi.org/10.1016/j.jneumeth.2019.108520

This is a PDF file of an article that has undergone enhancements after acceptance, such as
the addition of a cover page and metadata, and formatting for readability, but it is not yet the
definitive version of record. This version will undergo additional copyediting, typesetting and
review before it is published in its final form, but we are providing this version to give early
visibility of the article. Please note that, during the production process, errors may be
discovered which could affect the content, and all legal disclaimers that apply to the journal
pertain.

© 2019 Published by Elsevier.


A Novel Enhanced Softmax Loss Function for Brain Tumour
Detection using Deep Learning

Mr Sunil Maharjan1, Dr Abeer Alsadoon1*, Dr P.W.C. Prasad1, Dr Mustafa Salam2, Mr Omar


Hisham Alsadoon3
1
Charles Sturt University, Sydney, Australia
2
Computer Engineering Techniques, Imam Ja'afar Al-Sadiq University, Iraq
3
Department of Islamic Sciences, Al Iraqia University, Baghdad, Iraq

Abeer Alsadoon1*
1
School of Computing and Mathematics, Charles Sturt University, Sydney, Australia
*
Corresponding author. Dr. Abeer Alsadoon, Charles Sturt University, Sydney Campus, Sydney, Australia,
Email: aalsadoon@studygroup.com , Phone +61 2 9291 9387

of
Sunil Maharjan: MIT(Software Development and Design)
Abeer Alsadoon: Post-Doctorate (IT) PhD (Soft. Eng.) MSc (Comp. Sc.) BSc (Comp. Sc.)

ro
P.W.C. Prasad: PhD (Comp. Eng), MSc (Comp.Eng), BSc (Comp. Sc)
Mustafa Salam: BSc(Soft. Eng.), MSc (IT), PhD (Comp. Sc.)
Omar Hisham Alsadoon: BSc(Soft. Eng.), MSc (IT), PhD (Comp. Sc.) -p
re
Graphical abstract
lP
na
ur
Jo
Read MRI image

It extracts the features


directly from image and
has short training
Pre-processing period decreases
performance time
Reduces noise and
preserves the edges Median filter
simultaneously

Classification

Softmax supports both


binary and multiclass
classification. Modified Different type of benign
loss reduces the risk of and malignant brain

of
overfitting tumor classification

ro
-p
re
Abeer Alsadoon (2019)
lP

Highlights

The proposed solution:


na

 Proposed a Modified loss function with regularization to enhance the accuracy


by reducing the overfitting of data.
 Use Multiclass classification of the type of brain tumour. The cross-entropy loss
function is combined with softmax function. With the help of equation of
ur

softmax and cross-entropy, it classifies into multiple classes and reduces the risk
of overfitting in the training samples.
 Regularization will be implemented in the loss function to enhance the accuracy
Jo

of the neural network.

Abstract
Background and Aim: in deep learning, the sigmoid function is unsuccessfully used for the multiclass
classification of the brain tumour due to its limit of binary classification. This study aims to increase the
classification accuracy by reducing the risk of overfitting problem and supports multi-class classification. The
proposed system consists of a convolutional neural network with modified softmax loss function and
regularization.
Results: Classification accuracy for the different types of tumours and the processing time were calculated based
on the probability score of the labeled data and their execution time. Different accuracy values and processing
time were obtained when testing the proposed system using different samples of MRI images. The result shows
that the proposed solution is better compared to the other systems. Besides, the proposed solution has higher
accuracy by almost 2% and less processing time of 40~50 milliseconds compared to other current solutions.
Conclusion: The proposed system focused on classification accuracy of the different types of tumours from the
3D MRI images. This paper solves the issues of binary classification, the processing time, and the issues of
overfitting of the data.

Keywords: Multiclass classification; Brain tumour detection; Neural network; Deep learning; Loss
function; Softmax function.

1. Introduction
Computed tomography (CT) scans and x-ray scans are used as a guide for brain tumour detection [1, 2]. This
provides the anatomical detail of the brain tissues and the whole brain structure. During the extraction stage, CT
scans images are required to be flooded using watershed segmentation and morphological operation to separate

of
the brain tissues and the tumour. However, it fails to separate the region of tumour accurately. Therefore, deep
learning and segmenting the magnetic resonance imaging (MRI) images need to be considered to solve the
detection issue and then positively impact the current solutions.

ro
Deep learning provides significant advantages in the medical field, especially for diagnosing and treating the
brain tumour. In the medical field, deep learning plays an essential role in medical image processing, computer
aided diagnosis, and image segmentation [3, 4]). Some of the examples are deep learning can be used to
anatomical localization [5, 6]), liver, heart, and great vessels segmentation. In this study, the convolutional
-p
neural network (CNN) is used for detecting and classifying the various types of brain tumours. CNN consists of
a number of hidden layers with abstraction for extracting the features from the input images. There are many
techniques used to overcome the classification of the brain tumour problem. Some researchers have suggested to
increase the number of datasets, number of layers, or to use the regularization, and activation function to classify
re
the tumour images. However, this study solution is limited to improve the accuracy, processing time and weight
variables. Therefore, a CNN model with high accuracy, low processing time and low weight matrices is
proposed.
lP

Current studies related to brain tumour detection and classification using deep learning use different techniques
and algorithms before classifying the images. They follow different algorithms and procedures for image
processing, feature extraction, selection, and calcification. Most of these studies tried to improve the accuracy of
the systems during the learning and testing process. The experimental result of the state-of-the-art gives a
classification accuracy of 97.18% [7, 8]. Besides that, the accuracy of using deep learning is higher and accurate
na

than using support vector machine [9, 10]. In another hand, the CNN is limited by dataset size, filter size,
number of hidden layers, and activation function.
ur
Jo
The purpose of this study is to improve the classification accuracy of different types of brain tumours by
combining modified loss function and softmax function. CNN can have overfitting problem, because of a
smaller number of datasets which lowers the accuracy and increases the processing time. The neural network
can overfit and underfit the dataset without the regularization. The proposed solution is to use CNN with
modified loss and softmax function; by the combination of the modified loss function with regularization and
the softmax function for multi-class classification. Thus, it improves accuracy and decreases the processing
time.
The rest of the paper is organized as follows. Section 2 presents a literature review. Section 3 describes the
proposed system. Section 4 presents the results. Section 5 presents the discussion. Section 6 presents the
conclusion and provides recommendations for future work.

2. Literature Review
Abd-Ellah et al. [1] proposed an approach to improve brain tumour detection and localization. They
preprocessed the image to improve quality. Besides, they modified the existing neural network AlexNet to
reduce complexity and to train the model faster. The solution provides detection accuracy of 99.55% using
CNN, 92.95% using visual geometry group (VGG-16), and 95.15% using VGG-19. Moreover, the localization

of
of 0.87 dice score was achieved. This solution obtained an accuracy of 99.55%, which is very high compared to
66.96% in the previous works. They used 349 MRI images for the study. The images extracted from the
standard reference image database to evaluate response (RIDER). Bahadure et al. [9] proposed a brain Tumour

ro
detection technique. This technique developed based on Berkeley wavelet transformation (BWT) for
segmentation and support vector machine (SVM) for classification. For the clarity of the images, adaptive
contrast enhancement based on sigmoid function is applied. Also, they used skull stripping based on the
threshold operation to enhance the skull stripping performance. The experimental results achieved from this
-p
technique was 96.51%, 94.2% and 97.72% accuracy, specificity and sensitivity respectively. Besides, an
average of 0.82 dice similarity index coefficient was achieved. The achieved accuracy from the proposed
technique can be enhanced by fusing other images and applying post-processing operations for better tumour
classification.
re
Ari and Hanbay [7] studied the classification of MRI and the detection of brain Tumour using the extreme
learning machine local receptive fields (ELM-LRF) method. The proposed method consists of three stages,
which are pre-processing, classification, and Tumour region extraction. In the preprocessing step, possible
lP

noises are removed using nonlocal means and local smoothing methods. In the second stage, ELM-LRF used to
classify the cranial MR images as benign or malignant. In the third stage, the Tumours segmentation process is
completed. The proposed ELM-LRF method achieved a performance of 97.18%. The classification accuracy
was 97.18 %. Damodharan and Raghavan [11] introduced a capable brain tumour detection technique based on
the neural network (NN). The major stages of the technique are brain image pre-processing, pathological tissues
segmentation, and tissue segmentation. The tissues segmentation consists of white matter (WM), gray matter
na

(GM) and fluid (Cerebrospinal fluid (CSF)). They used NN for features extraction and classification of the
Tumour. The accuracy of the proposed technique was 82% using NN, 67 % using K-nearest neighbors (K-NN)
and bayesian classification techniques. However, the NN can be implemented for multimodal imaging along
with the post-processing to achieve a better result.
ur

Laukamp et al. [12] studied the automatic detection and segmentation of meningiomas based on deep learning
model (DLM). 249 glioma cases were used for training the DLM. Diverse MRI images were given as input to
the solution. It achieved high accuracy in detection and segmentation using the brain tumour image
segmentation (BRATS) benchmark. This solution provides 98% accuracy. Furthermore, the solution correlated
Jo

strongly with manual segmentations; the dice coefficient was 0.81 for total tumour volume and 0.78 for
contrast-enhancing tumour volume. Although the solution achieved higher accuracy, multimodal imaging was
missing and needed to be considered while training the model. Aminet al. [13] proposed a deep neural network
(DNN) based architecture for the detection of brain tumour. In the model, 7 layers were used for classification
that consist of 3 convolutional, 3 rectified linear unit (ReLU), and a softmax layer. Eight different datasets were
used for this study including BRATS (image dataset and synthetic dataset), ISLES (Ischemic stroke lesion
segmentation) and five different MRI modalities such as fluid attenuation inversion recovery (FLAIR) and
diffusion-weighted image (DWI). This model achieved 99.8% dice similarity coefficient (DSC) on FLAIR and
100% DSC on DWI. The average processing time of the proposed model was 5.502 second. However, a feature
technique should be considered to reduce the processing time. for different MRI modalities.
Charron et al. [14] adapted the 3D CNN (DeepMedic) to segment and detected the brain metastases on MRI
images. The network parameters were adapted to brain metastases. Network performance was evaluated in
terms of detection and segmentation by exploring the single or combined use of different MRI modalities. This
solution improves the sensitivity and dice similarity coefficient and decreases the false positive. The sensitivity
increased from 82% to 98%, and the dice similarity coefficient increased from 0.66 to 0.79. Deepa and
Emmanuel [15] proposed a model to classify tumour with high accuracy. First, the intensity variation of the
images was reduced using the average filter. Second, the Gabor wavelet feature extraction extracts the locality,
orientation, and frequency of the tumour image, which provides texture information for classification. Kernel
principal component analysis (KPCA) feature selection was used to select the small subset of features to reduce
the redundancy and increase the relevancy of the feature. The Gaussian radial basis function (GRBF) was used
for feature fusion to distinguish information from the multiple sets of features. The fused feature adaptive firefly
backpropagation neural network to overcome the drawbacks of existing solutions. It provides 99.84% accuracy,
97.24% sensitivity, and 99.85% specificity using BRATS dataset. The solution achieved high accuracy due to
efficient feature extraction, selection, and fusion technique but still limited by the number of datasets.
Jun et al.[16] proposed a model to detect the brain metastases from the contrast-enhanced 3D gradient-echo (CE
3D GRE) images by suppressing the blood vessels. Patients were randomly selected for training (29 sets) and
testing (36 sets). Two neuroradiologists independently evaluated deep-learned and original Black Blood (BB)
images, assessing the degree of blood vessel suppression and lesion conspicuity. On per-patient analysis,
sensitivities were 100% for both deep-learned and original BB imaging. In the per lesion analysis, the deep-
learned 3D BB imaging provided 90.3% overall sensitivities and 100% for the original BB imaging. In subgroup

of
analysis for lesions ≥2 mm, deep-learned BB imaging achieved 98% sensitivity. However, because of all the
images were from the same scanner and using the same acquisition protocols for training, the overall
sensitivities were reduced. Ural [17] performed different image processing techniques for enhancing the image
quality and smoothing the MRI images. Also, he mixed different segmentation process for boosting the

ro
performance of the solution. Furthermore, the probabilistic neural network (PNN) method was used to detect
and localize the tumour areas in the brain. The proposed solution offers acceptable performance rate and less
computational time. 25 neuroimages were used to optimize the system, and 25 out-of-sample neuroimages were
also used to test the approach. This solution provides 85-90% performance rate and 80.3087%, which are less
-p
computational time than the conventional solution. However, the accuracy of the system can be improved by
using different modalities of MRI images.
Arunachalam and Savarimuthu [18] developed a model to classify the normal and abnormal brain tumour in
re
MRI images. They fused the two brain MRI images using a shift-invariant shearlet transform (SIST). Also, the
features are extracted from the approximate sub-band of the NSCT transformed image using Gray level co-
occurrence matrix (GLCM), Gabor, and discrete wavelet transform (DWT). The proposed system achieved
99.9% specificity, 89.7% sensitivity, and 99.8% accuracy slightly higher than state of the art using BRATS
lP

dataset. Preethi and Aishwarya [19] proposed a model to classify and segment the tumour based on multiple
stages. They combined the gray-level co-occurrence matrix and wavelet-based gray-level co-occurrence matrix
to produce the feature matrix. Oppositional flower pollination algorithm (OFPA) was used to select a relevant
feature. After that, DNN was used to categorize the brain image based on the selected features. Besides that, the
possibilistic fuzzy c-means clustering (PFCM) algorithm and the projected scheme were used to extract the
na

Tumour region from the Tumour images. The developed model provides 92%, accuracy, 86% sensitivity, 91%
specificity.
El-Dahshan et al. [20] developed a hybrid intelligent technique for automatic detection of brain tumour. They
applied the feedback pulse-coupled NN for image segmentation, and the region of interest was detected by
ur

proper setting of various parameter of the network. 101 images were used in the study, which consists of 14
normal and 87 abnormal (malignant and benign tumours) from a real human brain MRI dataset. The proposed
method achieved 99% accuracy, 100% sensitivity, and 92.8% specificity in compared. Kavitha and Chellamuthu
[21] provided an additional orientation constraint to the intensity constraint in the modified region growing
Jo

segmentation technique. This inclusion of additional constraints gives a better result than the normal region
growing. Comparative analyses were made of the normal and the modified region growing using both the feed-
forward neural network (FFNN) and radial basis function (RBF) neural network. The proposed solution
provides 100% specificity, 80% sensitivity, and 90% accuracy for training dataset using neural network. Also, it
provides 80% specificity, 80% sensitivity, and 80% accuracy for testing dataset. The solution provides the range
of accuracy in detecting and classifying the tumour, which can be improved by using a larger dataset and
implementing a new algorithm for feature extraction and selection. Kochar [22] studied three different
approaches for brain tumour classification. The solution combined the self-organizing map (SMO) sequential
minimal optimization with k-mean clustering for the higher accuracy. The SMO) provides the highest accuracy,
specificity, and sensitivity of 89%, 90%, and 92% respectively than others. While the accuracy was acceptable,
the number of datasets to train the self-organizing map with k-mean clustering needs to be increased for more
accurate results.
2.1 State of the Art
Fig. 1 represents the features and techniques in the brain tumour detection of the current solution [7]. The
strengths of the current solution are highlighted inside the blue border. The limitations in the solution are
highlighted inside the red border. Ari and Hanbay [7] classifies the cranial brain image using the extreme
learning machine local receptive fields (ELM-LRF) based tumour classification and detects the desired tumour
using watershed segmentation method. Ari and Hanbay [7] proposed an extreme learning machine with
sigmodal, Gaussian, and hard-limited activation function that used in the ELM hidden layer for the classification
and detection the brain tumour. Furthermore, the use of convolution layer, pooling layer followed by extreme
learning machine layer in the neural network architecture to enhance the tumour classification accuracy.
Watershed segmentation, followed by morphological operations, were used for detection. this solution provides
97.18% classification accuracy, 96.80% sensitivity, and 97.12% specificity. The solution consists of three main
stages: pre-processing, classification, and extraction.

Read MRI image

It extracts the
Increase the features directly
processing time from image and
has short training
period decreases

of
performance time

Classification

Convolution layer

ro
Sigmoidal function

Sigmoidal supports
only binary
classification -p
re
lP

Figure 1: Block Diagram of the State-of-the-Art System, [7]


[The blue borders show the good features of the solution, and the red border refers to the limitation of it]
na

Pre-processing stage: MRI images are used as the input for the system to get the information of the
brain. The MRI images are preprocessed using a non-local mean filter and a local smoothing filter.
These filters are used to remove the unwanted noise that may occur during images compressing or
image data transferring processes. This stage improves the image quality and contrast of the image.
ur

However, some vital structure and details in the images are removed during this pre-processing stage.
Thus, it will reduce the classification accuracy of the tumour.
Classification stage: in this stage, ELM-LRF is used to extract the features from the image directly and
Jo

classifies the brain tumour. The brain tumour is classified as benign or malignant. ELM-LRF has two
parts in its structure. The first part has a convolution layer and pooling layer. ELM-LRF uses a
sigmoid function for classification in convolution layer. It uses square/square root techniques for the
pooling. In the second part, the matrix is acquired by combining the extracted features from the image
based on analytical calculation. Also, the weight vector between the ELM’s hidden layer and the
output layer is calculated. This method updates the weight and bias value of the network.
Extraction stage: in this stage, the classified image is segmented using watershed segmentation,
which is built on the terms of watershed lines. The watershed segmentation is connected to
topography, water source basin, and the morphological operations. Watershed segmentation helps in
determining the tumour. In another hand, in the watershed, the gradient info is gathered by calculating
the differentiating the first derivate of the change between pixel values. It defines the gray value and
tumours border. Furthermore, it produces a fast result in the form of closed curves. Different pointer
pixels are gathered initially by the morphological operations.
The state-of-the-art solution has achieved higher classification accuracy, sensitivity, and specificity
comparing to other solutions. However, there are some limitations in the classification stage that need
to be improved for better results. The state-of-the-art solution uses two different filters in the pre-
processing stage, i.e., non-local mean and local smoothing filter rather than one single to remove the
noise. These filters help to increase the classification accuracy, but they remove some important
structures and information that are useful for classification accuracy and reducing the processing time.
Therefore, using a single median filter for the image pre-processing need to be considered to increase
accuracy and reduce the processing time.
Furthermore, Ari and Hanbay [7] used a sigmoid function for the binary classification of tumour. The
sigmoid function is used in the ELM-LRF algorithm to map the predicted value to probabilities
between 0 and 1, as shown in equation 1. It separates whether the brain tumour is benign or malignant
only. However, the sigmoid function limits the performance of the neural network. The image can
have a non-linear relationship with the dataset and to establish the non-linear relationship, non-linear
activation function is necessary. There can be a different type of benign and malignant tumours which

of
can be classified using multi-class classification. Therefore, the accuracy can be improved by using
the softmax function.
𝐸 = ∑𝑁
𝑖=1 𝛽𝑖 𝑔(𝑤𝑖 𝑥𝑗 + 𝑏𝑖 ) (1)

ro
𝑗 = 1, … . , 𝑁
𝑤𝑖̇ = [𝑤𝑖1 , 𝜔𝑖2 , … … , 𝑤𝑖𝑛 ]𝑇 is the weight vector between i hidden layer cells and input cells.
𝛽𝑖̇ = [𝛽𝑖1 , 𝛽𝑖2 , … … , 𝛽𝑖𝑚 ]𝑇 is the weight vector between i hidden layer cells and output cells.
bi is the i th hidden cells threshold value
g(x) is the activation function
wi xj is the inner multiplication of wi and xj
-p
N is number of cells in hidden layer
re
1
Sigmoid function is expressed as: 𝑔(𝑥) =
1+𝑒 −𝑥
Where,
g = output class
lP

x = set of input values


Table 1: Extreme Learning Machine Local Receptive Fields
Algorithm: Extreme Learning Machine Local Receptive Fields
1. Begin
2. Randomly assign the weights and bias to the neuron of the Extreme Learning Machine
na

3. compute the hidden layer output matrix using Moore-Penrose inverse of a matrix
4. compute the output weight by multiplying the hidden layer output matrix and the transpose of the target sample
5. update the weight and bias value of the hidden layer
6. stop
ur
Jo
Flowchart of Extreme Learning Machine

of
Figure 2: Flowchart of Extreme Learning Machine

ro
3. Proposed System
-p
Many classification and detection techniques using deep learning were comprehensively reviewed and analyzed
for this study. The main issues to be considered were the activation function, loss function, weight, and bias of
the neuron. These are the important factors of the neural network, and their values affect the performance of the
whole network results.
re
From among of classification and detection techniques that were examined, Ari and Hanbay [7] model was
selected as the basis for the proposed solution. The main reason for choosing this model is because of the use of
ELM-LRF algorithm for the classification. This algorithm trains the neural network faster than the sigmoid
function. The algorithm does not require any iteration for updating weight and bias factors. Thus, the training
lP

period of the network is short and reduces the processing time of classification.
However, as discussed above in the state-of-the-art section, the current solution uses a sigmoid function in the
neural network for the binary classification instead of multi-class classification, which limits the model
performance. For further enhancement and to overcome the limitation of the current solution, we have adopted a
na

solution proposed by Abd-Ellah et al. [1]. They used a softmax function for the multiclass classification instead
of sigmoid function for detecting and localizing the tumour type. The proposed system consists of three stages
(see Fig. 3): pre-processing, classification, and extraction.
Pre-processing stage: this is the first stage of the detection system in which the noise is eliminated and
preserves the edges of the tumour in 3D MRI images. A single 3 X 3 median filter was used in this stage instead
ur

of non-local mean filter and local smoothing filter in the state-of-the-art solution. The median filter improves the
image quality and contrast of the image. Furthermore, it enhances the overall accuracy of the network by sorting
the image pixels values and calculation of the median value of image pixels, which reduces the processing time.
Jo

Classification stage: In this stage, ELM-LRF is used to extract the desired features from the image directly.
ELM-LRF structure has two parts, which are pooling layer and convolution layer in the first part and combining
the extracted features in the second part. The proposed solution uses a modified loss and softmax function in
convolution layer [1] for classification instead of the sigmoid function in the state-of-the-art solution [7]. The
ELM-LRF model has six layers. The first layer was the input layer, and the second layer was the convolution
layer, where six convolution filters were used. The third layer was the pooling layer. The fourth layer was
another convolution layer with 12 convolution filters follow by another pooling layer. The last layer was a fully
connected layer. As shown in Table 2 and Fig. 4, the modified loss function reduces the risk of overfitting the
data. Besides, the softmax function supports multiclass classification for classifying the brain tumour and its
type. In the pooling layer, a square/ square root was used. In the second part, the feature map obtained from the
convolution layer is combined on a matrix to acquire the extracted features from the tumour image. Also, the
weight vector between the ELM’s hidden layer and the output layer is calculated analytically using the least
square method. The method finds the line of best fit for a set of data by providing a visual demonstration of the
relationship between the data points. Each point of data is representative of the relationship between a known
independent variable and an unknown dependent variable.
Extraction stage: In this stage, watershed segmentation and morphological operations are carried out to
effectively segment the classified image [7]. Watershed segmentation is used to help in determining the tumour.
The gradient info is gathered by computing the difference between the changes of pixel values of the image. It
defines the gray value and tumour border by flooding process. Furthermore, it produces a fast result in the form
of closed curves. The watershed segmentation has disadvantages such as provides excessive over-segmentation.
In order to avoid this problem, different pointer pixels are gathered initially by the morphological operations.
After that, the watershed was applied to the images that include pointers. In Morphological operations, dilation
is used to add pixels to the object boundaries in an image and erosion to remove pixels on object boundaries
when it is completed. The number of pixels added or removed from the objects in an image depends on the size
and shape of the structuring element used to process the image.

Read MRI image

It extracts the features


directly from image and

of
has short training
Pre-processing period decreases
performance time
Reduces noise and
preserves the edges Median filter
simultaneously

ro
Classification

Softmax supports both


-p
re
binary and multiclass
classification. Modified Different type of benign
loss reduces the risk of and malignant brain
overfitting tumor classification
lP
na
ur

Figure 3: Block diagram of the proposed system for brain tumour detection using extreme learning machine local receptive fields
algorithm
Jo

[The green borders refer to the new parts in our proposed system]

Table 2: Proposed modified softmax loss function


Algorithm: Extreme Learning Machine Local Receptive Fields
1. Begin
2. Randomly assign the weights and bias to the neuron of the Extreme Learning Machine
3. compute the hidden layer output matrix using Moore-Penrose inverse of a matrix
4. compute the output weight by multiplying the hidden layer output matrix and the transpose of the target sample
5. compute the probability of neuron representing a different class
6. compute the cross-entropy loss function
7. update the weight and bias value of the hidden layer
8. stop
Flowchart of the proposed modified softmax loss function

of
ro
-p
re
Figure 4: Flowchart of the proposed algorithm
lP

3.1 Proposed Equation:


In the proposed solution, the cross-entropy loss function is considered with the softmax function. It calculates
the probability of multiple classes on the scale of [0, 1]. The loss function is shown in equation 2, while the
modified loss function used for the proposed solution is shown in equation 3. Also, elaborated regularization is
na

considered for the proposed solution to reduce the risk of overfitting of data and to increase the accuracy of the
network [1][13]. Equation 4 shows the original regularization formula. The modified loss function is combined
with the regularization to minimize the overfitting of the data, as shown in equation (5). To achieve the study
goal, the combination of the modified version of loss function and regularization is multiplied by the weight
vector between hidden layer cells and output cells, as shown in equation 6. The average weight vector between
ur

hidden layer and output layer cell is combined with the probability output of each sample and regularized.

𝑛 𝑘
Jo

𝐸(𝜃) = − ∑ ∑ 𝑡𝑖𝑗 ln𝑦𝑗 (𝑥𝑖 , 𝜃 ) (2)


𝑖=1 𝑗=1
Where,
E(θ) is the loss function
θ is the parameter vector,
tij is the indicator that the i sample belongs to the j class,
yj(xi, θ) is the output for sample I
ln is the natural log

𝑀𝐸(𝜃) = 𝑡𝑖𝑗 ln𝑦𝑗 (𝑥𝑖 , 𝜃 ) (3)


Where,
ME(θ) is the modified loss function
tij is the indicator that the i sample belongs to the j class,
yj(xi, θ) is the output for sample I
ln is the natural log

𝜆
𝑅= ∑ ‖𝑤‖2 (4)
2𝑛 𝑤
Where,
R is regularization
n is number of iterations
w is weight of the neurons
𝜆 is regularizations parameter
𝜆
is Scale factor
2𝑛
R is Regularization

The modified loss function is combined with the regularization to minimize the overfitting of the data. It is

of
shown in equation (6).
𝜆
𝑀𝐸(𝜃)′ = 𝑡𝑖𝑗 ln𝑦𝑗 (𝑥𝑖 , 𝜃 ) + ∑ ‖𝑤‖2 (5)
2𝑛 𝑤

ro
Where,
ME(θ) is the modified loss function with regularization
tij is the indicator that the i sample belongs to the j class,
yj(xi, θ) is the output for sample I
ln is the natural log
-p
n is number of iterations
re
w is weight of the neurons
𝜆 is regularizations parameter
λ
is Scale factor
2n
lP

𝑀𝐸 = ∑ 𝛽𝑖 ∗ 𝑀𝐸(𝜃)′ (6)
𝑖=1
na

Where,
ME is the final modified function of the proposed solution
𝛽𝑖 is the weight vector between i hidden layer cells and output cells
ME(θ)’ is the modified loss function with regularization
ur

3.2 Area of Improvement:


Jo

The major limitation of state-of-the-art solution [7] is that it uses two different filters in the pre-processing stage,
i.e., non-local mean and local smoothing filter to remove the noise. These filters remove some important
structures and information that are useful for classification. The other limitation of the state-of-the-art solution is
its use for sigmoid function in convolution layer for the binary classification of tumour. The binary classification
can affect the overall performance of the neural networks. The proposed solution solved these limitations. The
proposed solution replaced the non-local mean and local smoothing filter by the single median filter. Also, the
proposed solution replaced the sigmoid function by combining the modified version of the loss function with
regularization. CNN can have overfitting problem, because of a smaller number of datasets which lowers the
accuracy and increases the processing time. The neural network can overfit and underfit the dataset without the
regularization. The proposed solution is to use CNN with modified loss and softmax function; by combination
of the modified loss function with regularization and the softmax function for multi-class classification. Thus, to
achieve the study goal, we combined the modified version of loss function with regularization and multiply by
the weight vector between hidden layer cells and output cells as shown in equation 6. Thus, this proposed study
makes the neural network simple, more powerful to classify the new unseen data, and enhancing the accuracy of
the neural network.

3.3 Why Softmax Entropy Loss Function?


Extreme Learning Machine (ELM) consists of a convolutional layer and pooling layer along with numerous
parameters in a single hidden layer with a non-linear activation function. It delivers significant performance gain
for the classification of the brain tumour [7]. When training the ELM network, large training data can add noise
in the extracted features which creates the overfitting problem.
Because of overfitting in data, the prediction accuracy and overall performance of the network could be reduced.
The neural network cannot generalize the new unseen data due to the supervised process. Therefore, it is
important to find the overfitting of data to train the neural network effectively without any data conflict. The
effect of overfitting can be reduced by different techniques, which includes increasing the size of the network,
increasing the number of training datasets and regularization. Among these solutions, the best technique to
minimize overfitting is regularization, since the size of the network and training datasets can be constant or
fixed in some instances. In this technique, the prime neurons in a neural network that can generate the functional
outputs are saved and others unwanted are removed automatically. This phenomenon reduces the weights in the

of
neural network and decreases size on average since the large weights of the neuron in the network give higher
loss. Thus, the neural network will be simple and powerful to generalize the datasets. If the regularization is not
implemented, then the weights will increase over time and decreases the learning rate of the network. It makes a
complex neural network, and the neural network cannot accurately classification the unseen data. Moreover, the

ro
sigmoid function used in the state-of-the-art solution [7] classifies the dataset into two classes and limits the
capability of the neural network to binary classification. However, the softmax function enhances the
classification of multiple classes.

4. Results and Discussion


-p
re
Python 3.6 and additional libraries like Keras, Natplotlib, and others were used in the implementation of the
proposed model. Dataset of 3064 images from 233 patient was used for the study. The images were randomly
selected. Dataset had patient with three different types of tumours, namely meningioma, glioma, and pituitary.
Among all the samples, there were 708 meningioma tumour, 1426 glioma, and 930 pituitary tumours. 85% of
lP

the data were used for training (2,614 images), and the remaining were considered for testing the model (450
images). Among 450 testing sample: 100 meningioma, 210 glioma, and 140 pituitary images. All the brain
tumour image sample was originally 376 x 376 resolution and later resized to 180 x 180 resolution of black and
white quality for image scaling. For the validation, K-fold cross-validation was used. In this test, K had a value
of 10. Among the subsamples, one was chosen as the validation test data, and others are used for training. The
sample was downloaded from figshare.com [23], which freely available on the internet. For the implementation,
na

2.8 GHz Intel Core i7 7th gen processor with 16 GB RAM and 4 Gigs of NVIDIA 1050 memory was used.
Tables and graphs were used to make a comparison between the state-of-the-art solution and the proposed
system. The result from the classification is reviewed in tables 3, 4, and 5. The results from the sample are
presented in terms of the accuracy and the processing time for three different types of tumours, namely
ur

meningioma, glioma, and pituitary. The results generated by the proposed system were compared with the state-
of-the-art solution in different stages of image processing. In the classification stage, the results showed that the
proposed system enhanced the accuracy of the classification by employing softmax loss function and
regularization compared to other systems. Moreover, the average processing time was also significantly
Jo

reduced.
The mean and standard deviation of the test data for three different types of brain tumour was calculated using
the AVERAGE() and STDEVA() functions of Microsoft Excel. The formula for the standard deviation is
presented in equation 7.

∑|𝑥 − 𝑋̅|2
𝜎=√ (7)
𝑛

σ = standard deviation
x = sample
X̅ = mean of the sample
N = total number of samples

In the extraction stage, the deep learning model was generated. It helps in extracting the features automatically
from the images using their labeled data. The input of max-pooling layer of CNN is the feature map output of
the first convolutional layer shown in Fig 5 and Fig 6. L2 and L3 represent the layer 2 and layer 3, as shown in
Fig 7 and Fig 8. The map depicts the feature map of the convolutional layer. ReLU is the activation function of
convolutional layer. ReLUPool is the activation function for the max-pooling layer. The selected features are
feed to the extreme learning machine to classify the type of brain tumour.

Figure 5: Grayscale image after applying the input image to the neural network

of
ro
-p
re
Figure 6: Feature Extraction of the gray-scale images in the first convolutional layer
lP
na
ur

Figure 7: L2 layer Feature Extraction of the max-pooling layer of CNN


Jo

Figure 8: L3 layer Feature Extraction of the max-pooling layer of CNN


Table 3: Accuracy and Processing time for Meningioma tumour
Sample group Original Images State of Art Proposed solution
No.
Sample

details Processed sample Accuracy Processing Processed sample Accuracy Processing


(%) time (%) time
(sec) (sec)
1.1 Images of
Meningioma
96.25% 0.355 99.54% 0.304
tumour subject

1.2
97.18% 0.389 100.00% 0.309

1.3
98.04% 0.347 100% 0.351

1.4

of
96.54% 0.356 99.28% 0.317

1.5
97.68% 0.309 99.99% 0.311

ro
1.6
98.74% 0.352 99.04% 0.314

1.7
98.52%
-p
0.389 98.95% .347
re
Table 4: Accuracy and Processing time for Glioma tumour
Sample Original Images State of Art Proposed solution
No.
Sample

group Processed sample Accuracy Processing Processed sample Accuracy Processing


details (%) time (%) time
lP

(sec) (sec)
2.1 Images of
glioma 97.60% 0.354 98.12% 0.315
tumour
subject
2.2
na

99.99% 0.345 98.50% 0.389

2.3
96.33% 0.369 97.96% 0.358
ur

2.4
97.14% 0.356 98.26% 0.314

2.5
Jo

96.86% 0.361 98.63% 0.328

2.6
97.68% 0.391 97.92% 0.357

2.7
96.95% 0.358 97.56% 0.332
Table 5: Accuracy and Processing time for Pituitary tumour
Sample Original Images State of Art Proposed solution
No.
Sample

group details Processed sample Accuracy Processing Processed sample Accuracy Processing
(%) time (%) time
(sec) (sec)
3.1 Images of
pituitary
tumour 97.54% 0.389 98.16% 0.364
subject
3.2
98.52% 0.412 99.24% 0.336

3.3
96.97% 0.358 98.56% 0.341

3.4

of
97.63% 0.405 99.52% 0.329

3.5
97.59% 0.381 98.62% 0.351

ro
3.6
97.25% 0.345 97.89% 0.320

3.7
98.27% 0.429
-p 98.68% 0.369
re
Accuracy of state of Art and Proposed Solution for
three different type of Brain Tumour
lP

100
99.54
Accuracy in Percentage(%)

99.5
99 98.67
na

98.5 98.14
98 97.56 97.68
97.51
97.5
97
ur

96.5
96
Meningioma Glioma Pituitary
Jo

Tumour Type

Proposed Solution State of Art

Figure 9: Classification Accuracy of different type of Brain Tumour


Figure 9 shows the tumour classification accuracy in term of percentage for the three different types of brain tumour. The red color
represents the accuracy of the proposed solution, and the purple color shows the state-of-the-art solution. From the Bar Graph, a) the first
couple one of bar graph indicates the average accuracy for the Meningioma Tumour b) the second one indicates the average accuracy for the
Glioma Tumour and c) the third one is for the Pituitary Tumour’s average accuracy as shown in the above figure.
Processing time (seconds) of state of art and
the proposed solution for for three different
type of Brain Tumour
0.5
Processing Time (sec)

0.3884
0.4
0.3218
0.3567 0.3418 0.362 0.3442
0.3

0.2

0.1

0
Meningioma Glioma Pituitary
Tumour Type

of
Proposed Solution State of Art

Figure 10: Processing Time for a different type of Brain Tumour

ro
Fig. 10 shows the processing time in seconds for the three different types of brain tumour. The blue color represents the processing period of
the proposed solution, and the green color shows the state-of-the-art solution. From the Bar Graph, a) the first couple one of bar graph
indicates the average processing time for the Meningioma Tumour b) the second one indicates the processing time of the Glioma Tumour
and c) the third one is for the Pituitary Tumour’s processing time as shown in the above figure.

-p
The accuracy was calculated by using the evaluate() method of the Keras Python package/library. The
computational time was calculated using the now() method of Python DateTime package. As shown in Fig. 9,
the classification accuracy is increased to almost 99% with the help of less batch size and epoch. Similarly, the
re
processing time is reduced by 40-50 milliseconds in the proposed solution compared to the existing solutions, as
shown in Fig. 10.
The accuracy was measured by the probability score of the labeled image after classification. The processing
lP

time was measured by the execution time taken to classify the images after inputting the image to the model.
The input image was classified among one of the tumours types, either meningioma, glioma, or pituitary. The
overall average of the accuracy and the processing time was calculated by averaging all the sample data of
respective tumour and the processing time for respective tumour.
Using softmax function with the regularization in the loss function during the classification stage in the CNN
na

layer of extreme learning machine has enhanced the performance of the proposed solution. The regularization
removes the unwanted neurons and only keeps the prime neurons that are considered important for
classification. The unwanted neurons are discarded and disabled to reduce the processing time. Similarly, the
reduced number of neurons leads in minimizing overfitting of training datasets as well as increasing the
accuracy. Furthermore, setting the proper value of bias, which is obtained by experimental increased the
ur

classification accuracy of the proposed solution compared to the state-of-the-art solution.


Different image enhancement techniques, along with the feature extraction and selection algorithm, have been
used. It is implemented for the classification and the detection of different type of brain tumour. However, the
Jo

continuous refinement and enhancement of weight and the bias value of the neuron is done to improve the
accuracy and the processing time. The limitation of the state-of-the-art solution is solved in this research with
the improved accuracy of 98.99% against 97.4%. In another hand, the processing time of the classifier has also
been reduced to 322.56 milliseconds from 371.86 milliseconds. The accuracy is calculated by taking the
proportion of true positive results (both true positives and true negative). True positive is the total actual
positives that are correctly identified as a positive one. True negative is the total actual negatives that are
identified correctly as a negative one. This is due to the improvement in the loss function along with the
regularization, which minimizes the risk of the overfitting of the training image datasets.
It is clearly noted from the above practical experimental results that the proposed solution shows a significant
improvement in the overall classification accuracy and the processing time in comparison to the state-of-the-art
solution [7].
The proposed solution improves the classification and detection accuracy of brain tumour along with the
different type of tumours. Combining the modified softmax loss function and the modified loss function with
regularization has been contributed significantly in classifying different type of tumours with high accuracy and
minimum processing time. The regularization reduces the risk of overfitting of dataset which is adapted from
Abd-Ellah et al. [1] solution. Hence, the proposed solution improves accuracy by almost 2% and reduces the
processing time by 40-50 milliseconds. Therefore, the proposed solution could be used as a new method to
diagnose brain tumour in its early stage using brain images.

5. Conclusion and Future Work


Brain tumour classification and detection accuracy at the early stage is vital to increasing the probability of
patient treatment. Brain tumour classification and detection are implemented using deep learning approach, but
it was not achieving higher accuracy and processing time. Consequently, this research proposed a model to
improve the classification and detection accuracy of brain tumour along with the different types of tumours. The
model combines the modified loss function for categorical classification with the regularization. The
regularization reduces the risk of overfitting of a dataset and decreases the number of neurons. Moreover, it

of
decreases the computational time and increases the model’s accuracy. Hence, it improves the accuracy by
almost 2% and reduces the processing time by 40-50 milliseconds comparing to the state-of-the-art solution, see
table 6. Future research will be focused on enhancing the performance of the network and increase the

ro
efficiency of the system. Furthermore, large datasets for the different brain tumour can be used for the training
and testing of the deep learning model.
Table 6: Comparison table between your proposed and state of art solutions

Applied Area
(Ari et al., 2018)
-p
State-of-art technique (sigmoid function)

Binary classification of brain tumour


Proposed Technique (modified softmax
loss function)
Multiclass classification of the type of
brain tumour
Accuracy The classification accuracy of the solution The classification accuracy of the
re
is 97.4012% proposed solution is 98.9933%
Processing Time 371.86 millisecond 322.56 millisecond
Equation The model function as Modified model function is
𝑁

𝐸 = ∑𝑁
𝑖=1 𝛽𝑖 𝑔(𝑤𝑖 𝑥𝑗 + 𝑏𝑖 ) 𝑀𝐸 = ∑ 𝛽𝑖 ∗ 𝑀𝐸(𝜃)′
lP

𝑖=1
Contribution 1 Binary classification of Brain tumour into Multiclass classification of the type of
benign or malignant brain tumour

Contribution 2 Modified loss function with


regularization enhances accuracy by
na

reducing the overfitting of data

Ethical approval: Not Applicable


ur

Informed consent: Informed consent was obtained from all individual participants included in the study

Compliance with Ethical Standards:


Jo

Funding: No Funding has used in this work.


Conflict of Interest: No conflict of interest
References
[1] M. K. Abd-Ellah, A. I. Awad, A. A. M. Khalaf, and H. F. A. Hamed, "Two-phase multi-model automatic brain
tumour diagnosis system from magnetic resonance images using convolutional neural networks," EURASIP
Journal on Image and Video Processing, journal article vol. 2018, no. 1, p. 97, September 30 2018.
[2] Y. Xu et al., "Deep convolutional activation features for large scale brain Tumour histopathology image
classification and segmentation," in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International
Conference on, 2015, pp. 947-951: IEEE.
[3] M. I. Razzak, S. Naz, and A. Zaib, "Deep Learning for Medical Image Processing: Overview, Challenges and the
Future," in Classification in BioApps: Springer, 2018, pp. 323-350.
[4] Y. Qiu et al., "Applying deep learning technology to automatically identify metaphase chromosomes using
scanning microscopic images: an initial investigation," in Biophotonics and Immune Responses XI, 2016, vol.
9709, p. 97090K: International Society for Optics and Photonics.
[5] B. D. de Vos, J. M. Wolterink, P. A. de Jong, M. A. Viergever, and I. Išgum, "2D image classification for 3D
anatomy localization: employing deep convolutional neural networks," in Medical Imaging 2016: Image
Processing, 2016, vol. 9784, p. 97841Y: International Society for Optics and Photonics.
[6] R. Cuingnet, R. Prevost, D. Lesage, L. D. Cohen, B. Mory, and R. Ardon, "Automatic detection and segmentation

of
of kidneys in 3D CT images using random forests," in International Conference on Medical Image Computing and
Computer-Assisted Intervention, 2012, pp. 66-74: Springer.
[7] A. Ari and D. Hanbay, "Deep learning based brain Tumour classification and detection system," Turkish Journal
of Electrical Engineering & Computer Sciences, Article vol. 26, no. 5, pp. 2275-2286, 2018.

ro
[8] N. Nabizadeh and M. Kubat, "Brain Tumours detection and segmentation in MR images: Gabor wavelet vs.
statistical features," Computers & Electrical Engineering, vol. 45, pp. 286-301, 2015.
[9] N. B. Bahadure, A. K. Ray, and H. P. Thethi, "Image Analysis for MRI Based Brain Tumour Detection and

[10]
vol. 2017, p. 12, 2017, Art. no. 9749108.
-p
Feature Extraction Using Biologically Inspired BWT and SVM," International Journal of Biomedical Imaging,

M. Alfonse and A.-B. M. Salem, "An automatic classification of brain Tumours through MRI using support vector
machine," Egyptian Computer Science Journal, 2016.
re
[11] S. Damodharan and D. Raghavan, "Combining Tissue Segmentation and Neural Network for Brain Tumour
Detection," International Arab Journal of Information Technology (IAJIT), Article vol. 12, no. 1, pp. 42-52, 2015.
[12] K. R. Laukamp et al., "Fully automated detection and segmentation of meningiomas using deep learning on
lP

routine multiparametric MRI," Eur Radiol, journal article Jun 25 2018.


[13] J. Amin, M. Sharif, M. Yasmin, and S. L. Fernandes, "Big data analysis for brain Tumour detection: Deep
convolutional neural networks," Future Generation Computer Systems, vol. 87, pp. 290-297, 2018/10/01/ 2018.
[14] O. Charron, A. Lallement, D. Jarnet, V. Noblet, J.-B. Clavier, and P. Meyer, Automatic detection and segmentation
na

of brain metastases on multimodal MR images with a deep convolutional neural network, Computers in Biology
and Medicine , vol. 95, pp. 43-54, 2018.
[15] A. R. Deepa and W. R. Sam Emmanuel, "An efficient detection of brain Tumour using fused feature adaptive
firefly backpropagation neural network," Multimedia Tools and Applications, journal article October 06 2018.
[16] Y. Jun et al., "Deep-learned 3D black-blood imaging using automatic labelling technique and 3D convolutional
ur

neural networks for detecting metastatic brain Tumours," Sci Rep, vol. 8, no. 1, p. 9450, Jun 21 2018.
[17] B. Ural, "A Computer-Based Brain Tumour Detection Approach with Advanced Image Processing and
Probabilistic Neural Network Methods," Journal of Medical and Biological Engineering, journal article vol. 38,
no. 6, pp. 867-879, December 01 2018.
Jo

[18] M. Arunachalam and S. Royappan Savarimuthu, "An efficient and automatic glioblastoma brain Tumour detection
using shift-invariant shearlet transform and neural networks," International Journal of Imaging Systems and
Technology, vol. 27, no. 3, pp. 216-226, 2017.
[19] S. Preethi and P. Aishwarya, "Combining Wavelet Texture Features and Deep Neural Network for Tumour
Detection and Segmentation over MRI," Journal of Intelligent Systems, vol. 0, no. 0, p. <xocs:firstpage
xmlns:xocs=""/>, 2017.
[20] E.-S. A. El-Dahshan, H. M. Mohsen, K. Revett, and A.-B. M. Salem, "Computer-aided diagnosis of human brain
Tumour through MRI: A survey and a new algorithm," Expert Systems with Applications, vol. 41, no. 11, pp.
5526-5545, 2014/09/01/ 2014.
[21] A. R. Kavitha and C. Chellamuthu, "Detection of brain tumour from MRI image using modified region growing
and neural network," Imaging Science Journal, Article vol. 61, no. 7, pp. 556-567, 09// 2013.
[22] P. Kochar, "A Survey on Brain Tumour Detection and Classification System based on Artificial Neural Network,"
International Journal of Computer Applications, vol. 90, no. 18, 2014.
[23] C. Jun, brain Tumour dataset. 2017. Retrieved from https://figshare.com/.

of
ro
-p
re
lP
na
ur
Jo

You might also like