You are on page 1of 12

Received: 11 July 2019 Revised: 25 December 2019 Accepted: 16 January 2020

DOI: 10.1002/ima.22400

RESEARCH ARTICLE

Deep learning and optimization algorithms for automatic


breast cancer detection

Zijun Sha1 | Lin Hu2 | Babak Daneshvar Rouyendegh3

1
Honda R&D Co., Ltd., Wako-shi,
Saitama, Japan
Abstract
2
AZAPA Co., Ltd, Tokyo, Japan Breast cancer is caused by the abnormal and rapid growth of breast cells. An
3
Department of Industrial Engineering, early diagnosis can ensure an easier and effective treatment. A mass in the
Ankara Yıldırım Beyazıt University breast is a significant early sign of breast cancer, even though differentiating
(AYBU), Ankara, Turkey
the cancerous mass's tissue from normal tissue for diagnosis is a difficult task
Correspondence for radiologists. The development of computer-aided detection systems in
Lin Hu, AZAPA Co., Ltd., Tokyo recent years has led to nondestructive and efficient cancer diagnostic tech-
105-0013, Japan.
Email: 2528265908@qq.com niques. This paper proposes a comprehensive method to locate the cancerous
region in the mammogram image. This method employs image noise reduc-
tion, optimal image segmentation based on the convolutional neural network,
a grasshopper optimization algorithm, and optimized feature extraction and
feature selection based on the grasshopper optimization algorithm, thereby
improving precision and decreasing the computational cost. This method was
applied to the Mammographic Image Analysis Society Digital Mammogram
Database and Digital Database for Screening Mammography breast cancer
databases and the simulation results were compared with 10 different state-of-
the-art methods to analyze the proposed system's efficiency. Final results
showed that the proposed method had 96% Sensitivity, 93% Specificity, 85%
PPV, 97% NPV, 92% accuracy, and better efficiency than other traditional
methods in terms of Sensitivity, Specificity, PPV, NPV, and Accuracy.

KEYWORDS
breast cancer, convolutional neural networks, feature extraction, feature selection, grasshopper
optimization algorithm, image classification, image segmentation

1 | INTRODUCTION Breast cancer occurs among both women and men,


although women account for over 99% of the cases. Breast
The human body is made up of minute cells that are visi- cancer is the second most common among women and
ble only through the microscope. Generally, cells contin- the second most widespread in the world. Every year,
uously reproduce to replace old and dead cells. Cell more than 11 000 women worldwide die and 6% of the
proliferation is characterized by uniform behavior where world's total mortality can be attributed to breast cancer.1,2
cell growth is as required by the body. When the growth Breast cancer under the age of 30 years is less preva-
of the cells is out of control, a large number of new cells lent. Predisposing factors of breast cancer include high
are produced which are cancerous and this gradually age, familial history, infertility, first pregnancy over
forms a tumor. 30 years of age, consumption of foods containing high

Int J Imaging Syst Technol. 2020;1–12. wileyonlinelibrary.com/journal/ima © 2020 Wiley Periodicals, Inc. 1
2 SHA ET AL.

animal fat, and so on. Many patients do not have any clini- efficiency and ease of use. Recently, the use of Computer-
cal symptoms at an early stage, and the disease is identi- aided Design (CAD) methods is employed for diagnosing
fied by the surgeon's examination and mammography.3 cancer with a Non-destructive Test with minimum error.
When significant signs for cancer in the patient are Image classification is the final stage of image
not noticeable, it is necessary to follow up with the exam- processing and is a relatively difficult process, that is,
iner and undergo diagnostic procedures like mammogra- after image pre-processing, image segmentation, image
phy to diagnose cancer. This helps with early detection of feature extraction, and image classification are carried
cancerous cells, and hence, the probability of a successful out. The purpose of image classification is to separate the
treatment is higher. original input image into predefined classes.
Mammography is the most commonly used diagnostic The quality of the final categorization depends on the
test by radiologists to diagnose and screen breast cancer. results of all the steps in image processing. The use of
The use of a mammogram reduces mortality by up to computer-aided detection helps radiologists analyze and
25%. It is difficult to interpret and describe the mammog- highlight the suspicious cancerous regions that improve the
raphy images, and the official US National Cancer Insti- cancer detection rate.12 This methodology improves the can-
tute states that 10% to 30% of the glands in the patient's cer detection rate by considering a two-step guaranteed sys-
breast are not recognized by the radiologist through a tem to help the radiologist decrease human error.13,14
mammogram. Mammography is an important part of cancer detec-
In recent decades, extensive research has been per- tion as selecting a precise method to detect the cancerous
formed to reduce diagnostic errors of breast cancer and to region influences the diagnosis. Human error is unavoid-
increase the speed of diagnosis. The results of this research able, and in some cases, even the most experienced radi-
can help radiologists and specialists with a rapid diagnosis. ologists make mistakes when analyzing the image,15
Image processing techniques and introducing pattern rec- though many precautions are taken to cover this short-
ognition for automatic diagnosis and detection of breast coming.16-18 In 2016, Spanhol et al presented a technique
cancer from mammogram images can reduce human for automatic cancer diagnosis of histopathological
errors and increase the speed of diagnosis. As the reasons images using convolutional neural networks (CNNs).19
for breast cancer cannot be completely defined, it cannot Simulation results analyzed the system's precision and
be prevented. Early diagnosis can increase the chance of compared it with the other popular methods. Different
complete recovery, although breast cancer in most cases configurations for the method were applied for the recog-
cannot be recognized until the advanced stage.4 nition rates.
The statistics of breast cancer survivors can be In 2015, Liu et al presented a new methodology for
improved if rapid diagnostics are employed. Mammogra- automatic breast cancer tissue detection based on false-
phy overcomes the setbacks of other methods. The results positive reduction.20 The multiple concentric layers tech-
of the American Cancer Society on 280 000 American nology was used to locate the suspicious areas after
women demonstrated that imaging, clinical examina- which the narrowband-based active contour was used to
tions, breast cancer screening, and mammography enable improve the segmentation accuracy of the cancerous
early diagnosis of breast cancer and that mammography mass. The regions of interest (ROI) method for the seg-
can identify more than 41% of cancer patients.5 mented suspicious regions was used to extract the texture
Several studies are carried out to show that the early and geometric features and was evaluated by a gray-level
diagnosis of breast cancer can increase the chance of cure. co-occurrence matrix and completed local binary pattern.
Recently, the research on non-destructive experi- Finally, a support vector machine (SVM) was trained
ments to diagnose breast cancer has increased. The under the supervision of a radiologist to classify the
suggested methods can be performed without involving results. Experimental results showed higher sensitivity
the patient by using various techniques, especially image compared with previous methods.
processing and computer programming.6-8 A significant In 2016, Gu et al introduced a method for automatic
part of breast cancer diagnosis in digital image processing image segmentation of the 3D ultrasound breast cancer
is in precisely locating the cancerous area. images, where images were categorized according to the
New methods for diagnosing breast cancer, especially majority of its tissue components.21 Simulation results
non-invasive diagnostic tools, have been discovered in showed high efficiency of the proposed method to other
various laboratories. The methods are based on different methods and manual diagnosis.
digital photography techniques like optical coherence In 2016, Wang et al proposed a different technique for
tomography,9 multispectral,10 and dermoscopy.11 cell nuclei detection in breast cancer images. The tech-
These methods have their advantages and shortcom- nique employed a Region of Interest (ROI)-based proce-
ings which compromise accuracy and efficiency for cost dure for accurate diagnosis of breast cancer images. The
SHA ET AL. 3

method is a hybrid of the Curvature Scale Space (CSS) Mammogram Median Image Segmentation
method and the mathematical morphology. Wrapper fea- Acquisition Filtering Based on CNN

tures were utilized for improving system accuracy. The


final results showed 91.64% precision for this GOA
technique.22
In 2018, an improved version of the k-nearest neigh- Healthy
Feature Feature
borhood was proposed by Li et al. This method, called Classification
Selection Extraction
Cancer
the entropy weighted local-hyperplane knn, was used to
diagnose the cancerous regions in the mammogram. Sim-
ulation results showed a 92.33% precision rate.23 F I G U R E 1 Proposed breast cancer diagnosis system [Color
figure can be viewed at wileyonlinelibrary.com]
In 2018, Al-antari et al proposed a deep learning-
based method for the automatic diagnosis of breast can-
cer in digital mammograms.24 The method was based on a favorable result which sets it apart from the other simi-
a deep belief network (DBN) that automatically detects lar works.
breast tissue areas and detects the benign and malignant Section 2 of this paper describes the materials and
regions. Two ROI methods were utilized: whole mass methods where the CNN as the classifier and GOA for opti-
ROIs and multiple mass ROIs. They used linear discrimi- mization are described. Section 3 illustrates how to reduce
nant analysis, neural network, and quadratic discrimi- the noise of the input image based on median filtering.
nant analysis for classification. The final results were Section 4 illustrates how to use the GOA for optimizing and
compared with other methods and showed good achieve- CNN for the classification. Section 5 describes the applica-
ments for the DBN. tion of GOA for the optimal selection of the image features.
In 2019, Nayak et al presented an automatic method Section 6 describes the SVMs. Section 7 describes the
for breast tissue diagnosis based on an efficient watershed adopted Database and Section 8 discusses the conclusion.
algorithm.25 The watershed method is a popular method
for contour-based feature extraction and region segmen-
tation that makes it an efficient breast tissue segmenta- 2 | MATERIALS AND METHODS
tion method. The proposed method is tested with the
well-known breast cancer database, Mammographic 2.1 | Convolutional neural network
Image Analysis Society Digital Mammogram Database
(MIAS) and Digital Database for Screening Mammogra- In the CNN learning method, several layers are trained
phy (DDSM), and the results were compared with other efficiently.43,44
state-of-the-art techniques available in the literature. This method is efficient and is commonly used in differ-
In recent years, the applications of computational intelli- ent computer vision applications. The CNN network consists
gence have extensively increased. Computational intelligence of three main layers: the convolutional layer, the pooling
includes different techniques like neural networks,26-30 fuzzy layer, and the fully connected layer. Each layer performs dif-
methods,31-35 and optimization algorithms.36-40 Meta- ferent tasks. Figure 2 shows the architecture of the CNN.
heuristics are a part of optimization algorithms that can opti- For image classification, multiple 2D matrices are
mize the segmentation and classification methods. considered as the input and the output of the con-
In 2015, Bhardwaj and Tiwari presented a modified volutional layer. There is no limitation to use an equal
neural network to design an optimal breast cancer diagno- number of input and output matrices.
sis system using a genetic algorithm. Simulation results Local feature extraction is employed to obtain the
showed a 100% precision rate for the studied dataset.41 regional properties of the input image.
The grasshopper optimization algorithm (GOA) is a The main objective of the learning technique here is
new meta-heuristic algorithm that was first proposed by to achieve a few kernel matrices to extract the principal
Mirjalili and Lewis.42 It is derived from the swarming characteristics of the cancerous image from the mammo-
activities of the grasshoppers in the environment. The gram image.
GOA was applied for different datasets and showed good This study employs the backpropagation (BP) technique
results. This has been a motivation to implement this to optimize network connection weights. A sliding window
algorithm in the system's architecture for breast cancer is used as a vector for convolution so that the dot product
diagnosis. The proposed system is illustrated in Figure 1. and the weights are added.
As seen in the figure, the aim is to use the GOA in The rectified linear unit is used as the activation func-
both image segmentation and feature selection to achieve tion, where the function f(x) = max(x, 0).45
4 SHA ET AL.

Pooling F I G U R E 2 Convolutional
neural network architecture [Color
Malignant figure can be viewed at
wileyonlinelibrary.com]

...
Benign

Convolution

Max pooling is used to increase the scale reduction of since most of the mentioned layouts are experimental. In
the output; in this study, the highest value is the subse- recent years, a few methods have been presented to
quent layer of the sliding grid. improve CNN based on meta-heuristic algorithms.48,49
After the CNN initialization, an optimization algo-
rithm is used to fit the output by employing internal
weights. The BP algorithm is usually chosen for this 2.2 | Grasshopper optimization
process. algorithm
BP evaluates the error of the training pairs and then
it adjusts the weights of the neurons to fit according to The GOA is a new stochastic swarm-based optimization
the desired output.30,46 It uses a gradient descent algo- algorithm that is inspired by the behavior of grasshopper
rithm to minimize the error. insects.50
Gradient descent is a technique for cross-entropy loss Like any other swarm optimization algorithm, GOA
minimization.47 The related cost function, in this case, is starts with a random swarm of the population (candidate
as follows: solutions) to search and find the global optimum (maxi-
mum or minimum) solution for the problem.51
N X
X M
ðiÞ ðiÞ
The development of the ith grasshopper close to the
L= −dj logzj , ð1Þ objective grasshopper is determined by Pi which is formu-
j=1 i=1
lated as follows:
0 1
Pi = GF i + SAi + WAi , ð4Þ
where dj = @0, …, 0, 1, …, 1,0, …, 0A is the desired output
|fflffl{zfflffl}
k where GFi is the gravity force on the ith grasshopper, SAi
vector and zj is the softmax function of the mth class as is the social interaction, and WAi is the wind advection.
follows: Here, Pi is the position of the ith grasshopper which is
obtained as follows:
ðiÞ e fj
zj = PM , ð2Þ Pi = R1 SAi + R2 GF i + R3 WAi , ð5Þ
fi
i = 1e

where R1, R2, and R3 are random constants in the interval


where N describes the sample number. [0,1].
The function L can be modified based on the weight The social interaction for the ith grasshopper (SAi)
penalty to include γ value and to keep the values of the depends on the two social forces between two grasshop-
weights from getting larger: pers that is a repulsion force to stop collisions and an
attraction force over a small length. The social interaction
N X
X M
ðiÞ ðiÞ 1 X X 2 is simulated as follows:
L= − dj logzj + γ K W k,l , ð3Þ
j=1 i=1
2 L
X
N
SAk = SAðDkl ÞDkl ð6Þ
where Wk determines the connection weight, k is in layer l=1
l6¼i
la and L and K describe the total number of layers and
the layer l connections, respectively.
Since CNN is a strong classification tool, proposing Dlk = X k −X l ð7Þ
an optimal configuration for this structure is important
SHA ET AL. 5

Xk − Xl
D^lk =
Initializing: generate the initial swarm (population), cmax, cmin, and the
ð8Þ
Dlk maximum number of iteration.
Evaluate the cost function based on all agents in the swarm
where Dkl is the length of the Euclidian of the kth with T=the best solution (agent)
the lth position grasshopper, and D^lk describes the cur- for (l=1: max number of iteration)
rent unit vector between the kth and the lth grasshopper. Normalize the distance between grasshoppers in the interval [1, 4]
The strong point of the social forces is determined by SF Update the agent's position based on Eq. (14).
which is evaluated by the following formula: Apply the constraints
Update T if there is a better solution
SF ðRÞ = f i e − R=L −e − R , ð9Þ L=l+1
end for
Return T
where L is the length scale of attraction, and fi describes
the intense force of attraction. F I G U R E 3 Pseudo-code of the Grasshopper optimization
The intensity of the attraction strength is formulated algorithm (GOA)
as follows:
0 1 cmax −cmin
c = cmax −l , ð14Þ
BX C L
B N U dB −LdB  d  C
X dk = cB c SF X − X d  X l −X k C + T^d , ð10Þ
B 2 l k
Dlk C
@l=1 A
where LBd and UBd are the lower and the upper limita-
l6¼1 tions in the dth dimension, respectively, T^d is the
objective magnitude of dth dimension by the target
where SF describes the force due to grasshopper on social grasshopper, c describes the reducing factor to the suit-
interaction. able zone of the repulsion region and attraction region,
The GF component is evaluated by cmax and cmin are the highest value and the lowest value
of factor c, respectively, and l and L are the current
GF i = −GF e^g , ð11Þ iterations and the total number of iterations, respec-
tively. The pseudo-code of the algorithm is shown in
where e^g describes a unity vector direction to the center Figure 3.
of earth and Gi represents the constant for the gravity
force.
Finally, WAi is the wind advection model obtained by 3 | MEDIAN FILTERING

WAi = U e^g , ð12Þ In general, most of the measured signals have additive
unintentional and oscillation changes that are called
where U is a constant drift, and e^g describes a unity vec- noise. A mammogram is no exception and sometimes
tor along with the wind. presents with noise. The presence of noise causes prob-
By substituting the aforementioned parameters, the lems in image processing, especially when extracting the
following formula is achieved: image details for different applications. Differentiation
where N describes the number of grasshoppers, and SF is enhances high-frequency pixels that include noises.
a function. To avoid this problem, the images with noise are pre-
Assuming the gravity and wind are always toward the processed before the main processing.52,53
desired target, a formula is developed for evaluating the Median Filtering is a popular method in noise filter-
connections among swarm grasshoppers: ing. It is a low-pass filter that preserves the image details
while removing the noise. The median filter filters a
0 1
neighborhood m × n by arranging all neighborhoods in
BX C ascending order and choosing the middle element of the
B N U dB −LdB  d  C
d  X l −X k C
B 
SF X k −X l ^
x dl = cB c
2 D C + T d , ð13Þ ordered numbers, and finally replacing the central pixel.
@l=1 lk A
The mathematical equation of the median filter is as
l6¼1 follows:
6 SHA ET AL.

  To prevent system error, the minimum (min) and the


yðm,nÞ = median x ði,jÞ , ði, jÞ∈N , ð15Þ
maximum (max) limitations adopted are 2 and the size of
the sliding window, respectively.
where N describes the neighborhood centered around the The minimum value 2 here is the allowed min value
location (m, n) in an image. for the max-pooling; thus, lower sizes are not allowed.
The median filter is effective in removing the salt and This optimization has an inequality constraint such that
pepper noises. In this study, the median filter is employed the value of the sliding window should be less than the
to remove the digital noise in the mammogram image. input data. The number of swarms is set to 100, where
The size of the median filter mask selected is [m,n] = [3]. the characteristics of the CNN hyper-parameters are
Figure 4 shows a simple example of median filtering on determined by the agents, for example, within 10 integer
the mammogram image. Figure 4 shows two different values. In this study, the half-value precision of CNN is
examples. The first one shows how the median filter can selected as the cost function of breast cancer validation.
be used for image smoothing to enable easy image seg- Since both the CNN and GOA are used, the configu-
mentation and the second example applies a 30% salt and ration has a high computational cost as each swarm
pepper noise to the image and filters it with a median agent of the CNN needs to be trained on the breast can-
filter. cer dataset by BP for 1000 iterations Figure 5.
After initializing and evaluating the cost of the
agents, the position of the search agents is updated based
4 | SEGMENTATION BY GOA- on the GOA parameters and the process repeats until the
B A S E D CN N stop criterion is achieved. In this study, the CNN weights
and biases selected for optimization are as follows:
This study utilizes GOA to design an optimized CNN to
 
segment the cancerous areas from the background. The W = w1 , w2 ,…, wp ð16Þ
function of the GOA is to justify the number of hyper-
parameters of CNN for better performance when com- A = fa1 , a2 , …,aA g ð17Þ
pared with manual justification. The solution for this
configuration is a sequence of integers. wn = fw1n , w2n ,…, wLn g ð18Þ

bn = fb1n , b2n , …, bLn g


l = 1, 2, …,L
n = 1, 2, …,A,

where l is the layer index, A and L are the total numbers


of agents and the total number of layers, respectively, n is
the number of the swarm, and win is the weight of layer i.
Therefore, the total parameters for optimizing are
Wn = {W, A}.
The formulation of the measured error between the
reference output and the system output is:

1X Ns X k  2
E= dji −oji , ð19Þ
Ns i = 1 j = 1

w1 w2 … wp a1 a2 … aA

F I G U R E 5 Search agent vector for the Grasshopper


FIGURE 4 Median filtering on the mammogram image for optimization algorithm (GOA)-based convolutional neural
smoothing and noise reduction: before (A) and after (B) image networks (CNN) [Color figure can be viewed at
processing [Color figure can be viewed at wileyonlinelibrary.com] wileyonlinelibrary.com]
SHA ET AL. 7

F I G U R E 6 Architecture of Training set Testing set


Grasshopper optimization algorithm
(GOA)-based convolutional neural
networks (CNN) [Color figure can be
viewed at wileyonlinelibrary.com]
Setting the
network and its Calculate
training the fitness

Initialize random
grasshoppers

Update the agent’s


position based on
Termination
criteria Eq. (14)

Optimal hyper-parameters

F I G U R E 7 Some examples of
cancer area detection based on the
Grasshopper optimization algorithm
(GOA)-based convolutional neural
networks (CNN): (A) original image,
(B) image after process

where Ns is the number of training samples, k describes image processing and computer vision in medical imag-
the number of output layers, and dji and oji are the ing. Research on cancer detection has created a potent
desired output and the output value of CNN, respectively. text descriptor to extract images that obtain relevant
Because gradient descent as a part of the BP algo- information for any changes in specific features. In this
rithm can be easily trapped into the local minimum, research, feature extraction is employed in cancer detec-
GOA is employed as a stochastic algorithm to escape the tion after breast cancer image segmentation.
local minima.54-56 Figure 6 shows the architecture of the In general, the input image is crucial and massive
GOA-based CNN. Another advantage of using GOA over data of pixels. Such data make image processing compli-
the BP algorithm for error minimization is that the GOA- cated. To simplify the operation, a technique is required
based method does not require the backward phase, to extract profitable features and prune the extra data.
which has a high computational cost. Figure 4 shows the Therefore, in this study, image feature extraction is
flowchart of the proposed method. A few examples of employed to extract useful data from the mammogram
cancerous area detection based on the GOA-based CNN images. Different features are used for image feature
are shown in Figure 7. extraction. The utilized features are based on geometric
features, textures, and statistical features which are
introduced as follows:
5 | F EA T U R E EXTR ACTI O N AN D
S E L E C T I O N BA S E D O N S V M G O A M X
X N
Perimeter = bp ði,jÞ ð20Þ
i=1 j=1
After detecting the cancerous area by image segmenta-
tion, the images should be further processed to obtain
precise results. This is performed by image feature extrac- Solidity = Area=Convex Area ð21Þ
tion. Machine vision techniques are useful in image fea-
ture extraction, but these techniques cannot be used with
M X
X N
Area = pði, jÞ ð22Þ
a few types of images like medical images. In recent i=1 j=1
years, much attention has been paid to various areas of
8 SHA ET AL.

pffiffiffiffiffiffiffiffiffiffi
2 Area value at (i, j), μ and σ are the mean and the SD,
Elongation = pffiffiffi ð23Þ respectively.
a π
As some features have more useful information than
Area others in feature extraction of the images, for optimal
Rectangularity = ð24Þ selection of features, a cost function is required.
a×b
In this research, by introducing a valid criterion, the
Irregularity index = 4π × Area=Perimeter 2 ð25Þ GOA minimizes the cost function to achieve the best fea-
tures. The cost function for this purpose is considered as
Area follows:
Form factor = ð26Þ
a2
Cost function
−1
 0:5 ðTP × TN Þ− ðFP × FN Þ
Eccentricity = 2a a2 − b2 ð27Þ = 1,
ððTN + FPÞ × ðTP + FPÞ × ðTP + FN Þ × ðTN + FN ÞÞÞ2
X
M X
N ð37Þ
Contrast = p2 ði,jÞ ð28Þ
i=1 j=1
where FN and FP are false negative and false positive,
M X
X N and TN and TP are truly negative and true positive,
Energy = p2 ði, jÞ ð29Þ respectively.
i=1 j=1

X
M X
N
pði, jÞ 6 | CLASSIFICATION BASED
Homogeneity = ð30Þ ON SVMS
i=1 j=1
1 + j i −j j

The SVM contains a set of points in the n-dimensional


X
M X
N
pði,jÞ −μr μc space of data that shows the boundaries of the classes
Correlation = ð31Þ
i=1 j=1
σr σc and categorizes them. It can be changed by relocating
one of these two cases. The SVM gives the best results for
classifying the data with a criterion for locating the sup-
1 X M X N
Mean = pði, jÞ ð32Þ port vectors.
MN i = 1 j = 1
The SVM detects the surface with best decision using
the formula:
M X
X N
Entropy = − pði, jÞlogpði,jÞ ð33Þ !
X
N
i=1 j=1 y = sgn yi αi K ðx,x i Þ + b , ð38Þ
i=1

1 X M X N
Variance = ðpði, jÞ −μÞ ð34Þ
MN i = 1 j = 1 where x determines a test set vector with d dimensions, xi
describes the ith training set vector, y is a class label
1 between −1 and 1, N is the training set numbers, K(x, xi)
SD = variance2 : ð35Þ
is a kernel function, α = {α1…αN} and b are the model
parameters.
Invariant moments: In this study, SVM is employed to classify the mam-
mogram image into cancerous and healthy groups. The
φ1 = η20 + η02 input data are first received as labeled vectors in a
hyper-dimensional space, based on the assumption that
φ2 = ðη20 −η02 Þ2 + 4η211 ð36Þ
the vectors have sets of features to specify a class. The
φ3 = ðη30 −3η12 Þ2 + ð3η21 −μ03 Þ2 , number of the input data is important to design a pre-
cise classifier; that is, using fewer data lessens, the sys-
tem precision and using many data fits the system
where bp is the external side length of the boundary pixel, excessively. The input data for SVM are obtained from
a and b represent the major and the minor axis, respec- the optimal selected features of the mammogram
tively, MN is the image size, p(i, j) is the pixels intensity images.
SHA ET AL. 9

7 | T H E DA T A B A S E DDSM databases to analyze the system efficiency. The


proposed method includes four parts: The first part is the
The MIAS57 and the DDSM58 are used to test and analyze noise elimination of the input mammogram images by
the proposed system. The MIAS database is gathered in simple median filtering. Second, an optimized CNN
England to help the researchers who work with mammo- based on GOA is used to segment the cancerous area
gram images. Images are acquired from the UK National from the background. Different features are extracted
Breast Screening Program. The database contains 322 dig- from the image to process the raw data into useful infor-
ital mammogram images of size 1024 × 1024 that are mation with less complexity. Third, an optimized method
accurately labeled by specialists. MIAS is available at the based on GOA is used to select profitable features.
Pilot European Image Processing Archive at the Univer- Finally, necessary information is trained and injected
sity of Essex. into an SVM classifier to classify the images into two
DDSM contains 2620 images of 3000 × 5000 pixels groups- cancerous and healthy.
with a 16-bit gray level. The gray-level images have an For the proposed optimized CNN/GOA and GOA-
intensity level between 0 and 255. The original format of based feature extraction, the percentages for training
the images is LJPEG, which is converted to jpg format to and testing the dataset are 70% and 30%, respectively.
reduce complexity. Therefore, the databases had a total of The proposed network considers 10 000 iterations for
2942 (322 + 2620) images. Few samples of the MIAS training. To ensure a constant analysis, the training
database and DDSM mammogram images are shown in step is repeated 20 times and the final results are
Figure 8. described based on the mean values. Five performance
metrics were used for the analysis as follows:

8 | EXPERIMENTAL RESULTS accuracy =


correctly detected cases
ð39Þ
total cases
Experimental simulations were implemented on a laptop
with Matlab platform R2017b software on Intel Core correctly detected healthy skin cases
Specifity = ð40Þ
i7-4790K processor with 32 GB of RAM and two NVIDIA total healthy skin cases
GeForce GTX Titan X GPU cards with scalable link inter-
face. The simulations were applied to the MIAS and correctly detected skin cancer cases
PPV = ð41Þ
detected skin cancer cases

correctly detected healthy skin cases


NPV = ð42Þ
detected healthy skin cases

correctly detected skin cancer cases


Sensitivity = : ð43Þ
Total skin cancer cases

The proposed method is compared with 10 different


states of the art techniques to compare its efficiency. The
method of59 is a framework based on the semi-supervised
system. The method of60 is based on a commercial tool.
For a fair comparison, automatically extracted descriptors
of this method are employed. Few deep learning-based
systems like Ordinary CNN, AlexNet,61 VGG-16,62
ResNet,63 LIN,64 and Inception-v365 are also utilized for
this comparison. Table 1 illustrates a performance com-
parison between the proposed system and the aforemen-
tioned methods.
From the results, it is clear that the proposed method
has the highest precision compared with the 10 aforemen-
tioned methods. The results show the effect of using the
GOA optimization algorithm on the deep learning frame-
F I G U R E 8 Sample mammography images from (A) MIAS work. The distribution classification rate is also shown in
database and (B) DDSM the following figure for more clarification in a bar chart.
10 SHA ET AL.

T A B L E 1 Comparison of the
Performance metric
performance metrics for breast cancer
Method Sensitivity Specificity PPV NPV Accuracy detection
Proposed optimized method 0.89 0.88 0.81 0.92 0.87
MED-NODE texture 0.57 0.82 0.74 0.81 0.75
descriptor59
MED-NODE color descriptor59 0.72 0.71 0.60 0.85 0.65
60
Spotmole 0.76 0.56 0.57 0.87 0.62
61
AlexNet 0.77 0.59 0.62 0.86 0.73
63
ResNet-50 0.79 0.76 0.68 0.85 0.72
ResNet-10163 0.69 0.74 0.73 0.87 0.81
62
VGG-16 0.85 0.83 0.74 0.89 0.82
64
LIN 0.86 0.86 0.78 0.88 0.83
65
Inception-v3 0.76 0.63 0.59 0.68 0.78
Ordinary CNN 0.74 0.95 0.71 0.79 0.74

T A B L E 2 Time computation complexity results of the final 9 | CONCLUSIONS


breast cancer detection methods
This paper presented a comprehensive methodology for
Algorithm Time (s)
the optimal diagnosis of breast cancer by mammography.
Proposed optimized method 952.15
The original images were pre-processed by a median filter
MED-NODE texture descriptor59 5.83 for noise elimination, and optimized image segmentation
59
MED-NODE color descriptor 6.47 based on a CNN was used to segment the cancerous area
Spotmole 60
3.16 from the background. Several features were extracted to
AlexNet 61
782.61 improve the precision and decrease the computational
ResNet-5063 689.25
cost. To extract relevant features, an optimal method was
63
employed to select the useful features and prune the
ResNet-101 743.16
unwanted features. After feature extraction, the features
62
VGG-16 12.15 are trained into an SVM classifier to differentiate the can-
64
LIN 20.16 cerous images from the healthy ones. Image segmenta-
Inception-v365 12.58 tion and feature selection were optimized by the
Ordinary CNN 782.53 relatively new GOA. Simulations were applied to the
MIAS and DDSM breast cancer database and were com-
pared with 10 different states of the art methods for ana-
lyzing the proposed system efficiency. The results showed
By analyzing the GOA and other methods for the stud- 96% Sensitivity, 93% Specificity, 85% PPV, 97% NPV, and
ied databases, it can be seen that the GOA has achieved 92% accuracy that is better than other methods.
the best classification accuracy of 87%. Lin and NGG-
16 by 83% and 82% have the second and third best RE FER EN CES
results, respectively. Although the individual GOA- 1. Mehdy M, Ng P, Shair E, Saleh N, Gomes C. Artificial neural
based CNN requires a learning time, which is time-con- networks in image processing for early detection of breast can-
suming, this stage was performed only once and hence cer. Comput Math Methods Med. 2017;2017:1-15.
the time parameter can be overlooked. The high perfor- 2. Broeders M, Allgood P, Duffy S, et al. The impact of mammog-
raphy screening programmes on incidence of advanced breast
mance of the proposed method can be attributed to
cancer in Europe: a literature review. BMC Cancer. 2018;
employing an appropriate network architecture with 18:860.
better hyper-parameter configuration. This creates a 3. Tabrizi FM, Vahdati S, Khanahmadi S, Barjasteh S. Determi-
well-performed CNN classifier with better generaliza- nants of breast cancer screening by mammography in women
tion capability and improved convergence of the train- referred to health Centers of Urmia, Iran. Asian Pac J Cancer
ing stage (Table 2). Prev. 2018;19:997.
SHA ET AL. 11

4. Mellouli D, Hamdani T, Sanchez-Medina J, Ayed M, Alimi A. Mor- 22. Liu Y, Wang W, Ghadimi N. Electricity load forecasting by an
phological convolutional neural network architecture for digit rec- improved forecast engine for building level consumers. Energy.
ognition. IEEE Trans Neural Netw Learn Syst. 2019;30:2876-2885. 2017;139:18-30.
5. American Cancer Society. 2019. https://cancerstatisticscenter. 23. Li Q, Li W, Zhang J, Xu Z. An improved k-nearest-neighbor
cancer.org/#!/cancer-site/Breast (accessed January 2019). method to diagnose breast cancer. Analyst. 2018;143:2807-2811.
6. Rashid Sheykhahmad F, Razmjooy N, Ramezani M. A novel 24. Al-antari MA, Al-masni MA, Park S-U, et al. An automatic
method for skin lesion segmentation. Int J Inform Secur Syst computer-aided diagnosis system for breast cancer in digital
Manag. 2015;4:458-466. mammograms via deep belief network. J Med Biol Eng. 2018;
7. Razmjooy N, Mousavi BS, Soleymani F, Khotbesara MH. A 38:443-456.
computer-aided diagnosis system for malignant melanomas. 25. Mohammadi M, Talebpour F, Safaee E, Ghadimi N,
Neural Comput Applic. 2013;23:2059-2071. Abedinia O. Small-scale building load forecast based on hybrid
8. Chiang T-C, Huang Y-S, Chen R-T, Huang C-S, Chang R-F. forecast engine. Neural Process Lett. 2018;48(1):329-351.
Tumor detection in automated breast ultrasound using 3-D 26. Moallem P, Razmjooy N. A multi layer perceptron neural net-
CNN and prioritized candidate aggregation. IEEE Trans Med work trained by invasive weed optimization for potato color
Imaging. 2019;38:240-249. image segmentation. Trends Appl Sci Res. 2012;7:445-455.
9. Boppart SA, Luo W, Marks DL, Singletary KW. Optical coher- 27. Parsian A, Ramezani M, Ghadimi N. A hybrid neural network-
ence tomography: feasibility for basic research and image- gray wolf optimization algorithm for melanoma detection.
guided surgery of breast cancer. Breast Cancer Res Treat. 2004; Biomed Res. 2017;28(8):3408-3411.
84:85-97. 28. Razmjooy N, Mousavi BS, Soleymani F. A hybrid neural net-
10. Goh Y, Balasundaram G, Moothanchery M, et al. Multispectral work imperialist competitive algorithm for skin color segmen-
optoacoustic tomography in assessment of breast tumor mar- tation. Math Comp Model. 2013;57:848-856.
gins during breast-conserving surgery: a first-in-human case 29. Razmjooy N, Ramezani M. Training wavelet neural networks
study. Clin Breast Cancer. 2018;18:e1247-e1250. using hybrid particle swarm optimization and gravitational sea-
11. Goyal M, Yap MH. Region of interest detection in dermoscopic rch algorithm for system identification. Int. J. Mechatron. Electr.
images for natural data-augmentation. arXiv preprint arXiv: Comput. Technol. 2016;6(21):2987-2997.
1807.10711, 2018. 30. Razmjooy N, Sheykhahmad FR, Ghadimi N. A hybrid neural
12. Jagadeesh K, Jamunalaksmi K, Muthuvidhya P, Harris SM, network–world cup optimization algorithm for melanoma
Ganga V. Mammogram based automatic computer aided detec- detection. Open Med. 2018;13:9-16.
tion of masses in medical images. J Telecomm Study. 2018; 31. Peng Z, Wang J, Wang D. Distributed Maneuvering of autono-
3(1):4. mous surface vehicles based on Neurodynamic optimization and
13. Ortiz-Rodriguez JM, Guerrero-Mendez C, del Rosario fuzzy approximation. IEEE Trans Control Syst Technol. 2017;
Martinez-Blanco M, et al. Breast cancer detection by means of 26(3):1083-1090.
artificial neural networks. Advanced Applications for Artificial 32. Yin S, Gao H, Qiu J, Kaynak O. Adaptive fault-tolerant control
Neural Networks. Rijeka, Croatia: InTech; 2018. for nonlinear system with unknown control directions based
14. Qi X, Zhang L, Chen Y, et al. Automated diagnosis of breast on fuzzy approximation. IEEE Trans Syst Man Cybern Syst.
ultrasonography images using deep neural networks. Med 2017;47:1909-1918.
Image Anal. 2019;52:185-198. 33. Ghadimi N. A new hybrid algorithm based on optimal fuzzy
15. Hassanien A. Fuzzy rough sets hybrid scheme for breast cancer controller in multimachine power system. Complexity. 2015;21:
detection. Image Vision Comput. 2007;25:172-183. 78-93.
16. Rangayyan RM, Ayres FJ, Desautels JL. A review of computer- 34. Ghadimi N. An adaptive neuro-fuzzy inference system for
aided diagnosis of breast cancer: toward the detection of subtle islanding detection in wind turbine as distributed generation.
signs. J Franklin Inst. 2007;344:312-348. Complexity. 2015;21:10-20.
17. Chen J-M, Li Y, Xu J, et al. Computer-aided prognosis on 35. Hosseini Firouz M, Ghadimi N. Optimal preventive mainte-
breast cancer with hematoxylin and eosin histopathology nance policy for electric power distribution systems based on
images: a review. Tumor Biol. 2017;39:1010428317694550. the fuzzy AHP methods. Complexity. 2016;21:70-88.
18. Huppe AI, Mehta AK, Brem RF. Molecular breast imaging: a 36. Yu D, Ghadimi N. Reliability constraint stochastic UC by con-
comprehensive review. Paper presented at: Seminars in Ultra- sidering the correlation of random variables with Copula the-
sound, CT and MRI, 2018, pp. 60–69. ory. IET Renew Power Gen. 2019;13:2587-2593.
19. Spanhol FA, Oliveira LS, Petitjean C, Heutte L. Breast cancer 37. Razmjooy N, Khalilpour M, Ramezani M. A new meta-
histopathological image classification using Convolutional heuristic optimization algorithm inspired by FIFA world cup
Neural Networks. Paper presented at: 2016 International Joint competitions: theory and its application in PID designing for
Conference on Neural Networks (IJCNN), 2016; 2560–2567. AVR system. J Control Automat Elec Syst. 2016;27:419-440.
20. Liu X, Zeng Z. A new automatic mass detection method for 38. Razmjooy N, Ramezani M. An improved quantum evolution-
breast cancer with false positive reduction. Neurocomputing. ary algorithm based on invasive weed optimization. Indian J
2015;152:388-402. Sci Res. 2014;4:413-422.
21. Gu P, Lee W-M, Roubidoux MA, Yuan J, Wang X, Carson PL. 39. Mohammadi M, Ghadimi N. Optimal location and optimized
Automated 3D ultrasound image segmentation to aid breast parameters for robust power system stabilizer using honeybee
cancer image interpretation. Ultrasonics. 2016;65:51-58. mating optimization. Complexity. 2015;21:242-258.
12 SHA ET AL.

40. Razmjooy N, Ramezani M, Ghadimi N. Imperialist competitive 54. Zhang L, Suganthan PN. A survey of randomized algo-
algorithm-based optimization of neuro-fuzzy system parameters rithms for training neural networks. Inform Sci. 2016;364:
for automatic red-eye removal. Int J Fuzzy Syst. 2017;19:1144-1156. 146-155.
41. Bhardwaj A, Tiwari A. Breast cancer diagnosis using geneti- 55. Jaddi NS, Abdullah S. Optimization of neural network using
cally optimized neural network model. Expert Syst Appl. 2015; kidney-inspired algorithm with control of filtration rate and
42:4611-4620. chaotic map for real-world rainfall forecasting. Eng Appl Artif
42. Ewees AA, Elaziz MA, Houssein EH. Improved grasshopper Intel. 2018;67:246-259.
optimization algorithm using opposition-based learning. Expert 56. Emary E, Zawbaa HM, Grosan C. Experienced gray wolf opti-
Syst Appl. 2018;112:156-172. mization through reinforcement learning and neural networks.
43. Cai W, Mohammaditab R, Fathi G, Wakil K, Ebadi AG, IEEE Trans Neural Netw Learn Syst. 2018;29:681-694.
Ghadimi N. Optimal bidding and offering strategies of com- 57. Suckling J, Parker J, D. Dance et al. The mammographic image
pressed air energy storage: a hybrid robust-stochastic approach. analysis society digital mammogram database. Exerpta Medica.
Renew Energy. 2019;143:1-8. International Congress Series, 1994, pp. 375–378.
44. Acharya UR, Oh SL, Hagiwara Y, Tan JH, Adeli H. Deep con- 58. Street N. Breast cancer Wisconsin (diagnostic) data set.
volutional neural network for the automated detection and https://archive.ics.uci.edu/ml/datasets/Breast+Cancer
diagnosis of seizure using EEG signals. Comput Biol Med. 2018; +Wisconsin+(Diagnostic) (accessed January 2019).
100:270-278. 59. Giotis I, Molders N, Land S, Biehl M, Jonkman MF, Petkov N.
45. Koehler F, Risteski A. Representational power of ReLU net- MED-NODE: a computer-assisted melanoma diagnosis system
works and polynomial kernels: beyond worst-case analysis. using non-dermoscopic images. Expert Syst Appl. 2015;42:6578-
arXiv preprint arXiv:1805.11405, 2018. 6585.
46. Roy K, Mandal KK, Mandal AC. Ant-lion optimizer algorithm 60. Munteanu C, Cooclea S. Spotmole—Melanoma Control System.
and recurrent neural network for energy management of micro California, San Francisco: Cloudflare Inc. 2009.
grid connected system. Energy. 2019;167:402-416. 61. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification
47. Van Merriënboer B., D. Bahdanau, V. Dumoulin, et al. Blocks with deep convolutional neural networks. Advances in Neural
and fuel: frameworks for deep learning. arXiv preprint arXiv: Information Processing Systems, Neural Information Processing
1506.00619, 2015. Systems Conference; 2012:1097-1105.
48. Martens J, Sutskever I. Learning recurrent neural networks 62. Simonyan K, Zisserman A. Very deep convolutional networks
with hessian-free optimization. Paper presented at: Proceedings for large-scale image recognition. arXiv preprint arXiv:
of the 28th International Conference on Machine Learning 1409.1556, 2014.
(ICML-11), 2011, pp. 1033–1040. 63. He K, Zhang X, Ren S, Sun J. Deep residual learning for image rec-
49. Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer- ognition. Paper presented at: Proceedings of the IEEE Conference
wise training of deep networks. Advances in Neural Informa- on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
tion Processing Systems, Neural Information Processing Systems 64. Li Y, Shen L. Skin lesion analysis towards melanoma detection
Conference; 2007:153-160. using deep learning network. Sensors. 2018;18:556.
50. Mirjalili S, Lewis A. The whale optimization algorithm. Adv 65. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethink-
Eng Softw. 2016;95:51-67. ing the inception architecture for computer vision. Paper pres-
51. Abedinia O, Abedinia O, Zareinejad M, Doranehgard MH, ented at: Proceedings of the IEEE Conference on Computer
Fathi G, Ghadimi N. Optimal offering and bidding strategies Vision and Pattern Recognition, 2016, pp. 2818–2826.
of renewable energy based large consumer using a novel
hybrid robust-stochastic approach. J Clean Prod. 2019;215:
878-889.
52. Anoraganingrum D. Cell segmentation with median filter and How to cite this article: Sha Z, Hu L,
mathematical morphology operation. Paper presented at: Rouyendegh BD. Deep learning and optimization
Image Analysis and Processing, 1999. Proceedings. Interna- algorithms for automatic breast cancer detection.
tional Conference on, 1999, pp. 1043–1046. Int J Imaging Syst Technol. 2020;1–12. https://doi.
53. Loupas T, McDicken W, Allan PL. An adaptive weighted
org/10.1002/ima.22400
median filter for speckle suppression in medical ultrasonic
images. IEEE Trans Circuits Syst. 1989;36:129-135.

You might also like