Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
ABSTRACT A computer-aided diagnosis (CAD) system based on mammograms enables early breast
cancer detection, diagnosis, and treatment. However, the accuracy of existing CAD systems remains
unsatisfactory. This paper explores a breast CAD method based on feature fusion with Convolutional Neural
Network (CNN) deep features. First, we propose a mass detection method based on CNN deep features
and Unsupervised Extreme Learning Machine (US-ELM) clustering. Second, we build a feature set fusing
deep features, morphological features, texture features, and density features. Third, an ELM classifier is
developed using the fused feature set to classify benign and malignant breast masses. Extensive experiments
demonstrate the accuracy and efficiency of our proposed mass detection and breast cancer classification
method.
INDEX TERMS Mass Detection; Computer-Aided Diagnosis; Deep Learning; Fusion Feature; Extreme
Learning Machine
VOLUME 4, 2016 1
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
selection, which is also called objective features, have been rithm to detect suspicious masses in the mammogram, in
successfully applied with a great improvement on accuracies which they utilized an adaptive global and local thresholding
in many applications, such as image recognition, speech segmentation method on the original mammogram.Yap et
recognition, and natural language processing [9], [10]. There al. [15] used three different deep learning methods to detect
are some shortcomings in either subjective or objective fea- lession in breast ultrasound images based on a Patch-based
tures. Subjective features ignore the essential attributes of LeNet, a U-Net, and a transfer learning approach with a
images, while objective features ignore artificial experience. pretrained FCN-AlexNet, respectively.
Therefore, the subjective and objective features are fused In the aspect of differentiating benign and malignant
so that these features can reflect the essential properties of breast tumors, Kahn et al. [16] created a Bayesian network
the image as well as the artificial experience. Meantime, that utilizes 2 physical features and 15 manually marked
Extreme Learning Machine (ELM) has better classification probabilistic characteristics to conduct the computer-aided
effect on multi-dimensional features than other classifiers diagnosis for breast cancer. Wang et al. [17] utilized ELM
including SVM, decision tree, etc., based on our previous to classify features of breast tumors and compare results
research. Thus, we use ELM to classify the extracted breast with SVM classifier. Qiu et al. [18] applied CNN to the risk
mass features. Therefore, in this paper, we propose a novel prediction of breast cancer by training CNN with a large
diagnosis method that merges several deep features. The amount of time series data. Sun et al. [19] also used deep
main contributions are shown as follows. neural networks to predict the risk of breast cancer in the
• In the detection phase, we propose a method that utilizes near term based on 420 time series records of mammography.
CNN and US-ELM for feature extraction and clustering, Jiao et al. [20] proposed a deep feature based framework for
respectively. First, a mammogram will be segmented breast masse classification, in which CNN and a decision tree
into several sub-regions. Then, CNN is used to extract process are utilized. Arevalo et al. [21] used CNN to abstract
features based on each sub-region, followed by utiliz- representations of breast tumor and then classified the tumor
ing US-ELM to cluster features of sub-regions, which as either benign or malignant. Carneiro et al. [22] proposed
eventually locate the region of breast tumor. an automated mammogram analysis method based on deep
• In the phase of feature integration, we design an 8- learning to estimate the risk of patients of developing breast
layer CNN architecture and obtain 20 deep features. In cancer. Kumar et al. [23] presented an image retrieval system
addition, we integrate extra 5 shape features, 5 texture using Zernike moments (ZMs) for extracting features since
features and 7 density features of the tumor with those the features can affect the effectiveness and efficiency of a
deep features to form a fusion deep feature set. breast CAD system.Emina et al. [24] proposed a breast CAD
• In the diagnosis phase, we use the fusion deep feature method, in which genetic algorithms are used for extraction
set of each mammogram as an input of ELM for classifi- of informative and significant features, and the rotation forest
cation. The output directly indicates whether the patient is used to make a decision for two different categories of
has either a benign or a malignant breast tumor. subjects with or without breast cancer.
• Finally, the experimental results demonstrate that our
proposed methods, the sub-regional US-ELM clustering III. METHODS
and the ELM classification with fusion deep feature sets, In this paper, we consider the following five steps in breast
achieve the best performance in the diagnosis of breast cancer detection: breast image preprocessing, mass detection,
cancer. The experimental dataset contains 400 female feature extraction, training data generation, and classifier
mammograms. training. In the breast image preprocessing, denoising and
enhancing contrast processes on the original mammogram
II. RELATED WORK have been utilized to increase the contrast between the mass-
The research efforts related to breast cancer CAD mainly es and the surrounding tissues. The mass detection is then
focus on the detection and diagnosis of breast tumor. This performed to localize the ROI. After that, features including
section briefly summarizes existing works related to these deep features, morphological features, texture features and
two aspects. In the aspect of breast tumor detection, Sun et density features, are extracted from the ROI. During the
al. [11] proposed a mass detection method, where an adaptive training process, the classifiers have been trained with every
fuzzy C-means algorithm for segmentation is employed on image from the breast image dataset using their extracted
each mammogram of the same breast. A supervised artificial features and corresponding labels. Thus, the mammogram
neural network is used as a classifier to judge whether the underdiagnosis can be identified using the well-trained clas-
segmented area is a tumor. Saidin et al. [12] employed pixels sifiers. Fig. 1 presents the flowchart of the entire diagnosis
as an alternative feature and used a region growing method process.
to segment breast tumor in the mammogram. Xu et al. [13]
proposed an improved watershed algorithm. They first make A. BREAST IMAGE PREPROCESSING
a coarse segmentation of breast tumor, followed by the image There are several preprocessing methods in [25]–[29]. The
edge detection via combining regions that has similar gray- adaptive mean filter algorithm [25] is selected to eliminate
scale mean values. Hu et al. [14] proposed a novel algo- noise on the original mammograms in order to avoid the
2 VOLUME 4, 2016
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
Mammograms
Mammograms Under Diagnosis
Train Diagnosis
Pre-processing Pre-processing
Classifier Training
Diagnosis Results
of Mammograms
impact of noise on subsequent auxiliary diagnosis. The main processing flowchart is shown in Fig. 3. The first step is to
idea is to use a fixed size window sliding in the line direction extract the ROI from the images after preprocessing. The
of the image, calculating the mean, variance, and spatial cor- ROI is then divided into several non-overlapping sub-regions
relation values of each sliding window to determine whether using a sliding window. After that, we determine whether all
the window contains noise. If the noise is detected, replace sub-regions have been successfully traversed. If yes, extract
the pixel values of the selected window with the mean value. the deep features of the sub-regions; otherwise, clustering
In this paper, a contrast enhancement algorithm [30] has the deep features of the sub-regions, obtain the mass area
been used to increase the contrast between the suspected boundary and complete the mass detection process.
masses and surrounding tissues. The main idea is to trans-
form the histogram of the original image into uniformly 1) Extract ROI
distributed. After this process, the gray scale of the image is In mammography, there are a large number of 0 gray value
enlarged, thus the contrast has been enhanced and the image areas, which have no impact on the breast CAD. In order
details become more clear. to improve the mammary image processing efficiency and
Fig. 2 shows the mammogram before and after preprocess- ensure the accuracy of follow-up diagnosis, it is necessary
ing. Fig. 2(a) is the original image, Fig. 2(b) and 2(c) are to separate the mammary area from the whole mammogram
the images after denosing and enhancing contrast processes, as ROI.
respectively. By comparing these three images, we can see In this paper, an adaptive mass region detection algorithm
that the boundaries of the mammary and the background area has been utilized to extract the breast mass region. Specifi-
in the original image (Fig. 2(a)) are often ambiguous and cally, in a mammogram, all rows are scanned sequentially to
irregular. After preprocessing, the contrast between the mam- find the first nonzero pixel (with abscissa denoted as xs ) and
mary region and the background area has been significantly the last nonzero pixel (with abscissa denoted as xd ), and all
enhanced, thus reducing considerable computational burden columns are then scanned sequentially to find the first nonze-
in image post-processing. ro pixel (with ordinate denoted as ys ) and the last nonzero
pixel (with ordinate denoted as yd ). Algorithm 1 presents the
B. MASS DETECTION details of this algorithm, the size of the mammography I is
The purpose of mass detection is to extract the mass region m × n.
from the normal tissues. The more precise the mass seg-
mentation, the more accurate the extracted features. In this 2) Partition Sub-region
paper, we propose a mass detection method based on sub- In this section, a method to divide the ROI into several non-
domain CNN deep features and US-ELM clustering. The overlapping sub-regions is proposed. The searching area to
VOLUME 4, 2016 3
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
(a) Initial mammogram (b) Denoising after (a) (c) Enhancement after (b)
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
The output of the first convolution layer is connect- C. CLASSIFY BENIGN AND MALIGNANT MASSES
ed with a max-pooling layer. Then the second and third BASED ON FEATURES FUSED WITH CNN FEATURES
convolution/max-pooling layer are connected to one another In this subsection, a diagnosis method using ELM classifier
until we have the output with size 2 × 2 × 6. The fully- with fusion deep features is proposed. The main idea is to
connected layer has 2 × 2 × 6 = 24 neurons which are the extract the deep features using CNN, and also extract the
features for the following clustering analysis. morphological, texture and density characteristics from the
breast mass area. Then use the ELM classifier to classify
4) Clustering Deep Features Using US-ELM the fusion features and obtain the benign and malignant
In this paper, we use the US-ELM algorithm to cluster diagnosis results.
deep features extracted from the previous CNN architecture.
The cluster number is set to 2 and sub-region features are 1) Feature Modeling
clustered into two categories: A. suspicious mass areas; B. In clinical, breast masses are common signs of early breast
non-suspicious mass areas. disease, according to the pathological characteristics of the
When the amount of training data is small, the effect of masses, where the masses are divided into two categories:
the model obtained by supervised learning cannot satisfy malignant and benign. On one hand, CNN extracts the deep
the demand. Therefore, semi-supervised learning is used to features of the masses, which can represent the essential
enhance the effect, meanwhile, it can also perform some properties of the masses. On the other hand, based on the
clustering tasks [31], [32]. US-ELM algorithm is one of the doctors‘ experience, malignant masses in mammography are
semi-supervised learning algorithms and it can find out the often have the following characteristics: the irregular shape
internal structure relationships that exist among the unlabeled with a burr-like edge, unsmooth surface with hard nodules,
dataset [33]. The details of this algorithm are shown in and the densities are significantly different with surrounding
Algorithm 2. The input of the algorithm is the deep feature tissues. While the benign masses often have regular shape
matrix X, and the output is the feature clustering results. with the clear edge, smooth surface, rarely accompanied with
VOLUME 4, 2016 5
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
TABLE 1. Types of Extracted Features scale texture feature. The gray-level co-occurrence matrix
describes the gray distribution over all image pixels, which
is based on the second-order joint condition probability [36].
Types Features In this paper, we have extracted inverse moment, entropy,
Deep features 20 deep features extracted from CNN energy, correlation and contrast coefficient as the texture
Morphological features Roundness, normalized radius entropy, nor- features and the corresponding model can be expressed as
malized radius variance, acreage ratio, F3 = [t1 , t2 , t3 , t4 , t5 ]. TABLE 3 illustrates the equations to
roughness calculate each texture feature.
Texture features Inverse difference moment, entropy, energy, Densityf eatures : Recent studies have shown that the
correlation and contrast coefficient
density features of breast masses have a significant correla-
Density features Density mean, density variance, density tion with mass benign and malignant characteristics. Using
skew, density peak gray density variance,
gray density skew, gray density peak density features to predict the breast cancer is also a common
method [37]. In this paper, we have extracted seven density
features and the corresponding model can be expressed as
96
48 2*2*5=20
F4 = [d1 , d2 , d3 , d4 , d5 , d6 , d7 ]. TABLE 4 presents the equa-
128 32
16
12
6 4 2
tions to calculate each density feature.
96 12 6 4 2
48 32 16 5
5 5
10
128 20 10 5
20
Convolution
Max
Pooling
Convolution
Max
Pooling
Convolution
Max
Pooling
Convolution Max
Pooling
Fully
connected
2) Classifier
The ELM algorithm is a single hidden layer feed-forward
FIGURE 6. CNN architecture for mass diagnosis.
neural network proposed by G. Huang [38], which has a
good generalization performance and fast learning speed, and
small nodules, and the densities are uniformly distributed. insensitive to manual parameters setup. In this paper, we use
The types of the extracted features used in this paper are ELM as the classifier to obtain breast cancer benign and
listed in TABLE 1. malignant diagnosis results.
The fusion features can be modeled as The entire ELM algorithm consists of training and testing
processes, and detailed algorithm training steps are shown
F = [F1 , F2 , F3 , F4 ] (3) in Algorithm 3. First, randomly generate weights wi and
bias bi of the hidden layer. Then calculate the single hidden
where F1 denotes deep features, F2 denotes morphological
layer output matrix H based on parameters wi , bi , and fusion
features, F3 represents texture features, and F4 represents
features F . After that, obtain the output weight vector β
density features.
based on the training data labels.
Deepf eatures : CNN has great advantages in feature
extraction due to its convolution-pooling operations. It can
Algorithm 3 ELM Training Algorithm
extract the essential image characteristics without human
participation [34]. A 10-layer CNN architecture used in 1 Input: N = {(xj , tj ) |xj ∈ Rn , tj ∈ Rm , j =
feature extraction is shown in Fig. 6. The CNN input is 1, 2, · · · , N } L: number of hidden neurons; N : labeled
the suspicious mass area. After a series of convolution/max- dataset size
pooling layers, the last fully-connected layer has 2 × 2 × 2 Output: Three parameters of ELM: w, b, β
5 = 20 neurons which are the deep features denoted as 3 for i = 1 to L do
F1 = [c1 , c2 , · · · , c20 ]. 4 randomly generate weights wi and bias bi of the hidden
M orphologicalf eatures : According to experienced layer
doctors, malignant breast masses often have irregular shapes 5 end for
and blur boundaries with surrounding tissues [35]. Mor- 6 Calculate the single hidden layer output matrix H
phological features are important indicators to distinguish 7 Calculate the output weight vector β = HT T
between benign and malignant masses. In this paper, we 8 Return w, b, β
have extracted mass roundness, normalized radius entropy,
normalized radius variance, acreage ratio, and roughness as Algorithm 4 shows the testing steps of ELM algorithm.
the morphological features and the corresponding model can First, calculate the single hidden layer output matrix H based
be expressed as F2 = [g1 , g2 , g3 , g4 , g5 ]. TABLE 2 illustrates on parameters wi , bi , β. Then extract the fusion features of
the detailed equations to calculate each morphological fea- the testing image as the algorithm input, obtaining breast
ture. cancer diagnosis results R.
T exturef eatures : Texture features are effective param-
eters that reflect the benign and malignant characteristics IV. EXPERIMENT
of the breast masses, and contribute to the early diagno- In this section, the effectiveness of the mass detection method
sis of breast cancer. The gray-level co-occurrence matrix based on the CNN and US-ELM algorithms, and breast
which was first introduced by Haralick is a classic gray- cancer diagnostic method based on the fusion features are
6 VOLUME 4, 2016
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
TABLE 5. Mass Detection Experiment Procedures models contain deep features and one of the morphological,
texture and density features, i.e., “deep feature + morpholog-
Experiment CNN DBN SAE US-ELM k-means ical feature (CNN-G)". Multi-features models (MF) contain
Name deep features and two or more other features, i.e., “deep
√ √
CNN-US feature + morphological feature + texture feature (CNN-
√ √
CNN-K GTD)". To further evaluate our proposed method, we also
√ √
DBN-US select the state-of-art algorithm mentioned in [24] as the
√ √
DBN-K baseline, which is can be simplified as “GARF". Detailed
√ √
SAE-US feature combinations and selections are shown in TABLE 6.
√ √
SAE-K In the diagnosis of breast masses, the CNN architecture
parameters are shown in Fig. 6. Other parameters include
the activation function f and the learning rate η. In our
B. EXPERIMENT PROCEDURES AND PARAMETERS
experiment, the learning rate is equal to 0.003 and the ac-
tivation function is tanh. When using DBN and SAE for
In the following experiments, the effectiveness of the mass
feature extraction, we use the same activation function and
detection method based on the CNN and US-ELM algorithm-
learning rate. When using ELM for clustering, the parameters
s, and breast cancer diagnostic method based on the fusion
involved are the activation function and the number of hidden
features are investigated based on the above experiment
layer neurons. In the experiment, the selected activation func-
dataset. The experimental procedures and parameters used in
tion f is “sigmoid", and the number of hidden layer neurons
the verification process are as follows.
is set to L = 1000. When considering the SVM classification
technique, the parameters involved are kernel function R,
1) Mass Detection Experiment
penalty coefficient c, and kernel function parameter g. RBF
In the mass detection process, the feature extraction algo- is selected as kernel function R, c = 0.5, and g = 0.0206.
rithms used in the experiment are CNN, DBN, and SAE. The
clustering algorithms are US-ELM and k-means. The names C. EVALUATION METRICS
of the experiments are simplified, and the combination of the In the above experimental scheme, the mass detection
feature extraction method and the clustering method used in method based on subdomain CNN deep feature through
each experiment is shown in TABLE 5. US-ELM clustering and the mass diagnosis method based
The marker-controlled watershed algorithm (MCWA) [13] on deep feature fusion are investigated respectively. The
and adaptive thresholding algorithm (ATA) [14] have been quantitative evaluation metrics of the experimental results are
used for comparison and to further verify the accuracy of the described as follows.
proposed method.
In the diagnosis of breast masses, the CNN architecture 1) Detection Metrics
parameters are shown in Fig. 5. Other parameters include In this paper, Misclassified Error (ME), Area Overlap Met-
the activation function f and the learning rate η. In our ric (AOM), Area Over-segmentation Metric (AVM), Area
experiment, the learning rate is equal to 0.003 and the ac- Under-segmentation Metric (AUM), and Comprehensive
tivation function is tanh. When using DBN and SAE for Measure (CM) are used to evaluate the accuracy of mass seg-
feature extraction, we use the same activation function and mentation in the step of breast mass detection. The detailed
learning rate. When using ELM for clustering, the parameters calculation formula of each evaluation metrics is shown in
involved are the activation function and the number of hidden TABLE 7. A smaller value of ME, AVM and AUM, and
layer neurons. In the experiment, the selected activation larger value of AOM and CM, correspond to a better segmen-
function f is “sigmoid", and the number of hidden layer tation result. However, in practical applications, AVM and
neurons is set to L = 1000. Since the density feature is AUM often can not be optimal at the same time. ME and CM
binary clustered based on US-ELM, the k value is set to 2. as comprehensive measurement parameters are often more
When clustering is performed using k-means, the parameter important in the measurement process.
k involved in the experiment is the same as the US-ELM
algorithm, which is equal to 2. 2) Diagnosis Metrics
Accuracy, sensitivity, specificity, TPRatio, TNRatio, and area
2) Breast Cancer Diagnosis Experiment under ROC curve (AUC) are used to compare and analyze the
In the process of breast-assisted diagnosis, the deep, mor- results of masses. The evaluation formula of the evaluation
phological, texture and density features of suspected masses metrics is shown in TABLE 8, and the explanation of the
were extracted and used for fusion feature modeling. In the evaluation metrics parameters in TABLE 9 are presented
experiment, the deep feature model is mainly divided into s- in Table XIII. Among the above six metrics, the higher
ingle deep feature, double features and multi-features. Single the value of accuracy, sensitivity, specificity, TP Ratio, TN
deep feature models (SF) contain the features extracted from Ratio and AUC, the more accurate the diagnosis will be.
CNN, DBN, and SAE algorithms only. Double features (DF) The k-fold cross validation method [39] has been used to
8 VOLUME 4, 2016
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
Feature Types Feature DBN SAE CNN Morphological Texture Fea- Density Fea-
Models √ Feature (G) ture (T) ture (D)
Single feature CNN √
DBN √
SAE √ √
Double feature CNN-G √ √
CNN-T √ √
CNN-D √ √ √
Multiple feature CNN-GT √ √ √
CNN-GD √ √ √
CNN-TD √ √ √ √
CNN-GTD
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
1.0
0.8
0.4
0.2 CNN-ELM
DBN-ELM
SAE-ELM
CNN-SVM
DBN-SVM
SAE-SVM
FIGURE 10. ROC Curve of single feature with ELM and SVM classifier
0&:$ $7$ &1186(/0
accuracy, sensitivity and specificity.
When the classifier is SVM, the accuracy, sensitivity and
specificity of the CNN feature model are the best in the
single deep feature model, which shows that the CNN model
chosen in this paper is the most suitable method. In the
double feature model, the CNN deep feature model combined
with texture feature has the best performance in the diagnosis
accuracy, sensitivity and specificity.
0( $20 $80 $90 &0
According to the deep fusion model, comparing the anal-
FIGURE 9. Collation map of evaluation indices of experiments. ysis of each evaluation metrics obtained by using ELM and
SVM classifier for benign and malignant tumor classification,
it can be seen that ELM classifier gives better diagnostic
2) Mass Detection Based on Fusion Feature accuracy, sensitivity, and specificity. Meantime, it is obvious
According to the experimental scheme described in Section that the mass diagnosis method based on deep fusion feature
III.C.2, the accuracy, sensitivity, specificity, TP Ratio, TN are better than GARF [24] in the diagnosis accuracy, sensi-
Ratio and AUC of the benign and malignant breast mass tivity and specificity. Thus it is desirable to combine the deep
classification are analyzed based on three types of deep features with ELM in the diagnosis of breast cancer.
feature models (10 feature sets) and two classifiers. We also
show the results of the method mentioned in [24] using our V. CONCLUSION
datasets. The evaluation results are shown in TABLE 10, and This paper proposes a breast CAD method based on fusion
the ROC curve are shown in Fig. 10-12. deep features. Its main idea is to apply deep features extracted
When the classifier is ELM, the accuracy, sensitivity and from CNN to the two stages of mass detection and mass
specificity of the CNN feature model are the best in the diagnosis. In the stage of mass detection, a method based on
single deep feature model, which shows that the CNN model sub-domain CNN deep features and US-ELM clustering is
chosen in this paper is the most suitable method. In the developed. In the stage of mass diagnosis, an ELM classifier
double feature model, the CNN deep feature model combined is utilized to classify the benign and malignant breast masses
with texture feature has the best performance in the diagnosis using a fused feature set, fusing deep features, morphological
10 VOLUME 4, 2016
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
0.6
0.0
classify the benign and malignant of the breast mass. In this
0.0 0.2 0.4 0.6
ACKNOWLEDGMENT
0.8
This research was partially supported by the National Natural
Science Foundation of China (Nos. 61472069, 61402089,
and U1401256), the China Postdoctoral Science Foundation
True Positive Rate
0.6
0.2
CNNGD
CNNTD
CNNGTD
search of Intelligent Healthcare Technology, Co. Ltd.(No.
CNNGTS
CNNGDS
CNNTDS
NRIHTOP1802).
CNNGTDS
GTD
0.0
0.0 0.2 0.4 0.6 0.8 1.0
VOLUME 4, 2016 11
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
income countries: do what you can versus one size fits all,” Lancet [24] E. Aličković and A. Subasi, “Breast cancer diagnosis using ga feature
Oncology, vol. 12, no. 3, pp. 306–312, 2011. selection and rotation forest,” Neural Computing and applications, vol. 28,
[3] C. Lerman, M. Daly, C. Sands, A. Balshem, E. Lustbader, T. Heggan, no. 4, pp. 753–763, 2017.
L. Goldstein, J. James, and P. Engstrom, “Mammography adherence and [25] H. D. Cheng, J. Shan, W. Ju, Y. Guo, and L. Zhang, “Automated breast
psychological distress among women at risk for breast cancer,” Journal of cancer detection and classification using ultrasound images: A survey,”
the National Cancer Institute, vol. 85, no. 13, pp. 1074–1080, 1993. Pattern Recognition, vol. 43, no. 1, pp. 299–317, 2010.
[4] P. T. Huynh, A. M. Jarolimek, and S. Daye, “The false-negative mammo- [26] Y. Pathak, K. V. Arya, and S. Tiwari, “Low-dose ct image reconstruction
gram,” Radiographics, vol. 18, no. 5, pp. 1137–54, 1998. using gain intervention-based dictionary learning,” Modern Physics Let-
[5] M. G. Ertosun and D. L. Rubin, “Probabilistic visual search for masses ters B, no. 14, 2018.
within mammography images using deep learning,” in IEEE International [27] S. Tiwari and S. Tiwari, “A variational framework for low-dose sinogram
Conference on Bioinformatics and Biomedicine, 2015, pp. 1310–1315. restoration,” International Journal of Biomedical Engineering and Tech-
[6] S. D. Tzikopoulos, M. E. Mavroforakis, and H. V. Georgiou, “A fully nology, vol. 24, no. 4, pp. 356–367, 2017.
automated scheme for mammographic segmentation and classification [28] Z. Gao, Y. Li, Y. Sun, J. Yang, H. Xiong, H. Zhang, X. Liu, W. Wu,
based on breast density and asymmetry,” Computer Methods and Programs D. Liang, and S. Li, “Motion tracking of the carotid artery wall from
in Biomedicine, vol. 102, no. 1, pp. 47–63, 2011. ultrasound image sequences: a nonlinear state-space approach,” IEEE
[7] D. C. Pereira, R. P. Ramos, and M. Z. do Nascimento, “Segmentation and Transactions on Medical Imaging, vol. 37, no. 1, pp. 273–283, 2017.
detection of breast cancer in mammograms combining wavelet analysis [29] Z. Gao, H. Xiong, X. Liu, H. Zhang, D. Ghista, W. Wu, and S. Li, “Robust
and genetic algorithm,” Computer Methods and Programs in Biomedicine, estimation of carotid artery wall motion using the elasticity-based state-
vol. 114, no. 1, pp. 88–101, 2014. space approach,” Medical Image Analysis, vol. 37, pp. 1–21, 2017.
[8] S. A. Taghanaki, J. Kawahara, B. Miles, and G. Hamarneh, “Pareto- [30] H. Ibrahim and N. S. P. Kong, “Brightness preserving dynamic histogram
optimal multi-objective dimensionality reduction deep auto-encoder for equalization for image contrast enhancement,” IEEE Transactions on
mammography classification,” Comput Methods Programs Biomed, vol. Consumer Electronics, vol. 53, no. 4, pp. 1752–1758, 2007.
145, pp. 85–93, 2017. [31] Q. Miao, R. Liu, P. Zhao, Y. Li, and E. Sun, “A semi-supervised image
[9] X. W. Chen and X. Lin, “Big data deep learning: Challenges and perspec- classification model based on improved ensemble projection algorithm,”
tives,” IEEE Access, vol. 2, pp. 514–525, 2014. IEEE Access, vol. 6, pp. 1372–1379, 2017.
[10] K. Ganesan, U. R. Acharya, C. K. Chua, L. C. Min, K. T. Abraham, and [32] H. Gan, Z. Li, Y. Fan, and Z. Luo, “Dual learning-based safe semi-
K. H. Ng, “Computer-aided breast cancer detection using mammograms: supervised learning,” IEEE Access, vol. 6, pp. 2615–2621, 2017.
A review,” IEEE Reviews in Biomedical Engineering, vol. 6, pp. 77–98, [33] G. Huang, S. Song, J. N. D. Gupta, and C. Wu, “Semi-supervised and unsu-
2013. pervised extreme learning machines,” IEEE Transactions on Cybernetics,
[11] X. Sun, W. Qian, and D. Song, “Ipsilateral-mammogram computer-aided vol. 44, no. 12, pp. 2405–2417, 2014.
detection of breast cancer,” Computerized Medical Imaging and Graphics [34] R. K. Samala, H.-P. Chan, L. Hadjiiski, M. A. Helvie, J. Wei, and K. Cha,
the Official Journal of the Computerized Medical Imaging Society, vol. 28, “Mass detection in digital breast tomosynthesis: Deep convolutional neural
no. 3, pp. 151–158, 2004. network with transfer learning from mammography,” Medical physics,
[12] N. Saidin, U. K. Ngah, H. Sakim, and N. S. Ding, Density based breast vol. 43, no. 12, pp. 6654–6666, 2016.
segmentation for mammograms using graph cut and seed based region [35] J. Tang, R. M. Rangayyan, J. Xu, I. El Naqa, and Y. Yang, “Computer-
growing techniques. IEEE Computer Society, 2010. aided detection and diagnosis of breast cancer with mammography: recent
[13] S. Xu, H. Liu, and E. Song, “Marker-controlled watershed for lesion advances,” IEEE Transactions on Information Technology in Biomedicine,
segmentation in mammograms,” Journal of Digital Imaging, vol. 24, no. 5, vol. 13, no. 2, pp. 236–251, 2009.
pp. 754–763, 2011. [36] G. D. Tourassi, B. Harrawood, S. Singh, J. Y. Lo, and C. E. Floyd,
[14] K. Hu, X. Gao, and F. Li, “Detection of suspicious lesions by adaptive “Evaluation of information-theoretic similarity measures for content-based
thresholding based on multiresolution analysis in mammograms,” IEEE retrieval and detection of masses in mammograms,” Medical Physics,
Transactions on Instrumentation and Measurement, vol. 60, no. 2, pp. 462– vol. 34, no. 1, pp. 140–150, 2007.
472, 2011. [37] L. Liu, J. Wang, and K. He, “Breast density classification using histogram
[15] M. H. Yap, G. Pons, J. Marti, S. Ganau, M. Sentis, R. Zwiggelaar, A. K. moments of multiple resolution mammograms,” in International Confer-
Davison, and R. Marti, “Automated breast ultrasound lesions detection ence on Biomedical Engineering and Informatics, 2010, pp. 146–149.
using convolutional neural networks,” IEEE J Biomed Health Inform, [38] G. B. Huang, Q. Y. Zhu, and C. K. Siew, “Extreme learning machine:
vol. 22, no. 4, pp. 1218–1226, 2017. Theory and applications,” Neurocomputing, vol. 70, no. 1, pp. 489–501,
[16] K. C. Jr, L. M. Roberts, K. A. Shaffer, and P. Haddawy, “Construction 2006.
of a bayesian network for mammographic diagnosis of breast cancer,” [39] Q. Dai, “A competitive ensemble pruning approach based on cross-
Computers in Biology and Medicine, vol. 27, no. 1, pp. 19–29, 1997. validation technique,” Knowledge-Based Systems, vol. 37, no. 2, pp. 394–
[17] Z. Wang, G. Yu, Y. Kang, Y. Zhao, and Q. Qu, “Breast tumor detection 414, 2013.
in digital mammography based on extreme learning machine,” Neurocom-
puting, vol. 128, no. 5, pp. 175–184, 2014.
[18] Y. Qiu, Y. Wang, S. Yan, M. Tan, S. Cheng, H. Liu, and B. Zheng, “An
initial investigation on developing a new method to predict short-term
breast cancer risk based on deep learning technology,” in Medical Imaging
2016: Computer-Aided Diagnosis. International Society for Optics and
Photonics, 2016.
[19] W. Sun, T. L. Tseng, B. Zheng, and W. Qian, A Preliminary Study on ZHIQIONG WANG received her M.Sc and Ph.D.
Breast Cancer Risk Analysis Using Deep Neural Network. Springer in Computer Science and Technology from the
International Publishing, 2016. Northeastern University, China, in 2008 and 2014,
[20] Z. Jiao, X. Gao, Y. Wang, and J. Li, “A deep feature based framework respectively. She visited the National University of
for breast masses classification,” Neurocomputing, vol. 197, pp. 221–231,
Singapore in 2010 and the Chinese University of
2016.
Hong Kong in 2013 as the academic visitor. Cur-
[21] J. Arevalo, M. A. G. Lopez, and M. A. G. Lopez, “Representation learning
for mammography mass lesion classification with convolutional neural rently, she is an Associate Professor with the Sino-
networks,” Computer Methods and Programs in Biomedicine, vol. 127, Dutch Biomedical and Information Engineering
pp. 248–257, 2016. School of Northeastern University. Her main re-
[22] G. Carneiro, J. Nascimento, and A. P. Bradley, “Automated analysis of search interests are: biomedical, biological data
unregistered multi-view mammograms with deep learning,” IEEE Trans- processing, cloud computing, and machine learning. She has published more
actions on Medical Imaging, vol. 36, no. 11, pp. 2355–2365, 2017. than 50 papers.
[23] Y. Kumar, A. Aggarwal, S. Tiwari, and K. Singh, “An efficient and
robust approach for biomedical image retrieval using zernike moments,”
Biomedical Signal Processing and Control, vol. 39, pp. 459–473, 2018.
12 VOLUME 4, 2016
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2019.2892795, IEEE Access
MO LI received her B.E. degree in College of In- YUDONG YAO (Saŕ88-M ˛ aŕ88-SM
˛ aŕ94-F
˛ aŕ11)
˛
formation and Computer Engineering from North- has been with Stevens Institute of Technology,
east Forestry University in 2014, M.E. degree Hoboken, New Jersey, since 2000 and is currently
in Sino-Dutch Biomedical and Information Engi- a professor and department director of electrical
neering School from Northeastern University in and computer engineering. He is also a Professor
2017. Currently, she is pursuing her PH.D degree the Sino-Dutch Biomedical and Information Engi-
in School of Computer Science and Engineering, neering School of Northeastern University, and a
Northeastern University. Her main research in- director of Stevens’ Wireless Information Systems
terests are: bioinformatics, machine learning, big Engineering Laboratory (WISELAB). Previously,
data management. from 1989 and 1990, he was at Carleton Univer-
sity, Ottawa, Canada, as a Research Associate working on mobile radio
communications. From 1990 to 1994, he was with Spar Aerospace Ltd.,
Montreal, Canada, where he was involved in research on satellite commu-
nications. From 1994 to 2000, he was with Qualcomm Inc., San Diego,
CA, where he participated in research and development in wireless code-
division multiple-access (CDMA) systems. He holds one Chinese patent
and twelve U.S. patents. His research interests include wireless commu-
nications and networks, spread spectrum and CDMA, antenna arrays and
beamforming, cognitive and software defined radio (CSDR), and digital
signal processing for wireless systems. Dr. Yao was an Associate Editor of
IEEE COMMUNICATIONS LETTERS and IEEE TRANSACTIONS ON
VEHICULAR TECHNOLOGY, and an Editor for IEEE TRANSACTIONS
ON WIRELESS COMMUNICATIONS. He received the B.Eng. and M.Eng.
degrees from Nanjing University of Posts and Telecommunications, Nan-
jing, China, in 1982 and 1985, respectively, and the Ph.D. degree from
Southeast University, Nanjing, China, in 1988, all in electrical engineering.
VOLUME 4, 2016 13
2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.