Professional Documents
Culture Documents
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Original article:
* Corresponding author: Dr. Alireza Vard, Assistant Professor in the Department of Bio-
electrics and Biomedical Engineering, School of Advanced Technologies in Medicine and
Medical Image and Signal Processing Research Center, Isfahan University of Medical
Sciences, Isfahan, Iran. Post Code: 81746-73461. Tel: +98 31 37923859;
Fax: +98 31 37923851; E-mail: vard@amt.mui.ac.ir, alivard@gmail.com
http://dx.doi.org/10.17179/excli2019-1292
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/4.0/).
ABSTRACT
This paper presents a simple and efficient computer-aided diagnosis method to classify Chronic Myeloid Leukemia
(CML) cells based on microscopic image processing. In the proposed method, a novel combination of both typical
and new features is introduced for classification of CML cells. Next, an effective decision tree classifier is pro-
posed to classify CML cells into eight groups. The proposed method was evaluated on 1730 CML cell images
containing 714 cells of non-cancerous bone marrow aspiration and 1016 cells of cancerous peripheral blood
smears. The performance of the proposed classification method was compared to manual labels made by two
experts. The average values of accuracy, specificity and sensitivity were 99.0 %, 99.4 % and 98.3 %, respectively
for all groups of CML. In addition, Cohen's kappa coefficient demonstrated high conformity, 0.99, between joint
diagnostic results of two experts and the obtained results of the proposed approach. According to the obtained
results, the suggested method has a high capability to classify effective cells of CML and can be applied as a
simple, affordable and reliable computer-aided diagnosis tool to help pathologists to diagnose CML.
Keywords: Chronic Myeloid Leukemia (CML), blood cancer, microscopic image processing, classification, de-
cision tree classifier
382
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
type of cancer that commonly initiates in the tial diagnosis of chronic leukemia should in-
bone marrow and causes a high numbers of clude microscopic analysis of the bone mar-
abnormal white blood cells (Bain, 2008). row, the final diagnosis of CML is done based
According to the French-American-Brit- on the investigation of a peripheral blood
ish (FAB) classification, leukemia; depending smear under a microscope. In CML, the bone
on type of effective WBCs, is divided in two marrow generates too many abnormal white
groups: 1- myeloid and 2-lymphocytic cells, called myeloid series (CML cells). The
(Coppola, 2010). These two groups are the developmental stages of the CML cells, from
further sub-classified into acute and chronic immature to mature include: Myeloblasts,
types. The acute leukemia progresses more Promyelocytes, Myelocytes, Metamyelo-
rapidly than chronic leukemia and requires cytes, Bands and Neutrophils (Zittoun et al.,
immediate treatment. In summation, the four 1994). The characteristics of all types of
main types of leukemia are: Acute Lympho- CML cells are summarized in Table 1. Mye-
cytic Leukemia (ALL), Acute Myelogenous loblasts are the most immature type of mye-
Leukemia (AML), Chronic Lymphocytic loid cells. The nuclei in these cells are large
Leukemia (CLL) and Chronic Myelogenous with a round or oval shape, fine chromatin
Leukemia (CML). CML is a type of leukemia and can have several nucleoli, whilst, the
that affects the blood and bone marrow. This cell’s cytoplasm is basophilic, and lacking in
disease usually occurs after adulthood and granules. Additionally, the area of the nucleus
rarely occurs in childhood. Although the ini- to the area of the cytoplasm ratio (N:C) is
more than 80 % in Myeloblasts.
Number of 1
Nucleus Nuclear N:C Cytoplasm
Cell Type nucleus Granules Nucleoli
Shape chromatin ratio % Color
lobes (Num)
Myeloblast
None or some
Round or ~80 %
1 Fine Blue primary 1 or more
oval
granules
Promyelocyte Less
Round or Coarser than a basophilic Primary
1 ~70 % 1 or more
oval Myeloblast than granules
Myeloblast
Myelocyte
More coarse
Round to Secondary
1 than ~60 % Acidophilic None
oval granules
Promyelocyte
Metamyelocyte
Secondary
Indented 1 Usually clumped ~40 % Pink None
granules
Band
Band Coarse and ~[30 % ‐ Secondary
1 Pink None
shaped clumpy 40 %] granules
Neutrophil
~[20 % ‐
Multi‐lobed 1 or more Clumped Pink Fine None
30 %]
1
Area of nucleus to area of cytoplasm ratio (N:C)
383
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Promyelocyte cells are a little bigger than (Sarrafzadeh et al., 2014; Habibzadeh et al.,
Myeloblasts with round or oval shaped nuclei, 2012; Putzu et al., 2014) six types of imma-
smooth chromatin and the possible presence ture WBCs (CML cells; Myeloblast, Promye-
of nucleoli. The cytoplasm’s of promyelo- locyte, Myelocyte, Metamyelocyte, Band and
cytes are blue with numerous azurophilic Neutrophil) (Theera-Umpon and Dhompong-
granules; and the N:C is more than 70 %. sa, 2007; Ramirez-Cortes et al., 2010; Theera-
Myelocytes are typically smaller in size Umpon, 2005) plasma cells in myeloma
compared to Promyelocytes. Their nuclei are (Saeedizadeh et al., 2016) and effective cells
round to oval and lack the presence of a nu- in diagnosis other leukemia (CLL (Alférez et
cleolus. The chromatin in Myelocytes is com- al., 2014), AML and ALL (Laosai and
parably coarser than the chromatin in Pro- Chamnongthai, 2014; MoradiAmin et al.,
myelocytes and the cytoplasm of Myelocytes 2016)).
are typically light blue in color with second- In all these image processing methods for
ary granules. Furthermore, the N:C in Myelo- cell classification, usually a four-stage proce-
cyte is more than 60 %. dure has to been implemented. The four
Metamyelocytes are considerably smaller stages are: 1-preprocessing, 2-segmentation,
than Myelocytes with a kidney shaped nu- 3-feature extraction and 4-classification. The
cleus and partly dense chromatin. They have main tasks of these stages and some image
a pink cytoplasm with numerous secondary processing method used in the previous work
granules and a N:C of more than 40 %. are summarized as follows:
Band cells have even less size than a Preprocessing: this stage is designed to
Metamyelocytes, a curved and elongated nu- enhance contrast of cells rather than back-
cleus with parallel sides and coarse chroma- ground; while reduce noise and clutter of
tin, cytoplasm’s similar to Metamyelocytes images. To contrast enhancement
and a N:C typically between 30 % and 40 %. (Gonzalez and Woods, 2008), techniques
Finally, Neutrophils are the most mature such as histogram stretching or equaliza-
type of cells in the myeloid series that have tion (Sabino et al., 2004) and Gram-
lobulated Nuclei and dark purple stains with Schmidt orthogonalization (Tabrizi et al.,
large dense chromatin. The cytoplasm’s of 2010) have been used. In others, to reduce
Neutrophils are pale pink/tan with fine pink noise, various image denoising methods
purple granules and a N:C between 20 % and such as median filter (Ghane et al., 2017;
30 %. Ross et al., 2006) and Wiener filter have
The screening of prepared blood samples been utilized.
for cell counting and classification conducted Segmentation: in this stage, image seg-
by a pathologist is currently the main action mentation algorithms and methods such
taken in the diagnosis leukemia. This manual as thresholding (Wu et al., 2006), water-
process is however; tedious, slow, time-con- shed (Ghane et al., 2017; Jiang et al.,
suming, and is largely dependent on experi- 2003), or active contour models (Zamani
enced experts in this field (Roussel et al., and Safabakhsh, 2006; Vard et al., 2008)
2010). Therefore, computer-aided methods are applied semi- or full-automatically to
that autonomously partially or fully perform extract cells from background. These
some steps of this process, can be very useful methods are also employed to separate
and helpful for experts and researchers cell components, nucleus and cytoplasm,
(Fatichah et al., 2012). from each other.
In the past years, various computer-aided Feature extraction: after segmentation, for
approaches based on image processing meth- each extracted cell, suitable image fea-
ods have been presented to classify five types tures that are able to discriminate each
of mature WBCs (Neutrophils, Eosinophils, group of cells, are extracted from the cell,
Basophils, Lymphocytes and Monocytes) its nucleus and/or cytoplasm. The main
384
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
image features usually utilized in previous diagnostic time (Putzu et al., 2014). Some of
work include: geometric features such as them (Ramirez-Cortes et al., 2010; Saeedi-
area, perimeter, compactness and eccen- zadeh et al., 2016) have only been evaluated
tricity (MoradiAmin et al., 2016); statisti- on a few numbers of images or cells and
cal information such as mean, standard sometimes the reported results of them did not
deviation and skewness; color, such as have high performance (Theera-Umpon and
RGB and L*a*b*(Sarrafzadeh et al., Dhompongsa, 2007). In addition, most of
2014) and texture features such as Gray- these approaches have just been evaluated and
Level Co-Occurrence Matrix (GLCM) compared only by opinion of one expert
(Sabino et al., 2004) and local binary pat- (Sarrafzadeh et al., 2014; Habibzadeh et al.,
tern (LBP)) (Agaian et al., 2014). 2012; Putzu et al., 2014; MoradiAmin et al.,
Classification: in the final stage, classifi- 2016).
cation methods such as SVM (Putzu et al. In this paper, we propose a simple and ef-
2014; Tabrizi et al., 2010), Random For- ficient method based on image processing
est (RF) (Mishra et al., 2017) and Feed- techniques for classification of effective cells
forward Neural Networks (FFNN) in CML (Myeloblast, Promyelocyte, Myelo-
(Theera-Umpon and Dhompongsa, 2007) cyte, Metamyelocyte, Band and Neutrophil)
are used on the set of calculated features from microscopic images of both bone mar-
in the previous stage to determine the type row and peripheral blood. In the proposed
of each cell. method, after automatic segmentation of
For example (Theera-Umpon and WBCs, the cells and their nuclei are extracted.
Dhompongsa, 2007), the classification of Then, a new combination of features based on
CML cells (Myeloblasts, Promyelocytes, three typical features and three new features
Myelocytes, Metamyelocytes, Bands and are extracted from them. Finally, a novel de-
Neutrophils) have been considered. In this cision tree is implemented for classification of
work, nuclei of WBCs have been manually CML cells. The main contribution of this pa-
segmented. Next, texture and geometric fea- per is as follows:
tures, based on the pattern spectrum and the Database: in this project, 540 images have
area, have been extracted from them. Finally, been gathered from 25 patients and 1730
ANN (Artificial neural network) was applied WBCs have been extracted and classified
to classify each cell into the six groups of from these images.
CML cells. This algorithm has been tested on Feature Extraction: three new geometric
431 cells and the average of accuracy, re- features have been calculated and ex-
ported in this paper, has been about 77 %. tracted from WBCs that include: 1- Mini-
Ramirez-Cortes et al. (2010) also con- mum thickness of Nucleus Mask to mini-
ducted the segmentation of WBCs manually mum convex thickness of Nucleus Mask
for classification of the six groups of CML ratio, 2- Minimum thickness of Nucleus
cells, followed by the extraction of some tex- Mask to Hausdorff distance between the
ture and geometric features including pattern nucleus and cell ratio and 3- Perimeters of
spectrum, normalized area, and nucleus to cy- smaller nucleus after splitting.
toplasm area ratio from 54 nuclei of WBCs. Classification: a simple and efficient deci-
In this paper, the average of accuracy, 87.6 %, sion tree classifier has been introduced to
has been obtained by a Feed-forward Neural classified CML cells to the eight groups
Networks (FFNN) classifier as the best result. (including two extra groups added to the
Most of the previous studies for classifi- six main groups as overlapping comple-
cation of WBCs based on image processing mentary groups).
methods have some issues and limitations. Validation: In this study, WBCs have
For example, some of these approaches need been labeled by two pathologists in differ-
to calculate many image features that increase
385
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
ent times. Then, in order to assess pro- it is converted from RGB color space (Figure
posed method, different evaluation pa- 2a) to CMYK color space and the Y compo-
rameters have been calculated to compare nent (Figure 2b) of it is enhanced as follows:
the results of our algorithm with diagnos-
tic results of two experts. EI = 2×L+H (1)
The remainder of this paper is organized
as follows: In “MATERIALS AND METH- where EI denotes the enhanced image, L
ODS” Section, the proposed method is intro- and H are linear contrast stretching and histo-
duced. Experimental data and validation gram equalization of Y component respec-
methods are explained in ‘‘EXPERI- tively. After enhancing the contrast (Figure
MENTAL DATA AND VALIDATION 2c), a 3×3 minimum filter is employed on this
METHODS’’ Section. The experimental re- image three times to decrease noise and clut-
sults are presented and analyzed in ‘‘RE- ter of the image. Next, the Otsu (1979) thresh-
SULTS AND DISCUSSION’’ Section. Fi- olding method is used to extract binary mask
nally, the paper is concluded in ‘‘CONCLU- of WBCs, and then some morphological op-
SIONS’’ Section. erations including opening followed by clos-
ing and fill holes (Gonzalez and Woods,
MATERIALS AND METHODS 2008) are performed on the binary mask of
cells to clean up it (Figure 2d). This binary
Classification of WBCs is one of the fun- mask is multiplied to the main RGB image
damental tasks in the microscopic image anal- and a cell image in this color space is obtained
ysis. To classify WBCs of CML cells, we pro- (Figure 2e). Since the components of RGB
pose a three-part algorithm including auto- image are highly correlated, the cell image is
matic segmentation, feature extraction and converted from RGB color space to L*a*b*
classification. A schematic block diagram of color space (Figure 2f). L*a*b* color system
the proposed method is presented in Figure 1 has three independent components that the in-
and each part of this algorithm is described in tensity is represented by lightness (L*) and
the following of this section. two color components are denoted by (a*) and
(b*). As a cell’s nucleus can be better discrim-
A: Segmentation inated from its cytoplasm in the a* component
In order to extract cells and segment nu- rather than two other components in L*a*b*
cleus and cytoplasm of them, we apply a sim- color space, we select this component, and
ple and efficient method introduced in our then employ a Gaussian filter with standard
previous work (Ghane et al. 2017). In this deviation, =33, on it to decrease noise and
method, to increase the contrast of the image, clutter of the image. Next, in order to extract
386
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 2: Results of automatic segmentation: (a) Original RGB image; (b) Y sub-band; (c) filtered
image of EI Image; (d) Binary image of cells after thresholding and applying morphological operations;
(e) RGB image of Cells; (f) L*a*b* image; (g) Binary image of nuclei (Nucleus Mask) after applying
morphological operations; (h) the RGB image of Nucleus Mask;(i) output of watershed on (d) (Mask
Cell).
nucleus, a suitable normalization method remove them, the regions which have the area
(Rezatofighi and Soltanian-Zadeh, 2011) is lower than nucleus are disregarded. The result
applied on the filtered image. After that, again of applying nucleus segmentation has been
the Otsu (1979) thresholding method is used presented in Figure 2g and the RGB image of
to extract nucleus. Applying this threshold, the Nucleus Mask has been shown in Figure 2h.
mask of nucleus (Nucleus Mask) is obtained. At the following, in order to split touching
Then, some morphological operations cells and obtaining the mask of each cell indi-
(Gonzalez and Woods, 2008) are performed vidually, the modified watershed transform
on the Nucleus Mask to clean up it. In this re- based on gradient method(Bala, 2012) is used
gard, at first, a morphological closing operator (Figure 2i). Finally, the Nucleus Mask is sub-
(with10-pixel-diameter disk) is used to join tracted from the Cell Mask to obtain the Cyto-
disconnected pixels. Next, a morphological plasm Mask.
opening operator (with 2-pixel-diameter disk)
is applied to remove small objects connected B: Feature extraction
to the mask. After that, a morphological hole- In the next stage, some appropriate fea-
filling operator is utilized to fill inside of seg- tures are determined from the segmentation
mented regions. Usually, in the nucleus mask, results of the nucleus and cytoplasm for each
there are some false objects; that in order to
387
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
WBC. In this regard, after considering charac- the portion of overlapping groups. Moreover,
teristics of all types of CML cells according to it can indicate symmetry and skew in the data
Table 1, we calculated various image features and identify outliers. The general structure of
similar to the previous work including: geo- a boxplot shows in Figure 9a.
metric, statistical, color and texture features. Our experimental results after performing
Then, in order to select the suitable features PCA, ANOVA and boxplot on the typical fea-
that able to properly discriminate each group tures used in the previous papers show that
of CML cell, we performed some feature se- they are not very appropriate features for clas-
lection methods by three statistical analyses sification of WBCs in CML cells. Therefore,
including: one-way analysis of variance in this work, we propose three new geomet-
(ANOVA (Roscoe, 1975)), Principal compo- rical features and combined them with three
nent analysis (PCA (Smith, 2002)) and Box- typical image features to select them as the
plot (Frigge et al., 1989). These statistical suitable features for classification. Three typ-
analyses are described as follows: ical features include: 1- number of nucleus
lobes (Num), 2- area of nucleus (AoN) and 3-
ANOVA
average color of cytoplasm of b* component;
ANOVA is a statistical test used to com-
also new geometrical features consist of: 1-
pare mean values of variations of more than
minimum thickness of Nucleus Mask to min-
two groups. In this test, the discrimination ca-
imum convex thickness of Nucleus Mask ratio
pability of each individual feature is deter- T1
mined with a p–value, where a less p–value ( ), 2- minimum thickness of Nucleus Mask
T2
indicates that the corresponding feature can to Hausdorff distance (Huttenlocher et al.,
properly separate at least one group from oth- 1993) between border of nucleus and Cell
ers. In practice, p–value less than 0.05 are T1
Mask ( ) and 3- perimeter of smaller nucleus
clinically considered significant (Acharya et HD
after splitting (P1).
al., 2014).
PCA Typical features
PCA is a statistical procedure that uses an
orthogonal transformation to convert a set of 1-Number of nucleus lobes (Num)
features of possibly correlated variables into a After separation of the nucleus from a
set of features of linearly uncorrelated varia- WBC, counting separated objects in the Nu-
bles called principal components (PCs). To cleus Mask is determined by the number of
choose a reasonable number of PCs, we ran nucleus lobes. Figure 3a shows one neutrophil
PCA with different number of PCs and com- cell with three lobes.
puted a PC score each time. Finally, compo-
nents including more than 99.9 % of the fea- 2-Area of Nucleus (AoN)
tures were selected. The area of an object in a binary (black
Boxplot and white) image can be defined by counting
In descriptive statistics, a boxplot is a all white pixels inside the object (Nucleus).
graphical representation of groups based on
the five statistical features of each group. 3-Average Color of Cytoplasm (ACoC)
These five statistical features consist of mini- As seen in Table 1, cytoplasm’s color var-
mum, maximum, median, lower quartile and ies during maturity stages. Thus, we compute
upper quartile of data in each group. This the average color of cytoplasm in b* compo-
compact graph is a proper way to summarize nent of L*a*b* color space as an appropriate
the distribution of each group of numerical statistical feature.
data. Also, this graph can be used to determine
388
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 3: (a) Neutrophil with nuclear lobes, (b) Display T1 and T2, (c) A pair of points incorrectly
determined for splitting by the original bottleneck algorithm, (d) Result of applying the proposed algo-
rithm to correct the Step 1 of bottleneck algorithm, (e) RGB image of nucleus with an extra object, (f)
Binary image of e, (g) Nucleus after removing extra object, (h) Display T1 and HD, (i) Binary image
of Band nucleus after splitting, (j) Binary image of Neutrophil nucleus after splitting
389
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 3c), the original bottleneck algorithm using an area filter, then apply the bottleneck
incorrectly determines a pair of points for algorithm again. Figure 3g displays the main
splitting. To overcome this problem, we pre- nucleus and two points after using the pro-
sent a novel algorithm (Figure 4a) to modify posed algorithm in order to split.
this method for finding correct points. Ac-
cording to the block diagram of this figure, we 2- Minimum thickness of Nucleus Mask to
first apply Step 1 of the original bottleneck al- Hausdorff distance between nucleus and cell
gorithm on the Nucleus Mask image. Next, we ratio (
T1
technique. According to this technique, two Our experimental results demonstrate that
obtained points are acceptable if there isn't a minimum thickness of Nucleus Mask (T1) to
black pixel between them. Otherwise two Hausdorff distance (HD) between border of
points are removed and the bottleneck algo- Nucleus Mask (U) and border of Cell Mask
rithm is employed again. The result of apply- (V) ratio in three groups (Myeloblast, Pro-
ing the proposed algorithm is shown in Figure myelocyte and Myelocyte) of CML cells is
3d. different and we can use this feature as a wor-
In the Step 2 of the bottleneck algorithm, thy geometrical feature to classify these three
sometimes the result of splitting process is not groups (Figure 3h). Suppose
acceptable because of an extra object (F in , ,…, and , ,…, are
Figure 3f), connected to the nucleus in the two sets of points that determine border of Nu-
RGB image (Figure 3e). To solve this prob- cleus Mask and border of Cell Mask respec-
lem, we employ a novel algorithm, after split- tively. The distance between the point ∈
ting, shown as a block diagram in Figure 4b. and border is defined as:
As seen in this figure, where the area of the , min‖ ‖ (2)
big nucleus is 3.8 times more than the small
nucleus, and the area of two nucleus (AE+AF)
is more than 74000 (pixels), splitting is ac-
ceptable. Otherwise, we remove small nucleus
Figure 4: Block diagram of the proposed algorithm for modifying of bottleneck algorithm
390
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
391
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 5: Block diagram of proposed decision tree classifier for classification of CML cells into 8
groups. Myeloblast (MB), Promyelocyte (PM), Myelocyte (M), Metamyelocyte (MM), Band (B), Neu-
trophil (N), Myeloblast or Promyelocyte (MB | PM) and Promyelocyte or Myelocyte (PM | M)
392
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
κ (7)
1
(8)
(9)
393
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 6: Sample cells of non-cancerous bone marrow aspirations from our dataset
Figure 7: Sample cells of cancerous peripheral blood smear from our dataset
RESULTS AND DISCUSSION bone marrow aspiration and 192 cells of can-
cerous peripheral blood smear. The remaining
In this project, our dataset was provided in
1443 cells including 619 WBCs of non-can-
several phases during more than one year. In
cerous bone marrow aspiration and 824 CML
this regard, the feature selection and parame-
cells of the cancerous peripheral blood smear
ter setting in the decision tree classifier were
were utilized for testing and generalizing.
performed on only 95 cells of non-cancerous
394
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
To achieve the desired results, we planned was used for SVM classifier. To evaluate the
and performed our experimental procedure in classification results, we divided our primary
five steps as follows: dataset into 80 % for training and 20 % for
testing. Also, for validation, we used K-fold
Step 1: Classification of non-cancerous bone cross-validation with K = 10. Then, the eval-
marrow aspirations using typical features and uation parameters based on the outcomes of
SVM Classifier the classifier and the joint opinion of expert 1
and expert 2 (joint (E1, E2)) were calculated
In the first step, 119 WBCs of non-cancer-
that the average values of specificity, sensitiv-
ous bone marrow aspirations were considered.
ity and accuracy were 56 %, 78 % and 76 %
After considering of the previous papers indi-
respectively.
vidually (Putzu et al., 2014; MoradiAmin et
At the following, we exploited feature se-
al., 2016; Saeedizadeh et al., 2016;
lection methods to remove correlation be-
Rezatofighi and Soltanian-Zadeh, 2011;
tween features, reduce the feature vector size
Sarrafzadeh et al., 2014) in order to improve
and increase the performance of the classifier.
the classification results, we combined fea-
In this regards, ANOVA and PCA methods
tures of these papers and extracted finally 33
were applied to select the suitable features for
typical features. These features included
classification. For this purpose, at first, feature
eleven geometric features of in CML cells (1-
selection using ANOVA was used. The results
area of cell, 2- area of nucleus, 3- nucleus area
to cell area ratio, 4- perimeter of cell, 5- pe- of this method are portrayed as a plot between
rimeter of nucleus, 6- solidity of nucleus, 7- feature index and p–value in Figure 8a. This
eccentricity of nucleus, 8- minor axis length figure can indicate significant features that
of nucleus, 9- elongation of nucleus, 10- form able to discriminate different groups. The fea-
factor of nucleus, 11- the average of Fourier tures with p–values less than 0.05 can be used
coefficients), six statistical features of cell ob- to discriminate the six classes with higher ac-
tained from G component of RGB color space curacy. Using of ANOVA, we found that 26
(12- mean , 13- standard deviation , 14- features from the 33 features are statistically
smoothness, 15- uniformity , 16- third central significant; and are capable to classify mye-
moment and 17- entropy) and six statistical loid series into Myeloblast, Promyelocyte,
features of nucleus from the G component of Myelocyte, Metamyelocyte, Band and Neu-
RGB color space (18- mean, 19- standard de- trophil classes.
viation, 20- smoothness, 21- uniformity, 22- Then, we employed the SVM classifier
third central moment and 23- entropy), six sta- that the average values of sensitivity, specific-
tistical features of cytoplasm calculated from ity and accuracy for the six groups were ob-
the b* component of L*a*b* color space (24- tained 60 %, 93 % and 88 % respectively.
mean, 25- standard deviation, 26- smooth- These evaluation parameters revealed that ap-
ness, 27- uniformity, 28- third central moment plying only ANOVA method does not lead to
and 29- entropy); also two texture features the desired classification results.
based on GLCM matrix of nucleus (30- con- Following this step, the PCA technique
trast and 31- corrolotion) and two texture fea- was used on 26 selected features after
tures based on GLCM matrix of cytoplasm ANOVA test. To choose a reasonable number
(32- contrast and 33- corrolotion) from b* of PCs, PCA was run with a different number
component of L*a*b* color space. of PCs ranging from 1 to 12; each time, a PC
After that, the SVM classifier was utilized score was computed. According to Figure 8b,
for categorizing of the cells. SVM is one of a PC score for 12 features of the 26 features
the powerful classification methods that typi- include more than 99.9 % of features and can
cally are used in the previous work. For this be an adequate choice for classification of the
study, according to (Saeedizadeh et al., 2016) six groups of CML cells. After using SVM
a radial basis functions kernel with sigma = 3 classifier on these 12 features, average values
395
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 8: Result of simulation in step 1: (a) Plot between feature index and p–value for showing feature
significance and (b) PCA performance based on number of principle components
of 87 % for sensitivity, 97 % for specificity Neutrophil). Also, the T1/HD box plots (Fig-
and 96 % for accuracy were achieved for the ure 9c) show obvious differences among
six groups. The evaluation parameters show Myeloblast, Promyelocyte and Myelocyte.
that applying PCA after ANOVA can improve Furthermore, as shown in the box plots of Fig-
the results of classification, however the sen- ure 9d, there is discrimination between Band
sitivity has not been satisfactory yet. and Neutrophil in the P1 feature. Moreover,
Neutrophile cells can be recognized where
number of nucleus lobes were more than one
Step 2: Classification of non-cancerous bone (Num>1).
marrow aspirations using proposed features The SVM classifier was then applied for
and SVM classifier classification of the six groups of CML cells
In this step, we attempted to introduce a and achieved the average values of specificity,
few reliable set of features that better distin- 92 %, sensitivity, 98 %, and accuracy, 97 %.
guish each type of CML cells. For this pro- The evaluation parameters show that applying
pose, we evaluated various number of fea- proposed features have improved the results
tures, used in the previous work, and some of classification using fewer numbers of fea-
proposed new features by boxplot. Next, by tures.
comparing and considering these boxplots, we
selected four proposed features (T1/T2 fea- Step 3: Classification of non-cancerous bone
ture, T1/HD feature, P1 feature and number of marrow aspirations using proposed features
nucleus lobes (Num)) as the suitable features and new decision tree classifier
for classification of CML cells. In this step, as in the previous step, we
The results of selected features using the used 119 CML cells of non-cancerous bone
box plot have been displayed in Figure 9b-d. marrow aspiration and four proposed features.
As observed from the box plots of Figure 9b, However, in order to obtain the desired re-
there are significant differences using the sults, we employed the new decision tree clas-
T1/T2 feature, among three groups (Myelo- sifier, presented in the previous section. In the
blast, Promyelocyte and Myelocyte), Meta-
myelocyte group and two groups (Band and
396
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 9: Display of the best features using boxplot for CML cell classification of non-cancerous bone
marrow aspirations. (a) Description of a box plot, (b) "T1"/"T2" (the minimum thickness of Nucleus
Mask to minimum convex thickness of Nucleus Mask ratio), (c) "T1"/"HD" (the minimum thickness of
Nucleus Mask to minimum convex thickness of Nucleus Mask ratio), (d) P1 (perimeter of smaller
nucleus after splitting)
training stage, for 95 cells, we set up all kappa coefficient between joint (E1, E2) and
thresholding values for the classifier in order proposed automatic algorithm (Auto_alg) was
to obtain the best results (100 % for the spec- achieved 0.97.
ificity, sensitivity and accuracy). After
achieving the desired results, we tested our al-
gorithm on 24 WBCs and compared the clas- Step 4: Classification of cancerous peripheral
sification results to the joint opinion of two blood smear using proposed features and new
experts that the average values of specificity, decision tree classifier
sensitivity and accuracy were obtained 100 %, Since, the acceptable results for the classi-
100 % and 100 % respectively. fication of non-cancerous bone marrow in the
At the following, for generalization, we step 3 were obtained; in this step, we em-
applied our algorithm on a new dataset includ- ployed the same four proposed features and
ing 595 CML cells of non-cancerous bone the same decision tree classifier, used in pre-
marrow aspiration and compared the classifi- vious step, on 240 CML cells of the cancerous
cation results to the joint opinion of two ex- peripheral blood smear. The average values of
perts. The average values of specificity, sensi- sensitivity, specificity and accuracy for the six
tivity and accuracy are attained 96 %, 98 % groups using our classifier were obtained
and 98 % respectively. Moreover, Cohen’s 89 %, 92 % and 91 % respectively. Then, in
397
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
order to achieve better results, at first, we at- and AON) to our feature set. The results of se-
tempted to modify the parameters of the clas- lected features using box plot have been dis-
sifier, but the results did not substantially played in the Figure 10a-e.
change. Next, we added two features (ACOC
Figure 10: Display of the best features using boxplot for classification of CML cells. (a) "T1"/"T2" (the
minimum thickness of Nucleus Mask to minimum convex thickness of Nucleus Mask ratio), (b) ACoC
(average color of cytoplasm, (c) "T1"/"HD" (the minimum thickness of Nucleus Mask to minimum convex
thickness of Nucleus Mask ratio), (d) AoN (area of nucleus) and (e) P1 (perimeter of smaller nucleus
after splitting)
398
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
As seen in the box plots of Figure 10a, Finally, in order to set up classifier param-
there are substantial differences using the T1/ eters related to two new features and two extra
T2 feature, between three groups (Myeloblast, groups, in the training stage, for 192 cancer-
Promyelocyte and Myelocyte) and other ous cells, we tuned the related thresholding
groups (Metamyelocyte, Band and Neutro- values of the classifier to get the best results
phil). The ACoC box plots (Figure 10b) indi- (100 % for the specificity, sensitivity and ac-
cates considerable variations between Myelo- curacy). After reaching the appropriate re-
blast, Promyelocyte and Metamyelocyte; so sults, we tested our algorithm on 48 remaining
this feature can be used for separating them. WBCs of cancerous cells and compared the
Also, T1/HD box plots (Figure 10c) show a classification results to the joint opinion of
notable difference between Myeloblast and two pathologists that the average values of
Promyelocyte and AoN box plots (Figure specificity, sensitivity and accuracy were
10d) can be employed for separating Promye- achieved 100 %, 100 % and 100 %, respec-
locyte and Myelocyte. Furthermore, from the tively, for the eight groups of the CML cells.
box plots of Figure 10e, we can find clear dis- After that, for generalization, we utilized
crimination between Band and Neutrophil in the proposed classification method on a new
the P1 feature. data including 776 CML cells of the cancer-
Additionally, we added two extra groups ous peripheral blood smear and compared the
to the six main groups, because our experts' classification results to the joint opinion of
investigation revealed that in this series of two experts. The results of the automatic al-
cells, there can be an overlap between Myelo- gorithm versus joint (E1, E2) have been
blast and Promyelocyte (MB|PM); also an shown in the Figure 11. According to this fig-
overlap between Promyelocyte and Myelo- ure, calculated performance parameters for all
cyte (PM | M). of 8 groups are above 99 %. Also, Cohen's
kappa coefficient between joint (E1, E2) -
Auto_alg pairing is attained 0.99. These wor-
thy results confirm validity of proposed algo-
rithm to classify CML cells of the cancerous
peripheral blood smear as well.
Figure 11: Results of automatic algorithm (Auto_alg) versus joint opinion between Expert1 and Ex-
pert2 (joint (E1, E2)) for classification of cancerous peripheral blood smear. MB (Myeloblast), PM
(Promyelocyte), M (Myelocyte), MM (Metamyelocyte), B (Band) and N (Neutrophil)
399
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Step 5: Classification of both non-cancerous 99.4 % and 98.3 % for all of the eight groups.
bone marrow aspirations and cancerous pe- Consequently, as can be seen from Figure 12,
ripheral blood smear using proposed features the results of our method verify reliability and
and new decision tree classifier validation of proposed algorithm to classify
CML cells.
In the final step, we utilized proposed de-
Furthermore, in order to determine the
cision tree classifier (Figure 5) on all of our
level of agreement between the classification
testing set (1443 cells) including 619 WBCs
results of proposed method and the manually
of non-cancerous bone marrow aspiration and
labeling by two pathologists (E1, E2 and joint
824 CML cells of the cancerous peripheral
(E1, E2)), we computed the Cohen's kappa co-
blood smear to classify them into the eight
efficient. Moreover, this coefficient was de-
groups.
termined between two experts (E1 and E2).
After classification, to validate the effec-
The obtained Cohen's kappa coefficients are
tiveness of the proposed algorithms, we calcu-
presented in Table 2. According to this table,
lated sensitivity, specificity and accuracy as
Cohen’s kappa coefficient has been 0.97 be-
the performance measures between the manu-
tween E1 and E2 pairing while between
ally labeling (E1, E2 and joint (E1, E2)) and
Auto_alg and joint (E1, E2) pairing is very
the results obtained by the proposed classifier
close to 1 (0.99). Also, this coefficient demon-
(Auto_alg) for our dataset. These evaluation
strates level of agreement between the auto-
parameters for Auto_alg versus joint (E1, E2),
matic algorithm and Expert 1 (0.98) is more
Auto_alg versus E1 and Auto_alg versus E2
than the automatic algorithm and Expert 2
have been shown in Figure 12a, Figure 12b
(0.97). As a result, calculated Cohen’s kappa
and Figure 12c respectively. The results be-
coefficients verify conformity between the re-
tween Auto_alg and joint (E1, E2), show that
sult of the automatic algorithm and two ex-
the average values of accuracy, specificity and
perts.
sensitivity respectively have been 99 %,
E1-E2 0.97
Auto_Alg: Result of the automatic algorithm, E1: Opinion of Expert 1, E2: Opinion of Expert 2 and Joint (E1, E2): Joint opinion
between E1 and E2
400
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Figure 12: Results of automatic algorithm (Auto_alg) for classification of both non-cancerous bone
marrow aspirations and cancerous peripheral, versus (a) Joint opinion between Expert1 and Expert2
(Joint (E1, E2)), (b) Opinion of Expert1 and (c) Opinion of Expert2. MB (Myeloblast), PM (Promyelo-
cyte), M (Myelocyte), MM (Metamyelocyte), B (Band) and N (Neutrophil)
401
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
402
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
403
EXCLI Journal 2019;18:382-404 – ISSN 1611-2156
Received: March 20, 2019, accepted:June 04, 2019, published: June 14, 2019
Wang H, Zhang H, Ray N. Clump splitting via Zamani F, Safabakhsh R. An unsupervised GVF snake
bottleneck detection and shape classification. Pattern approach for white blood cell segmentation based on
Recognit. 2012;45:2780-7. nucleus. In: 8th International Conference on Signal
Processing, 2006. New York: IEEE, 2006.
Wu J, Zeng P, Zhou Y, Olivier C. A novel color image
segmentation method and its application to white blood Zittoun R, Rea D, Ngoc LH, Ramond S. Chronic
cell image analysis. In: 8th International Conference on neutrophilic leukemia. Ann Hematol. 1994;68:55–60.
Signal Processing, 2006 (pp 4-7). New York: IEEE,
2006.
404