Professional Documents
Culture Documents
Abstract: Considering the importance of early diagnosis of breast cancer, a supervised patch-wise texton-based approach has
been developed for the classification of mass abnormalities in mammograms. The proposed method is based on texture-based
classification of masses in mammograms and does not require segmentation of the mass region. In this approach, patches from
filter bank responses are utilised for generating the texton dictionary. The methodology is evaluated on the publicly available
Digital Database for Screening Mammography database. Using a naive Bayes classifier, a classification accuracy of 83% with
an area under the receiver operating characteristic curve of 0.89 was obtained. Experimental results demonstrated that the
patch-wise texton-based approach in conjunction with the naive Bayes classifier constructs an efficient and alternative approach
for automatic mammographic mass classification.
with values for the (σ, τ) pair set as (2,1), (4,1), (4,2), (6,1), (6,2),
(6,3), (8,1), (8,2), (8,3), (10,1), (10,2), (10,3) and (10,4). These
filters are: rotationally invariant, isotropic and anisotropic, multi-
scale and at multiple orientations. With such characteristics, they
can generate appropriate features for various types of textures. The
RoIs were then convolved with each of the filter (see Section 3.1).
The resulting filter responses on one of the sample patches from
the input dataset are shown in Fig. 3. The filter responses were
normalised to zero mean and unit variance. Experiments showed
that this pre-processing improved the overall classification results.
Laplacian of Gaussian (LoG) filters. The edge and bar filters are argmin ∑ ∑ ∥ Xk − T j ∥2 (4)
T j = 1 Xk ∈ Dc
obtained at six orientations and three scales [(σx; σy) = (1,3), (2, 6),
(4, 12)], for the Gaussian filter with σ = 10 and for the three LoG
After applying K-means clustering for each of the benign and
filters with σ = 3, 10, 12. The other 13 filters are obtained from the
malignant classes, texton model generation data for both classes
S filter bank, including various isotropic Gabor-like filters [24] of
were partitioned into k clusters and the texton Tj is referred to by
the form
the mean vector of a particular cluster j ( j ≤ k). For the current
πτr −(r2 /2σ) work, we used 10 clusters for each of the benign and malignant
F(r, σ, τ) = F0(σ, τ) + cos e (1) classes (10 textons/class). The final texton dictionary was
σ
generated by merging the textons from both classes. This
1062 IET Comput. Vis., 2018, Vol. 12 Iss. 8, pp. 1060-1066
© The Institution of Engineering and Technology 2018
Fig. 3 Process of generating filter responses for a sample image. Fifty-three filters (from the MR and S filter banks) are shown on the left side of the image.
The middle image shows a sample image that has been convolved with the filters and the results are shown on the right
Fig. 4 Process of generating histograms from a test image. Each red box within one particular filter response corresponds to a 7 × 7 patch that has been
assigned to the most similar texton from the texton dictionary (Fig. 1). Finally, a frequency histogram is generated for one filter response image that represents
the frequency distribution of all the textons contained in the texton dictionary for a particular filter response. Subsequently, the histograms for all filter
responses were concatenated into a single feature vector
specification leads to 20 textons (T1, T2, T3, … T20) from which 10 dictionary (generated in Section 3.3). The Euclidean distance was
textons belong to the benign class (T1 … T10) and 10 to the used as the metric to estimate the similarity. After that frequency
malignant class (T11 … T20). Based on our experiments, 10 clusters histograms were generated for each filter response, where each bin
is an appropriate cluster representation for 200 samples per class of the histogram represented the frequency of a particular texton in
considering memory and computational expense (see Section 4). a particular filter response image. Fig. 4 shows the process of
In the work proposed by Varma and Zisserman [19, 20], the generating a frequency histogram from one of the filter response
final model consisted of the texton frequency histogram where the images from the evaluation data. Such histograms were generated
histograms were representing the probability distribution of each for each of the 53 filter response images for each evaluation RoI,
texton for each texton model generation sample (therefore, the total resulting in 53 histograms for a single evaluation image. The final
number of models was equal to the total number of texton model feature vector was represented by aggregating the texton
generation samples that were used for model generation), whereas distribution for all 53 filter responses for a test RoI, resulting in a
in the proposed approach the final model comprised the textons feature vector with dimension equal to 1060 (i.e. 20 × 53). The
(cluster centroids), which will be used to generate the features and most salient features were extracted (to reduce the data
perform evaluation. dimensionality) using Weka [25], where the CfsSubsetEval
attribute evaluator was used along with the BestFirst search
method. CfsSubsetEval evaluates the level of a subset of attributes
3.4 Model evaluation by evaluating the individual predictive ability of each feature along
For the evaluation, a feature set was generated for the data based with the degree of redundancy between them. Whereas, the
on the model developed in Section 3.3. For feature-set generation, BestFirst search method searches the subset of the total attribute
the same initial steps were repeated with the evaluation data: i.e. space by using greedy hill climbing with added backtracking
after getting filter responses for each RoI from the evaluation facility. In this way, the total attributes have been reduced from
dataset, the filter response images were divided into 7 × 7 patches 1060 to 17. Varma and Zisserman [19, 20] used the KNN approach
and subsequently converted into a 1D vector. The next step was to for the final classification, where they used Chi-square statistics to
assign each data vector to the closest texton from the texton compare the histogram corresponding to the evaluation data with
Fig. 5 ROC curve for the proposed method plotting the false positive rate
against the true positive rate. The operating points on the curve where the
CA was 83% have been indicated
Fig. 6 Randomly selected correctly classified instances
(a), (b) Instances from benign class, (c), (d) Malignant RoIs
the histograms that have been generated as a model, whereas here
we used a naive Bayes classifier using bootstrap aggregation to
report the final evaluation results.
225) from the dataset were resized to 175 × 175 (for the filter-based
method size of the input image was the same, i.e. 225 × 225 and
were reduced to 175 × 175 after getting filter responses).
Subsequently, patches of size 7 × 7 were extracted from each
RoI representing the mass region. Overlapping patches were
extracted in order to get additional samples from the model
building images to capture data variations. For a single RoI, 28,561
patches of size 7 × 7 were extracted. In total, 2,856,100 patches
were extracted for each of the benign and malignant classes. For K-
means clustering, k was set to 10, which resulted in 10 clusters for
each class. The overall set-up is quite similar to the approach
presented in this paper except that the patches are now representing Fig. 8 Randomly selected misclassified instances
the raw image data instead of using filter response images. After (a), (b) Benign RoIs, (c), (d) RoIs from the malignant class
generating the texton dictionary, evaluation data was used with a
similar processing pipeline. A histogram was generated for each Table 3 Confusion matrix using raw image patches
evaluation RoI which was assigned to the closest texton. This
Benign Malignant
histogram was then used as a feature for classifying masses but the
length of the feature vector in this case is 20 instead of 1060 as was benign 34 16
the case when using filter response patches. malignant 14 36
The classification setup was the same, i.e. using a naive Bayes
classifier and a 10-FCV scheme. After attribute selection the total
number of attributes was reduced to 5. The final CA when using Table 4 Overview of existing techniques developed for
raw image patches was 70% with the Az value equal to 73%. The mass classification in mammograms
confusion matrix for this experiment can be seen in Table 3, which Author Features Best reported
shows a considerable difference in terms of true and false classified results, %
instances compared to the results in Table 1. Rangayyan et al. [13] shape, textures CA = 95
From the presented classification results for benign and Mudigonda et al. [7] shape, texture CA = 82.1
malignant masses, unlike the material texture classification [20], Valarmathie et al. [11] shape, margin and CA = 94
patches from the filter responses are more useful rather than using texture
raw image patches. The likely reason is that both benign and
Mu et al. [8] shape, texture Az = 0.95
malignant mass areas are very similar in terms of intensity values,
but the filter responses are useful for producing more distinct Rouhni et al. [12] shape, texture CA = 96.47
features (e.g. boundaries and lines) to classify both classes. Campos et al. [10] ICA CA = 97.3
Dong et al. [16] mass area CA = 97.73
5.2 Comparison with the alternative approaches Boujelben et al. [14] boundary of the CA = 97.9
mass area
Table 4 is summarising the existing work with respect to
alternative approaches for mass classification in mammograms. Işikli Esener et al. [17] LCP, statistical, CA = 94.67
As discussed in Section 2, the shape of the mass is one of the frequency domain
important factors for the characterisation of benign and malignant Buciu and Gacsadi [18] gabor Az = 0.78
mass in mammograms [6]. In the literature review of this paper, Li et al. [22] subsampled textons CA = 85.96
several methods have been discussed [7, 8, 11–13] that used the Varma and Zisserman [20] traditional textons CA = 70
shape of the mass as a major feature (with other features as texture, proposed approach modified textons CA = 83
intensity etc.) and provided promising results for mammographic
mass classification (benign versus malignant). In clinical practice,
it is not always possible for the boundary of the mass region to be
provided so that the shape features can be extracted and be used for equal to 98% using shape, margin and texture features, however, a
the classification. The current approach of using texture features very low CA was reported using only the texture features (∼60%)
(based on texton modelling) is different from the approaches for the mass classification in mammograms. Similarly, good
presented in the literature review, that used shape features keeping classification results have been reported by Boujelben et al. [14],
into account the problem that the mass segmentation is not always where the boundary information of the mass region has been
available at the classification stage and therefore only the texture of exploited to classify the RoIs as benign or malignant. Işikli Esener
the mass region is used to extract the features. et al. [17] used LCP features in combination with the statistical and
Campos et al. [10] used ICA to extract the features that were frequency domain features and reported a CA equal to 94.67% for
used for classifying the patches extracted from the mammograms. classifying mammographic RoIs as normal, benign or malignant. In
They reported an accuracy of 97.3%, however, the ICA feature their work, they used the masses corresponding to only fatty
extraction seems to be based on all data and no training/testing tissues, whereas the dataset used for the current approach uses
separation is used. In [16], several features have been extracted randomly selected masses that belongs to different tissue densities
from the segmented mass area and a CA of 97.73% has been and not restricted to a particular type of tissue class. Other closely
reported as the best results based on the single fold with average related work to the current approach in the literature is found in
results just above 90%. In [11], the best CA has been reported to be [18], where Gabor wavelets have been used to filter the images and
then PCA was applied to reduce the data dimensionality. By using
IET Comput. Vis., 2018, Vol. 12 Iss. 8, pp. 1060-1066 1065
© The Institution of Engineering and Technology 2018
only the Gabor wavelets on the images, they reported results in [5] Oliver, A., Freixenet, J., Marti, J., et al.: ‘A review of automatic mass
detection and segmentation in mammographic images’, Med. Image Anal.,
terms of Az equal to 0.78 for the classification of benign and 2010, 14, (2), pp. 87–110
malignant masses. The current approach of using various filter [6] American College of Radiology BI-RADS Committee and American College
responses in combination with the texton-based approach provided of Radiology: ‘Breast imaging reporting and data system’ (American College
results in terms of Az equal to 0.89, that is an improvement on the of Radiology, Reston, VA, USA, 1998)
[7] Mudigonda, N.R., Rangayyan, R., Desautels, J.L.: ‘Gradient and texture
results achieved by Buciu and Gacsadi [18] using only Gabor analysis for the classification of mammographic masses’, IEEE Trans. Med.
wavelets. Imaging, 2000, 19, (10), pp. 1032–1043
As mentioned in Section 2, texton-based work has been done in [8] Mu, T., Nandi, A.K., Rangayyan, R.M.: ‘Classification of breast masses using
selected shape, edge-sharpness, and texture features with linear and kernel-
the past for the classification of benign and malignant masses by Li based classifiers’, J. Digit. Imaging, 2008, 21, (2), pp. 153–169
et al. [22], which showed CA equal to 85.96% that is comparable [9] Ball, J.E., Bruce, L.M.: ‘Digital mammographic computer aided diagnosis
to the developed approach. (CAD) using adaptive level set segmentation’. 29th Annual Int. Conf. of the
IEEE Engineering in Medicine and Biology Society, Lyon, France, 2007, pp.
4973–4978
6 Future directions [10] Campos, L., Silva, A., Barros, A.: ‘Diagnosis of breast cancer in digital
mammograms using independent component analysis and neural networks’.
In the future, we will examine the effect of using different sizes of Iberoamerican Congress on Pattern Recognition, Berlin, Germany, 2005, pp.
texton dictionary (different number of clusters) to end up with the 460–469
best combination of patch size and the size of the texton dictionary [11] Valarmathie, P., Sivakrithika, V., Dinakaran, K.: ‘Classification of
mammogram masses using selected texture, shape and margin features with
in terms of CA. In the method proposed by Varma and Zisserman multilayer perceptron classifier’, Biomed. Res., 2016, pp. S310–S313
[19], CA improved by increasing the number of clusters, whereas [12] Rouhi, R., Jafari, M., Kasaei, S., et al.: ‘Benign and malignant breast tumors
in the current method at a certain point the classification classification based on region growing and CNN segmentation’, Expert Syst.
performance degraded by increasing the number of clusters. Appl., 2015, 42, (3), pp. 990–1002
[13] Rangayyan, R.M., El-Faramawy, N.M., Desautels, J.L., et al.: ‘Measures of
Further statistical evaluation needs to be performed in order to acutance and shape for classification of breast tumors’, IEEE Trans. Med.
explore if the difference between the texton-based approach [19] Imaging, 1997, 16, (6), pp. 799–810
and the current patch-based method is significant. [14] Boujelben, A., Chaabani, A.C., Tmar, H., et al.: ‘Feature extraction from
Deep learning is a new area of machine learning and several contours shape for tumor analyzing in mammographic images’, Digit. Image
Comput., Tech. Appl., 2009, pp. 395–399
techniques have been proposed in the field of CAD system [15] Ertas, G., Gulcur, H., Aribal, E., et al.: ‘Feature extraction from
development [26–28] giving promising results. It should be noted mammographic mass shapes and development of a mammogram database’.
that deep learning tends to rely on large annotated datasets which 2001 Proc. of the 23rd Annual Int. Conf. of the IEEE Engineering in
are not always available. In addition, the best results tend to be Medicine and Biology Society, Istanbul, Turkey, 2001, Vol. 3, pp. 2752–2755
[16] Dong, M., Lu, X., Ma, Y., et al.: ‘An efficient approach for automated mass
obtained using a mixture of deep learning and handcrafted features. segmentation and classification in mammograms’, J. Digit. Imaging, 2015,
In the presented work, the primary focus was on exploiting 28, (5), pp. 613–625
traditional machine learning approaches (and features) for [17] Işkl Esener, İ., Ergin, S., Yüksel, T.: ‘A new ensemble of features for breast
providing improved results for the classification of benign and cancer diagnosis’. 38th Int. Convention on Information and Communication
Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia,
malignant masses. In the future, we will investigate the effects of 2015, pp. 1168–1173
using deep learned features in combination with the proposed [18] Buciu, I., Gacsadi, A.: ‘Directional features for automatic tumor classification
feature set (texton histograms) for classifying masses as benign or of mammogram images’, Biomed. Signal Proc. Control, 2011, 6, (4), pp. 370–
malignant using additional data. 378
[19] Varma, M., Zisserman, A.: ‘A statistical approach to texture classification
from single images’, Int. J. Comput. Vis., 2005, 62, (1–2), pp. 61–81
7 Conclusions [20] Varma, M., Zisserman, A.: ‘A statistical approach to material classification
using image patch exemplars’, IEEE Trans. Pattern Anal. Mach. Intell., 2009,
In conclusion, we have proposed a modified texton-based approach 31, (11), pp. 2032–2047
to classify pre-detected mammographic abnormalities as benign or [21] Kinoshita, S.K., Marques, P.A., Slaets, A.F.F., et al.: ‘Detection and
characterization of mammographic masses by arpngicial neural network’,
malignant. Patches from filter responses were used to make the Digital Mammography, 1998, 13, pp. 489–490
texton dictionary and texton frequency histograms from all 53 filter [22] Li, Y., Chen Rohde, H.G.K., Yao, C., et al.: ‘Texton analysis for mass
responses for each RoI were aggregated to form features for classification in mammograms’, Pattern Recognit. Lett., 2015, 52, pp. 87–93
classifying the benign and malignant masses. The developed model [23] Heath, M., Bowyer, K., Kopans, D., et al.: ‘The digital database for screening
mammography’. Proc. of the 5th Int. Workshop on Digital Mammography,
was evaluated on a subset of the DDSM dataset. Results were Medical Physics Publishing, Toronto, Canada, 2000, pp. 212–218
comparable to alternative state-of-the-art methods [22]. [24] Gabor, D.: ‘Theory of communication’, J. Inst. Electr. Eng. Part III, Radio
Commun. Eng., 1946, 93, (26), pp. 429–441
[25] Eibe, F., Mark, A.H., Ian, H.W.: ‘The WEKA Workbench. Online appendix for
8 References ‘Data Mining: practical machine learning tools and techniques’’ (Morgan
Kaufmann, Cambridge, MA, USA, 2016, 4th edn.)
[1] National Health Service-Breast Screening: ‘Professional guidance’, 31 August
[26] Jiao, Z., Gao, X., Wang, Y., et al.: ‘A deep feature based framework for breast
2016. Available at https://www.gov.uk/government/collections/breast-
masses classification’, Neurocomputing, 2016, 197, pp. 221–231
screening-professional-guidance
[27] Jadoon, M.M., Zhang, Q., Haq, I.U., et al.: ‘Three-class mammogram
[2] Tabár, L., Dean, P.B.: ‘Breast cancer-the art and science of early detection
classification based on descriptive CNN features’, BioMed Res. Int., 2017,
with mammography’ (Thieme, New York, 2005), ISBN: 3-13-131
2017, pp. 1–11
[3] Djaroudib, K., Ahmed, A.T., Zidani, A.: ‘Textural approach for mass
[28] Ribli, D., Horváth, A., Unger, Z., et al.: ‘Detecting and classifying lesions in
abnormality segmentation in mammographic images’, 2014, arXiv preprint
mammograms with deep learning’, Sci. Rep., 2017, 8, Article number: 4165,
arXiv:1412.1506
p. 4165
[4] Elter, M., Horsch, A.: ‘CADx of mammographic masses and clustered
microcalcifications: a review’, Med. Phys., 2009, 36, (6), pp. 2052–2068