You are on page 1of 15

Mango Leaf Unhealthy Region Detection

and Classification

K. Srunitha(&) and D. Bharathi

Department of Computer Science and Engineering, Amrita School of


Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Amrita University,
Coimbatore, India
cb.en.p2cvi15013@cb.students.amrita.edu,
d_bharathi@cb.amrita.edu

Abstract. Diseases in any plant decrease the productivity and quality of


product. Identification of plant leaf diseases by naked human eye is very diffi-
cult. Image processing techniques can identify the diseased leaf by prepro-
cessing and classifying leaf unhealthy regions. This paper delivers an
implementation on Mango leaf unhealthy region detection and classification. In
the Proposed work Multiclass SVM is used for diseases classification and
segmentation through k-means. The experimental results show the effectiveness
of the proposed method in recognizing the diseases affected mango leaf.

Keywords: Multi class SVM  Image processing  k-means clustering

1 Introduction

India is famous for its plant cultivation and productivity. 3/4th of our nation’s popu-
lation depended on it. Individuals of here have wide variety of choice, for selecting
plants for their cultivation. Many research works are going behind plant cultivation for
improving the quality of production, reducing the expenditure, increasing the profit rate
and decreasing the pesticide usage. The outcome of all the cultivation is a mixed
combination of perfect soil, water, weather, chemicals…etc.
The major difficulty facing here is the infections found in plants, which cause
reduction of product count and quality degradation. Diagnosis of diseases and timely
cure is very important task in reducing substandard products. Each plant diseases
depends on many characteristics, behavior and also various parameters like environ-
ment, nutrient, organism etc., up on which some of the diseases are not easily distin-
guishable. The visual symptom for each disease varies with respect to how deeply it is
affected. Hence diagnosis is very difficult task to carry out manually. This makes
development in image processing and pattern recognition techniques. With this it is
possible to create a system which can recognize, classify and thus cure the diseases.
Usage of pesticides actually damages the field and produce health issues in human.

© Springer International Publishing AG 2018


D. J. Hemanth and S. Smys (eds.), Computational Vision and Bio Inspired Computing,
Lecture Notes in Computational Vision and Biomechanics 28,
https://doi.org/10.1007/978-3-319-71767-8_35
Mango Leaf Unhealthy Region Detection and Classification 423

In plants, leaf is considered as the area which is commonly prone to diseases.


Majority of this are caused due to fungi, bacteria and viruses. This paper handles
mainly four types of mango leaf diseases. They are Anthracnose, Red rust, Sooty
mould, and Scab. Each disease differs from other with respect to their color, shape type
of virus etc.
Anthracnose affected mango leaf will have dark brown spot covered with yellow
patches. This will cause serious losses of young shoots. Red rust is caused by alga and
it leads to serious reduction in leaf green regions white patches will cover all the green
regions in leaf and it get damaged. Sooty mould shows black velvety patches all over
the leaf. It spreads easily and damages the whole leaf. Scab is a fungal disease. This is
similar to that of Anthracnose. Spots of this will be circular, slightly angular with dark
black color. Identification of these diseases with naked human is impossible. Thus
image processing methods are introduced (Figs. 1, 2, 3 and 4).

Fig. 1 Anthracnose

Fig. 2 Red rust


424 K. Srunitha and D. Bharathi

Fig. 3 Sooty mould

Fig. 4 Scab

For diagnosing the diseases images of mango leaf is captured using digital camera
and mobiles, then the diseases spot is used for classification. The technique includes
pattern recognition and image processing. The purpose of using image processing in
plant leaf diseases identification is that spotting the diseases, recognizing the texture
and color of the diseases spot area, etc.
Here, paper holds mainly following sections. Section 1 is a brief introduction about
the importance of identifying leaf diseases and explains types of diseases that is
handling in this paper. Section 2 narrates in short of literature survey and brief review
on basic image processing technique for diseases identification. Section 3 detailed
explanation of the proposed work and Sect. 4 delivers the experimental results. Finally
Conclusions are summarized in Sect. 5.

2 Literature Survey

The various methods for finding the plant leaf diseases using different image processing
technique mainly segmentation and classification phase is described here.
Anand et al. [1] used the machine learning method for identifying the diseases on
the rice plant. The software prototype system uses the HIS model, boundary and spot
detection for image segmentation and the Self Organizing Map neural network for the
Mango Leaf Unhealthy Region Detection and Classification 425

output classification. Pooja et al. [2] using hybrid intelligence system finds out the
grape leaf infections identification from color imagery. There were mainly three phases
involved. The very first step was leaf color segmentation. A self-forming feature map
along with a back-propagation neural network is organized for identify the color dis-
eases of the leaf. Following was segmenting the affected diseases and last stride was to
find out and classify the diseases. Modified self-forming feature map together with
optimization using genetic algorithm and for classification support vector machine is
used, thus identification of the leaf diseases has done. To identify the color diseases
gabor wavelet is used. For classifying the diseases support vector machine is used.
Krishnan and Sumithra [3] identification and recognition of diseases based on
morphological operator which defines a system that contains four phases; the very prior
phase was image enrichment, that includes, analysis of histogram, HSI detraction and
intensity tuning. Segmentation of taken image was carried through fuzzy c-means
algorithm. The features used to extract from leaf were color, size of effected diseases
and the shape. Back propagation centered neural network was used for the classifi-
cation of the diseases.

Table 1 Summary of methods in literature survey


Name of the Advantages Disadvantages
algorithm
K-nearest Simpler classifier as exclusion of any Speed of computing distance
neighbor training process. Applicable in case of a increases according to numbers
(KNN) small dataset which is not trained available in training samples
Radial basis Training phase is faster. Hidden layer is It is slower in execution when
function easier to interpret speed is a factor
(RBF)
Probabilistic Tolerant of noisy inputs. Instances Long training time. Large
neural classified by more than one output complexity of network
networks structure. Need lot of memory
(PNN) for training data
Back Easy to implement. Applicable to wide Learning can be slow. It is hard
propagation range of problems to know how many neurons as
neural network Able to form arbitrarily complex well as layers are required
nonlinear mappings
Support vector Simple geometric interpretation and a Slow training. Difficult to
machine sparse solution. Can be robust, even understand structure of
(SVM) when training sample has some bias algorithm. Large no. support
vectors are needed from
training set to perform
classification task
426 K. Srunitha and D. Bharathi

Arivazhagan et al. [4] proposed a method for diagnosing the diseases found in
citrus plant. Citrus canker, Anthracnose, Overwatering and Citrus greening disease are
the major diseases that affect the citrus plant. Diseases identification and recognition
play a vital role in improving the cultivation. For resolving the problem the system has
mainly four sections. After the acquisition of leaf image the RGB image of citrus plant
is translated to different color spaces. K-means was used as image segmentation
technique for recognizing the diseases regions. Segmentation extracts the interest
region. The extraction of feature for texture is carried through statistical GLCM and
color through mean values. Finally SVM is used for classification.
Naikwadi and Amoda [5] the author found out the diseases in sugarcane. The input
color image is transferred to L * a * b color space and thresholding a* element gives
the spot information. The diseases regions are pulled out with maximum standard
deviation. For identifying the severity of the diseases the white pixel area is subtracted
from the total segmented area. The system has two features color and texture. GLCM is
used for extracting texture feature and L * a * b is used for color. For classifying the
diseases support vector machine uses both the feature information.
Arivazhagan et al. [6] there are certain diseases which are not easily identified with
naked human eye like black leaf spot and sun scorch seen in orchid leaf. Paper gives a
classification of diseases using morphological image segmentation. The area of white
pixels in the inputted leaf image is calculated for recognizing the diseases. In seg-
mentation phase some preprocessing steps like intensity adjustment and histogram
equalization and filtering techniques like Gaussian filter, disc filter and median filter are
used. Segmentation does the split-up between infected and uninfected areas of leaf.
Morphological operations like opening, closing and filled holes are used. Interest
region are chosen through binary mask. Complementary Binary mask will change zeros
to one and vice- versa. Thus when this complement mask is subtracted from original
the region of interest is founded and thresholding help to identify the edge. Classifi-
cation is done based on the white pixel. White pixel for both the diseases were cal-
culated earlier compares this with the query leaf image and it classifies the diseases
(Table 1).

3 Proposed Method

The proposed work is feasible for all size of images. Before classification and seg-
mentation some preprocessing is carried out. The pre-processing consists of image
acquisition phase followed by image enhancement then segmentation in which the
region of interest is segmented. After segmentation, extraction of feature and finally
classification is carried out. For segmentation Color image segmentation using k-means
clustering is used and GLCM (gray level color co-occurrence metrics) is used for
feature extraction and Support vector machine (SVM) is used as classifier. The
architecture diagram of the proposed method is explained in Fig. 5.
Mango Leaf Unhealthy Region Detection and Classification 427

3.1 Image Acquisition


The first step is to collect dataset for input purpose. The dataset are collected from
agricultural university. Dataset contains 5 classes. Below table shows the dataset details
for each class. Number of image per class is 61 images in Anthracnose, 50 in Red rust,
55 images in Sooty mould, 75 in Scab and 45 healthy mango leaf image. The collected
images are given for preprocessing to separate the diseases infected region.

3.2 Preprocessing of Leaf Image


All captured input images are preprocessed because all are corrupted by illumination,
shadow, and noise. This can cause the loss of information that can be used for diseases
diagnosis. Preprocessing can improve the image feature for further process. Prepro-
cessing mainly includes contrast enhancement and color space transformation. The
original images are of RGB (Red, Green, Blue) color space. They are converted to HSI
(Hue, Saturation, and Intensity) color space.

Fig. 5 Architecture of proposed work

The HSV represent the points in the RGB Space. Hue from HIS color space
explains the color perceived by the viewer. Saturation is the white light added to hue.
Intensity is the value of light amplitude. For each RGB individual color band histogram
428 K. Srunitha and D. Bharathi

stretching is applied and pixel count is calculated. As a result color image are enhanced.
Algorithm 1 explains the Image extraction and pre-processing [10].
Step 1: Collected Diseases infected mango leaf image are store in a folder on pc.
Step 2: Each image from the folder are read through MATLAB R2016a.
Step 3: Original images are converted into hue image, Saturation image and value
image (HSV).
Step 4: Image histograms are calculated for individual RGB color bands i.e., pixel
count (x, y).
Step 5: Apply each color band threshold range to its respective color band.
Step 6: Next regions which are smaller than 100 pixels are eliminated.
Step 7: Regions filled size filtered mask is created for the image.
Step 8: Regions which are smaller than 100 pixels are eliminated.
Step 9: The size filtered mask is applied to the original RGB image.
Step 10: Finally, we are displaying the masked original image of the specified color.

3.3 Segmentation
Mango leaf image are divided into multiple set of pixels. Different types of segmen-
tation are there: Region based, Edge based, Threshold, Feature based, Clustering,
Model based. Color image segmentation using K-Means Clustering is used in our
work. The main aim of this algorithm is to find the mean value between points and
group them depending on their minimum distance [11].
Color Image Segmentation using K-Means Clustering
Step 1: First, an image is taken as an input. The input image is in the form of pixels and
is transformed into a feature space.
Step 2: Next similar data points are grouped together using k-means clustering. The
distances are calculated using and Euclidean distance. Where d = distance, (p,q) = two
data points.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d ðp; qÞ ¼ ðq1  p1Þ2 þ ðq2  p2Þ2 þ    þ ðqn  pnÞ2
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X n
¼ ðqi  piÞ2
i¼1

The data points with minimum distance are grouped together to form the clusters.
Step 3: After clustering is done, the mean of the clusters is taken. The mean color in
each cluster is calculated to be remapped onto the image.
The output of this clustering will have k = 3 clusters of Red, Green and Blue
segments. These clusters are passed to feature extraction phase.
Mango Leaf Unhealthy Region Detection and Classification 429

3.4 Feature Extraction and Classification


Feature extraction is a method for reducing the original dataset collection. Features are
mainly considered as interesting points. It is done by measuring certain features like
texture, color and shape. In this proposed method we uses GLCM (Gray level
co-occurrence metrices) values such as Contrast, Correlation, Energy, Homogeneity,
Mean, Standard-Deviation, Entropy, RMS, Variance, Smoothness, Kurtosis, and
skewness. GLCM is a statistical method of extracting feature. Below is the formula for
calculating GLCM features. Here Ng number of gray level and pd is normalized metrics
dimension of GLCM [12].
Contrast is a measure of gray level variations between the reference pixel and its
neighbor. If the contrast is large it shows large intensity difference in GLCM metrics.
XX
Contrast ¼ ðjÞ2 pd ði; jÞ ð1Þ
i j

Homogeneity shows how close the distribution of element to diagonal of GLCM is


and when homogeneity increases the contrast decreases.
XX 1
Homogeneity ¼ pdði;jÞ ð2Þ
i j 1 þ ði  j Þ2

Entropy is the degree of randomness present in an image. When the co-occurrence


metrics is having same element the value of entropy becomes high. It becomes small
when elements become unequal.
XX
Entropy ¼  pdði;jÞ ln pdði;jÞ ð3Þ
i j

Energy is derived from Angular Second Moment (ASM). Local uniformity of


pixels is measured using ASM. The value of ASM becomes high when pixels are
similar.
pffiffiffiffiffiffiffiffiffiffi
Energy ¼ ASM
XX
ASM ¼ p2d ði; jÞ ð4Þ
i j

Correlation values are the linear dependency of the co-occurrence metrics.


XX
Correlation ¼ p ðilx Þðjly Þ ð5Þ
d ði;jÞ rx ry
i j
430 K. Srunitha and D. Bharathi

Mean is the average of pixel values present in an image.


XX
Mean ¼ ði  jÞpd ði; jÞ ð6Þ
i j

Standard deviation is noted as:


XX
Standard deviation ¼ ði  jÞ2 pd ði; jÞ ð7Þ
i j

Moment is measured as the asymmetry of the image measured as:


XX
Moment ¼ ði  jÞ3 pd ði; jÞ ð8Þ
i j

Kurtosis measures the relative peak of flatness in an image.


XX
Kurtosis ¼ ði  jÞ4 pd ði; jÞ ð9Þ
i j

Skewness gives the textural intensity values. It can be low or high.

X
G1
Skewness ¼ ði  lÞ3 pðiÞ ð10Þ
i¼0

Homogeneity of the diseased region is calculated through inverse difference metrics


(IDM).

X
G1 X
G1
1
IDM ¼ pði; jÞ ð11Þ
i¼0 j¼0 1 þ ði  j Þ2

1
SmoothnessðRÞ ¼ 1  ð12Þ
1 þ r2
G1 n
G1 X
X o
Angular second moment ¼ pði; jÞ2 ð13Þ
i¼0 j¼0

The 13 features are extracted from the diseased mango leaf image and given to
classification algorithm for classification. A comparison of two classifiers is carried out
here. Multi class-SVM (Support vector machine) and ANN (Artificial neural network).
Support vector machine (SVM) is a non-linear classifier, and is a latest method in
Mango Leaf Unhealthy Region Detection and Classification 431

machine learning algorithm. SVM is popularly used in many pattern recognition


problems including texture classification [14]. SVM is designed to work with only two
classes. This is done by maximizing the margin from the hyper plane. Support vectors
are the samples that are closest to the margin, which helps in determining the hyper-
plane. Multiclass classification is applicable and basically built up by various two class
SVMs to solve the problem, by using one-versus-all. The SVM uses RBF (Renal basis
function) as kernel here. For a K-class classification One versus Rest creates K separate
binary classifier. The nth binary classifier is trained by taking data from the nth class.
This is considered as a positive example and the remaining classes (K − 1) considered
as negative. When a test image came, depending on the maximum output value the
class label is determined. Renal basis function used as a kernel for drawing hyper-
planes. The RBF kernel on two feature point x and xl is defined as
  !
    x  xj   2
k x; x I
¼ exp 
r2
    2  1
k x; xj ¼ exp cx  xj  when c ¼
r2

RBF kernel has two parameters gamma and C. Gamma parameter tells us how far a
single training parameter can reach. The value of gamma is low means the sample is far
and high means close. C gives the misclassification against the training surface. It
makes the hyperplane smooth. A high C value means the misclassification is low.
When gamma value is high C will prevent over fitting of data. Weaka 3.8 software is
used for finding the accuracy and confusion metrics.

4 Experimental Results

This section holds the results obtained from the above experiment. In the preprocessing
step collected diseased leaf image are taken as input to enhance the image quality by
resizing, noise removal, contrast enhancement. The preprocessing original image is
converted to binary image and individual RGB color band calculation is performed
using histogram. Finally it displays the color threshold range (Fig. 6).
432 K. Srunitha and D. Bharathi

Fig. 6 Preprocessing for converting the original image into a binary image

A size filtered mask is applied to the original RGB image and display only the
specified color (Fig. 7).

Fig. 7 Filtered mask for removing the objects smaller than 100 pixels and apply the filtering
mask
Mango Leaf Unhealthy Region Detection and Classification 433

The preprocessed diseased images are then passed to segmentation using k-means
clustering. Here k = 3 (Red, Green, and Blue) (Fig. 8).

Fig. 8 Segmented mango leaf image with K = 3

After segmentation from each of this color band features are extracted through
GLCM method. Thirteen harlick features are extracted for each band. So total 3  13
feature vector. Below is the feature vector for RED color band (Tables 2, 3 and 4).

Table 2 GLCM matrix


41781 264 383 442 567 212 46 30
371 1871 453 8 0 0 0 0
429 445 4679 550 18 0 0 0
389 28 534 2737 540 1 0 0
552 0 23 488 3114 337 0 0
209 0 0 6 316 1929 78 0
36 0 0 6 2 79 604 30
24 0 0 0 0 0 34 635
434

Table 3 Feature extraction using GLCM


Contrast 0.07887 0.46683 0.36758 0.54123 0.51277 0.69762 0.07887
Correlation 0.97832 0.86570 0.91019 0.75103 0.71032 0.87389 0.97832
Energy 0.76258 0.79672 0.75731 0.53823 0.89470 0.48725 0.76258
K. Srunitha and D. Bharathi

Homogeneity 0.97487 0.95919 0.96254 0.9 0.97168 0.91041 0.97487


Mean 14.8438 14.150115 16.4441 17.9716 17.1185 31.5603 14.843851
Standard deviation 47.8116 48.1395 51.4194 37.6635 35.52045 56.4596073417924 47.8116
Entropy 1.70987 1.36584 1.66789 2.58288 2.84317 2.98298 1.70987
RMS 5.57477 4.31361 5.34037 7.40369 10.4504 8.11404 5.57477
Variance 2150.69625 1632.21 2305.04 1306.81 1162.22 2844.32 2150.69
Smoothness 0.99999 0.99999 0.99999 0.99999 0.99999 0.99999 0.99999
Kurtosis 15.5977 15.7654 13.7926 10.4951 27.6032 4.40083 15.5977
Skewness 3.63201 3.67442 3.40252 2.58833 4.68201 1.61292 3.63201
IDM 255 255 255 255 255 255 255
Mango Leaf Unhealthy Region Detection and Classification 435

Classification gives result of classifier multi class-SVM (Support vector machine).


The accuracy of svm is found out with the help of WEKA 3.8 software. Below is the
result of SVM with weka.

Table 4 Confusion metrics plot using WEKA


Dataset TP FP TN FN
Anthracnose 55 6 75 150
Red rust 48 2 88 148
Sooty mould 50 5 91 140
Scab 72 3 81 130
Healthy leaf 44 1 96 145

5 Conclusion

The work contains mainly two zones the segmentation followed by classification using
SVM. Over and done with the experimental analysis it is resolved that the proposed
method works efficiently on leaf diseases recognition and classification. Presence of
multiple diseases in one region of the leaf and the variation in their color, texture, and
shape characteristics made difficult in segmentation and feature selection phase. SVM
in classification phase performed well and get accuracy up to 96%. Upon implementing
the process we get a conclusion that main requirements for any diseases detection are
speed and accuracy. The work can also be extended by specifying suited organic timely
diseases curing technique for each disease. Further need to compute the amount of
disease area present on leaf.

References
1. Anand, R., Veni, S., Aravinth, J.: An application of image processing techniques for
detection of diseases on brinjal leaves using K-means clustering method. In: IEEE
International Conference on Circuit, Power and Computing Technologies, ICCPCT (2016)
2. Pooja, A., Mamtha, R., Sowmya, V., Soman, K.P.: X-ray image classification based on
tumor using GURLS and LIBSVM. In: International Conference on Communications and
Signal Processing (ICCSP’16) (2016)
3. Krishnan, M., Sumithra, M.G.: A novel algorithm for detecting bacterial leaf scorch
(BLS) of shade trees using image processing. In: IEEE 11th Malaysia International
Conference on Communications (2013)
4. Arivazhagan, S., NewlinShebiah, R., Ananthi, S., Vishnu Varthini, S.: Detection of
unhealthy region of plant leaves and classification of plant leaf diseases using texture
features. AgricEngInt CIGR J. 15, 211–217 (2013)
5. Naikwadi, S., Amoda, N.: Advances in image processing for detection of plant diseases. Int.
J. Appl. Innov. Eng. Manag. 2(11) (2013)
6. Arivazhagan, S., NewlinShebiah, R., Ananthi, S., Vishnu Varthini, S.: Detection of
unhealthy region of plant leaves and classification of plant leaf diseases using texture feature.
CIGR 15(1), 211–217 (2013)
436 K. Srunitha and D. Bharathi

7. Amoda, N., Naikwadi, S.: Advances in image processing for detection of plant diseases. Int.
J. Appl. Innov. Eng. Manag. (IJAIEM) 2(11). ISSN: 2319-4847 (2013)
8. Jagtap, S.B., Hambarde, S.M.: Agricultural plant leaf disease detection and diagnosis using
image processing based on morphological feature extraction. IOSR J. VLSI Signal Process.
(IOSR-JVSP) 4(5), 24–30, Ver. I. e-ISSN: 2319-4200, p-ISSN: 2319-4197 (2014)
9. Gavhale, K.R., Gawande, U.: An overview of the research on plant leaves disease detection
using image processing techniques. IOSR J. Comput. Eng. (IOSR-JCE) 16(1), 10–16, Ver.
V. ISSN: 2278–8727 (2014)
10. Ratnasari, E.K., Mentari, M., Dewi, R.K., Hari Ginardi, R.V.: Sugarcane leaf disease
detection and severity estimation based on segmented spots image. In: IEEE. ICTS
978-1-4799-6858-9/14/$31.00 © 2014
11. Fadzil, W.M.N.W.M., Rizam M.S.B.S., Jailani, R., Nooritawati, M.T.: Orchid leaf disease
detection using border segmentation techniques. In: 2014 IEEE Conference on Systems,
Process and Control (ICSPC 2014), Kuala Lumpur, Malaysia, 12–14 December 2014
12. Warne, P.P., Ganorkar, S. R.: Detection of diseases on cotton leaves using K-mean
clustering method (IRJET) 02(04). e-ISSN: 2395 -0056 (2015)
13. Kaur, R., Kang, S.S.: An enhancement in classifier support vector machine to improve plant
disease detection. In: IEEE 3rd International Conference on MOOCs, Innovation and
Technology in Education (MITE), pp. 135–140 (2015)
14. Khirade, S.D., Patil, A.B.: Plant disease detection using image processing. Int. Conf.
Comput. Commun. Control Autom. 978-1-4799-6892-3/15 $31.00 © 2015 IEEE
15. Padmavathi, S., Saipreethy, M.S., Valliammai, V.: Indian sign language character
recognition using neural networks. In: IJCA Special Issue on Recent Trends in Pattern
Recognition and Image Analysis, vol. RTPRIA, pp. 40–45 (2013)
16. Dinesh Kumar, C.K., Manjusha, R., Latha, P.: Comparision of image classification methods
on event data. Int. J. Applied Eng. Res. 10, 29631–29640 (2015)

You might also like