You are on page 1of 8

Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864

(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

DETECTION OF PLANT LEAF DISEASES USING


MACHINE LEARNING ALGORITHM
1M.Rajathi, 2S.Abinaya, 3 Dr.R.Arumugam
1
Assistant professor, 2 PG student, 3 Assistant Professor (SG)
1,2
Department of Software Engineering
3Department of Mathematics

Periyar Maniammai Institute of Science & Technology, Thanjavur,


Tamil Nadu, India
Email: sixfaceraji2010@gmail.com, abisaravanan728@gmail.com, arumugamr2@gmail.com

Abstract
As plant diseases reduce the health and robustness of the plant, which are both
essential for agricultural productivity, it is very vital to identify them in the agricultural
industry. These problems are common in plants, and if appropriate preventative actions are
not taken, the cultivation could suffer. In the real world, disease detection today requires
expert judgement and physical examination, which is costly and time-consuming. This has
led to the need for computer-based detection. Overviews of feature extraction using GLCM,
HSV dependent classification for locating infected leaf sections, and image segmentation
using K-means clustering are presented in this paper. The effectiveness of the proposed
methodology could successfully identify and classify plant diseases with an accuracy of 98%
when employed with a Random Forest classifier.
Keywords: Random Forest classifier, HSV dependent classification, K-means clustering

1. Introduction
It may be challenging for farmers in rural locations to identify the diseases that
could harm their crops. They are unable to conveniently drop by the agricultural office to
learn what the infection might be. Using image processing and machine learning, our
main objective is to recognise the disease that is introduced in a plant by examining its
structure. Food production is decreased and food insecurity is increased as a result of
pests and diseases destroying crops or plant parts. Furthermore, in many less developed
countries, nothing is known about the prevention or control of illnesses and pests. Toxic
infections, inadequate disease control, and drastic climatic c hanges are some of the key
factors contributing to decreasing food output. Many laboratory-based techniques have
been employed to identify diseases, including polymerase chain reaction, gas
chromatography, mass spectrometry, thermographs, and hyper spectral approaches.
These techniques are time-consuming and inefficient in terms of money. Disease
identificationhas recently been carried out using server-based andmobile-based methods.
This technological advance among other things, a high-resolution camera with high

136
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

performance, Processing and the abundance of built-in accessories enable automatic


disease detection. By utilising cutting-edge techniques like machine learning and deep
learning algorithms, the accuracy of the outcomes has increased. Many research have
been carried out utilising traditional machine learning techniques, such as random
forests, artificial neural networks, support vector machines (SVM), fuzzy logic, K-
means method, convolution neural networks, etc. for the detection and diagnosis of plant
diseases. Generally speaking, random forests are a learning method for issues like
classification, regression, and others that develop a forest of decision trees during the
training phase. Random forests handle both numerical and categorical data, unlike
decision trees, and get around the problem of over fitting their training data set.

Problem Definition
In India, agriculture is essential to the country's economic development. Almost
half of the workforce in India is employed in the agriculture sector. India is the country
that produces the most pulses, rice, wheat, spices, and spice goods worldwide. The
expansion of the farmers' economies is determined by the quality of the products they
create, which is reliant on plant growth and yield. Identification of plant diseases is
crucial in agriculture as a result. Plants are highly likely to be impacted by diseases that
stop plant growth, which has an effect on the farmer's ecosystem. It is preferable to use
an automated disease detection method when a plant disease is discovered early. Plant
diseases can manifest themselves in a variety of plant parts, including the leaves. It takes
a long time to manually diagnose plant illness using photographs of the leaves.
Computational tools must be developed in order to automate the process of disease
classification and identification using leaf photographs.

2. Literature Survey
P. R. Rothe et. al., (2015) [1] focussed a Cotton Leaf Disease Identification using
the techniues of pattern recognition which uses snake segmentation. Godliver
et.al.,(2014) [2], projected Automated Vision Diagnosis of Banana Bacterial Wilt
Disease and Black Sigatoka Disease. Several techniques for identifying plant diseases
have been devised by writers utilising images of the leaves. They applied Otsu's
thresholding, boundary detection, and spot detection algorithms to isolate the
contaminated area of the leaf. Then, they collected characteristics like colour, texture,
morphology, edges, etc. in order to categorise plant illnesses.

137
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

A Statistical study for A Markov Model for Prediction of Corona Virus COVID-19 in India
was made by Dr..Arumugam R, et.al., [3] and Arumugam.R and Rajathi,M [4] dealt with
applications of manpower with various stages in business using stochastic models. BPNN
is used forclassification or to identify the plant disease. The characteristics of the grey
level co-occurrence matrix (GLCM), the mean and standard deviation of the picture
convolved with theGabor filter, and the mean and standard deviation ofthe RGB and
YCbCr channels wereretrieved for classification. A support vector machine classifier
was used to categorise the data. The scientists' conclusion was that GCLM features
can be used to identify healthy leaves. While it is believed that colour features and
Gabor filter features are the best for detecting leaf spots a nd anthracnose-affected S.
Yun et. al., (2015) [5] focussed Pnn based crop disease recognition with leaf image
features and meteorological data. The spectrums of the visible, near -infrared (VNIR),
and short-wave infrared (SWIR) were used in this investigation. The k-means
clustering method in the spectral domain was employed by the authors for the
segmentation of leaves. They have introduced a novel grid removal algorithm with the
aim of removing the grid from hyperspectral photos. The authors' overall spectrum
accuracy was 93%, with an accuracy of 83% for their vegetation indicators in the
VNIR spectral band. Despite the fact that the proposed method increased accuracy, it
is too costly because it requires a hyperspectral camera with bands. Caglayan, A. et.
al., (2013) [6] discussed A plant recognition approach using shape and color features
in leaf images. As shown in the ‘Fig.1.” labeled training datasets are converted into
their respective feature vectors by HoG feature extraction. These extracted feature
vectors are saved under the training datasets. Further the trained feature vectors are
trained under Random forest classifier [7, 8]. Shima et. al., (2018) [9] Plant Disease
Detection Using Machine Learning, Using a clever edge detector, the photo's edges
were retrieved. Many authors have developed a method for precisely predicting the
degree of fruit infection. The network consists of three blocks of pooling and
convolutional layers. Hence, the cost of computing on the network rises. Due to the
significant number of incorrect negative predictions, the model's F1 score is 0.12,
which is incredibly low. S. S. Sannakki et. al., [10], proposed a “Classification of
Pomegranate Diseases Based on Back Propagation Neural Network” which mainly
works on the method of Segment the defected area and color and texture are used as
the features. M.Rajathi et. al., [11] discussed the applications of mobile learing in the

138
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

higher educational institutions.

3. Problem Description Existing System


The present method for diagnosing plant diseases involves simple observation
with the naked eye by plant experts, which can be utilised to locate and identify plant
diseases. To follow vast fields of crops in these circumstances, utilise the provided
method. Farmers in other nations may also not haveaccess to the necessary resources or
may not be aware that they can consult with experts. It costs extra time and money to
consult with professionals as a result. It would be beneficial in those circumstances to use
the provided technique for tracking numerous plants.
Disadvantages of Existing System
1. Only people can forecast diseases
2. The procedure proceeds slowly
3. Significant time and physical space are consumed
4. The price is also exorbitant.
Proposed System
The primary aim of this research is to identify plant diseases. Using feature
extraction, segmentation, and classification techniques, plant illnesses are detected.
Pictures of leaves from various plants are taken using a digital camera or a similar
device, and the images are then used to classify the damaged areas in the leaves. A
Convolution neural network and a deep neural network are utilised to detect plant
sickness in the proposed framework. This study proposes an open-source, low-cost
software system for the precise diagnosis of plant diseases.
Advantages:
1. Quickly collects pertinent datasets
2. Graphical representation
4. Implementation Pre-Processing Phase
In order to photograph plant leaves, this procedure must first be completed. The
photographs of the RGB plant leaves were captured with a digital camera at pixel
resolutions of 568x1020. 75 information tests have been accumulated. The list of plant
diseases includes five different varieties. An image from a mat lab used to create library
plant leaf pictures that have been grayscale is entered into the model.
Division of Images
The photos will be divided into groups based on theimage segmentation method.
The two primary categories of image segmentation methods are threshold-based
segmentation and region-based segmentation. Then, as the next stage, the valuable bits

139
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

must be retrieved. Not every section has a substantial amount of information. In order
to do additional analysis, only the piece of the data that comprises more than 50% is
considered. Generic block diagram showing the many process steps; In this work, the
images are segmented using the region-based k-mean segmentation technique since it is
more noise- resistant and performs better in homogeneous regions. A cluster will be
produced via picture segmentation using the data set we will supply.
Extrapolating Features
Many different attributes represent the distinguishing qualities of the leaves. Some
leaves are recognised by their distinct shapes, while others are distinguished by their
surface traits.
GLCM Algorithm:
This approach for surface analysis, first proposed by Haralick in 1973, is still one of the
most used ones. The fundamental idea of the technique is to generate highlights based on
a grey level co-occurrence matrix (GLCM). To extract co-occurrence characteristics, a
gray-level co-occurrence matrix is used. As the GLCM method at the time only required
13 features, only 13 features were utilised in the entire project.
 The formula below gives a description of theprocedure.
 Determine the size of the matrix where the datais saved in terms of pixels.
 In the matrix P[I,j], enter the pixel counts.
 You can ascertain how comparable the matrix's pixels are by employing the
histogram technique.
 To normalise the g-elements, the pixels must beseparated.
Classifier
To be saved in the feature library for use in the classification process, the co -event
highlights for the leaves are similarly extracted, contrasted, and compared. A small
number of classifiers—the recently suggested KNN classifier and the currently in use
Support vector machines (SVMs) classifier— are described in this study. For both
classification and reversal, SVM classifiers employ a number of similar supervised
learning approaches. The creation of a hyper plane or collection of hyper planes by a
support vector machine in a high- or huge-dimensional environment can be utilised for
grouping, reversion, or other activities. For computations utilising supervised learning,
the K- NN also serves as the classifier.
The closest neighbour model is the best one for comprehending AI structures. Just
for a second, let'sthink about the different types of named tests thatexist. Homogenous in

140
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

nature is the idea of assembling or grouping together similar objects. Currently, a name
needs to be given to something that isn't labelled under its name. The K-closest
neighbours method is currently the best method for ranking items since it keeps track of
every class thatis currently available and can accurately classify new items according to
whatever category they earnthe most votes. KNN is thus a possible option to categorising
an unlabeled object into a known category. The fundamental metric for assessing the
precision and skill of the k-NN calculation is the K value that has been selected. A further
advantage of KNN is that it employs an impartial algorithm and makes no assumptions
about the data being taken into account.

Figure 1: Data Flow Diagram

141
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

Figure 2: RGB to HSV conversion of leaf

Sample code for detecting plant leaf diseases using the convolutional neural network
(CNN) machine learning algorithm in Python:

# Import required libraries


import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import classification_report, confusion_matrix

# Load dataset
data = pd.read_csv('plant_leaf_dataset.csv')

# Split data into training and testing sets


X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Train SVM classifier


classifier = SVC(kernel='linear', random_state=0)
classifier.fit(X_train, y_train)

# Predict test set results


y_pred = classifier.predict(X_test)

# Evaluate performance
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

5. Conclusion & Future Work


In this study, methods for automatic computerised disease diagnosis in plants
using photo segmentation and classification algorithms are provided. The algorithms
were put to the test using the three diseases Early Blight, Late Blight, and Bacterial Spot.
Identifying illnesses is the main objective of the offered technique. The extraction of
texture information from GLCM considerably improved the accuracy of the HSV
modification method, which was quite successful in detecting the diseased area of the
leaf. As a result, by examining the results, the suggested approach generates accurate
plant leaf disease detection with a minimal level of computer complexity. Our primary
objective in writing this article was to compare and contrast KNN classifier and SVM
classifier in order to show how the former is better at identifying plant leaf diseases than
the latter. Early Blight, Down Mildew, White Fly, and Leaf Miner were among the five

142
Education and Society (शि क्षण ििण समाज) ISSN: 2278-6864
(UGC Care Journal) Vol-47, Issue-2, No.6S, April-June: 2023

distinct plant diseases on which the proposed algorithm was tested. Examining other
unstudied leaf diseases was one of the other goals. According on the experimental
results, the proposedmethod is 98.56% accurate, compared to 97.6% for the current
system. Further work can be done on improvements. Future research can concentrate on
increasing the dataset and accuracy.

References
1. P. R. Rothe and R. V. Kshirsagar (2015),” Cotton Leaf Disease Identification
using Pattern Recognition Techniques”, International Conference on Pervasive
Computing (ICPC).
2. Godliver Owomugisha, John A. Quinn, Ernest Mwebaze and James Lwasa,” Automated
Vision-Based Diagnosis of Banana Bacterial Wilt Disease and Black Sigatoka Disease “,
Preceding of the 1’st international conference on the use of mobile ICT in Africa ,2014.
3. Dr.R.Arumugam and M. Rajathi, “A Markov Model for Prediction of Corona
Virus COVID-19 in India- A Statistical Study”, Journal of Xidian University,
Vol. 14, Issue. 4, pp-1422- 1426.
4. R Arumugam, M Rajathi, (2017), “Applications of Manpower with various stages in
Business using Stochastic models”, International Journal of Recent Trends in Engineering
and Research, Vol. 3, Issue. 1, pp-95-100.
5. S. Yun, W. Xianfeng, Z. Shanwen, and Z. Chuanlei, “Pnn based crop disease recognition
with leaf image features and meteorological data,” International Journal of Agricultural and
Biological Engineering, vol. 8, no. 4, p. 60, 2015
6. Caglayan, A., Guclu, O., & Can, A. B. (2013, September). “A plant recognition
approach using shape and color features in leaf images.” In International Conference
on Image Analysis and Processing (pp. 161-170). Springer, Berlin, Heidelberg.
7. Wang P., Chen K., Yao L., Hu B., Wu X., Zhang J., et al. (2016).” Multimodal
classification of mild cognitive impairment based on partial least squares”,
8. Zhen, X., Wang, Z., Islam, A., Chan, I., Li, S., (2014). “Direct estimation of cardiac bi-
ventricular volumes with regression forests.” In: Accepted by Medical Image Com- puting
and Computer-Assisted Intervention–MICCAI 2014.
9. Shima Ramesh , Mr. Ramachandra Hebbar, Niveditha M, Pooja R, Prasad Bhat N,
Shashank N, Mr. P V Vinod 2018, International Conference on Design Innovations
For 3Cs Compute Communicate Control, Plant Disease Detection Using Machine
Learning, 978-1-5386-7523-6/18/$31.00 April 2018 IEEE, DOI
10.1109/ICDI3C.2018.00017,
10. S. S. Sannakki and V. S. Rajpurohit (2015),” Classification of Pomegranate Diseases
Based on Back Propagation Neural Network,” International Research Journal of
Engineering and Technology (IRJET), Vol2 Issue:02.
11. M.Rajathi and R. Arumugam (2019), Applications of Mobile Learning In The Higher
Educational Institutions Through Statistical Approach, International Journal of Recent
Technology and Engineering, Volume 8, Issue 1, 1431-1439.

143

You might also like