You are on page 1of 84

UNIVERSITY OF ZIMBABWE

FACULTY OF ENGINEERING

DEPARTMENT OF ELECTRICAL ENGINEERING

MASTER OF SCIENCE IN COMMUNICATION ENGINEERING

Android Application for Crop Disease Diagnosis using image processing and
deep learning (Smart Agriculture)

Name: Katsande Munashe Brian

Registration Number: R109881C

Year: 2019

Supervisor: Dr Munochiveyi

Date of Submission 17 July 2020


UNIVERSITY OF ZIMBABWE

DEPARTMENT OF ELECTRICAL ENGINEERING

The undersigned certify that this dissertation proposal prepared by Munashe Brian Katsande,
titled: Android Application for Crop Disease Diagnosis using image processing and deep
learning, meets the requirements of the Department of Electrical Engineering for the Award of
Master of Science (MSc.) Degree in Communication Engineering.

------------------------------------------------ --------------------------------------------

Dr. Munochiveyi DATE

(Supervisor)

----------------------------------------------- ----------------------------------------

Dr. Marisa DATE

(Chairperson of Department)

-------------------------------------------- ------------------------------------

EXTERNAL EXAMINER DATE


Declaration

I declare that every work written in this dissertation is mine and no plagiarism has been done.

________________________________

(Signature)

________________________________

(Name of the student)

_________________________________

(Student No)

Date: _________
Abstract

Plant diseases are a major threat to food security worldwide. An accurate and a faster approach
to detection and diagnosis of diseases in crops will go a long way to help farmers save their crop
and increase yield. Recent developments in smartphone technology and deep neural networks
have allowed researchers to develop accurate and ease to use systems to help farmer in this
regard. In this dissertation, we developed an android based cotton crop disease detector using
deep convolutional networks and image processing. We made use of transfer learning using a pre
trained Inceptionv3 model. Additional layers were added to the pretrained model and trained on
our dataset. The trained model finally integrated into an android mobile app and experimental
results on the developed model were able to achieve an average accuracy of 83%.
Acknowledgements

First and foremost, I would like to thank the LORD Almighty God for making all this possible,
for being my guide through this research.

My special thanks also go to my supervisor Dr Munochiveyi and the entire team at University of
Zimbabwe Electrical Engineering department for their mentorship during this program.

I would also like to extend my acknowledgement to my family and colleagues for the support
and invaluable assistance.
Table of contents

......................................................................................................................................... 1

1.1 Introduction ...................................................................................................................... 1

1.2 Problem statement ............................................................................................................ 2

1.2.1 Aim ........................................................................................................................... 3

1.2.2 Objectives ................................................................................................................. 3

1.3 Dissertation Organization ................................................................................................. 3

......................................................................................................................................... 4

2.1 Background and Literature Review.................................................................................. 4

2.2 Background ...................................................................................................................... 4

2.2.1 Deep learning approach ............................................................................................ 5

2.2.2 Convolutional neural networks (CNN) ..................................................................... 5

2.2.3 Hyperparameter tuning ........................................................................................... 11

2.2.4 Validation in deep learning ..................................................................................... 12

2.2.5 Deep learning architectures..................................................................................... 16

2.2.6 2.4.1 Transfer Learning........................................................................................... 20

2.3 Related work .................................................................................................................. 21

....................................................................................................................................... 25

3.1 Methodology .................................................................................................................. 25

3.1.1 Research procedure ................................................................................................. 25

3.1.2 Building of the Dataset ........................................................................................... 25

3.1.3 Preprocessing of Images ......................................................................................... 29

3.1.4 Neural Network Design .......................................................................................... 30

3.1.5 Training of Neural Network.................................................................................... 33

3.1.6 Mobile application development............................................................................. 40


....................................................................................................................................... 44

4.1 Results ............................................................................................................................ 44

4.1.1 Model training results ............................................................................................. 44

4.1.2 Android Mobile Application ................................................................................... 48

....................................................................................................................................... 51

5.1 Conclusions .................................................................................................................... 51

5.2 Future Work ................................................................................................................... 52

References ..................................................................................................................................... 53

Appendix ....................................................................................................................................... 57
List of Figures

Figure 2:1 CNN Layers, source Stanford University [8]. ............................................................... 5


Figure 2:2: CNN architecture that classifies input images. Source Matlab [8] .............................. 6
Figure 2:3:Convolving an image with a filter, Source Stanford University [9] ............................. 7
Figure 2:4 Zero Padding ................................................................................................................. 8
Figure 2:5: Rectified Linear Unit .................................................................................................... 8
Figure 2:6: Max Pooling Operation [10] ........................................................................................ 9
Figure 2:7 Overfitting and underfitting......................................................................................... 10
Figure 2:8 Early stopping technique, Tuan-Ho Le [14] ................................................................ 11
Figure 2:9 Hold-out Validation ..................................................................................................... 13
Figure 2:10 Leave-One-Out cross validation ................................................................................ 13
Figure 2:11 K-Fold cross validation ............................................................................................. 14
Figure 2:12 Stratified K-Fold cross Validation ............................................................................ 15
Figure 2:13 Cross validation for Time series ................................................................................ 16
Figure 2:14:Architecture. Source: LeCun et al 1998 [17] ............................................................ 17
Figure 2:15: AlexNet architecture Source:AlexNet 2012[18] ...................................................... 17
Figure 2:16 GoogleNet Architecture [20] ..................................................................................... 18
Figure 2:17 VGGnet Architecture [21] ......................................................................................... 18
Figure 2:18 ResNet Architecture [23]........................................................................................... 19
Figure 3:1 Research procedure ..................................................................................................... 25
Figure 3:2 Cotton leaves affected by Alternaria leaf spot, Source: PlantVillage[34] .................. 26
Figure 3:3 Cotton leaves affected by bacterial blight, Source: Clemson University[35] ............. 27
Figure 3:4 Dataset structure .......................................................................................................... 29
Figure 3:5: Deep neural network .................................................................................................. 31
Figure 3:6 Process flow diagram .................................................................................................. 33
Figure 3:7 : Portion of a diseased cotton leaf ............................................................................... 37
Figure 3:8 Mobile application flowchart ...................................................................................... 40
Figure 3:9: Application GUI layout .............................................................................................. 41
Figure 3:10: Android Studio ......................................................................................................... 42
Figure 3:11 Mobile Application.................................................................................................... 43
Figure 4:1: Training and validation losses vs epochs ................................................................... 45
Figure 4:2: Training and validation accuracies vs epochs ............................................................ 46
Figure 4:3 Holdout validation results ........................................................................................... 47
Figure 4:4 Model random test ....................................................................................................... 47
Figure 4:5 Mobile application ....................................................................................................... 49
Figure 4:6: Incorrectly classified images ...................................................................................... 50

List of Tables

Table 2.1 Comparison table .......................................................................................................... 20


Table 3.1 : Dataset ........................................................................................................................ 30
List of acronyms

Deep learning …………………………………………..DL

Machine learning…………………………......................ML

Neural Networks………………………………..............NN

Convolutional neural networks…………………………CNN

Artificial intelligence…………………………………....AI
1.1 Introduction

Zimbabwe is a landlocked country with a total land area of about 39 million hectares, with over
80% intended for agricultural purposes. Agriculture is the backbone of the country’s economy
and main source of livelihood of most of the population. The farming activities in Zimbabwe
supplies 60% of raw materials required by industrial sector and contributes 40% of total export
earnings[1]. Zimbabwe’s economic growth is therefore highly tied to the performance of its
agricultural sector. Because the agricultural sector is so vital to the country, it is of great
importance to protect crops from any form of threat.

Crop protection plays a key role in safeguarding crop productivity against drought, climate
change, animal pests and diseases. Crop diseases are a major threat to crop production and their
rapid identification remains difficult in Zimbabwe due to the lack of resources and expertise. The
Food and Agriculture Organization of the United Nations (FAO) estimates that plant pathogens
lead to the loss of 20–40% of food production globally[2].The early stage diagnosis of plant
disease is an important task in order to avoid the damage of crops and help to decrease the cost
involved in crop production. Diseases and insect pests are the major problems for plant
cultivation. Mitigation of plant diseases serves a dual purpose of reducing pesticide use and
increasing crop yield

Farmers often do not have the expertise and tools like Microscope and DNA sequencing tools to
identify crop diseases properly and It is prohibitively expensive and time consuming for them to
get assistance from plant pathologist. Farmers especially small-scale farmers end up relying on
naked eye observations to detect crop diseases. This method is not reliable because a lot of
expertise and knowledge is required to be able to identify crop diseases and some of the diseases
look almost similar which makes it difficult to distinguish.

In order prevent a situation where farmers make wrong decisions based on naked eye
observations a less expensive, automated and accurate method is needed. Deep machine learning
and computer vision solutions offer great opportunities for the automatic recognition of crop

1
Munashe Brian Katsande R109881C
diseases. These technologies play a big role in developing a monitoring tool for crop diseases
and for warning the communities of possible outbreaks.

Even though many of the farmers do not have the expertise and tools to detect crop disease,
majority of them possess a smartphone. Smartphones offer a very novel approach to help identify
crop diseases because of increased computing power, high screen resolution displays and
advanced high definition cameras. The leading provider of Information and Communication
Technology (ICT) Ericsson forecasts that mobile penetration will reach 9.3 billion in 2019 and
5.6 billion of these will be smartphone subscriptions[3].Hence, an android based mobile
application that helps by automating diagnosis of crop diseases by analyzing a picture of a crop
leaf using machine learning is a promising solution.

To this end, we propose an android based application using deep learning-based approach to
identify and diagnose crop diseases. However, we limit our study to identify and diagnose cotton
leaf diseases. Cotton is threatened by different types of diseases, such as Alternaria macrospora
and Bacterial blight among others

1.2 Problem statement

Cotton is one of the most important fiber crops in entire world to provide basic raw material for
cotton textile industry. Zimbabwe produces approximately 123 000 tons of cotton annually and
exports to the international market amount to 70% while 30% is used domestically[4]. Cotton
production in the country is done mainly by smallholder farmers. The semi-arid climate in the
country makes it favorable for peasant farmers to grow cotton. Cotton is mainly grown on
communal farms that are geographically dispersed in rural areas and are operated as family units.

Cotton is considered as one of the most important cash crops so identification and diagnosis of
crop disease in the field is very critical to prevent losses in the yield and thereby increase
production. The goal of our proposed work is as follows:

2
Munashe Brian Katsande R109881C
1.2.1 Aim

• To develop an Android Application for Cotton Disease Diagnosis using image processing
and deep learning

1.2.2 Objectives

• Develop an algorithm to perform image processing on crop digital images


• Develop and train a deep learning model to classify plant diseases by analyzing the leaf
from a plant (Bacterial Blight and Alternaria)
• Integrate image processor and deep learning algorithm into a usable android application
to provide farmers with crop disease diagnosis and provide possible solutions.

1.3 Dissertation Organization

In Chapter 2 we present background information and summarizes the work carried out to date to
show that neural networks can indeed provide a very powerful and accurate means of identifying
crop disease based on leaf symptoms.

Chapter 3 discusses the implementation of the proposed techniques. It discusses in detail the
datasets, preprocessing techniques, deep learning approaches and performance measures used in
this study.

Chapter 4 presents the measurements of results of the techniques implemented in chapter 3 as


well as the challenges and practical possibilities for overcoming them.

Chapter 5 reports a summary of the accomplished work as well as suggesting possible future
research areas in the area of crop disease identification and classification.

3
Munashe Brian Katsande R109881C
2.1 Background and Literature Review

This proposal consists of two sections. The first sections begin with the overview of deep
learning and section 2 gives a comprehensive literature survey on related works by other
researchers on crop disease diagnosis.

2.2 Background

The main goal of this dissertation is to develop a system to enable farmers to accurately identify
different crop diseases using images taken from smartphone cameras. Over recent years deep
learning has proven to offer great opportunities for image classification and object recognition
capabilities[5].

It involves training a neural network using a dataset of images collected from actual crop leaves
affected by a disease[6]. After the model has been trained it can be integrated into a mobile
application or deployed on a cloud server for use as a crop disease detector.

Traditional computer vision was mainly based on image processing algorithms and methods. It
was mainly used for extracting the features of the image which includes detecting the corners,
edges, and color of objects. The main challenge with this approach used in traditional computer
vision for a classification task is that you must choose which features to look for in each given
image.

It becomes hard to cope up when the number of features of classes becomes high. The
introduction of deep learning DL has pushed the limits of what was possible in image
classification. There is no longer need of manually choosing the features. Simply put if you want
a deep learning network to classify objects, you do not explicitly tell it which features to look for
in an image. You just show it thousands of images and it will train itself to classify the given
images for you without writing any code.

4
Munashe Brian Katsande R109881C
2.2.1 Deep learning approach

Deep learning (DL) is a technique for implementing machine learning (ML) in artificial
intelligence (AI). It was inspired by our understanding of the human brain which uses
interconnections between neurons to learn hence the name Neural Networks (NN) [7]. Neural
Networks have a variety of applications in computer vision such as image classification, speech
recognition, face recognition and analysis of big of data. Increase of computational power and
availability of more training data has caused deep learning networks to become very successful.
For these reasons, researcher have been able to develop and train neural networks more
efficiently.

2.2.2 Convolutional neural networks (CNN)

Convolutional neural networks are currently one of the most prominent algorithms for deep

learning with image data. They have proven to deliver outstanding results in several application
most notably in image related tasks. CNNs consist of multiple layers of connected neurons which
can learn features automatically from datasets.

CNNs uses raw images as input to learn certain features and consist of an input layer, several
hidden layers and an output layer.

Figure 2:1 CNN Layers, source Stanford University [8].

Figure 2:1 shows a CNN where the red input layer consisting of an image is transformed into a
3D arrangement. The height and width of the hidden layer are the dimensions of the image and
the depth consists of the three RGB channels.

5
Munashe Brian Katsande R109881C
Convolution Neural Network consists of various types of principle layers as shown in Figure 2:2.
The main building blocks are as follows:

1. Convolutional layer
2. Rectified Linear Unit Layer
3. Pooling Layer
4. Fully Connected Layer

Figure 2:2: CNN architecture that classifies input images. Source Matlab [8]

Convolutional layer

The name “Convolutional Neural Network” indicates that the networks makes use of convolution
operations. The primary purpose of this layer is to extract features from the input image. Figure
2:3 show an example of how convolution works in this layer. Consider an image of dimensions 5
x 5 whose pixel values can be either 0 or 1. A filter or kernel or feature detector of dimensions 3
x 3 is slid over the image and along the way we multiply the values in the filter with the original
pixel values of the image.

6
Munashe Brian Katsande R109881C
Figure 2:3:Convolving an image with a filter, Source Stanford University [9]

After sliding the filter over the image, the result of the computation will be a 3 x 3 matrix called
the activation map or feature map.

The convolution layer has following types of hyperparameters:

1. Kernel or Filter size

A kernel or filter is used to extract features from an input image.

2. Stride

The stride S denotes the number of pixels by which the filter window moves over the input
matrix operation. It controls how the filter convolves for example, when the stride is 1 then we
move the filters to 1 pixel at a time and so on.

3. Padding

Padding involves symmetrically adding any number to the input image matrix to preserve the
feature. The most common type of padding is zero padding. It works by adding zeroes to the
input matrix to allows the size of the input to be adjusted to your requirement.

7
Munashe Brian Katsande R109881C
Figure 2:4 Zero Padding

Figure 2:4 shows an input matrix of 32 x 32 x 3 with two boarders of zero padding applied
around it to gives us a 36 x 36 x 3 output matrix.

Rectified Linear Unit Layer

Rectified linear unit layer is an additional operation that is carried out after the convolutional
layer to replace all negative pixel values in the feature map by zero and if it receives a positive
value, the function will return the same positive value. It is graphically shown in Figure 2:5 and
mathematically defined as below

𝐹(𝑥) = 𝑚𝑎𝑥(0, 𝑥) (2:1)


Its main purpose is to introduce nonlinearity in the network. This layer is made up of an
activation function that takes the feature map from the convolutional layer and creates the
activation map as its output.

Figure 2:5: Rectified Linear Unit

8
Munashe Brian Katsande R109881C
Pooling Layer

This layer is used to reduce the spatial dimensions of the feature map generated by the
convolutional layer. Less spatial information means that you gain computational performance
and less parameters reduces the chances of over fitting in a model. This makes the model more
robust because of translation invariance in the position of the features in the input image

Figure 2:6: Max Pooling Operation [10]

Figure 2:6 show most common approach used in pooling is max pooling. The operation simply
involves sliding a window over the feature map and taking the max value in the window.

Fully Connected Layer

After all the features have been detected from the combination of previous different layers, the
fully connected layer is attached at the end of the network. It uses the detected features to
classify the input images into various classes based on the training dataset. The output of the
previous layers is flattened into a single vector of values. These values represent a probability
that a certain feature belongs to a certain class. In the case of crop disease identification, the fully
connected layer will then show you the probability that an input leaf image is affected by a
which disease. An activation function is used at the end to get likelihood value between zero and
one for each possible classification.

9
Munashe Brian Katsande R109881C
Possible challenges with deep learning

Overfitting and underfitting are the main challenges in deep learning. A good model is one that
can generalize new and unseen data and not only test data. This is a measure of how well a
model performs on unseen data[11]. Overfitting occurs when the model fits the training data too
well or effectively memorizes the existing data as shown in Figure 2:7 to cause high
variance[12].

Figure 2:7 Overfitting and underfitting

Underfitting occurs we have less data to train our model but quite high amount of features which
causes it to not perform well even on the training set as shown in Figure 2:7. Therefore, there is
need for a tradeoff between underfitting and overfitting.

Addressing Overfitting

To overcome the effects of overfitting and underfitting, techniques have been developed to
improve the performance of the model. These techniques are called regularization techniques and
as follows[13]:

1. Train with more data.

A larger training set helps your model to give you a better approximation and avoid underfitting.

2. Early stopping

10
Munashe Brian Katsande R109881C
In this method training data is split into validation set and test set then during training when we
see that the performance on the validation set is getting worse compared to test set, we
immediately stop the training on the model as show in Figure 2:8.

Figure 2:8 Early stopping technique, Tuan-Ho Le [14]

3. Dropout

This is the most frequently used regularization technique in the field of deep learning. Dropout
involves ignoring neurons during the training phase of certain set of neurons which is chosen at
random[14].

4. Data augmentation

This consists of the transformation of the geometry or intensity of the original images to make
them seem like new images. The operations include rotation, zooming, mirroring, cropping,
adjusting contrast and brightness values as well as simulating new backgrounds.

2.2.3 Hyperparameter tuning

In order to enhance the model performance there are a set of hyperparameters that need tuning. A
hyperparameter can be any configurable variable that is a pre-determined before starting the
training. Tuning these hyperparameters can effectively lead to a massive improvement in the
accuracy of the model[15]. Below, we briefly survey the hyperparameters.

1. Learning rate

Learning rate controls how much to update the weight at the end of each batch in the
optimization algorithm.

11
Munashe Brian Katsande R109881C
2. Number of epochs

Number of epochs is the number of times that the learning network will go through the entire
training dataset. This number is increased until the validation and training error from the model
has been sufficiently minimized.

3. Batch size

The entire dataset cannot be passed into the deep learning network all at once, so the dataset is
divided into a number of batches or sets. The number of samples in each set is what is called a
batch size.

4. Number of hidden layers

Adding more layers in a model has been proven to generally improve accuracy to a certain
degree depending on the problem[16].

5. Weight initialization

It is necessary to that initial weights are set for the first feed forward pass. Two techniques are
generally practiced initializing parameters which are zero initialization and random initialization.

6. Regularization

The appropriate regularization technique should be chosen to avoid overfitting in deep neural
networks. As discussed in the previous section dropout is a preferable regularization method
which involves ignoring certain neuron during the training phase.

2.2.4 Validation in deep learning

Model validation or testing is a process used to estimate the performance or accuracy of deep
learning models[15]. This process is conducted after training has been completed to know how
well the model will perform in real world. It is not a good practice to test the model on the same
dataset used for training. So, to know the actual performance of the model, it should be tested on
unseen dataset which is usually referred to as test set.

Types of validation methods

1. Holdout Validation

This is a simple approach in which the entire dataset is divided into three parts (training
validation and testing dataset). The model is then trained on training data, fine-tuned using
validation data and tested using the test set

12
Munashe Brian Katsande R109881C
Figure 2:9 Hold-out Validation

This method is mainly used when we only have one model to evaluate.

2. Leave-One-Out cross validation

In this method we keep only one instance of the data as a testing data and train the model on the
rest of the data. This process is applied once for each data point. For example, if you have 500
instances of data it will iterate 500 times. One instance is allocated for testing and the rest for
training as shown in Figure 2:10.

Figure 2:10 Leave-One-Out cross validation

13
Munashe Brian Katsande R109881C
3. K-Fold cross Validation

K-fold cross validation is an improvement of the holdout validation method. The data set is
divided into k number of subsets and the process is as follows:

1. Split the entire dataset randomly into k number of subsets or folds ash shown in Figure
2:11
2. Train the model on (k-1) subsets of the dataset and then test the model on the kth subset
3. Repeat the process until each of the k-folds has been used as a test set
4. Take the average of your k subsets accuracy to get the model performance metric. This
also called the cross-validation accuracy

Figure 2:11 K-Fold cross validation

The main advantage of this method is that It covers all the data instances and learns everything
therefore the bias will be very low.

14
Munashe Brian Katsande R109881C
4. Stratified K-Fold cross Validation

This method is similar to k fold method, but the only difference is that it ensures that when we k
fold the dataset each fold is a good representative of the whole dataset.

Figure 2:12 Stratified K-Fold cross Validation

An example of stratified k fold cross validation is shown in Figure 2:12 above.

5. Cross validation for Time series

This is the most sophisticated method of all the validation methods. It consists of a series of test
subsets and the corresponding training subset contains only observations that occurred prior to
the test set as shown in Figure 2:13.

15
Munashe Brian Katsande R109881C
Figure 2:13 Cross validation for Time series

2.2.5 Deep learning architectures

As discussed thus far, CNNs are based on three main layers which are convolution and pooling
layer which act as feature extractors from the input image and fully connected layer which acts
as a classifier. Several architectures have been developed in the past few years with varying
number of layers and arrangements from one architecture to another. This section discuses
multiple deep learning architectures notably LeNet, AlexNet, GoogleNet and ResNet.

LeNet

In 1994, one of the very first convolutional neural network architecture was proposed by Yann
LeCun and was name LeNet5[17]. This architecture is made up of seven layers, which are three
convolutional layers, two pooling layers and one fully connected layer followed by an output
layer.

16
Munashe Brian Katsande R109881C
Figure 2:14:Architecture. Source: LeCun et al 1998 [17]

Its first use case was to automatically classify hand-written digits on bank cheques. LeNet-5 was
able to achieve error rate below 1% on the MNIST dataset.

AlexNet

In 2012, Alex Krizhevsky and team released an architecture called AlexNet which was a an
advanced version of the LeNet architecture[18]. It consists of consists of 5 convolutional layers
and 3 fully connected layers.

Figure 2:15: AlexNet architecture Source:AlexNet 2012[18]

In that same year, AlexNet won the ImageNet competition by a large margin. It achieved this
state-of-the-art performance in this competition because it used the novel ReLU activation, data
augmentation, dropout to reduce overfitting and local response normalization.

GoogleNet or Inception v1

In 2014 Christian Szegedy from Google published a network known as GoogLeNet [19] which
became the winner of the ImageNet Large Scale Visual Recognition Competition 2014, an image
classification competition. It was able to outperform the other models because of the introduction
of a concept called inception modules, so as a result the model runs very fast, has 12 times less
parameters and more accurate. Also, it has lower power and memory use.

17
Munashe Brian Katsande R109881C
Figure 2:16 GoogleNet Architecture [20]

GoogleNet has 22 layers in total as show in Figure 2:16. and is also called Inception v1. In the
same paper, Christian Szegedy and the team at Google also proposed several upgrades to the
GoogleNet or Inception v1 model which increased accuracy and reduced computational
complexity. This led to improved version namely Inception v2, Inception v3 and Inception v4.

VGGnet

VGGnet architecture was proposed by Simonyan and Zisserman from University of Oxford in
2014 [20]. This architecture was the 1st runner-up of the ImageNet Large Scale Visual
Recognition Competition 2014 with a 7.3% error rate. VGGnet performed way better than
AlexNet by replacing the large kernel filters in AlexNet with multiple 3x3 kernel sized filters.

Figure 2:17 VGGnet Architecture [21]

18
Munashe Brian Katsande R109881C
It has two versions VGG16 and VGG19. VGG16 is made up of 16 weight layers consisting of 13
convolutional layers and 3 dense layers for classification. VGG19 consists of a total of 16
convolution layers and 3 dense layers.

ResNet

In 2015 Microsoft published its own neural network known as ResNet [22]. ResNet was the
winner of the ImageNet Large Scale Visual Recognition Competition 2015 with an error rate of
3.57%.

Figure 2:18 ResNet Architecture [23]

Resent was designed to allow hundreds or thousands of convolutional layers. Other deep learning
architectures drop off in effectiveness because of additional layers. This architecture can add
many layers with strong performance. It was able to achieve high performance because of using
different identity mappings building blocks which has several different paths of stacked identity
layers with their outputs merged via addition.

19
Munashe Brian Katsande R109881C
Comparison

Table 2.1 Comparison table

Model Year Number of layers Top-1/Top-5 Error % Model Description

AlexNet 2012 8 41.00/18.00 5 conv + 3 fc Layers

Google 2013 22 29.81/10.04 21 conv + 1 fc Layers

VGG-16 2014 16 28.07/9.33 13 conv + 3 fc Layers

VGG-19 2014 19 27.30/9.00 16 conv + 3 fc Layers

Resnet-50 2015 50 22.85/6.71 49 conv + 1 fc Layers

Resnet-150 2015 152 21.43/3.57 151 conv + 1 fc Layers

2.2.6 2.4.1 Transfer Learning

In recent years, most applications of deep learning rely on transfer learning. Rarely do
researchers train an entire deep learning network from ground up. This is because obtaining
many images for a given class of diseases can be complicated especially in an agricultural
context. A concept known as transfer learning is very useful in cases where you have insufficient
data for a new domain to be implemented by a deep learning network.

Transfer learning is a machine learning method of using the knowledge of an already trained
model to a different but related problem[24]. For example you might only have 600 images of a
crop disease, but by leveraging on the knowledge of an existing model such as GoogleNet which
has been trained on over million images, you can use the model as an either as an initialization or
a feature extractor for the task of interest. Furthermore, training a deep learning network from
scratch generally require multiple GPU’s and the process is time-consuming process which can
take up to 3 weeks[25]. This makes transfer learning an even easier and practical approach to
consider.

There are different strategies and techniques for transfer learning usage patterns [26] as follows:

20
Munashe Brian Katsande R109881C
• Classifier - The pre-trained model is used directly to classify new images.
• Standalone Feature Extractor - The pre-trained model as an image preprocessor and
relevant feature extractor.
• Integrated Feature Extractor: The pre-trained model is merged into a new model and
layers added on top
• Weight Initialization: The pre-trained model is merged into a new model and the layers
of the pre-trained model are trained together with the new model

2.3 Related work

Crop disease detection is still an active area of research. Several techniques have been
implemented to detect crop diseases. The most common and simplest method used for detection
of plant diseases is visual plant disease inspection by humans[27]. This traditional method is
based on characteristic plant disease symptoms like blights, wilts, rots and lesions. Other
techniques are performed by experts using different methods including visible light imaging,
thermal imaging and chlorophyll florescence imaging[28]. Traditional methods are not reliable
for detecting disease in crops because of lack of uniformity and the need for expertise and
experience in the procedure. Also differentiating closely related strains may be difficult with
traditional methods.

The science of crop disease diagnosis evolved from visual inspection to other better techniques
which are based on the analysis of near-infrared reflectance on crop leaf[29]. They make use of
highly sensitive sensors to measure reflectance, temperature, biomass and fluorescence of crops
within different regions of the electromagnetic spectrum. The classification accuracies of these
methods are very low. Although these methods are much more reliable and accurate than visual
inspection, they have some limitations that block the effective use of these techniques for the
detection of diseases in crops. The main limitations include deployment costs, availability,
processing speed and real-time diagnostic capabilities.

Therefore, looking for less expensive, accurate and fast methods to automatically detect the
diseases from the symptoms that appear on the plant leaf is of great significance Techniques
capable of overcoming these challenges are needed to allow for the automation of diseases
identification in crops. Computer vision and image processing solutions offer great opportunities

21
Munashe Brian Katsande R109881C
for the automatic recognition of crop diseases[30]. These technologies play a big role in
developing a monitoring tool for pests and diseases and for warning the communities of possible
outbreaks.

Rothe, P. R and R. V. Kshirsagar [31] developed a system to identify and classify diseases in
cotton leaves using pattern recognition. The system was able to identify and classify three cotton
leaf diseases which are Alternaria, Bacterial leaf blight and Myrothecium with an accuracy of 85
percent. The technique involved image preprocessing and segmentation to isolate the leaf spots
from the background. Hu's moments are then extracted as features for use in training of the
adaptive neuro-fuzzy inference system.

Computing enrichment on image using image edge is also another technique that has been
proposed for crop disease identification specifically for cotton [32]. The images undergo
enrichment process first using image edge detection segmentation techniques followed by R, G,
B color feature image segmentation which is carried out to get target disease spots. To recognize
diseases, the image features such as boundary, shape, color and texture are extracted for the
disease spots.

Another author also developed a system for diagnosis and classification of plant diseases using
K-means clustering for image segmentation and neural network for classification[33]. The K-
means clustering is used for classification of image pixels based on a set of features into K
number of classes. K-Means clustering is mainly used to minimize the sum of squared distances
between all points and the cluster center.

The algorithm for K –means Clustering follows the steps below:

1. Pick center of K cluster, either randomly or based on some heuristic.

2. Each pixel in the image is assigned to the cluster that minimizes the distance between the pixel
and the cluster center.

3. Repeat computation of cluster centers by taking an average of all the pixels in the cluster.

Steps 2 and 3 are repeated until convergence is attained.

The efficiency of this algorithm is high but due to the need to determine the number of clusters
K, it brings certain difficulties for automated calculations.

22
Munashe Brian Katsande R109881C
Neural networks represent a huge breakthrough in pattern recognition and classification of
images. Deep learning techniques have obtained significant success in several plant disease
detection studies. Bouaziz B, Amara J and Algergawy [30] proposed a deep learning-based
approach that automated the process of classifying banana leaves diseases. They made use of the
LeNet architecture as a deep neural network to classify between healthy and diseased banana
leaves. The researchers obtained encouraging results which proved that the proposed method can
significantly be used to accurately detect leaf diseases with little computational effort.

Sladojevic S, Arsenovic M, Anderla A, Culibrk D and Stefanovic D [34] developed a new


approach to recognize 13 types of crop diseases by using deep leaning networks. The results
from the experiments on the developed model achieved an average of 96.3% accuracy on
separate class tests.

Fuentes A, Yoon S, Kim SC and Park DS[35] also developed a new approach but based on deep
learning networks. They incorporated VGG net and Residual Network (ResNet) with three
families of detectors which are Faster Region-based Convolutional Neural Network (Faster R-
CNN), Region-based Fully Convolutional Network (R-FCN), and Single Shot Multibox Detector
(SSD) to recognized nine different tomato diseases and pests. The model was trained on a large
Tomato Diseases and Pests dataset and experimental results showed that the proposed system
could effectively recognize nine different types of diseases and pests with high accuracy.

Fujita E, Kawasaki Y, Uga H, Kagiwada S and Iyatomi H[36]. developed a new practical
cucumber disease detection system that used CNNs consists of an input layer, convolution
layers, pooling, local response normalization operations and an output layer. The system based
on CNNs attained an average of 82.3% accuracy under the 4-fold cross validation strategy.

Mohanty SP, Hughes DP and Salathé M [37] evaluated AlexNet[18] and GoogleNet[19] deep
learning models applicability for for the classification problem. They analyzed the performance
of deep learning architecture on the PlantVillage dataset by training the model from scratch in
one case, and then by leveraging on already trained models using transfer learning. The overall
accuracy obtained on this dataset by training from scatch was 85.53% and 99.34% in case of
transfer learning. This approach also showed showing strong promise of the deep learning
approach for similar prediction problems.

23
Munashe Brian Katsande R109881C
From the literature survey conducted in this section, we can conclude that convolutional neural
networks (CNNs) has achieved impressive results in the field of image classification. Hence, in
this dissertation, we will investigate the application of deep learning-based approach to detect
diseases in cotton crops. In the next sections, we will explain in detail the proposed method as
applied to cotton cop leaves.

24
Munashe Brian Katsande R109881C
3.1 Methodology

This chapter describes entire procedure of developing the model for cotton crop disease
recognition using deep learning. This includes the design specifics and the entire process in
detail. The proposed solution and assumptions are thoroughly developed staring with dataset
collection.

3.1.1 Research procedure

The following approach to the research was taken

Figure 3:1 Research procedure

3.1.2 Building of the Dataset

To train a deep learning network for our research problem, a dataset of diseased and heathy
cotton leaf images had to be acquired The dataset was put together by downloading high quality
diseased cotton images on the internet from various sources and taking pictures of leaves using a

25
Munashe Brian Katsande R109881C
mobile phone camera. Cotton plants are affected by diseases caused by various pathogenic fungi,
bacteria, and viruses and to damage by parasitic worms and physiological disturbances also
classified as diseases. Cotton is threatened by different types of diseases, such as Alternaria
macrospora and Bacterial blight among others. We have limited our study to only two main
disease because of time constrains. A brief description of the diseases of interest are as follows:

1. Alternaria leaf spot - Alternaria macrospora

Alternaria leaf spot is a common fungal disease in cotton [38]. Main symptoms include small,
circular, brown spots with purple margins varying in size from 1 to 10 mm in diameter as shown
in Figure 3:2. The spots gradually turn grayish as the cotton plant grows producing irregular dead
spots areas on the leaf.

Figure 3:2 Cotton leaves affected by Alternaria leaf spot, Source: PlantVillage[34]

These symptoms are caused a fungus called Alternaria macrospora which survices on cotton
residues. This pathogen is an air-borne disease and can also be spread by water splashing on to
healthy plants.

26
Munashe Brian Katsande R109881C
2. Bacterial blight

It is the most devastating cotton crop disease. Cotton bacterial blight is caused by a bacterial
called Xanthomonas citri subsp malvacearum which survives in infected crop debris and seeds. It
starts out as angular, waxy and water-soaked leaf spot with a red to brown border on leaves,
stems and bolls. As the plant grows, the spots gradually turn into brown necrotic areas as shown
in Figure 3:3.

Figure 3:3 Cotton leaves affected by bacterial blight, Source: Clemson University[35]

If these diseases are left untreated, they will kill the plant. However, if they are diagnosed early,
they can be treated, and the crop can be saved.

The third category of classification is healthy cotton leaves to allow the model to tell the
difference between a health and diseased cotton leaf.

Dataset Annotation

After collecting the images various cotton leaf images there is need to organize the images and
associate a label to each image. It involves removing duplicated images and identifying diseases
on the leaves and organizing the images into folders corresponding to the different classes (i.e.
Bacterial blight, Alternaria leaf spot and Healthy Leaves). This is a repetitive, time-consuming
and necessary step in building up a dataset.

27
Munashe Brian Katsande R109881C
Dataset Division

For the purposes of training and testing the model, three separate datasets are required. In this
process we are subdivided the dataset put together in the previous section into the following sets:

1. Training set

This is the actual set used to train the model to learn its hidden parameters such as weights and
biases.

2. Validation set

The validation set is used to evaluate the model. This accomplished by manually fine-tuning
model hyperparameters allowing the training process to be monitored and to detect overfitting.
These include among others the learning rate, batch size and number of epochs.

3. Test set

This set is used when the training phase has been completed to evaluate the performance and
accuracy of the final trained model.

Table 3.1 Dataset Organization

28
Munashe Brian Katsande R109881C
Dataset split ratio

Research on different separation rations conducted, concluded that the ratio that gives better
results is when about 80% of the images go for training and the remaining 20% is split equally
between validation and testing[37][39].

Images

Heathly Alternaria leaf spot Bacterial blight

80% 10% 10% 80% 10% 10% 80% 10% 10%

Figure 3:4 Dataset structure

3.1.3 Preprocessing of Images

The next stage after building the dataset was preprocessing of the images. This process is very
important in the deep learning pipeline to make sure the training data is standard before being
feed into a model for training. For example, images must be resized to match the size required by
the network, 227 × 227 for AlexNet, 224 × 224 for DenseNet, ResNet, and VGG, and 299 × 299
for Inception.

Data Augmentation

To enrich the dataset, several techniques were used to increase the number and diversity of the
available images in the datasets. More augmented images increase the chance of the network to
learn more features and to be able to accurately distinguish one class from the other [21]. The
image transformations included resizing, cropping, flipping, rotations, color, contrast and
brightness enhancements.

The augmentation options used for training are as follows:

• Rotation - Rotating an image randomly over various angle


29
Munashe Brian Katsande R109881C
• Brightness – Varying brightness on images to help the model to adapt to variation in
lighting during training
• Zoom – Scaling input image by various factors
• Vertical/Horizontal Flip - Randomly flipping images vertically and horizontally.

Dataset composition

Table 3.2 : Dataset

Class Original Original and Training Validation Test


images augmented images images imgaes images
Healthy leaves 256 3870 3096 387 387
Bacterial Blight 430 5982 4785 598 598
Alternaria 430 5982 4785 598 598

3.1.4 Neural Network Design

Having acquired the images and preprocessed them, the next step was to design a model and
train it on those images.

Practical Considerations

In order to start developing the model, there are design choices that need to be taken into
consideration. The main one being the architecture choice of the model. As discussed in the
previous chapter it is now common practice to use a deep neural network that has been pretrained
on a very large dataset and then leverage its knowledge as an initialization for our task of
interest.

We discussed a number of architectures namely LeNet[17], AlexNet[18], GoogleNet or


Inception v(1,2,3 and 4)[19], VGGnet[20] and ResNet[22]. In comparison to VGGnet and
ResNet, Inception v3 have proven to use less computational power and its economical in terms
of memory requirements especially considering that we will be using it on a mobile phone. For
these reason Inception v3 was chosen as the architecture of choice for the purpose of this
research. We use this architecture as a feature extractor but modify and fine tune it to support our
disease classes.

30
Munashe Brian Katsande R109881C
Model design

We used the layers in the Inception v3 pre-trained model as a feature extraction component of
our new model. The inception model was loaded without the classifier part of the model and
added a new flatten layer, Dense layer, Dropout and Output layer altered for the requirements of
our new dataset to predict the probability for 3 classes.

Figure 3:5: Deep neural network Figure 3:5 shows the block diagram design of our model. The
block diagram was drawn using Deep Learning Studio[40].

Figure 3:5: Deep neural network

All code was written in the Python programming language. For the implementation of the deep
neural network, Keras[41] library a python based deep Learning library was used. Keras libary
runs on TensorFlow[42] backend. It was chosen as the backend for keras because it offers high
performance numerical computations. The full code is attached in appendix section of this report.

Keras implementation workflow was as follows:

1. Import the libraries in keras

from keras.models import Sequential

from keras.layers import Conv2D

from keras.layers import Flatten

31
Munashe Brian Katsande R109881C
from keras.layers import Dense

from tensorflow.keras import Model

import matplotlib.pyplot as plt

from keras.preprocessing.image import ImageDataGenerator

2. Load the dataset

train_samples =get_files(train_dir)

num_classes=len(glob.glob(train_dir+"/*"))

test_samples=get_files(test_dir)

print(num_classes,"Classes")

print(train_samples,"Train images")

print(test_samples,"Test images")

3. Instantiate Inception v3 model and load pre-trained weights.

# Importing pretrained model

from tensorflow.keras.applications.inception_v3 import InceptionV3

# Instantiate pre-trained model, load weights and remove top layers

Pre_Tmodel = InceptionV3(weights='imagenet', include_top=False)

4. Add new layers on top of the output of the pretrained Inception V3 model

model = tf.keras.Sequential([

feature_extractor,

tf.keras.layers.Flatten(),

tf.keras.layers.Dense(512, activation='relu'),

tf.keras.layers.Dropout(rate=0.2),

32
Munashe Brian Katsande R109881C
tf.keras.layers.Dense(train_generator.num_classes, activation='softmax',

kernel_regularizer=tf.keras.regularizers.l2(0.0001))

])

5. Freeze all layers in the inception v3 model

for layer in pre_Tmodel.layers:

layer.trainable = False

6. Train the new layers on your dataset

3.1.5 Training of Neural Network

All experiments in this section for the training of neural networks were run on Intel Core i7 8th
generation, with 16GB RAM.

Figure 3:6 Process flow diagram

Initial Run

An initial run was done on the model with the pre-trained inception model frozen. Training was
run with an initial batch size of 15 meaning that 15 image samples are used to train the network
each time. 10 epochs were set for the initial run and a learning rate of 0.01 was configure
initially.

33
Munashe Brian Katsande R109881C
The results of the first run are as shown below and the accuracy at the end of the 10th epoch was
48%.
Epoch 1/10
4/4 [==============================] - 7s 2s/step - loss: 45.3216 - accuracy: 0.5179 - val_loss: 1.5160 - val_accuracy: 0.5000
Epoch 2/10
4/4 [==============================] - 6s 2s/step - loss: 1.8062 - accuracy: 0.4423 - val_loss: 0.6952 - val_accuracy: 0.5000
Epoch 3/10
4/4 [==============================] - 5s 1s/step - loss: 0.6935 - accuracy: 0.5000 - val_loss: 0.6933 - val_accuracy: 0.5000
Epoch 4/10
4/4 [==============================] - 5s 1s/step - loss: 0.6950 - accuracy: 0.4615 - val_loss: 0.6889 - val_accuracy: 0.5385
Epoch 5/10
4/4 [==============================] - 4s 1s/step - loss: 0.6920 - accuracy: 0.5536 - val_loss: 0.6898 - val_accuracy: 0.4808
Epoch 6/10
4/4 [==============================] - 5s 1s/step - loss: 0.6928 - accuracy: 0.5192 - val_loss: 0.6902 - val_accuracy: 0.4821
Epoch 7/10
4/4 [==============================] - 6s 2s/step - loss: 0.6955 - accuracy: 0.4808 - val_loss: 0.6885 - val_accuracy: 0.6154
Epoch 8/10
4/4 [==============================] - 6s 1s/step - loss: 0.6973 - accuracy: 0.4643 - val_loss: 0.7086 - val_accuracy: 0.4231
Epoch 9/10
4/4 [==============================] - 4s 1s/step - loss: 0.6903 - accuracy: 0.5625 - val_loss: 0.6824 - val_accuracy: 0.5769
Epoch 10/10
4/4 [==============================] - 4s 1s/step - loss: 0.6903 - accuracy: 0.5536 - val_loss: 0.6779 - val_accuracy: 0.4808

During the training phase, the new model’s internal weights are automatically updated over
several iterations. With deep neural networks, you do not immediately know the optimal
hyperparameters for a given model architecture. So, several reruns were conducted after the
initial run adjusting the learning rate from the initial 0.01. It was observed that this the accuracy
continued to increase with the decrease in learning rate.

Once your deep learning model has converged on the dataset, you unfreeze the entire model and
retrain the entire model end-to-end with a very low learning rate. So, after a few reruns the
remaining layers of the model were unfrozen so that they could be retrained once an initial
training of the new fully connected layers had been carried. The model was retrained with a
learning rate of 0.0001. The accuracy trend grew at the end of 20th epoch. Therefore, we decided
to increase the number of epochs to 100. An accuracy of 94% was realized the end of the 100th
epoch as shown below:
Epoch 1/100
4/4 [==============================] - 4s 970ms/step - loss: 1.3901 - accuracy: 0.5000 - val_loss: 0.8051 - val_accuracy: 0.5357
Epoch 2/100
4/4 [==============================] - 3s 858ms/step - loss: 0.7290 - accuracy: 0.5577 - val_loss: 0.7698 - val_accuracy: 0.4615
Epoch 3/100
4/4 [==============================] - 3s 826ms/step - loss: 0.7099 - accuracy: 0.4808 - val_loss: 0.6755 - val_accuracy: 0.5000
Epoch 4/100
4/4 [==============================] - 3s 739ms/step - loss: 0.7009 - accuracy: 0.5192 - val_loss: 0.6062 - val_accuracy: 0.5962
Epoch 5/100
4/4 [==============================] - 3s 657ms/step - loss: 0.7127 - accuracy: 0.5192 - val_loss: 0.6857 - val_accuracy: 0.5000
Epoch 6/100
4/4 [==============================] - 3s 631ms/step - loss: 0.6617 - accuracy: 0.5577 - val_loss: 0.6251 - val_accuracy: 0.6429
Epoch 7/100
4/4 [==============================] - 4s 886ms/step - loss: 0.6557 - accuracy: 0.6250 - val_loss: 0.5577 - val_accuracy: 0.8077
Epoch 8/100
4/4 [==============================] - 3s 815ms/step - loss: 0.5721 - accuracy: 0.7115 - val_loss: 0.6436 - val_accuracy: 0.6538
Epoch 9/100
4/4 [==============================] - 3s 734ms/step - loss: 0.5747 - accuracy: 0.7083 - val_loss: 0.6114 - val_accuracy: 0.8077
Epoch 10/100
4/4 [==============================] - 3s 668ms/step - loss: 0.5524 - accuracy: 0.7679 - val_loss: 0.5184 - val_accuracy: 0.7692
Epoch 11/100
4/4 [==============================] - 3s 672ms/step - loss: 0.4717 - accuracy: 0.7857 - val_loss: 0.3824 - val_accuracy: 0.8393
Epoch 12/100
4/4 [==============================] - 3s 827ms/step - loss: 0.3626 - accuracy: 0.8958 - val_loss: 0.2591 - val_accuracy: 0.8654
Epoch 13/100
4/4 [==============================] - 4s 892ms/step - loss: 0.5225 - accuracy: 0.7143 - val_loss: 0.2240 - val_accuracy: 0.8462
Epoch 14/100
4/4 [==============================] - 3s 809ms/step - loss: 0.5016 - accuracy: 0.7692 - val_loss: 0.6410 - val_accuracy: 0.6538
Epoch 15/100
4/4 [==============================] - 3s 689ms/step - loss: 0.5667 - accuracy: 0.7115 - val_loss: 0.5442 - val_accuracy: 0.8846
Epoch 16/100
4/4 [==============================] - 3s 702ms/step - loss: 0.4755 - accuracy: 0.8036 - val_loss: 0.4380 - val_accuracy: 0.8750
Epoch 17/100
4/4 [==============================] - 4s 875ms/step - loss: 0.4145 - accuracy: 0.7708 - val_loss: 0.5058 - val_accuracy: 0.7885
Epoch 18/100
4/4 [==============================] - 4s 922ms/step - loss: 0.4231 - accuracy: 0.8036 - val_loss: 0.3618 - val_accuracy: 0.9615
Epoch 19/100
4/4 [==============================] - 3s 851ms/step - loss: 0.3340 - accuracy: 0.7885 - val_loss: 0.2095 - val_accuracy: 0.8462

34
Munashe Brian Katsande R109881C
Epoch 20/100
4/4 [==============================] - 3s 700ms/step - loss: 0.3615 - accuracy: 0.8269 - val_loss: 0.8572 - val_accuracy: 0.7692
Epoch 21/100
4/4 [==============================] - 3s 752ms/step - loss: 0.4675 - accuracy: 0.7885 - val_loss: 0.2954 - val_accuracy: 0.8393
Epoch 22/100
4/4 [==============================] - 4s 972ms/step - loss: 0.2378 - accuracy: 0.9423 - val_loss: 0.2364 - val_accuracy: 0.8846
Epoch 23/100
4/4 [==============================] - 4s 944ms/step - loss: 0.2098 - accuracy: 0.9231 - val_loss: 0.1516 - val_accuracy: 0.8077
Epoch 24/100
4/4 [==============================] - 3s 845ms/step - loss: 0.3335 - accuracy: 0.8269 - val_loss: 0.4717 - val_accuracy: 0.7500
Epoch 25/100
4/4 [==============================] - 3s 750ms/step - loss: 0.4870 - accuracy: 0.8393 - val_loss: 0.4265 - val_accuracy: 0.8846
Epoch 26/100
4/4 [==============================] - 3s 747ms/step - loss: 0.3897 - accuracy: 0.8269 - val_loss: 0.1966 - val_accuracy: 0.9821
Epoch 27/100
4/4 [==============================] - 4s 973ms/step - loss: 0.3193 - accuracy: 0.8750 - val_loss: 0.2592 - val_accuracy: 0.9038
Epoch 28/100
4/4 [==============================] - 4s 965ms/step - loss: 0.2445 - accuracy: 0.9231 - val_loss: 0.2284 - val_accuracy: 0.9231
Epoch 29/100
4/4 [==============================] - 4s 933ms/step - loss: 0.1709 - accuracy: 0.8846 - val_loss: 0.2603 - val_accuracy: 0.9038
Epoch 30/100
4/4 [==============================] - 3s 733ms/step - loss: 0.1317 - accuracy: 0.9423 - val_loss: 0.0628 - val_accuracy: 0.9808
Epoch 31/100
4/4 [==============================] - 3s 780ms/step - loss: 0.1961 - accuracy: 0.9286 - val_loss: 0.1056 - val_accuracy: 0.9821
Epoch 32/100
4/4 [==============================] - 4s 1s/step - loss: 0.1253 - accuracy: 0.9792 - val_loss: 0.0545 - val_accuracy: 0.9808
Epoch 33/100
4/4 [==============================] - 4s 1s/step - loss: 0.2667 - accuracy: 0.8846 - val_loss: 0.1180 - val_accuracy: 0.9423
Epoch 34/100
4/4 [==============================] - 4s 934ms/step - loss: 0.1796 - accuracy: 0.9286 - val_loss: 0.0400 - val_accuracy: 1.0000
Epoch 35/100
4/4 [==============================] - 3s 767ms/step - loss: 0.1691 - accuracy: 0.9423 - val_loss: 0.1338 - val_accuracy: 0.9231
Epoch 36/100
4/4 [==============================] - 3s 806ms/step - loss: 0.2065 - accuracy: 0.9615 - val_loss: 0.0853 - val_accuracy: 0.9464
Epoch 37/100
4/4 [==============================] - 4s 1s/step - loss: 0.0710 - accuracy: 0.9615 - val_loss: 0.0410 - val_accuracy: 0.9615
Epoch 38/100
4/4 [==============================] - 4s 1s/step - loss: 0.1896 - accuracy: 0.9423 - val_loss: 0.0509 - val_accuracy: 0.9231
Epoch 39/100
4/4 [==============================] - 4s 1s/step - loss: 0.1560 - accuracy: 0.9107 - val_loss: 0.2445 - val_accuracy: 0.9423
Epoch 40/100
4/4 [==============================] - 3s 815ms/step - loss: 0.1438 - accuracy: 0.9808 - val_loss: 0.1682 - val_accuracy: 0.9423
Epoch 41/100
4/4 [==============================] - 3s 861ms/step - loss: 0.0784 - accuracy: 0.9808 - val_loss: 0.0450 - val_accuracy: 0.9643
Epoch 42/100
4/4 [==============================] - 4s 1s/step - loss: 0.0457 - accuracy: 0.9808 - val_loss: 0.1717 - val_accuracy: 0.9808
Epoch 43/100
4/4 [==============================] - 4s 1s/step - loss: 0.1272 - accuracy: 0.9615 - val_loss: 0.0144 - val_accuracy: 1.0000
Epoch 44/100
4/4 [==============================] - 4s 1s/step - loss: 0.1300 - accuracy: 0.9423 - val_loss: 0.0865 - val_accuracy: 0.9808
Epoch 45/100
4/4 [==============================] - 3s 867ms/step - loss: 0.1603 - accuracy: 0.9464 - val_loss: 0.0300 - val_accuracy: 0.9038
Epoch 46/100
4/4 [==============================] - 3s 863ms/step - loss: 0.1419 - accuracy: 0.9231 - val_loss: 0.0896 - val_accuracy: 0.9643
Epoch 47/100
4/4 [==============================] - 4s 1s/step - loss: 0.1541 - accuracy: 0.9423 - val_loss: 0.0102 - val_accuracy: 1.0000
Epoch 48/100
4/4 [==============================] - 4s 1s/step - loss: 0.1119 - accuracy: 0.9423 - val_loss: 0.0147 - val_accuracy: 0.9423
Epoch 49/100
4/4 [==============================] - 4s 1s/step - loss: 0.0534 - accuracy: 0.9643 - val_loss: 0.0248 - val_accuracy: 0.9808
Epoch 50/100
4/4 [==============================] - 3s 847ms/step - loss: 0.1818 - accuracy: 0.8846 - val_loss: 0.0070 - val_accuracy: 0.9615
Epoch 51/100
4/4 [==============================] - 3s 852ms/step - loss: 0.0762 - accuracy: 0.9423 - val_loss: 0.0695 - val_accuracy: 0.9643
Epoch 52/100
4/4 [==============================] - 5s 1s/step - loss: 0.1203 - accuracy: 0.9643 - val_loss: 0.0092 - val_accuracy: 0.9615
Epoch 53/100
4/4 [==============================] - 4s 1s/step - loss: 0.0865 - accuracy: 0.9808 - val_loss: 0.0035 - val_accuracy: 1.0000
Epoch 54/100
4/4 [==============================] - 4s 1s/step - loss: 0.2169 - accuracy: 0.9423 - val_loss: 0.0168 - val_accuracy: 0.9808
Epoch 55/100
4/4 [==============================] - 3s 807ms/step - loss: 0.0694 - accuracy: 0.9808 - val_loss: 0.0710 - val_accuracy: 1.0000
Epoch 56/100
4/4 [==============================] - 3s 835ms/step - loss: 0.0528 - accuracy: 0.9615 - val_loss: 0.0392 - val_accuracy: 0.9821
Epoch 57/100
4/4 [==============================] - 5s 1s/step - loss: 0.0862 - accuracy: 0.9808 - val_loss: 0.1156 - val_accuracy: 0.9808
Epoch 58/100
4/4 [==============================] - 4s 1s/step - loss: 0.0347 - accuracy: 1.0000 - val_loss: 0.1593 - val_accuracy: 0.9808
Epoch 59/100
4/4 [==============================] - 4s 958ms/step - loss: 0.0778 - accuracy: 0.9615 - val_loss: 0.0099 - val_accuracy: 1.0000
Epoch 60/100
4/4 [==============================] - 3s 846ms/step - loss: 0.0644 - accuracy: 0.9808 - val_loss: 0.0246 - val_accuracy: 0.9808
Epoch 61/100
4/4 [==============================] - 3s 873ms/step - loss: 0.0819 - accuracy: 0.9808 - val_loss: 0.0874 - val_accuracy: 0.9643
Epoch 62/100
4/4 [==============================] - 4s 1s/step - loss: 0.0505 - accuracy: 0.9808 - val_loss: 0.0164 - val_accuracy: 1.0000
Epoch 63/100
4/4 [==============================] - 4s 1s/step - loss: 0.0718 - accuracy: 0.9643 - val_loss: 0.0090 - val_accuracy: 0.9808
Epoch 64/100
4/4 [==============================] - 4s 941ms/step - loss: 0.0076 - accuracy: 1.0000 - val_loss: 0.0063 - val_accuracy: 0.9615
Epoch 65/100
4/4 [==============================] - 4s 902ms/step - loss: 0.0671 - accuracy: 0.9808 - val_loss: 0.1119 - val_accuracy: 0.9808
Epoch 66/100
4/4 [==============================] - 4s 885ms/step - loss: 0.0200 - accuracy: 0.9808 - val_loss: 0.0362 - val_accuracy: 0.9821
Epoch 67/100

35
Munashe Brian Katsande R109881C
4/4 [==============================] - 4s 1s/step - loss: 0.1455 - accuracy: 0.9423 - val_loss: 0.0600 - val_accuracy: 0.9808
Epoch 68/100
4/4 [==============================] - 4s 1s/step - loss: 0.0683 - accuracy: 0.9643 - val_loss: 0.0139 - val_accuracy: 0.9808
Epoch 69/100
4/4 [==============================] - 4s 970ms/step - loss: 0.0416 - accuracy: 0.9808 - val_loss: 0.1689 - val_accuracy: 0.9231
Epoch 70/100
4/4 [==============================] - 3s 841ms/step - loss: 0.1758 - accuracy: 0.9231 - val_loss: 0.0333 - val_accuracy: 0.9808
Epoch 71/100
4/4 [==============================] - 3s 848ms/step - loss: 0.1218 - accuracy: 0.9231 - val_loss: 0.0219 - val_accuracy: 0.9821
Epoch 72/100
4/4 [==============================] - 4s 1s/step - loss: 0.0300 - accuracy: 1.0000 - val_loss: 0.0119 - val_accuracy: 1.0000
Epoch 73/100
4/4 [==============================] - 5s 1s/step - loss: 0.1110 - accuracy: 0.9423 - val_loss: 0.1007 - val_accuracy: 0.9423
Epoch 74/100
4/4 [==============================] - 4s 1s/step - loss: 0.0471 - accuracy: 0.9808 - val_loss: 0.0445 - val_accuracy: 0.9808
Epoch 75/100
4/4 [==============================] - 3s 769ms/step - loss: 0.1136 - accuracy: 0.9423 - val_loss: 0.0247 - val_accuracy: 0.9615
Epoch 76/100
4/4 [==============================] - 3s 816ms/step - loss: 0.0566 - accuracy: 0.9808 - val_loss: 2.9667e-04 - val_accuracy:
0.9821
Epoch 77/100
4/4 [==============================] - 4s 1s/step - loss: 0.3598 - accuracy: 0.9038 - val_loss: 0.0085 - val_accuracy: 1.0000
Epoch 78/100
4/4 [==============================] - 4s 1s/step - loss: 0.0917 - accuracy: 0.9464 - val_loss: 0.0264 - val_accuracy: 1.0000
Epoch 79/100
4/4 [==============================] - 4s 928ms/step - loss: 0.3549 - accuracy: 0.9038 - val_loss: 0.1176 - val_accuracy: 0.9423
Epoch 80/100
4/4 [==============================] - 3s 766ms/step - loss: 0.1450 - accuracy: 0.9615 - val_loss: 0.5958 - val_accuracy: 0.9423
Epoch 81/100
4/4 [==============================] - 3s 769ms/step - loss: 0.4566 - accuracy: 0.9038 - val_loss: 0.2373 - val_accuracy: 0.8929
Epoch 82/100
4/4 [==============================] - 4s 1s/step - loss: 0.0532 - accuracy: 1.0000 - val_loss: 0.0514 - val_accuracy: 0.9808
Epoch 83/100
4/4 [==============================] - 4s 968ms/step - loss: 0.1677 - accuracy: 0.9286 - val_loss: 0.0509 - val_accuracy: 0.9808
Epoch 84/100
4/4 [==============================] - 3s 841ms/step - loss: 0.1485 - accuracy: 0.9375 - val_loss: 0.0599 - val_accuracy: 1.0000
Epoch 85/100
4/4 [==============================] - 3s 823ms/step - loss: 0.0992 - accuracy: 0.9643 - val_loss: 0.0088 - val_accuracy: 0.9615
Epoch 86/100
4/4 [==============================] - 3s 784ms/step - loss: 0.1648 - accuracy: 0.9423 - val_loss: 0.0493 - val_accuracy: 0.9821
Epoch 87/100
4/4 [==============================] - 4s 1s/step - loss: 0.0651 - accuracy: 0.9821 - val_loss: 0.2135 - val_accuracy: 0.9808
Epoch 88/100
4/4 [==============================] - 4s 1s/step - loss: 0.1102 - accuracy: 0.9808 - val_loss: 0.0507 - val_accuracy: 0.9808
Epoch 89/100
4/4 [==============================] - 4s 944ms/step - loss: 0.0848 - accuracy: 0.9615 - val_loss: 0.0601 - val_accuracy: 0.9615
Epoch 90/100
4/4 [==============================] - 3s 842ms/step - loss: 0.0826 - accuracy: 0.9615 - val_loss: 0.0610 - val_accuracy: 0.9808
Epoch 91/100
4/4 [==============================] - 4s 912ms/step - loss: 0.0688 - accuracy: 0.9615 - val_loss: 0.0281 - val_accuracy: 1.0000
Epoch 92/100
4/4 [==============================] - 5s 1s/step - loss: 0.0646 - accuracy: 0.9464 - val_loss: 0.1384 - val_accuracy: 0.9615
Epoch 93/100
4/4 [==============================] - 4s 1s/step - loss: 0.0479 - accuracy: 0.9615 - val_loss: 0.1230 - val_accuracy: 0.9231
Epoch 94/100
4/4 [==============================] - 4s 987ms/step - loss: 0.0423 - accuracy: 0.9615 - val_loss: 0.1056 - val_accuracy: 0.9231
Epoch 95/100
4/4 [==============================] - 3s 866ms/step - loss: 0.2599 - accuracy: 0.9615 - val_loss: 0.1087 - val_accuracy: 0.9231
Epoch 96/100
4/4 [==============================] - 3s 841ms/step - loss: 0.1882 - accuracy: 0.9615 - val_loss: 0.1397 - val_accuracy: 0.9464
Epoch 97/100
4/4 [==============================] - 4s 1s/step - loss: 0.2571 - accuracy: 0.8462 - val_loss: 0.0876 - val_accuracy: 0.9615
Epoch 98/100
4/4 [==============================] - 4s 1s/step - loss: 0.0806 - accuracy: 0.9808 - val_loss: 0.4361 - val_accuracy: 0.9615
Epoch 99/100
4/4 [==============================] - 4s 1s/step - loss: 0.0746 - accuracy: 0.9821 - val_loss: 0.0488 - val_accuracy: 0.9615
Epoch 100/100
4/4 [==============================] - 3s 799ms/step - loss: 0.1323 - accuracy: 0.9423 - val_loss: 0.1112 - val_accuracy: 0.9423

36
Munashe Brian Katsande R109881C
Visualizing output after every layer

Matplotlib was used to visualize the output of an image at every layer. For example, for the
image in Figure 3:7, the output image can be visualized after each layer.

Figure 3:7 : Portion of a diseased cotton leaf

Convolutional layer 1 output

37
Munashe Brian Katsande R109881C
MaxPooling layer 1 output

Convolutional layer 2 output

38
Munashe Brian Katsande R109881C
Convert model to TensorFlow Lite

import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_keras_model_file('crop.h5')

tfmodel = converter.convert()

open ("output_model.tflite" , "wb") .write(tfmodel)

39
Munashe Brian Katsande R109881C
3.1.6 Mobile application development

All code in this section was written in the Java programming language using Android Studio
software[43].

Figure 3:8 Mobile application flowchart

40
Munashe Brian Katsande R109881C
Application layout

The layout was developed using Extensible Markup Language (XML). The main components of
the layout are as follows:

• Upload photo - This is a button to allow the user to open the device gallery and select an
image to classify.
• Take photo - This is a button to allow the user to open up the device camera and take a
photograph of an image to be classified.

Figure 3:9: Application GUI layout

• Image Display panel – This panel is to display the photograph take or the image taken
from the device gallery
• Detect disease – This button is used to initiate the classification process
• Results – This panel is used to display the inference results

41
Munashe Brian Katsande R109881C
Java programming

For every corresponding button mentioned in Figure 3:9 java code was written to invoke the
intended process under the MainActivity class. Code snippets for each button were as follows.

CameraButton.setOnClickListener {
val callCameraIntent = Intent(MediaStore.ACTION_IMAGE_CAPTURE)
startActivityForResult(callCameraIntent, CameraRequestCode)
}

GalleryButton.setOnClickListener {
val callGalleryIntent = Intent(Intent.ACTION_PICK)
callGalleryIntent.type = "image/*"
startActivityForResult(callGalleryIntent, GalleryRequestCode)
}
DetectButton.setOnClickListener {
val results = Classifier.recognizeImage(mBitmap).firstOrNull()
mResultTextView.text= results?.title+"\n

}
}
Classifier class

The main function of this application is to classify cotton diseases, so a java classifier class was
developed to load our model developed in the previous section. The class is invoked every time
the detect disease button is clicked to take an image as an argument and compute the appropriate
disease class of the image. The full code has been attached in appendix section of this report.

Figure 3:10: Android Studio

42
Munashe Brian Katsande R109881C
Final application

Figure 3:11 Mobile Application

43
Munashe Brian Katsande R109881C
4.1 Results

This section presents all the results of our dissertation of all the work carried out in the previous
section.

4.1.1 Model training results

The results presented in this section are related to the training of our deep learning model with
collected image dataset. As mentioned in the previous chapter, we developed a cotton crop
disease identification model based on transfer learning. Inception v3 pretrained model was used
as a feature extractor and a new flatten layer, Dense layer, Dropout and Output layer were added
for the requirements of our research problem.

Our dataset was split into three subsets namely training set, validation set and test set. The train
set was a subset of images the model was trained on by adjusting hyperparameters. During the
training process, our model was periodically evaluated using the validation set. The model auto
tunes some of the parameters based on the periodic evaluation results on the validation set.

The final evaluation of the model after the training phase has been completed was carried out
using the training set. This is the most important step get the working accuracy and
generalizability of our model. Matplotlib in Keras was used to plot the training and validation
losses vs epochs and training and validation accuracies vs epochs after training process was
completed.

• Train curve – This curve was plotted from the training dataset that gives an idea of how
the model performed.
• Validation curve – This curve plotted from validation dataset that gives an idea of how
well the model is generalizing the data.

44
Munashe Brian Katsande R109881C
Loss Graphs

The loss graphs of the training process.

Figure 4:1: Training and validation losses vs epochs

From Figure 4:1 we can see that we obtained a good fitting curve up to around 70 epochs which
is identified by a training and validation loss that decreases to a point of stability with a minimal
generalization gap between the two values. The performance of the model on the validation
dataset began to degrade after 70 epochs, so the training process was stopped. Before 70 epochs,
the model has low variance and generalizes the data well. Further training from this point
increased the variance of the mode which means the model is no longer learning but overfitting
or memorizing the data

45
Munashe Brian Katsande R109881C
Accuracy graph

Figure 4:2: Training and validation accuracies vs epochs

After fine-tuning the parameters of the model and several training iterations, an average overall
accuracy of about 94% was achieved as observed from Figure 4:2. Overall, this model was able
to generalize our data. The accuracy increased gradually until it converged at an average of 94%.

Model performance on test dataset

To estimate the real performance or accuracy the models, model validation or testing was done.
This process is conducted after training has been completed to judge how well the model
perform in practice. This is the ultimate test to evaluate how the model generalize on unseen and
new data. This evaluation was conducted on the test dataset collected in chapter 3 to test how the
model predicts the disease classes using the trained model. The holdout validation results on test
data is as shown in Figure 4:3 below.

46
Munashe Brian Katsande R109881C
Figure 4:3 Holdout validation results

From Figure 4:3 we can see that our test dataset holdout validation accuracy is lower that the
training accuracy by a small margin. This is naturally expected as test dataset does not reflect the
exact reality of the train dataset. The average accuracy achieved on test dataset is 83%.

Random test example

A random test was carried out on the model using an image that the model had not seen before
and Figure 4:4 show how the model was able to correctly predict the disease.

Figure 4:4 Model random test

47
Munashe Brian Katsande R109881C
4.1.2 Android Mobile Application

An android smartphone application was developed as part for this dissertation. The main aim of
developing the android application was to provide a user interface for use by the farmers to
determine the diseases affecting their crop in an image. The farmer takes a photo of the affected
cotton leaf using the smartphone camera and the image is processed by the application to detect
the disease.

After the model was trained and tested, it was packaged into an android mobile application.
Several tests were conducted on the application by uploading unseen images of diseased cotton
leaves to the application. The mobile application was able to accurately classify diseases on most
of the images and failed dismally on a few. The image below shows the mobile app correctly
classifying a cotton leaf disease.

48
Munashe Brian Katsande R109881C
The mobile application screenshots

Figure 4:5 Mobile application

49
Munashe Brian Katsande R109881C
Incorrectly classified images

On other test occasions the model incorrectly classified some diseases. It was observed that the
incorrect classification happened for the following reasons among others:

1. Co-occurrence of disease on one leaf


2. High level of background noise i.e. soil occupying more than a reasonable proportion of
the image
3. Multiple images of leaves that belonged to different parts of the same plant

Figure 4:6: Incorrectly classified images

50
Munashe Brian Katsande R109881C
5.1 Conclusions

Detection of crop diseases at an early stage plays a significant role in food security. As a solution
to this challenge, we devised an android based application to allow farmers to detect cotton crop
leaf diseases with ease. We started this dissertation by going through research in machine
learning and deep Learning. We then found out that deep learning techniques have been very
successful in image recognition tasks in terms performance and accuracy. We first collected a
dataset of healthy and diseased cotton leaf images and then sub divided them into training,
validation and testing subsets.

The solution employed deep learning and image processing to analyze an image and classify it as
either healthy, diseased with Bacterial blight or Alternaria leaf spot. Transfer learning using a
pretrained model was used as a feature extractor and more layers were added to suit the
requirements of our research problem. Transfer learning is very useful when the amount of
training data is limited. This is usually always the case in the case of crop diseases. This
technique allows our model to achieve greater generalizability because the network has been
trained and has learned from millions of images.

The new model was trained and tested on an image dataset that we collected. After several
iterations of training the network converged and the model was integrated into a mobile
application using TensorFlow lite for android.

The process pipeline was as follows:

1. Capturing an image with mobile phone


2. Extracting leaf from image using image preprocessing
3. Running the cropped image through a classifier to identify disease if any using deep
learning model

This application was trained to distinguish between 2 cotton leaf diseases, and an average
accuracy of 83%was obtained. Despite some few incorrect classifications, the mobile application
was able to produce good results which gives confidence that the approach was viable. This

51
Munashe Brian Katsande R109881C
highlights that the dataset must reflect the reality of the operating environment and that data
diversity is one of the key elements required to ensure model generalization and high accuracy.

We had to compromise quality of training and results because of because of the inability to
acquire a large and diverse dataset of images. Ideally, a model must be trained on many and
diverse images to learn many features as possible to able to accurately classify disease from the
other. Based on the work carried out in this dissertation, the approached taken can be developed
to help farmers diagnose crop diseases accurately and at an affordable price.

5.2 Future Work

Several future research directions can be taken for the continuation of this research. For example,
alternative pretrained models such as ResNet could be used as a base model to provide greater
robustness of classification although it may come at the cost of increased training and application
resources.

As mentioned before, the best improvement to this research would be to get a significantly larger
and diverse dataset which will provide more variance in the training set. This allows the model to
learn more features thereby improving the application generalizability and accuracy.

In conclusion, deep learning is a promising solution for the detection and diagnosis of crop
diseases making it easier for farmers to know accurately the disease affection their crop. This
allow them to take appropriate action in a timely manner to save their crop. Algorithms and
methods used in this research to are subject to further research to improve on the results obtained
in this study.

52
Munashe Brian Katsande R109881C
References

[1] FAO, “FAO in Zimbabwe.” [Online]. Available: http://www.fao.org/zimbabwe/fao-in-


zimbabwe/zimbabwe-at-a-glance/en/. [Accessed: 20-Nov-2019].

[2] E. Casadei and J. Albert, “FOOD AND AGRICULTURE ORGANIZATION OF THE


UNITED NATIONS,” in Encyclopedia of Food Sciences and Nutrition, 2003, pp. 2587–2593.

[3] Ericson, “Ericson Mobility Report: On The Pulse of Network Society,”


Http://Www.Ericsson.Com/Res/Docs/2015/Ericsson-Mobility-Report-June-2015.Pdf, 2015.

[4] T. Rusere, M;Chihuri, C;Muzorori, “Zimbabwe’s Cotton Sector: Growth and Prospects
Under Changing Trade Environment,” 2006.

[5] Z. Zhao and P. Zheng, “Object Detection with Deep Learning : A Review,” pp. 1–21,
2012.

[6] A. Kamilaris, “A review of the use of convolutional neural networks in agriculture,”


2018.

[7] M. A. Wani, F. A. Bhat, S. Afzal, and A. I. Khan, “Introduction to Deep Learning,” 2020,
pp. 1–11.

[8] Matlab, “Introduction to Deep Learning.” [Online]. Available:


https://www.mathworks.com/videos/introduction-to-deep-learning-what-are-convolutional-
neural-networks--1489512765771.htm. [Accessed: 20-Nov-2019].

[9] J. Li, Fei-Fei, Justin, “CS321n,” in Convolutional Neural Networks for Visual
Recognition, Stanford University, .

[10] A. Hidayat, U. Darusalam, I. Technology, and S. Jakarta, “Jurnal Ilmu Komputer dan
Informasi ( Journal of a Science and Information ). 12 / 1 ( 2019 ), 51-56 DOI :
http://dx:doi:org/10:21609/jiki:v12i1:695 DETECTION OF DISEASE ON CORN PLANTS
USING CONVOLUTIONAL NEURAL,” vol. 1, pp. 51–56, 2019.

[11] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout:


A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., 2014.

53
Munashe Brian Katsande R109881C
[12] Pedro Domingos, “A Few Useful Things to Know About Machine Learning,” Commun.
ACM, vol. 55, no. 10, pp. 79–88, 2012.

[13] Y. Bengio, I. Goodfellow, and A. Courville, “Chapter 7: Regularization,” Integr. Vlsi J.,
2003.

[14] P. Baldi and P. Sadowski, “Understanding dropout,” in Advances in Neural Information


Processing Systems, 2013.

[15] M. Learning and A. Zheng, Evaluating Machine Learning Models\nA Beginner’s Guide
to Key Concepts and Pitfalls. 2015.

[16] A. C. Ian Goodfellow, Yoshua Bengio, “Deep Learning Book,” Deep Learn., 2015.

[17] Y. Lecun, L. Bottou, Y. Bengio, and P. Ha, “LeNet,” Proc. IEEE, 1998.

[18] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “2012 AlexNet,” Adv. Neural Inf.
Process. Syst., 2012.

[19] C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, 2015.

[20] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” in 3rd International Conference on Learning Representations, ICLR 2015 -
Conference Track Proceedings, 2015.

[21] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep


convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012.

[22] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, 2016.

[23] L. Deng, H. H. Chu, P. Shi, W. Wang, and X. Kong, “Region-based CNN method with
deformable modules for visually classifying concrete cr,” Appl. Sci., 2020.

[24] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep
neural networks?,” in Advances in Neural Information Processing Systems, 2014.

54
Munashe Brian Katsande R109881C
[25] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on
Knowledge and Data Engineering. 2010.

[26] Handbook of Research on Machine Learning Applications and Trends. 2010.

[27] C. Bock, G. Poole, P. E. Parker, and T. Gottwald, “Plant Disease Severity Estimated
Visually, by Digital Photography and Image Analysis, and by Hyperspectral Imaging,” CRC.
Crit. Rev. Plant Sci., vol. 29, Mar. 2010.

[28] A. M. Mutka and R. S. Bart, “Image-based phenotyping of plant disease symptoms,” vol.
5, no. January, pp. 1–8, 2015.

[29] “Present and Future Trends in Plant Disease Detection,” no. February, pp. 241–251,
2016.

[30] J. Amara, B. Bouaziz, and A. Algergawy, “A Deep Learning-based Approach for Banana
Leaf Diseases Classification,” pp. 79–88, 2017.

[31] USDA, “Cotton Production and Consumption in Zimbabwe,” Usda, 2016.

[32] P. Revathi and M. Hemalatha, “Advance computing enrichment evaluation of cotton leaf
spot disease detection using Image Edge detection,” in 2012 3rd International Conference on
Computing, Communication and Networking Technologies, ICCCNT 2012, 2012, no. July.

[33] C. Usha Kumari, S. Jeevan Prasad, and G. Mounika, “Leaf disease detection: Feature
extraction with k-means clustering and classification with ANN,” in Proceedings of the 3rd
International Conference on Computing Methodologies and Communication, ICCMC 2019,
2019, no. Iccmc, pp. 1095–1098.

[34] S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk, and D. Stefanovic, “Deep Neural


Networks Based Recognition of Plant Diseases by Leaf Image Classification,” Comput. Intell.
Neurosci., 2016.

[35] A. Fuentes, S. Yoon, S. C. Kim, and D. S. Park, “A robust deep-learning-based detector


for real-time tomato plant diseases and pests recognition,” Sensors (Switzerland), 2017.

55
Munashe Brian Katsande R109881C
[36] E. Fujita, Y. Kawasaki, H. Uga, S. Kagiwada, and H. Iyatomi, “Basic investigation on a
robust and practical plant diagnostic system,” in Proceedings - 2016 15th IEEE International
Conference on Machine Learning and Applications, ICMLA 2016, 2017.

[37] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep learning for image-based
plant disease detection,” Front. Plant Sci., 2016.

[38] M. E. Palm and R. J. Hillocks, “Cotton Diseases,” Mycologia, 1994.

[39] L. Zhang, S. Wang, and B. Liu, “Deep learning for sentiment analysis: A survey,” Wiley
Interdiscip. Rev. Data Min. Knowl. Discov., 2018.

[40] R. Gupta, “Deep Learning Studio.” [Online]. Available: https://deepcognition.ai/.


[Accessed: 23-Feb-2020].

[41] F. Chollet, “Keras: The Python Deep Learning library,” Keras.Io, 2015.

[42] M. Abadi et al., “TensorFlow: Large-scale machine learning on heterogeneous systems,”


Methods Enzymol., 1983.

[43] Google, “Android Studio.” [Online]. Available: https://developer.android.com/studio.


[Accessed: 19-Nov-2019].

56
Munashe Brian Katsande R109881C
Appendix

Mobile Application Code

MainActivity class

package com.example.cropdiag

import android.annotation.SuppressLint
import androidx.appcompat.app.AppCompatActivity
import android.os.Bundle
import android.widget.Toast
import android.app.Activity
//import android.widget.RelativeLayout
import android.content.Intent
import android.content.pm.ActivityInfo
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.graphics.Matrix
import android.os.Build
import android.provider.MediaStore
import android.provider.MediaStore.Images.Media.getBitmap
//import android.support.annotation.RequiresApi
//import android.support.v7.app.AppCompatActivity
import android.view.Gravity
import androidx.annotation.RequiresApi
//import com.example.cropdiag.R
import kotlinx.android.synthetic.main.activity_main.*
import java.io.IOException

@Suppress("DEPRECATION")
class MainActivity : AppCompatActivity() {

57
Munashe Brian Katsande R109881C
private lateinit var mClassifier: Classifier
private lateinit var mBitmap: Bitmap

private val mCameraRequestCode = 0


private val mGalleryRequestCode = 2

private val mInputSize = 256


private val mModelPath = "plant_disease_model.tflite"
private val mLabelPath = "plant_labels.txt"
private val mSamplePath = "soybean.JPG"

@SuppressLint("SetTextI18n")
@RequiresApi(Build.VERSION_CODES.O)
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
requestedOrientation = ActivityInfo.SCREEN_ORIENTATION_PORTRAIT
setContentView(R.layout.activity_main)
mClassifier = Classifier(assets, mModelPath, mLabelPath, mInputSize)

resources.assets.open(mSamplePath).use {
mBitmap = BitmapFactory.decodeStream(it)
mBitmap = Bitmap.createScaledBitmap(mBitmap, mInputSize, mInputSize, true)
mPhotoImageView.setImageBitmap(mBitmap)
}

mCameraButton.setOnClickListener {
val callCameraIntent = Intent(MediaStore.ACTION_IMAGE_CAPTURE)
startActivityForResult(callCameraIntent, mCameraRequestCode)
}

58
Munashe Brian Katsande R109881C
mGalleryButton.setOnClickListener {
val callGalleryIntent = Intent(Intent.ACTION_PICK)
callGalleryIntent.type = "image/*"
startActivityForResult(callGalleryIntent, mGalleryRequestCode)
}
mDetectButton.setOnClickListener {
val results = mClassifier.recognizeImage(mBitmap).firstOrNull()
mResultTextView.text= results?.title+"\n *Confidence:"+results?.confidence

}
}

@SuppressLint("SetTextI18n", "MissingSuperCall")
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
if(requestCode == mCameraRequestCode){
//Considérons le cas de la caméra annulée
if(resultCode == Activity.RESULT_OK && data != null) {
mBitmap = data.extras!!.get("data") as Bitmap
mBitmap = scaleImage(mBitmap)
val toast = Toast.makeText(this, ("Image crop to: w= ${mBitmap.width} h=
${mBitmap.height}"), Toast.LENGTH_LONG)
toast.setGravity(Gravity.BOTTOM, 0, 20)
toast.show()
mPhotoImageView.setImageBitmap(mBitmap)
mResultTextView.text= "Your photo image set now."
} else {
Toast.makeText(this, "Camera cancel..", Toast.LENGTH_LONG).show()
}
} else if(requestCode == mGalleryRequestCode) {
if (data != null) {
val uri = data.data

59
Munashe Brian Katsande R109881C
try {
mBitmap = getBitmap(this.contentResolver, uri)
} catch (e: IOException) {
e.printStackTrace()
}

println("Success!!!")
mBitmap = scaleImage(mBitmap)
mPhotoImageView.setImageBitmap(mBitmap)

}
} else {
Toast.makeText(this, "Unrecognized request code", Toast.LENGTH_LONG).show()

}
}

private fun scaleImage(bitmap: Bitmap?): Bitmap {


val orignalWidth = bitmap!!.width
val originalHeight = bitmap.height
val scaleWidth = mInputSize.toFloat() / orignalWidth
val scaleHeight = mInputSize.toFloat() / originalHeight
val matrix = Matrix()
matrix.postScale(scaleWidth, scaleHeight)
return Bitmap.createBitmap(bitmap, 0, 0, orignalWidth, originalHeight, matrix, true)
}

60
Munashe Brian Katsande R109881C
Activity main class

<?xml version="1.0" encoding="utf-8"?>

<androidx.constraintlayout.widget.ConstraintLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
tools:context="com.example.cropdiag.MainActivity">

<Button
android:id="@+id/mCameraButton"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="@string/buttonTakePhoto"
android:layout_marginBottom="8dp"
android:layout_marginTop="32dp"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintHorizontal_bias="1.0"
android:layout_marginRight="32dp"
app:layout_constraintBottom_toTopOf="@+id/mPhotoImageView"
android:layout_marginEnd="32dp" />

<Button
android:id="@+id/mGalleryButton"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_marginBottom="8dp"

61
Munashe Brian Katsande R109881C
android:text="@string/buttonSelectPhoto"
app:layout_constraintBottom_toTopOf="@+id/mPhotoImageView"
app:layout_constraintEnd_toStartOf="@+id/mDetectButton"
app:layout_constraintHorizontal_bias="1.0"
app:layout_constraintStart_toEndOf="@+id/mCameraButton"
app:layout_constraintStart_toStartOf="parent" />

<ImageView
android:id="@+id/mPhotoImageView"
android:layout_width="350dp"
android:layout_height="400dp"
android:contentDescription="@string/descriptionImage"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintTop_toTopOf="parent"
app:srcCompat="@android:color/darker_gray"
app:layout_constraintVertical_chainStyle="packed"
android:layout_marginBottom="8dp"
app:layout_constraintBottom_toTopOf="@+id/mDetectButton"/>

<Button
android:text="@string/buttonDiagnose"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:id="@+id/mDetectButton"
app:layout_constraintStart_toStartOf="parent"
app:layout_constraintEnd_toEndOf="parent"
app:layout_constraintBottom_toBottomOf="parent"
app:layout_constraintBottom_toTopOf="@+id/mResultTextView"

62
Munashe Brian Katsande R109881C
android:layout_marginBottom="8dp"/>
<TextView
android:text="@string/defaultImage"
android:layout_width="0dp"
android:layout_height="75dp"
app:layout_constraintStart_toStartOf="parent"
android:layout_marginStart="32dp"
android:id="@+id/mResultTextView"
app:layout_constraintBottom_toBottomOf="parent"
android:layout_marginBottom="8dp"
app:layout_constraintEnd_toEndOf="parent"
android:layout_marginEnd="32dp"
android:textStyle="bold"
android:textAlignment="center"/>
</androidx.constraintlayout.widget.ConstraintLayout>

Android Manifest

<?xml version="1.0" encoding="utf-8"?>


<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:dist="http://schemas.android.com/apk/distribution"
package="com.example.cropdiag">

<dist:module dist:instant="true"/>

<application
android:allowBackup="true"

63
Munashe Brian Katsande R109881C
android:fullBackupContent="true"
android:icon="@mipmap/ic_launcher"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/AppTheme">
<activity android:name="com.example.cropdiag.MainActivity">
<intent-filter>
<action android:name="android.intent.action.MAIN"/>
<action android:name="android.intent.action.VIEW" />
<category android:name="android.intent.category.LAUNCHER"/>
</intent-filter>
</activity>
</application>

</manifest>

64
Munashe Brian Katsande R109881C
Classifier

package com.example.cropdiag

import android.content.res.AssetManager
import android.graphics.Bitmap
import android.util.Log
import org.tensorflow.lite.Interpreter
import java.io.FileInputStream
import java.nio.ByteBuffer
import java.nio.ByteOrder
import java.nio.MappedByteBuffer
import java.nio.channels.FileChannel
import java.util.*

@Suppress("DEPRECATION",
"NULLABILITY_MISMATCH_BASED_ON_JAVA_ANNOTATIONS")
class Classifier(assetManager: AssetManager, modelPath: String, labelPath: String, inputSize:
Int) {
private var interpret123: Interpreter
private var list: List<String>
private val inputsize: Int = inputSize
private val pixelsize: Int = 3
private val imagemean = 0
private val imagestd = 255.0f
private val maxresult = 3
private val threshold = 0.4f

data class Recognition(


var id: String = "",
var title: String = "",

65
Munashe Brian Katsande R109881C
var confidence: Float = 0F
) {
override fun toString(): String {
return "Title = $title, Confidence = $confidence)"
}
}

init {
interpret123 = Interpreter(loadModelFile(assetManager, modelPath))
list = loadLabelList(assetManager, labelPath)
}

private fun loadModelFile(assetManager: AssetManager, modelPath: String):


MappedByteBuffer {
val fileDescriptor = assetManager.openFd(modelPath)
val inputStream = FileInputStream(fileDescriptor.fileDescriptor)
val fileChannel = inputStream.channel
val startOffset = fileDescriptor.startOffset
val declaredLength = fileDescriptor.declaredLength
return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength)
}

private fun loadLabelList(assetManager: AssetManager, labelPath: String): List<String> {


return assetManager.open(labelPath).bufferedReader().useLines { it.toList() }

fun recognizeImage(bitmap: Bitmap): List<Recognition> {


val scaledBitmap = Bitmap.createScaledBitmap(bitmap, inputsize, inputsize, false)
val byteBuffer = convertBitmapToByteBuffer(scaledBitmap)
val result = Array(1) { FloatArray(list.size) }

66
Munashe Brian Katsande R109881C
interpret123.run(byteBuffer, result)
return getSortedResult(result)
}

private fun convertBitmapToByteBuffer(bitmap: Bitmap): ByteBuffer {


val byteBuffer = ByteBuffer.allocateDirect(4 * inputsize * inputsize * pixelsize)
byteBuffer.order(ByteOrder.nativeOrder())
val intValues = IntArray(inputsize * inputsize)

bitmap.getPixels(intValues, 0, bitmap.width, 0, 0, bitmap.width, bitmap.height)


var pixel = 0
for (i in 0 until inputsize) {
for (j in 0 until inputsize) {
val `val` = intValues[pixel++]

byteBuffer.putFloat((((`val`.shr(16) and 0xFF) - imagemean) / imagestd))


byteBuffer.putFloat((((`val`.shr(8) and 0xFF) - imagemean) / imagestd))
byteBuffer.putFloat((((`val` and 0xFF) - imagemean) / imagestd))
}
}
return byteBuffer
}

private fun getSortedResult(labelProbArray: Array<FloatArray>): List<Recognition> {


Log.d("Classifier", "List Size:(%d, %d,
%d)".format(labelProbArray.size,labelProbArray[0].size,list.size))

val pq = PriorityQueue(

67
Munashe Brian Katsande R109881C
maxresult,
Comparator<Recognition> {
(_, _, confidence1), (_, _, confidence2)
-> confidence1.compareTo(confidence2) * -1
})

for (i in list.indices) {
val confidence = labelProbArray[0][i]
if (confidence >= threshold) {
pq.add(
Recognition("" + i,
if (list.size > i) list[i] else "Unknown", confidence)
)
}
}
Log.d("Classifier", "pqsize:(%d)".format(pq.size))

val recognitions = ArrayList<Recognition>()


val recognitionsSize = pq.size.coerceAtMost(maxresult)
for (i in 0 until recognitionsSize) {
recognitions.add(pq.poll())
}
return recognitions
}

68
Munashe Brian Katsande R109881C
Model

# Import Libraries

import warnings

warnings.filterwarnings("ignore")

import os

import glob

import matplotlib.pyplot as plt

# Keras API

import keras

from keras.models import Sequential

from keras.layers import Dense,Dropout,Flatten

from keras.layers import


Conv2D,MaxPooling2D,Activation,AveragePooling2D,BatchNormalization

from keras.preprocessing.image import ImageDataGenerator

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

import tensorflow_hub as hub

from tensorflow.keras.optimizers import Adam

#Load data

zip_file=tf.keras.utils.get_file(origin='https://storage.googleapis.com/plantdata/PlantVillage.zip',

fname='PlantVillage.zip', extract=True)

69
Munashe Brian Katsande R109881C
#Create the training and validation directories

data_dir = os.path.join(os.path.dirname(zip_file), 'PlantVillage')

train_dir = os.path.join(data_dir, 'train')

validation_dir = os.path.join(data_dir, 'validation')

import json

with open('Plant-Diseases-Detector-master/categories.json', 'r') as f:

cat_to_name = json.load(f)

classes = list(cat_to_name.values())

print (classes)

validation_datagen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)

validation_generator = validation_datagen.flow_from_directory(

validation_dir,

shuffle=False,

seed=42,

color_mode="rgb",

class_mode="categorical",

target_size=IMAGE_SIZE,

batch_size=BATCH_SIZE)

do_data_augmentation = True #@param {type:"boolean"}

if do_data_augmentation:

70
Munashe Brian Katsande R109881C
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(

rescale = 1./255,

rotation_range=40,

horizontal_flip=True,

width_shift_range=0.2,

height_shift_range=0.2,

shear_range=0.2,

zoom_range=0.2,

fill_mode='nearest' )

else:

train_datagen = validation_datagen

train_generator = train_datagen.flow_from_directory(

train_dir,

subset="training",

shuffle=True,

seed=42,

color_mode="rgb",

class_mode="categorical",

target_size=IMAGE_SIZE,

batch_size=BATCH_SIZE)

# Preprocessing data.

71
Munashe Brian Katsande R109881C
train_datagen=ImageDataGenerator(rescale=1./255,

shear_range=0.2,

zoom_range=0.2,

validation_split=0.2, # validation split 20%.

horizontal_flip=True)

test_datagen=ImageDataGenerator(rescale=1./255

img_width,img_height =256,256

input_shape=(img_width,img_height,3)

batch_size =14

train_generator =train_datagen.flow_from_directory(train_dir,

target_size=(img_width,img_height),

batch_size=batch_size)

test_generator=test_datagen.flow_from_directory(test_dir,shuffle=True,

target_size=(img_width,img_height),

batch_size=batch_size)

from keras.preprocessing import image

import numpy as np

img1 = image.load_img('/PlantVillage/train/cotton Alternaria/image16.JPEG')

plt.imshow(img1);

72
Munashe Brian Katsande R109881C
#preprocess image

img1 = image.load_img('/PlantVillage/train/cotton Alternaria/image16.JPEG', target_size=(256,


256))

img = image.img_to_array(img1)

img = img/255

img = np.expand_dims(img, axis=0)

import matplotlib.image as mpimg

fig=plt.figure(figsize=(14,7))

columns = 8

rows = 4

for i in range(columns*rows):

#img = mpimg.imread()

fig.add_subplot(rows, columns, i+1)

plt.axis('off')

plt.title('filter'+str(i))

plt.imshow(conv2d_1_features[0, :, :, i], cmap='viridis') # Visualizing in color mode.

plt.show()

opt=keras.optimizers.Adam(lr=0.001)

model.compile(optimizer=opt,loss='categorical_crossentropy',metrics=['accuracy'])

train=model.fit_generator(train_generator,

73
Munashe Brian Katsande R109881C
nb_epoch=100,

steps_per_epoch=train_generator.samples // batch_size,

validation_data=validation_generator,

nb_val_samples= validation_generator.samples// batch_size,verbose=1)

converter = tf.lite.TFLiteConverter.from_keras_model_file('crop.h5')

tfmodel = converter.convert()

open ("output.tflite" , "wb") .write(tfmodel)

74
Munashe Brian Katsande R109881C

You might also like