Plant Disease Detection Neural Network Final

EGEE 529: NEURAL NETWORK Jeffry Salazar(891884140)
Neural Networks FOR PLANT DISEASE DETECTION ON LEAVES 5/14/20
The goal of this project is to create a shallow neural network to detect plant disease using leaf
images of different crops. Since this neural network has images as data input, a convolution neural
network will be implemented. First with a simple neural network with basic parameters, not involving
more than a few convolutions layers, with the goal of reaching above 90% accuracy. Then will retrain an
existing neural network called Alexnet since it is a relatively shallow network, to test if it achieves better
accuracy then the shallow neural network I created. If possible, will try to implement the two networks
on a mobile device.
The dataset used is known as PlantVillage, which has single leaves on a grey table on each
image. The images in the dataset are also flipped and rotated and of different image qualities. The
dataset is organized in folders with the label for each image as the folder name. Below are a few
examples of the image dataset. From left to right, an Apple with Cedar Rust, Healthy Tomato, and
Potato with Early Blight.
Figure 1 Figure 2 Figure 3
The dataset consists of 38 labels:
Figure 4
1
The software used is Matlab since it has the Deep Learning Toolbox that makes creating neural
networks very quick and simple, with lots of documentation. Matlab is also able to convert the Matlab
code into C code, which is useful to later transfer a C program into an Android device for mobile
application.
Training methods for the neural network are stochastic gradient descent with
momentum(sgdm) and adaptive moment estimation optimizer(adam).
Stochastic gradient descent updates network parameters (weights and biases) to minimize loss
function by taking small steps in the direction of the negative gradient of the loss.
Equation:
𝜃𝑙+1 = 𝜃𝑙 − 𝛼𝛻𝐸(𝜃𝑙 ) + 𝛾(𝜃𝑙 − 𝜃𝑙−1 )
𝑙 is the iteration number
𝛼 > 0 is the learning rate

𝜃 is the parameter vector(weights & biases)
𝛻𝐸𝜃 is the gradient of the loss function

𝛾 determines contribution of previous gradients step to current iteration
Adaptive Moment estimation optimizer updates (weights and biases), with element-wise
moving average of both the parameter gradients, and their squared values.
Equations:
𝑚𝑙 = 𝛽1 𝑚𝑙−1 + (1 − 𝛽1 )𝛻𝐸(𝜃𝑙 )
𝑣𝑙 = 𝛽2 𝑣𝑙−1 + (1 − 𝛽2 )[𝛻𝐸(𝜃𝑙 )]2

𝛼𝑚𝑙
𝜃𝑙+1 = 𝜃𝑙 −
√𝑣𝑙 + 𝜖
𝛼 > 0 is the learning rate
𝜃 is the parameter vector(weights & biases)

𝛻𝐸𝜃 is the gradient of the loss function
𝛽1 & 𝛽2 are the decay rates

𝜖 is small constant to avoid division by zero
2
My first attempt after learning how to make a simple neural network was designed as:
1) imageInputLayer([256 256 3])

2) convolution2dLayer([5 5],30,'stride',[2,2])
3) [batchNormalizationLayer() ; reluLayer(); maxPooling2dLayer([2 2])]
4) convolution2dLayer([4 4],20,'stride',[2,2])
5) [batchNormalizationLayer() ; reluLayer(); maxPooling2dLayer([2 2])]
6) [fullyConnectedLayer(38);softmaxLayer();classificationLayer()]
With training options, with stochastic gradient descent with momentum was used:
options = trainingOptions('sgdm', ...
'MaxEpochs',6, ...
'InitialLearnRate',.0001, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',1,...
'LearnRateDropFactor',0.2,...
'Shuffle','every-epoch', ...
'MiniBatchSize',64, ...
'GradientThreshold',10,...
'ValidationFrequency',100, ...
'Verbose',false, ...
'Plots','training-progress',...
'ValidationData',imdsValid,...
'CheckpointPath', 'C:\NN');
Training results are as follows, which were at 70%:
Figure 5
3
After testing and reviewing how changes in the options modified the results, I was able to get it
to 80% accuracy. No matter how I modified the options, that would prevent the loss from plateauing.
The only recommendation that the Matlab documentation recommended was to add another
convolution layer. I modified the code to have 3 convolution layer and mid layers (Batch Normalization,
ReLu, and Max Pooling), but by mistake, I ran the code with only the mid layers, which gave the
following results from training.
Figure 6
Then after I noticed my mistake in my code. I corrected it to have the 3 rd convolution layer
which resulted with slightly lower accuracy.
Figure 7
4
Next, I trained the same network using the adaptive moment estimation optimizer, which
includes learning rate information, L2 regularization factor, and mini-batch size.
This actually brought it back up to a higher accuracy and smaller loss than the previous networks.
Figure 8
Confusion Chart from the network trained above. Gave an accuracy of 81.8% for the test.
Figure 9
5
The filters for the 2 convolution networks for the latest version of the network are shown below.
Figure 10 shows all 16 filters with all channels. Figure 11 show 32 filters that are applied to the image in
figure 1.
Figure 10 Figure 11
After the image goes through the convolution layers through the filters shown, the outputs(activations)
of the convolution layers are images that highlight features that were created by the filters as shown
below.
Figure 12 Figure 12
6
Some of the issues with the previous results was the testing accuracy was a lot lower than the
validation accuracy. This occurred because the testing images was a small amount and also didn’t cover
all the possible categories. Therefore, I randomly moved some of the training images to the testing set
to cover all the categories, and then retrained with images removed from the training set.
The next step was to move away from using stochastic gradient descent(sgdm) with momentum
to only use adaptive moment estimation optimizer(adam), since the results were better when training
with adam, at least with the settings that I used for each training method.
The final version of the neural network I created is shown below:
Figure 13
Training options for the adam training method are seen below:
Figure 14
7
Given the options and the neural network created I was able to get a validation accuracy of
98.44% (figure 15). The mini-batch training accuracy was closer to 100% (figure 15), this was an issue I
had with all my versions of the neural network. There was no documentation on Matlab on how to get
the validation accuracy to align with the mini-batch accuracy.
Figure 15
For the testing results of this network, I made sure I had at least on image from each category,
since in the previous testing I only used 33 images that only covered a few categories. The accuracy
came out to 98.41%, which is good, but 1 is given the testing images. Only 1 out of 63 images, was
improperly labeled. One of the issues I had previously with my testing results, was that I had a small
amount of test images, with a validation accuracy of 95.46%, which resulted in a test accuracy of 81.8%.
The issue of having to few test images wasn’t significant given that the current validation accuracy was
higher.
8
Figure 16
Up until this point, my networks, were only compared to the previous versions. Now that I had a
CNN with high accuracy results, the plan was to compare my neural network with a well-known
pretrained image classification network. I chose to use ‘AlexNet’, since it is considered a less accurate
network compared to other pretrained networks, which allows my network to be comparable. To
achieve this, I will do transfer learning with ‘AlexNet’.
Figure 17
9
AlexNet is a CNN that is 8 layers deeps (has 8 convolution layers sequentially). It is pretrained on
1000 object categories, which results in the network having learned rich feature representations for a
wide range of images. To take full advantage of ‘AlexNet’ with transfer learning, the only layers
replaced, were the last 3 layers of the network; the fully connected layer, softmax layer, and the
classification layer. The goal is to fine tune the network, therefore avoid large changes to the original
layers. To achieve fine tuning the original layers, the learning rate must be very small (<0.0001). On the
new layers, specifically the fully connected layer, the weight learning rate factor and bias learning rate
factor will need to be fast. Settings used for both the weight and bias learning rates were set to 20.
Shown below is the ‘AlexNet’ convolution neural network:
Figure 18
Training options for the sgdm training method are seen below:
Figure 19
10
After completing transfer learning on ‘AlexNet’ using the sgdm training method, I was able to
get a validation accuracy of 98.78% (figure 18).
Figure 20
The modified ‘AlexNet’ network had a slightly higher validation accuracy with a 0.37% difference
compared to the network I created, which is only 4 convolution layers deep compared to ‘AlexNet’
which is 8 convolution layers deep.
Test results for the modified ‘AlexNet’ gave same results as the network I created.
The test accuracy was 98.41%, and it appears the same image was misclassified when you compare
figure 16 to figure 21.
Figure 21
11
A previous issue during the development of the created neural network was the validation
accuracy was always slightly offset lower than the mini-batch accuracy, where the goal is to have them
overlap. During the transfer learning network training, I noticed that modifying the fully connected
layer’s weight and bias learn rate factor shifted the validation accuracy up to match or be above the
mini-batch training accuracy.
To conclude, the goal of creating a neural network to detect plant disease using leaf images was
achievable. The accuracy on the validation and test images was at 98.4%. Was able to complete another
network using transfer learning on a pre-trained neural network, that achieved the same accuracy as the
network I created. During this time is where I learned to adjust the weight and bias of the fully
connected layer, that was able to solve the problem of my validation accuracy always being slightly
lower than the mini-batch accuracy. Modifying the weights and bias for my created neural network on a
future version can make it so it is closer to 100% accuracy or no errors. Other future goals are to convert
it to a phone app and get a better dataset, since this dataset was the only one, I could find for my
application. I want to get a dataset more specific to California native plants, one with field pictures. The
dataset used was of single leaves on a table. The idea is to use the phone out in the field. Other
improvements of the neural network can be done during preprocessing, which will be part of my next
steps in further developing my cnn.
References:
• [1] BISHOP, C.M. PATTERN RECOGNITION AND MACHINE LEARNING. SPRINGER, NEW YORK, NY,
2006
• [2] MURPHY, K.P. MACHINE LEARNING: A PROBABILISTIC PERSPECTIVE. THE MIT PRESS,
CAMBRIDGE, MASSACHUSSETTS, 2012
• [3] PASCANU, R., T. MIKOLOV, AND Y. BENGIO. “ON THE DIFFICULTY OF TRIANING RECURRENT
NEURAL NETWORKS”. PROCEEDINGS OF THE 30TH INTERNATIONAL CONFERENCE ON MACHINE
LEARNING. VOL. 28(3), 2013, PP. 1310-1318.
• [4] KINGMA, DIEDERIK, AND JIMMY BA. “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION.”
ARXIV PREPRINT ARXIV:1412.6980(2014).
12

Plant Disease Detection Neural Network Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Plant Disease Detection Neural Network Final

Uploaded by

Copyright:

Available Formats

EGEE 529: NEURAL NETWORK Jeffry Salazar(891884140)

Neural Networks FOR PLANT DISEASE DETECTION ON LEAVES 5/14/20

Figure 1 Figure 2 Figure 3

The dataset consists of 38 labels:

𝜃𝑙+1 = 𝜃𝑙 − 𝛼𝛻𝐸(𝜃𝑙 ) + 𝛾(𝜃𝑙 − 𝜃𝑙−1 )

𝑙 is the iteration number

𝛼 > 0 is the learning rate

𝛻𝐸𝜃 is the gradient of the loss function

𝑣𝑙 = 𝛽2 𝑣𝑙−1 + (1 − 𝛽2 )[𝛻𝐸(𝜃𝑙 )]2

𝛼 > 0 is the learning rate

𝜃 is the parameter vector(weights & biases)

𝛽1 & 𝛽2 are the decay rates

1) imageInputLayer([256 256 3])

Training results are as follows, which were at 70%:

The final version of the neural network I created is shown below:

Shown below is the ‘AlexNet’ convolution neural network:

You might also like