You are on page 1of 8

Computers and Electronics in Agriculture 183 (2021) 106042

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Original papers

A deep learning approach for RGB image-based powdery mildew disease


detection on strawberry leaves
Jaemyung Shin a, Young K. Chang b, *, Brandon Heung c, Tri Nguyen-Quang b, Gordon W. Price b,
Ahmad Al-Mallahi b
a
Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, Alberta, Canada
b
Department of Engineering, Faculty of Agriculture, Dalhousie University, Truro, Canada
c
Department of Plant, Food, and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, Canada

A R T I C L E I N F O A B S T R A C T

Keywords: In this study, Deep Learning (DL) was used to detect powdery mildew (PM), persistent fungal disease in
Disease detection strawberries to reduce the amount of unnecessary fungicide use, and the need for field scouts. This study opti­
Powdery mildew mised and evaluated several well-established learners, including AlexNet, SqueezeNet, GoogLeNet, ResNet-50,
Deep learning
SqueezeNet-MOD1, and SqueezeNet-MOD2. Data augmentation was carried out from among 1450 healthy and
Convolutional neural network
Artificial intelligence
infected leaf images to prevent overfitting and to consider the various shapes and direction of the leaves in the
field. A total of eight clockwise rotations (0◦ ; the original data, 45◦ , 90◦ , 135◦ , 180◦ , 225◦ , 270◦ , and 315◦ ) was
performed to generate 11,600 data points. Overall, the six DL algorithms that were used in this study showed on
average of >92% in classification accuracy (CA). ResNet-50 gave the highest CA of 98.11% in classifying the
healthy and infected leaves; however, considering the computation time, AlexNet had the fastest processing time,
at 40.73 s, to process 2320 images with a CA of 95.59%.When considering the memory requirements for
hardware deployment, SqueezeNet-MOD2 would be recommended for PM detection on strawberry leaves with a
CA of 92.61%.

1. Introduction monitoring and management are required to control the spread of


infection in strawberry fields. The primary method used to suppress PM
Fungal diseases are a significant challenge to strawberry production is to apply fungicides as a spray; however, PM is prone to developing
systems of which powdery mildew (PM) in a particularly problematic resistance to fungicides (McGrath, 2001). Furthermore, the excessive
pathogen. PM can easily infect strawberry plants in regions that expe­ application of fungicides may lead to environmental degradation, such
rience warm temperatures and humid climatic conditions. PM sequen­ as soil erosion and/or the accumulation of toxic substances in soil (Kalia
tially attacks leaves, flowers, and fruits, which can lead to high levels of and Gosal, 2011).
crop loss if the infection becomes severe. Initially, white mycelium be­ The conventional method for monitoring plant pathogens is to hire
gins to cover the leaf surface, reducing the amount of photosynthesis, disease specialists who can scout a field to identify the presence of dis­
and causing water deficiency. During the early stage of infection, less eases. However, this method is not economical in terms of labour and
than ten white mycelium spots appear with the formation of a white star- time, and it is challenging to accurately predict the spread of the disease
shaped dot; and about five days later, white fungal groups begin to cover over the field (Kobayashi et al., 2001). These challenges are com­
the entire surface of the strawberry leaves (Adam and Somerville, 1996). pounded by a decreasing workforce within the agricultural sector (Frank
As the infection progresses, leaves turn reddish-purple or have small et al., 2004). According to Statistics Canada, the number of people
purple spots (Jacob et al., 2008). Infected fruit has a bitter taste, which working in the agricultural industry has steadily decreased since 1991,
in turn, reduces its marketability and increases losses to the producer while the average age of workers has gradually increased up to 55 years
(Liu, 2017). of age in 2016 (Statistics Canada, 2020).
Due to the significant impacts of PM, effective methods for In the 1980s, the concept of precision agriculture (PA) emerged as a

* Corresponding author at: Department of Engineering, Faculty of Agriculture, Dalhousie University, Truro, NS B2N5E3, Canada.
E-mail address: YoungChang@Dal.Ca (Y.K. Chang).

https://doi.org/10.1016/j.compag.2021.106042
Received 17 June 2020; Received in revised form 11 October 2020; Accepted 2 February 2021
Available online 27 February 2021
0168-1699/© 2021 Elsevier B.V. All rights reserved.
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

major component of the third wave of the modern agricultural revolu­ detecting disease in tomatoes and yielded similar accuracies (Durmuş
tion (Robert, 2002). PA was initially used for the targeted allocation of et al., 2017); however, for mobile applications with light memory re­
fertilisers to suit different soil conditions (Rosa et al., 2000). Since then, quirements, SqueezeNet was found to be more appropriate. GoogLeNet
PA has been developed for the automated guidance of agricultural ve­ (ILSVRC 2014 winner), also known as Inception, has also been used for
hicles and tools, autonomous machines and processes, research on disease detection in cassava. For example, Ramcharan et al. (2017)
farms, and the automated management of agricultural production sys­ determined that Inception V3 was more effective than previous versions,
tems (Zhang et al., 2002). Collecting data using sensors that are and it was aligned with detecting diseases in cassava (Szegedy et al.,
mounted on machines (i.e., unmanned ground vehicles [UGV], un­ 2016). However, they reported some limitations of Inception related to
manned aerial vehicles [UAV], satellites, and airplanes) is non- memory requirements and computational complexity. In studies
destructive and applicable over large geographical areas. The analysis comparing AlexNet with GoogLeNet, the differences were variable; for
of ground- and aerial-based imagery are often critical components in PA example, Mohanty et al. (2016) showed that GoogLeNet was more
(Liaghat and Balasundram, 2010), especially given the importance of effective than AlexNet in detecting an array of 26 diseases found on 14
image identification and classification techniques in the practice. different crop species whereas the accuracy rates for the two algorithms
Machine learning (ML), which includes deep learning (DL), can be were similar when used to detect nine diseases common to tomatoes.
used to detect pathogens, pests, nutrient deficiencies, and other abiotic Lastly, Fuentes et al. (2017) compared different versions of the ResNet
stressors (Teke et al., 2013). In the case of ML without the use of DL, algorithm (ILSVRC 2015 winner) for detecting tomato diseases, where it
hereinafter referred to as the “non-DL”, several feature extraction was determined that ResNet-50 outperformed ResNeXt-50 (ILSVRC
methods (e.g., histogram of oriented gradients [HOG], speeded up 2016 winner).
robust features [SURF], and gray level co-occurrence matrix [GLCM]) Despite the application and success of different CNNs for detecting
have been evaluated and compared (Shin et al., 2020). The best crop diseases, there is no existing study that demonstrates the applica­
extraction method can subsequently be used as an input for the learning bility for detecting PM on strawberry leaves. Developing a DL technique
algorithm. In comparison to non-DL approaches, DL is more effective in that could rapidly assess PM status in strawberry fields would help
handling additional complexity and hierarchical structure in the data inform producers when targeted applications of fungicides are required
(Qi et al., 2017). The key aspect of DL is that it identifies and extracts the and reduce disease-scouting, labour requirements. Hence, the objectives
best features during the learning procedure (LeCun et al., 2015). In the of this paper are to differentiate between PM infected and uninfected
case of DL, the processing step of DL is more simplified than non-DL strawberry leaves by (1) comparing the performance between super­
because the optimal feature method is applied to the learning algo­ vised non-DLs and CNNs; (2) modifying the architectures of four CNNs
rithm automatically (Alom et al., 2018). Furthermore, testing time of the (AlexNet, SqueezeNet, GoogLeNet, and ResNet-50) and developing
DL, which is processed with the data first seen in the algorithm, is SqueezeNet-MOD1 and SqueezeNet-MOD2; and (3) evaluating their
typically faster than that of non-DL approaches (Chen et al., 2014). Fast performance with respect to CA, computation time, and memory
processing time is critical for developing algorithms that are going to be requirements.
applied to hardware (e.g., field-programmable gate array [FPGA] or
mobile application) and it could provide producers with results in a 2. Materials and methods
minimal amount of time to facilitate rapid management decisions to be
taken. 2.1. Image datasets
DL can be categorised into three types according to the method and
purpose of the study. The first type is unsupervised learning, which does Healthy and infected strawberry leaves were collected from Bala­
not require data labels, and is optimal for identifying representative more Farm Ltd. (45◦ 24′ 35.4′′ N, 63◦ 34′ 26.3′′ W) and Millen Farms Ltd.
features with little data. Two examples of unsupervised learning are (45◦ 23′ 57.6′′ N, 63◦ 33′ 31.1′′ W) in the summers of 2018 and 2019 under
deep belief network and deep autoencoder (Vincent et al., 2010). The the supervision of disease specialists. The varieties of strawberry were
second type is a recurrent neural network, which fits well with Ruby June and Albion. Over the plant growth stages, the formation of
sequential data processing, such as protein or polymer sequences. The asexual conidia occurs when petioles develop, and mycelium begins to
last type is a convolutional neural network (CNN), which has generated spread. Mycelium first appears on leaves and petioles and iteventually
excellent results in image recognition and natural language processing spreads to the fruits. Data consisted of images acquired during the post-
(Deng and Platt, 2014). Among the three types of DL, CNN has been used germination phase and prior to the fruit-bearing stage. After collecting
in various disciplines, along with advanced computer vision (CV), and is the leaves, they were placed in coolers with icepacks in the field and
the most popular for detecting crop diseases due to the high classifica­ being preserved in an icebox while transportation. Images were taken
tion accuracy (CA) in image recognition (Szegedy et al., 2015, 2016). right after arriving in the lab with a person holding a digital single-lens
The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is reflex (DSLR) camera: EOS 1300D (Canon Inc., Tokyo, Japan), with
an annual competition aimed at fostering the development of object 3456 × 5184 pixels (Raw CR2 format). The camera had a 1/15 exposure
detection and image classification algorithm using the annotated time with a shutter priority option and ISO 100 with a 46 mm focal
ImageNet dataset (Deng et al., 2009). Outcomes from the ILSVRC length under 5,500 K colour temperature illumination—mimicking
include milestone algorithms and techniques in the field of CV and DL. direct sunlight. The total number of images collected from 2018 to 2019
In this study, four established, state-of-the-art CNNs (AlexNet, Squee­ consisted of 677 healthy and 773 infected leaves for a total of 1450
zeNet, GoogLeNet, and ResNet-50), along with two developed by the leaves.
authors, SqueezeNet-MOD1 and SqueezeNet-MOD2, were evaluated and The computations were implemented in MATLAB R2019a (The
compared for PM detection on strawberry leaves. MathWorks Inc., Natick, MA, U.S.A.) using an Intel® Core™ i7-8700
The four selected CNNs, AlexNet, SqueezeNet, GoogLeNet, and CPU @ 3.20 GHz with 48.0 GB RAM and 64-bit Windows 10 oper­
ResNet-50, were found to be commonly used in agriculture application ating system. All images were first cropped to 908 × 908 pixels, then
(Kamilaris and Prenafeta-Boldú, 2018; Jiang et al., 2019). For example, resized to 227 × 227 pixels or 224 × 224 pixels depending on the
AlexNet (ILSVRC 2012 winner) and SqueezeNet were compared for requirement of the architecture to ensure that the relative size of the

2
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

leaves on all images was consistent and to reduce the computational Table 1
demand. The required input size in AlexNet, SqueezeNet, SqueezeNet- Depth, size, parameters, and image input size per each deep learning
MOD1, and SqueezeNet-MOD2 was 227 × 227 pixels whereas GoogLe­ architectures.
Net and ResNet-50 required an input size of 224 × 224 pixels. Images Network Depth Size Parameters (millions) Image input size
were adjusted to ensure that the relative portions of PM were uniform AlexNet 8 227 MB 61.0 227-by-227
across all images, to reduce the computational complexity, and to SqueezeNet 18 4.6 MB 1.2 227-by-227
improve the efficiency of the processing in terms of computational time. GoogLeNet 22 27 MB 7.0 224-by-224
Data augmentation in DL is essential for models to be generalised so ResNet-50 50 96 MB 25.6 224-by-224
that it could be applied more easily to real field situations (Taylor and
Nitschke, 2017). In addition, Rasti et al. (2019) also identified that at
The learning rate is one of the most important hyperparameters to
least 10,000 observations are required to optimise the overall perfor­
tune when training CNNs as it tells the optimiser how much weight to
mance of DL algorithms and minimise issues related to model over­
move in the opposite direction from the mini-batch since DL models are
fitting. Cawley and Talbot (2010) demonstrated that the main
primarily trained by stochastic gradient descent optimisers. Adjusting
prevention of overfitting is derived from the model selection. They
the learning rate can modify the behaviour the model and yield different
tested the overfitting problem by using leave-one-out cross-validation
results. For example, if the learning rate is low, the optimisation time is
and investigated the effects of overfitting in model selection. Hence,
increased; however, the training process is able to be stabilised.
data augmentation was performed on the whole dataset to improve the
Conversely, if the learning rate is high, the optimisation time is
robustness of the architectures and increase the number of observations
decreased; however, the model may not converge on an optimal solution
(Liu et al., 2018). There are two aspects of data augmentation:
and in some cases, it may diverge (Sutskever et al., 2013). The number of
geometrical transformation (e.g., rotating, flipping, cropping, and
epochs is also a crucial hyperparameter that defines the number of times
resizing the images; Fadaee et al., 2017) and intensity transformation (e.
that the learning algorithm will process through the entire training
g., noise, colour change, brightness enhancement; Fuentes et al., 2017).
dataset. An epoch can be comprised of one or more batches and a mini-
Among them, a rotation technique was applied in order to represent the
batch can be determined according to the size of data. Usually, a batch
different shapes and directions of the leaves in the field. Hence, data
size of 32 is a good starting point while batches of 62, 128, and 256 can
augmentation was carried out using a clockwise rotation along the
be applied as well (Keskar et al., 2016); however, identifying the opti­
0◦ (the original images), 45◦ , 90◦ , 135◦ , 180◦ , 225◦ , 270◦ , and 315◦
mised batch size within this given range is generally suggested.
angles. Therefore, the original 1450 observations were increased by
eight times with 5416 observations representing healthy leaves and
2.3.1. AlexNet
6184 observations representing infected leaves for a total of 11,600
AlexNet consists of five convolution layers, three fully connected
observations.
layers, and the softmax function. In the first convolutional layer, 96
kernels (i.e., filters) with a size of 11 × 11 × 3 can be calculated with a
2.2. Comparisons of non-DLs and CNNs stride of four on 227 × 227 input images. Among the five convolution
layers, the first and second convolution layers are followed by the
The performances with respect to CA in non-DL and CNN techniques overlapping max pooling; however, the remaining convolutional layers
were compared with the original dataset of 1450 images. In our recent are connected directly to each other. The last convolutional layers are
study, Shin et al. (2020) compared the classification accuracies between followed by overlapping max pooling before connecting to the two fully
three feature extraction techniques (HOG, SURF, and GLCM) and two connected layers. The second fully connected layers are linked to the
supervised non-DLs (artificial neural network [ANN] and support vector softmax classifiers with two class labels. The architecture of AlexNet in
machines [SVM]). Results of their study showed that the combination of this study was changed from 1 × 1 × 1000 into 1 × 1 × 2 in the last,
SURF and ANN, and GLCM and SVM were optimal to detect PM on fully-connected layer that performs the binary classification (i.e.,
strawberry leaves (Shin et al., 2020). healthy and infected; Fig. 1).

2.3. Architectures of CNNs 2.3.2. GoogLeNet


GoogLeNet has 22 layers that were designed as a deep network,
A CNN has convolutional layers that extract features from input which has more than 2 hidden layers and 12 times fewer parameters
images. The convolutional layers consist of a filter and an activation than AlexNet. The major characteristic of GoogLeNet is that it was the
function where the images of the leaves are classified into healthy or first CNN to introduce the inception module. Inception layers keep a
infected leaves by using the CNN constructed from the extracted fea­ high resolution for small information on the images by covering a bigger
tures. AlexNet, SqueezeNet, GoogLeNet, and ResNet-50 all have area. Hence, the inception module applies 1 × 1, 3 × 3, and 5 × 5
different properties regarding depth, size, a number of parameters, and convolutional filters in parallel and characterise objects across multiple
an image input size (Table 1). scales (Fig. 2). Then, these different sizes of filters can process the better
CNNs consist of several hyperparameters that needed to be tuned multiple objects scale according to the needs.
through optimisation trials. The optimal hyperparameter combinations
for the CNNs were determined to be; iterations = 5, base learning rate = 2.3.3. ResNet-50
0.001, max epochs = 10, and mini batch size = 64. This combination of The original implementation of ResNet consists of 152 layers while
parameters was applied to all CNN algorithms. To identify and optimise ResNet-50 is a smaller version of ResNet-152. ResNet-50 is a CNN that is
the best classifier models, randomised hold-out validation was carried trained on more than a million images from the ImageNet database and
out, whereby the augmented dataset was randomly partitioned into 80% is 50 layers deep (He et al., 2016). Each ResNet block is either two layers
(9280 images) for model training and 20% (2320 images) for model deep or three layers deep. The input images for the network are 224 ×
testing. To ensure stability in the model evaluation metrics, the hold-out 224 pixels and except for the first layer (7 × 7 convolution window), all
procedure was repeated five times and the average accuracy metrics operations use a 3 × 3 convolution window or smaller. To reduce the
were reported.

3
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

Input
Conv [a] MXP [b] Conv 5×5, Conv 3×3,
11×11, pad=2 MXP 3×3, pad=1
3×3,
stride=4 Layer2: stride=2 Layer3:
stride=2
Layer1: 96 256 384
227×227

Conv 3×3, Conv 3×3,


FC FC FC [c] MXP 3×3, pad=1 pad=1
Softmax
Layer8: Layer7: Layer6: stride=2 Layer5: Layer4:
Layer9: 2
4096 4096 9216 256 384

[a] [b] [c]


Fig. 1. Architecture of AlexNet. Conv: Convolutional layer. MXP: Max pooling layer. FC: Fully Connected layer.

1 × 1 × 1000 to 1 × 1 × 2. The SqueezeNet algorithm implemented in this


1×1
convolution study was developed based on the SqueezeNet version 1.1, which is
similar to SqueezeNet version 1.0 but differs in how the maxpool is
3×3 Filter configured. SqueezeNet version 1.1 requires 2.4 times less computation
Previous layer convolutions than version 1.0 without negatively impacting accuracy (Bressem et al.,
concatenation
2020). The purpose for modifying SqueezeNet was to improve the per­
5×5 formance with respect to CA and testing time. In addition to modifying
convolutions the end architecture like AlexNet, GoogLeNet, and ResNet-50, SqueezeNet
was modified by changing the whole architecture to make this algorithm
Fig. 2. Inception module in GoogLeNet.
more applicable to hardware to take advantage of the light memory
characteristics. SqueezeNet-MOD1 was developed by bypassing the sev­
number of calculations, 1 × 1, 3 × 3, and 1 × 1 convolution windows enth fire module in Fig. 4 (b) whereas SqueezeNet-MOD2 was developed
were built in the residual block. This process is like a bottleneck; hence, by bypassing the second, fifth, and seventh fire modules in Fig. 4(c).
it is called a bottleneck layer (Fig. 3).

2.3.4. SqueezeNet and SqueezeNet modifications 2.4. Evaluation of CNNs


SqueezeNet can achieve similar levels of accuracy to that of AlexNet
and is 510 times smaller in size (Iandola et al., 2016). According to A true positive (TP) represents the number of leaves that were
Iandola et al. (2016), it is considered a feasible CNN for use on an FPGA correctly classified as being infected and a true negative (TN) represents
and other hardware due to its limited memory requirements. Three the number of leaves that were correctly classified as being non-infected.
points are different from the previous big CNN; firstly, the number of False negative (FN) reflects the number of leaves that were incorrectly
parameters in the convolutional layer reduced than existing big CNN by classified as being non-infected and is classified as Type II error, which is
replacing 3 × 3 filter with 1 × 1 filter. Secondly, the number of channel related to recall/sensitivity and. False positive (FP) reflects the number
inputs is decreased to a 3 × 3 filter. Lastly, downsampling late in the of leaves that were incorrectly classified as being infected and is
network to convolution layers has large activation maps and this considered as Type I error, which is related to precision. High values of
designed strategy can lead higher CA than before. The key layer intro­ FN indicate that infected leaves were incorrectly identified as being
duced in the SqueezeNet is a fire module that consists of a squeeze layer health while high values of FP indicate that healthy leaves were incor­
and an expand layer whereby SqueezeNet constitutes of many fire rectly identified as being infected. The CA was the primary accuracy
modules and a few pooling layers (Fig. 4). The role of the squeeze layer metric used because it is the most intuitive evaluation metric to gauge
shrinks the features down consisting of 1 × 1 convolutional layer and it the overall performance of the model; however, it is useful only in the
can be expanded with a combination of 1 × 1 and 3 × 3 convolutional situation when the values of FP and FN are balanced. Lastly, precision,
layers. The fire module is made up of the following three parameters: the sensitivity/recall, specificity, and F1-score were calculated and used to
number of 1 × 1 convolutional layers used in the squeeze layer; the compare the various models. A detailed explanation of these evaluation
number of 1 × 1 convolutional layers used in the expand layer; and the metrics is described in Hossin and Sulaiman (2015). Again, accuracy
number of 3 × 3 convolutional layers used in the expand layer (Fig. 5). metrics were averaged from five repeats of randomised holdback cross-
SqueezeNet was modified by adjusting the activation dimensions from validation.

Input
[a] 1×1, 64 1×1, 128
Conv 7×7, MXP[b] 3×3, 64 3×3, 128
64, stride 2 3×3, 1×1, 256 1×1, 512
stride 2
Conv 1 Conv 2×3 Conv 3×4
224×224

1×1, 512 1×1, 256


Softmax [d] [c]
FC Avgpool 3×3, 512 3×3, 256
1×1, 2048 1×1, 1024
Conv 5×3 Conv 4×6
[a] [b] [c]
Fig. 3. ResNet-50 architecture used in this study. Conv: Convolutional layer. MXP: Max pooling layer. Avgpool: Average pool. [d] FC: Fully Connected layer.

4
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

[a] [a] [a]


Conv1 Conv1 Conv1

227×227 Maxpool Maxpool Maxpool

Fire 2
[b] [b]
Fire 2
[b] Bypassing
Fire 2

Fire 3 Fire 3 Fire 3

Maxpool Maxpool Maxpool

Fire 4 Fire 4 Fire 4

Fire 5 Fire 5 Fire 5 Bypassing

Maxpool Maxpool Maxpool

Fire 6 Fire 6 Fire 6

Fire 7 Bypassing Fire 7 Bypassing


Fire 7

Fire 8 Fire 8 Fire 8

Fire 9 Fire 9 Fire 9

Conv10 Conv10 Conv10

[c] [c] [c]


Avgpool Avgpool Avgpool

Softmax Softmax Softmax

(a) (b) (c)


Fig. 4. The architectures of SqueezeNet and SqueezeNet modifications, (a) SqueezeNet (version 1.1), (b) SqueezeNet-MOD1 with bypassing of Fire 7 model, (c)
SqueezeNet-MOD2 with bypassing of Fire 2, 5, and 7 models. [a] Conv: Convolutional layer. [b] Fire: Fire module. [c] Avgpool: Average pool.

2.5. Statistical analysis

Squeeze Layer 2.5.1. Methods for factorial analysis


To determine which algorithm provides the highest CA and shortest
1×1convolution filter testing time respectively, each CNN algorithm was considered as an
input factor, and the CA and computation time were considered as a
response (Eq. (1)). It should be noted that the statistical analysis of CA
and testing time were performed separately. This statistical analysis was
ReLU[a] performed using Minitab version 18 (Minitab, LLC, PA, State College,
Expand Layer
PA. U.S.A). The normality test of the data was carried out before doing a
one-way ANOVA. If the p-value was < 0.05, a multiple means compar­
1×1 and 3×3 convolution filters ison (MMC) test was applied to compare the significant differences be­
tween treatment groups. Tukey’s test was selected due to the
consideration of the magnitude of experimental error and was used to
compare the differences between groups.
The model for factorial analysis is represented as follows:
Y ij = μ + αi + εij (1)
ReLU
Where;
[a]
Fig. 5. Fire module in the SqueezeNet architecture. ReLu: Rectified Linear Yij = Classification accuracy (%),
Unit which is an activation function.

5
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

µ = Overall mean, 3.2. Experimental performance with augmented images


αi = Effect of “Deep learning algorithms” on response at ith level,
εij = The error terms (uncontrollable & uncontrolled factors), All CA values from the six CNN algorithms were > 92% suggesting
i = 1, 2…, a; j = 1, 2…, n. that CNN algorithms show promise in detecting PM on the strawberry
In this study, a = 6 (number of algorithms) and n = 5 (replications). leaves. ResNet-50 had the highest test CA of 98.11% and the highest
precision of 98.46% (Table 3). SqueezeNet-MOD1, developed by the
3. Results and discussion authors, had the next highest accuracy with a CA of 96.38%, which was
slightly higher than the CA from GoogLeNet (Table 3). ResNet-50 out­
3.1. Comparison of non-DLs and CNNs performed the other CNN algorithms for various other evaluation met­
rics, including precision, sensitivity/recall, specificity, and F1-score.
In ANN, the SURF feature extraction was selected as the best com­ An algorithm that generates results with a high precision (i.e., low
bination and in SVM, GLCM feature extraction was selected as the best FP) could result on cost-savings related to the fungicides. If cost is a more
combination (Shin et al., 2020). In their study, both SURF and GLCM important consideration than processing time, ResNet-50 would be
feature extraction showed the best performance with the strawberry leaf recommended due as it has the highest precision while yielding a CA of
dataset in terms of ANN. Non-DL techniques showed an underfitting 98.46%. Furthermore, ResNet-50 showed the highest sensitivity/recall
problem due to the addition of an extra summer season of data, through (97.99%). High sensitivity/recall (i.e., low FN) results indicate that
the data augmentation process, but showed a tendency to overfit with there are fewer cases of PM detection mistakes. In the early stages of
small datasets (Shin et al., 2020). diagnosis, sensitivity/recall could be considered when trying to find the
A statistically significant difference (p < .001) between CA results maximum number of possible PM leaves. In fact, if PM detection is the
from the ten ML techniques was observed (Table 2). Among the ten most important factor, the values of FN should be reduced. Specificity is
techniques, the pre-trained CNN models (AlexNet, SqueezeNet, Goo­ encouraged to be considered in the final diagnosis stage or when the
gLeNet, and ResNet-50) showed significantly higher CAs. Furthermore, effectiveness of treatment is low even after spraying the primary
the modified SqueezeNet learners (i.e., SqueezeNet-MOD1 and fungicide. Hence, a balance between values for sensitivity/recall and
SqueezeNet-MOD2) also performed better than non-DLs. In comparison specificity is important to consider. The F1-score is useful to understand
to Shin et al. (2020), there was a decrease in CA for the non-DL tech­ the results where there is an uneven class distribution. If the gap be­
niques, whereby these differences were likely attributed to a larger tween FP and FN values is large, then the F1-score should be considered
dataset used in this study as well as differences in the growing season first. Among the CNN techniques compared, ResNet-50 received the
amongst the studies. The addition of noise and orientation variables highest F1-score of 98.23% (Table 3).
through data augmentation can cause a decrease the CA. For example, The findings from this study are consistent with other studies that
Ebrahim et al. (2018) investigated the use of image rotation as a means compared a variety of DL algorithms. For example, Wang et al. (2017)
for data augmentation to train a CNN to classify cell images and compared AlexNet, GoogLeNet, VGG, and ResNet-50 for classifying
demonstrated a decrease in model accuracy when a shallow network eight different disease types, where ResNet-50 was most effective in the
was used. On the other hand, in DL, the larger dataset makes it a robust detection of a variety of lung diseases. Similarly, Asad and Bais (2019)
model and data augmentation is a way of achieving that. Overall, the compared SegNet, UNET, VGG16, and ResNet-50 for predicting weed
results show that the CNN approaches performed significantly better density in canola fields. Among the four CNN algorithms, ResNet-50
than the non-DL approaches. showed the best performance with an 82.88% mean intersection over
union (IOU), a 98.69% frequency weighted IOU, and a 99.48% accuracy.
Cruz et al. (2019) evaluated the performance of six CNN algorithms (e.
g., AlexNet, GoogLeNet, Inception V3, ResNet-50, ResNet-101, and
SqueezeNet) to detect grapevine yellows disease in a red grapevine and
Table 2
Performance of non-DLs and CNNs using the original dataset (1450 images).
found that ResNet-50 was the most optimal algorithm with a CA of
99.18%.
Type Techniques CA (%) Grouping[b]
[a]
non-DLs SURF + ANN 85.44 ± 1.12 d
Table 4
GLCM + ANN 69.28 ± 0.92 e
Analysis of variance of CA of six CNN algorithms using the augmented dataset
SURF + SVM 55.20 ± 2.28 g
GLCM + SVM 62.28 ± 2.42 f (11,600 images).
CNNs AlexNet 96.55 ± 0.81 ab Source DF[a] SS[b] MS[c] F-value p-value
SqueezeNet 94.62 ± 0.79 abc
GoogLeNet 94.42 ± 0.47 abc Algorithms 5 81.12 16.22 30.30 <0.001
ResNet-50 98.01 ± 0.47 a Error 24 12.85 0.53
SqueezeNet-MOD1 93.72 ± 2.75 bc Total 29 93.98
SqueezeNet-MOD2 91.01 ± 0.90 c [a]
Degrees of freedom.
[b]
[a]
The standard deviation of five repetitions. Sum of squares.
[c]
[b]
Means that do not share a letter are significantly different. Mean square.

Table 3
Performance of six CNN algorithms using the augmented dataset (11,600 images).
AlexNet(%) SqueezeNet (%) GoogLeNet (%) ResNet-50 (%) SqueezeNet-MOD1 (%) SqueezeNet-MOD2 (%)

CA 95.59 ± 0.49[a] b[b] 95.80 ± 0.72b 96.36 ± 1.16b 98.11 ± 0.28 a 96.38 ± 0.57b 92.61 ± 0.84c
Precision 96.14 ± 1.93 96.90 ± 1.20 96.94 ± 2.12 98.46 ± 0.37 97.00 ± 1.21 92.96 ± 1.21
Sensitivity/recall 95.64 ± 1.83 95.29 ± 1.64 96.31 ± 2.64 97.99 ± 0.42 95.84 ± 1.77 93.48 ± 2.49
Specificity 95.54 ± 2.34 96.49 ± 1.41 96.74 ± 2.61 98.25 ± 0.37 96.30 ± 2.30 92.52 ± 1.96
F1-score 95.86 ± 0.44 96.07 ± 0.73 96.56 ± 1.18 98.23 ± 0.26 96.29 ± 0.26 94.10 ± 1.18

* The bold font shows the highest CA among the values.


[a]
The standard deviation of five repetitions.
[b]
Means that do not share a letter are significantly different.

6
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

Table 5 Table 6
Computation times of six CNN algorithms using the augmented dataset (11,600 Analysis of variance of computation time of six CNN algorithms using the
images). Training time was calculated with 9280 images and testing time with augmented dataset (11,600 images).
2320 images. Source DF[a] SS[b] MS[c] F-value p-value
Algorithms Training time (s)[a] Testing time (s)[b]
Algorithms 5 62,771 12554.2 127.64 <0.001
AlexNet 3451.88 ± 40.22[c] 40.73 ± 0.33c[d] Error 24 2361 98.4
SqueezeNet 3555.40 ± 261.15 73.40 ± 3.81b Total 29 65,131
GoogLeNet 9365.25 ± 41.61 87.13 ± 2.38b [a]
Degrees of freedom.
ResNet-50 21738.29 ± 420.16 178.20 ± 8.51 a [b]
SqueezeNet-MOD1 5141.80 ± 27.31 68.55 ± 3.99b Sum of squares.
[c]
SqueezeNet-MOD2 3832.87 ± 95.46 45.70 ± 0.85c Mean square.

* The bold font shows the short computation time among the values which has
no significant difference.
[a]
Training time (seconds) for processing with 9280 data. demands were further reduced. Both modifications of SqueezeNet had
[b]
Testing time (seconds) for processing with 2320 data. the smallest memory requirements in comparison to the other CNN al­
[c]
The standard deviation of five repetitions. gorithms presented in this study. Here, SqueezeNet-MOD1 could be
[d]
Means that do not share a letter are significantly different. selected when high CA is required and SqueezeNet-MOD2 could be
selected for when training and testing times need to be reduced and/or
when memory requirements need to be minimised.
It was determined to be a significant difference (p < .001) in CA
between the CNN algorithms (Table 4); therefore, an MMC using 4. Conclusions
Tukey’s test was carried out. Here, it was determined that there were no
significant differences among all the different algorithms with the This study began with a comparison of non-DLs and CNNs with an
exception of ResNet-50. ResNet-50 showed a significantly higher CA of image dataset consisting of PM infected and uninfected strawberry
(98.11%), which aligns with our expectations based on similar results leaves (1450 images). As we expected, CNNs performed better than non-
reported by Bianco et al. (2018) (Table 3). However, the magnitude of DLs when distinguishing between infected and uninfected leaves and
the difference was small. when the dataset was increased, and parameters to the function were
added. The overall goal of our study was to develop a deployable
hardware system to detect PM for a strawberry field to provide recom­
3.3. Computation time mendations for fungicide applications. Our testing was expanded to use
an image-rotated, data augmentation technique, leading to a total of
Computation time is also a crucial parameter that needs to be 11,600 data points. ResNet-50 had statistically higher CA (98.11%) but
considered in order to facilitate real-time processing. Even though the the CA among the rest of the algorithms was not significantly different.
popularity of DL has increased, and computation ability has been When considering potential hardware memory requirements,
improving, the amount of time required to train the algorithms is a SqueezeNet-MOD2 had the lowest requirements. In terms of testing
critical consideration (Justus et al., 2018). The training and testing times time, ResNet-50 required the most time with 178.20 s and AlexNet the
shown in Table 5 were calculated as the training time per epoch and the least time with 40.73 s to process 2320 images. The testing time was
number of epochs that should be performed to reach the desired level of significantly different among the CNN algorithms, with AlexNet and
accuracy. Training time was the time required to train the algorithms SqueezeNet-MOD2 having the shortest times.
using 9280 images (i.e., 80% of the total augmented dataset) and test The experimental results showed that the CNN techniques are a
time was the time required to test the algorithm using 2330 images (i.e., promising tool for development of a field deployable strategy to detect
20% of the total augmented dataset). PM in the strawberry leaves. The ideal management scenario would be
In Section 3.2., ResNet-50 was recommended based on having the to prevent the disease at early onset; however, if the imagery is acquired
highest CA; however, it also had the longest computation time compared during later stages of the diseases, preventative practices may be unre­
to all other algorithms (Table 5). Significant differences in testing time alistic, and the disease may need to be treated with chemical applica­
(p < .001) between the CNN algorithms were observed (Table 6). tions. Using the proposed DL approach, it would be possible to install
AlexNet and SqueezeNet-MOD2 had the shortest computation time these algorithms on agrochemical spraying hardware platform—an area
(Table 5); however, the optimal model should also consider CA and of future research. Although this research was carried out in a laboratory
memory requirements. When considering computation time, AlexNet setting where the lighting conditions and leaf orientation were kept
performed the fastest with an average of 3451.88 s when training the constant, we fully recognise that the proposed techniques warrants
algorithm and an average of 40.73 s when testing the algorithm future research with regards to their transferability to field conditions,
(Table 5). Both training time and testing time in AlexNet excelled rela­ where there are additional challenges related to irregular lighting con­
tive to all DL algorithms in our study. ditions, leaf orientation, and overlapping of leaves.
Based on computational time, the findings from this study were Specifically, study could be extended to develop a fully automated
consistent with the findings of Bianco et al. (2018). Shafiee et al. (2017) hardware (i.e., FPGA or mobile application) in order to help producers
indicated that SqueezeNet is good for deployment on mobile hardware who are struggling with PM disease. Future work will investigate the
applications (Table 1). The results of the processing time with ResNet-50 integration of the CNN algorithms with hardware and develop a disease
are consistent with our expectation that it would require the longest management platform, that would be easy for producers to use by
computational time due to the model complexity. In general, DL might providing accurate and fast results.
require much longer training times compared to non-DL; however, this
is potentially offset by the shorter testing time required (Busseti et al., Funding
2012).
Overall, based on CA, computation time, and memory requirements This research was funded by Natural Science and Engineering
for a hardware implementation, SqueezeNet would be recommended. Research Council of Canada (NSERC) Discovery Grants Program
When considering memory (Table 1), SqueezeNet required the least (RGPIN-2017-05815).
memory and by removing one or three of the Fire modules for
SqueezeNet-MOD1 and SqueezeNet-MOD2, respectively, the memory

7
J. Shin et al. Computers and Electronics in Agriculture 183 (2021) 106042

CRediT authorship contribution statement Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K., 2016.
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model
size. arXiv preprint arXiv:1602.07360.
Jaemyung Shin: Conceptualization, Methodology, Data curation, Jacob, D., David, D.R., Sztjenberg, A., Elad, Y., 2008. Conditions for development of
Writing - original draft, Investigation, Software. Young K. Chang: powdery mildew of tomato caused by Oidium neolycopersici. Phytopathology 98
Conceptualization, Methodology, Data curation, Writing - review & (3), 270–281.
Jiang, B., He, J., Yang, S., Fu, H., Li, T., Song, H., He, D., 2019. Fusion of machine vision
editing, Software, Resources, Supervision. Brandon Heung: Conceptu­ technology and AlexNet-CNNs deep learning network for the detection of
alization, Writing - review & editing. Tri Nguyen-Quang: Writing - postharvest apple pesticide residues. Artif. Intell. Agric. 1, 1–8.
review & editing, Supervision. Gordon W. Price: Writing - review & Justus, D., Brennan, J., Bonner, S., & McGough, A.S., 2018, December. Predicting the
Computational Cost of Deep Learning Models. In 2018 IEEE International Conference
editing, Supervision. Ahmad Al-Mallahi: Writing - review & editing, on Big Data (Big Data), IEEE, pp. 3873–3882.
Supervision. Kalia, A., Gosal, S.K., 2011. Effect of pesticide application on soil microorganisms. Arch.
Agron. Soil Sci. 57 (6), 569–596.
Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: A survey.
Declaration of Competing Interest Comput. Electron. Agric. 147, 70–90.
Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., & Tang, P.T.P., 2016. On large-
batch training for deep learning: Generalization gap and sharp minima. arXiv
The authors declare that they have no known competing financial preprint arXiv:1609.04836.
interests or personal relationships that could have appeared to influence Kobayashi, T., Kanda, E., Kitada, K., Ishiguro, K., Torigoe, Y., 2001. Detection of rice
panicle blast with multispectral radiometer and the potential of using airborne
the work reported in this paper. multispectral scanners. Phytopathology 91 (3), 316–323.
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521 (7553), 436.
Acknowledgements Liaghat, S., Balasundram, S.K., 2010. A review: The role of remote sensing in precision
agriculture. Am. J. Agric. Biolog. Sci. 5 (1), 50–55.
Liu, B., 2017. Sustainable strawberry production and management including control of
This work was also supported by Nova Scotia Research and Inno­ strawberry powdery mildew.
vation Graduate Scholarship Program and Dalhousie Entrance/In-course Liu, B., Zhang, Y., He, D., Li, Y., 2018. Identification of apple leaf diseases based on deep
convolutional neural networks. Symmetry 10 (1), 11.
Scholarship Programs. The authors would like to thank Millen farm and McGrath, M.T., 2001. Fungicide resistance in cucurbit powdery mildew: experiences and
Balamore farm to providing field access for image collection and challenges. Plant Dis. 85 (3), 236–245.
experiment. Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based
plant disease detection. Front. Plant Sci. 7, 1419.
Statistics Canada, 2020. Table 32-10-0169-01 Number of farm operators by sex, age and
References paid non-farm work, historical data. Retrived from https://doi.org/10.25318/32100
16901-eng.
Adam, L., Somerville, S.C., 1996. Genetic characterization of five powdery mildew Qi, C. R., Yi, L., Su, H., & Guibas, L.J., 2017. Pointnet++: Deep hierarchical feature
disease resistance loci in Arabidopsis thaliana. Plant J. 9 (3), 341–356. learning on point sets in a metric space. In: Advances in neural information
Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S., ... & Asari, processing systems, pp. 5099–5108.
V. K. (2018). The history began from alexnet: A comprehensive survey on deep Ramcharan, A., Baranowski, K., McCloskey, P., Ahmed, B., Legg, J., Hughes, D.P., 2017.
learning approaches. arXiv preprint arXiv:1803.01164. Deep learning for image-based cassava disease detection. Front. Plant Sci. 8, 1852.
Asad, M. H., & Bais, A., 2019. Weed Detection in Canola Fields Using Maximum Rasti, P., Ahmad, A., Samiei, S., Belin, E., Rousseau, D., 2019. Supervised Image
Likelihood Classification and Deep Convolutional Neural Network. Information Classification by Scattering Transform with Application to Weed Detection in
Processing in Agriculture. Culture Crops of High Density. Remote Sens. 11 (3), 249.
Bianco, S., Cadene, R., Celona, L., Napoletano, P., 2018. Benchmark analysis of Robert, P.C., 2002. Precision agriculture: a challenge for crop nutrition management. In
representative deep neural network architectures. IEEE Access 6, 64270–64277. Progress in Plant Nutrition: Plenary Lectures of the XIV International Plant Nutrition
Bressem, K.K., Adams, L., Erxleben, C., Hamm, B., Niehues, S., & Vahldiek, J., 2020. Colloquium. Springer, Dordrecht, pp. 143–149.
Comparing Different Deep Learning Architectures for Classification of Chest Rosa, U.A., Upadhyaya, S. K., Koller, M., Josiah, M., & Pettygrove, S., 2000. Precision
Radiographs. arXiv preprint arXiv:2002.08991. farming in a tomato production system. In: Proceedings of the 5th International
Busseti, E., Osband, I., Wong, S., 2012. Deep learning for time series modeling. Technical Conference on Precision Agriculture, Bloomington, Minnesota, USA, 16-19 July,
report. Stanford University, pp. 1–5. 2000. American Society of Agronomy, pp. 1–15.
Cawley, G.C., Talbot, N.L., 2010. On over-fitting in model selection and subsequent Shafiee, M. J., Li, F., Chwyl, B., & Wong, A., 2017. SquishedNets: Squishing SqueezeNet
selection bias in performance evaluation. J. Mach. Learn. Res. 11, 2079–2107. further for edge device scenarios via deep evolutionary synthesis. arXiv preprint
Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y., 2014. Deep learning-based classification of arXiv:1711.07459.
hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7 (6), Shin, J., Chang, Y.K., Heung, B., Nguyen-Quang, T., Price, G.W., Al-Mallahi, A., 2020.
2094–2107. Effect of directional augmentation using supervised machine learning technologies:
Cruz, A., Ampatzidis, Y., Pierro, R., Materazzi, A., Panattoni, A., De Bellis, L., Luvisi, A., A case study of strawberry powdery mildew detection. Biosyst. Eng. 194, 49–60.
2019. Detection of grapevine yellows symptoms in Vitis vinifera L. with artificial Sutskever, I., Martens, J., Dahl, G., Hinton, G., 2013. February). On the importance of
intelligence. Comput. Electron. Agric. 157, 63–76. initialization and momentum in deep learning. In: International conference on
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L., 2009, June. Imagenet: A machine learning, pp. 1139–1147.
large-scale hierarchical image database. In: 2009 IEEE conference on computer Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Rabinovich, A., 2015.
vision and pattern recognition, IEEE pp. 248–255. Going deeper with convolutions. In: In Proceedings of the IEEE conference on
Deng, L., Platt, J.C., 2014. Ensemble deep learning for speech recognition. Fifteenth computer vision and pattern recognition, pp. 1–9.
Annual Conference of the International Speech Communication Association. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the
Durmuş, H., Güneş, E. O., & Kırcı, M., 2017, August. Disease detection on the leaves of inception architecture for computer vision. In: In Proceedings of the IEEE conference
the tomato plants by using deep learning. In 2017 6th International Conference on on computer vision and pattern recognition, pp. 2818–2826.
Agro-Geoinformatics, IEEE, pp. 1–5. Taylor, L., & Nitschke, G., 2017. Improving deep learning using generic data
Fadaee, M., Bisazza, A., & Monz, C., 2017. Data augmentation for low-resource neural augmentation. arXiv preprint arXiv:1708.06020.
machine translation. arXiv preprint arXiv:1705.00440. Teke, M., Deveci, H. S., Haliloğlu, O., Gürbüz, S. Z., & Sakarya, U., 2013, June. A short
Ebrahim, M., Alsmirat, M., & Al-Ayyoub, M., 2018, April. Performance study of survey of hyperspectral remote sensing applications in agriculture. In 2013 6th
augmentation techniques for HEp2 CNN classification. In 2018 9th International International Conference on Recent Advances in Space Technologies (RAST), IEEE,
Conference on Information and Communication Systems (ICICS), IEEE, pp. 163–168. pp. 171–176.
Frank, A.L., McKnight, R., Kirkhorn, S.R., Gunderson, P., 2004. Issues of agricultural Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A., 2010. Stacked denoising
safety and health. Annu. Rev. Public Health 25, 225–245. autoencoders: Learning useful representations in a deep network with a local
Fuentes, A., Yoon, S., Kim, S., Park, D., 2017. A robust deep-learning-based detector for denoising criterion. J. Mach. Learn. Res. 11 (Dec), 3371–3408.
real-time tomato plant diseases and pests recognition. Sensors 17 (9), 2022. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M., 2017. Chestx-ray8:
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. In: Hospital-scale chest x-ray database and benchmarks on weakly-supervised
Proceedings of the IEEE conference on computer vision and pattern recognition, classification and localization of common thorax diseases. In: Proceedings of the
pp. 770–778. IEEE conference on computer vision and pattern recognition, pp. 2097–2106.
Hossin, M., Sulaiman, M.N., 2015. A review on evaluation metrics for data classification Zhang, N., Wang, M., Wang, N., 2002. Precision agriculture—a worldwide overview.
evaluations. Int. J. Data Min. Knowle. Manage. Process 5 (2), 1. Comput. Electron. Agric. 36 (2–3), 113–132.

You might also like