Professional Documents
Culture Documents
Keywords: Aiming at the diseases of tomato and eggplant, we present a solanaceae disease recognition model based on SE-
Disease recognition Inception. Our model uses batch normalization layer (BN) to accelerate network convergence. Besides, SE-
Batch normalization Inception structure and multi-scale feature extraction module is adopted to improve accuracy of this model. Our
SE-Inception sample data set consists of 4 disease categories including whitefly, powdery mildew, yellow smut, cotton blight.
Multi-scale feature extraction
We also add healthy leaves into it. In order to reduce overfitting, the data set is expanded by the data en-
Model implementation
hancement method of translation, rotation and flip. Experiments show that the average recognition accuracy of
this model is 98.29% and the model size is 14.68 MB on our constructed dataset. In addition, in order to verify
the robustness of this model, it was also verified on the public data set of PlantVillage, and the top-1, top-5
accuracy and the size of our proposed model is 99.27%, 99.99% and 14.8 MB respectively. Moreover, we im-
plemented a solanaceae disease image recognition system using this model based on the Android. The accuracy
of average recognition and the recognition time of a single photo are 95.09% and 227 ms, respectively. Our
constructed model has a small number of parameters with maintaining high accuracy, which can meet the needs
of automatic recognition of disease images on mobile devices. Data and code are available at https://github.
com/Jujube-sun/diseaseRecognition.
⁎
Corresponding author at: P.O. Box 121, China Agricultural University, 17 Tsinghua East Road, Beijing 100083, PR China.
E-mail address: lizb@cau.edu.cn (Z. Li).
https://doi.org/10.1016/j.compag.2020.105792
Received 16 May 2020; Received in revised form 10 August 2020; Accepted 12 September 2020
Available online 24 September 2020
0168-1699/ © 2020 Elsevier B.V. All rights reserved.
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
training time, which is difficult to deploy on the mobile devices. With kinds of plants and 26 types of disease leaves totally. PlantVillage has a
the popularity of mobile devices, some scholars have also proposed total of 38 species and 54,305 images of plant disease leaves. In our
lightweight networks such as MobileNetV1 (Howard et al. 2017), Mo- experiment, we disturbed dataset and divided it into a training set, a
bileNetV2 (Sandler et al. 2018), ShuffleNetV1 (Zhang et al. 2018), validation set, and a test set according to the ratio of 6: 2: 2. The size of
ShuffleNetV2 (Ma et al. 2018), etc. Xiaoqing et al. (2019) identifies the original picture is normalized to 224 × 224, as input for model
tomato leaf disease images based on improved multi-scale AlexNet, training.
with the accuracy of 92.7% and the model size is just 29.9 MB. Yang
et al. (2019) compare MobileNeV1 and InceptionV3 [23] methods to 2.2. Data augmentation
realize the plant disease recognition on the mobile devices. The average
recognition rate on the PlantVillage data set is 95.02% and 95.62%, By counting the total number of samples and the distribution of
respectively. samples in various categories, it is found that the samples of our con-
In order to solve the problem of large model size, this paper pro- structed dataset are imbalanced distribution. Our constructed dataset
poses a new solanaceous disease recognition method based on SE- contains 434 yellow smut, 161 cotton blights, 386 powdery mildews,
Inception, which is inspired by GoogLeNet. Our model combines multi- 104 whitely, and 750 healthy leaves. Due to unbalanced data affecting
scale feature extraction, SENet (Hu et al. 2017), InceptionV2, Batch the recognition effect of deep learning models (Buda et al. 2018), data
normalization (Ioffe and Szegedy, 2015) methods. Our proposed model enhancement is performed for several categories with a small amount of
was trained and tested on our constructed dataset and the PlantVillage data. Color feature is one of the key features of disease identification. So
(Hughes and Salathe, 2015) public dataset. And we also compared with the color information of the original picture cannot be changed during
the recognition results and model size of some existing lightweight augmenting data. Based on the Keras framework, the following three
networks. As a result, our model performs well compared with others. data enhancement methods are mainly adopted (1) Random flip: flip
The remainder of this paper is organized in the following manner. along the horizontal and vertical directions of the image. (2) Random
Section 2 introduces the structure of experimental data and data pre- angle rotation: rotate at a certain angle with the image center as the
processing methods; In Section 3, we propose the network structure of origin (3) Image offset: shift the entire image along the horizontal or
our model; Experimental results are described in Section 4; Model im- vertical direction by a certain distance. The enhanced data set dis-
plementation is described in Section 5. Finally, the paper is summarized tribution is: 690 yellow smut, 644 cotton blights, 674 powdery mil-
in Section 6. dews, 602 whitely, and 750 healthy leaves. Detailed report of dataset
before and after applying the augmentation process is shown in Table 1.
2. Datasets
3. Architecture of our constructed model
2.1. Data acquisition
Our model uses the network structure of GoogLeNet as a reference
In this paper, we selected two datasets for the experiment. The first to construct a new lighter convolutional neural network with BN layer,
dataset was our constructed solanaceous disease dataset, and the multi-scale feature extraction module and SE-Inception (Szegedy et al.
second one was the PlantVillage dataset. 2016, Hu et al. 2017) structure. In order to improve the operating ef-
Our constructed solanaceous disease data in this article consists of ficiency, the model needs to reduce its memory requirements while
two parts, one part is from AI Challenger2018 (https://challenger.ai/ ensuring recognition accuracy.
competition/pdr2018.) Crop Disease Challenge (1315 photos). The
other part was taken under natural light in Xinyuan Sunshine Plantation 3.1. Multi-scale feature extraction
Park, Yongqing County, Langfang City, Hebei Province (520 photos). In
order to restore the real natural environment, we adopted multiple Due to the different morphology and features of different diseases, a
angle shots to take pictures in the morning and afternoon. The shooting multi-scale feature extraction module is proposed, which uses con-
equipment is Sony RX100M3 camera and Huawei Honor 10 mobile volution kernels of different scales to extract features from input pic-
phone. A total of 5 types of image samples of solanaceae were collected, tures. Multi-scale feature extraction module can extract multiple local
including 4 kinds of diseases (powdery mildew, whitely, cotton blight features simultaneously. In the convolutional neural network, the low-
and yellow smut) and healthy leaves. The background of the images level convolution retains the original information of the picture as
taken in the Xinyuan Sunshine Plantation Park is more complex than much as possible, mainly extracting simple features such as color,
those from the AI Challenger. An example of image samples is shown in texture, and edge of the image, while the features extracted by the high-
Fig. 1. level convolution are abstract and global (Yu et al. 2017).
PlantVillage (http://www.plantvillage.org) is a plant disease data The first layer in the GoogLeNet model uses a 7 × 7 large-scale
set. It contains a large number of plant disease images, including 13 convolution kernel. Generally, a large-scale convolution is used at the
2
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Table 1
Detailed report of the constructed dataset before and after applying the augmentation process.
Class Name Original(AI) Original(Xinyuan) Original After augmentation
whitefly are relatively small and scattered, while cotton blight are more presents the batch variance. After that, the data is normalized to obtain
obvious. (2) The characterization information of different diseases is data xi with mean 0 and variance 1.
similar. The powdery mildew and whitefly disease spots in Fig. 1 are xi µB
xi
relatively small and scattered. The small color and texture differences 2
+ (3)
B
(fine-grained characteristics) are the key to distinguishing these dis-
eases. Finally, the original feature distribution is restored through re-
In summary, the identification of different solanaceous diseases construction.
needs to consider both coarse-grained features (the size of the lesion) yi xi + = BN , (xi ) (4)
and fine-grained features (small colors and textures). In addition, the
comprehensive extraction of multiple features is the key to character- = Var [x i] (5)
izing the disease. Therefore, convolution kernels of different sizes are
set on the first layer of the model to improve the response of the bottom = E [xi ] (6)
layer to different granularity features. Four different convolutions where and are the parameters to be learned, Var represents the
kernels of 1 × 1, 3 × 3, 5 × 5, 7 × 7 are used. The number of small variance, and E represents the mean. When and in formula (4) take
convolution kernels (1 × 1, 3 × 3) and large convolution kernels the values of formula (5) and formula (6) respectively, the original
(5 × 5, 7 × 7) are 32 and 16 respectively. The feature maps obtained characteristic distribution of a certain layer can be restored.
after the convolution operation are merged into a tensor and continue
to be passed down. The specific structure is shown in Fig. 2. 3.3. InceptionV2
Fig. 2. Multi-scale feature extraction module. Fig. 3. Structure of InceptionV1 and InceptionV2.
3
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Fsq Fscale
Fex
1×1×C 1×1×C
C C
1 × 1 × 1 × 1 ×
r r
Table 2
Structure of the base model.
Type Patch Size/Stride Output Size
4
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Table 3
The influence of different modules on the model.
Models Size(MB) Epoch Parameters FLOPs Accuracy
Table 4
Classification results of SE-Inception. FLOPs are estimated for input of
3 × 224 × 224.
Model FLOPs Parameters Accuracy (%) Size (MB)
5
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Fig. 14. Loss curve of the Solanaceae validation set. Fig. 15. Precision curve of PlantVillage training set.
Table 5
Classification results of our conduct dataset. FLOPs are estimated for input of
3 × 224 × 224.
Model FLOPs Parameters Accuracy (%) Size (MB)
height and width of the original feature map and the number of
channels, respectively. Fex stands for the excitation operation, which
uses the global features obtained by the Fsq operation through the fully
connected layer, the ReLU activation layer, the fully connected layer,
and the Sigmoid layer in sequence, and learns the weight coefficients of
each channel. The formula is shown below. Fig. 16. Loss curve of PlantVillage training set.
Table 6
Detailed results of specific diseases.
Model Index YellowSmut CottonBlight PowderyMidew Whitefly Healthy
6
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
7
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Less than
threshold
Greater than or
equal to threshold
4.4. Evaluation of model structure according Figs. 8 and 10. The model recognition accuracy rate is
91.03%, which is higher than the benchmark model's 85.65%. Besides,
In order to explore the influence of the multi-scale feature extrac- the model size is just 13.5 MB.
tion module, BN layer and SENet in the constructed model, experiments
were carried out based on our constructed dataset. Our constructed
4.4.2. Evaluation of BN
dataset has a total of 3360 images, which are divided into a training set
We added BN to the base model to explore its effect on the model.
and a validation set according to a ratio of 8: 2. The structure of the
The blue curve represents Base_BN and the purple one represents Base
base model is shown in Table 2. It contains 2 convolutional layers, 2
in Figs. 7–10. After adding BN, the performance of the model on the
maximum pooling layers, 3 InceptionV2 modules, 1 maximum pooling
training set has been significantly improved, according to Figs. 7 and 8.
layer, and 1 fully connected layer. Base model and the models we
Besides, the convergence time of the model was reduced after adding
compare to use the same training parameters and training methods.
BN. However, we found its performance on validation set was not very
Training these models according to the training parameters in Section
well. Then, based on the multi-scale feature extraction module, a BN
4.2. The accuracy curve and loss curve of the training set are shown in
layer was added after each convolutional layer to explore the effect of
Figs. 7 and 8, respectively. Fig. 9 shows the accuracy curves of different
combining BN and multi-scale feature extraction module on the model
models on the validation set. Fig. 10 shows the loss curves of different
recognition. Comparing the orange curve and green one in Figs. 7 and
models on the validation set.
8, we found that accuracy increased and loss declined of Base_Multi_BN
on training set. And the model training time was shortened by adding
the BN layer. It can be seen from Fig. 9 that the average recognition
4.4.1. Evaluation of multi-scale feature extraction
accuracy rate is 96.58%, which is 5.7% higher than the Base_Multi.
In order to enhance the extraction of the model on disease features
After adding the BN layer, the size of the Base_Multi_BN model is
at different scales, the first convolution operation of the base model was
13.7 MB. Compared with the previous Base_Multi model, its size hardly
replaced with a multi-scale feature extraction module. This module
increases.
performs feature extraction on the input picture through 1 × 1, 3 × 3,
5 × 5, 7 × 7 parallel convolution. In Figs. 7 and 9, Base represents the
network result without adding the multi-scale feature extraction 4.4.3. Evaluation of SENet
module, and Base_Multi represents the model result after adding the In order to further improve the recognition effect of the model, we
module. Comparing these curves, it can be concluded that multi-scale combined the original InceptionV2 structure and SENet into SE-
feature extraction modules contributes to extract disease features at InceptionV2. As shown by Base_Multi_BN_SENet in Figs. 7 and 8, the
different scales, thereby improving the recognition accuracy of the final model performed slightly worse than model Base_Multi_BN and
model. We can also find that loss has reduced after adding this module Base_BN on the training set. Base_Multi_BN_SENet in Fig. 9 is the
8
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
9
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
healthy leaves and the number of categories is 38. According to the experiments on the PlantVillage dataset and our constructed dataset,
training parameters in 4.2. We trained the compared models from and compared with some common lightweight network models. The
scratch. experimental results show that our model can better balance the re-
The accuracy curve and loss curve of the training set are shown in cognition accuracy and the memory consumption required for opera-
Figs. 15 and 16, respectively. And we show accuracy and loss curves of tion. It has high operating efficiency and the average recognition ac-
validation set in Figs. 17 and 18. According to Figs. 15 and 16, our curacy of the model on our constructed dataset and PlantVillge public
model performs better than GoogLeNet. Our number of iterations is data set reaches 98.29% and 99.27%, the model size is 14.68 MB and
higher than other models. Our model's performance on the training set 14.8 MB respectively. The weakness of our model is that FLOPs is 1.16
is slightly inferior to other models. But, as can be seen from Fig. 17, the GFLOPs. Although it is smaller than GoogLeNet, comparing with other
recognition accuracy of our model is higher than MobileNetV1, Mobi- models like MobileNetV1, V2, V3, it is still a large number. We had also
leNetV2, MobileNetV3, GoogLeNet and ShuffleNetV2. As can be seen developed a Solanaceae disease recognition system based on this
from Fig. 18, the loss of our model is lower than other models. Vali- model, which could achieve a recognition speed of 4 frames/s on the
dation set accuracy and model size on Plant Village of different models common Android platform and achieve 95.09% recognition accuracy
are shown in Table 7. FLOPs, parameters, top-1, top-5 are shown in on the test set, which initially meets the production of Solanaceae
Table 8. disease recognition on the mobile platform demand.
It can be seen from Table 7 that the accuracy rates of our model on In future work, we will further adjust the model structure to reduce
the validation set are 99.27%, which is 1.34% higher than Mobile- the FLOPs of the model. In a word, this model achieves a higher ac-
NetV2. The accuracy of the validation set of MobileNetV1, Mobile- curacy while occupying a smaller space, laying the foundation for the
NetV2, MobileNetV3, GoogLeNet and ShuffleNetV2 is 97.26%, 97.93%, deployment of mobile devices, and provides method guidance for the
97.89%, 96.52%, 95.32%, respectively. The model size of our model is automatic identification of diseases in the agricultural field.
14.8 MB which is smaller than others. The result of the models on the
test set can be seen from Table 8, our top-1 accuracy and top-5 accuracy CRediT authorship contribution statement
are highest among these models. And Parameters of our model is
1.29 M, less than MobileNetV1, V2, V3 and GoogLeNet. The weakness Zhenbo Li: Conceptualization, Supervision, Formal analysis.
refers to the high FLOPs. The experimental results prove that the model Yongbo Yang: Methodology, Software, Writing - original draft, Writing
is robust and has good performance on our constructed dataset and - review & editing. Ye Li: Validation, Visualization. RuoHao Guo:
public dataset. Investigation. Jinqi Yang: Data curation. Jun Yue: Resources.
We used SE-Inception to develop a Solanaceae disease recognition The authors declare that they have no known competing financial
system based on the Android platform. Our system was deployed on interests or personal relationships that could have appeared to influ-
Huawei Honor 10 mobile phone. The design process of the system is ence the work reported in this paper.
shown in Fig. 19. Our model requires the input image format to be a
color image. After the user uploads a photo of any size, the system will Acknowledgements
unify the image size to 224 × 224 × 3 through scaling. The system will
return the label corresponding to the maximum probability value as the Our deepest gratitude goes to the anonymous reviewers and editors
result to the user. We set a threshold value of 0.8 in the system. When for their careful work that have helped improve this paper sub-
the probability value of the largest category label is greater than or stantially. This study is supported by Hebei Province Science and
equal to 0.8, the recognition result is returned to the user. When the technology plan project under grant no. 18047405D—Integration and
probability value is less than 0.8, the user is asked to re-enter an image. demonstration of Internet of Things technology for quality and safety
Users can upload images in two ways: shooting and local uploading. In management of facility vegetables.
addition to displaying the recognition results directly to the user, the
recognition results can also be saved locally in the form of screenshots References
for users to view. The system operation interface is shown in Fig. 20.
We used 367 images to test the system, and the test results are Buda, M., Maki, A., Mazurowski, M.A., 2018. A systematic study of the class imbalance
shown in Table 9. It can be seen from the Table 9, the accuracy and problem in convolutional neural networks. Neural Networks 106, 249–259.
system size of our model are 95.09% and 84.84 MB respectively. System Xiaoqing, G., Taojie, Fan, Xin, S., 2019. Tomato leaf diseases recognition based on im-
proved Multi-Scale AlexNet. Trans. Chin. Soc. Agric. Eng. 35 (13), 162–169.
contains the tensorflow framework so its size is higher than the model Hassanien, A.E., Gaber, T., Mokhtar, U., Hefny, H., 2017. An improved moth flame op-
size. Although ShuffleNetV2 has the fastest recognition speed, its ac- timization algorithm based on rough sets for tomato diseases detection. Comput.
curacy is lower. The accuracy of our model is almost the same as that of Electron. Agric. 136, 86–96.
He, K., Zhang, X., Ren, S. Jian, 2016. Deep residual learning for image recognition. In:
MobileNetV3, but our system size is 25.16 MB less than that. The IEEE Conference on Computer Vision & Pattern Recognition.
average recognition time of a single picture in our system is 227 ms. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M.,
Adam, H., 2017. MobileNets: efficient convolutional neural networks for mobile vi-
sion applications. arXiv preprint arXiv:1704.04861.
6. Conclusion
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E., 2017. Squeeze-and-excitation networks. IEEE
Trans. Pattern Anal. Mach. Intell. 7132–7144.
This paper proposes a solanaceous disease identification model Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q., 2017. Densely connected
convolutional networks. In: Proceedings of the IEEE conference on computer vision
based on SE-Inception, which well satisfies the mobile devices's needs
and pattern recognition, 4700–4708.
for disease identification models. By using batch normalization layers Hughes, D.P., Salathe, M., 2015. An open access repository of images on plant health to
after each convolutional layer in the model, training time is greatly enable the development of mobile disease diagnostics. arXiv preprint arXiv:1511.
reduced and training stability is also improved. At the same time, the 08060.
Ioffe, S., Szegedy, C., 2015. Batch normalization: Accelerating deep network training by
multi-scale feature extraction module is used to improve the recogni- reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
tion accuracy of the model for different diseases. In addition, the SE Kamal, M.M., Masazhar, A.N.I., Rahman, F.A., 2018. Classification of leaf disease from
module is also added to the model, so that the model channel in- image processing technique. Indonesian J. Electr. Eng. Comput. Sci. 10 (1), 191–200.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep con-
formation can be fully utilized to improve the recognition rate. volutional neural networks. In: International Conference on Neural Information
In order to verify the effectiveness of the model, we conducted
10
Z. Li, et al. Computers and Electronics in Agriculture 178 (2020) 105792
Processing Systems, 1097–1105. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke,
Liang, Q., Xiang, S., Hu, Y., Coppola, G., Zhang, D., Sun, W., 2019. PD2SE-Net: Computer- V., Rabinovich, A., 2015. Going deeper with convolutions. In: Proceedings of the IEEE
assisted plant disease diagnosis and severity estimation network. Comput. Electron. conference on computer vision and pattern recognition, 1–9.
Agric. 157, 518–529. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the inception
Liran, W., Jun, Y., Zhenbo, L., Guangjie, K., Haiping, Q., 2017. Multi- classification de- architecture for computer vision. In: Proceedings of the IEEE conference on computer
tection method of plant leaf disease based on kernel function SVM. Trans. Chin. Soc. vision and pattern recognition, 2818–2826.
Agric. Mach. 48 (S1), 166–171. Too, E.C., Yujian, L., Njuki, S., Yingchun, L., 2019. A comparative study of fine-tuning
Ma, N., Zhang, X., Zheng, H.T., Sun, J., 2018. ShuffleNet V2: Practical Guidelines for deep learning models for plant disease identification. Comput. Electron. Agric. 161,
Efficient CNN Architecture Design, 116–131. 272–279.
Mohanty, S.P., Hughes, D.P., Salathé, M., 2016. Using deep learning for image-based Yang, L., Quan, F., Shuzhi, W., 2019. Plant disease identification method based on
plant disease detection. Front. Plant Sci. 7 (1419). lightweight CNN and mobile application. Trans. Chin. Soc. Agric. Eng. 35 (17),
Nachtigall, L.G., Araujo, R.M., Nachtigall, G.R., 2016. Classification of apple tree dis- 194–204.
orders using convolutional neural networks. In: 2016 IEEE 28th International Yongquan, X., Bing, W., Jun, Z., Haipeng, H., Jingru, S., 2018. Identification of wheat leaf
Conference on Tools with Artificial Intelligence (ICTAI), IEEE, 472–476. disease based on random forest method. J. Graph., 39(01), 57–62.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C., 2018. Mobilenetv2: Yu, W., Yang, K., Yao, H., Sun, X., Xu, P., 2017. Exploiting the complementary strengths
Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on of multi-layer CNN features for image retrieval. Neurocomputing 237, 235–241.
computer vision and pattern recognition, 4510–4520. Zhang, X., Zhou, X., Lin, M., Sun, J., 2018. ShuffleNet: an extremely efficient convolu-
Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale tional neural network for mobile devices. In: Proceedings of the IEEE conference on
image recognition. arXiv preprint arXiv:1409.1556. computer vision and pattern recognition, 6848–6856.
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A., 2017. Inception-v4, inception-resnet Zhong, Y., Zhao, M., 2020. Research on deep learning in apple leaf disease recognition.
and the impact of residual connections on learning. In: Thirty-first AAAI conference Comput. Electron. Agric. 168, 105146.
on artificial intelligence.
11