You are on page 1of 11

Computers and Electronics in Agriculture 183 (2021) 106064

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Detecting soybean leaf disease from synthetic image using multi-feature


fusion faster R-CNN
Keke Zhang a, Qiufeng Wu b, *, Yiping Chen a
a
College of Engineering, Northeast Agricultural University, Harbin 150030, China
b
College of Arts and Sciences, Northeast Agricultural University, Harbin 150030, China

A R T I C L E I N F O A B S T R A C T

Keywords: An accurate detection of soybean leaf disease in soybean field is essential for soybean quality and the agricultural
Soybean leaf disease detection economy. Though many works have been done in identifying soybean leaf disease, because of the insufficient
Synthetic image dataset and technical difficulties, the tasks about detecting soybean leaf disease in complex scene are little
Complex scene
dressed. This paper develops a synthetic soybean leaf disease image dataset to tackle the problem of insufficient
Multi-feature fusion
dataset at first. Further, detecting soybean leaf disease in complex scene requires the detection model to be able
to precisely discriminate various features, such as features of healthy leaves and diseased leaves, features of
leaves with different diseases and so on. Thus, this paper designs a multi-feature fusion Faster R-CNN (MF3 R-
CNN) to address the above intractable problem. We obtain the optimal mean average precision with 83.34% in
real test dataset. Moreover, the experimental results indicate that the MF3 R-CNN trained only by synthetic
dataset is effective in detecting soybean leaf disease in complex scene and superior to the state-of-the-art.

1. Introduction themselves. Besides, the studies on soybean leaf disease identification


are relatively small.
Soybean is a widely cultivated plant that provides protein and oil for The studies on soybean leaf disease identification based on deep
people. In addition, the demand of soybean is increasing with the growth learning can be divided into two types according to the utilized soybean
of population. Therefore, it is essential to enlarge the yield of soybean. leaf image. The first one is that the collected images contain only one
However, soybean leaf disease will cause considerable yield losses and diseased soybean leaf (Wu et al., 2019; Gui et al., 2019). In this condi­
pose a threat to food security simultaneously. Thus, accurate and real- tion, authors identified soybean leaf disease by the improved Convolu­
time detection of soybean leaf disease is urgently needed. tional Neural Network (CNN) directly. The second one is that the
In the field of soybean leaf disease identification based on computer collected images include multiple soybean leaves (Tetila et al., 2019;
vision, the early traditional work usually consists of two independent Jiang et al., 2019; Xiong et al., 2020). In this situation, authors utilized
processes, i.e., feature extraction and disease classification. For example, segmentation algorithm to segment the original image into images
researchers extracted canopy color feature (Guan et al., 2016) or containing only a single soybean leaf (Tetila et al., 2019; Xiong et al.,
diseased spots feature (Qi et al., 2006; Ma et al., 2017; Li et al., 2019) 2020), or images containing diseased soybean leaf region (Jiang et al.,
first, then fed the extracted features to classifier to recognize soybean 2019) at first. Then used a specially designed CNN to recognize the
leaf disease. However, the divergence of one disease in different stage is soybean leaf disease category. In addition, some researchers are devoted
obvious, and the symptom of some different diseases presents similar. to identify soybean pests image via CNN (Sun et al., 2020), or estimate
The above factors lead to the traditional recognition method has poor soybean leaf defoliation using CNN (Silval et al., 2019).
universality when solving the problem of soybean leaf disease identifi­ To our knowledge, there is no work for detecting soybean leaf disease
cation. With the development of computer vision, the identification of in complex scene. The “complex scene” in our work refers to the soybean
soybean leaf disease via deep learning becomes research hotspot. Since leaf disease image that containing multiple diseased leaves, multiple
the public dataset Plant Village (Hughes et al., 2015) does not collect soybean plants, weeds, soil, massive healthy leaves and so on. Further­
soybean leaf image, most researchers collect soybean leaf disease image more, the target diseased leaves only occupy relatively small regions

* Corresponding author.
E-mail address: qfwu@neau.edu.cn (Q. Wu).

https://doi.org/10.1016/j.compag.2021.106064
Received 21 June 2020; Received in revised form 21 September 2020; Accepted 23 February 2021
Available online 5 March 2021
0168-1699/© 2021 Elsevier B.V. All rights reserved.
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Fig. 1. The overview of detecting soybean leaf disease in complex scene via MF3 R-CNN and synthetic image.

over the whole image, and it is common for leaves to cover each other. In locating text in natural images. They designed the synthetic image by
the real soybean field, monitoring soybean disease usually by utilizing automatically fitting text to natural images.
fixed camera or smart phone, the collected images are mostly with The other is that the real image is not involved in the synthesis
complex scene. Thus, detecting soybean leaf disease in complex scene is process. In more detail, the synthesis technology of not using real image
essential and a detection model suitable for image with complex scene is consists of two approaches, that is, relying on modern graphic engines
urgently needed. There are two reasons for the lack of work on soybean and designing special procedures. For instance, Hattori et al. (2015)
leaf disease detection in complex scene. The first one is that the established a scene-specific pedestrian detector by synthetic image.
complexity of detecting soybean leaf disease is higher than recognizing They created simulation scene by combining pedestrian renderings and
soybean leaf disease only, since the detection task should not only scene geometry. Mancini et al. (2016) trained an obstacle detector on
recognize the disease category of all diseased soybean leaves, but also synthetic image. They used Unreal Engine 4 with Urban City pack to
locate them. The second aspect is that the soybean leaf disease image in create synthetic scenarios. Richter et al. (2016) used synthetic image to
complex scene is difficult to collect. On the one hand, the image train semantic segmentation system. They explored the use of modern
collection will consume massive manpower, material resources and games such as Grand Theft Auto to generate large pixel-wise ground
time. On the other hand, the soybean leaf disease image in complex truth images. Barth et al. (2018) applied synthetic image in segmenting
scene needs to be carefully collected, if the occlusion between the sweet pepper images. They used PlantFactory 2015 Studio to create
diseased soybean leaves in collected image is serious, the detection virtual sweet pepper plants. Wang et al. (2019) estimated head pose
model will be difficult to learn the comprehensive features of each from synthetic image. The commercial software FaceGen was used to
disease. generate synthetic head pose image. Even though synthesizing image by
To tackle the above problems, first, the synthetic soybean leaf disease graphic software is convenient, some researchers prefer to synthesize
images were generated to address the problem of insufficient dataset. image by designing specific procedures. For example, Cicco et al. (2017)
Second, the multi-feature fusion Faster R-CNN (MF3 R-CNN) was pro­ utilized synthetic image to detect crop and weeds. They procedurally
posed to detect soybean leaf disease in complex scene, and the MF3 R- generated large synthetic image by randomizing crop, weed species, soil
CNN model was trained only by synthetic image generated from the first and light condition. Rahnemoonfar and Sheppard (2017) adopted syn­
step. Third, the trained detection model was evaluated by the real soy­ thetic image to build a fruit counting model. They designed a procedure
bean leaf disease image collected from the soybean field and internet. to create variable-sized tomatoes in synthetic image. Režnáková et al.
The overview of detecting soybean leaf disease in complex scene via MF3 (2017) utilized synthetic image in recognizing on-line handwritten
R-CNN and synthetic image is illustrated in Fig. 1. gesture. They designed a synthetic image generation system to create
As we all know, deep leaning based methods require large dataset. virtual handwritten gesture. The key idea of image synthesis methods is
However, the collection of a sufficient image dataset is expensive and consistent, that is, to create synthetic image for a certain task, but the
time-consuming. To tackle the problem of insufficient image dataset, implementation is different. The aforementioned works use synthetic
researchers tend to use synthetic image to train deep models. Image image to train their models and achieve acceptable results, which proves
synthesis methods can be divided into two types according to whether the rationality and effectiveness of using synthetic image in deep
the synthesis technology utilizes real image. The first one is that the learning based tasks. Thus, in this work, the real soybean field complex
process of synthesis technology contains real image. For example, Silval background image and the soybean leaf disease image in simple scene
et al. (2019) used synthetic image to estimate soybean leaf defoliation. A were used to generate the synthetic soybean leaf disease image.
synthetic defoliation image is generated by removing leaf-belonging In this study, our main contributions can be summarized as follows:
pixels in the real image in three different manners. Galbally et al. (1) The MF3 R-CNN model is designed to identify soybean leaf disease in
(2015) applied synthetic image to recognize on-line signatures. They complex scene. The MF3 R-CNN model is trained only by synthetic
generated the synthetic signature by combining real signature image soybean leaf disease image and evaluated on real image. Our experi­
and pen-up trajectory. Gupta et al. (2016) utilized synthetic image on mental results show that the performance of the proposed MF3 R-CNN

2
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Fig. 2. The real image dataset.

model is superior to the state-of-the-art. (2) The synthetic soybean leaf 2.1.1. Description of the real image dataset
disease image dataset was generated, and it has been experimentally By consulting plant protection experts and conducting field research
shown that the synthetic method proposed in this work is effective. (3) in soybean field, three common soybean leaf diseases of the season, that
Five novel feature fusion methods in MF3 R-CNN are presented. is, virus disease, frogeye leaf spot and bacterial spot, were studied in this
The remainder of this paper is as follows. Section 2 introduces the work. From July to September 2019, the soybean leaf disease image
details of the synthetic soybean leaf disease image and the construction dataset in simple scene and soybean field complex background image
of the MF3 R-CNN model. Section 3 presents the massive experiments dataset were collected from the soybean experimental base of the
and results. Section 4 concludes the paper. Northeast Agricultural University and Xiangyang farm, Harbin, China.
The real image dataset is shown in Fig. 2.
2. Materials and methods The first column of Fig. 2 displays the soybean bacterial spot images.
The symptoms of soybean bacterial spot on the leaves are as follows: the
2.1. Synthetic image generation spot is a small chlorotic irregular spot at the beginning, it is water-
stained, and it expands in the later stage, a yellow halo appears on the
In this section, we present the proposed method of synthesizing edge of the spot, and the middle of the lesion is dark brown. The disease
soybean leaf disease image in this work. spots are mostly irregular, and leaf edge disease spots often appear in
pieces, which can cause local leaf death. The second column of Fig. 2
shows the soybean frogeye leaf spot images. The frogeye leaf spot is

Fig. 3. Diagram of the generation of the synthetic soybean diseases image.

3
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Table 1 2.1.3. Data augmentation


The details of training dataset. Data augmentation technologies were used on the synthetic soybean
Disease in image Synthetic Augmented Total leaf disease image dataset, which not only enlarge the scale of the
dataset dataset training dataset, but also enhance the robustness of the detection model
Virus disease 50 500 550 (Krizhevsky et al., 2012; Ma et al., 2018; Cruz et al., 2019; KC et al.,
Frogeye leaf spot 50 500 550 2019; Liang et al., 2019). In this work, data augmentation techniques
Bacterial spot 50 500 550 such as reflection, rotation and color adjustment are applied (Krizhevsky
Frogeye leaf spot and Bacterial 50 500 550 et al., 2012; Zhang et al., 2016; Jia et al., 2017; Cruz et al., 2019; Liang
spot
et al., 2019; Zhang et al., 2019). The training dataset is given in Table 1.
The first column in Table 1 indicates the category of diseased soybean
mostly round, oval or irregular spots in the leaves. The center of the spot leaves contained in the image. The number of the total target objects in
is grayish white, the edge of the spot is reddish brown, the center of the training dataset is 2200.
late spot is gray, and the edge is dark brown. The third column of Fig. 2 In this work, the soybean leaf disease detection model is trained on
shows the soybean virus disease images. The symptoms of soybean virus synthetic image, but the testing process is conducted on the real image.
disease are systemic shrinkage of leaves. The leaves are distorted and The test dataset is real soybean leaf disease image dataset collected from
deformed. The mesophylls have dark green sore-like protrusions, and soybean field (25 images) and the internet (5 images). In addition, some
the leaf margins curl downward. The last two columns of Fig. 2 display images in test dataset are displayed in Fig. 4.
the soybean field complex background images. As shown in the image,
the soybean field complex background images contain multiple soybean
2.2. Multi-feature fusion Faster R-CNN (MF3 R-CNN)
plants, weeds, soil, massive healthy leaves and so on.

2.2.1. Motivations
2.1.2. Details of the synthesis process
In the real soybean field, monitoring soybean disease usually by
The pipeline of generating synthetic soybean leaf disease image can
utilizing fixed camera or smart phone, the collected images are mostly
be summarized as follows (see also Fig. 3). Firstly, the coordinates of the
with complex scene, that is, images that containing multiple diseased
midpoint of the major veins of the leaves with less severe occlusion in
leaves, multiple soybean plants, weeds, soil, massive healthy leaves and
background image were obtained, that is, the interactive program was
so on. Thus, detecting soybean leaf disease in complex scene is critical
provided to the agricultural experts, and they were requested to deter­
and a detection model suitable for image with complex scene is urgently
mine which leaves should be selected and then the corresponding co­
needed.
ordinates of the midpoint of the major veins of the leaves will be
Detecting soybean leaf disease in complex scene requires the detec­
returned and saved automatically. Then, one or multiple coordinates
tion model to be able to study comprehensive features and precisely
were randomly selected and marked in the background image (see the
discriminate various features, for example, features of healthy leaves
red circle in the marked image in Fig. 3). Next, the annotated back­
and diseased leaves, features of leaves with different diseases, features of
ground images were saved in different folders, and then soybean leaf
background such as soybean plants, weeds and so on. In addition, in the
disease images in simple scene were randomly put into a folder with a
field of object detection, Faster R-CNN (Ren et al., 2015) is a strong
marked background image. Finally, the diseased leaf image was merged
detector and has been successfully applied in various tasks (Xue et al.,
in the location of the annotated leaf region in the background image.
2018; Yang et al., 2018; Wang et al., 2018; Zheng et al., 2018; Li et al.,
The process framed by the blue box is programmatic and implemented
2019). Faster R-CNN is a two-stage network, which identifies region
on Matlab software. The process of merging images is manual and
proposals in the initial stage and classifies the objects within the region
implemented on Photoshop software.
proposals in the second stage. One-stage detector, representative YOLO

Fig. 4. The real soybean leaf disease test image dataset.

4
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Fig. 5. Operating mechanism of the MF3 R-CNN.

Fig. 6. All detectors in this work: (a) baseline, (b) early skip, (c) halfway skip 1, (d) halfway skip 2, (e) late skip, (f) fusion skip.

5
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Fig. 7. The process of determining anchor boxes.

(Redmon et al., 2016; Redmon and Farhadi, 2017) and SSD (Liu et al., extraction network was denoted as baseline, and the architecture of
2016) skip the second stage of Faster R-CNN and directly regresses the baseline detector is shown in Fig. 6(a). The feature extraction layer in
default anchors into detection boxes (Liu et al., 2018). One-stage de­ using ResNet as feature extraction network is the final convolutional
tector usually run faster than two-stage detector, but they may not reach layer of the 4-th stage of ResNet (He et al., 2017). Considering that the
the same level of accuracy, especially for the situation that the differ­ number of layers of ResNet is large and inconvenient to draw all layers,
ences between different categories are small. Further, multi-scale each building block is simplified to a rose red rectangle with a mark such
feature representations and contextual are essential for accurate visual as “res3b”, “res3c” and so on. Meanwhile, the same network structure of
recognition (Bell et al., 2016). Features at different layers focus on different detectors was omitted and only the different parts between all
different information, that is, lower layers contain finer visual details, detectors are drew. The implementation details of the five skip
such as location and counter, and higher layers include stronger se­ connection methods are as follows:
mantic information (Liu et al., 2018). Thus, this paper designed a multi-
feature fusion Faster R-CNN (MF3 R-CNN) to detect soybean leaf disease (1) Early skip. First, the “res3b” and “res3d” are concatenated in
in complex scene. depth, and then an average pooling layer is added to reduce
dimension. Next, the “res4f” layer concatenates the average
2.2.2. Implementation details and architecture visualization pooling layer in depth. Then, the fusion feature maps are fed into
In this work, we improved the Faster R-CNN model by improving its the RPN and the RoI pooling layer. In addition, the “res5a”,
feature extraction network. For feature extraction network, the Faster R- “res5b” and “res5c” building blocks are deleted for improving
CNN model use VGG-16 (Simonyan and Zisserman, 2015) to extract computational efficiency.
features from input images. In this work, first, we selected the ResNet-50 (2) Halfway skip 1. The “res4a” layer concatenates the “res4f” layer
(He et al., 2016) as the feature extraction network, and then optimized in depth. Then, the fusion feature maps are fed into the RPN and
the structure of ResNet-50. The reasons for choosing ResNet-50 as the RoI pooling layer.
feature extraction network are that, on the one hand, ResNet-50 is an (3) Halfway skip 2. The “res4c” layer concatenates the “res4f” layer
excellent backbone network and performs strongly in feature represen­ in depth. Then, the fusion feature maps are shared by the RPN
tation. Moreover, ResNet-50 is widely used as the backbone network in and the RoI pooling layer.
massive object detection tasks (Lin et al., 2017; Liu et al., 2018; Lin et al., (4) Late skip. The “res4e” layer concatenates the “res4f” layer in
2018; Li et al., 2018). On the other hand, to address the dilemma be­ depth, and then the fusion feature maps are sent to the RPN and
tween detection performance and computational complexity, ResNet-50 the RoI pooling layer.
is a better choice than ResNet-101 or ResNet-152. In the following, (5) Fusion skip. The fusion skip mode is a combination of the early
ResNet-50 will be denoted as ResNet. skip method and the halfway skip 2 method.
In order to implement “multi-feature fusion” in MF3 R-CNN, first,
skip connection was introduced into the structure of ResNet, then the 3. Experiments and results
generated fusion feature maps were sent to the region proposal network
(RPN) and RoI pooling layer. The operating mechanism of MF3 R-CNN is 3.1. Implementation details
given in Fig. 5.
Five skip connection methods are explored in this work, named as The proposed MF3 R-CNN is implemented under Matlab software
early skip (see Fig. 6(b)), halfway skip 1 (see Fig. 6(c)), halfway skip 2 with the NVIDIA TITAN Xp GPU. The training parameters are as follows:
(See Fig. 6(d)), late skip (see Fig. 6(e)) and fusion skip (see Fig. 6(f)). It the optimization method is stochastic gradient descent (SGD) algorithm
should be noted that, for the clarity of the image, the same network with a momentum of 0.9. The initial learning rate is 0.001 and divided
structure of different detectors was omitted and drew the part with by 2 every 2 epochs, and the number of epochs is 10.
different connections only. The difference between the early skip, In this work, the number and scale of the anchor boxes are deter­
halfway skip 1, halfway skip 2 and late skip is the location of the layer mined by clustering algorithm as in Redmon and Farhadi (2017). The
connected to the feature extraction layer defined in baseline detector. ground truth box distribution of the training dataset was visualized (see
The difference between the fusion skip and other skip methods is the Fig. 7(a)) to understand the range of object sizes better. Fig. 7(a) shows
complexity of the skip connection. that the size and shape of the most groups of objects are similar. Fig. 7(b)
In this work, the detector using the original ResNet as feature displays the mean IoU (Intersection-over-Union) versus number of

6
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Table 2 which exceeds the baseline model by 18.17%. The mAP of the early skip
Detection performance comparison between the baseline and MF3 R-CNN. model with the lowest performance in MF3 R-CNN is 69.69%, which still
Detection Average precision (AP) (%) mean AP (mAP) exceeds the baseline model by 4.52%. Fig. 8 shows the comparison
models (%) detection results on a real soybean leaf disease image, which contains
Virus Frogeye leaf Bacterial
disease spot spot five diseased leaves and two kinds of diseases: frogeye leaf spot (two
objects) and virus disease (three objects). The ground-truth image with
Baseline 75.34 66.94 53.22 65.17
Early skip 84.32 67.48 57.26 69.69
annotations and the detection results of the baseline model and MF3 R-
Halfway skip 1 89.66 71.80 61.05 74.17 CNN are displayed. It should be noted that, in the annotations of each
Halfway skip 2 91.80 72.27 85.96 83.34 image, the class label of each disease is simplified to the first word of the
Late skip 91.72 72.11 67.90 77.24 disease for convenience. For example, the “virus disease” is simplified to
Fusion skip 88.44 70.87 51.60 70.30
“virus”, the “frogeye leaf spot” is simplified to “frogeye” and so on. It can
be seen that the baseline model fail to detect the biggest leaf infected by
anchor boxes. Combining the Fig. 7(a) and Fig. 7(b), the number of virus disease, and each MF3 R-CNN model can detect all diseased soy­
anchor boxes is set to 4. The size of the 4 anchor boxes is determined by bean leaves.
the results of the clustering. It has been experimentally shown that the proposed MF3 R-CNN
trained only by synthetic image dataset can effectively detect the soy­
bean leaf disease in complex scene. In addition, the designed halfway
3.2. Experiments between baseline and MF3 R-CNN skip 2 model achieves the highest mAP of 83.34% among all models, and
exceeds the baseline model by 18.17%. Further, the followed sections
In this work, 30 test images collected from soybean field (25 images) will be advanced based on the halfway skip 2 model.
and the internet (5 images) were utilized to evaluate the baseline model
and MF3 R-CNN introduced in Section 2.2.2. Mean Average Precision
(mAP) is selected as the primary evaluation metric in comparison ex­
periments, and the reason given in Ren et al. (2015) is that the mAP is
the actual metric for object detection rather than focusing on object Table 3
proposal proxy metrics. Detection performance comparison between the MF3 R-CNN and state-of-the-
art.
The detection performance comparison between the baseline model
and MF3 R-CNN is presented in Table 2. In addition, a visualization of Model AP(%) mAP (%)
detection results between the baseline model and MF3 R-CNN is dis­ Virus disease Frogeye leaf spot Bacterial spot
played in Fig. 8. In Table 2, the first column gives all detection models in Faster R-CNN 75.04 52.55 30.91 52.83
this work, the rest columns present the average precision of the three YOLO v2 44.58 13.71 19.05 25.78
diseases and the mAP of each detection model. As can be seen from MF3 R-CNN 91.80 72.27 85.96 83.34
Table 2, the halfway skip 2 model obtains the highest mAP of 83.34%,

Fig. 8. Visualization of the detection results between the baseline and MF3 R-CNN.

7
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Fig. 9. Visualization of the comparison results with other approaches.

Fig. 10. Visualization of test results.

3.3. Comparison with state-of-the-art In Table 3, the mAP of MF3 R-CNN, Faster R-CNN and YOLO v2 are
83.34%, 52.83% and 25.78% respectively. Meanwhile, the visualization
The comparison experiments with the two state-of-the-art detection of comparison results is given in Fig. 9. It should be noted that the “MF3
models, Faster R-CNN and YOLO v2 are conducted. We train all models R-CNN” in the Fig. 9 refer to the halfway skip 2 model. The ground-truth
by the synthetic soybean leaf disease image and test on the real image. image in the first row is the same as the ground-truth image in Fig. 8, and
Detection results are presented in Table 3. It should be noted that the was collected from soybean field. Five diseased leaves and two kinds of
“MF3 R-CNN” in the first column refers to the halfway skip 2 model. diseases: frogeye leaf spot (two objects) and virus disease (three objects)

8
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Table 4 In Table 4, the recall rate and precision rate of virus is 93.10% and
Classification confusion matrix for soybean leaf diseases detection in complex 96.43% respectively, which can be obtained that the classification per­
scene. formance of MF3 R-CNN in identifying virus disease is great. However,
Disease Disease category Null Recall rate the classification performance of frogeye leaf spot and bacterial spot still
category
Virus Frogeye leaf Bacterial
(%) needs to be improved. The reason for the failure situations will be dis­
disease spot spot cussed combined with the results visualization subsequently.
For the frogeye leaf spot and the bacterial spot, the precision rate still
Virus disease 27 0 1 1 93.10
Frogeye leaf 0 43 5 4 87.76 needs to be improved. The main reason for confusing the frogeye leaf
spot spot and bacterial spot is that it is hard to distinguish which disease
Bacterial spot 0 0 21 1 95.45 occupies the major composition when the two diseases appear in one
Null 1 6 5 – – leaf simultaneously. Given an example in Fig. 11, the disease category
Precision rate 96.43 87.76 65.63 – –
(%)
corresponding to the enlarged partial image is frogeye leaf spot, but the
detector classified the disease to bacterial spot with a confidence prob­
ability of 0.5925. In the enlarged image, there are many tiny gray-white
are contained in the ground-truth image. The MF3 R-CNN model pre­ spots (belonging to frogeye leaf spot) and two obvious irregular brown
cisely detected all diseased soybean leaves, however, the Faster R-CNN lesions (belonging to bacterial spot). However, the symptom belongs to
model mistakenly detected a healthy leaf as a leaf with frogeye leaf spot, the frogeye leaf spot occupies more region in leaf, meanwhile, according
and the YOLO v2 model only detected two leaves with virus disease. The to the identification of plant protect experts, the disease of the enlarged
ground-truth image in the second row was collected from the internet,
and includes only one leaf with bacterial spot. The MF3 R-CNN precisely
detected the diseased soybean leaf, but the detection results of the Faster
R-CNN and YOLO v2 models are both wrong. Experimental results show
that, the performance of the proposed MF3 R-CNN in detecting soybean
leaf disease in complex scene is superior to the state-of-the-art.

3.4. Visualization of the detection results

The visualization of the detection results of the optimal MF3 R-CNN


was conducted (see Fig. 10). As can be seen in Fig. 10, the MF3 R-CNN
can effectively identify and locate the virus disease, frogeye leaf spot and
bacterial spot of soybean leaf diseases in natural soybean field. In
Fig. 10, most images contain multiple soybean plants and multiple
diseased leaves, and the MF3 R-CNN can recognize and locate the soy­
bean leaf diseases accurately. In addition, for the distant objects and
some partially occluded objects, the MF3 R-CNN can also achieve a
satisfactory performance.

3.5. Evaluation of the multi-classification performance of MF3 R-CNN

For quantitatively analysis the multi-classification performance of


the MF3 R-CNN, the confusion matrix of the test results is presented in
Table 4. Moreover, the precision rate and recall rate of each disease
category are given simultaneously. The “Null” in column represents the
number of missing detection targets, and the “Null” in row indicates the
Fig. 12. Sample image of a missing detection situation.
number of detected objects that are not included in ground-truth image.

Fig. 11. Sample image of a wrong classification situation.

9
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

leaf should be the frogeye leaf spot. A missing detection situation is References
shown in Fig. 12, the image on the upper right is a leaf with frogeye leaf
spot, which is partially occluded and the spot area is very small. The Barth, R., Ijsselmuiden, J., Hemming, J., et al., 2018. Data synthesis methods for
semantic segmentation in agriculture: A Capsicum annuum dataset. Comput.
image on the lower right is a leaf with bacterial spot, which is partially Electron. Agric. 44, 284–296.
occluded and appears in the image in a side posture. Thus, the detection Bell, S., Zitnick, C.L., Bala, K., et al., 2016. Inside-outside net: detecting objects in context
model did not detect the two diseased leaves. with skip pooling and recurrent neural networks. In: IEEE Conference on Computer
Vision and Pattern Recognition, pp. 2874–2883.
Cicco, M.D., Potena, C., Grisetti, G., et al., 2017. Automatic model based dataset
generation for fast and accurate crop and weeds detection. In: IEEE/RSJ
3.6. Discussion International Conference on Intelligent Robots and Systems, pp. 5188–5195.
Cruz, A., Ampatzidis, Y., Pierro, R., et al., 2019. Detection of grapevine yellows
symptoms in Vitis vinifera L. with artificial intelligence. Comput. Electron. Agric.
Due to the insufficient dataset and technical difficulties, the research 157, 63–76.
on the soybean leaf disease detection in complex scene is scarce. Galbally, J., Diaz-Cabrera, M., Ferrer, M.A., et al., 2015. On-line signature recognition
through the combination of real dynamic data and synthetically generated static
Therefore, this work propose to use synthetic image dataset to train the data. Pattern Recogn. 48, 2921–2934.
MF3 R-CNN model. The comparison experiment between the baseline Guan, H., Li, J., Ma, X., et al., 2016. Recognition of soybean nutrient deficiency based on
model and MF3 R-CNN indicates that the MF3 R-CNN beats the baseline color characteristics of canopy. J. Northeast A&F Uni. (Natural Science Edition). 44
(12), 136–142.
model by a large margin (See Table 2). In addition, the comparison Gui, J., Wu, Z., Li, K., 2019. Hyperspectral imaging for early detection of soybean mosaic
experiment between the MF3 R-CNN and state-of-the-art shows that the disease based on convolutional neural network model. J. Zhejiang Univ. (Agric. &
MF3 R-CNN is superior to the state-of-the-art (See Table 3 and Fig. 9). Life Sci.) 45 (2), 256–262.
Gupta, A., Vedaldi, A., Zisserman, A., 2016. Synthetic data for text localisation in natural
The MF3 R-CNN trained only by the synthetic image dataset can perform
images. In: IEEE Conference on Computer Vision and Pattern Recognition,
well on the real test dataset, which not only proves the effectiveness of pp. 2315–2324.
the designed model and synthetic image, but also shows that the model Hattori, H., Boddeti, V.N., Kitani, K., et al., 2015. Learning scene-specific pedestrian
detectors without real data. In: IEEE Conference on Computer Vision and Pattern
has generalization ability. However, the classification performance of
Recognition, pp. 3819–3827.
the MF3 R-CNN needs to be further improved, especially for the iden­ He, K., Gkioxari, G., Dollar, P., et al., 2017. Mask r-cnn. In: IEEE International Conference
tification of the bacterial spot. For the situations that multiple diseases on Computer Vision, pp. 2980–2988.
appear in one leaf or the diseased leaves are seriously occluded, the MF3 He, K., Zhang, X., Ren, S., et al., 2016. Deep residual learning for image recognition. In:
IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.
R-CNN may get into trouble. Hughes, D., Salathé, M., et al., 2015. An open access repository of images on plant health
to enable the development of mobile disease diagnostics. arXiv preprint arXiv:
1511.08060.
4. Conclusion Jia, S., Wang, P., Jia, P., et al., 2017. Research on data augmentation for image
classification based on convolution neural network. In: Chinese Automation
In this work, we target the problem of detecting soybean leaf disease Congress. pp. 4165–4170.
Jiang, F., Li, Y., Yu, D., et al., 2019. Soybean disease detection system based on
in complex scene and make improvements in dataset and model. First, convolutional neural network under Caffe framework. Acta Agriculturae
we generate synthetic soybean leaf disease image to tackle the insuffi­ Zhejiangensis. 31 (7), 1177–1183.
cient dataset problem. Second, the novel MF3 R-CNN model is designed KC, K., Yin, Z., Wu M., et al., 2019. Depthwise separable convolution architectures for
plant disease classification. Comput. Electron. Agric. 165, 104948.
which mixes multi-features by connecting different layers in the feature Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet classification with deep
extraction network in a skipping manner. Third, the MF3 R-CNN model convolutional neural networks. In: International Conference on Neural Information
is trained by synthetic image dataset and test on the real image dataset. Processing Systems, pp. 1097–1105.
Li, C., Song, D., Tong, R., et al., 2019a. Illumination-aware faster R-CNN for robust
The optimal mAP obtained by MF3 R-CNN is 83.34%, and the experi­
multispectral pedestrian detection. Pattern Recogn. 85, 161–171.
mental results indicate that the proposed MF3 R-CNN trained only by Li, J., Shi, C., Shan, Q., et al., 2019b. Visual Identification System of Soybean Frogeye
synthetic dataset can effectively detect the soybean leaf disease in Leaf Spot Based on SURF Feature Extraction. Soybean Science. 38 (1), 90–96.
complex scene and superior to the state-of-the-art. We hope the key idea Li, Z., Pao, C., Yu, G., et al., 2018. DetNet: A Backbone network for Object Detection.
arXiv:1804.06215v2.
in this work can trigger some inspiration to other computer vision tasks. Liang, Q., Xiang, S., Hu, Y., et al., 2019. PD2SE-Net: Computer-assisted plant disease
In the future work, other diseases in the soybean field will be added, diagnosis and severity estimation network. Comput. Electron. Agric. 157, 518–529.
besides, the designed model will be packaged and embedded in the Lin, T.Y., Dollár, P., Girshick, R., et al., 2017. Feature Pyramid Networks for Object
Detection. arXiv:1612.03144v2.
smart phone application. Lin, T.Y., Goyal, P., Girshick, R., et al., 2018. Focal loss for dense object detection. arXiv:
1708.02002v2.
Liu, W., Anguelov, D., Erhan, D., et al., 2016. Ssd: Single shot multibox detector. In:
CRediT authorship contribution statement European Conference on Computer Vision, pp. 21–37.
Liu, W., Liao, S., Hu, W., et al., 2018. Learning efficient single-stage pedestrian detectors
Keke Zhang: Conceptualization, Methodology, Software, Writing - by asymptotic localization fitting. European Conference on Computer Vision.
Ma, J., Du, K., Zheng, F., et al., 2018. A recognition method for cucumber diseases using
original draft. Qiufeng Wu: Supervision, Writing - review & editing, leaf symptom images based on deep convolutional neural network. Comput.
Project administration, Funding acquisition. Yiping Chen: Data cura­ Electron. Agric. 154, 18–24.
tion, Validation, Visualization. Ma, X., Guan, H., Qi, G., et al., 2017. Diagnosis model of soybean leaf diseases based on
improved cascade neural network. Trans. Chinese Soc. Agric. Mach. 48 (1), 163–168.
Mancini, M., Costante, G., Valigi, P., et al., 2016. Fast robust monocular depth estimation
for obstacle detection with fully convolutional networks. In: IEEE/RSJ International
Declaration of Competing Interest Conference on Intelligent Robots and Systems, pp. 4296–4303.
Qi, G., Ma, X., Guan, H., 2006. Extraction of the image of soybean target leaf spot based
on improved genetic algorithm. Trans. Chinese Soc. Agric. Eng. 25 (05), 142–145.
The authors declare that they have no known competing financial Rahnemoonfar, M., Sheppard, C., 2017. Deep count: Fruit counting based on deep
interests or personal relationships that could have appeared to influence simulated learning. Sensors. 17, 905.
Redmon, J., Divvala, S., Girshick, R., et al., 2016. You only look once: Unified, real-time
the work reported in this paper. object detection. In: IEEE Conference on Computer Vision and Pattern Recognition,
pp. 779–788.
Redmon, J., Farhadi, A., 2017. Yolo9000: Better, faster, stronger. In: IEEE Conference on
Acknowledgements Computer Vision and Pattern Recognition, pp. 6517–6525.
Ren, S., He, K., Girshick, R., et al., 2015. Faster r-cnn: Towards real-time object detection
This work was supported by National Key Application Research and with region proposal networks. In: Advances in Neural Information Processing
Systems. pp, 91–99.
Development Program in China under Grant 2018YFD0300105-2 and
Režnáková, M., Tencer, L., Plamondon, R., et al., 2017. Forgetting of unused classes in
Harbin Applied Technology Research and Development Program under missing data environment using automatically generated data: Application to on-line
Grant 2017RAQXJ096. handwritten gesture command recognition. Pattern Recogn. 72, 355–367.

10
K. Zhang et al. Computers and Electronics in Agriculture 183 (2021) 106064

Richter, S.R., Vineet, V., Roth, S., et al., 2016. Playing for data: ground truth from Wu, Q., Zhang, K., Meng, J., 2019. Identification of soybean leaf diseases via deep
computer games. In: European Conference on Computer Vision, pp. 102–118. learning. J. Inst. Eng. (India): Series A. 100 (4), 659–666.
Silval, A., Bressan, P.O., Goncalves, D.N., et al., 2019. Estimating soybean leaf defoliation Xiong, J., Dai, S., Ou, J., et al., 2020. Leaf deficiency symptoms detection method of
using convolutional neural networks and synthetic images. Comput. Electron. Agric. soybean based on dep learning. Trans. Chinese Soc. Agric. Mach. 51 (1), 195–202.
156, 360–368. Xue, Y., Zhu, X., Zheng, C., et al., 2018. Lactating sow postures recognition from depth
Simonyan, K., Zisserman, A., 2015. Very deep convolutional networks for large-scale image of videos based on improved Faster R-CNN. Trans. Chinese Soc. Agric. Eng. 9
image recognition. International Conference on Learning Representations. (34), 189–196.
Sun, P., Chen, G., Cao, L., 2020. Image recognition of soybean pests based on attention Yang, Q., Xiao, D., Lin, S., 2018. Feeding behavior recognition for group-housed pigs
convolutional neural network. J. Chinese Agric. Mechanization. 41 (2), 171–176. with the Faster R-CNN. Comput. Electron. Agric. 155, 453–460.
Tetila, E.C., Machado, B.B., Menezes, G.K., et al., 2019. Automatic recognition of Zhang, L., Yang, F., Zhang, Y.D., et al., 2016. Road crack detection using deep
soybean leaf diseases using UAV images and deep convolutional neural networks. convolutional neural network. In: International Conference on Image Processing,
IEEE Geosci. Remote Sens. Lett. 99, 1–5. pp. 3708–3712.
Wang, D., Tang, J., Zhu, W., et al., 2018. Dairy goat detection based on Faster R-CNN Zhang, L., Mohamed, A.A., Chai, R., et al., 2019. Automated deep learning method for
from surveillance video. Comput. Electron. Agric. 154, 443–449. whole-breast segmentation in diffusion-weighted breast MRI. J. Magn. Reson.
Wang, Y., Liang, W., Shen, J., et al., 2019. A deep Coarse-to-Fine network for head pose Imaging 51 (2).
estimation from synthetic data. Pattern Recogn. 94, 196–206. Zheng, C., Zhu, X., Yang, X., et al., 2018. Automatic recognition of lactating sow postures
from depth images by deep learning detector. Comput. Electron. Agric. 147, 51–63.

11

You might also like