You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/335361618

Smart Monitoring of Crops Using Generative Adversarial Networks

Chapter · August 2019


DOI: 10.1007/978-3-030-29888-3_45

CITATION READS
1 283

4 authors:

Hamideh Kerdegari Manzoor Razaak


King's College London Kingston University London
23 PUBLICATIONS   154 CITATIONS    19 PUBLICATIONS   120 CITATIONS   

SEE PROFILE SEE PROFILE

Vasileios Argyriou Paolo Remagnino


Kingston University London Kingston University London
135 PUBLICATIONS   1,620 CITATIONS    223 PUBLICATIONS   6,220 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

5G Rural Integrated Testbed (5GRIT) - http://www.5grit.co.uk View project

Immersion and Sensory Augmentation View project

All content following this page was uploaded by Hamideh Kerdegari on 07 December 2020.

The user has requested enhancement of the downloaded file.


Smart Monitoring of Crops using Generative
Adversarial Networks

H. Kerdegari, M. Razaak, V. Argyriou, and P. Remagnino

The Robot Vision Team (RoViT), Kingston University, UK


{h.kerdegari, manzoor.razaak}@kingston.ac.uk

Abstract. Unmanned aerial vehicles (UAV) are used in precision agri-


culture (PA) to enable aerial monitoring of farmlands. Intelligent meth-
ods are required to pinpoint weed infestations and make optimal choice
of pesticide. UAV can fly a multispectral camera and collect data. How-
ever, the classification of multispectral images using supervised machine
learning algorithms such as convolutional neural networks (CNN) re-
quires a large amount of training data. This is a common drawback in
deep learning. Our method makes use of a semi-supervised generative
adversarial networks (GAN), providing a pixel-wise classification for all
the acquired multispectral images. It consists of a generator network
to provide photo-realistic images as extra training data to a multi-class
classifier acting as a discriminator and trained on small amounts of la-
beled data. The performance of the proposed semi-supervised GAN is
evaluated on the weedNet dataset consisting of multispectral crop and
weed images collected by a micro aerial vehicle (MAV). Results indi-
cate high classification accuracy can be achieved and show the potential
of GAN-based methods for the challenging task of multispectral image
classification.

Keywords: Generative adversarial networks (GAN) · Semi-supervised


GAN · Multispectral images · Classification · Unmanned aerial vehicles
(UAV).

1 Introduction

Weed infestation is a major challenge for the agriculture sector. Early detection
and removal of weed can greatly improve crop yield. Traditional methods of weed
removal are time consuming: they require farmers to physically survey, identify
and treat infested areas. UAV are used by farmers with cameras (RGB, multi
or hyper spectral) to obtain a better view of their farm and identify specific
weed infestations. Ultimately, the goal is to implement more effective treatment
measures and reduce if not eliminate entirely weed infestation [1].
RGB cameras and multispectral sensors flown by UAV have proven to be
useful in early weed detection. Figure 1 illustrates an example with spectral
images of a farm. Captured data is then analyzed by computer vision algorithms
to detect the presence of weed. Studies that utilize RGB cameras mainly apply
2 H. Kerdegari et al.

Fig. 1. A weed detection system: a UAV with a mounted multispectral camera, 5G


transmission system, and a data processing centre hosting the captured image data
and weed detection algorithms.

feature extraction techniques for the detection and classification of weed from
crop. Hung et al. [2] proposed a feature learning based approach with a bank of
image filters to draw image statistics and feed them to a linear classifier to be able
to detect presence of weed from images captured by a RGB camera mounted on
a UAV. In [3], the authors used a commercial camera that operates in the visible
spectrum for a ultra-high resolution image acquisition over wheat fields. Their
study calculated six different vegetation indices based on the RGB spectrum
and achieved a high accuracy of above 85% in the detection and classification
of vegetation. Although RGB image analysis methods can be successfully used
for weed identification and classification [4], RGB images captured by UAV have
few limitations in crop-weed disambiguation. Results in [5] showed that better
performance for weed detection via RGB images was obtained for larger size
of weed plants. To achieve high accuracy feature detection and learning, an
expensive high resolution camera is necessary to capture sufficient details of the
crop and weed. Further, RGB images capture less information at higher altitudes
and it was observed that vegetation indices decrease as altitude increases [6].
Due to the limitations of the visible spectrum (RGB band), use of additional
spectral bands, in the infrared side of the spectrum, have shown to provide
detailed information that enables accurate calculation of vegetation indices and
crop-weed classification [7]. Therefore, multispectral cameras are increasingly
used for crop growth monitoring. Health analysis can be implemented through
extraction of vegetation indices, such as the normalized difference vegetation
index (NDVI), green normalized difference vegetation index (GNDVI) and soil
adjusted vegetation index (SAVI) [8]. These indices are computed from different
spectral bands captured by the multispectral camera and can be further utilized
to analyze vegetation conditions.
Studies employing multispectral cameras apply different approaches for weed
detection. For instance, in [9], an object detection based image analysis method
Intelligent Crop Monitoring using Semi-supervised GAN 3

was proposed that identifies objects in crop rows and applies classification tech-
niques to discriminate crop and weed from the spectral images. Spectral index
variation is an approach that considers spectral reflectance variation to discrim-
inate between crop and weed [6]. Various statistical methods for weed detec-
tion from multispectral images have been proposed and include approaches such
as Mahalanobis distance computation between vegetation rates [10] and partial
least squares discriminant analysis classification models [11] that have shown de-
tection accuracy in range over 80%. Machine learning approaches such as support
vector machine (SVM) method was found to show better accuracy performance
[12], compared to the decision tree (DT) method [13] where multispectral images
were classified along with the use of NDVI thresholding.
More recently, deep learning methods have been explored for crop-weed dis-
ambiguation. In particular, the basic convolutional neural network (CNN) has
gained ground in the analysis and classification of remote sensing data such as
multispectral images [14]. For instance, Sa et al. [15] applied cascaded CNN, Seg-
net, on multispectral image datasets for classifying sugar beet crop from weed.
They trained six models on different spectral channels and achieved a classifica-
tion F1 score of 0.8. The study was further extended to include a sliding window
approach on orthomosaic maps of the farm to apply a deep neural network and
achieved improved performance accuracy [16]. Similarly, several other studies
applied CNN based methods for weed classification with images captured from
both UAV and ground based vehicles and achieved a high performance accuracy
[17], [18].
However, a large amount of training data for learning is an inherent require-
ment of deep learning methods. The lack of large corpora of specific labeled
data is a challenge in general and for multi-spectral data in particular. Further-
more, collecting large corpora of multispectral image data with UAV platforms
for crop-weed classification system is time consuming and expensive. To address
this challenge, this paper utilizes a semi-supervised version of the generative ad-
versarial networks (GAN) [19]. This method generates photo-realistic crop-weed
images and can be employed to augment training data. In the presented GAN
based semi-supervised classification method, a generator creates large realistic
images, in turn, forcing a discriminator to learn better features for more accurate
pixel classification. To the best of our knowledge, application of GAN methods
for multispectral image classification is not well explored and our work addresses
this research gap. The main contributions of this paper are:
– first application of semi-supervised GAN for classification of multispectral
images acquired by UAV,
– investigation of limited annotated data for multispectral image classification
task.
Section 2 presents the proposed approach by providing a brief background of
GAN and semi-supervised GAN, then the design and structure of the proposed
model for semi-supervised learning is described in system overview. Section 3
deals with experimental results, where results on the weedNet dataset [15] are
presented, finally section 4 concludes the paper.
4 H. Kerdegari et al.

2 Proposed Approach
This section presents a brief background about GAN, semi-supervised GAN and
then describes the proposed network architecture for semi-supervised pixel-wise
classification of multispectral crop/weed imaging data.

2.1 Semi-supervised Generative Adversarial Network


The GAN framework was first introduced by Goodfellow et al. [19] to train
deep generative models. A GAN usually contains two networks: a generative
(G) network and a discriminative (D) network. Both networks G and D are
trained simultaneously in an adversarial manner, where G tries to generate fake
inputs as real as possible, and D tries to disambiguate between real and fake
data. The following formulation shows G and D competition in a two-player
minmax game with value function V (D, G):

min max V (D, G) = Ex∼Px [logD(x)] + Ez∼Pz [log(1 − D(G(z)))] (1)


G D

Where symbol E represents the expected value. G transforms a noise variable


z into G(z), that is a sample from distribution pz , and distribution pz should
converge to distribution px . D is trained to minimize log(D(x)) while G is trained
to minimize log(1 − D(G(z))).
Unlike typical GAN, where the discriminator is a binary classifier for dis-
criminating real and fake images, semi-supervised GAN implements a multiclass
classifier. In semi-supervised learning, where class labels are not available for
all training images, it is convenient to leverage unlabeled data for estimating a
proper prior to be used by a classifier for enhancing performance. This paper
extends typical GAN by replacing the traditional discriminator D with a fully
convolutional multiclass classifier, which, instead of predicting whether a sample
x belongs to the data distribution (real or fake), it assigns to each input image
pixel a label y from the n classes (i.e. crop, weed or background) or mark it
as a fake sample (extra n + 1 class). More specifically, D network predicts the
confidence for n classes of image pixels and softmax is employed to obtain the
probability of sample x belonging to each class.
Figure 2 presents a schematic description of the semi-supervised GAN archi-
tecture that three inputs such as generated multispectral data, unlabeled data
and a small number of labeled data are fed into the discriminator. Note that our
GAN formulation is different from the typical GAN, where the discriminator is
a binary classifier for discriminating real/fake images, while our discriminator
performs multiclass pixel categorization.

2.2 System Overview


The details of our semi-supervised GAN architecture including both generator
and discriminator are presented in this section.
Intelligent Crop Monitoring using Semi-supervised GAN 5

Fig. 2. The semi-supervised GAN architecture. Random noise is used by the Generator
to generate an image. The Discriminator uses generated data, unlabeled data and
labeled data to learn class confidences and produces confidence maps for each class as
well as a label for a fake data.

The generator network, shown in Figure 3, takes a uniform noise distribution


as input, followed by a series of four convolution layers and generates a fake
image resembling samples from real data distribution. The discriminator network
processes the generated images, unlabeled images and a small number of labeled
multi-spectral images to learn class confidence, producing a confidence map for
each class as well as a label for fake data.
The underlying idea is that adding large fake multispectral images forces real
samples to be close in the feature space, which, in turn, improves classification
accuracy. Our semi-supervised GAN formulation extends the canonical GAN,
where the discriminator is a binary classifier for real/fake images, implementing
a pixel-wise multiclass classifier.
Note that the proposed semi-supervised GAN adopts the DCGAN [20] ar-
chitecture with a modification in the last layer of the discriminator, replacing
the sigmoid activation function with the softmax to enable pixel-wise multiclass
classification. All the networks are implemented using the Keras library with
a Tensorflow backend. The standard Adam optimizer with momentum is used
for the discriminator and the generator optimization with learning rate and mo-
mentum (β1) set to 0.0002 and 0.5, respectively. A batch size of 32 and batch
normalization are utilized for both networks. The ReLU activation function is
applied in the generator for all layers except for the output, which uses the
Tanh and the LeakyReLU activation in the discriminator for all layers. In the
experiments, no data augmentation or post-processing is performed.
During the testing process, the discriminator network is only used as pixel-
wise multiclass classifier network. Given a test image, the softmax layer of the
discriminator outputs a set of probabilities for each pixel belonging to semantic
6 H. Kerdegari et al.

Fig. 3. The network architecture of our semi-supervised GAN. The noise is a vector
of size 100 sampled from a uniform distribution and is used as input to the generator.
The number of feature maps in the four different convolutional layers, respectively, are
256, 128, 64, 32 and 1 (Here 1 shows the number of channels).

classes, and accordingly, the label with the highest probability is assigned to the
pixel. Figure 4 shows some generated images in different channels. Interestingly,
these images indicate that the semi-supervised GAN framework is able to learn
spatial object patterns, for example, crop shape and weed shape.

Fig. 4. Images generated by the generator of the semi-supervised GAN on the weedNet
dataset. Interestingly, patterns related to crops and weeds from NDVI, Red and NIR
channel can be observed that highlights the effectiveness of the approach.

3 Experimental Results

This section presents the experimental setup, followed by a quantitative and


qualitative evaluation of the proposed method in this paper.
Intelligent Crop Monitoring using Semi-supervised GAN 7

The proposed method is evaluated on the weedNet [15] dataset collected


by a micro aerial vehicle (MAV) equipped with a 4-band Sequoia multispectral
camera. The multispectral images are captured from sugar beets field at 2 meter
height. The dataset contains only NIR and Red channel due to difficulties in
image registration of other bands. From corresponding NIR and Red channel
images, the Normalized NDVI, given by:

N IR − Red
N DV I = (2)
N IR + Red
is extracted indicating the difference between soil and plant. Therefore, each
training/test image consists of the 790nm NIR channel, the 660nm Red chan-
nel, and NDVI imagery. The dataset contains only crop, or weed, or crop-weed
combination along with their corresponding pixel-wise annotated data. For semi-
supervised training, different percentages of pixel-wise annotated images (such
as 50%, 40% and 30%) are used as labeled data to the discriminator and the
rest of images are without pixel-wise annotations. As metric, F1 score measure
that is a harmonic average of precision and recall is employed:

precision ∗ recall
F1 = 2 ∗ (3)
precision + recall
Where precision is T PT+F
P TP
P , recall is T P +F N , TP, FP and FN indicate the
number of true positive, false positive and false negative, respectively. Quanti-
tative results of our method on weedNet are shown in Table 1. F1 measure with
a varying number of input channels and different amount of labelled data are
used as evaluation metric in this experiment. Considering the difficulty of the
dataset, all models (including different channels + different amount of labeled
data) perform reasonably well (about 80% for all classes). As shown in Table 1,
two input channels (Red and NIR) yield higher performance compared to single
channels as they contain more useful features to be used by the semi-supervised
GAN network. However, using 3 channels (NDVI + Red + NIR) did not improve
performance as NDVI depends on NIR and Red channels rather than captur-
ing new information. Furthermore, the network was evaluated by reducing the

Table 1. Results on the weedNet dataset using 50%, 40% and 30% of labeled data
with different number of channels for semi-supervised GAN, and cascaded CNN [15]
with fully labeled data. Higher F1 values indicate better classification performance.

F1 Score Semi-supervised GAN Cascaded CNN


Amount of labeled data 50% 40% 30% Fully labeled
Channel Crop Weed Crop Weed Crop Weed Crop Weed
Red 0.831 0.814 0.822 0.813 0.792 0.813 0.923 0.845
NIR 0.839 0.823 0.80 0.821 0.782 0.733 0.942 0.839
NDVI 0.826 0.803 0.817 0.79 0.788 0.812 0.952 0.849
Red + NIR 0.857 0.865 0.837 0.834 0.823 0.815 0.971 0.851
Red + NIR + NDVI 0.852 0.831 0.847 0.821 0.816 0.812 0.979 0.816
8 H. Kerdegari et al.

amount of labeled data starting at 50% and then reducing by step 10 to 30% to
find out how it affects the classification performance. It is expected that higher
amount of labeled data result in better performance. It can be seen by comparing
the results of the 50%, 40% and 30% in Table 1.
Results of decoder-encoder cascaded CNN on weedNet dataset [15] is shown
in the last column of Table 1. As compared with our semi-supervised GAN,
it achieved higher accuracy using fully labeled data. However, we showed that
semi-supervised GAN with limited training data ables to achieve a good accuracy
about 80%.
Qualitative results on some sample images are depicted in Figure 5. As it
is shown, each row contains original Red channel, NIR channel, NDVI imagery,
semi-supervised GAN probability output and the corresponding ground truth.
The probability of each class is mapped to the red, green and black color rep-
resenting weed, crop and background, respectively. There are some noticeable
weed and crop misclassification areas in the images that occur mostly when crop
and weed are surrounded by each other. This misclassification shows that net-
work can capture high-level features such as shape and texture in addition to
the low-level features.

Fig. 5. Qualitative results of some sample images from the weedNet test set. The first
three columns are input data to the semi-supervised GAN, the fourth is the results of
semi-supervised GAN using 30% of labeled data and the last column is ground truth.
Intelligent Crop Monitoring using Semi-supervised GAN 9

4 Conclusion

This paper presents a semi-supervised framework, based on Generative Adver-


sarial Networks (GAN), for the classification of multi-spectral images. The semi-
supervisd GAN network is trained on the weedNet dataset captured by an MVA
from a sugar beet field. The performance of the system was evaluated using
F1 score metric by varying the number of input channels and the amount of
the labeled data. Results showed the F1 score of about 0.85 for two channels
with 50% labeled data. Compared with weedNet paper [15] that utilized all the
labeled data for training their decoder-encoder cascaded CNN for crop/weed
classification, this paper demonstrated that even with limited labeled data the
semi-supervised GAN network can classify the crop and weed with a relatively
good accuracy. Additionally, the presented model generates synthetic images
that could be used as additional multispectral data for other classifiers.
Future work includes adapting the algorithm for a near real-time application
involving the transmission of aerial farm images from UAV to a processing server
over 5G wireless network. Our semi-supervised GAN algorithm was only tested
on one multispectral dataset with two channels (To the best of our knowledge,
weedNet dataset is the only publicly available multispectral plant/weed dataset)
that is one drawback of this work. To overcome this limitation, future work will
involve collecting multispectral imagery with multiple channels using UAV from
fields that contain both plant and weed. Therefore, we would be able to test
the proposed method with the collected multispectral dataset that will contain
more channels such as Infrared, Red edge, Red, Green and Blue to investigate
the effect of each channel separately and in combination with each other on the
classification performance.
Acknowledgement This work is part of the 5GRIT project supported by
the Department for Digital, Culture, Media and Sport (DCMS), UK, through
their 5G Testbeds Program.

References

1. Lottes, P., Khanna, R., Pfeifer, J., Siegwart, R. and Stachniss, C.: UAV-based crop
and weed classification for smart farming. In: IEEE International Conference on
Robotics and Automation (ICRA), pp.3024-3031 (2017)
2. Hung, C., Xu, Z. and Sukkarieh, S.: Feature learning based approach for weed
classification using high resolution aerial images from a digital camera mounted on
a UAV. In: Remote Sensing, 6(12), pp.12037-12054 (2014)
3. Torres-Snchez, J., Pea, J.M., de Castro, A.I. and Lpez-Granados, F.: Multi-temporal
mapping of the vegetation fraction in early-season wheat fields using images from
UAV. In: Computers and Electronics in Agriculture, 103, pp.104-113 (2014)
4. Herrera, P.J., Dorado, J. and Ribeiro, A.: A novel approach for weed type classifi-
cation based on shape descriptors and a fuzzy decision-making method. In: Sensors,
14(8), pp.15304-15324 (2014)
5. Pea, J.M., Torres-Snchez, J., Serrano-Prez, A., de Castro, A.I. and Lpez-Granados,
F.: Quantifying efficacy and limits of unmanned aerial vehicle (UAV) technology for
10 H. Kerdegari et al.

weed seedling detection as affected by sensor resolution. In: Sensors, 15(3), pp.5609-
5626 (2015)
6. Samseemoung, G., Soni, P., Jayasuriya, H.P. and Salokhe, V.M.: Application of
low altitude remote sensing (LARS) platform for monitoring crop growth and weed
infestation in a soybean plantation. In: Precision Agriculture, 13(6), pp.611-627
(2012)
7. Lpez-Granados, F., Torres-Snchez, J., Serrano-Prez, A., de Castro, A.I., Mesas-
Carrascosa, F.J. and Pea, J.M.: Early season weed mapping in sunflower using
UAV technology: variability of herbicide treatment maps against weed thresholds.
In: Precision Agriculture, 17(2), pp.183-199 (2016)
8. Bannari, A., Morin, D., Bonn, F. and Huete, A.R.: A review of vegetation indices.
In: Remote sensing reviews, 13(1-2), pp.95-120 (1995)
9. Pena, J.M., Torres-Snchez, J., de Castro, A.I., Kelly, M. and Lpez-Granados, F.:
Weed mapping in early-season maize fields using object-based analysis of unmanned
aerial vehicle (UAV) images. In: PloS one, 8(10), p.e77151 (2013)
10. Louargant, M., Villette, S., Jones, G., Vigneau, N., Paoli, J.N. and Ge, C.: Weed
detection by UAV: Simulation of the impact of spectral mixing in multispectral
images. In: Precision Agriculture, 18(6), pp.932-951 (2017)
11. Herrmann, I., Shapira, U., Kinast, S., Karnieli, A. and Bonfil, D.J.: Ground-level
hyperspectral imagery for detecting weeds in wheat fields. In: Precision agriculture,
14(6), pp.637-659 (2013)
12. Ishida, T., Kurihara, J., Viray, F.A., Namuco, S.B., Paringit, E.C., Perez, G.J.,
Takahashi, Y. and Marciano, J.J.: A novel approach for vegetation classification
using UAV-based hyperspectral imaging. In: Computers and Electronics in Agricul-
ture, 144, pp.80-85 (2019)
13. Natividade, J., Prado, J. and Marques, L.: Low-cost multi-spectral vegetation clas-
sification using an Unmanned Aerial Vehicle. In: IEEE International Conference on
Autonomous Robot Systems and Competitions (ICARSC), pp. 336-342 (2017)
14. Mahdianpari, M., Salehi, B., Rezaee, M., Mohammadimanesh, F. and Zhang, Y.:
Very deep convolutional neural networks for complex land cover mapping using
multispectral remote sensing imagery. Remote Sensing, 10(7), p.1119 (2018)
15. Sa, I., Chen, Z., Popovi, M., Khanna, R., Liebisch, F., Nieto, J. and Siegwart, R.:
weedNet: Dense Semantic Weed Classification Using Multispectral Images and MAV
for Smart Farming. In: IEEE Robotics and Automation Letters, 3(1), pp.588-595
(2016)
16. Sa, I., Popovi, M., Khanna, R., Chen, Z., Lottes, P., Liebisch, F., Nieto, J., Stach-
niss, C., Walter, A. and Siegwart, R.: WeedMap: A large-scale semantic weed map-
ping framework using aerial multispectral imaging and deep neural network for
precision farming. In: Remote Sensing, 10(9), p.1423 (2018)
17. Bah, M.D., Dericquebourg, E., Hafiane, A. and Canals, R.: Deep learning based
classification system for identifying weeds using high-resolution UAV imagery. In:
Science and Information Conference, Springer, pp.176-187 (2018)
18. Lottes, P., Behley, J., Milioto, A. and Stachniss, C.: Fully Convolutional Networks
with Sequential Information for Robust Crop and Weed Detection in Precision Farm-
ing. In: arXiv preprint, arXiv:1806.03412 (2018)
19. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
Courville, A. and Bengio, Y.: Generative adversarial nets. In: Advances in neural
information processing systems, pp. 2672-2680 (2014)
20. Radford, A., Metz, L. and Chintala, S.: Unsupervised representation learn-
ing with deep convolutional generative adversarial networks. In: arXiv preprint,
arXiv:1511.06434 (2015)

View publication stats

You might also like