Professional Documents
Culture Documents
71
IFAC AGRICONTROL 2019
72
December 4-6, 2019. Sydney, Australia P. Ganesh et al. / IFAC PapersOnLine 52-30 (2019) 70–75
2.2 Dataset
The image data to train and validate the models was ac-
quired from an orange orchard in Citra, Florida, USA. The
images were acquired in 2018 just ahead of the commercial
harvesting season using a consumer grade digital camera in
natural lighting. The original images were of size 2816×1880 Fig. 3. Input images to the neural network - mean pixel
pixels. The fruit count per image was observed to be about subtracted RGB image (left) and HSV image (right)
60, thereby having relatively low pixel count per fruit. In
of labelled images will also be large. In order to limit the
an effort to reduce the number of training images to be
number of images of oranges needed to train the entire
acquired, the original images were divided into sub-images
network, transfer learning is opted. In ML, transfer learning
of size 256×256 pixels while retaining the original pixel
refers to applying knowledge gained in solving one problem
density.
to solve a different but related problem. This means that
From this cache, 150 sub-images of varying levels of lighting instead of training the network parameters from scratch, we
conditions, occlusions, and overlapping fruits were arbi- can use weights of the network trained on another dataset
trarily chosen to train the neural network. Although it is as a starting point for further fine tuning the weights for
well-known, it must be emphasized that, due to limited our problem of identifying oranges. This not only reduces
dynamic range of digital cameras, it is important to choose the number of images needed to train the network but also
images corresponding to different illumination conditions decreases the time required to train the models as only a
since the perceived color of the fruit changes depending on limited amount of hand labelled training data is required.
the position of the fruit on a tree and the position of the sun.
To this end, the Common Objects in Context (COCO)
Further, the selected images were manually annotated using
dataset (Lin et al., 2014) is employed, which consists
VGG Image Annotator (VIA) (Dutta et al., 2016). Fig. 2
of more than 120,000 images from 80 object categories
shows one such sample image with the manually generated
(including oranges). The network weights for Mask R-CNN
masks using polygon shaped regions. Note that only the
trained on the COCO dataset are freely available, which
fruits that were clearly visible in the image were labelled in
are adopted for this work. In order to fine tune the weights
the manual annotation process.
for detecting oranges, we only have to train weights for
the RPN, classifier, and the mask generation portion of
the network. However, this process is valid for training the
network using RGB input data as the COCO dataset only
contains RGB images. In order to train the network for the
other two cases (i.e., HSV and RGB+HSV input data), we
re-train the RPN, classifier, mask generation, and the first
three convolution layers while other parameters are kept
constant.
All the three models were trained for 40 epochs on a
computer with an Intel Xenon processor and four NVIDIA
GTX1070Ti Graphical Processing Units (GPUs). Each
model took about four hours to train and validate.
Fig. 2. Image on the right is the input training image and 3. RESULTS
the one on the left shows the masks generated manually
using VIA. The performance of the presented deep learning framework
is evaluated on a test dataset of randomly selected 200
Fig. 3 shows a sample image pair used to train the neural images using each of the three trained models. The metrics
network. The RGB image input is the mean pixel sub- selected to validate the fruit detection performance are
tracted original sub-image, and the HSV image input is precision, recall, and F1 score. Precision is the fraction of
the original sub-image in the HSV color space. relevant instances from all the retrieved instances while
recall is the fraction of relevant instances that have been
2.3 Training retrieved from all the relevant instances. Roughly, precision
is an indicator of false positives in the retrieved instances,
The implementation of Mask R-CNN using ResNet-101 and recall is an indicator of false negatives in the retrieved
(He et al., 2016) serves as a feature extractor. Being a instances. In fruit detection, larger precision would relate to
Deep Network with 101 layers, the network has millions of higher correctness of detection and larger recall corresponds
trainable hyper parameters, which means that the number to higher detection efficiency. F1 score is the harmonic
72
IFAC AGRICONTROL 2019
December 4-6, 2019. Sydney, Australia P. Ganesh et al. / IFAC PapersOnLine 52-30 (2019) 70–75 73
73
IFAC AGRICONTROL 2019
74
December 4-6, 2019. Sydney, Australia P. Ganesh et al. / IFAC PapersOnLine 52-30 (2019) 70–75
Fig. 4. Validation images (first column) and the corresponding detection and segmentation results using RGB, HSV, and
RGB+HSV input data.
74
IFAC AGRICONTROL 2019
December 4-6, 2019. Sydney, Australia P. Ganesh et al. / IFAC PapersOnLine 52-30 (2019) 70–75 75
797–804. IEEE, 2015. works for mapping winter vegetation quality cover-
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian age via multi-temporal sar sentinel-1. arXiv preprint
Sun. Deep residual learning for image recognition. In arXiv:1708.03694, 2017.
Proceedings of the IEEE conference on computer vision Sharada P Mohanty, David P Hughes, and Marcel Salathé.
and pattern recognition, pages 770–778, 2016. Using deep learning for image-based plant disease detec-
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross tion. Frontiers in plant science, 7:1419, 2016.
Girshick. Mask r-cnn. In Proceedings of the IEEE in- Anders Krogh Mortensen, Mads Dyrmann, Henrik
ternational conference on computer vision, pages 2961– Karstoft, Rasmus Nyholm Jørgensen, René Gislum, et al.
2969, 2017. Semantic segmentation of mixed crops using deep convo-
Dino Ienco, Raffaele Gaetano, Claire Dupaquier, and Pierre lutional neural network. In CIGR-AgEng Conference,
Maurel. Land cover classification via multitemporal 26-29 June 2016, Aarhus, Denmark. Abstracts and Full
spatial data by deep recurrent neural networks. IEEE papers, pages 1–6. Organising Committee, CIGR 2016,
Geoscience and Remote Sensing Letters, 14(10):1685– 2016.
1689, 2017. Horea Mureşan and Mihai Oltean. Fruit recognition from
Andreas Kamilaris and Francesc X Prenafeta-Boldu. Deep images using deep learning. Acta Universitatis Sapien-
learning in agriculture: A survey. Computers and Elec- tiae, Informatica, 10(1):26–42, 2018.
tronics in Agriculture, 147:70–90, 2018. Sarah Taghavi Namin, Mohammad Esmaeilzadeh, Mo-
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. hammad Najafi, Tim B Brown, and Justin O Borevitz.
Imagenet classification with deep convolutional neural Deep phenotyping: deep learning for temporal pheno-
networks. In Advances in neural information processing type/genotype classification. Plant methods, 14(1):66,
systems, pages 1097–1105, 2012. 2018.
Nataliia Kussul, Mykola Lavreniuk, Sergii Skakun, and Michael P Pound, Jonathan A Atkinson, Alexandra J
Andrii Shelestov. Deep learning classification of land Townsend, Michael H Wilson, Marcus Griffiths, Aaron S
cover and crop types using remote sensing data. IEEE Jackson, Adrian Bulat, Georgios Tzimiropoulos, Dar-
Geoscience and Remote Sensing Letters, 14(5):778–782, ren M Wells, Erik H Murchie, et al. Deep machine learn-
2017. ing provides state-of-the-art performance in image-based
Kentaro Kuwata and Ryosuke Shibasaki. Estimating crop plant phenotyping. Gigascience, 6(10):gix083, 2017.
yields with deep learning and remotely sensed data. Maryam Rahnemoonfar and Clay Sheppard. Deep count:
In 2015 IEEE International Geoscience and Remote fruit counting based on deep simulated learning. Sensors,
Sensing Symposium (IGARSS), pages 858–861. IEEE, 17(4):905, 2017.
2015. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
Konstantinos Liakos, Patrizia Busato, Dimitrios Moshou, Faster r-cnn: Towards real-time object detection with
Simon Pearson, and Dionysis Bochtis. Machine learning region proposal networks. In Advances in neural infor-
in agriculture: A review. Sensors, 18(8):2674, 2018. mation processing systems, pages 91–99, 2015.
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Inkyu Sa, Zongyuan Ge, Feras Dayoub, Ben Upcroft, Tris-
Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and tan Perez, and Chris McCool. Deepfruits: A fruit detec-
C Lawrence Zitnick. Microsoft coco: Common objects tion system using deep neural networks. Sensors, 16(8):
in context. In European conference on computer vision, 1222, 2016.
pages 740–755. Springer, 2014. Mayanda Mega Santoni, Dana Indra Sensuse, Aniati Murni
Xu Liu, Steven W Chen, Shreyas Aditya, Nivedha Sivaku- Arymurthy, and Mohamad Ivan Fanany. Cattle race
mar, Sandeep Dcunha, Chao Qu, Camillo J Taylor, classification using gray level co-occurrence matrix con-
Jnaneshwar Das, and Vijay Kumar. Robust fruit count- volutional neural networks. Procedia Computer Science,
ing: Combining deep learning, tracking, and structure 59:493–502, 2015.
from motion. In 2018 IEEE/RSJ International Confer- Srdjan Sladojevic, Marko Arsenovic, Andras Anderla,
ence on Intelligent Robots and Systems (IROS), pages Dubravko Culibrk, and Darko Stefanovic. Deep neural
1045–1052. IEEE, 2018. networks based recognition of plant diseases by leaf image
Francois PS Luus, Brian P Salmon, Frans Van den Bergh, classification. Computational intelligence and neuro-
and Bodhaswar Tikanath Jugpershad Maharaj. Mul- science, 2016, 2016.
tiview deep learning for land-use classification. IEEE Madeleine Stein, Suchet Bargoti, and James Underwood.
Geoscience and Remote Sensing Letters, 12(12):2448– Image based mango fruit detection, localisation and yield
2452, 2015. estimation using multiple view geometry. Sensors, 16
Chris McCool, Tristan Perez, and Ben Upcroft. Mixtures of (11):1915, 2016.
lightweight deep convolutional neural networks: applied Hulya Yalcin. Plant phenology recognition using deep
to agricultural robotics. IEEE Robotics and Automation learning: Deep-pheno. In 2017 6th International Con-
Letters, 2(3):1344–1351, 2017. ference on Agro-Geoinformatics, pages 1–5. IEEE, 2017.
Andres Milioto, Philipp Lottes, and Cyrill Stachniss. Real- Yu-Dong Zhang, Zhengchao Dong, Xianqing Chen, Wen-
time blob-wise sugar beets vs weeds classification for juan Jia, Sidan Du, Khan Muhammad, and Shui-Hua
monitoring fields using convolutional neural networks. Wang. Image based fruit category classification by 13-
ISPRS Annals of the Photogrammetry, Remote Sensing layer deep convolutional neural network and data aug-
and Spatial Information Sciences, 4:41, 2017. mentation. Multimedia Tools and Applications, 78(3):
Dinh Ho Tong Minh, Dino Ienco, Raffaele Gaetano, 3613–3632, 2019.
Nathalie Lalande, Emile Ndikumana, Faycal Osman,
and Pierre Maurel. Deep recurrent neural net-
75