You are on page 1of 5

2019 3rd International Conference on Informatics and Computational Sciences (ICICoS)

Deep Convolutional Adversarial Network-Based


Feature Learning for Tea Clones Identifications
Endang Suryawati Vicky Zilvan R. Sandra Yuwana
Research Center for Informatics Research Center for Informatics Research Center for Informatics
Indonesian Institute of Sciences (LIPI) Indonesian Institute of Sciences (LIPI) Indonesian Institute of Sciences (LIPI)
Bandung, Indonesia Bandung, Indonesia Bandung, Indonesia
enda029@lipi.go.id vick001@lipi.go.id rade014@lipi.go.id

Ana Heryana Dadan Rohdiana Hilman F. Pardede


Research Center for Informatics Research Institute for Tea and Cinchona Research Center for Informatics
Indonesian Institute of Sciences (LIPI) Indonesian Agency for Indonesian Institute of Sciences (LIPI)
Bandung, Indonesia Agricultural Research and Development Bandung, Indonesia
anah0021@lipi.go.id Gambung, Indonesia hilm001@lipi.go.id
rohdiana@ritc.id

Abstract—Tea is a commodity has a strategic role in the In a tea plantation management system, which allows one
Indonesian economy. The cultivation of tea plants becomes very clone to be cross-breed with other clones, it is essential to
important in order to maintain the superior commodity, with identify each type of tea clone planted in a field. The GMB
respect to increase the production and/or improve the quality
of tea. In a tea plantation management system, it is essential to clones can be identified visually through physical size, texture,
identify the types of tea clones planted in the field. But, it requires bone structure, and leaf color. This identification process
human experts to distinguish one types of clones with another. requires experts to be able to distinguish between their char-
The existence of an automatic clones identification is expected to acteristics. In a plantation area, it is a high effort to identify
make the identification easy, fast, accurate, and easily accessible superior tea clones manually. The existence of the GMB clones
for common farmers. In this work, we propose an unsupervised
feature learning algorithm derived from Deep Convolutional identification application automatically is expected to make
Generative Adversarial Network (DCGAN) for automatic tea this identification task easy, fast, and accurate.
clone identification. The use of unsupervised learning enable us to Automatic tea clone identification application can be de-
utilize unlabeled data. Our experiments suggest the effectiveness veloped using machine learning technologies as in object
of our method for tea clones detection task. recognition. Other applications of object recognition are face
Index Terms—Generative Adversarial Network, feature learn-
ing, object recognition, Convolutional Neural Network, DCGAN
recognition [1], [2], vehicle number plate recognition [3],
handwritten character recognition [4] etc. In agriculture, ob-
ject recognition application has been applied to inspect the
quality of agricultural products. Some of them are a machine
I. I NTRODUCTION
vision-based agricultural product grading system that has been
Tea is one of the favorite drinks in the world, including developed by Qinghua Su et al. to predict the potato feature
Indonesia. This commodity has a strategic role in the Indone- [5], the fruit grading system by Sapan Naik & Bankim Patel
sian economy. The demand for tea production, which increases [6], the strawberry grading system using K-means clustering
every year, make tea as an important commodity that may [7] and Support Vector Machine [8]. Meanwhile, Riyadi,
affect the country’s foreign exchange sources. Attention to the S. et al. [9]developed an estimation system of mangosteen
cultivation of tea plants becomes very important in order to fruit maturity by using Support Vector Machine with RGB
maintain the superiority among other countries. A lot of efforts features. For tea plants in particular, the application of image
have been made to produce tea clones with good production object recognition has been applied to the identification of tea
rate and/or high quality of tea. Superior clones are the ones leaf diseases using machine learning methods such as Sup-
with high productivity, resistant to pests and diseases, tolerant port Vector Machine (SVM) [10] and multi-layer perceptron
of drought, and high catechin content. The Tea and Quinine (MLP) [11].
Research Center Institute has developed a variety of superior The successful application of machine learning techniques
tea clones. The clones are called as the GMB (Gambung) in object recognition applications still has limitations when
Clone series. Some of them are superior clones of GMB 1-11 they are used for general purposes due to data variations.
tea series of Assamica varieties and superior clones of GMBS For conventional machine learning techniques, robust and
1-5 series of Sinensis varieties. discriminative features are designed and extracted from image

978-1-7281-4610-2/19/$31.00
Authorized ©2019
licensed use limited to: Auckland IEEE
University of Technology. Downloaded on May 27,2020 at 10:24:19 UTC from IEEE Xplore. Restrictions apply.
2019 3rd International Conference on Informatics and Computational Sciences (ICICoS)

Fig. 1. The Original DCGAN Generator

data so that they are easily separated by a machine learning idea is to feed both G and D networks with some knowledge
algorithm through a feature extraction process. The process of of the data instead of G network learn blindly. To do this
extracting features may require a sophisticated and complex we append encoder network to DCGAN and then train it in
computational process. similar manner as normal GAN networks. Here, we only apply
Deep learning technology can provide solutions to the the method for identification of two types of Gambung clones,
limitations of conventional machine learning methods. Deep namely GMB 3 [22] and GMB 9 [23]. Representations of
learning technology [12] is a branch of machine learning features generated by the DCGAN architecture are used to
technology that has proven to be very useful for images, train supervised learning techniques. Our results show that the
sounds, videos. With predefined deep networks architectures, proposed method outperform autoencoder.
we could train the network to automatically learn “good This paper has structure as follows. Section II describes
features” from raw data. One architecture of deep learning DCGAN as an unsupervised feature learning. In Section III,
technology that has been widely applied to image data and we describe our proposed method. The experimental setup is
provides a satisfactory level of accuracy is Convolutional briefly explained in Section IV. Section V describes the results
neural network (CNN) [13]–[15]. Several applications for and discussion of these results. Section VI explains the main
detecting tea leaf disease, through images of tea leaves, have conclusions and possible future work to be carried out.
also implemented deep learning techniques [16], [17].
Most of the object image recognition tasks still implement II. D EEP C ONVOLUTIONAL G ENERATIVE A DVERSARIAL
supervised learning. Supervised learning provides satisfying N ETWORK AS U NSUPERVISED F EATURE L EARNING
results, but deep learning usually requires large number of
labeled data. In reality, it is difficult to get large amounts of Radford et al. tried to bridge the gap between the success of
labeled data, while there are a lot of unlabeled data around the CNN architecture as supervised learning and unsupervised
us that should be able to be utilized. One way to utilize the learning, by carrying out the DCGAN (Deep Convolutional
unlabeled data is to use unsupervised learning techniques. The Generative Adversarial Network) architecture. They build DC-
results of this unsupervised learning could be used as features, GAN by applying CNN to the GAN (Generative Adversarial
and then object classification could be trained on supervised Network) architecture [21] to build an unsupervised represen-
learning techniques [12] with smaller number of labeled data. tation learning model.
There were several unsupervised feature learning techniques CNN is applied to the DCGAN architecture to replace the
proposed in previous studies in deep learning frameworks, multilayer perceptrons used by GAN. It is one way to over-
such as autoencoder [18], [19] and DCGAN [20]. come the weaknesses of GAN, which is known to be unstable
In the previous studies, unsupervised features learning using in the training process. Other settings for stabilizing DCGAN
convolutional autoencoder for identification of tea diseases has are replacing max-pooling layers with stride convolutions on
been implemented [18]. However, autoencoder tends to just the discriminator and fractional-strided convolutions on the
memorize input data and hence prone to overfit. Generative generator, using batch normalization on the discriminator and
adversarial network (GAN) [21] is a latest architecture in deep the generator to stabilize learning, using ReLU activation on
learning. GAN works by competing two networks (called G all layers in the generator except at the output (tanh) and using
and D networks), where only D network knows the real data, LeakyReLU activation on all layers in the discriminator.
to each other so G network is able to predict the underlying Figure 1 shows the DCGAN generator architecture. Input
distributions of the real data without ever seeing it. Therefore, generator is a uniform noise or random numbers in a 100-
we could avoid the network to overfit. GAN has also been dimensional vector. The convolutional representation form
implemented using convolutional neural networks [20]. How- (width, length, and many feature maps) is the projection result
ever, GAN is sensitive to initializations and the outcome of G of the input. The first projection is an l4x4x1024 layer, and the
networks may be unpredictable. final layer produces a 64x64 pixel image. The discriminator
In this paper, we propose encoder DCGAN for unsupervised reads this 64x64 pixel image, then learn whether the generated
feature learning for tea clones identification applications. The images are real or fake.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 27,2020 at 10:24:19 UTC from IEEE Xplore. Restrictions apply.
2019 3rd International Conference on Informatics and Computational Sciences (ICICoS)

Fig. 2. DCGAN networkas as features extractor for classification tasks.

Fig. 3. The proposed Generator of DCGAN

Fig. 4. The proposed Discriminator of DCGAN


Fig. 5. The Fully-Connected Classifier

III. T HE P ROPOSED M ETHOD


In Radford’s study, image classification tasks use the dis-
criminator model while G networks as feature extractor. Fea- In this study, we propose a feature learning architecture
tures on all discriminator layers are flattened and concatenated by appending encoder into the DCGAN network as a feature
on vector dimension 28672. Then the L2-SVM classifier is extractor to identify Gambung tea clones. IN DCGAN, as G
trained on these features. Finally, the L2-SVM classifier shows never really “see” the data, it is very sensitive to initialization.
that the DCGAN model can outperform the performance of all There are a few different configurations between our architec-
K-means based approaches with an accuracy of 84.3% and a ture and their architecture. Fig. 2 shows the proposed DCGAN
smaller number of features (512). architecture as a feature learning for classification tasks.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 27,2020 at 10:24:19 UTC from IEEE Xplore. Restrictions apply.
2019 3rd International Conference on Informatics and Computational Sciences (ICICoS)

DCGAN handles the features learning process of the object.


DCGAN consists of generator and discriminator networks.
Generator learns features of the image through a presumptive
distribution of the data and generates an image based on
this distribution. The discriminator assesses the generated
images and decides whether they are real or fake images.
Both networks are trained in adversarial manner. Based on
the result of the discriminator assessment, the generator adapt
the presumptive distributions to generate better images until
the discriminator cannot distinguish between the real image Fig. 6. Sample of GMB 3 and GMB 9 images
or the fake image. In our method, we append encoder at the
top of the Generator network, so the Generator in some sense, TABLE I
T HE CLASSIFIER PERFORMANCE WHEN USING ENCODER DCGAN AS THE
could have better initialization for presumptive distributions. FEATURE LEARNING MODEL WITH A VARIATED CLASSIFIER LEARNING
Fig. 3 shows the proposed generator architecture. The input RATE
generator is a 64x64x16 and flattened into a 32768 vector,
and we reshape this vector into the 8x8x512 tensor. Then Learning Rate Performance Number of feature learning epochs
some convolution processes map the 8x8x512 input into the 500 1000 1500 2000
64x64x3 output. The discriminator shown in Fig. 4 consists of Adam(1e-3) Accuracy 83.47 85.95 80.99 86.78
Loss 0.86 0.69 0.61 0.48
three convolution layers, three dropout layers, one flatten layer, Adam(1e-4) Accuracy 76.00 86.78 84.30 85.95
and one fully-connected layer with a sigmoid function for Loss 0.75 0.42 0.69 0.38
the classification stage. The discriminator receives a 64x64x3 Adam(1e-5) Accuracy 82.64 87.60 84.30 91.74
Loss 0.40 0.35 0.43 0.32
input to undergo several convolution processes. The classifier
(Fig. 5 ) has one flatten layer and the last three fully-connected
layers of VGG architecture without dropout layers. TABLE II
T HE PERFORMANCE COMPARISON OF THE CLASSIFIER WHEN USING THE
IV. E XPERIMENTAL S ETUP DIFFERENT ARCHITECTURES OF FEATURE LEARNING

In our experiment, we use images of Gambung Clone


Architecture Accuracy (%) Loss
dataset. This dataset consists of two types of Gambung clone,
they are GMB 3 and GMB 9. Fig. 6 shows the sample images proposed encoder DCGAN 91.74 0.32
of both of clone types. At first glance, they look similar, but fully-connected Autoencoder 60.33 0.66
actually, they have a difference in shape, size, texture, and convolutional Autoencoder 72.73 0.58
color. We take the leaves pictures and collect them according
to their class. A total of the dataset is 1297 leave images.
We divided the dataset into three groups: training, testing, and We also evaluate each model by the same classifier. The
validation. 80% of the dataset is used for training data and setting of feature learning parameters, which is used in the
the rest (20%) of the dataset for a testing data. For validation DCGAN architecture and gives the best performance for the
data, we use 10% of testing data. There is no pre-processing classification task, will reuse by autoencoders for the same
for each image, except reduce the image to 64x64. task. The best parameter settings are 2000 feature learning
For experiments, we train the DCGAN by using the ar- epochs at the classifier learning rate of 1e − 5.
chitectures of generator and discriminator such as shown in
Fig. 3 and Fig. 4 to build the feature learning model first. V. R ESULTS AND D ISCUSSIONS
After DCGAN is trained, we use G as feature extractor and Table I shows the performance of the fully-connected layer
the output G will be fed as features to fully-connected layers classifier with the various learning rate, and use an encoder
(Fig. 5) for classification. DCGAN as the feature learning model. The classifier achieves
For the feature learning process, this experiment uses the the best performance once the feature learning model is trained
various number of epochs: 500, 100, 1500, and 2000. We during 2000 epochs and the classifier use a learning rate
use Adam optimizer with the learning rate is 0.0002 and of 1e − 5. This configuration achieves the highest accu-
momentum of 0.5. For encoder, we use a latent dimension racy (97.74%) and the lowest loss (0.32) among the other
of 100, and a batch size of 128. In the classification task, we configurations, which is shown in Table I. The results in
use the various learning rate of Adam optimizer (1e−3, 1e−4, Tabel I also show that overall configuration gives a satisfactory
1e − 5), the batch size of 128, and train the classifier during performance with an accuracy value of above 82%, except
2 epochs. once the feature learning model is trained using the epoch
In the same scheme, we build the feature learning model number of 500 and the classifier learning rate of 1e − 4. This
with use two autoencoders architecture as our baselines. These configuration gives the accuracy of 76 % and a loss of 0.75.
autoencoders refer to Pardede et al. [18]. They are fully- There is a tendency that the number of epochs affects
connected (MLP) autoencoder and convolutional autoencoder. the performance of the model. The highest performance is

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 27,2020 at 10:24:19 UTC from IEEE Xplore. Restrictions apply.
2019 3rd International Conference on Informatics and Computational Sciences (ICICoS)

achieved when the model is trained using the highest number [2] P. Rasti, T. Uiboupin, S. Escalera, and G. Anbarjafari, “Convolutional
of epochs (2000) for all variations of the classifier learning neural network super resolution for face recognition in surveillance
monitoring,” in Articulated Motion and Deformable Objects (F. J.
rate. However, it cannot be concluded that an increase in the Perales and J. Kittler, eds.), (Cham), pp. 175–184, Springer International
number of epochs increases the performance of the model Publishing, 2016.
linearly. Nevertheless, the experimental results show that the [3] H. Rajput, T. Som, and S. Kar, “An automated vehicle license plate
recognition system,” Computer, vol. 48, pp. 56–61, Aug 2015.
increase in accuracy occurs when the number of epochs [4] P. A. Khaustov, V. G. Spitsyn, and E. I. Maksimova, “Algorithm for op-
increases from 500 to 1000. Accuracy decreases when the tical handwritten characters recognition based on structural components
model is trained using the number of epochs 1500 and increase extraction,” in 2016 11th International Forum on Strategic Technology
(IFOST), pp. 355–358, June 2016.
again when the model is trained during 2000 epochs. [5] Q. Su, N. Kondo, M. Li, H. Sun, and D. Al Riza, “Potato feature pre-
In our experiment, the selection of the best learning rate diction based on machine vision and 3d model rebuilding,” Computers
classifier is 1e−5, as the model gives the highest performance and Electronics in Agriculture, vol. 137, pp. 41–51, 05 2017.
[6] S. Naik and B. Patel, “Machine vision based fruit classification and
and the lowest loss. The choice of the learning rate may affect grading - a review,” International Journal of Computer Applications,
the optimum number of epochs as the learning rate may affect vol. 170, pp. 22–34, Jul 2017.
on how fast the model to converge. The learning rate controls [7] X. Liming and Z. Yanchao, “Automated strawberry grading system based
on image processing,” Computers and Electronics in Agriculture, vol. 71,
how the model responds to the problem by estimating the 04 2010.
error gradient after updating the model’s weights. Choosing [8] O. Mahendra, H. Pardede, R. Sustika, and R. Budiarianto
the learning rate can be a challenge in the building model Suryo Kusumo, “Comparison of features for strawberry grading
classification with novel dataset,” pp. 7–12, 11 2018.
process. [9] S. Riyadi, A. Zuhri, T. Hariadi, I. Prabasari, and N. Utama, “Optimized
Our proposed encoder DCGAN architecture supports the estimation of mangosteen maturity stage using svm and color features
combination approach,” International Journal of Applied Engineering
classifier to give better performance than using autoencoder Research, vol. 12, pp. 15034–15038, 01 2017.
architectures as a feature learning model, as shown in Table [10] M. S. Hossain, R. M. Mou, M. M. Hasan, S. Chakraborty, and M. A.
Tabel II. Each of autoencoder is trained using the same number Razzak, “Recognition and detection of tea leaf’s diseases using support
vector machine,” in 2018 IEEE 14th International Colloquium on Signal
of epochs as DCGAN. It indicates that with the same number Processing Its Applications (CSPA), pp. 150–154, March 2018.
of training epochs, the encoder DCGAN tends to converge [11] B. Chandra Karmokar, M. Ullah, M. Siddiquee, and K. Alam, “Tea
faster than the autoencoder. leaf diseases recognition using neural network ensemble,” International
Journal of Computer Applications, vol. 114, pp. 975–8887, 03 2015.
[12] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
VI. C ONCLUSIONS pp. 436–444, 5 2015.
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
In this study, we proposed Encoder-Deep Convolutional with deep convolutional neural networks,” in Proceedings of the 25th
Generative Adversarial Network as the unsupervised feature International Conference on Neural Information Processing Systems -
Volume 1, NIPS’12, (USA), pp. 1097–1105, Curran Associates Inc.,
learning for tea clone identification tasks. Encoder is put 2012.
at the top of Generator to produce features for three fully- [14] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov,
connected layers Classifier. The result show that DCGAN D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with
convolutions,” CoRR, vol. abs/1409.4842, 2014.
architecture can give better results compared autoencoder as [15] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
feature learning. large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
In the future, we also would like to investigate various [16] J. Chen, Q. Liu, and L. Gao, “Visual tea leaf disease recognition using
a convolutional neural network model,” Symmetry, vol. 11, no. 3, 2019.
deep learning architectures to improve the performance of [17] X. Sun, S. Mu, Y. Xu, Z. Cao, and T. Su, “Image recognition
our unsupervised feature learning architectures. Adding more of tea leaf diseases based on convolutional neural network,” CoRR,
training data, depth of the encoder, or variations on the encoder vol. abs/1901.02694, 2019.
[18] H. F. Pardede, E. Suryawati, R. Sustika, and V. Zilvan, “Unsupervised
are among our interest for future studies. convolutional autoencoder-based feature learning for automatic detection
of plant diseases,” in 2018 International Conference on Computer,
ACKNOWLEDGMENT Control, Informatics and its Applications (IC3INA), pp. 158–162, Nov
2018.
This paper is partially funded by Insinas Grant 2019 [19] T. Wen and Z. Zhang, “Deep convolution neural network and
autoencoders-based unsupervised feature learning of eeg signals,” IEEE
(Contract Number: 091/P/PRL-LIPI/INSINAS-1/II/2019) from Access, vol. 6, pp. 25399–25410, 2018.
Indonesian Ministry of Research, Technology, and Higher [20] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation
Education. This experiments on this research are conducted learning with deep convolutional generative adversarial networks,” 2015.
[21] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
on High Performance Computing (HPC) facility in Research S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in
Center for Informatics, Indonsiaan Institute of Sciences. We Proceedings of the 27th International Conference on Neural Information
thank our fellow researchers in the Research Center for Processing Systems - Volume 2, NIPS’14, (Cambridge, MA, USA),
pp. 2672–2680, MIT Press, 2014.
Informatics- LIPI which has provided assistance in this study. [22] B. Sriyadi, “Gmb 3,” September 2013.
[23] B. Sriyadi, “Gmb 9,” September 2013.
R EFERENCES
[1] A. ElSayed, A. Mahmood, and T. Sobh, “Effect of super resolution on
high dimensional features for unsupervised face recognition in the wild,”
in 2017 IEEE Applied Imagery Pattern Recognition Workshop (AIPR),
pp. 1–5, Oct 2017.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 27,2020 at 10:24:19 UTC from IEEE Xplore. Restrictions apply.

You might also like