Professional Documents
Culture Documents
Abstract—As each tea clone may produce different quality of features This result is a reference in our experiment, using
tea, it is important to have them identified in the field. Tea Clones RGB data as a comparison for this experiment. Spesifically
identification is one application of ICT technologies in agricul- for tea plants, the study was carried out by Zhang et al. [7]
ture. Tea clones may have very similar characteristics between
them, required to have a good amount of data to train a machine conducting real-time monitoring of tea leaf harvest time,
learning-based classifiers to have good performances. However, research on tea leaf quality was carried out by Gejima et
we may have to deal with a small amount of data in many cases. al. [8] used the RGB model of a tea leaf image. In many of
To overcome this, we propose to use an encoder-based feature these studies, vast and varied data sets are needed to achieve
reduction to produce RGB-based bottleneck features. The output good results.
features are then fed into an SVM classifier. We evaluate our
features on the classification of two tea clones of the Gambung The research of deep learning technology in plants was
Assamica (GMB) series. Our experimental results show that conducted by Sun et al. [9], who used CNN to identify tea
our proposed features achieve better performance than using leaf disease. The accuracy obtained is 93.75 %. Pardede et
full dimensions RGB.
al. [10] use a convolutional auto encoder for the detection of
Index Terms—Dimensionality reduction, bottleneck features,
Support Vector Machine (SVM), Tea clones. plant diseases. The results show that the use of conventional
autoencoders with more hidden layers gives better results. The
I. I NTRODUCTION use of convolutional networks indicates better performance.
Indonesia is one of the tea producing countries in the One reliable machine learning method for classification
world. There are currently many varieties of tea. In Indonesia, is Support Vector Machine (SVM). SVM is a method for
Research Institute for Tea and Cinchona (RITC)in Gambung, classification and regression, which is a machine learning
West Java, is one of the centers of excellence for tea and algorithm for labeled data [11]. The advantages of using
quinine studies. RITC succeeded in producing a series of su- SVM include that SVM works effectively on large dimension
perior tea clones, and they are 11 Gambung Assamica (GMB) sizes, its effectiveness is maintained when the number of
clones, and 5 Gambung Sinensis (GMBS) clones. Each clone dimensions is more significant when compared to the number
has almost the same characteristics. Different clones produce of samples, efficient memory usage because it uses a decision
different quality and quantity of tea. High physical similarities function called support vectors [11]. Research on plants using
between clones make it difficult for ordinary people like farm- SVM, among others, was conducted by Saranka et al. [12],
ers to distinguish them. Expertise is needed to differentiate who examined the monitoring of fermentation of black tea
between these clone varieties. Currently, there are not many processing by using SVM as a classifier.
experts who can identify these clones, and they still identify In recent years, data have become very valuable because
them manually. Automatic identification will be beneficial they can provide useful information. For tea leaves, there
to resolve this problem. Research on identifying plant types is currently no specific digital dataset for GMB and GMBS
using neural networks has shown many successes [1]–[5]. tea leaf clones. We collected the Gambung Assamica (GMB)
They use the Multi-Layer Perceptron (MLP) approach and series tea leaf dataset in this study manually by taking photos
deep learning in solving problems. of tea leaves directly. However, collecting data from the
The use of MLP and in-depth learning in agriculture have beginning is an activity that requires effort and high costs.
been carried out. For MLP, Kusumo et al. [6] proposed the Original image data has high dimensions. This condition is a
detection of disease in maize using SVM, Decision Tree challenge because it affects the size of the dataset. If the data
(DT), Random Forest (RF), and Naive Bayes (NB) using has many dimensions, then machine learning techniques may
various image processing features. The results show RGB has not be optimal when trained. Dimension reduction is a way
the best accuracy value when compared to other extraction to overcome this problem.
260
in dimensions. We use a linear kernels for SVM. The penalty
parameters (C) set to 1; this parameter is used to control
outliers, so misclassification in each training data can be
avoided.
261
TABLE III [5] S. G. Wu, F. S. Bao, E. Y. Xu, Y.-X. Wang, Y.-F. Chang, and Q.-
T HE PERFORMANCE OF PROPOSED METHODS WHEN THE NUMBER OF L. Xiang, “A leaf recognition algorithm for plant classification using
EPOCHS ARE VARIED probabilistic neural network,” in 2007 IEEE international symposium
on signal processing and information technology, pp. 11–16, IEEE,
Number of epoch Feature Reduction 2007.
FR 20% FR 50% FR 80% [6] B. S. Kusumo, A. Heryana, O. Mahendra, and H. F. Pardede, “Machine
50 86.7 85.9 88.4 learning-based for automatic detection of corn-plant diseases using
100 85.1 88.4 86.7 image processing,” in 2018 International Conference on Computer,
150 90.9 91.7 88.4 Control, Informatics and its Applications (IC3INA), pp. 93–97, IEEE,
200 83.4 91.3 90.9 2018.
[7] L. Zhang, H. Zhang, Y. Chen, S. Dai, X. Li, I. Kenji, Z. Liu, and M. Li,
“Real-time monitoring of optimum timing for harvesting fresh tea
leaves based on machine vision,” International Journal of Agricultural
TABLE IV and Biological Engineering, vol. 12, no. 1, pp. 6–9, 2019.
THE PERFORMANCE COMPARISON OF SVM CLASSIFIER [8] Y. Gejima, M. Nagata, et al., “Basic study on kamairicha tea leaves
USING INPUT FR ( I . E ., FR 20%, 50%, AND 80%) AND WITHOUT quality judgment system.,” Basic study on Kamairicha tea leaves
FR (RGB) WHEN THE NUMBER OF EPOCH 200. quality judgment system., pp. 1–10, 2000.
[9] X. Sun, S. Mu, Y. Xu, Z. Cao, and T. Su, “Image recognition of tea
Input for Classifier Epoch (% Accuracy) leaf diseases based on convolutional neural network,” arXiv preprint
arXiv:1901.02694, 2019.
FR 20% 200 83.4 [10] H. F. Pardede, E. Suryawati, R. Sustika, and V. Zilvan, “Unsupervised
convolutional autoencoder-based feature learning for automatic detec-
FR 50% 200 91.3
tion of plant diseases,” in 2018 International Conference on Computer,
FR 80% 200 90.9 Control, Informatics and its Applications (IC3INA), pp. 158–162, IEEE,
2018.
RGB data 200 80.1 [11] M. S. Hossain, R. M. Mou, M. M. Hasan, S. Chakraborty, and M. A.
Razzak, “Recognition and detection of tea leaf’s diseases using support
vector machine,” in 2018 IEEE 14th International Colloquium on
feature of around 50% of the size of the image, it will produce Signal Processing & Its Applications (CSPA), pp. 150–154, IEEE, 2018.
[12] S. Saranka, T. Kartheeswaran, D. Wanniarachchi, and W. Wan-
an accuracy of 91.3% on the Epoch 200. This is better when niarachchi, “Monitoring fermentation of black tea with image process-
compared to other image sizes that are only taken at 20% ing techniques,” 2016.
and 80%. This accuracy is better by 11.2% when compared [13] L. Van Der Maaten, E. Postma, and J. Van den Herik, “Dimensionality
reduction: a comparative,” J Mach Learn Res, vol. 10, no. 66-71, p. 13,
to RGB images. 2009.
In the future, we also would like to identify and classify [14] W.-L. Chao, “Dimensionality reduction,” Graduate Institute of Com-
all classes of GMB clone tea leaf series, which consist of 11 munication Engineering, National Taiwan University, Tech. Rep, 2011.
[15] M. D. Patil and S. S. Sane, “Dimension reduction: A review,” Interna-
classes. We will also focus on using other features such as tional Journal of Computer Applications92, vol. 16, pp. 23–29, 2014.
texture, shape, and leaf angle. Because the weakness of an [16] G. Sasikala, R. Kowsalya, and M. Punithavalli, “A comparative study
RGB image is that it is sensitive to light, the use of stronger of dimension reduction techniques for content-based image retrieval,”
The Int. J. of Multimedia & Its Applications, vol. 2, no. 3, 2010.
features can produce better classification results. [17] Y. Mao, K. Balasubramanian, and G. Lebanon, “Dimensionality re-
duction for text using domain knowledge,” in Proceedings of the
ACKNOWLEDGMENT 23rd International Conference on Computational Linguistics: Posters,
The authors would like to thank Hilman F. Pardede for dis- pp. 801–809, Association for Computational Linguistics, 2010.
[18] H. Kim, P. Howland, and H. Park, “Dimension reduction in text clas-
cussions and inputs. This paper is partially funded by Insinas sification with support vector machines,” Journal of Machine Learning
Grant 2019 (Contract Number: 091/P/PRL-LIPI/INSINAS- Research, vol. 6, no. Jan, pp. 37–53, 2005.
1/II/2019) from Indonesian Ministry of Research, Technology, [19] M. Shafiei, S. Wang, R. Zhang, E. Milios, B. Tang, J. Tougas, and
R. Spiteri, “Document representation and dimension reduction for text
and Higher Education. The experiment on this research is clustering,” in 2007 IEEE 23rd International Conference on Data
conducted on High Performance Computing (HPC) facilities Engineering Workshop, pp. 770–779, IEEE, 2007.
in Research Center for Informatics, Indonesian Institute of [20] C. Ding, X. He, H. Zha, and H. D. Simon, “Adaptive dimension reduc-
tion for clustering high dimensional data,” in 2002 IEEE International
Sciences (LIPI). We thank our fellow researchers in the Conference on Data Mining, 2002. Proceedings., pp. 147–154, IEEE,
Research Center for Informatics- LIPI which has provided 2002.
assistance in this study. [21] W. Zhao and S. Du, “Spectral–spatial feature extraction for hyperspec-
tral image classification: A dimension reduction and deep learning
R EFERENCES approach,” IEEE Transactions on Geoscience and Remote Sensing,
vol. 54, no. 8, pp. 4544–4554, 2016.
[1] B. C. Karmokar, M. S. Ullah, M. K. Siddiquee, and K. M. R. [22] Y. Wang, H. Yao, and S. Zhao, “Auto-encoder based dimensionality
Alam, “Tea leaf diseases recognition using neural network ensemble,” reduction,” Neurocomputing, vol. 184, pp. 232–242, 2016.
International Journal of Computer Applications, vol. 114, no. 17, 2015. [23] M. Prawira-Atmaja, H. Khomaini, H. Maulana, S. Harianto, D. Ro-
[2] E. Suryawati, R. Sustika, R. S. Yuwana, A. Subekti, and H. F. Pardede, hdiana, et al., “Changes in chlorophyll and polyphenols content in
“Deep structured convolutional neural network for tomato diseases camellia sinensis var. sinensis at different stage of leaf maturity,” in
detection,” in 2018 International Conference on Advanced Computer IOP Conference Series: Earth and Environmental Science, vol. 131,
Science and Information Systems (ICACSIS), pp. 385–390, IEEE, 2018. p. 012010, IOP Publishing, 2018.
[3] D. Moshou, E. Vrindts, B. De Ketelaere, J. De Baerdemaeker, and [24] B. Sriyadi and H. Khomaeni, “Klon teh sinensis unggul gmbs 1, gmbs
H. Ramon, “A neural network based plant classifier,” Computers and 2, gmbs 3, gmbs 4, dan gmbs 5,” in Prosiding Seminar Nasional
electronics in agriculture, vol. 31, no. 1, pp. 5–16, 2001. Pertemuan Teknis Teh Nasional: Teknologi Terkini Untuk Mendukung
[4] M. Dyrmann, H. Karstoft, and H. S. Midtiby, “Plant species classifica- Sustainable Tea, pp. 7–24, 2009.
tion using deep convolutional neural network,” Biosystems Engineering,
vol. 151, pp. 72–80, 2016.
262