Professional Documents
Culture Documents
Original papers
A R T I C L E I N F O A B S T R A C T
Keywords: Sweetness is an essential factor for assessing the internal quality of fresh watermelon. In this paper, a fusion non-
Sweetness destructive method for classifying watermelon sweetness based on acoustic signal and image processing tech
Non-destructive detection niques is proposed. Tapping sound signals, watermelon rind patterns, and weight are considered as features. The
Watermelon image processing
application of the three features is inspired by techniques that are used by famers to estimate watermelon
Acoustic signal processing
maturity. Machine learning (ML) techniques are applied to develop sweetness classification models. Eight
Machine learning
classification-based ML techniques are used: Naïve Bayes, K-nearest neighbors, Decision tree, Random forest,
Artificial neural network, Logistic regression, Support vector machine, and Gradient boosted trees. The applied
ML models are evaluated classification performance using accuracy, precision, recall, F-measure, and the area
under the receiver operating characteristic (AUC). The results show that the proposed method can reliably
classify watermelon sweetness. The highest classification accuracy achieves 92%, obtained by Gradient boosted
trees.
* Corresponding author.
E-mail address: supaporn.kit@mahidol.ac.th (S. Kiattisin).
https://doi.org/10.1016/j.compag.2020.105938
Received 30 April 2020; Received in revised form 8 October 2020; Accepted 1 December 2020
Available online 7 January 2021
0168-1699/© 2020 Elsevier B.V. All rights reserved.
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
An overview of this framework is illustrated in Fig. 1. There are three <9 100 SSC < 9 ◦ Brix classified as unsweet
≥9 100 SSC ≥ 9 ◦ Brix classified as sweet
main steps which are pre-processing, modelling, and performance
2
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
3
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
where xi is a class feature of an image that can have more than 1 class.
As mentioned, the entropy can be determined by summarising the
probability of each class that can be written in a general form in Eq. (4).
The high value of Entropy value indicates the high level of different
information from an average content in an image. Based on the concept
of the entropy which considers an image’s textures, the entropy function
can be used to characterise the rind patterns of an input watermelon
image.
∑n
H(X) = − i=1
P(xi )log2 P(xi ) (4)
4
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
performance of a ML model.
5
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
TP
Precision = (7)
TP + FP
2xPrecisionxRecall
F − measure = (8)
Precision + Recall
Moreover, the area under the receiver operating characteristic (AUC)
was also considered as an index for evaluating the performance of the
applied ML models in this study. An AUC value is obtained by the area
under ROC curve where the ROC curve is a graphical plot between recall
and the false positive rate (FPR). FPR can be defined as Eq. (9).
FP
FPR = (9)
FP + TN
The AUC value varies between 1 and 0 whereas 1 represents a perfect
prediction while the values below 0.5 show insufficient prediction of a
model. A ML model which has the AUC value above 0.9 is outstanding,
0.8–0.9 is excellent, and 0.7–0.8 is acceptable (Hosmer and Lemeshow, Fig. 8. TNR and TPR of the applied ML models.
2000).
weight as input predictors can successfully classify the sweetness of
3. Results and discussion watermelon. The ML models which had the accuracy over 90% were
Gradient boosted trees, SVM, Logistic regression, and Random forest.
3.1. Comparison of the machine learning models Among these high-accurate models, Gradient boosted trees had the
highest accuracy (92%). The accuracy of SVM and Random forest was
The eight classification-based ML techniques are evaluated the slightly lower than Gradient boosted trees with the same value of 91.5%.
classification performance. The confusion matrices of each model based For the precision and recall, the highest precision was obtained by SVM,
on the dataset are summarised in Table 4. First, the true positive rate Gradient boosted trees and Naïve Bayes while the highest recall was
(TPR) and true negative rate (TNR) are employed to evaluate the model obtained by Gradient boosted trees, Random forest, and ANN. It is worth
performance. The high TPR and TNR demonstrate the capability of the nothing that models which have high precision generally have low recall
ML models for the sweetness classification. TPR is the proportion of as seen, for example, Naïve Bayes and K-nearest neighbors. This is
actual sweet watermelon samples (SSC ≥ 9 ◦ Brix) which are predicted as because precision and recall are in tension which means that increasing
sweet. TNR is the proportion of actual unsweet samples (SSC < 9 ◦ Brix) one indicator decreases the other. Therefore, to comprehensively eval
which are predicted as unsweet. uate the performance of each model, F-measure which is the joint
Fig. 8 shows TPR and TNR of the applied ML models. the highest rate consideration of the two indicators are used to examine the classification
was obtained by Gradient boosted trees, SVM and Naïve Bayes with TPR performance. As shown, Random forest provided better performance
of 0.93 indicating that 93 out of 100 sweet samples were classified with the highest F-Measure of 91.7% closely followed by SVM (91.4%)
correctly. On the other hand, the highest TNR of 0.92 (92 out of 100 and Gradient boosted trees (91.0%). In contrast, KNN and decision tree
unsweet samples were classified correctly) was obtained by Random had relatively low values in terms of accuracy and F-measure showing
forest closely following by Gradient boosted trees and ANN. Decision that the two models did not perform well in the classification.
tree was found as the model which had the lowest classification per AUC values of the eight ML models are provided in Fig. 10. As seen,
formance. Moreover, it could be noted that although Naïve Bayes had except Decision tree, the AUC values of all the ML models were larger
the largest TPR, the model was not recommended for the classification than 0.90 which could be considered as outstanding (Hosmer and
because of the low TNR. Lemeshow, 2000). The high AUC values confirm the reliability of the
Fig. 9 shows the bar charts of the accuracy, precision, recall, and F- proposed method for classifying the watermelon sweetness. The top
measure of the applied ML models. In terms of the accuracy, all models three highest AUC values were obtained by Random forest, Gradient
obtained good performance with the accuracy over 85%. The result boosted trees and SVM.
indicates that this proposed method which employs Fmax, Entropy, and Based on the comparison using various indicators, it was found that
Gradient boosted trees, Random forest, and SVM had superior perfor
mance in terms of the accuracy, F-measure and AUC. The results show
Table 4 that these three models can be successfully used to classify the sweetness
Confusion matrices of different ML models.
of watermelon based on the input features. On the other hand, K-nearest
ML models Actual condition Predicted condition neighbors and Decision tree are not recommended to use for the clas
Unsweet Sweet sification since the models have a comparatively low level of the clas
Gradient boosted trees (GBT) Unsweet 91 7 sification performance. Finally, all the considered performances
Sweet 9 93 discussed above are shown in Table 5.
Support vector machine (SVM) Unsweet 90 7
Sweet 10 93
Logistic regression Unsweet 90 8
Sweet 10 92
3.2. Classification performance using different combined features
Artificial neural network (ANN) Unsweet 91 11
Sweet 9 89 This section presents the influences of using different combinations
Random forest Unsweet 92 9 of the input features on the performance of the sweetness classification.
Sweet 8 91
For the further discussion in this section, the classification model ob
Decision tree Unsweet 88 13
Sweet 12 87 tained by Gradient boosted trees was used as the model provided the
K-nearest neighbors (KNN) Unsweet 81 19 highest accuracy. Two cases of different combined features are shown in
Sweet 10 90 Table 6. Accuracy and AUC were used to evaluate the performance of
Naïve Bayes Unsweet 86 7 each case.
Sweet 14 93
It was observed that the model using all three features had the
6
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
accuracy of 92% while the model using only Fmax and Entropy value had
the accuracy of 87%. The results showed that the classification perfor
mance was lower if the input variables was reduced. Thus, all three
features are recommended to use in this proposed method in order to
obtain the highest classification performance. However, it was also
observed that the method can achieve high accuracy using only two
features (Fmax, Entropy value). The AUC of the classification model with
the two features reached 0.92 which was still considered as outstanding
discrimination (Hosmer and Lemeshow, 2000). This demonstrate that
even though the watermelon weight is unknow, the classification per
formance maintains highly accurate. The results support the possibility
to apply the proposed method for portable instrument such as smart
phones which may be difficult to use the devices to measure watermelon
weight.
4. Conclusion
7
K. Chawgien and S. Kiattisin Computers and Electronics in Agriculture 181 (2021) 105938
- A good classification capability can be obtained even only Fmax and watermelon fruits. J. Sci. Food Agric. 97, 479–487. https://doi.org/10.1002/
jsfa.7749.
Entropy value are used in this proposed method. The use of two
Arendse, E., Fawole, O.A., Magwaza, L.S., Opara, U.L., 2018. Non-destructive prediction
features is suitable for portable detection instrument which may of internal and external quality attributes of fruit with thick rind: A review. J. Food
have limitations for weighing watermelons. However, all three fea Eng. 217, 11–23. https://doi.org/10.1016/j.jfoodeng.2017.08.009.
tures are recommended to use in order to achieve higher accuracy. Collins, J.K., Wu, G., Perkins-Veazie, P., Spears, K., Claypool, P.L., Baker, R.A.,
Clevidence, B.A., 2007. Watermelon consumption increases plasma arginine
concentrations in adults. Nutrition 23, 261–266. https://doi.org/10.1016/j.
Although the results of this study are based on the sample set from nut.2007.01.005.
one farm in Thailand with a single classification point, the results De, I., Sil, J., 2012. Entropy based fuzzy classification of images on quality assessment.
J. King Saud University - Computer and Information Sciences 24, 165–173. https://
indicate that the proposed method can be used to classify the sweetness doi.org/10.1016/j.jksuci.2012.05.001.
of watermelon effectively. However, the results in this study may not be El-Bendary, N., El Hariri, E., Hassanien, A.E., Badr, A., 2015. Using machine learning
adequate to conclude that the best effective ML technique can be applied techniques for evaluating tomato ripeness. Expert Syst. Appl. 42, 1892–1905.
https://doi.org/10.1016/j.eswa.2014.09.057.
for all watermelon varieties. To apply this proposed method for other Gouveia, L.T.D., Costa, F., Senger, L.J., Albertini, M.K., Mello, R.F.D., 2011. Entropy-
watermelon varieties, modifications regarding tuning the hyper- Based Approach to Analyze and Classify Mineral Aggregates. J. Comput. Civil Eng.
parameter of the ML models might be needed. More samples are rec 25, 75–84. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000071.
Hosmer, D.W., Lemeshow, S., 2000. Applied Logistic Regression, 2nd ed. Wiley, New
ommended to create more generalisation of the classification model. York. https://doi.org/10.1080/00401706.1992.10485291.
Moreover, deep-learning techniques such as Convolutional neural net Jie, D., Wei, X., 2018. Review on the recent progress of non-destructive detection
works (CNN) could also be used to train the image dataset. However, technology for internal quality of watermelon. Comput. Electron. Agric. 151,
156–164. https://doi.org/10.1016/j.compag.2018.05.031.
sufficient learning samples need to be ensured to achieve high-quality
Jie, D., Xie, L., Rao, X., Ying, Y., 2014. Using visible and near infrared diffuse
results. transmittance technique to predict soluble solids content of watermelon in an on-line
As this proposed method does not require advanced technologies, it detection system. Postharvest Biol. Technol. 90, 1–6. https://doi.org/10.1016/j.
is convenient to implement in portable and commercial on-line detec postharvbio.2013.11.009.
Krawozyk, A., Turowski, J., 1987. The Mathematical Theory of Communication. IEEE
tion equipment. Moreover, the concept of this proposed method can be Trans. Magn. 23, 3032–3037. https://doi.org/10.1109/TMAG.1987.1065451.
applied for classifying other fruits or objects which can emit sound Kyriacou, M.C., Leskovar, D.I., Colla, G., Rouphael, Y., 2018. Watermelon and melon
signals by hitting and have surface details/textures such as melon, or fruit quality: The genotypic and agro-environmental factors implicated. Sci. Hortic.
234, 393–408. https://doi.org/10.1016/j.scienta.2018.01.032.
gems and mineral aggregates. In addition, this method can be easily Kyriacou, M.C., Soteriou, G.A., Rouphael, Y., Siomos, A.S., Gerasopoulos, D., 2016.
applied as mobile application in smartphones so that everyone can use to Configuration of watermelon fruit quality in response to rootstock-mediated harvest
classify watermelon sweetness. The improvements of the camera and maturity and postharvest storage. J. Sci. Food Agric. 96, 2400–2409. https://doi.
org/10.1002/jsfa.7356.
microphone in smartphones allow users to interact with the mobile Lee, W., Xiang, D., 2001. Information-theoretic measures for anomaly detection. Proc.
devices more conveniently. Instead of selecting watermelons based on IEEE Computer Society Symposium on Research in Security and Privacy 130–143.
human judgement, the mobile application can provide a better way to https://doi.org/10.1109/secpri.2001.924294.
Mao, J., Yu, Y., Rao, X., Wang, J., 2016. Firmness prediction and modeling by optimizing
estimate the sweetness of watermelon more precisely. acoustic device for watermelons. J. Food Eng. 168, 1–6. https://doi.org/10.1016/j.
jfoodeng.2015.07.009.
Declaration of Competing Interest Menon, S. V, Rao, • T V Ramana, Doshi, B.R., 2012. Enzyme Activities during the
Development and Ripening of Watermelon (Citrullus lanatus (Thunb.) Matsum. &
Nakai) Fruit.
The authors declared that they have no known competing financial Mohd Ali, M., Hashim, N., Bejo, S.K., Shamsudin, R., 2017. Rapid and nondestructive
interests or personal relationships that could have appeared to influence techniques for internal and external quality evaluation of watermelons: A review.
the work reported in this paper. Sci. Hortic. 225, 689–699. https://doi.org/10.1016/j.scienta.2017.08.012.
Muhammad Jawad, U., Gao, L., Gebremeskel, H., Safdar, L.B., Yuan, P., Zhao, S.,
Xuqiang, L., Nan, H., Hongju, Z., Liu, W., 2020. Expression pattern of sugars and
Appendix A. Supplementary data organic acids regulatory genes during watermelon fruit development. Sci. Hortic.
265, 109102 https://doi.org/10.1016/j.scienta.2019.109102.
Zaki, M.J., Jr. Wagner Meira, 2020. Data Mining and Analysis: Fundamental Concepts
Supplementary data to this article can be found online at https://doi. and Algorithms, 2nd ed, Personality and Social Psychology Bulletin. Cambridge
org/10.1016/j.compag.2020.105938. University Press. https://doi.org/10.1145/3054925.
Zhang, H., Ge, Y., 2016. Dynamics of sugar-metabolic enzymes and sugars accumulation
during watermelon (Citrullus lanatus) fruit development. Pak. J. Bot. 48,
References 2535–2538.
Zhu, Q., Gao, P., Liu, S., Zhu, Z., Amanullah, S., Davis, A.R., Luan, F., 2017. Comparative
Ahmad Syazwan, N., Shah Rizam, M.S.B., Nooritawati, M.T., 2012. Categorization of transcriptome analysis of two contrasting watermelon genotypes during fruit
watermelon maturity level based on rind features. Procedia Eng. 41, 1398–1404. development and ripening. BMC Genomics 18, 1–20. https://doi.org/10.1186/
https://doi.org/10.1016/j.proeng.2012.07.327. s12864-016-3442-3.
Akashi, K., Mifune, Y., Morita, K., Ishitsuka, S., Tsujimoto, H., Ishihara, T., 2017. Spatial
accumulation pattern of citrulline and other nutrients in immature and mature