You are on page 1of 5

Classification of Broccoli Quality using Fuzzy K-

Nearest Neighbor

Abstract— This study proposes a method of searching for differentiate beef and pork. Imagery is also applied to the
broccoli quality based on imagery using the Fuzzy K-Nearest classification of domestic chicken egg grades [5].
Neighbor (F-KNN). Broccoli is used because it is a type of
herbaceous vegetable that has a unique color and shape so visual In the process of digital image processing, feature
assessment is still limited. The data used in this study were taken extraction is needed to extract features from objects that are to
from 2 cities in Indonesia, Pasuruan and Malang Regency which be distinguished from other objects. Common features of
consisted of good quality and bad quality. The total data used is feature extraction include shape, color, texture, and size.
120, each quality comprising 60 images. In the pre-processing Implementation using texture feature extraction has been
process, several stages are carried out to proceed to the next carried out using texture feature extraction Local Binary
process. Feature extraction is done through algorithms, namely Pattern (LBP) to classify papaya [6] and other multi-class
HSI (Hue, Saturation, and Intensity) and LBP (Local Binary fruits [7]. A similar implementation of texture feature
Pattern), which are then divided into training and testing data. extraction has also been carried out by [8] for setting lace
F-KNN is used as a classification. The accuracy obtained from image parameters. Researchers [9] and [10] also use similar
this study reached 94.4%. This value indicates that the use of feature extraction for their research by combining other
both feature extraction and classification algorithms produces features.
good accuracy in the training and testing data with a 40:60
scenario. This result showed the potential of the feature In addition to texture feature extraction, color feature
extraction and F-KNN algorithm when classifying a large extraction also needs to be done to distinguish an object with
number of broccoli qualities. a certain color. Research [11] used Hue, Saturation, Intensity
(HSI) color feature extraction to detect bleeding in wireless
Keywords—Fuzzy K-Nearest Neighbor, HSI Features capsule endoscope images using the Support Vector Machine
Extraction, LBP Feature Extraction, Broccoli Classification. (SVM) classification algorithm. The classification algorithm
is also needed because it is an important step in classifying an
I. INTRODUCTION object. One of them is the Fuzzy K-Nearest Neighbor which
Broccoli (Brassica oleracea var. Italica) is an herbaceous is a classifier algorithm that is widely used for classification,
vegetable that is rich in nutrients and is often consumed by the both in the form of images [12]–[15] or data [16].
public. Broccoli is easy to cultivate in cool, wet climates. A Classification to determine the quality of an object using this
good time to plant broccoli is at the beginning of the rainy algorithm has been carried out by [17] to classify the quality
season or the beginning of the dry season. However, it is of milkfish, and [18] to classify soybean plant diseases in leaf
possible if broccoli is grown all year round if it is cultivated images with the highest accuracy of 83.33%.
intensively.
Based on the literature study, LBP texture feature
This incentive treatment aims to maintain the quality of extraction and HSI color feature extraction have been widely
broccoli so that consumers get optimal nutrition from this used for digital image research and produce high accuracy.
vegetable. Currently, the quality of broccoli is usually visually Good quality broccoli has a tight texture and a fresh white
rated by farmers based on the size, color, shape, and longevity color and vice versa. Objectively classifying broccoli's quality
of the broccoli after harvest. However, this visual rate is can help improve consistency and accuracy in assessing the
limited in detecting more subtle quality differences in quality of vegetables by using feature extraction and classifier
broccoli. algorithms. So that it can provide benefits in supporting the
production of higher-quality vegetables and ensuring the
Digital image processing is now often developed by optimal availability of nutrients for consumers.
researchers as a method of classification or identification
through images, both on fruits and vegetables. Research [1] In this study, to classify the quality of broccoli, we used
succeeded in classifying the maturity level of Golek Mango the feature extraction of the LBP texture and the HSI color
using the Fuzzy C-Means classification method and obtaining obtained from pre-processing. After that, we used the Fuzzy
a high accuracy of 83.3%. Digital image processing was also K-Nearest Neighbor (F-KNN) Classifier, which refers to the
developed for detecting and classifying citrus fruit diseases as research above that has never done a quality classification of
was done by [2]. In research conducted by [3] digital imagery broccoli based on color and texture, creating a gap in this
is also used to measure the dimensions of sunflowers. study.
Classification image processing is also used by [4] to
The upcoming section will explain the methodology then selecting the best image from the Hue, Saturation, and
employed in this study, along with a comprehensive Intensity image processes. Next, we select the intensity to be
description of each stage involved. The findings and processed into texture extraction. Figure 3 shows the results
subsequent discussion will be presented in Chapter III, of the color character values of each image.
followed by a concluding section in Chapter IV.
II. METHODOLOGY
In this stage, we describe the methods and data we used
in this study. Figure 1 shows the five stages starting from the
input image, pre-processing, feature extraction,
classification, and output which we will explain in detail (a) (b) (c)
between the sub-chapters.
Fig. 3 Channel separator based on each channel (a) Hue, (b) Saturation, and
(c) Intensity.

The results of each channel have been separated, then we


choose the channel with the best results and carry out the
adjustment process. After that, we perform a median filter to
reduce noise in the broccoli image and continue the threshold
process to minimize the color variance of black and white
pixels. Furthermore, we carry out a complementing process to
remove binary values and perform a segmentation process to
return the original image. Figure 4 shows the results of the
adjustment, threshold, and segmentation processes.

(a) (b) (c)


Fig. 4 (a) Adjustment, (b) Threshold, and (c) Segmentation process

Fig. 1 Research Methodology C. Features Extraction


In this study, feature extraction is used as color and
A. Dataset texture to classify the quality of broccoli. Feature extraction
The dataset was taken from 2 regions in Indonesia, based on color uses HSI (Hue, Saturation, and Intensity) and
Pasuruan and Malang Regency which consisted of good texture extraction uses Local Binary Pattern (LBP).
quality and bad quality in Figure 2. Each quality consists of 1) HSI (Hue, Saturation, and Intensity)
60 images, so the number of images used 120 images. The Hue, Saturation, and Intensity (HSI) separate colors from
image size used is 350 x 350 pixels in JPG format.
the original colors in image processing. HSI is very different
from the three-dimensional color space of RGB and best
describes color in terms that are practical for human
interpretation. Hue is the name of a color that is different from
red, green, yellow, orange, blue, and so on. Saturation is the
strength of the color or intensity. Brightness is a color whose
brightness is impossible to measure, and intensity is the most
useful descriptor of a monochromatic image and is easy to
interpret and measurable.
(a) (b) The RGB and HSI models will be used in image feature
Fig. 2 Broccoli quality (a) good quality, and (b) bad quality extraction. Figure 5 is an explanation of the HSI space and
how to convert RGB colors to HSI colors [19].

B. Pre-processing Data
After inputting the dataset, the next step is pre-processing
which aims to convert the data into numbers. This process
requires several steps the image is separated from the
background. It starts with processing the initial data from the
input RGB image, then we convert it into an HSI image. After
that, the process continues by separating the HSI channels,
P = the number of sampling points
gp = the gray value of 𝑝
gc = center pixel value
𝑠 = sign (binary code).
Before carrying out feature extraction using the LBP
method, the segmented image data is converted into a
grayscale image. Furthermore, the grayscale image pixel
values are extracted using LBP [21].
D. Classification using Fuzzy K-Nearest Neighbor (F-KNN)
Object classification is an important area of research and
practical applications in a variety of fields, including pattern
Fig. 5 The HSI color model based on circular color planes [19]
recognition and artificial intelligence, statistics, cognitive
To convert RGB images to HSI use Equation (1). In psychology, vision analysis, and medicine. Pattern
Equation (1), the H function for hue values. The value of θ is recognition and pattern classification are based on data where
obtained from Equation (2). The saturation value is the sample size of each class is small. In many circumstances,
determined by Equation (3) and the intensity is determined the K-Nearest Neighbor (KNN) algorithm is used to perform
by Equation (4). the classification. This decision provides a simple
θ if B ≤ G nonparametric procedure for class labels to input patterns
𝐻={ (1)
360 if B > G based on class labels represented by KNN [22].
The Fuzzy K-Nearest Neighbor (F-KNN) algorithm is a
1
[(𝑅−𝐺)+(𝑅−𝐵)] development of K-NN, the F-KNN algorithm focuses more
𝜃 = 𝑐𝑜𝑠 −1 { 2
1 } (2) on class membership into a pattern (pattern) than assigning
[((𝑅−𝐺)2 + (𝑅−𝐵)(𝐺−𝐵))]2
patterns to a particular class. The basis of this algorithm is to
3 determine membership as a function of the pattern distance
𝑆 = 1 − (𝑅+𝐺+𝐵) [min (𝑅, 𝐺, 𝐵)] (3) from K-NN and determine the possible classes [23].
The Fuzzy K-Nearest Neighbor (F-KNN) classification
1 process has several stages, the first is by determining the
𝐼 = (𝑅 + 𝐺 + 𝐵) (4) value of the i-class membership in the j-neighbor. The
3
following is Equation (7) which is used in determining the
2) Local Binary Pattern (LBP) membership value.
Local Binary Pattern (LBP) is a method used for texture 𝑛𝑗
0.51 + ( ) ∗ 0.49 → 𝑗 = 𝑖
recognition by using a grayscale color on a surface based on 𝑢𝑗 (𝑥) = { 𝐾
(7)
the difference between the neighboring pixels and the central 𝑛𝑗
( ) ∗ 0.49 → 𝑗 ≠ 𝑖
pixel. LBP is widely used in various applications. LBP works 𝐾
by labeling the pixels in the image based on the neighboring
threshold of each pixel and displaying them in binary [20] where,
The basic operation of LBP measures 3x3 of an image uij = neighbor membership value
using 8 neighboring pixels as shown in Figure 6. nj = the number of class-i neighbors in the KNN set
To determine the membership value of each class in the test
data, the process uses the formula Equation (8).

1
∑𝑘
𝑗=1 𝑢𝑖𝑗 ( 2 )
||𝑥−𝑥𝑗 ||
𝑢𝑖 (𝑥) = 1
𝑚−1
(8)
∑𝑘
𝑗=1 ( 2 )
||𝑥−𝑥𝑗 ||
𝑚−1

Fig. 6 The example of LBP Calculation [21] where,


LBP calculation can be written by Equation (5) and Ui(x) = value of membership of data x to class i
Equation (6).
k = the number of nearest neighbors used
𝑝−1
LBP P,R (xc, yc)= ∑𝑝=0 𝑠 (𝑔𝑝 − 𝑔𝑐 )2𝑝 (5)
Uij = class i membership value at j
1𝑥 ≥0
s(x)={ (6) x – xj = difference in the distance from data x to data x j in
0𝑥 <0 the nearest neighbour
where,
m = the weight to the power of the magnitude m > 1
xc and yc = neighboring pixel center coordinates
p = circular sampling points
III. RESULT AND DISCUSSION data testing for each class of image data. M Variable is the
In this lesson, we created a broccoli quality classification power weight used in the Fuzzy K-Nearest Neighbor function;
system that displays all processes in image processing. Figure this variable is used to find out how big the distance between
7 shows the processing starting from the color process, neighbors is when calculating the effect of neighbors on
preprocessing, feature extraction, and the results of membership values. A large value of M will make the
classification by machine learning. membership value lower, affecting the determination of class
results in the classification of cabbage quality based on color
and texture.
The results of our scenario testing in Table 1 show the best
results when compared to the experiment of Febri., et al [24]
which tested the Fuzzy K-NN algorithm with a 70:30 scenario
resulting in an accuracy of 86.66%.
Regarding color features, the color features we use are HSI
(Hue, Saturation, and Intensity). Before performing color
features, a segmentation process is needed by performing
morphological operations to overcome noise in the image,
such as the presence of parts of broccoli leaves. By testing
Fig. 7 The final look Broccoli Quality Identification System without LBP feature extraction, we get an accuracy of 56.6%
for the values k=3, and m=2. Meanwhile, we get 93.3%
accuracy for the values k=9 and m=8 without using the HSI
In this study, we use the F-KNN algorithm for training and
feature.
testing data by simulating several ratios. The first scenario was
carried out with a ratio of 40:60 with a total of 50 training data By identifying the quality of broccoli, the accuracy
and 70 testing data. The second scenario was carried out with obtained without using HSI color feature extraction is better
a ratio of 10:90 with a total of 20 training data and 100 testing than without LBP texture feature extraction. To increase the
data. Furthermore, we tested without extraction of LBP accuracy value, we combine the two features so that the
texture features and without extraction of HSI color features. accuracy can reach 94.4% with a 40:60 scenario using F-
KNN. The more features extracted from an image, the more
TABLE I. CLASSIFICATION ACCURACY BASED ON FUZZY K-NN complex the algorithm must be used because each feature
extracted adds another dimension to the data that must be
No Scenario Accuracy
processed and analyzed. As the number of features increases,
1 40:60 94,4% the complexity of data processing also increases. More
2 10:90 72% complex algorithms are required to learn more complex
patterns and be able to make more informed decisions based
Table 1 shows the accuracy results of the classification, on those features. Complex algorithms can lead to longer
and we can see the results of differences in accuracy which are computation times and increase programming complexity.
affected by the comparison of training and testing data, the
IV. CONCLUSION
less training data, and the more testing data, the lower the
resulting accuracy. In this paper, we propose a method for identifying the
quality of broccoli using the F-KNN method. This system
TABLE II. TESTING OF K-VALUES classifies the quality of broccoli, good quality, and bad
quality. The results of HSI and LBP feature extraction
No K-Values Accuracy implementation can reach 94.4%. This value is the best
1 3 100% accuracy of the 40:60 scenario. The more training data used,
2 5 100% the greater the accuracy value. In our research, only 120
3 7 100% broccoli images were taken. Therefore, we suggest further
4 9 100% research to add data to increase the effectiveness of the
method used.
Table 2 shows the accuracy of the test for the k value. The
test results on the value of the level of accuracy are influenced REFERENCES
by a lot of training data. Even though the value of k increases, [1] I. Habiburrohman, E. Suryani, and Wiharto, “Maturity Classification
the accuracy obtained is 100% and tends to be stable. of Golek Mango using Fuzzy C Means Clustering Method based on
HSI and YCbCr Color Space Transformation,” in 2022 International
Conference on Informatics, Multimedia, Cyber and Information
TABLE III. TESTING OF M-VALUES System (ICIMCIS), 2022, pp. 401–406. doi:
10.1109/ICIMCIS56303.2022.10017524.
No M-Values Accuracy
[2] Z. Iqbal, M. A. Khan, M. Sharif, J. H. Shah, M. H. ur Rehman, and K.
1 3 100% Javed, “An automated detection and classification of citrus plant
2 5 96,6% diseases using image processing techniques: A review,” Comput
3 7 93,3% Electron Agric, vol. 153, pp. 12–32, 2018, doi:
https://doi.org/10.1016/j.compag.2018.07.032.
4 9 85% [3] S. Sunoj et al., “Sunflower floral dimension measurements using digital
image processing,” Comput Electron Agric, vol. 151, pp. 403–415,
Table 3 shows the accuracy of testing the value of M. 2018, doi: https://doi.org/10.1016/j.compag.2018.06.026.
classification using Fuzzy K-Nearest Neighbor does not affect [4] Elvia Budianita, Jasril Jasril, and Lestari Handayani, “Implementasi
Pengolahan Citra dan Klasifikasi K-Nearest Neighbour Untuk
the accuracy value but affects the membership value of each
Membangun Aplikasi Pembeda Daging Sapi dan Babi,” Jurnal Sains Communication (ICCMC), 2023, pp. 896–901. doi:
dan Teknologi Industri, vol. 12, no. 2, 2015. 10.1109/ICCMC56507.2023.10083522.
[5] Nur Ibrahim, Tasya Fikriyah Bacheramsyah, Bambang Hidayat, and [15] Novita, Suyanto, and Y. Prasti Eko, “Bonferroni Mean Fuzzy K-
Sjafril Darana, “Pengklasifikasian Grade Telur Ayam Negeri Nearest Neighbors Based Handwritten Chinese Character
menggunakan Klasifikasi K-Nearest Neighbor berbasis Android,” Recognition,” in 2021 International Conference on Data Science and
Elkomika Jurnal, vol. 6, no. 2, pp. 288–302, 2018. Its Applications (ICoDSA), 2021, pp. 118–123. doi:
[6] C. A. Sari et al., “Papaya Fruit Type Classification using LBP Features 10.1109/ICoDSA53588.2021.9617488.
Extraction and Naive Bayes Classifier,” in 2020 International Seminar [16] S. Patikar, P. Saha, S. Neogy, and C. Chowdhury, “An Approach
on Application for Technology of Information and Communication towards prediction of Diabetes using Modified Fuzzy K Nearest
(iSemantic), 2020, pp. 28–33. doi: Neighbor,” in 2020 IEEE International Conference on Computing,
10.1109/iSemantic50169.2020.9234240. Power and Communication Technologies (GUCON), 2020, pp. 73–76.
[7] H.-L. Kuang, L. L. H. Chan, and H. Yan, “Multi-class fruit detection doi: 10.1109/GUCON48875.2020.9231066.
based on multiple color channels,” in 2015 International Conference on [17] Y. Anggoro, B. Setiawan, and P. Adikara, “Implementasi Metode
Wavelet Analysis and Pattern Recognition (ICWAPR), 2015, pp. 1–7. Fuzzy K-Nearest Neighbor Untuk Klasifikasi Penyakit Tanaman
doi: 10.1109/ICWAPR.2015.7295917. Kedelai Pada Citra Daun,” Jurnal Pengembangan Teknologi Informasi
[8] V. T. Hoang, A. Porebski, N. Vandenbroucke, and D. Hamad, “LBP dan Ilmu Komputer, vol. 2, pp. 2381–2389, 2018.
parameter tuning for texture analysis of lace images,” in 2016 [18] A. A. I. Wiratmaka, I. F. Rozi, and R. A. Asmara, “Klasifikasi Kualitas
International Image Processing, Applications and Systems (IPAS), Tanaman Cabai Menggunakan Metode Fuzzy K-Nearest Neighbor
2016, pp. 1–6. doi: 10.1109/IPAS.2016.7880063. (FKNN),” Jurnal Informatika Polinema, vol. 3, no. 3, Mar. 2017, doi:
[9] H. Kang, L. XueFei, and Z. Wenhui, “An adaptive fusion panoramic 10.33795/jip.v3i3.25.
image mosaic algorithm based on circular LBP feature and HSV color [19] R. C. Gonzalez, R. E. Woods, and P. Prentice Hall, “Digital Image
system,” in 2020 IEEE International Conference on Information Processing Third Edition Pearson International Edition prepared by
Technology,Big Data and Artificial Intelligence (ICIBA), 2020, pp. Pearson Education.”
94–100. doi: 10.1109/ICIBA50161.2020.9277348. [20] E. Prakasa, “Texture Feature Extraction by Using Local Binary
[10] W. Huang, Y. Huang, Z. Wu, J. Yin, and Q. Chen, “A Multi-Kernel Pattern,” INKOM Journal of Informatics, Control Systems, and
Mode Using a Local Binary Pattern and Random Patch Convolution Computers, vol. 9, no. 2, pp. 45–48, 2015, doi: 10.14203/j.inkom.420.
for Hyperspectral Image Classification,” IEEE J Sel Top Appl Earth [21] Yeni Herdiyeni and Ni Kadek Sri Wahyuni, “Mobile Application for
Obs Remote Sens, vol. 14, pp. 4607–4620, 2021, doi: Indonesian Medicinal Plants Identification using Fuzzy Local Binary
10.1109/JSTARS.2021.3076198. Pattern and Fuzzy Color Histogram,” in Conference: International
[11] E. Tuba, S. Tomic, M. Beko, D. Zivkovic, and M. Tuba, “Bleeding Conference on Advance Computer Science and Information System,
Detection in Wireless Capsule Endoscopy Images Using Texture and 2012, pp. 301–306.
Color Features,” in 2018 26th Telecommunications Forum (TELFOR), [22] J. M. Keller, M. R. Gray, and J. A. Givens, “A fuzzy K-nearest
2018, pp. 1–4. doi: 10.1109/TELFOR.2018.8611939. neighbor algorithm,” IEEE Trans Syst Man Cybern, vol. SMC-15, no.
[12] Liantoni F, “Klasifikasi Daun Dengan Perbaikan Fitur Citra 4, pp. 580–585, 1985, doi: 10.1109/TSMC.1985.6313426.
Menggunakan Metode K-Nearest Neighbor,” Ultimatics : Jurnal [23] F. C.-H. Rhee and C. Hwang, “An interval type-2 fuzzy K-nearest
Teknik Informatika, vol. 7, no. 2, pp. 98–104, 2016. neighbor,” in The 12th IEEE International Conference on Fuzzy
[13] G. Singh Sodhi and J. Singh Sodhi, “A Robust Invariant Image-Based Systems, 2003. FUZZ ’03., 2003, pp. 802–807 vol.2. doi:
Paper-Currency Recognition Based on F-kNN,” in 2021 International 10.1109/FUZZ.2003.1206532.
Conference on Intelligent Technology, System and Service for Internet [24] L. Febri and N. A. Fitri, “Fuzzy K-Nearest Neighbor pada Klasifikasi
of Everything (ITSS-IoE), 2021, pp. 1–6. doi: 10.1109/ITSS- Kematangan Cabai Berdasarkan Fitur HSV Citra,” Jurnal Ilmiah
IoE53029.2021.9615287. Penelitian dan Pembelajaran Informatika, vol. 3, no. 2, 2018.
[14] S. Chandrasekaran, V. Dutt, N. Vyas, and A. Anand, “Fuzzy KNN
Implementation for Early Parkinson’s Disease Prediction,” in 2023 7th
International Conference on Computing Methodologies and

You might also like