You are on page 1of 7

AIL303m Project

Support Vector Machine : Classification


based on lemon quality
Lab Project Report
Student
Chu Dang Huong ; Dang Hoang Phong ; Nguyen Thanh Long
Advisor
Tran Van Ha

ABSTRACT
In the agriculture industry, the classification of fruit quality plays a crucial
role in ensuring consumer satisfaction and maintaining high standards of
produce. The grading of lemons based on their quality is an important
aspect of this process. With the advancement of technology, machine
learning techniques can be employed to automate the classification of
lemon quality, resulting in efficient and accurate outcomes. In this study, we
propose the use of Support Vector Machines (SVM) to classify lemons based
on various quality attributes such as color, size, texture, and shape. By
utilizing publicly available datasets and digital images of lemons, we will
preprocess and extract relevant features to train the SVM model. This
approach aims to enhance the classification accuracy and improve the
overall efficiency of the grading process in the agriculture industry. The
results of this study can contribute to optimized resource allocation and
waste reduction, ultimately benefiting both producers and consumers.

Keywords: Support Vector Machine, Classification, Feature extraction, Agriculture


industry..
I. Introduction………………………………………………………
I.1 Problem and motivation………………………………………………

I.2 Contribution…………………………………………………………..

II. Data preparation…………………………………………………


II.1 Data information……………………………………………………..

II.2 Data processing………………………………………………………

III. Method……………………………………………………
IV. Implementation……………………………………………
V. Result……………………………………………………………..
VI. Conslusions and Future Work……………………………
VII. References……………………………………………………………
I. Introduction
1.1 :Problem and Motivation.
Lemons are fruits that are in high demand and widely used in daily life.
Ensuring the quality of lemons is extremely important for both producers and
consumers, as it directly affects market value and the interests of consumers.
In this project, we will address the problem of classifying lemons based on
their quality using a machine learning approach. The goal is to develop a
model that can accurately predict the quality of lemons, thereby assisting in
the selection process and improving product quality.

1.2 Contributions.
In this project, we will apply Support Vector Machines (SVM) to classify
lemons based on different attributes, including size, color, texture, and other
relevant quality factors. This will support the accurate prediction of lemon
quality automatically, helping to optimize resources and product quality for
the benefit of both sellers and consumers. As students new to machine
learning projects, we are excited to learn and explore the applications of
SVM in this classification challenge. We will strive to improve and refine our
model to achieve high accuracy and create a model that can be applied in
various real-world situations.
II. Data preparation
2.1 Datasets

In the process of understanding the problem, we also found a dataset with


suitable features to use for classification. This is a dataset with 2000 images about
300x300, including 951 bad quality lemon images, 1125 good quality lemon
images, and 452 images with an empty background. The dataset presents three
common types of images used in the classification process, including two types of
good quality and bad quality lemon images, and one type of images with an
empty background. Here is a visualization of the data used in the problem-solving
process.

Figure 1: Data of bad quality lemon.


Figure 2: Data of empty backgorund.

Figure 3: Data of good quality lemon.


2.2 Data processing

In this project, we focus on image-based lemon quality classification.


After deciding on the direction to go using machine learning to solve
the problem, we understand the data processing step is particularly
important. to ensure that consistent, simplified and unbiased data is
available for use in the classification model. We have done the
following:

First, the image is resized to a uniform size (64x64) to reduce variation


in the input data.

Next, the image's pixel values are normalized to a range of 0 to 1 to


ensure that they have the same scale. This is important because the
SVM model is sensitive to the size of the input data and normalizing
the data helps to prevent bias in the model.

Finally, to speed up the computation and reduce the cost of the


model. We'll transform the image from a two-dimensional array into
a one-dimensional vector before adding it to the array and assigning
the appropriate data labels.

Below is a picture of the data before and after processing ( figure 4


and figure 5)
Figure 4: Data before processing.

Figure 5: Data after processing.

You might also like