Professional Documents
Culture Documents
di produk apel
Fadel Thariq
Gifari
18/423103/PA/181
86
1
2023
2
Undergraduate Program in Computer Science
Department of Computer Science and Electronics
Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Yogyakarta
2023
APPROVAL PAGE
Proposed by:
Supervisor:
3
Secondary Supervisor:
CN=085a28ba-1e57-48ed-8
3af-92d1ad27ae0c
I have reviewed this
document
Yogya
2023.06.08 05:52:22+07'00'
Azhari, Drs., MT., Dr
4
CONTENTS
Approval Page...................................................................................................2
Contents............................................................................................................ 3
List of Figures...................................................................................................4
Abstract............................................................................................................ 7
1. Introduction...........................................................................................6
1.1 Background.................................................................................................6
1.2 Research Problem.......................................................................................7
1.3 Research Scope.......................................................................................... 8
1.4 Research Objective...................................................................................8
1.6 Research Advantages............................................................................... 8
1.7 Proposal Organization.................................................................................. 9
2. Literature Review.................................................................................10
3. Basic Theories...................................................................................... 16
3.1 Image Classification using Transfer Learning.......................................... 16
3.2 ResNet-50 Implementations and Architecture:.......................................... 18
3.3 Fruits 360 Dataset.................................................................................. 19
3.4 Data Preprocessing for images.................................................................. 20
3.4.1 Histogram Equalization..................... 21
3.4.2 Affine Transformations.................................................. 21
3.5 Optimization Methods for ResNet-50............................................. 22
3.7 Example of setting parameters.............................................. 22
3.8 Example of Importing the required libraries.............................................. 23
3.9 Example and Preprocessing specifics.............................................. 24
4. Research Methodology........................................................................26
4.1 Research Description................................................................................26
4.2 Research Stages........................................................................................27
4.3 Research Workflow...................................................................................29
4.4 Dataset Preprocessing.............................................................................. 29
4.4 Google Collab usage................................................................................31
5
4.6 Implementation and Tools.................................. 34
4.6.1 Tools and Materials........................................................................ 31
4.6.2 Implementation of Machine Learning Models of ResNet-50.............. 32
4.6.3 Model Evaluation Plan....................................................................32
5. Research Schedule............................................................................... 33
References.........................................................................................................34
LIST OF FIGURES
6
ABSTRACT
Keywords: high accuracy, VGG-16, CNN, models, nutritional dificiency, machine learning, variables, transfer learning.
7
CHAPTER 1
INTRODUCTION
1.1 Background
In researching Agricultural progression for fruits specifically, nutrition such
as potassium,calcium,and nitrogen are detrimental in producing healthy crops
from well attended plants.Using image classification will allow the process of
identifying a type of deficiency on a crop via the images of the leaves itself will
shorten the time required to produce a counter plan for the owner of the plant to
increase its nutrients level. A simulation of all unlabeled data from an available
dataset using an untrained model that will eventually be trained using the same
dataset.
Existing research based on crop image classification existed, commonly
used to detect the overall ripeness of the produce and if it is rotten or fresh and to
gauge the ripeness of the fruit based on its chlorophyll color. As for this research
which aims to analyze the leaves for the deficiency based on its colorization
,which will provide a basis of criteria for the model to train on segmenting the
colors of the leaves for the nutrient deficiency classification.
Regarding the results for past researches related to image classification for
crops or its plants, there has been a trend of a comparison between two statuses of
the plant or crop, either ripe or unripe,rotten or fresh,viable or unviable for the
land and so forth, but there is little to none , when it comes to observing the
natural discrepancy of the produce, using its original leaves by utilizing
Algorithms such as CNN and ResNet 50, but rather than entirely creating a new
model from scratch, utilizing a pre-trained model as a basis for the new model and
the available dataset filled with pictures of different leafs with multiple
colorization on the surface.
8
1.2 Research Problem
For Research methods and complications regarding image classification in
Agricultural uses, it is prevalent that previous researches touched on the
landscaped identification of an apple’s viability using a dataset that contains a
close up view that shows the intermittent colors on the different types of apples.
To put it simply a specification of result based on colorization of the leaf, in this
case nutrient deficiency classification that focuses on parts of the images on the
leaf based on a variety induced dataset with multiple leafs and crops all from
differing environments.
Although this research aims to be more specific, previous research that
focused on the colorization will prove to be useful in creating a reference point
for the intensity of filters used in the model that will then translate into the
classification of each nutrient, and in each iteration the overall accuracy will be
heightened.
9
1.4 Research Objective
The primary objective of this research is to create a model that adheres to to
a sustainable and modifiable result of classification, furthermore the simplicity of
the results will allow the model to improve its accuracy overtime with a specified
dataset rather than experimenting with a variety of datasets that does not meet the
criteria for the nutrients colours.The model is expectantly to have a reduced epoch
loss and have an accuracy between 70% and 80 %.
It is also important the the data preprocessing phase extract the desired
features such as colors and leaf shapes, followed by resizing each image
uniformly in accordance with the initial filters used to highlight the colour
anomalies.Blurry/Unclear images will also be augmented using pixelating
techniques that expand a select few of the images and finally smoothing the edges
for increased recognition.
From there the results of this model can stem into multiple uses such as an
automated system that creates a step by step process into restoring the health of
the plant or a personalized application open to public use for personal
greenhouses and house plants.It is also possible to create a commercially
introductory businesses that provide the nutrients need based on what the owner
has gain knowledge of.
10
1.6 Proposal Organization
This proposal will provide Chapter II, containing all the literature review
that are comparable to this research topic, followed by Chapter III that discusses
the initial methods and theories to pre process the data and create the
model,furthermore Chapter IV will delve into the specifics of the methods that are
to be used and implemented throughout the course of the model and its result, and
finally Chapter V provides the research schedule in order until the final result is
admitted.
11
CHAPTER 2
LITERATURE
REVIEW
For Agriculture, using the Convolutional Neural Network for its economic
and practical analysis is highly intensified in the industry. On the practical side,
Convolutional Neural Network is able to detect the anomalies within a specific
crop field and segment all the desired results, in order to classify the percentage of
each factor revolving the practice of agriculture, specifically to identify the
components that increase or decrease the overall growth of a crop field.For
example in a research conducted by Cambridge University A set of Aerial Photos
were taken of an abandoned Plantation Field, using a training data ratio of 85:15 ,
it was possible to remotely label and differentiate SugarCane from Soil on the
plantation field, while the miscellaneous items were labeled as others.
12
Figure 1.1
From the result, it can be deduced that the matrix displays labels in an
orderly manner,Data as such can be applied in researching an adaptable and
feasible land, suitable for massively produced crops(Kamilaris,2018). There is a
segmentation between pictures to test the overall accuracy and loss rate of the
CNN model, it is highly admissible to create a program that has all of the
attributes for a CNN and a sample dataset unrelated to the topic to avoid any
baseline errors, in addition, multiple parameters that were defined including
sigmoid and feedforward functions that increases the overall efficiency of the
model before training it iteratively, by defining each parameter individually, it
will allow the program to predict the overall accuracy of the model when
presented with 3 classes containing 100 images each.While also using a form of
kernel that allows each image to be separated individually. With the goal of
preventing errors during classification.
13
precipitation that plays a role in the overall growth of the crops. According to a
research done by Dong- a University that focused on the contents within the
crops, mainly Tomatoes, to determine nutrient deficiency within the crops.
Figure 2.1
The data above indicates the deficiency between the three micronutrients are
prevalent in the tomatoes,by adding multiple Convolutional layers , in supervised
learning the data has been labeled for each form of tomato, to indicate its
deficiency.According to the researchers, in order to set the proper boundaries for
the Image Segmentation to successfully Operate,first it is required to “tune the
Inception‐ResNet v2 model based on the CNN architecture”
(Tran,2019,p.6),followed by the implementation of normalization in order to
display the vanishing gradient prominent within the specified images.The objects
that are displayed within the image are the Tomato fruit and Leaves. Figure 2.2
For the experimental results, the images showed that the nitrogen deficiency is
detected only in the tomato leaves, while calcium and other nutrients are found on
the Tomato Fruit itself with the accuracy resulting in 87.27 % , with each of the
values producing its own individual maximum value of 0.989,0.999, and 1.0
(Tran,2019,). The results of the supervised learning proved to be substantially
accurate, all the data acquired from the Tomato Research is viably utilized by
farmers, to determine what type of tomato seed is preferable compared to the
other existing ones,from this it will be possible to set the tomato seeds so that it
will result in a hybrid produce, that will improve and avoid any deficiency
evolving within the tomatoes.
14
from the fruit ripeness dataset. Because there are only 140 photos in each dataset
for each image category, the Data Augmentation procedure is carried out using
ImageDataGenerator from the Keras library in order to reduce overfitting. ReLu
activation is the activation method employed in the MLP block. The MLP block
output layer employs the 8 class softmax activation function. The second design
uses the VGG16 model and Batch Normalisation for transfer learning.With the
architecture outlined in the proposed transfer learning sub-chapter, this research
demonstrates the impact of regularization approaches on transfer learning in
decreasing overfitting of the fruit maturity dataset. For picture training, validation,
and testing, there are 100, 20 and 20 photos for each category, respectively. The
utilized ModelCheckpoint's monitor = val accuracy, mode = max property. Table
1 compares the accuracy, precision, recall, and F-Measure values for dropouts
between 0.3 and 0.7.
16
Purbaditya et al. Deep Learning undefined Undifined
Detection of
(2018) Extraction
Harbour
Porpoise
with
Low-Level
Feature
Extraction
and Deep
Learning
Based
Classificati
on
17
CHAPTER 3
BASIC
THEORIES
18
As for transfer learning using Keras (ResNet50) it has better sensitivity than
VGG-16 regarding the edges of an object as shown below.
Figure 3.1
The Comparison shows that ResNet50 has an advantage
While reading about Transfer Learning comparison with ResNet 50 and
VGG-16 it is important to note that both features a different hierarchical
clustering to identify, which object corresponds to what features as shown within
the visualization below.
19
Figure 3.2
The Image above indicates that there are some features that are similar
within the object, therefore each architecture responds in accordance to their
accuracy and inner workings. It can be said ResNet-50 is conditionally better for
this research due to the fact that it picks up more features on the animal, than the
latter architecture.
20
a CONV2D layer to match up the inputs and outputs In detail the ResNet-50
model contains 5 stages that consists of both a convolution block and an identity
block,each has 3 convolutional layers that accommodate their functions.
As seen above there are multiple layers such as ReLU and others that take
into account all the functions ResNet-50 can and are able to do.
The Fruits 360 Dataset provides all the necessary content that adhere to this
research, ranging from a variety of fruit images that has a 360 point of view for
each of the object.Each images is 100x100 pixels all uniformed followed by a
training set size of 67692 images and a test set size of 22688.
21
It can be inferred from the dataset above that each fruit is labelled
differently based on its species and angle.That will allow the ResNet algorithm to
review its accuracy based on each angle and filters top detect the nutrional
deficiency. In this dataset only the types of Apples will be taken into account such
as: Braeburn,Crimson Snow,Golden,etc.
For the Preprocessing stage of the dataset the images are transformed using
a multitude of functions such as the histogram equalization using the appropriate
functions and values.
22
Histogram Equalization will allow the images in the dataset to improve its
own individual contrast, in essence it stretches out the overall intensity of each
image.The histogram will also assist the model in identifying the color balance in
the form of a graph.
23
As seen from the image above it can be inferred that an equalization helps
spread the red,green,blue aspect of an Image.Furthermore the specific equation in
which the equalization is calculated is a as follows.
T_opencv = np.float32(T.flatten()[:6].reshape(2,3))
plt.imshow(cv2.cvtColor(img_transformed,
cv2.COLOR_BGR2RGB))
24
3.5 Optimization Methods for ResNet-50
The optimization method that is applicable for ResNet 50 is one that can be
divided into multiple parts, including but not limited to padding and
stride.Padding in ResNet-50 refers to when a fixed padding is commonly used
rather than a personalized one with customized padding. In order to prevent the
loss of pixels within the image, this formula is used in python for the model.
TEST_SIZE = 0.05
RANDOM_STATE = 1567
BATCH_SIZE = 80
NO_EPOCHS = 200
NUM_CLASSES = 3
SAMPLE_SIZE =
/'
TRAIN_FOLDER =
'./train/' TEST_FOLDER =
'./test/'
IMG_SIZE = 250
Here to set the parameters first its important to set the image size,then the
number of classes which are , Nitrogen,Phosphorus,Iron deficiencies then
separating the test and training images into different folders for the program to
detect by itself via the command.
As shown in the above example, importing the libraries first for the
normalization and pre-processing of the dataset should be of utter importance, as
the keras is needed for standardization of each image in the dataset.
Next the training of the dataset would result in the form of phosphorous
deficiency that correlate to white on the apple, green for nitrogen ,and yellow for
iron.So for example each singular picture of an apple will result in the example
below:
26
Therefore it is prevalen that the output for the predictions will be percentage
based for each deficiency meaing 3 classes with differing status that determine the
qualifications of the apple to be a specific nutrient deficient.
splitfolders.ratio("/kaggle/input//fruits360/",
output="output",
seed=1000, ratio=(.65, .3, .05), group_prefix=None,
move=False)
27
img = cv2.imread(path,cv2.IMREAD_COLOR)
img = cv2.resize(img,
(IMG_SIZE,IMG_SIZE))
data_df.append([np.array(img),np.array(label)])
shuffle(data_df)
return data_df
batch_size=BATCH_SIZE,
epochs=NO_EPOCHS,
verbose=1,
validation_data=(X_val, y_val))
axs[i].set_title('Actual'+str(key_list[val_list.index(label[i]
28
)])+'Predicted'+key_list[val_list.index(y_pred_l)])
# axs[i,1].imshow(masks[i])
# axs[i,1].set_title('Mask')
plt.show()
test_plot(a,b)
CHAPTER 4
RESEARCH
METHODOLOGY
In detail the model will detect the different colourization that are apparent
on the outside of the fruit, thus focusing only on its outer filtration of skin color
that corresponds to a certain nutrient.The percentage in which the the nutritional
deficiency is detected will be based on how frequent the decolorization is apparent
of the skin of the produce,and from there the model will classify either the
29
produce is nutritionally deficient or not.
To start the research it is advisable to use Google Collab to import all the
necessary libraries, which in this case is Keras, with a Convolutional Neural
Network approach that will compare the accuracy of Training vs. Test data. The
overall performance of the model will be based on multiple iterations of its
training and test results, while also increasing its accuracy.
30
Figure 4.2 Research Stages
31
The next stage requires to use transfer learning from previously similar
iterated models, in order to create a sophisticated model that will do a new task
which is detecting colourization and decolorization, that will determine whether
or not the product falls under the nutritional deficiency category.
32
Table 4.1 : Example of a categorical fruits 360 dataset that has been
classified under different objectives.
Normalisation is performed by using the First normal form. The table 4.3
below shows a brief example of the first normal form used in the dataset of this
research.
Example:
def normalize_images(images):
normalized_images = np.zeros_like(images.astype(float))
num_images = images.shape[0]
Standardization is performed only if needed in case there are values that are
33
distant from the normal form. This is done by using the z-score conversion. The
following shows an example of z-score conversion used for standardization in
measuring with the order value in the dataset of this research. (average order = 16,
standard deviation = 2, current order = 15). Below is the basic equation for
Standardization:
Example:
The model will mainly be created in Google Collab as it is most preferable when
creating a python based machine learning program,because it already has built in
libraries and commands for image formatting and classification.
4.6 Implementation
34
1. Processor : 8th Gen Intel Core i7-8750H 2.2 GHz (up to
4.1 GHz)z
2. RAM : 16 GB
3. GPU : NVIDIA GeForce GTX 1060
4. Operating System : Windows 11 Home 64-bit v. 21H1
This research will use the following software and supporting libraries.
1. Python 3.10.0
2. Anaconda
2.1.1 3. NumPy
1.22.3
4. Keras 2.7
5. Matplotlib 3.1
6. TensorFlow 2.0
7. ResNet-50
8. Fruits 360 Dataset
Keras is the perfect simplified library to make use of all the preprocessing
techniques for each picture in the dataset. Implementing Machine Learning in
color detection using ResNet-50 will allow the model to systematically and
automatically classify which percentage of absence of color adhere to which
category of deficiency.
35
4.6.3 Model Evaluation Plan
The evaluation for the Model will be based on how many deficiencies are
constantly shown and detected within all the overall types of apples and its
subspecies. The percentage based on the consistency of deficiency taken from the
detection of the colors will be the basis on the accuracy of the model.
Accuracy To be
researched
To be
Color Ratio researched
36
Table 5.1 : Proposed Research Schedule
2023
Kegiatan May June July
1 2 3 4 1 2 3 4 1 2 3 4
Model and
Datasete Research
Thesis completion
overtime
37
REFERENCES
Bank, W., 2021. Overview. [online] World Bank. Available at: [Accessed 18 April
2021]. Chui, M. et al., 2019. Notes from the AI frontier: Applications and value
of deep learning.
Bhattacharya, P., Wulf, S. and Zölzer, U. (2018) (PDF) detection of Harbour Porpoise with
low-level feature extraction ... Available at:
https://www.researchgate.net/publication/332902381_Detection_of_Harbour_Porpoise_w
i th_Low-Level_Feature_Extraction_and_Deep_Learning_Based_Classification
(Accessed: November 4, 2022).
Garg, A. (2022) Image classification using resnet-50 Deep learning model, Analytics
Vidhya. Available at:
https://www.analyticsvidhya.com/blog/2022/09/image-classification-in-stl-10-dataset-usi
n g-resnet-50-deep-learning-model/ (Accessed: November 3, 2022).
McKinsey & Company. Available at: [Accessed April 18, 2021]. Jan, B. et al.,
2017. Deep learning in big data Analytics: A comparative study. Computers &
Electrical Engineering. Available at: [Accessed April 18, 2021]. Kamilaris, A. &
Prenafeta-Boldú, F.X., 2018. A review of the use of convolutional neural networks
in agriculture. The Journal of Agricultural Science, 156(3), pp.312–322.
Kudrov, M. (no date) Example of the resnet-50 result classification. | download scientific
38
https://www.researchgate.net/figure/Example-of-the-ResNet-50-result-classification_fig5_
4, 2022).
39