You are on page 1of 38

SUMMER INTERNSHIP REPORT

On

“Fresh and Rotten Fruit Classification”

submitted in partial fulfillment of the requirement for


the award of the degree of

Bachelor of Technology

In

Computer Science and Engineering

By

PALAK ARORA
Enroll. No. A50105218032

Under the guidance of

Dr. Vikas Thada Ms. Ruchi Kamra


Amity University, ASET(CSE) Amity University, ASET(CSE)

Department of Computer Science and Engineering


Amity School of Engineering and Technology
Amity University Haryana
Gurgaon, India
June, 2021
Department of Computer Science and Engineering
Amity School of Engineering and Technology

DECLARATION

I, Palak Arora, A50105218032 , student of Bachelor of Technology in Department of


Computer Science and Engineering, Amity School of Engineering and Technology, Amity
University Haryana, hereby declare that I am fully responsible for the information and results
provided in this project report titled “Fresh and Rotten Fruit Classification” submitted
Department of Computer Science and Engineering, Amity School of Engineering and
Technology, Amity University Haryana, Gurgaon for the partial fulfilment of the requirement
for the award of the degree of Bachelor of Technology in Computer Science and Engineering.

Date: September 2021


Signature(s)

PALAK ARORA
A50105218032

i
Department of Computer Science and Engineering
Amity School of Engineering and Technology

CERTIFICATE

This is to certify that Palak Arora (Enrolment No A50105218032), student of Bachelor of


Technology, Department of Computer Science and Engineering, Amity School of Engineering
and Technology, Amity University Haryana, has done her Summer Internship Project entitled
“Fresh and rotten Fruit classification” under the guidance and supervision of me during “May
2021 to July 2021”. The work was satisfactory. She has shown complete dedication and
devotion to the given project work.

Date: July 2021 Dr. Vikas Thada

Amity University, ASET(CSE)

Ms. Ruchi kamra

Amity University, ASET(CSE)

Head
Department of Computer Science & Engineering
Amity School of Engineering and Technology
Amity University Haryana, Gurgaon

ii
Acknowledgement
“You just have to have the guidance to lead you in the direction until you can do it yourself.”
I consider it my proud privilege to have undertaken this endeavour under the inspiring guidance
of Dr. Vikas Thada, Assistant Professor, ASET, Amity University Haryana, the supervisor of
my project and Ms. Ruchi kamra, Assistant Professor, ASET, Amity University Haryana, the
co-supervisor of my project. I take this opportunity to express my deep sense of gratitude to
them for their valuable guidance and constructive criticism during the course of this project
and going through the entire development phase critically. I thank both of them for believing
in me.

iii
ABSTRACT

Detecting the rotten fruits become significant in the agricultural industry. Usually, the
classification of fresh and rotten fruits is carried by humans is not effectual for the fruit farmers.
Human beings will become tired after doing the same task multiple times, but machines do not.
Thus, the project proposes an approach to reduce human efforts, reduce the cost and time for
production by identifying the defects in the fruits in the agricultural industry. If we do not detect
those defects, those defected fruits may contaminate good fruits. Hence, we proposed a model
to avoid the spread of rottenness. The proposed model classifies the fresh fruits and rotten fruits
from the input fruit images. In this work, we have used three types of fruits, such as apple,
banana, and oranges. A Convolutional Neural Network (CNN) is used for extracting the
features from input fruit images, and Softmax is used to classify the images into fresh and
rotten fruits. The performance of the proposed model is evaluated on a dataset that is
downloaded from Kaggle and produces an accuracy of 97.29%. The results showed that the
proposed CNN model can effectively classify the fresh fruits and rotten fruits.

iv
List of Figures

Figure No. Figure Name Page No.


Figure 1.1 Components of Intelligence 1

Figure 2.1 Anaconda Logo 6

Figure 2.2 Anaconda Navigator 7

Figure 2.3 Juypter Notebook Logo 7

Figure 2.4 Jupyter Notebook 8

Figure 2.5 Python DL Libraries 9

Figure 3.1 Fresh and rotten fruits available in dataset 13

Figure 3.2 Training Dataset 14

Figure 3.3 Test Dataset 14

Figure 3.4 Libraries Used 14

Figure 3.5 Initializing CNN 15

Figure 3.6 Convolution and Max Pooling layers 16

Figure 3.7 Flatten layer 16

Figure 3.8 Dense layers 17

Figure 3.9 Compiling CNN 18

Figure 3.10 Image pre-processing 19

Figure 3.11 Data pre-processing 20

Figure 3.12 Fitting the model 21

Figure 3.13 Epochs 21

Figure 3.14 Model Evaluation 22

Figure 4.1 Rotten Apple Prediction 23

Figure 4.2 Fresh Orange Prediction 24

v
Figure 4.3 Rotten Banana Prediction 24

Figure 4.4 Fruit classification 24

Figure 4.5 Training and validation loss 25

Figure 4.6 Training and validation accuracy 25

Figure 4.7 Saving the model 26

Figure 5.1 Architecture 27

vi
Contents
Declaration i

Certificate ii

Acknowledgement iii

Abstract iv

List of Figures v

1. INTRODUCTION 1

1.1 What is Intelligence? 1

1.2 Project Overview 2

1.3 Project Background 2

1.4 Project Objective 3

1.5 Impact, Significance & Contribution 4

2. MATERIALS & METHODOLOGY 5

2.1 Development Environment 5

2.2.1 Anaconda 5

2.2.2 Jupyter Notebook 7

2.2 Technology Used 8

2.2.1 Python 8

2.2.2 Deep Learning 9

2.2.3 Libraries Used 11

3. IMPLEMTATION 13

Dataset 13

3.1 Importing Libraries 14

3.2 Initializing CNN 14

vii
3.3 Adding layers 15

3.4 Flatten the layers 16

3.5 Adding Dense layers 16

3.6 Compiling CNN 17

3.7 Data Extraction and Preprocessing 19

3.8 Model Fitting & Evaluation 20

4. RESULTS 23

4.1 Model Predictions 23

4.1.1 Steps for predictions 23

4.2 Graphs 25

4.3 Saving the model 26

5. ARCHITECHTURE 27

5.1 Architecture Flow 27

6. CONCLUSION & FUTURE SCOPE 28

6.1 Conclusion 28

6.2 Future Scope 28

7. REFRENCES 29

viii
CHAPTER 1

INTRODUCTION
In this chapter, the overall structure of the proposed system has been justified and project
contribution is stated and we will be discussing about the difficulties and issues that come
under Fruit Classification. Lastly, the project objective and project scope will also be listed
and discussed.

1.1 What is Intelligence?


The ability of a system to calculate, reason, perceive relationships and analogies, learn
from experience, store and retrieve information from memory, solve problems,
comprehend complex ideas, use natural language fluently, classify, generalize, and adapt
new situations.
We can say a machine or a system is artificially intelligent when it is equipped with at
least one and at most all intelligences in it.

The intelligence is intangible. It is composed of −

• Reasoning

• Learning

• Problem Solving

• Perception

• Linguistic Intelligence

Fig 1.1: Components of Intelligence

1
1.2 Project Overview
“Fresh and Rotten Fruit Classification” is an important task for many industrial
applications. A fruit classification system may be used to help a supermarket cashier
identify whether the fruit is fresh or rotten. It may also be used to help people decide
whether specific fruit species are meeting their dietary requirements. In this project we
used Deep Learning technology for our predictions. Deep learning is an artificial
intelligence (AI) function that imitates the workings of the human brain in processing data
and creating patterns for use in decision making. Also known as deep neural learning or
deep neural network.

In our project, we propose an efficient framework for “Fresh and Rotten Fruit
Classification” using deep learning. A fruit recognition framework utilising CNN is
proposed. The work uses the fruit shape and shading to recognise each picture. we will
recognize the fruit where the Convolutional Neural Network will predict whether the fruit
is fresh or not. We will train the network in a supervised manner where images of the
fruits will be the input to the network and labels of the fruits will be the output of the
network. After successful training, the CNN model will be able to correctly predict the
label of the fruit.

1.3 Project Background


“Fresh and Rotten Fruit Classification”– whether the fruit is rotten or fresh. It is necessary
to know the fruit we are eating or having is fresh because rotten fruit can cause lots of
problem and can lead to nausea, vomiting or indigestion as well as other food poisoning
symptoms. A rotten fruit can be easily be recognized through naked eyes from outside but
there are some fruits which are partially rotten like some fruits from outside looks fresh
but on eating taste different or may be rotten from inside while some fruits are rotten from
outside (if they kept under sun, because of pollution, off seasonal etc) but they are fresh
from inside so this model will help us to recognize the fruit whether it’s rotten or fresh.

The project was developed using Tensor-flow and keras framework. TensorFlow
is Google's opensource AI framework for machine learning and high-performance
numerical computation. TensorFlow is a Python library that invokes C++ to construct and

2
execute dataflow graphs. It supports many classification and regression algorithms, and
more generally, deep learning and neural networks whereas Keras is a neural network
library. It is an API designed for human beings, not machines. Keras follows best practices
for reducing cognitive load. Keras offers simple and consistent high-level APIs and
follows best practices to reduce the cognitive load for the users. Both frameworks thus
provide high-level APIs for building and training models with ease.

In this Project, we are doing fruit classification and a great way to use deep learning to
classify images is to build CNN. CNN is a type of neural network model which allows us
to extract higher representations for the image content, most commonly applied to analyze
visual imagery. The keras library in python make it pretty simple to build CNN. The model
type we are using in this project is Sequential. Sequential is the easiest way to build a
model in keras and Tensor-flow. It allows us to build model layer by layer. CNN uses a
special technique called Convolution. Now in mathematics convolution is a mathematical
operation on two functions that produces a third function that expresses how the shape of
one is modified by the other.

1.4 Project Objective

The main objectives and characteristics of the project are discussed as follows –

• Loading the dataset-The Dataset we are using in this project is taken from Kaggle and
the dataset we are using is already divided in train and test set with 1212 images in
train set and 300 in test data. More-over our dataset has three fruits (apple, orange,
banana) of both categories rotten and fresh hence dataset has 6 items.

• Exploratory data analysis (EDA) – It helps us understand the data better and spot
patterns in it. The most important variable to explore in the data is the target variable

• Data pre-processing-Data pre-processing in deep learning we need to reshape our


dataset inputs to shape our model at time of training model.

• Building the model- Build the model in this project we used sequential CNN model.
We added two cono2D layer, two max-pooling2D layers followed by flatten layer.

3
• Compiling the model- After adding convolutional layer, pooling layer, flatten layer
and hidden layer we compile our model. And compilation takes three parameter -
optimizer, loss and metrics.

• Training the model- After compilation next step is to train the model and we will use
“fit()” function.

• Using our model to make predictions-After training we can predict using “predict”
function. The predict function will give an array of 10 numbers (0-9).

• Saving the model- at the end we need to save the model using .h5 extension

1.5 Impact, Significance & Contribution


After the prediction of model is successfully developed it will bring lots of benefits to
market because now shop retailer or even customers can buy fresh fruit which will be
beneficial and worth the money but sometimes fruits at home got spoiled so there are many
other uses of rotten fruits listed below-
o Can be use as a fertilizer
o Bake bread with it. (Banana bread).
o Turn it into smoothie
o Use rotten fruits in sangria
o Make a cobbler with it.
o Make jams.
o Give it to your plants.
o Freeze it
o Use it in pancakes

4
CHAPTER 2

MATERIALS & METHODOLOGY


In the previous chapters, we have learned about the existing and proposed system of predicting
model and discussed about project. Now that we have a clear view of the project proposed, we
will discuss about the development methodology, environment and technologies used for the
development.

2.1 Development Environment

Hardware Configuration:

Microsoft Windows 10 (64-bit)

Processor: Intel Core i5


RAM: 8 GB
Software Requirement:

Anaconda
Jupyter Notebook
Python 3.8.5
2.2.1 Anaconda

Anaconda is a free and open-source distribution of the Python and R programming languages
for scientific computing, that aims to simplify package management and deployment. It is used
for data science, machine learning, deep learning, etc. With the availability of more than 300
libraries for data science, it becomes fairly optimal for any programmer to work on anaconda
for data science. It helps in simplified package management and deployment.

Anaconda comes with a wide variety of tools to easily collect data from various sources using
various machine learning and AI algorithms. It helps in getting an easily manageable
environment setup which can deploy any project with the click of a single button. It is
developed and maintained by Anaconda, Inc., which was founded by Peter Wang and Travis
Oliphant in 2012. As an Anaconda, Inc. product, it is also known as Anaconda
Distribution or Anaconda Individual Edition.

5
Fig 2.1: Anaconda Logo

Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda


distribution that allows users to launch applications and manage conda packages, environments
and channels without using command-line commands. Navigator can search for packages on
Anaconda Cloud or in a local Anaconda Repository, install them in an environment, run the
packages and update them.

The following applications are available by default in Navigator:

• JupyterLab
• Jupyter Notebook
• QtConsole
• Spyder
• Glue
• Orange
• RStudio
• Visual Studio Code

6
Fig 2.2: Anaconda Navigator

2.2.2 Jupyter Notebook

Fig 2.3: Jupyter Notebook Logo

The Jupyter Notebook is a web based interactive computational environment for creating the
Jupyter Notebook documents. The “notebook” term can make the reference to many different
entities, such of them are the Jupyter web application, Jupyter python web server, or Jupyter
document format depending on the context. A Jupyter Notebook document is a JSON
document, following a versioned schema, and containing an ordered list of input/output cells

7
which can contain code, text, mathematics, plots and rich media, usually ending with the
“.ipynb” extension. In 2014, Fernando Perez announced a spin – off project from IPython called
Project Jupyter. IPython continues to exist as a Python shell and a kernel for Jupyter, while the
notebook and other language – agnostic parts of IPython moved under the Jupyter name.
Jupyter is a language agnostic and it supports the execution environments in several dozen
languages among which are Julia, R, Haskell, Ruby and of course Python.

Fig 2.4: Juypter Notebook

2.2 Technologies Used

• Python
• Deep learning
• Libraries Used

2.2.1 Python

The Python Programming Language is a high-level language. It is also an interpreted


and the dynamically typed language. It was created by Guido van Rossum in 1990.
Python is easy to learn and the most powerful programming language. The nature of
the python makes it the language for scripting and rapid application development in
many areas on most of the platforms. Some of the platforms which are growing on day
to day basis just because of the python are the Data Science and the Machine Learning.
8
The python contains so many different types of the libraries; some of the libraries are
the sklearn, pandas, Matplotlib and many more. These all libraries are used in the
project. The libraries contain the built – in modules that provides the access to the
system functionality such as the file I/O. Python is meant to be an easily readable
language. Its formatting is visually uncluttered, and it often uses the English keywords
where other languages use the punctuation. Unlike, many other languages it does not
use the curly braces to delimit the blocks, and semicolons after the statements are
optional. It has fewer syntactic exceptions and special cases than C or Pascal.

Fig 2.5: Python Deep Learning Libraries

2.2.2 Deep Learning

Deep learning (also known as deep structured learning) is part of a broader family of machine
learning methods based on artificial neural networks with representation learning. Learning
can be supervised, semi-supervised or unsupervised.

Deep-learning architectures such as deep neural networks, deep belief networks, deep
reinforcement learning, recurrent neural networks and convolutional neural networks have
been applied to fields including computer vision, speech recognition, natural language
processing, machine translation, bioinformatics, drug design, medical image analysis,
material inspection and board game programs, where they have produced results comparable
to and in some cases surpassing human expert performance

Deep learning drives many artificial intelligence (AI) applications and services that improve
automation, performing analytical and physical tasks without human intervention. Deep
learning technology lies behind everyday products and services (such as digital assistants,
voice-enabled TV remotes, and credit card fraud detection) as well as emerging technologies
(such as self-driving cars).

9
2.2.2.1 How Deep Learning works?

Deep learning neural networks, or artificial neural networks, attempts to mimic the human
brain through a combination of data inputs, weights, and bias. These elements work together
to accurately recognize, classify, and describe objects within the data.

Deep neural networks consist of multiple layers of interconnected nodes, each building upon
the previous layer to refine and optimize the prediction or categorization. This progression of
computations through the network is called forward propagation. The input and output layers
of a deep neural network are called visible layers. The input layer is where the deep learning
model ingests the data for processing, and the output layer is where the final prediction or
classification is made.

2.2.2.2 Types of Deep Neural Network

The simplest type of deep neural network in the simplest terms. However, deep learning
algorithms are incredibly complex, and there are different types of neural networks to address
specific problems or datasets. For example,

• Convolutional neural networks (CNNs), used primarily in computer vision and


image classification applications, can detect features and patterns within an image,
enabling tasks, like object detection or recognition. In 2015, a CNN bested a human
in an object recognition challenge for the first time.
• Recurrent neural network (RNNs) are typically used in natural language and speech
recognition applications as it leverages sequential or times series data.

We used CNN (Convolutional neural network)-

Deep Learning has proved to be a very powerful tool because of its ability to handle large
amounts of data. The interest to use hidden layers has surpassed traditional techniques,
especially in pattern recognition. One of the most popular deep neural networks is
Convolutional Neural Networks.

10
Convolutional neural network (CNN/ConvNet) is a class of deep neural networks, most
commonly applied to analyse visual imagery. It uses a special technique called Convolution.
Now in mathematics convolution is a mathematical operation on two functions that produces
a third function that expresses how the shape of one is modified by the other.

Convolutional neural networks are composed of multiple layers of artificial neurons. Artificial
neurons, a rough imitation of their biological counterparts, are mathematical functions that
calculate the weighted sum of multiple inputs and outputs an activation value. When you input
an image in a ConvNet, each layer generates several activation functions that are passed on to
the next layer.

The first layer usually extracts basic features such as horizontal or diagonal edges. This output
is passed on to the next layer which detects more complex features such as corners or
combinational edges. As we move deeper into the network it can identify even more complex
features such as objects, faces, etc. Based on the activation map of the final convolution layer,
the classification layer outputs a set of confidence scores (values between 0 and 1) that specify
how likely the image is to belong to a “class.”

2.2.3 Libraries Used

Libraries in programming languages are collections of pre-complied and non-volatile


routines used by programs. These routines, sometimes called modules, can include
configuration data, documentation, message templates etc.

The libraries we used in this project are listed below-

• Pandas- pandas is a fast, powerful, flexible and easy to use opensource data analysis
and manipulation tool, built for the python programming language. It helps in default
and customized indexing of dataframe, handling missing data, reshaping and pivoting
of data sets, indexing, slicing and sub setting of large datasets, manipulating, merging
the datasets etc.

• Numpy- numpy is a python library used for evaluating the arrays. It helps in providing
sophisticated and high performing multi-dimensional arrays and also provides tools in

11
evaluating and manipulating these arrays. This library is fast, easy to work and helps
users in computation of arrays.

• Matplotlib- Matplotlib is a plotting library for the python programming language and
its numerical mathematics extension Numpy. It provides an object-oriented API for
embedding plots into applications using general – purpose GUI.

• Keras- Keras is an API designed for human beings, not machines. Keras follow best
practices for reducing cognitive load: it offers consistent & simple APIs. it minimizes
the number of user actions required for common use cases, and it provide clear &
actionable error messages. It also has extensive documentation and developer guides.
Keras is a powerful and easy-to-use free open-source Python library for developing and
evaluating deep learning models. It wraps the efficient numerical computation libraries
Theano and TensorFlow and allows you to define and train neural network models in
just a few lines of code.

• TensorFlow: TensorFlow is a framework created by Google for creating Deep Learning


models. It is an opensource artificial intelligence library, using data flow graphs to build
models. It allows developers to create large-scale neural networks with many layers.
TensorFlow is mainly used for: Classification, Perception, Understanding,
Discovering, Prediction and Creation.

12
CHAPTER 3

IMPLEMENTATION
This chapter explains the implementation part of the project in detail. Steps of designing the
project were already discussed in the previous chapter. So, now we are going according to those
steps.

Dataset

Dataset for “Fresh and Rotten Fruit Classification” is easily available on platform like Kaggle,
UCI (dataset repositories). For our project we took “Fruits fresh and rotten for classification-
“Fresh and Rotten Fruit Classification” available on Kaggle.

The Dataset consist of three fruits- Apple, Orange, Banana and each fruit has 2 classes fresh
and rotten so in all “Fruit Classification” Dataset has 6 classes.

Fig 3.1: Fresh and rotten fruits available in dataset

Dataset is already divided into train and test set. Training dataset consists total of 10,901
images where 1693 images are of fresh apples, 1581 images of fresh bananas, 1466 images of
fresh oranges, 2342 images of rotten apple, 2224 images of fresh bananas and 1595 images of
rotten oranges.

13
Fig 3.2: Training Dataset

Test Dataset consist of 2698 images where 395 images are of fresh apples, 381 images are of
fresh bananas, 388 images are of fresh oranges, 601 images are of rotten apples, 530 images
of rotten bananas and 403 images are of rotten oranges.

Fig 3.2: Test Dataset

3.1 Importing Libraries

The first Step in the implementation process is to import all the libraries (discussed in chapter
2) also we imported some modules and layers which is necessary for code to run.

Fig 3.4: Libraries used

3.2 Initializing CNN

After importing the libraries we will initialize Sequential Model.

14
Sequential is the easiest way to build a model in Keras. It allows you to build a model layer by
layer.

Here we initialized “classifier” object of Sequential class.

Fig 3.5: Initializing CNN

3.3 Adding Layers

After the initialization we will add the layers one by one and to add layers in the model we will
use add()- method of sequential class. Add method is useful in adding layers (convolutional
and max-pooling layers)

First, we will add convolutional layer. We’ll use add function and pass convolution layer as
argument and in convolutional layer there are several arguments mentioned below

o Filters- Filters detect spatial patterns such as edges in an image by detecting the changes
in intensity values of the image. In this project we are using 32 different kinds of filters
of 3 X 3.
o Input_shape-This parameter is important as this parameter will decide which size of
images are going to fed in neural network.
Syntax ~ input_shape(height, width, rgb)
We have to specify height and width of image in input_shape. In our implementation
we took height and width as 64 and rgb (red, blue, green) value is 3 means our images
are coloured images and in case of grey we take rgb value 1.

o Activation- Activation function is an important parameter. So in CNN we are going to


need non linearity between images, no images can linearly related to each other. To
remove, linearity we use activation function. In this project we used relu function is
best for our project.

After adding convolutional layer we will add pooling layer which is Max-Pooling.

15
Max pooling is a pooling operation that selects the maximum element from the region of the
feature map covered by the filter. In max-pooling we just need to specify the size of pooling
matrix. The conventional size is 2 X 2 we are using this only.The main aim of using max-
pooling is to reduce the risk of overfitting.

In our project we used two sequence of convolution and max pooling. The procedure is same
but in second convolutional layer we don’t have to specify the input_shape.

Fig 3.6: Convolution and Max pooling layers

3.4 Flatten the layers

The feature extraction and non-linearity is covered above now we will convert this 2D
matrix into 1D vector we will use flatten layer for this. For flattening the layers we don’t
need any arguments we will use Flatten function.

Fig 3.7: Flatten layer

3.5 Adding Dense Layers

After the above steps we will pass the whole data to Artificial neural network and for this we'll
use dense layer.

16
The dense layer is a neural network layer that is connected deeply, which means each
neuron in the dense layer receives input from all neurons of its previous layer. Dense
Layer is used to classify image based on output from convolutional layers.
To add dense layers we will use add method and Dense is passed in add method and
the dense layer has following parameters.
o Units- It represents the number of neurons in this layer. In this project we used
6 dense layers with different number of neurons.
o Activation- we used “Relu” as activation function. Which is best for dense or
hidden layers.

The last dense layer is output layer where units equal to six because we have six classes and
the activation function used here is “softmax” because we have categorical classification and
softmax is suited best for classification problem.

Fig 3.8: Dense layers

3.6 Compiling CNN

After adding all the layers now we will compile the model as a single piece so all above
function work correctly for this we will use compile method it is sequential class
method. In this method we are going to pass several parameters listed below

o Optimizers- They are used to optimize model or to make our training efficient
and here we use “adam” optimizer. The purpose of using adam optimizer is
because it has adaptive learning means whenever we train our model it will
detect error and loss functions and automatically adapt its learning rates so that
our accuracy increase gradually. In adam we don’t have to specify learning rate
manually it will automatically adapt the training scenario and change the
learning rate accordingly.

17
o Loss- Loss functions are passed during the compile stage. The Loss Function is
one of the important components of Neural Networks. Loss is nothing but a
prediction error of Neural Network. And the method to calculate the loss is
called Loss Function.

❖ Types of loss function available in keras/TensorFlow-


➢ For Binary Classification loss function used is Binary Cross
Entropy -The Binary Cross entropy will calculate the cross-
entropy loss between the predicted classes and the true
classes.
➢ For Multiclass classification loss function used is
Categorical Cross-entropy. The Categorical Cross-entropy
also computes the cross-entropy loss between the true
classes and predicted classes.

We used categorical cross entropy as loss function because our project deals with
multi class classification. It will calculate the errors in the predictions and actual
results and will pass that information into training so that weight and bias can be
updated so that our training accuracy increases.

o Metrics- It is the performance metrics which means when ever we train our
model, we need to evaluate our model with respect to accuracy or loss. Here
we are evaluating on the basis of accuracy.

Fig 3.9: Compiling CNN

18
3.7 Data Extraction and Preprocessing

Now the model compilation part has been done. In CNN we are doing steps in reverse
order because here first we created the layers and compiled them and now we will pre-
process our images present in dataset.
For this we will use preprocessing.image library provided by tensorflow and keras and
there’s a class ImageDataGenerator useful for image pre-processing. We need to
initialize this class with an object for training dataset and testing dataset. The image-
data-generator has following parameters listed below: -
o Rescale- It is used to rescale matrix value in range of 0-1.
o Shear Range- It is used to shear the images (not shifting) from current position.
o Zoom Range- It is used to zoom the images to a certain percent so that features
can be extracted properly.
o Horizontal flip-It is used to flip the images randomly so that the postion of
images can be change and training will not get bias. We don’t have to bias our
model that feature is position dependent.
o For test dataset we will only rescale the images because for testing we need
original images for testing.

Fig 3.10: Image Pre-processing

After the pre-processing of images. There’s another method of Image-data-generator class


which is flow from directory. This method is used when we don’t want our labels to be
written separately in another files. So according to flow from directory method we don’t have
to specify the labels it will extract the labels from the name of folder and it will append this
label to the images present in the folder. We just need to pass the following parameters: -

19
o Path of training data folder- if you have training data in same folder write the name
only but you can also pass relative and absolute path as well.
o Target size- it is same as input size we passed in the layers.
o Batch size- Total no of batches we have taken batch size as 11.
o Class mode- class mode depend on whether the problem is multi-class classification
problem or binary here we used categorical because we have a multi class
classification problem.

We follow the same procedure for test dataset also.

Fig 3.11: Data Pre-processing

3.8 Model Fitting and Evaluation

Now we extracted the training and testing images now we need to fit the images to our
model this is our final step of our model. We will use fit method and pass the following
parameters-
o Specify the training set-pass the train set as a first parameter which we are going
to train CNN network.
o Specify the number of images- Pass the number of images present in train
dataset which is 10,901
o Specify Batch size- Batch size is selected in the manner that it divides the
training set completely so that every image in training set got train here 11
divides 10901 completely so batch size is set to 11.

20
o Specify epochs- The number of epochs is the number of iterations. An epoch is
a term indicates the number of passes of the entire training dataset the model
has completed. At each epoch accuracy increases.
o Validation data- The validation data is equal to test dataset.
o Specify the validation steps – it is equal to number of images in test set divided
by batch size.

Fig 3.12: Fitting the model

Fig 3.13: Epochs

There are 20 epochs so, it iterates data 20 times and gives training accuracy and loss,
validation accuracy and loss.

To evaluate the model we will use evaluate function and pass the test dataset and set verbose
equal to zero.

21
Fig 3.14: Model Evaluation

On evaluating test set we got loss 0.098 and accuracy 97% which is good accuracy means our
model is perfect.

22
CHAPTER 4

RESULTS

4.1 MODEL PREDICTIONS

After setting the model now let’s just check the prediction whether the fruit is fresh or
rotten.

4.1.1 Steps for predictions:


o Load the image using load_img class provided by tensor-flow.
o Next we will convert the image using img_to_array- it converts a PIL
Image instance to a Numpy array.
o After converting image to numpy array we will manually scale image
pixel data. We will scale the image by factor of 255.
o Next we will convert the above obtained array into numpy array.
o After that We will expand the shape of an image by adding new axis.
o After all the pre-processing now, we will predict the results using
predict_classes.
o Final step, pass the predicted image in display function we created (Fig
3.13) and it will predict the final outcome.

Fig 4.1: Rotten Apple Predictions

23
Fig 4.2: Fresh Orange Predictions

Fig 4.3: Rotten Banana Predictions

Fig 4.4: Fruit classification

24
4.2 Graphs
The following graph shows the training and validation loss and accuracy. As you can see
from graph how loss decreases for both training and validation and how accuracy
increases for both.

Fig 4.5: Training and validation loss

The training loss indicates how well the model is fitting the training data, while the validation loss
indicates how well the model fits new data from the above graph we can observe that loss decreases at
every epoch for both training and validation.

25
Fig 4.6: Training and validation accuracy.

The above graph indicates the accuracy of training and validation (test) dataset and from graph we can
observe that both training and validation accuracy increases gradually and there is no overfitting and
underfitting.

4.3 Saving the model


After predictions and evaluation we need to save the model using save function
passing the model name along with h5 extension.

Fig 4.7: saving the model

26
CHAPTER 5

ARCHITECTURE
5.1 Architecture Flow

The architecture of the approach of implementation of the accuracy check being conducted in
our experiment. First we will add convolutional layer followed by pooling layer and then we
added five hidden layer and at end the we added output layer.

Fig 5.1 Architecture

27
CHAPTER 6

Conclusion & Future Scope


6.1 Conclusion

The classification of fresh and rotten fruits is very important in agricultural fields. In our
work, we introduced a model based on CNN and concentrated on building transfer learning
models for the task of classification of fresh and rotten fruits. The results proved that the
CNN model proposed can classify fresh and rotten fruits firmly and produced better accuracy.
Thus, the proposed CNN model can automate the process of human brain in classifying the
fresh and rotten fruits with the help of the proposed convolutional neural network model and
thus reduces the human errors while classifying fresh and rotten fruits. The accuracy of
97.29% is attained for the proposed CNN model.

6.2 Future Scope

The future extent for this work includes increasing varieties of fruits will be taken for
classification, so that every fruit farmer will use the system. The work proposed is more
useful for fruit yielding farmers for the classification of fresh and rotten fruits in yield so that
they can get better cost price at markets.

28
REFERENCES
[1] https://www.kaggle.com/sriramr/fruits-fresh-and-rotten-for-classification

[2] https://www.iieta.org/journals/ria/paper/10.18280/ria.340512

[3]https://towardsdatascience.com/understanding-cnn-convolutional-neural-network-
69fd626ee7d4

[4]https://www.researchgate.net/publication/347299542_Fresh_and_Rotten_Fruits_Classifica
tion_Using_CNN_and_Transfer_Learning

29

You might also like