You are on page 1of 5

Proceedings of 2020 IEEE Applied Signal Processing Conference (ASPCON)

A novel approach to detect and classify fruits using


ShuffleNet V2
Sourodip Ghosh Md. Jashim Mondal Sourish Sen
Dept. of Electronics Engineering Dept of Electronics Engineering Dept. of Electronics Engineering
KIIT University KIIT University KIIT University
Bhubaneshwar, India Bhubaneshwar, India Bhubaneshwar, India
sourodip.ghosh02@gmail.com jashimmondal007@gmail.com sourishsen7@gmail.com

Soham Chatterjee Nilanjan Kar Roy Suprava Patnaik


Dept of Electronics Engineering Dept of Electronics Engineering Dept of Electronics Engineering
KIIT University KIIT University KIIT University
Bhubaneshwar, India Bhubaneshwar, India Bhubaneshwar, India
1704302@kiit.ac.in nilan3107@gmail.com suprava.patnaikfet@kiit.ac.in

Abstract—In the proposed context, we show an identification framework to group and distinguish between classes of the
and classificationa pproacho fo rganicp roductsb etween41 fruits. In this paper, we propose a classificationa nalysisof
unique classes. We have utilized a pre-trained Convolutional fruits using ShuffleNetV 2,a l ight-weightC NNarchitecture,
Neural Network design, the ShuffleNetV 2,c hosena sf orthe
proficientp resentatione xtento fbuildingc onvolutionalb locksat and evaluate the performance with feature assessments and
ease, by using more feature channels. The model, when tried on evaluation metrics.
the proposed dataset, accomplished a test accuracy of 96.24%
accordingly making a stride further in the exploration proposed II. L ITERATURE S URVEY
by past authors surveying the organic product detection via In this segment, we will get acquainted with past under-
Convolutional learning and feature re-usability technique. The
outcomes are assessed utilizing various assessment parameters, takings on Fruit Recognition using Deep Learning. Bargoti et
like Precision, Sensitivity, F-Score, and ROC score. Moreover, a al. [3] use a framework to perceive fruits in an estate. They
visual of the predicted images was performed to anticipate the employ usage of Faster R-CNN framework across detection
evaluation. of orchards, including apples, almonds and mangoes. They
Index Terms—Convolutional Neural Networks, Fruit classifi- achieveanF1-Score,greaterthan0.9forapplesandmangoes.
cation, ShuffleNetV2
The quicker Region-based convolutional association has been
explained in nuances by Ren et al. [4]. Puttemans et al. [5]
I. I NTRODUCTION
worked on methods related to the customized gathering of
In speaking to the origin for the human mind, images fruits with a technique for recognizing prepared strawberries
are the most fundamental technique in physical grouping and apples from estates. Barth et al. [6] portrays a technique
of fruits. Components influencingf ruitsc anb emeasured to exploration of images. They created a synthetic dataset,
outwardlywhicharedifficult,costlyandarehandilyinfluenced by using emperical measurements and creating 3D plant
by physical variables, including conflictinga ssessmentand models.DeepLearningmodelsweretrainedonthesesynthetic
abstract outcomes. The fruit examiners have done the quality datasets. This was the first synthetic approach, enhancing
assessment by experience and observing. This strategy is further step in Deep Learning using 3D synthetic models.
fundamentally conflicting,fl ightyan dch oicesar eon lyfrom An advancing survey of working found that Convolutional
time to time equivalent among examiners. In this kind of Neural Networks have expanded a stunning ground in pic-
condition, the examination of natural products for a few ture portrayal and item location. CNN has been exhaustively
prospective rules is a persistent assignment; machine vision applied to design attestation issues, for example, image clas-
frameworks are optimum in organic product assessment. The sification and object detection [7] [8] [9]. A Quicker Region-
computer-aided procedures used to perceive a fruit depend basedCNNapproachhasbeenutilizedfortweakedcustomized
on four essential highlights which portray the item: intensity, fruitdetectionandacknowledgmentfrompictures.Thesystem
color, shape and texture, Arivazhagan et al. [1] propose an is prepared to utilize RGB and NIR (close infra-red) pictures
effective combination of shading and surface features for and utilizing the blend of these photographs the structure
fruitrecognition.Theacknowledgmentisfinishedbythebase can have better execution [10]. Sunderhauf et al. [11] in
separation classifier dependent on the measurable and co- 2014, presented a CNN model to extract features from plant
event features obtained from the Wavelet transformed sub- classificationdatasetswithascoreof0.249onLifeCLEF2014
bands. Zawbaa et al. [2] propose a natural fruit detection program. Naik et al. [12] highlighted the excess time and

‹,(((

163
ISBN: 978-1-7281-6882-1 PART: CFP20P52-ART
Authorized licensed use limited to: KIIT University. Downloaded on February 10,2022 at 03:48:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2020 IEEE Applied Signal Processing Conference (ASPCON)

energy loss in controlling fruit quality control, and therefore TABLE I: Number of images for proposed method from the
discussusingmethodslikeSURF,HOG,LBP,alongwithsome FRUITS 360 dataset.
discussions about machine learning algorithms like SVM, Class
Class Training images Validation images Test images
ANN, KNN, and a few CNN architectures. The review con- index
tainsdetailedcomparisonintheformofatable,tosummarize 0 Apple Braeburn 492 75 89
1 Apple Crimson Snow 444 66 82
the best classifiersf orf ruitd etectiont ill2 017.Pre-handling, 2 Golden Apple-I 480 74 86
and grayscale image processing was applied so as to develop 3 Golden Apple-II 492 86 78
4 Golden Apple-III 481 73 88
information. The accuracy achieved on the testing set was 5 Granny Smith Apple 492 80 84
96.3%. 6 Apple Pink Lady 456 80 72
7 Red Apple-I 492 83 81
8 Red Apple-II 492 88 76
III. PROPOSED METHOD 9 Red Apple-III 429 80 64
10 Red Delicious Apple 490 75 91
In this research, we present a fruit classification approach 11 Red Yellow Apple I 492 83 81
usingShuffleNetV2[13].Thisarchitectureunravelsthecom- 12 Red Yellow Apple II 672 107 112
plexity level by limiting MAC to a great extent. This permits 13 Banana 490 78 88
14 Banana Lady Finger 450 68 86
the model to arrive at a lower bound when the total number 15 Banana Red 490 84 82
of input and output channels are equivalent. The reusing of 16 Cherry I 492 83 81
17 Cherry II 738 120 126
features through mapping is a preferred position additionally 18 Cherry Rainier 738 113 133
found in DenseNet [14]. The exceptionally proficient model 19 Wax Black Cherry 492 81 85
20 Wax Red Cherry 492 86 78
makes building blocks simpler, by utilizing more feature 21 Wax Yellow Cherry 492 85 79
channels. Likewise, the model can access a huge system 22 Grape Blue 984 157 171
23 Grape Pink 492 87 77
effortlessly. This permits the model to accomplish the best 24 White Grape 490 85 81
in class accuracy tradeoff and speed in the computation. The 25 White Grape II 490 74 92
26 White Grape III 492 89 75
light-weight CNN model is an advantage for situations where 27 White Grape IV 471 80 78
computation complexity, MAC and system specifications are 28 Grapefruit Pink 490 81 85
29 Grapefruit White 492 83 81
notveryefficient.Itallowsmodelstoruninacompact,mobile 30 Guava 490 71 95
enviroment with limited scope of FLOPs. The work process 31 Lychee 490 85 81
32 Mango 490 74 92
for the proposed strategy is demonstrated in Fig. 1. 33 Mango Red 426 69 73
34 Orange 479 78 82
35 Pineapple 490 96 70
36 Pineapple Mini 493 82 81
37 Raspberry 490 77 89
38 Redcurrant 492 90 74
Fig. 1: Workflow for this experiment 39 Strawberry 492 76 88
40 Strawberry Wedge 738 118 128

A. Data
ThedatasetutilizedformodeltrainingistheFruits360data average CPU. This enhances the opportunity for experimenting
[15].Thisisanextensivelyutilizeddatasetforthegroupingof with extremely huge datasets, like the data identified with our
organic products in a deep learning approach. It contains top- proposed investigation.
notch surveyed pictures of 131 classes altogether. We have
restricted the number of classes to 41, to keep up with the C. Data pre-processing
extentoftheresearch.Therestoftheclasseswerediscardedas
amajorityofclassesinthisdatasetcontainsvegetableimages. The images in the various classes of fruits were high-
Also, the 41 classes were manually selected with respect to dimensional,perfectlysuitabletoassessfeatures,fundamental
their prevalence when compared to the other rare classes for model building. The pictures were initially of size 100 x
of fruits. The quantity of images is 29,347 altogether. This 100. They were resized to 32 x 32 to forestall information
incorporates training and test directories previously designed regarding input layer criteria in ShuffleNetV 2architecture.
by the authors. The images were captured from 6 different The images were converted from BGR channel encoding to
cameras, the pictures were caught in a rotational premise, RGB to get to access information from the OpenCV library.
to prepare models with a feature evaluation ability. The total The pixel values were standardized to range 0-1. The final
number of pictures per class is referenced in Table I. numberofimagesbeforemodeltrainingisshowninTableII.

B. System specifications TABLE II: Number of Images after Data Pre-processing
The model was created utilizing a single 12GB NVIDIA
Training Testing Validation
Tesla K80 GPU. It provides 1.8TFlops 12GB RAM with 4992
Set Set Set
CUDA cores, which makes the entire calculation time 10x
quicker than on a typical CPU. The training data had high No. of images 22,232 3,615 3,500
batches of information, hence GPU was preferred over an

164
ISBN: 978-1-7281-6882-1 PART: CFP20P52-ART
Authorized licensed use limited to: KIIT University. Downloaded on February 10,2022 at 03:48:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2020 IEEE Applied Signal Processing Conference (ASPCON)

D. Model Architecture
A light-weight CNN architecture, ShuffleNetV 2w asused
as the architecture of the model training. The perception
behind selecting this model is that it incorporates feature re-
usabilitywithlowtrainableparametersforlargedata-setslike
the Fruits 360 dataset. This reduces model complexity and
training time, with a state of the art model performance and
evaluation assessment parameters.
Parameters like scale factor, pooling, num shuffleu nits,bot-
tleneck ratio were set, the number of classes was 41. The
number of hidden layers was rectifiedw ithc ustominputs
using hyperparameter tuning to get the optimum results from
Fig. 2: Accuracy and Loss graph for training and validation
the model. The network was created by stacking 6 building
for ShuffleNet V2
blocks, each containing two convolutional layers. No layers
were frozen during the model execution. For each activation
layer,’ReLU’functionwasused,followedbyafinal’softmax’
layer,with41neurons,correspondingtothenumberofoutput E. Result Analysis
classes.Theoptimizer,Adam[16]wasusedfortheerrorrate TheShuffleNetV2modelwasevaluatedonthetestset.The
reduction. The equations for update weights are mentioned model correctly classifies3 ,479i mageso uto f3 ,615images
below. in the test set. Therefore, the test accuracy achieved was
Initialisation of weights: 96.24%.Theconfusionmatrixdenotestheclass-wisedetection
performance of the model, as shown in Fig. 3.
ρm ← 1, ρv ← 1, m ← 0, v ← 0 (1)
Update rules for Adam Optimiser:
ρm ← βm ρm (2)

ρ v ← β v ρv (3)

m ← βm m + (1 − βm )∇w J (4)

v ← βv v + (1 − βv )(∇w J  ∇w J) (5)

m 1 − ρv
w ← w − α( √ ) (6)
v +  1 − ρm
where m, v represents first and second moment vector, re-
spectively.Similarly,βm,andβv representsexponentialdecay
rateforfirstandsecondmomentvector,respectively.ρmandρv
specifies the adaptive learning rate time decay factor. This
parameter is similar to momentum and relates to the memory
forpriorweightupdates.αinEq.6representslearningrateor
stepsize.∇wJrepresentsgradientofcostfunction,J.inEq.6
isasmallvaluetopreventdivisionbyzerocondition.TheEq.1
denotesinitialweightsbeforeupdate,andEq.2toEq.5denote
conditionsobeyingweightandbiaschange.Thefinalchangein
weight parameter is shown in Eq. 6. In Eq. 5,  refers to
element-wisemultiplicationandinEq.6,theoperationsunder
therootarealsohandledelement-wise. Fig. 3: Confusion Matrix for ShuffleNet V2
Adam optimizer is a blend of RMSProp and AdaGrad with
moments. This fuses in changing the primary moving average TableIIIcontainsRecallandPrecisionvaluesinclassifying
toNesterovacceleratedmomentum,alongtheselinesradically the test data, class wise. All the classes had high chance on
uniting to global minima in any event train time. In the getting predicted, with no interference. The Receiver Operat-
particulardataset,Adamoutperformedalldifferentoptimizers, ing Characteristics (ROC) scores of the model was calculated
with its refreshed weights. to be 99.64%, suggesting the model had an enhanced chance
The model was trained on 30 epochs. The training and oflocatingthebestthresholdwhilecategorizingclassesbased
validation accuracy and loss are shown in Fig. 2. on their feature differences.

165
ISBN: 978-1-7281-6882-1 PART: CFP20P52-ART
Authorized licensed use limited to: KIIT University. Downloaded on February 10,2022 at 03:48:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2020 IEEE Applied Signal Processing Conference (ASPCON)

TABLE III: Precision and Sensitivity scores of ShuffleNet V2 label in Fig. 4 indicates mis-classification, prediction of an
model Apple Pink Lady label which was actually, an image label
of class Apple Crimson Snow. The Precision of Apple Pink
Class
Class Precision Sensitivity Lady class (index 6 on Tab. III) and the Sensitivity of Apple
index
Crimson Snow class (index 1 on Tab. III) have values 0.92
0 Apple Braeburn 1.00 1.00 and 0.93 respectively. Moreover, the features of the classes
1 Apple Crimson Snow 1.00 0.93 were very closely related. Therefore, the model had a slightly
2 Golden Apple-I 1.00 1.00 lower chance of predicting the class label of this test image
3 Golden Apple-II 1.00 1.00 when compared to the rest of the classes.
4 Golden Apple-III 1.00 1.00
5 Granny Smith Apple 1.00 1.00
6 Apple Pink Lady 0.92 1.00
7 Red Apple-I 1.00 1.00
8 Red Apple-II 1.00 0.91
9 Red Apple-III 1.00 1.00
Fig. 4: Prediction results (if predicted label == true label: color
10 Red Delicious Apple 1.00 1.00
= green, else color = red)
11 Red Yellow Apple I 1.00 1.00
12 Red Yellow Apple II 1.00 1.00
13 Banana 1.00 1.00 C ONCLUSION
14 Banana Lady Finger 1.00 1.00
Convolutional neural networks and Image Processing have
15 Banana Red 0.90 1.00
been widely used to address the organic product classification
16 Cherry I 1.00 1.00
problem. We introduce a classification and analysis approach
17 Cherry II 0.98 1.00
of fruits using ShuffleNet V2, a pre-trained convolutional
18 Cherry Rainier 1.00 1.00
neural network architecture. The experiments showed a high-
19 Wax Black Cherry 1.00 1.00
performance analysis using several model assessment parame-
20 Wax Red Cherry 1.00 1.00
ters. The accuracy and misclassification range is optimized to
21 Wax Yellow Cherry 1.00 1.00
a very high extent in terms of previous methods involved, with
22 Grape Blue 1.00 1.00
an advanced step to assess the fruit classification problem.
23 Grape Pink 1.00 1.00
24 White Grape 1.00 1.00 R EFERENCES
25 White Grape II 1.00 1.00 [1] S. Arivazhagan, R. N. Shebiah, S. S. Nidhyanandhan, and L. Ganesan,
26 White Grape III 1.00 1.00 “Fruit recognition using color and texture features,” Journal of Emerging
27 White Grape IV 1.00 1.00 Trends in Computing and Information Sciences, vol. 1, no. 2, pp. 90–94,
2010.
28 Grapefruit Pink 1.00 1.00 [2] H. M. Zawbaa, M. Abbass, M. Hazman, and A. E. Hassenian, “Au-
29 Grapefruit White 1.00 1.00 tomatic fruit image recognition system based on shape and color
30 Guava 1.00 0.98 features,” in International Conference on Advanced Machine Learning
Technologies and Applications, pp. 278–290, Springer, 2014.
31 Lychee 1.00 0.98 [3] S. Bargoti and J. Underwood, “Deep fruit detection in orchards,” in 2017
32 Mango 0.99 1.00 IEEE International Conference on Robotics and Automation (ICRA),
33 Mango Red 1.00 1.00 pp. 3626–3633, IEEE, 2017.
[4] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time
34 Orange 0.93 1.00 object detection with region proposal networks,” in Advances in neural
35 Pineapple 1.00 1.00 information processing systems, pp. 91–99, 2015.
36 Pineapple Mini 1.00 1.00 [5] S. Puttemans, Y. Vanbrabant, L. Tits, and T. Goedemé, “Automated
visual fruit detection for harvest estimation and robotic harvesting,” in
37 Raspberry 1.00 1.00 2016 Sixth International Conference on Image Processing Theory, Tools
38 Redcurrant 1.00 1.00 and Applications (IPTA), pp. 1–6, IEEE, 2016.
39 Strawberry 1.00 1.00 [6] R. Barth, J. IJsselmuiden, J. Hemming, and E. J. Van Henten, “Data
synthesis methods for semantic segmentation in agriculture: A capsicum
40 Strawberry Wedge 1.00 0.94 annuum dataset,” Computers and electronics in agriculture, vol. 144,
pp. 284–296, 2018.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in neural infor-
The model had an average precision of 0.993 and sensitivity mation processing systems, pp. 1097–1105, 2012.
of 0.992, therefore the average F Score was evaluated as 0.993. [8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
The low validation loss ensures a very low chance of model V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,”
in Proceedings of the IEEE conference on computer vision and pattern
overfitting. recognition, pp. 1–9, 2015.
A test-prediction visualization was generated to mark labels [9] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
with green color if the prediction label matches true label, large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[10] J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-
red if not. Out of 41 classes, the model had a high chance based fully convolutional networks,” in Advances in neural information
of predicting correct labels as demonstrated in Fig. 4. A red processing systems, pp. 379–387, 2016.

166
ISBN: 978-1-7281-6882-1 PART: CFP20P52-ART
Authorized licensed use limited to: KIIT University. Downloaded on February 10,2022 at 03:48:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of 2020 IEEE Applied Signal Processing Conference (ASPCON)

[11] N. Sünderhauf, C. McCool, B. Upcroft, and T. Perez, “Fine-grained [14] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely
plant classification using convolutional neural networks for feature connected convolutional networks,” in Proceedings of the IEEE confer-
extraction.,” in CLEF (Working Notes), pp. 756–762, 2014. ence on computer vision and pattern recognition, pp. 4700–4708, 2017.
[12] S. Naik and B. Patel, “Machine vision based fruit classification and [15] H. Mureşan and M. Oltean, “Fruit recognition from images using deep
grading-a review,” International Journal of Computer Applications, learning,” Acta Universitatis Sapientiae, Informatica, vol. 10, no. 1,
vol. 170, no. 9, pp. 22–34, 2017. pp. 26–42, 2018.
[13] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “Shufflenet v2: Practical [16] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
guidelines for efficient cnn architecture design,” in Proceedings of the arXiv preprint arXiv:1412.6980, 2014.
European Conference on Computer Vision (ECCV), pp. 116–131, 2018.

167
ISBN: 978-1-7281-6882-1 PART: CFP20P52-ART
Authorized licensed use limited to: KIIT University. Downloaded on February 10,2022 at 03:48:35 UTC from IEEE Xplore. Restrictions apply.

You might also like