You are on page 1of 22

Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Authors’ list
Rodrigo Pereira Ramos (corresponding author) ORCID: 0000-0002-9243-3887, Federal University of São
Francisco Valley (Univasf) – College of Electrical Engineering, Juazeiro - BA, Brazil, e-mail:
rodrigo.ramos@univasf.edu.br
Jéssica Santana Gomes, Juazeiro - BA, Brazil, e-mail: jessicasgomes.3007@gmail.com
Ricardo Menezes Prates ORCID: 0000-0002-1580-9828, Federal University of São Francisco Valley
(Univasf) – College of Electrical Engineering, Juazeiro - BA, Brazil, e-mail: ricardo.prates@univasf.edu.br
Eduardo F. Simas Filho ORCID: 0000-0001-8707-785X, Federal University of Bahia (UFBA) – Department
of Electrical Engineering, Salvador - BA, Brazil, e-mail: eduardo.simas@ufba.br
Barbara Janet Teruel Mederos ORCID: 0000-0002-5102-6716, State University of Campinas (UNICAMP)
– College of Agricultural Engineering, Campinas - SP, Brazil, e-mail: barbara.teruel@feagri.unicamp.br
Daniel dos Santos Costa ORCID: 0000-0001-7703-3183, Federal University of São Francisco Valley
(Univasf) – College of Agricultural and Environmental Engineering, Juazeiro - BA, Brazil, e-mail:
daniel.costa@univasf.edu.br

Abstract

Background
The San Francisco Valley region from Brazil is known worldwide for its fruit production and exportation, especially
grapes and wines. The grapes have high quality due not only to the excellent morphological characteristics, but also to
the pleasant taste of their fruits. Such features are obtained because of the climatic conditions present in the region. In
addition to the favorable climate for grape cultivation, harvesting at the right time interferes with fruit properties.

Results
This work aims to define grape maturation stage of Syrah and Cabernet Sauvignon cultivars with the aid of deep
learning models. The idea of working with these algorithms came from the fact that the techniques commonly used to
find the ideal harvesting point are invasive, expensive, and take a long time to get their results. In this work,
Convolutional Neural Networks (CNNs) were used in an image classification system, in which grape images were
acquired, pre-processed and classified based on their maturation stage. Images were acquired with varying illuminants
that were considered as parameters of the classification models, as well as the different post-harvesting weeks. The best
models achieved maturation classification accuracy of 93.41% and 72.66% for Syrah and Cabernet Sauvignon,
respectively.

Conclusions
It was possible to correctly classify wine grapes using computational intelligent algorithms with high accuracy,
regarding the harvesting time, corroborating chemometric results.

Keywords
Deep learning, Grape maturation, Image processing, Post-harvest

Declarations
Funding – This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -
Brasil (CAPES) - Finance Code 001.
Conflicts of interest/Competing interests – The authors declare that they have no conflict of interest.
Availability of data and material – The datasets generated during and/or analysed during the current study are
available from the corresponding author on reasonable request.
Code availability – Not applicable

This article has been accepted for publication and undergone full peer review but has not been
through the copyediting, typesetting, pagination and proofreading process which may lead to
differences between this version and the Version of Record. Please cite this article as doi:
10.1002/jsfa.10824
This article is protected by copyright. All rights reserved.
Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Introduction

The actual demanding for food consumption worldwide has facing challenges regarding sustainability and production of

high-level quality supplies. In particular, the viticulture sector (wines and juices) has been giving attention to quality as

a way of meeting market requirements. Thus, determining the right harvesting time, with good precision, is crucial to

obtain grapes, and consequently wines, with high quality and increased aggregate values.

To obtain grapes for wine production, they must be harvested at the optimum point of ripeness.1 The optimal ripening of

grapes involves different events, being usually accompanied by a change in the color of the epicarp due to a

modification in the concentration of pigments in surface tissues.2 Such color change, resulting from biochemical

reactions inside the fruits, may be related to maturity.3 For example, in red grapes the fruits still green have a higher

proportion of chlorophyll, which gradually decreases with maturity, increasing the concentration of anthocyanins and

flavonoids.4,5,6

Monitoring of maturity and quality attributes often involves analytical techniques that are time-consuming, destructive

and sometimes can require sophisticated equipment. Analysis of total soluble solids (TSS), total anthocyanins and

flavonoids are examples of such techniques. With the purpose of developing non-destructive techniques, the use of

electronic instrumentation and machine learning techniques has been under consideration in recent years for

determining grapes maturation.7,8 The change of skin color has been correlated to grape maturation using hyperspectral

images and spectroscopy Vis-NIR compared to physiological and biochemical events.9,10 However, such methods

require very costly equipment for image and spectra acquisition to perform maturity determination.

A simpler and yet powerful technique for optimal harvesting time is the use of images acquired from traditional optic

cameras and artificial intelligent systems. This can be done combining automatic image acquisition systems, image

processing techniques and machine learning tools. Specifically, deep learning (DL) algorithms, as for example

Convolutional Neural Network (CNN) that is a class of algorithms mainly utilized for image classification, generating a

higher-level abstraction of the input data based on convolutional layers.11 CNNs have been widely employed in several

areas, however, in the agricultural area, the works are recent.

The present paper aims at the development of a framework for non-invasive grape classification based on their

maturation stage using deep machine learning algorithms. The grapes were collected in a vineyard from SFV region of

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Brazil. An apparatus for image acquisition was used in combination with a computational CNN system for

classification, considering standard RGB images from photos taken with different illumination conditions. Several

harvesting times were tested, and the system generated outputs related to the maturation stages of the fruits. It is

important to state that CNN algorithms are not related to image colors, but to image textures, which is why color

features were not taken into consideration in the article.

The present paper is structured as follows. Section Related Works explores existing methods for estimate grape

maturation levels. Section Materials and Methods details the data acquisition process as well as the convolutional neural

network architectures employed in this study. Section Results and Discussions presents the experimental analysis and

results. Finally, we conclude the article and discuss future lines of work in Section Conclusions.

Related Works

Some works have analyzed fruit attributes using machine learning and images. Zuñiga et al.12 analyzed grapevine

maturity using seed images in a pattern recognition procedure with traditional neural networks. Seed images were

acquired in a controlled environment and an invariant color model was used to segment the images and collect several

feature descriptors that were feed to the neural network. It was reported an accuracy of 86% in the test set with a total of

257 seed samples.

Seng et al.13 developed an image database of different varieties in several maturation stages. They employed some

traditional machine learning algorithms – support vector machines (SVM), k nearest neighbor (k-NN), logistic

regression, classification tree, boosted tree, and social adaptive ensemble (SAE) – to detect grape varieties using

different image color spaces, such as Red-Green-Blue (RGB), YCbCr, HSV and Lab. The best classification rates, using

SVM and Lab, yielded 84.4% and 89.1%, for white and red cultivars, respectively.

Deep learning with CNN algorithms were used to detect, segment and track grapes using natural images taken from

low-cost cameras14. With a dataset comprised of 408 grapes, the authors have reached an F1-score of 0.91 in the

segmentation process, using ResNet 101 architecture for feature extraction followed by a Mask R-CNN for

segmentation. For detection tasks, three CNN networks were employed, with the best scores being reached with Mask

R-CNN. However, the authors did not consider maturation analysis in the work.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

A robotic vision system based on a CNN framework named FasterRCNN was used to estimate ripeness of sweet

pepper15. Two approaches were employed: a multi-class task for detection of each ripeness to be detected and parallel

layers, one used to detection and another for ripeness estimation. On both cases, a VGG-16 architecture was used with

transfer learning, reaching F1-scores of 0.77 and 0.72 for parallel and multi-class approaches, respectively.

Hang et al.16 proposed a CNN model to classify apple trunk tree diseases. They generated a dataset composed of 607

images divided into three classes and used five CNN architectures for classification comparison. The best results were

reached using a VGGNet architecture with modified loss function, with which accuracy of 94.5% was obtained.

Deep learning was also used to detect grapevine phenotyping17. The authors developed a sensorial structure based on an

Intel RGB-D sensor to automatically estimate canopy volume and brunch detection and counting using four CNN

architectures. A maximum accuracy of 91.52% was reached using VGG-19 architecture.

Other computational systems, either based on traditional classification algorithms or deep learning techniques, have

been widely employed for various agricultural and food challenges, extensively reported on the survey work of

Kamilaris et al.18. However, to the best of the authors' knowledge, no approach with convolutional neural networks,

combined with an image dataset with different illuminant spectra, have yet been proposed to classify grape ripeness.

Materials and Methods

This section provides detailed information regarding the experimental procedures and materials used for the work

development. Both hardware and software arrangements were taken to provide reliable images acquisition in a

controlled fashion and achieve trustworthy data analysis. Figure 1 shows a flowchart representing a compact diagram of

the methodology used in this work. Briefly explaining, at first, the images were acquired and subsequently adjusted in

size and pixel-normalized. After that, the two separated datasets were split, regarding the harvesting week, and fed both

deep learning models (VGG-19 and the proposed one). Data were split into training, validation, and test subsets, from

which the classification performance was extracted and the determination of total soluble solids, anthocyanins and

flavonoids quality attributes were also performed.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Fig. 1 Diagram of the proposed methodology for grape maturation classification.

Image acquisition

Samples of wine grapes (Vitis Vinifera L.) were collected from a vineyard located in the city of Lagoa Grande - PE

(9.05363; -40.19868), in the SFV region. An amount of 432 and 576 berries from Syrah and Cabernet Sauvignon

cultivars were harvested, respectively. The harvests were interspersed by six dates for Syrah and eight dates for

Cabernet Sauvignon and accounted for a total of 72 berries for each date. The Syrah cultivar was precocious in the

climatic conditions of the SFV region, therefore the number of smaller harvests. Fruit collection was carried on a

weekly basis and corresponded to the period from May to August 2017, reaching various stages of fruit maturity. These

stages were classified according to the harvesting week, with each week, or set of weeks, being considered as a different

class representing a given stage. To obtain representative samples, a stratified sampling method was adopted. A middle

row of the vineyard was chosen and from it, six plants from the center were selected, being two bunches marked by

plant. The selected bunches were divided into three positions: top, middle and bottom. From each position, two berries

were removed, making a total of six berries per bunch. These samples were taken as representative of the bunch. It

should be noted that it is impracticable to measure SST, total anthocyanins and yellow flavonoids in the entire bunch.

The individual berries were packed in waterproof packaging and stored in ice for transportation, immediate processing

and acquisition of the spectra. Before each measurement, it was necessary to wait for the samples to stabilize at room

temperature of 25 °C. The individual berries were submitted to the reference measurements, namely total soluble solids,

total anthocyanins and yellow flavonoids.10

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Individual grape images were obtained from a Canon EOS REBEL T5i camera, with ISO-100 speed, f/5.6 scale, 48 mm

focal distance and 0.8s exposure time, allocated on top of a box, painted in matte black inside. Inside the dark chamber,

on its top, there were fifteen 3W LED illuminants equally distributed between red, green, blue, warm white and cool

white at an angular distance of 120°, as shown in Figure 2b. For each fruit, five RGB (Red, Green, Blue) images were

acquired (as shown in Figure 3), one for each illuminant turned on at a different time, thus resulting in a database of 360

images for each harvest week. The use of different illuminants was performed to take advantage of the several

interactions between visible sources of illumination and fruits and how they could improve the classification models.

The hardware setup for the image acquisition step is shown in Figure 2a. The above-mentioned dataset, containing all

grape images used in this work, will be available on request.

a) b)
Fig. 2 Image acquisition setup: (a) Dark chamber outside view; (b) LED positioning inside the chamber.

Fig. 3 Grapes illuminated by different light spectra: (a) red; (b) green; (c) blue; (d) cool white; (e) warm white.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Convolutional Neural Networks

In the present work, two CNN architectures were employed, as detailed described in the following subsections, with the

number of outputs equal to the number of classes (harvesting weeks). All computational experiments were implemented

on Python, using Keras.19

VGG-19

Acronym for Very Deep Convolutional Networks for Large-Scale Visual Recognition,20 VGG-19 is an CNN model

developed by researchers from the University of Oxford. As shown in Figure 4, the network has 19 layers, with 16

convolutional layers and 3 fully connected layers. It can classify images into different categories and, as it is trained

with the ImageNet database, its input image size is 224 x 224 pixels. VGG-19 is considered to be a simple network

because it has convolutional layers with kernel dimensions of 3 x 3. Although it has smaller convolution operations

compared to other architectures, it still has 144 million parameters, a quantity that requires a high computational cost.

To reduce its volume, maxpooling layers were used to reduce the amount of weights to be learned and avoid overfitting.

Fig. 4 VGG-19 CNN architecture.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

A pre-trained network model was used, with transfer learning to accelerate the learning process and improve

classification performance. For this work, only the last two layers were fine-tuning with the grape image dataset,

namely FC1 layer and output layer, representing a total of 102,780,932 trainable parameters.

Proposed architecture

Whereas VGG-19 is a robust, powerful CNN architecture, even with the use of transfer learning its training stage is

very time consuming with high computational efforts needed to implement it. We propose in this article a simplified

CNN architecture directed to decrease training computational costs, with reduced complexity compared to VGG-19,

while keeping similar classification performance.

The proposed CNN architecture is shown in Figure 5. It is composed of 10 convolutional layers, interlaced with

maxpooling layers, a fully connected (FC) layer at the end with 512 parameters and an output layer with softmax

activation function. The CNN outputs the probability of each entry image being of a given class (harvesting week). The

total number of trainable parameters was 704,643, which is far less (0.68%) than the number for VGG-19 model.

Fig. 5 Proposed CNN architecture.

Image adjustment

The original images were obtained with high dimensions (3456 x 5184 pixels on 3 RGB channels with 8-bit resolution

each) and the grapes were almost imperceptible, filling only a small area of the image. Therefore, a step of resizing the

images were performed in such a way that the fruits stayed in the center of the image and it completely contained the

berry. The shape of ImageNet input images is 224 x 224 pixels per RGB channel, and those dimensions are used by

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

VGG-19 network. To match the dimensions, therefore, it was necessary to crop the images keeping the fruit in the

center while preserving pixel resolution. The resizing step returned 8-bits resolution images with dimensions 224 x 224

pixels.

After cropping and resizing, the images were normalized to the interval [0,1] and separated according to their classes,

with each class representing a different week. One-hot encoding matrix was used to provide class labels and the data

were saved in tensor formats. These data were then divided into training, validation, and testing sets in the proportion of

44,44%, 22,33% and 33,33%, respectively.

Learning task and model evaluation

Both CNN architectures were trained iteratively with 50 epochs and 32 batches. The loss function used was the well-

known categorical cross-entropy and CNN weights optimization was conducted using Adam optimizer21 with learning

rate 10-3.

To evaluate both CNN models, confusion matrix, accuracy, and receiver operating characteristic (ROC) curves were

used. A confusion matrix shows the actual and predicted classification numbers for each class in the model. The

accuracy rate is a measure of the model prediction capacity on a given data set. It gives the rate of correct outputs of the

model. Mathematically, the accuracy can be defined by Acc = (TP + FN)/Total, where Total represents the sum of all

confusion matrix elements, TP the number of true positives and FN the number of false negatives. A ROC curve is a

graphical representation of the performance of a binary classifier system with varying classification thresholds. It is a

plot of TP rates (sensitivity) on the vertical axis versus FP rates (1-specificity) on the horizontal axis. The perfect model

would produce a step curve, with unity area.

Determination of quality attributes

The determination of TSS was performed from direct reading on a digital refractometer (HI 96804, Hanna Instruments,

USA), with a measurement range between 0 and 85 °Brix and accuracy of ±0.2 °Brix, using two drops of the sample,

with the results expressed in °Brix.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

The anthocyanins (Anth) and flavonoids (Flav) contents in the grape skin were determined according to Francis22, and

this single pH method had no difference with the AOAC method. A mass of 0.5 g of skin and pulp was weighed, then it

was added 25 mL of ethanol extraction solution (95%) acidified with HCl (1.5 N) in a proportion of 85:15. The samples

were macerated for one minute and next transferred to containers sheltered from light and stored in a refrigerator. After

24 hours, the supernatant was collected for reading on a spectrophotometer (Spectronic BioMate 5 UV-Vis, Thermo

Electron, UK) at 535 nm for anthocyanins and 374 nm for flavonoids. The results were expressed in mg.100 g-1 using

equations (1) and (2):

Anthocyanins = abs * dilutionfactor/E1 (1)

Flavonoids = abs * dilutionfactor/E2 (2)

where the dilution factor is 5000, the extinction coefficient for anthocyanins (E1%) is 98.2 and the extinction coefficient

for flavonoids (E2%) is 76.6.

Results and Discussions

Syrah cultivar

Two scenarios were considered to evaluate the capacity of the CNN models to predict maturation stages. The first one

considered six classes representing all six harvesting weeks. In the second scenario, the data set was modified to work

with three classes for which every class encompasses grape samples from two consecutive weeks, i.e., the first class is

composed by grape images from the first and second harvesting weeks, the second class from third and fourth weeks

and the last class from fifth and sixth weeks. The motivation for this modified dataset will be explained later.

For the first case, there were a total of 1260 grape images, with 360 for each of the six classes or weeks (72 samples per

week x 5 LEDs). For graphical illustration purposes, Figures 6a and 6b shows VGG-19 accuracy and loss evolution for

the training and validation steps in the six classes scenario. Each curve legend shows the best values obtained for each

case. It can be noticed the value of 85.15% for the accuracy in the validation stage.

Figure 6c illustrates the normalized confusion matrix for VGG-19 model considering the test dataset. Actual values are

presented on the vertical axis and predicted values on the horizontal axis. The harvesting weeks are represented as

classes {0, 1, ..., 5}. It can be seen that the classifier performed better in some classes (weeks), as it was the case for

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

weeks 1, 3, and 4 (classes 0, 2, and 3, respectively), and had decreased performance for weeks 2, 5, and 6 (classes 1, 4,

and 5), resulting in 86.33% of average accuracy for the test dataset.

Fig. 6 VGG-19 (a) accuracy, (b) loss and (c) normalized confusion matrix for the training and validation stages on the

six classes scenario for Syrah cultivar. The values shown in legends represents the best models within 50 epochs.

The proposed CNN performance, regarding accuracy and loss curves, is shown in Figures 7a and 7b for the six-class

scenario. The best model has achieved a validation accuracy of 88.10%, better than VGG-19 used with transfer

learning. Figure 7c show the normalized confusion matrices of the proposed model. A test accuracy of 89.62% was

reached, outperforming VGG-19 results. A similar behavior was observed for the VGG-19 model occurred for the

proposed CNN, in which some weeks led to more correct CNN predicted values.

Fig. 7 Proposed CNN (a) accuracy, (b) loss and (c) normalized confusion matrix for the training and validation stages

on the six classes scenario for Syrah cultivar. The values shown in legends represents the best models within 50 epochs.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Confusion matrix results, both from VGG-19 and the proposed models, showed an interesting pattern when considering

a one-to-one relation between class labels and harvesting weeks. Clearly, grape images from some weeks produce

outcomes with a quite small performance. One possible explanation on this behavior is that two weeks may represent

little changes on the visible characteristics of the fruits, leading to less activating maps of the CNN models. Figure 8

illustrates some selected examples of the most representative fruit images for each class. As can be seen, images from

neighboring classes have more similar patterns than others, which suggests a possible fusion of classes. Therefore, a

new scenario was considered, gathering the neighboring classes, resulting in a three-class dataset.

Fig. 8 Examples of grape images taken from each class, representing the most common patterns for each class: (a) First

week; (b) second week; (c) third week; (d) fourth week; (e) fifth week; (f) sixth week.

The confusion matrices for both VGG-19 and the proposed models for the three classes scenario are shown in Figure 9.

ROC curves are plotted in Figure 10 for both deep learning models as well as the area under the ROC curve (AUC). The

proposed model slightly outperforms VGG-19 model, as it is perceived from the closer the curves are to the unit step

function.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Fig. 9 Normalized confusion matrices for Syrah cultivar and three classes scenario: (a) VGG-19 model; (b) Proposed

CNN model.

Fig. 10 ROC curves for Syrah cultivar and three classes scenario: (a) VGG-19 model; (b) Proposed CNN model.

As a matter of illustration, Figure 11 shows some image samples that were incorrectly classified by the two deep

learning models. As it can be noticed, there were very confusing images from one harvesting week that are being

classified as belonging to another class.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Fig. 11 Examples of incorrectly classified Shiraz grape images: (a) sample from the first week classified as second

week; (b) sample from the second week classified as third week; (c) sample from the third week classified as fourth

week.

The average accuracy results for the two considered scenarios are illustrated in Table 1 for both VGG-19 and the

proposed CNN models. As it can be noticed, the fusion of classes resulted in better accuracy performances for both

models, as it was inferred. Moreover, VGG-19 was once again outperformed by the proposed simpler CNN model. It is

worthy to add that VGG-19 model is a very robust, efficient CNN and a highly hardware consuming architecture.

Besides that, it did not fit well for the present dataset.

Table 1 Validation and test average accuracy results for the two considered scenarios of Shiraz cultivar.

MODEL
Scenario VGG-19 Proposed CNN
Val. Acc. Test Acc. Val. Acc. Test Acc.
Six Classes 85.15% 86.33% 88.10% 89.62%
Three Classes 93.93% 91.30% 92.28% 93.41%

Chemometric results corroborate those findings, as it can be observed from Figure 12, that shows the boxplots of mean

and standard deviation for TSS, anthocyanins and flavonoids and the linear correlation for each maturation class. There

is an increase in quality attributes during ripening with accumulation of sugars (TSS) and pigments (anthocyanins and

flavonoids) for Syrah grapes, as expected. Moreover, it is observed that the decrease in the number of ripening classes

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

(harvesting weeks) increased the correlation for TSS and flavonoids, which was observed in the CNN models for the

Syrah cultivar.

Fig. 12 Boxplot of mean, standard deviation, and correlation of ripeness classes for quality attributes TSS (a) and (d),

anthocyanins (b) and (e), and flavonoids (c) and (f) of the Syrah cultivar.

Cabernet Sauvignon cultivar

The same approaches applied to Syrah cultivar were used for Cabernet Sauvignon, with some adaptations. For this

cultivar, eight harvesting weeks were employed because of the grape's different physicochemical characteristics. In the

first scenario, eight class labels were used in the CNN models to represent each of the harvesting weeks. In the second

scenario, the same strategy of combining neighboring weeks was conducted, leading to a four-class scenario.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

The results for the CNN models applied to Cabernet Sauvignon dataset are summarized in Table 2. The table shows

average validation and test accuracy values for both considered models. Gathering neighboring classes does not

improve VGG-19 test performance but substantially enhances the test average accuracy of the proposed CNN model. It

can be inferred that the proposed model is more sensitive to fusion of harvesting weeks than VGG-19 model, a similar

behavior seen for Syrah cultivar. ROC curves are shown in Figure 13 and confirm the results stated above. The best

results of VGG-19 against the proposed CNN model may be related to the higher number of layers of the VGG-19

model.

Table 2 Validation and test average accuracy results for the two considered scenarios of Cabernet Sauvignon cultivar.

MODEL
Scenario VGG-19 Proposed CNN
Val. Acc. Test Acc. Val. Acc. Test Acc.
Eight Classes 80.69% 76.23% 64.52% 65.30%
Four Classes 81.63% 80.97% 79.28% 72.66%

Fig. 13 ROC curves for Cabernet Sauvignon cultivar and four classes scenario: (a) VGG-19 model; (b) Proposed CNN

model.

As shown in Tables 1 and 2, both network models performed better for the Syrah dataset than for Cabernet Sauvignon

images. This can be explained looking at some images from Cabernet Sauvignon grapes representing the average

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

characteristic of each week, as illustrated in Figure 14. After the initial maturation stages, the samples are very similar,

and the differences are almost imperceptible after some weeks.

Fig. 14 Examples of grape images taken from each class, representing the most common patterns for each class: (a) first

week; (b) second week; (c) third week; (d) fourth week; (e) fifth week; (f) sixth week; (g) seven week; (h) eight week.

Figure 15 shows the boxplots of mean and standard deviation for TSS, anthocyanins and flavonoids and the linear

correlation for each maturation class. There is an increase in quality attributes during ripening with accumulation of

sugars (TSS) and pigments (anthocyanins and flavonoids) for Cabernet Sauvignon grapes. Also, it is observed that the

decrease in the number of ripening classes did not interfere in the correlation for quality attributes, different from the

network models for cultivar Cabernet Sauvignon.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

Fig. 15 Boxplot of mean, standard deviation, and correlation of ripeness classes for quality attributes TSS (a) and (d),

anthocyanins (b) and (e), and flavonoids (c) and (f) of the Cabernet Sauvignon cultivar.

Although an important research topic with direct application to grape production, there are very few works in the

specialized literature regarding the application of machine learning to classify grapes based on their maturation stage.

Visible and near-infrared (Vis-NIR) spectroscopy was used to classify grapes from a China local cultivar using linear

discriminant analysis (LDA), artificial neural networks (ANN) and support vector machine (SVM).23 Precision results

reached 100% when combining LDA with principal component analysis (PCA) but no accuracy information was

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

provided by the authors. Despite the good results, the cited work employed a very costly and difficult technique in

contrast with the simple RGB photos presented here.

Kangune et al.24 used CNN and SVM algorithms to classify grape ripeness of a local cultivar from India. Two classes

were considered, ripe and unripe, and classifiers outcomes were obtained regarding colors and morphological shape

features of the fruits. The authors reported validation accuracies equal to 79.49% for CNN and 69% for SVM that when

compared to the ones presented here showed better performance than the Syrah cultivar, but far worse than for the

Cabernet Sauvignon.

Conclusions

This study presented a procedure for classifying the maturation stage of wine grapes from two cultivars, namely Syrah

and Cabernet Sauvignon, using convolutional neural networks. Images of grapes were acquired with five different

illuminants and used as inputs to two CNN models for classifications based on their harvesting week representing

different maturation stages. Two scenarios were considered for both cultivars: one comprising all harvesting weeks and

another gathering samples from neighboring weeks. The idea was to merge similar samples in each class to improve

classification rates.

A simple CNN model was proposed, and its performance was compared to a pre-trained VGG19 model. The proposed

network outperformed the VGG-19 model in almost all cases considered, showing an average test accuracy of 93.41%

for Syrah dataset in a three classes scenario and 72.66% for Cabernet Sauvignon dataset in a four classes scenario. One

may infer that interspersed harvests in two weeks are sufficient and better to train the model, a behavior that was

corroborated by chemometric results. In addition to reducing the time spent on harvesting and obtaining the images, the

expenditure in monetary terms would be lower, since the personal responsible for collecting grape samples would

reduce the time required for that.

This work suggests the possibility of developing an apparatus comprising software and hardware steps to be used in

fieldwork for the classification of wine grapes maturation stage.

References

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

1. Jackson, D., Lombard, P., 1993. Environmental and management practices affecting grape composition and

wine quality – A review. American Journal of Enology and Viticulture, 44, 409–430.

2. Agati, G., Pinelli, P., Ebner, S.C., Romani, A., Cartelat, A., Cerovic, Z.G., 2005. Nondestructive evaluation of

anthocyanins in olive (Olea europaea) fruits by in situ chlorophyll fluorescence spectroscopy. Journal of

Agricultural and Food Chemistry, 53, 1354–1363.

3. Choong, T.S.Y., Abbas, S., Shariff, A.R., Halim, R., Ismail, M.H.S., Yunus, R., Ali, S., Ahmadun, F.R, 2006.

Digital Image Processing of Palm Oil Fruits. International Journal of Food Engineering, 1556-1560.

4. Agati, G., Meyer, S., Matteini, P., Cerovic, Z.G., 2007. Assessment of anthocyanins in grape (Vitis vinifera L)

berries using a noninvasive chlorophyll fluorescence method. Journal of Agricultural and Food Chemistry, 55,

1053–1061.

5. Agati, G., Traversi, M.L., Cerovic, Z.G., 2008. Chlorophyll fluorescence imaging for the noninvasive

assessment of anthocyanins in whole grape (Vitis vinifera L.) bunches. Photochemistry and Photobiology, 84,

1431–1434.

6. Hagen, S.F., Solhaug, K.A., Bengtsson, G.B., Borge, G.I.A., Bilger, W., 2006. Chlorophyll fluorescence as a

tool for non-destructive estimation of anthocyanins and total flavonoids in apples. Postharvest Biology and

Technology, 41, 156–163.

7. Ribera-Fonseca, A., Noferini, M., Jorquera-Fontena, E., Rombolà, A.D., 2016. Assessment of technological

maturity parameters and anthocyanins in berries of Cv. Sangiovese (Vitis vinifera L.) by a portable VIS-NIR

device. Scientia horticulturae, 209, 229–235.

8. ElMasry, G.M., Nakauchi, S., 2016. Image analysis operations applied to hyperspectral images for non-

invasive sensing of food quality – A comprehensive review. Biosystems engineering, 142, 53–82.

9. Nogales-Bueno, J., Hernández-Hierro, J.M., Rodrı́guez-Pulido, F.J., Heredia, F.J., 2014. Determination of

technological maturity of grapes and total phenolic compounds of grape skins in red and white cultivars during

ripening by near infrared hyperspectral image: A preliminary approach. Food chemistry, 152, 586–591.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

10. dos Santos Costa, D., Mesa, N.F.O., Freire, M.S., Ramos, R.P., Mederos, B.J.T., 2019. Development of

predictive models for quality and maturation stage attributes of wine grapes using VIS-NIR reflectance

spectroscopy. Postharvest Biology and Technology, 150, 166–178.

11. Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S., 2017. Efficient processing of deep neural networks: A tutorial and

survey. Proceedings of the IEEE, 105, 2295–2329.

12. Zuñiga, A., Mora, M., Oyarce, M., Fredes, C., 2014. Grape maturity estimation based on seed images and

neural networks. Engineering Applications of Artificial Intelligence, 35, 95–104.

13. Seng, K.P., Ang, L.M., Schmidtke, L.M., Rogiers, S.Y., 2018. Computer vision and machine learning for

viticulture technology. IEEE Access, 6, 67494–67510.

14. Santos, T.T., de Souza, L.L., Santos, A.A.d., Avila, S., 2019. Grape detection, segmentation and tracking using

deep neural networks and three-dimensional association. arXiv preprint arXiv:1907.11819, 1–36.

15. Halstead, M., McCool, C., Denman, S., Perez, T., Fookes, C., 2018. Fruit quantity and ripeness estimation

using a robotic vision system. IEEE Robotics and Automation Letters, 3, 2995–3002.

16. Hang, J., Zhang, D., Chen, P., Zhang, J., Wang, B., 2019. Identification of apple tree trunk diseases based on

improved convolutional neural network with fused loss functions, in: International Conference on Intelligent

Computing, Springer. pp. 274–283.

17. Milella, A., Marani, R., Petitti, A., Reina, G., 2019. In-field high throughput grapevine phenotyping with a

consumer-grade depth camera. Computers and electronics in agriculture, 156, 293–306.

18. Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: A survey. Computers and electronics

in agriculture, 147, 70–90.

19. Chollet, F., et al., 2015. Keras. https://keras.io. Accessed February, 2020.

20. Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv

preprint arXiv:1409.1556, 1–22.

21. Kingma, D.P., Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 1–

15.

This article is protected by copyright. All rights reserved.


Non-Invasive Setup for Grape Maturation Classification using Deep Learning

22. Francis, F.J., 1982. Analysis of Anthocyanins. In: Markais, P. (Ed.), Anthocyanins as Food Colors. Academic

Press, New York, pp. 181–207.

23. Lv, G., Yang, H., Xu, N., Mouazen, A.M., 2012. Identification of less-ripen, ripen, and over-ripen grapes

during harvest time based on visible and near-infrared (vis-nir) spectroscopy, in: 2012 2nd International

Conference on Consumer Electronics, Communications and Networks (CECNet), IEEE. pp. 1067–1070.

24. Kangune, K., Kulkarni, V., Kosamkar, P., 2019. Grapes ripeness estimation using convolutional neural network

and support vector machine, in: 2019 Global Conference for Advancement in Technology (GCAT), IEEE. pp.

1–5.

This article is protected by copyright. All rights reserved.

You might also like