You are on page 1of 8

Computers and Electronics in Agriculture 179 (2020) 105807

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture


journal homepage: www.elsevier.com/locate/compag

Early detection of grapevine leafroll disease in a red-berried wine grape


cultivar using hyperspectral imaging
Zongmei Gao a, Lav R. Khot a, *, Rayapati A. Naidu b, Qin Zhang a, *
a
Department of Biological Systems Engineering, Center for Precision and Automated Agricultural Systems, Washington State University, Prosser, WA 99350, USA
b
Department of Plant Pathology, Irrigated Agriculture Research and Extension Center, Prosser, WA 99350, USA

A R T I C L E I N F O A B S T R A C T

Keywords: This research was conducted to examine the potential use of hyperspectral imaging for non-destructive detection
Hyperspectral imaging of Grapevine leafroll-associated virus 3 (GLRaV-3) during asymptomatic and symptomatic stages of grapevine
Disease detection leafroll disease (GLD) in a red-berried wine grape (Vitis vinifera) cultivar. Cabernet Sauvignon vines tested
Vitis vinifera
positive and negative for GLRaV-3 were used in this study. Leaves from infected and non-infected vines were
Grapevine leafroll disease
Grapevine leafroll-associated virus 3
detached at five phenological stages and individual leaf images were acquired by a hyperspectral imager in 2017,
2018, and 2019 seasons. Those images were then preprocessed using spectra normalization and Monte-Carlo
method for eliminating spectral sample outliers. Least absolute shrinkage and selection operator was used to
select feature wavelengths for each phenological stage within three-season datasets. The sensitivity of selected
feature wavelengths was evaluated based on analysis of variance and linear regression. Six salient wavelengths
(690, 715, 731, 1409, 1425 and 1582 nm) were determined as sensitive wavebands for detecting virus symptoms
in leaf samples. The detectability of GLD using those six salient wavelengths was evaluated using least squares-
support vector machine. The classification accuracy was found between 66.67 and 89.93% for test datasets
collected at first asymptomatic stage over three seasons. These results indicated that the hyperspectral imaging
technique has the potential for nondestructively detecting virus-infected grapevines during asymptomatic stages.

1. Introduction implementing appropriate management decisions and to prevent spread


of the disease.
Grapevine leafroll disease (GLD), one of the most economically Currently, GLD management involves control of the vector, roguing
devastating viral diseases infecting wine grapes (Vitis vinifera L.), has of symptomatic vines, and re-planting (Burger et al., 2017). Visual
been reported in all viticultural regions of the world (Naidu et al., 2014). screening is one of the conventional methods for identifying diseased
GLD seriously affects vine vigor, vine physiology, causing uneven vines. Visual screening is often aimed to discover the infected vines
ripening, reduced yield and berry quality through reduced sugar con­ based on the plant disease symptoms such as lesions, blight, galls, can­
tents. Six distinct viral species, designated as grapevine leafroll- kers, wilts, and rots (Mahlein, 2016). This method requires a trained
associated virus (GLRaV) 1, 2, 3, 4, 7, and 13 have been reported in specialist, which may bring subjectivity in disease detection and only
grapevines (Dolja et al., 2017). Among them, GLRaV-3 has been the applicable for identifying the symptomatic stages of infected vines.
most widespread one and recognized as the main etiological agent of Biological indexing can also be used for detecting GLD symptoms.
grapevine leafroll disease. However, biological indexing is labor-intensive, requires large area of
GLD produces distinct foliar symptoms in red- and white-berried land to grow grafted vines and takes multiple seasons for vines to grow
wine grape cultivars (Naidu et al., 2014). It is virtually difficult to and express disease symptoms in a field setting (Rayapati et al., 2008).
identify the specific GLRaV in affected vineyards based on symptoms Importantly, biological indexing is influenced by various factors under
alone. Also, grapevine red blotch virus and a wide range of stressors are field conditions and symptoms do not reveal the identity of a specific
known to cause red leaf symptoms that mimic GLD in many red-berried virus present in test vines. These difficulties are circumvented with the
wine grape cultivars (Gohil et al., 2016; Adiputra et al., 2018). There­ use of serological and molecular diagnostic assays for the specific and
fore, reliable identification of GLRaV-infected vines is important for sensitive detection of viruses (Rowhani et al., 2017). These approaches

* Corresponding authors.
E-mail addresses: lav.khot@wsu.edu (L.R. Khot), qin.zhang@wsu.edu (Q. Zhang).

https://doi.org/10.1016/j.compag.2020.105807
Received 22 April 2020; Received in revised form 31 July 2020; Accepted 26 September 2020
Available online 16 October 2020
0168-1699/© 2020 Elsevier B.V. All rights reserved.
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

Fig. 1. Typical leaf samples tested positive for GLRaV-3 at (a) stage 1, (b) stage 2, (c) stage 3, (d) stage 4, (e) stage 5, and (f) healthy leaf (tested negative for GLRaV-
3) at stage 5 of data collected during 2017, 2018 and 2019 seasons.

are scalable, but the uneven distribution of viruses in an infected plant the crop and stressor to be detected. For example, Siedliska et al. (2018)
may affect the reliability of the assay. Additionally, serological, and selected 19 wavelengths from 617 to 2332 nm to detect the fungal in­
molecular diagnostic assays are destructive methods. They also need fections in strawberry fruit. Feature wavelengths were selected using the
expertise and facilities for reliable identification and interpretation of second derivative method. Cao et al. (2019) identified six feature
test results. Thus, large-scale scouting and testing would be costly for wavelengths (529, 641, 698, 749, 856, and 979 nm) through hyper­
growers. spectral imaging in the ranges of 383–1032 nm to detect waterlogging
Optical sensing, including broadband multispectral and hyper­ stress of oilseed rape leaves. Features were selected using successive
spectral techniques, have been shown useful in detecting crop diseases projections algorithm (SPA). Jarolmasjed et al. (2019) applied hyper­
(Sankaran et al., 2013). Additionally, researchers have developed the spectral imaging to detect apple fire blight disease, with images in the
optical and biosensor-based systems to detect the crop diseases (Ter­ range 350–2500 nm for two seasons. The feature wavelengths were
eshchenko et al., 2017; Tereshchenko et al., 2020). Among them, selected based on normalized difference spectral indices (NDSIs).
hyperspectral imaging (HSI) is a very powerful tool as it can obtain both Pertinent to GLD detection, prior studies have applied optical sensing
spectral and spatial information covering hundreds of wavebands. Such as a non-destructive method in red-berried wine grape cultivars in
hyperspectral data can aid in identification of the most sensitive wave­ commercial vineyards (Naidu et al., 2009; MacDonald et al., 2016; Sinha
lengths to potentially realize low-cost miniaturized version of an optical et al., 2019). These studies were focused on the detection of virus
sensor. infection during post-veraison, symptomatic stage. Since GLD symptoms
Hyperspectral imaging acquires spectral and spatial information of a are expressed in a phenological stage-specific manner, detection of virus
test object within the wide wavelength ranges, i.e. from 400 to 3000 nm infection during asymptomatic stages can greatly help implementing
(Knauer et al., 2017; Al-Saddik et al., 2018). Further, most of the HSI GLD management decisions. Consequently, this study was undertaken to
applications in agriculture are still under laboratory conditions due to assess the feasibility of applying the HSI technique for early detection of
the complex environment conditions in field (Kicherer et al., 2017; virus infections in a red-berried wine grape cultivar. Specific study ob­
Gutiérrez et al., 2018; Polder et al., 2019). Also, HSI creates high jectives were to: (1) identify the important wavelengths for early
dimensional datasets needing adequate computing and time to process detection of Grapevine leafroll-associated virus 3 (GLRaV-3) causing
the data in the field. Therefore, most prior studies have been focused on leafroll disease symptoms; and (2) evaluate the sensitivity and accuracy
selecting key feature wavelengths that can be used for detecting crop of the selected wavelengths for identifying virus-infected vines.
stresses using simplified optical systems.
Many unsupervised and supervised techniques have been applied to 2. Materials and methods
select the feature wavelengths. The common unsupervised techniques
include principal component analysis (PCA), multi-cluster feature se­ 2.1. Samples
lection (MCFS), maximum information and minimum redundancy
(MIMR), and genetic algorithms (GAs, Bhardwaj and Patra, 2018). The A red-berried wine grape cultivar (Cabernet Sauvignon) planted in
common supervised techniques include successive projections algorithm 2013 at an experimental site located at Prosser, WA (46.2◦ N, 119.8◦ W),
(SPA), uninformative variable elimination (UVE), spectrum derivative was selected for the study. Five grapevines tested positive, using RT-PCR
and band ratio (Mahlein et al., 2013; Gao et al., 2019; Shao et al., 2019). technique, for GLRaV-3 and adjacent five negatively tested vines were
Overall, the selected feature wavelengths range and number depends on selected for optical data collection (Naidu et al., 2009). The vines were

2
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

Fig. 2. Experimental and data analysis procedure outline. ROI: region of interest; LASSO: least absolute shrinkage and selection operator; ANOVA: analysis of
variance; LS-SVM: least squares-support vector machine.

loosening. At stage 3 (S3), shoots have 17–20 leaves per shoot with 50%
caps off. Young berries begin to enlarge (>2 mm diam.) at growth stage
4 (S4). Although asymptomatic at stage 2, 3, and 4, the leaves tend to
become tougher and thicker compared to that of early stage, so these
stages were considered as mid growth stage. At growth stage 5 (S5),
berries begin to color and enlarge. Also, maturing leaves begin to show
red and reddish-purple discolorations in the interveinal areas and near
the basal part of the shoots. Stage 5 was thus defined as symptomatic
growth stage.
In each of the three testing seasons, a total 500 leaf samples were
collected from both infected and healthy vines. In every growth stage, 50
leaf samples from healthy or infected vines were collected and analyzed
for three seasons. Specifically, from each grapevine, 10 leaf samples
were collected with one single leaf detached from lower 4th or 5th node
form a shoot. Since the spur training system with VSP trellis, 5 leaves
were detached from each side, i.e. 10 samples per vine. Fig. 1 shows
typical examples of GLD infected and healthy leaves collected in those
stages.

2.2. Hyperspectral imaging system and data acquisition

Fig. 2 shows three major steps, i.e., data collection, image pre-
processing, and feature wavelengths selection and evaluation, used in
conducting this study.
Fig. 3. Hyperspectral imaging system used in this study. The HSI system with the spectral range of 517 to 1729 nm and
spectral resolution of 8.3 nm was used in this study (Fig. 3). It consists of
maintained using standard viticulture practices, which had spur training an imaging spectrograph (Micro-Hyperspec NIR X-Series, Headwall
system with vertical shoot positioned (VSP) trellis, manual pruning, drip Photonics Inc., Fitchburg, MA), a conveyor belt operated by a stepper
irrigation, and pesticide weed control, etc. On each sampling day, in­ motor, and the lighting bulb (150 W quartz tungsten halogen) that il­
dividual leaf was detached in the morning and kept in cool insulated luminates sample using an elliptical reflector. The HSI system is a line-
container (humidity 21% and temperature 4.2 ◦ C) until transported to scanning system that scans a sample as it moves on a conveyer belt
the laboratory and kept in the refrigerator with 21% relative humidity during image acquisition.
and 3.1 ◦ C for about 3 h. During data acquisition, parameters of the HSI system were set as
After leaf samples being collected from virus positive (designated follows: spectrograph exposure time of 6.0 × 10-3 s, conveyor belt travel
hereafter as infected) and negative (designated hereafter as healthy) speed of 7.0 × 10-3 m⋅s− 1, belt travel distance from 0.05 to 0.20 m, and
vines at each of the five phenological stages (Coombe, 1995) in a distance between lens and objects at 0.352 m. For calibrating the image,
growing season, all leaf samples were washed and imaged using the the white and dark reference images were taken prior to imaging vine
hyperspectral imager. Leaf samples were collected according to the leaf samples. The white reference image was taken by scanning a
following rules: at growth stage 1 (S1), typically there are about 5 leaves Teflon® white panel that has been roughened to provide a diffuse
per shoot which is about 10 cm long with inflorescences, and the reflective surface. Then, the dark reference image was obtained by
maturing leaves were asymptomatic and considered as early growth covering the lens with the cap. For each leaf sample, imaged was the
stage sampling in this study. At growth stage 2 (S2), shoots have about upper surface of leaf under the lens.
16 leaves per shoot, with the beginning of flowering and first flower caps

3
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

2.3. Hyperspectral data pre-processing reflectance and leaf class from each sample.

The reflectance calibration (Equation (1)) converts the measured 2.5. Feature wavelength selection
pure formed digital numbers to percent reflectance. This step is per­
formed to account for the background spectral response of the instru­ Spectral information covers wavelengths from 599 to 1599 nm,
ment and the dark current of the camera (Park and Lu, 2015). typically characterized by a high dimension of redundancy between
Iraw(x,y,λ) − Idark(x,y,λ) adjacent wavelengths. Thus, selection of feature wavelengths was an
Ic(x,y,λ) = (1) important step in facilitating the design of an optimized multispectral
Iwhite(x,y,λ) − Idark(x,y,λ)
imaging detection system. This study applied the Least Absolute
where, Iwhite and Idark are white and dark references, Iraw is a measured Shrinkage and Selection Operator (LASSO, Zhang et al., 2018), a regu­
raw leaf image and Ic is the corrected image. The x and y are the spatial larized regression technique capable of controlling the variance, to es­
coordinates, and λ is the wavelength. timate the value of regression coefficients (βj ) by minimizing the
The dimensions of a hyperspectral image were 320 × 413 spatial following objective function.
pixels in each of 148 spectral wavelengths. To select the small infected ∑
n p
∑ p
∑ ⃒ ⃒
symptomatic spot from symptomatic stage (S5) leaves and avoid the LASSO = (yi − α − βj xij )2 + γ ⃒β j ⃒ (3)
veins in the leaves, ten ROIs, each with 10 × 10 pixels, were extracted i=1 j=1 j=1

from both healthy and infected leaf images at each stage in three sea­
where y and x were leaf classes and normalized reflectance of wave­
sons. Those ROIs were selected manually and equally located in the right
lengths in the function; n and p were number of samples and wave­
and left side of the main vein from the leaf stem. The spectral reflectance
lengths; βj was the coefficient, α was the intercept and γ was a penalty
of each pixel in a ROI were calculated and averaged as one sample. Since
term which controlled the value of the shrinkage.
the HSI sensor was recalibrated by manufacturer prior to 2018 season
In general, γ is a nonnegative regularization parameter correspond­
and there was 1 to 2 nm shift at some of the spectral wavelength bands.
ing to one value from 0 to 1. As γ increases, the number of nonzero
Therefore, the spectral wavelength band range was averaged for 2017
components of β decreases, i.e., the number of variables decrease. With
and 2018 seasons and the mean value of each wavelength band was used
the undetermined value of γ, LASSO calculates the largest value of γ. As a
for the further analysis. For example, the selected first waveband of 600
result, the the smallest number of variables will be obtained. Since the
nm in 2017 season was corresponded to 598 nm in 2018 season. So, the
most informative wavelengths was expected, the γ value was not
averaged waveband 599 nm was used for further analysis. The hyper­
determined in this study. The coefficients of a regularized linear
spectral data was also resized to consider wavelengths between 599 and
regression model using 10-fold cross-validation were identified, with the
1599 nm (122 bands) with enhanced signal-to-noise ratio. Then the
mean squared error (MSE) within one standard deviation of the mini­
spectral data was normalized using Equation (2) (Sankaran et al., 2011).
mum MSE. Then the absolute values of the coefficients were calculated
Ri
Rnorm(i) = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
̅ (2) and ranked for getting the top ten values, which were corresponded to
[Ri 2 +Ri+1 2 + ⋯ + +Rn 2 the top ten important feature wavelengths. In this study, the calculation
of LASSO was also implemented in MATLAB®.
where, Rnorm(i) is the normalized reflectance for the ith wavelength in a Data binning is often aimed to reduce the effects of minor observa­
sample, Ri is the measured reflectance of the ith wavelength in a sample, tion errors and improve the model sensitivity. In this process, the orig­
and i varies from 1 to 122 (n), referring to each wavelength of a sample. inal data values are transformed into a given small interval (bin) and
The normalization step was implemented in the MATLAB® (2019b, The representative values of that interval (Cannistra, 2011). Comparing the
MathWorks Inc., Natick, MA). feature wavelengths from dataset without and with binning process
could improve the wavelength sensitivity. Therefore, in this study, the
2.4. Outlier identification and elimination binning process was carried out for each stage in each season for vali­
dating reliability of the selected feature wavelengths. The reflectance
Sample outlier is the value that extremely deviates from other ob­ dataset was binned within 25 nm and the binned dataset was used for
servations in the dataset (Hodge and Austin, 2004). Variety of methods selecting feature wavelengths using LASSO.
have been used for identification of the outlier. For example, normal
probability plot and model-based method (i.e. box-and-whisker plot) is 2.6. Robutsness of selected feature wavelengths
based on the hypothesis that the outlier is the point that is farther away
from the mean value of the central distribution. Such approach is carried 2.6.1. Sensitivity
out without taking the dependent variables into account. Also, The sensitivity of feature wavelengths was evaluated using analysis
numerous outliers falsify measurement of central location and distri­ of variance (ANOVA) and linear regression which obtained the signifi­
bution of samples, resulting in incorrect results when multiple outliers cance of the variable importance. In this study, ANOVA was used to
exist in the dataset. This occurrence is termed as masking effect. To determine whether the wavelength would be significant to the healthy
prevail over the masking effect and acquire the clear boundary between or GLD infected leaf conditions. A linear regression was then used to
normal samples and outliers, this study employed Monte-Carlo method assess mean change in the normalized reflectance value of wavelength
(Zhang et al., 2016). The analysis was performed in MATLAB® (ver. (Y) given the change in the leaf condition (X), i.e. healthy or infected, to
2019b) with following key steps : i) determining the number of principal determine the importance of feature wavelength.
components (PCs) by partial least squares-discriminant analysis cross-
validation, ii) dividing the sample dataset into training and test data­ 2.6.2. Accuracy
sets, randomly with a ratio of 70:30, and build a prediction model to The accuracy of the feature wavelengths was evaluated using least
obtain the prediction errors for samples in test dataset, iii) repeat above squares support vector machine (LS-SVM) classifier. This classifier was
for 15,000 times to get prediction error for each sample, and iv) used to find a hyperplane, an optimal surface with the maximal distance
computing the mean value and standard deviation (SD) of the prediction between the classes from each margin (Pantazi et al., 2016), in the
errors for each sample. multidimensions to separate the healthy and GLD infected leaf samples.
Generally, the predictive error of a y outlier has a large mean value, The outlier removed sample datasets were divided into training and test
while an x outlier has a small mean value of predictive residuals but a sets with a ratio of 70:30 for each stage of every season. The training set
larger SD. The input data of x and y were respectively normalized was used to build the LS-SVM model; the test set was to validate the

4
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

3.2. Spectral analysis

The average spectral reflectance of outlier removed data was calcu­


lated for each stage in three seasons. As an example, the data spectrums
between healthy and infected samples from stage 1 collected in 2018
season is shown in Fig. 5. The general trends of the healthy and infected
spectral reflectance curves were quite similar. The absorption peaks or
valleys of the spectral curves indicated changing pattern of spectra,
which were related to the physical and chemical properties of leaves.
Importantly, part of the spectrum and the standard deviation of healthy
and GLD infected samples were overlapped, which indicates that the
small variances between the healthy and infected samples. The over­
lapping of standard deviations across whole wavelength range indicated
that it cannot sensitively and accurately distinguish the infected samples
from the healthy ones based on an individual wavelength. While it
cannot preclude good classification when few more bands being
analyzed.
Overall, the main variation of healthy and infected spectrum ranged
from 750 to 1300 nm, and same observations were obtained for other
asymptomatic stages (S2, S3, and S4). The reason for strong reflection in
the near infrared (NIR) wavelength (750–1300 nm) could be related to
Fig. 4. Outliers identification based on Monte-Carlo method for spectral sam­
the cell structure of leaves (Xu et al., 2007). In terms of cytopathology,
ples at grapevine growth stage 1 of 2017 season.
GLRav-3 is hindered to the phloem of infected hosts and the virus is
unevenly distributed in hosts’ organs and tissues, which has great effects
model. The normalized reflectance of selected feature wavelength was on sieve tubes, companion cells and phloem parenchyma cells (Maree
input as X and the sample label (healthy ‘1′ , and infected ‘2′ ) was input et al., 2013). As a result, GLRav-3 caused an extreme reduction in leaf
as Y. Based on the predicted and actual Y values from LS-SVM model, the photosynthesis. Furthermore, there was spectral variation between
confusion matrix was obtained. The classification accuracy was calcu­ healthy and infected samples of symptomatic stage (S5) in shortwave
lated (Eq. (4)) from the confusion matrix with the corrected samples infrared wavelength (1450 to 1599 nm) ranges. This has been related to
numbers identified from total samples, which was used to evaluate the the water content and leaf biochemicals, i.e., protein, lignin, and
performance of the classification models.
Correctly identified sample numbers
Accuracy = × 100% (4)
Total sample numbers
This procedure was also performed in the MATLAB®.

3. Results and discussion

3.1. Outlier identification

The distribution of mean value and SD of dataset from stage 1 in


2017 season as an example for describing the Monte-Carlo method-
based outliers’ identification is shown in Fig. 4. Specifically, the outliers
were significantly different from the majority of samples with large
mean value and SD. From Fig. 4, observed cut-off values of mean value
and SD were receptive 0.024 and 0.8. Also, samples numbered 57, 227,
293, 294, 333, 876, 882, and 883 had larger mean and SD values than
those of cut-off values, so they were considered as the outliers. The
respective cut-off values varied for each stage in each growth season.
Analogously, outlier identification and elimination were performed and
the sample numbers after removing outliers are as summarized in Fig. 5. Mean spectra and standard deviation (SD) of healthy and GLRaV-3
Table 1. infected leaf samples at phenological growth stage 1 in 2018 season. For each
class type, the reflectance values of all samples were averaged to one value as
the representative spectra, and the SD values were calculated based on the
reflectance values of all samples of that class.

Table 1
Outlier removed spectral samples.
Number of samples at each growth stage

Season Stage 1 Stage 2 Stage 3 Stage 4 Stage 5

H In H In H In H In H In

2017 495 497 497 495 490 500 492 488 485 484
2018 480 488 485 491 486 484 495 490 495 470
2019 493 496 497 495 494 488 488 491 494 480

Note: H: Healthy sample; In: infected sample.

5
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

Table 2 feature wavelengths from binned dataset are also summarized in


Feature wavelengths from respective phenological growth stage datasets in three Table 2.
season using LASSO. Comparing wavelengths selected from all growth stages in three
Season/ Feature wavelength (nm) seasons, the feature wavelength that appeared frequently with three or
Growth
S1 S2 S3 S4 S5
more times was considered as the key feature wavelength. There were
stage 19 key feature wavelengths (599, 607, 615, 624, 640, 649, 682, 690,
2017 599, 607, 682, 822, 615, 624, 607, 640, 607, 640, 698, 706, 715, 731, 830, 930, 1392, 1409, 1425, 1582 and 1591 nm)
698, 706, 830, 905, 706, 764, 731, 748, 715, 756, determined for dataset without binning process. Similarly, for binned
731, 855, 930, 1144, 1029, 797, 855, 888, 896,
dataset, the feature wavelength which appeared eight or more times was
863, 1103, 1335, 1335, 954, 1111, 1062,
1425, 1351, 1343, 1409, 1211, considered as the key feature wavelength. There were nine wavelengths
1591 1376, 1376, 1434 1525, (617, 654, 687, 720, 753, 1404, 1441, 1511 and 1577 nm) selected as
1508 1384, 1574 the key feature wavelengths. Additionally, the feature wavelengths at
1409 each stage in three seasons were compared for datasets without and with
2018 615, 657, 599, 624, 698, 921, 632, 731, 599, 640,
682, 698, 690, 698, 930, 971, 822, 830, 649, 682,
binning process and summarized as the staged feature wavelength in
764, 772, 715, 1012, 979, 1252, 938, 1045, 690, 1392, Table 2. For those feature wavelengths appeared frequently with two or
781, 1401, 1136, 1268, 1492, 1425, more times or with two adjacent ones were determined as the staged
1458, 1392, 1285, 1516, 1492, feature wavelengths. Furthermore, the staged feature wavelengths
1591 1442, 1301, 1591, 1566,
appeared respective two (or more) and three (or more) times from five
1516 1558 1599 1582
2019 615, 640, 599, 607, 607, 624, 640, 665, 599, 640, stages were considered as the key feature wavelengths. There were six
682, 690, 649, 839, 682, 715, 682, 698, 649, 665, (599, 640, 690, 731, 830, and 1582 nm) key feature wavelengths from
715, 731, 913, 1169, 723, 1202, 715, 830, 673, 706, dataset without binning process. Similarly, nine wavelengths (617, 654,
1227, 1401, 1392, 930, 1533, 1128, 687, 720, 753, 1404, 1441, 1511, and 1577 nm) were determined as the
1384, 1425, 1442, 1574, 1409,
key feature wavelengths from dataset with binning process. Also,
1417, 1483, 1450, 1582 1558,
1582 1500 1591 1582 comparing the staged feature wavelengths from asymptomatic stages
Staged 615, 682, 599, 690, 624, 1384 640, 731, 599, 640, (stage 1 to stage 4), 615, 690, 731, and 830 nm wavelengths appeared
FW 698, 731, 830, 1508 830, 1582 649, 673, twice or one time with adjacent ones (with a spectral band resolution
(nm) 772, 1591 690, 1566,
difference being 8.3 nm). Comparing these four feature wavelengths
1582
2017* 617, 654, 790, 827, 654, 687, 617, 654, 617, 654, with that of symptomatic stage (stage 5), we found the wavelength of
687, 720, 926, 996, 790, 860, 687, 790, 753, 893, 690 nm appearing in both asymptomatic and symptomatic stages. The
753, 860, 1132, 1033, 860, 959, 1066, 690 nm was attributed to chlorophyll content of leaves and suggest that
1099, 1165, 1132, 1099, 1202, the chlorophyll content of GLD infected leaves might have changes
1404, 1338, 1239, 1239, 1441,
during the infection progression. This is consistent with the findings by
1441, 1371, 1338, 1404, 1478,
1577 1511, 1371, 1441 1511, Hadaway, 2019, which stated that the chlorophyll content in infected
1577 1404 1577 leaves of grapevine was reduced compared with healthy leaves. Thus,
2018* 617, 654, 687, 720, 654, 827, 654, 720, 617, 654, comparing datasets with and without binning process for all growth
687, 720, 790, 893, 926, 996, 753, 827, 687, 720,
stage in three seasons, the wavelength difference was within one bin (25
753, 790, 996, 1033, 1165, 1066, 827, 996,
1099, 1165, 1305, 1371, 1404,
nm) was selected as the common wavelength. Eight wavelengths (615,
1441, 1404, 1404, 1404, 1441, 649, 690, 715, 731, 1409, 1425 and 1582 nm) were determined as the
1478, 1441, 1441, 1478, 1511, common wavelengths.
1577 1511 1511, 1511, 1577 The major constituents in leaf are chlorophyll, carbohydrate, water,
1577 1577
starch, and cellulose. According to Blackburn (2006), for the spectral
2019* 654, 687, 687, 753, 617, 687, 617, 720, 617, 654,
720, 753, 860, 1066, 720, 753, 753, 827, 687, 720, interval between 530 nm and 630 nm, a narrow band around 700 nm,
860, 1239, 1165, 893, 1202, 926, 959, 753, 790, and between 710 nm and 800 nm, the spectral behavior of the scattered
1371, 1305, 1404, 1239, 1132, radiation from the green leaf interior was mainly determined by the
1404, 1371, 1441, 1441, 1404,
absorption spectra of pigment, such as chlorophyll in most leaves. Thus,
1478, 1478, 1511, 1511, 1478,
1577 1544, 1577 1577 1577
the selected feature wavelengths from 615 nm to 731 nm could be
1577 attributed to chlorophyll content. The wavelength at 1409 nm may have
Staged 617, 654, 687, 790, 654, 687, 617, 654, 617, 654, resulted from water absorption (Darvishzadeh et al., 2011). The wave­
FW* 687, 720, 996, 1371, 1404, 720, 753, 687, 720, length of 1425 and 1582 nm have been reported as the ones related to
(nm) 753, 860, 1511, 1441, 827, 959, 753, 1404,
the lignin and starch absorption, respectively (Mobasheri and Rahim­
1099, 1577, 1511, 1239, 1441,
1404, 1165 1577 1404, 1478, zadegan, 2012).
1441, 1441, 1511,
1478, 1511, 1577 3.4. Robustness of feature wavelengths
1577 1577

Note: *represented the binned dataset. 3.4.1. Sensitivity


The P-value and regression coefficient between the dependent vari­
cellulous. These wavelengths’ biological attributes were consistent with able Y (normalized reflectance of individual common wavelength) and
previous research reported that the physiology changes of grapevine independent variable X (leaf class: ‘1′ for healthy samples, ‘2′ for
leaf, i.e., carbohydrates, water, sugar, and starch during the grapevine infected samples) were calculated for each stage in three seasons and
leafroll virus infection progression (Mannini and Digiaro, 2017). summarized in supplemental Table 3. The common wavelengths with P-
value less than or equal to 0.05 were considered to get significant ones to
the leaf class.
3.3. Feature wavelength selection From selected significant common wavelengths based on the P-
values, the top five larger absolute regression coefficient values were
LASSO based feature wavelengths selected from 122 variables for considered, for each growth stage in three seasons, to select five sig­
each growth stage in three seasons are summarized in Table 2. The nificant common wavelengths. All the significant and important

6
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

Table 4
Classification accuracy of the LS-SVM models based on salient wavelengths for each growth stage in three seasons.
Training Test

Growth stage/leaf class/Season 2017 2018 2019 2017 2018 2019

C1 C2 C1 C2 C1 C2 C1 C2 C1 C2 C1 C2

S1 C1 302 30 297 45 235 108 142 21 107 31 91 59


C2 12 350 66 270 106 243 9 126 37 115 40 107
Accuracy (%) 93.95 83.63 69.08 89.93 76.55 66.67
S2 C1 309 33 257 77 229 118 114 41 106 45 88 62
C2 52 300 61 288 117 230 21 122 41 101 55 93
Accuracy (%) 87.75 79.80 66.14 79.19 70.65 60.74
S3 C1 248 99 267 72 260 91 92 51 105 42 102 41
C2 74 272 55 285 92 244 44 110 29 115 51 101
Accuracy (%) 75.04 81.30 73.36 68.01 75.60 68.81
S4 C1 253 96 308 47 282 67 104 39 113 27 107 32
C2 84 253 54 281 63 273 47 104 23 132 44 111
Accuracy (%) 73.76 85.36 81.02 70.75 83.05 74.15
S5 C1 315 19 348 2 351 1 134 17 142 3 142 0
C2 42 302 23 303 7 323 30 110 9 135 8 142
Accuracy (%) 91.00 96.30 98.83 83.85 95.85 97.26

Note: C1: class 1 of healthy leaf samples; C2: class 2 of the infected leaf samples.

common wavelengths for all the stages in 2017, 2018 and 2019 seasons 1582 nm) identified from hyperspectral images could be used for early
were then compared and wavelengths that appeared seven times were detection of GLRaV-3-infected grapevines. Virus-infected leaves from
considered as the salient wavelengths. At last, six wavelengths (690, early phenological growth stage (S1) showed acceptable performance,
715, 731, 1409, 1425 and 1582 nm) were determined as the salient with the LS-SVM classifier accuracies in the ranges of 66.67 to 89.93%.
wavelengths. Our prior studies have found that 701, 726, and 1600 nm However, GLRaV-3 infection is a complex process, which depends on
wavelengths played pivotal roles in identifying the GLD infected leaves many factors, such as the virus, the cultivar, and other biotic and abiotic
(Naidu et al., 2009; Sinha et al., 2019), which were close to the finalized factors (Naidu et al., 2009) and additional investigation on potential
690, 731, and 1582 nm salient wavelengths of this study. cofounding effects are needed prior to the development of miniaturized
sensing system.
3.4.2. Accuracy
The classification accuracy of the salient wavelengths was evaluated Declaration of Competing Interest
by the LS-SVM classifier for each stage in three seasons (Table 4). The
performances of the classification models varied from each stage in each The authors declare that they have no known competing financial
season. The classification accuracy of the training set for the stages interests or personal relationships that could have appeared to influence
varied from 66.14 to 98.83%, with 69.08 to 93.95% for early growth the work reported in this paper.
stage (S1), 66.14 to 87.75% for mid growth stages (S2, S3 and S4), and
91.00 to 98.83% for symptomatic growth stage (S5), respectively. Acknowledgements
Overall, the early stage S1 achieved good performance with the test
accuracies of three seasons being 89.93, 76.55 and 66.67%, respectively. This research was supported in part by United States Department of
In GLRaV-3 infected grapevine leaves, the content of pigments, such Agriculture National Institute for Food and Agriculture Project Funds
as chlorophylls, carotenoids, anthocyanins, and carbohydrates were (Project # 1005756, 1001246, WNP00006, and WNP00745). We would
found affected in virus-infected leaves (Mannini and Digiaro, 2017). like to thank Dr. Sridhar Jarugula for his help in completion of this
Specifically, there was an overall reduction in chlorophyll and carot­ study. Zongmei Gao would like to thank China Scholarship Council
enoids, while an accumulation of anthocyanins and carbohydrates in (CSC) for sponsoring her study at Washington State University.
GLRaV-3 infected leaves (Gutha et al., 2010). These changes could lead
to lower photosynthetic rates, stomatal conductance and transpiration Appendix A. Supplementary data
in leaves (Bertamini et al., 2004). Therefore, the symptomatic stage S5
achieved higher classification accuracies, while the early stage without Supplementary data to this article can be found online at https://doi.
symptoms had lower accuracies. Overall, results showed that LS-SVM org/10.1016/j.compag.2020.105807.
classifier performed good when it was used to classify the healthy and
infected leaves of early growth stage (S1), which indicated the possi­ References
bility of identifying virus-infected leaves at early stage using the
hyperspectral imaging. The selected salient wavelengths (690, 715, 731, Adiputra, J., Kesoju, S.R., Naidu, R.A., 2018. The Relative Occurrence of Grapevine
1409, 1425 and 1582 nm) provided basis for identifying the GLD leafroll-associated virus 3 and Grapevine red blotch virus in Washington State
Vineyards. Plant Dis. 102 (11), 2129–2135.
infected leaves during asymptomatic stages. The future direction of this Al-Saddik, H., Laybros, A., Billiot, B., Cointault, F., 2018. Using image texture and
study will be to develop a customized multispectral sensing module spectral reflectance analysis to detect Yellowness and Esca in grapevines at leaf-
based on the selected salient wavelengths. level. Remote Sens. 10 (4), 618.
Bertamini, M., Muthuchelian, K., Nedunchezhian, N., 2004. Effect of grapevine leafroll
on the photosynthesis of field grown grapevine plants (Vitis vinifera L. cv. Lagrein).
3.5. Conclusions J. Phytopathol. 152 (3), 145–152.
Bhardwaj, K., Patra, S., 2018. An unsupervised technique for optimal feature selection in
attribute profiles for spectral-spatial classification of hyperspectral images. ISPRS J.
The hyperspectral imagery was useful in non-destructive detection of Photogramm. Remote Sens. 138, 139–150.
GLD infected leaves during asymptomatic and symptomatic stages in a Blackburn, G.A., 2006. Hyperspectral remote sensing of plant pigments. J. Exp. Bot. 58
red-berried wine grape cultivar. This study validated the feasibility of (4), 855–867.
Burger, J.T., Maree, H.J., Gouveia, P., Naidu, R.A., 2017. Grapevine leafroll-associated
applying such technique for early detection of virus infections. The virus3. In: Grapevine Viruses: Molecular Biology. Diagnostics and Management.
combination of six salient wavelengths (690, 715, 731, 1409, 1425 and Springer, Cham, pp. 167–195.

7
Z. Gao et al. Computers and Electronics in Agriculture 179 (2020) 105807

Cannistra, Steve, 2011. Small explanation of binning in image processing. http://www.st Mobasheri, M.R., Rahimzadegan, M., 2012. Introduction to Protein Absorption Lines
arrywonders.com/binning.html. Index for Relative Assessment of Green Leaves Protein Content Using EO-1 Hyperion
Cao, H., Yang, Y., Zhang, W., Wan, Q., Xu, L., Ge, D., Huang, B., 2019. Detection of Datasets. J. Agr. Sci. Tech 14, 135–147.
waterlogging stress based on hyperspectral images of oilseed rape leaves (Brassica Naidu, R.A., Perry, E.M., Pierce, F.J., Mekuria, T., 2009. The potential of spectral
napus L.). Comput. Electron. Agric. 159, 59–68. reflectance technique for the detection of Grapevine leafroll-associated virus-3 in
Coombe, B.G., 1995. Growth Stages of the Grapevine: Adoption of a system for two red-berried wine grape cultivars. Comput. Electron. Agric. 66 (1), 38–45.
identifying grapevine growth stages. Aust. J. Grape Wine Res. 1 (2), 104–110. Naidu, R., Rowhani, A., Fuchs, M., Golino, D., Martelli, G.P., 2014. Grapevine Leafroll: A
Darvishzadeh, R., Atzberger, C., Skidmore, A., Schlerf, M., 2011. Mapping grassland leaf Complex Viral Disease Affecting a High-Value Fruit Crop. Plant Dis. 98 (9),
area index with airborne hyperspectral imagery: A comparison study of statistical 1172–1185.
approaches and inversion of radiative transfer models. ISPRS J. Photogramm. Pantazi, X.E., Moshou, D., Bravo, C., 2016. Active learning system for weed species
Remote Sens. 66 (6), 894–906. recognition based on hyperspectral sensing. Biosyst. Eng. 146, 193–202.
Dolja, V.V., Meng, B., Martelli, G.P., 2017. Evolutionary aspects of grapevine virology. Park, B., Lu, R. (Eds.), 2015. Hyperspectral imaging technology in food and agriculture.
In: Grapevine viruses: molecular biology, diagnostics and management. Springer, Springer, New York.
Cham, pp. 659–688. Polder, G., Blok, P.M., de Villiers, H., van der Wolf, J.M., Kamp, J., 2019. Potato virus y
Gao, Z., Zhao, Y., Khot, L.R., Hoheisel, G.-A., Zhang, Q., 2019. Optical sensing for early detection in seed potatoes using deep learning on hyperspectral images. Front. Plant
spring freeze related blueberry bud damage detection: Hyperspectral imaging for Sci. 10, 209.
salient spectral wavelengths identification. Comput. Electron. Agric. 167, 105025. Rayapati, A.N., O’Neil, S., Walsh, D., 2008. Grapevine leafroll disease. WSU Extension
Gohil, H., Nita, M., Pavlis, G., Ward, D., 2016. Red leaves in the vineyard: Biotic and Bulletin EB 2027E, 20.
abiotic causes. Rutgers New Jersey Agricultural Experiment Station, New Brunswick, Rowhani, A., Osman, F., Daubert, S.D., Al Rwahnih, M., Saldarelli, P., 2017. Polymerase
NJ. Chain Reaction Methods for the Detection of Grapevine Viruses and Viroids. In:
Gutha, L.R., Casassa, L.F., Harbertson, J.F., Naidu, R.A., 2010. Modulation of flavonoid Grapevine Viruses: Molecular Biology. Diagnostics and Management, Springer,
biosynthetic pathway genes and anthocyanins due to virus infection in grapevine Cham, pp. 431–450.
(Vitis viniferaL.) leaves. BMC Plant Biol. 10 (1), 187. Sankaran, S., Mishra, A., Maja, J.M., Ehsani, R., 2011. Visible-near infrared spectroscopy
Gutiérrez, S., Fernández-Novales, J., Diago, M.P., Tardaguila, J., 2018. On-the-go for detection of Huanglongbing in citrus orchards. Comput. Electron. Agric. 77 (2),
hyperspectral imaging under field conditions and machine learning for the 127–134.
classification of grapevine varieties. Front. Plant Sci. 9, 1102. Sankaran, S., Maja, J., Buchanon, S., Ehsani, R., 2013. Huanglongbing (Citrus Greening)
Hadaway, K.M., 2019. Studies on cost analysis of viral diagnostics and red leaf symptoms Detection Using Visible, Near Infrared and Thermal Imaging Techniques. Sensors 13
in grapevines (Unpublished Master’s thesis). Washington State University, Prosser, (2), 2117–2130.
WA, USA. Shao, Y., Xuan, G., Hu, Z., Gao, Z., Liu, L., 2019. Determination of the bruise degree for
Hodge, V., Austin, J., 2004. A survey of outlier detection methodologies. Artif. Intell. cherry using Vis-NIR reflection spectroscopy coupled with multivariate analysis.
Rev. 22 (2), 85–126. PLoS ONE 14 (9).
Jarolmasjed, S., Kostick, S., Si, Y., Quiros, J., Marzougui, A., Evans, K., Sankaran, S., Siedliska, A., Baranowski, P., Zubik, M., Mazurek, W., Sosnowska, B., 2018. Detection of
2019. High-Throughput Phenotyping of Fire Blight Disease Symptoms Using Sensing fungal infections in strawberry fruit by VNIR/SWIR hyperspectral imaging.
Techniques in Apple. Front. Plant Sci. 10, 576. Postharvest Biol. Technol. 139, 115–126.
Kicherer, A., Herzog, K., Bendel, N., Klück, H.C., Backhaus, A., Wieland, M., Petry, W., Sinha, R., Khot, L.R., Rathnayake, A.P., Gao, Z., Naidu, R.A., 2019. Visible-near infrared
2017. Phenoliner: a new field phenotyping platform for grapevine research. Sensors spectroradiometry-based detection of grapevine leafroll-associated virus 3 in a red-
17 (7), 1625. fruited wine grape cultivar. Comput. Electron. Agric. 162, 165–173.
Knauer, U., Matros, A., Petrovic, T., Zanker, T., Scott, E.S., Seiffert, U., 2017. Improved Tereshchenko, A., Fedorenko, V., Smyntyna, V., Konup, I., Konup, A., Eriksson, M.,
classification accuracy of powdery mildew infection levels of wine grapes by spatial- Bechelany, M., 2017. ZnO films formed by atomic layer deposition as an optical
spectral analysis of hyperspectral images. Plant Methods 13 (1), 47. biosensor platform for the detection of Grapevine virus A-type proteins. Biosens.
MacDonald, S.L., Staid, M., Staid, M., Cooper, M.L., 2016. Remote hyperspectral imaging Bioelectron. 92, 763–769.
of grapevine leafroll-associated virus 3 in cabernet sauvignon vineyards. Comput. Tereshchenko, A., Yazdi, G.R., Konup, I., Smyntyna, V., Khranovskyy, V., Yakimova, R.,
Electron. Agric. 130, 109–117. Ramanavicius, A., 2020. Application of ZnO nanorods based whispering gallery
Mahlein, A.K., Rumpf, T., Welke, P., Dehne, H.W., Plümer, L., Steiner, U., Oerke, E.C., mode resonator in optical immunosensors. Colloids Surf. B: Biointerfaces 110999.
2013. Development of spectral indices for detecting and identifying plant diseases. Xu, H.R., Ying, Y.B., Fu, X.P., Zhu, S.P., 2007. Near-infrared spectroscopy in detecting
Remote Sens. Environ. 128, 21–30. leaf miner damage on tomato leaf. Biosyst. Eng. 96 (4), 447–454.
Mahlein, A.K., 2016. Present and future trends in plant disease detection. Plant Dis. 100 Zhang, L., Wang, D., Gao, R., Li, P., Zhang, W., Mao, J., Zhang, Q., 2016. Improvement
(2), 1–11. on enhanced Monte-Carlo outlier detection method. Chemometr. Intell. Lab. Syst.
Mannini, F., Digiaro, M., 2017. The effects of viruses and viral diseases on grapes and 151, 89–94.
wine. In: Grapevine viruses: molecular biology, diagnostics and management. Zhang, R., Zhang, F., Chen, W., Yao, H., Ge, J., Wu, S., Du, Y., 2018. A new strategy of
Springer, Cham, pp. 453–482. least absolute shrinkage and selection operator coupled with sampling error profile
Maree, H.J., Almeida, R.P., Bester, R., Chooi, K.M., Cohen, D., Dolja, V.V., Naidu, R.A., analysis for wavelength selection. Chemometr. Intell. Lab. Syst. 175, 47–54.
2013. Grapevine leafroll-associated virus 3. Front. Microbiol. 4, 82.

You might also like