I C G (HLB) U Vis-Nir S T: Dentification of Itrus Reening Sing A Pectroscopy Echnique

IDENTIFICATION OF CITRUS GREENING (HLB)
USING A VIS-NIR SPECTROSCOPY TECHNIQUE

A. R. Mishra, D. Karimi, R. Ehsani, W. S. Lee
ABSTRACT. Citrus greening, also known as Huanglongbing or HLB, is a major threat to the U.S. citrus industry. Current-
ly, scouting and visual inspection are used for screening infected trees. However, this is a time-consuming and expensive
method for HLB disease detection. Moreover, as it is subjective, the current method exhibits high detection error rates.
The objective of this study was to investigate the potential of visible and near-infrared (VIS-NIR) spectroscopy for identify-
ing HLB-infected citrus trees. The spectral data from infected and healthy orange trees of the Valencia variety were col-
lected from four different orchards in Florida. Two different spectroradiometers with a spectral range of 350 to 2500 nm
were used to collect the canopy reflectance spectral data. Three classification techniques were used to classify the data: k-
nearest neighbors (KNN), logistic regression (LR), and support vector machines (SVM). Analysis showed that using only
one canopy reflectance observation per tree was inadequate. None of the classification methods was successful in discrim-
inating healthy trees from HLB-infected trees because of the large variability in the canopy reflectance spectral data.
When five spectra from the same tree were used for classification, the SVM and weighted KNN methods classified spectra
with 3.0% and 6.5% error rates, respectively. The results from this study indicate that canopy VIS-NIR spectral reflectance
data can be used to detect HLB-infected citrus trees; however, high classification accuracy (>90%) requires multiple
measurements from a single tree.
Keywords. Support vector machines, k-Nearest neighbors, Logistic regression, Citrus greening.
C
itrus greening, also known as Huanglongbing healthy leaves. Yellow, angular blotching has been consid-
(HLB) (Candidatus Liberibacter asiaticus), is a ered a symptom specific to HLB disease and consists of
systemic bacterial disease transmitted by Asian blotches of yellow on dark greenish-gray leaves. HLB
citrus psyllids (Diaphorina citri Kuwayama). The symptoms appear on leafy shoots as they grow from
disease is causing substantial economic losses to the citrus branches and after leaves have matured normally. By the
industry by shortening the life span of infected trees. In the time these symptoms are apparent, a tree can already be se-
infected citrus orchards, trees decline and the productive verely affected. Takushi et al. (2007) reported that the
duration of fruit bearing is reduced. As no cure has been re- starch content of HLB-infected leaves can be 20 times
ported for citrus greening so far, removing the infected higher than that of healthy leaves. Etxeberria et al. (2009)
trees can reduce the spread of HLB. The elimination and studied the anatomical distribution of abnormally high lev-
removal of infected trees due to citrus canker (Xanthomo- els of starch in HLB-infected Valencia orange trees. They
nas citri subsp. citri) and HLB contributed to a gross loss reported phloem collapse in HLB-infected leaves in addi-
of 8,061 ha (19,918 acres) in Florida (NASS, 2009). Mura- tion to starch accumulation. They found multiple starch
ro (2007) reported that the total costs of growing citrus fruit grains per chloroplast in HLB-infected leaf palisade cells,
increased from $4,096 to $5,643 per ha due to HLB. whereas healthy leaf chloroplasts had a small number of li-
The first visible sign of citrus greening may either be pid inclusions and occasional smaller starch grains. They
leaf yellowing or mottling, which is not uniform and can further reported that HLB-infected leaves have a corky tex-
affect one branch without having any effect on another ture due to the thicker photosynthetic cell walls.
(Schneider, 1968). Figure 1 shows HLB-infected and Accurate diagnosis of HLB is essential before applying
control strategies like tree removal. HLB diagnosis is diffi-
cult to make based on field observations, as the symptoms
Submitted for review in March 2011 as manuscript number IET 8453; resemble nutrient deficiencies such as iron or zinc deficien-
approved for publication by the Information & Electrical Technologies cy (Etxeberria et al., 2007). Electron microscopy and bioas-
Division of ASABE in February 2012. say can be used to diagnose HLB, but these methods are
The authors are Ashish R. Mishra, ASABE Student Member, Grad-
uate Research Assistant, Davood Karimi, ASABE Member, Postdoctoral time-consuming and impractical (Chung and Brlansky,
Researcher, Reza Ehsani, ASABE Member, Associate Professor, Citrus 2005). Molecular methods such as real-time polymerase
Research and Education Center (CREC), University of Florida, Lake Al- chain reaction (PCR) based assays are used to detect the
fred, Florida; and Won Suk Lee, ASABE Member, Associate Professor, presence of HLB. However, identification of suspected
Department of Agricultural and Biological Engineering, University of
Florida, Gainesville, Florida. Corresponding author: Reza Ehsani, Citrus plants by foliar and fruit symptoms is required by a trained
Research and Education Center (CREC), 700 Experiment Station Rd., scouting crew prior to real-time PCR assays. Because the
Lake Alfred, FL 33850; phone: 863-956-1151, ext. 1228; e-mail: ehsani@ current HLB detection methods are expensive, time-
ufl.edu.
Transactions of the ASABE

Vol. 55(2): 711-720 © 2012 American Society of Agricultural and Biological Engineers ISSN 2151-0032 711
nifera L.) that causes leaf roll disease. They used vegeta-
tion indices to discriminate this disease and reported an ac-
curacy of 75% with discriminant analysis. In another study,
Sclerotinia rot disease caused by the fungus Sclerotinia
sclerotinia was investigated on celery (Apium graveolens)
by Huang and Apan (2006) under field conditions using
hyperspectral data. The authors concluded that the VIS-
NIR range from 400 to 1300 nm was as good as the full
spectrum (400 to 2500 nm) to detect this disease. Smith et
al. (2005) studied plant stress caused by elevated levels of
natural gas in the soil, dilute herbicide solution, and ex-
treme shade. They found that the red edge position in the
spectral data was strongly correlated with chlorophyll con-
(a) tent across all treatments. The ratio of reflectance centered
on the wavelengths at 670 and 560 nm was used to detect
increases in red pigmentation in gas and herbicide stressed
leaves. Stress due to extreme shade could be distinguished
from the stress caused by natural gas and herbicide from
the change in the spectrum. Liu et al. (2007) characterized
and estimated rice brown spot disease severity using step-
wise regression, principal component regression, and par-
tial least squares regression. They concluded that it was
feasible to estimate the disease severity using hyperspectral
reflectance from the leaves. Poole et al. (2008) investigated
HLB discrimination with VIS/NIR spectroscopy in the lab.
They found that reflectance associated with chlorophyll ab-
sorption decreased in HLB-infected leaves. Their partial
(b)
Figure 1. (a) HLB-infected leaves and (b) healthy leaves. least squares (PLS) analysis showed a larger absorption
band at 612 nm, which was affected by chlorosis in HLB-
consuming, and tedious, a rapid and reliable method to dis- infected leaves. Lee et al. (2008) used aerial hyperspectral
tinguish HLB-infected trees from healthy trees is necessary. imaging to detect HLB by applying spectral angle mapping
Sankaran et al. (2010) reviewed various visible (VIS) (SAM) and spectral feature fitting (SFF) methods. They re-
and near-infrared (NIR) spectroscopy studies that were ported that it was difficult to obtain good results with SAM
used in plant disease detection. Previous studies have and SFF because of the positioning errors in GPS ground
shown that VIS-NIR spectroscopy has the potential to iden- truthing and aerial imaging, and the spectral similarity be-
tify plant anomalies due to disease or malnutrition (Mu- tween non-symptomatic HLB-infected trees and healthy
hammed, 2005; Yang et al., 2007; Naidu et al., 2009; trees. Delalieux et al. (2007) used hyperspectral imaging
Huang and Apan, 2006; Liu et al., 2007; Poole et al., 2008; and parametric approaches such as logistic regression, par-
Delalieux et al., 2007). The VIS-NIR spectrum of a leaf tial least squares, discriminant analysis, and tree-based
contains information on plant pigment concentration, leaf modeling to detect biotic stress (apple scab; Venturia in-
cellular structure, and leaf moisture content (Borengasser et qequalis (Cooke) G. Winter) in apple trees.
al., 2001). Muhammed (2005) investigated fungal disease A method for timely detection of HLB in the field can
severity in wheat using canopy reflectance. He used 120 assist growers to better manage and control the disease, re-
hyperspectral crop reflectance data vectors of 164 spectral sulting in significant production and economic benefits.
bands corresponding to fungal disease infected trees and The long-term goal of this study is to develop a ground-
healthy trees. The author used a nearest neighbor classifier based method to detect HLB at the early stages of devel-
and characterized the disease severity successfully. VIS- opment in citrus orchards. The specific objective of this
NIR spectroscopy was utilized in characterizing insect in- study was to investigate the possibility of identifying HLB-
festation in a rice canopy caused by brown planthoppers infected trees using VIS-NIR spectroscopy and to compare
(Nilaparvata lugens Stål) and leaffolders (Cnaphalocrosis three classification techniques to analyze the spectral data.
medinalis Guenee (Lep., Pyralidae)) by Yang et al. (2007).
The authors used linear correlation intensity analysis and
correlation coefficients to characterize infestation severity MATERIALS AND METHODS
over the range of 350 to 2,400 nm. They found that the cor-
FIELD EXPERIMENTS
relation coefficient (r) at 426 nm was the highest (r =
A total of 1,239 spectra were collected from 135 (80
0.878), and the coefficient of determination (R2) of a ratio
HLB, 55 healthy) Valencia orange trees. Table 1 shows de-
index (RNIR/RRed) was the highest (R2 = 0.922, p < 0.001)
tailed information on the data used in this study. All the
with brown planthopper insect infestation. Naidu et al.
healthy and HLB-infected trees in this study were 15 to 20
(2009) investigated viral infection in grapevines (Vitis vi-
years old. Spectral data were collected using two portable
712 TRANSACTIONS OF THE ASABE

Table 1. Spectral data used in this study.
Location Data Collection No. of Trees No. of Spectra Spectroradiometer
in Florida Date HLB Healthy HLB Healthy Type
Lake Alfred 13 June 2007 2 2 119 122 ASD
Lake Alfred 25 June 2007 4 4 90 60 ASD
Lake Placid 22 February 2007 4 4 24 24 ASD
Immokalee 16 January 2009 10 5 100 50 SpectraVista
Immokalee 5 February 2009 10 5 100 50 SpectraVista
Immokalee 27 February 2009 10 5 100 50 SpectraVista
Southern Garden 24 March 2009 40 30 200 150 SpectraVista
spectroradiometers. The first series of data were collected Both spectroradiometers provided relative reflectance val-
with a FieldSpec 3 spectroradiometer (Analytical Spectral ues of the tree canopy based on the reference value. The da-
Devices - ASD, Boulder, Colo.). Reflectance data from 350 ta analysis described in this article was performed in
to 2,500 nm were collected using the spectroradiometer and Matlab (Mathworks, Inc., Natick, Mass.).
transmitted wirelessly to a laptop computer. The equipment
was set up to use an average of ten scans to represent a sin- SPECTRAL PRETREATMENT AND FEATURE SELECTION
gle observation. Bare fiber optic cable with 25° field of Multiplicative scatter correction (MSC) was performed
view (FOV) was used for data collection. The ASD Field- on the NIR portion of the spectra (Martens and Naes,
Spec spectroradiometer has a circular field of view. The in- 2001). Then the local mean and the first and second deriva-
tegration time was optimized using the optimization option tives were calculated at 25 nm intervals from 375 to 1,325
within the software. Optimization values depend on the re- nm, from 1,500 to 1,750 nm, and from 2,050 to 2,300 nm
sponse to light in a particular spectral region. The bare fiber (removing the noisy regions). At each of these 61 points,
optic cable was placed at a distance of approximately 50 to the local mean was computed by averaging the spectral re-
80 cm from the tree canopy, where the canopy was in the flectance values for that wavelength and its four neighbors.
range of FOV of the sensor. During data collection, the area For example, the local mean at 375 nm was computed by
scanned by the spectroradiometer was approximately 385 averaging from 373 to 377 nm. In order to avoid noise am-
to 988 cm2. plification, which results from differentiation, the Savitzky-
The second series of data were collected using a SVC Golay method (Orfanidis, 1996) was used to compute the
HR-1024 portable spectroradiometer (Spectra Vista Corp., first and second derivatives. A quadratic polynomial and a
Poughkeepsie, N.Y.). The spectral range of the SVC is 350 window size of 21 were used. Since three values were
to 2,500 nm with spectral resolutions of 3.5 nm (at 350 to computed for each of the 61 points mentioned above (i.e.,
1,000 nm), 9.5 nm (at 1,000-1,850 nm), and 6.5 nm (at the mean and the first and second derivatives), each spec-
1,850 to 2,500 nm), which was similar to the ASD Field- trum was represented by a feature vector of 183 elements.
Spec spectroradiometer. The SVC HR-1024 spectroradiom- Because the number of features was significantly high and
eter communicated with a handheld PDA through a wire- many of these features could be correlated, principal com-
less connection. All the data were collected with 4° FOV. ponent analysis (PCA) was performed to reduce the number
The FOV of the SVC spectroradiometer was rectangular, of features (fig. 2). The 183 spectral elements were reduced
which covered an area of 4.8 × 9.4 cm from a distance of 1 to 25 principal components (PCs) that accounted for more
m; the area was approximately 45 cm2. The SVC spectrora- than 90% of the variance within the data. From figure 2, it
diometer had a minimum integration time of 1 ms. Under can be seen that the first 25 PCs contribute about 90% of
field conditions, the scan time was set at 4 s. Laboratory the variance. A preliminary analysis was performed to see
tests were conducted with various objects to compare the whether increasing the number of PCs would improve the
spectral signatures of the ASD FieldSpec and SVC HR- classification results. No significant improvements were
1024 spectroradiometers. Similar results were obtained obtained by increasing the number of PCs beyond 25. Fig-
from both spectroradiometers. ure 3 shows a plot of PCA loadings for the first two princi-
Dark current measurements were automatically taken pal components.
immediately prior to the reference or target scans by both
the ASD FieldSpec and SVC spectroradiometers. The FOV
for both sensors would only cover a small section of a
branch. The spectral data were collected from sunlit cano-
pies between 11:00 a.m. to 2:00 p.m. with no additional ar-
tificial light. Spectral reflectance was collected from tree
canopies from sunlit sides of the tree canopy to maximize
the incident light intensity. In the case of HLB-infected
trees, the reflectance spectral data were collected only from
branches with symptomatic leaves. These symptomatic
leaves were marked by trained scouts and confirmed by
PCR test before data collection. Reference readings of a
white panel were collected every 10 min to reduce errors in
reflectance caused by changes in atmospheric conditions.
Figure 2. Variance of data explained by principal component analysis.
55(2): 711-720 713

(a)
(b)
(c)
Figure 3. (a) Sample spectrum, (b) loading coefficients for the first PC, and (c) loading coefficients for the second PC.
are then used to find the k training samples that are closest
CLASSIFICATION METHODS to the unknown sample, and a prediction is made based on
The vectors of the principal components were used as these nearest neighbors. For a classification problem, this
the input to the classification algorithms. The output classes can be done simply by a majority vote. A more sophisticat-
obtained from the classification models were the “diseased” ed approach, used in this study, was to give a different
and “healthy” trees. For each classification method, 75% of weight to the contribution of each of the k-nearest neigh-
the data were used for training, while 25% of the data were bors in the prediction. Commonly, a weight that is inversely
used for testing the accuracy of the classifier. proportional to the square of the Euclidian distance is used
(Mitchell, 1997):
Weighted k-Nearest Neighbors (KNN)
Weighted k-nearest neighbors (KNN) is an instance- K
based classification algorithm. This type of algorithm does f̂ ( xunknown ) = argmax
v∈V
 wi δ ( v, f ( xi ) )
not develop an explicit function or model for predicting the i =1 (1)
target classes. Instead, all the training samples are stored, where
and computation is delayed until a new unknown sample 1
must be classified. For a new sample, the Euclidian dis- wi =
n
 ( xunknown − xij )
j 2
tance between its feature vector and the feature vectors of
the training samples is computed. The computed distances j =1
(2)

In equations 1 and 2, f̂ indicates the predicted class for only two target classes. However, it can be generalized to
the new unknown feature vector xunknown, x ij denotes the jth solve problems that involve more than two classes. For lin-
element of the ith feature vector, wi are the weights, and the early separable data, such as the data that are shown in fig-
function δ is defined as follows: ure 5a, one can draw many different hyperplanes that can
separate the data. The goal in SVM is to find a hyperplane
1 if a = b that separates the data with the largest possible margin, as
δ ( a,b ) = 
shown in figure 5b. SVM uses numeric labels 1 and -1 to
0 otherwise (3)
identify the two classes. In this article, the following nota-
A k of 12 was used in this study based on preliminary anal- tion is used to show the SVM classifier (Webb, 2002):
ysis of the data.
(
y = gw ( x ) = f wT x + w0 ) (5)
Logistic Regression (LR)
Logistic regression provides a model for the probability The parameters w and w0 define the decision boundary
of occurrence of an event by fitting the data to a logistic (i.e., the separating hyperplane). The intercept w0 is a sca-
curve. As shown in figure 4, this curve maps the entire real lar, while w and the feature vector x are n-dimensional vec-
axis onto the interval [0,1], making it ideal for modeling the tors. In equation 5, f(t) = 1 if t ≥ 0 and f(t) = -1 if t < 0. The
probability of an event. When used in classification, the equation defining the separating hyperplane is as follows:
variable z is usually defined as a linear combination of the
wT x + w0 = 0 (6)
features (Witten and Frank, 2005):
The geometric margin of the ith example from the sepa-
( )
g θ ( x ) = f θT x =
1
1 + e −θ
T
x rating hyperplane can be calculated using the following
(4) equation:
where θ is the parameter vector, x is the feature vector, and  wT x i + w0 
Di = y i  
f(t) = 1 if t ≥ 0 and f(t) = 0 if t < 0. A batch gradient descent  w 
approach was used for training the model, as shown in the   (7)
i i t
following pseudo code: Here, a positive D would mean that y and w x + b are of
Do until convergence {
for all j
N
θj = θj +ε  xij ( yi − gθ ( xi ) )
i =1
end
}
where N is the number of features, ε is the learning rate,
and y is the true label (either 1 or 0). A sufficiently small ε
will ensure that the global optimum will be achieved, alt-
hough the computation time will increase by decreasing the
value of ε. In this study, a value of ε = 0.001 was used.
Support Vector Machines (SVM) (a)
Support vector machines (SVM) is one of the most suc-
cessful machine learning algorithms that is widely used in
various fields (Byvatov and Schneider, 2003; Rumpf et al.,
2010). The basic SVM solves a classification problem with
(b)
Figure 5. (a) Example of a linearly separable set of data, and (b) the
Figure 4. Logistic curve describing the logistic regression model. maximum margin classifier for this data set.
55(2): 711-720 715

the same sign, or equivalently, the ith example is correctly N
1 N
classified. The geometric margin of the classifier with re- max α  αi − 2  yi y j αi α j ( xi ,x j )
spect to the set of all training samples, {(x1,y1), ..., (xN,yN)}, i =1 i, j =1
is defined as the minimum of the geometric margins for αi ≥ 0, i = 1, ,N
each sample: 
D = min Di such that:  N
i =1:N (8) 
i
 αi y = 0
 i =1 (12)
Therefore, the goal of SVM is to find a separating hy-
perplane that maximizes D. In mathematical terms, SVM In equation 12, αi are the Lagrange multipliers, and
will pose the following optimization problem: (xi,xj) indicates the inner product between the ith and jth
feature vectors. The interesting point is that all of the La-
max w,w0 D
grange multipliers are zero except for the ones that corre-

such that: 
( 0 )
 y i wT x i + w ≥ D, i = 1, , N spond to the support vectors. Once the above optimization
problem is solved, w and w0 can be found using the follow-
 w = 1 ing equations:
(9)
 N
The second constraint, w = 1 , is a non-convex con-
straint, and this optimization problem cannot be directly
 w=

 αi y i xi
i =1
solved. It can be shown (Gunn, 1998) that this optimization 
problem can be replaced by the following equivalent prob-  max i,yi =−1wT xi + min i,y i =1wT xi
 w0 = −
lem:  2 (13)
1 2
min w,w0 w It is important to note that the new optimization problem
2
in equation 12 is only in terms of the inner products of the
( )
such that: y i wT xi + w0 ≥ 1, i = 1, ,N
(10)
feature vectors. Moreover, with the new definition of w in
terms of Lagrange multipliers (eq. 13), for a new example
In equation 10, yi(wTxi + w0) is called the functional x, the prediction of the SVM classifier can be written as fol-
margin and is shown by di. The difference between the lows:
functional margin and the geometric margin is that multi- N 
plying w and w0 by a constant will not change the geomet-
 
y = gw ( x ) = f  αi yi x,xi + w0 ( )

ric margin but will change the functional margin. The func-  i =1  (14)
tional and geometric margins are related as follows:
In other words, the entire algorithm can be written in terms
di of the inner products of the feature vectors.
Di =
w (11) In many practical applications, the data are not linearly
separable, or even if they are, the optimal margin classifier
Equation 10 means that the optimal margin classifier described so far may not be the best choice because it can
sought by SVM can be found by minimizing the norm of w, be very sensitive to outliers. Therefore, a modified version
under the constraint that the functional margin for all ex- of the SVM problem is defined as follows (Webb, 2002):
amples is at least equal to 1. It is easy to see that the mini-
N
mum margin always belongs to the points that are on the 1 2
edge of the widest strip that separates the data, as shown in min w,w0 w + C  ξi
2 i =1
figure 5b. Only for these points yi(wTxi + w0) = 1; for the
remaining points, yi(wTxi + w0) > 1. These points are called
the “support vectors” and usually comprise a very small such that: 
( 0 i )
 yi wT xi + w ≥ 1 − ξ , i = 1, ,N
fraction of the total number of points, resulting in a huge ξ

 i ≥ 0, i = 1, ,N (15)
reduction in the computational cost.
Although the optimization problem in equation 10 can With this definition, some of the examples are allowed
be solved by commercially available quadratic program- to fall on the wrong side of the separating hyperplane.
ming code, an easier equivalent problem can be found by However, for each such example, a cost (ξi) will be consid-
using the method of Lagrange multipliers (Strang, 1991). ered. The parameter C determines the importance that we
Using this method, the following dual optimization prob- attach to these errors. Once again, the method of Lagrange
lem will be obtained: multipliers can convert this problem into an easier one. The
resulting problem is as follows:

N
1 N The parameters C and γ in the problem are unknown.
max α  αi − 2  yi y j αi α j ( xi ,x j ) Therefore, the first step in solving the problem is to find the
i =1 i, j =1 optimum values for these parameters. The grid search pro-
C ≥ αi ≥ 0, i = 1, ,N cedure suggested by Hsu et al. (2008) was used for this
 purpose. First, the approximate values of the optimum C
such that:  N and γ were found on a coarse grid with C = 2-8, 2-4, ..., 230
i
 αi y = 0 and γ = 2-14, 2-10, ..., 224. After finding the approximate val-
 i =1 (16) ues for the best C and γ, a finer grid was defined to search
This is the same as the previous optimization problem in for more accurate values of the optimum C and γ.
equation 12, except that the first constraint has changed Reducing the Classification Error by
from αi ≥ 0 to C ≥ αi ≥ 0. Using Multiple Measurements
Although the SVM algorithm described so far is already The goal of the present study was to evaluate classifica-
a very powerful method, significant improvements would tion algorithms that can be used for automatic detection of
result by introducing kernels. The idea is to map the feature HLB-infected citrus trees in the field. The preliminary re-
vector xi into a higher-dimensional space using a mapping sults indicated that the classification accuracy from the sta-
function Φ(xi). As mentioned before, the entire SVM algo- tistical methods used to classify HLB-infected trees were
rithm can be written in terms of the inner products of the low when data with single measurements were considered
feature vectors. After mapping feature vectors using Φ, all for analysis (data not shown). This could be due to variabil-
instances of the inner products of the feature vectors, (xi,xj), ity in the experimental conditions, such as the amount of
will be replaced by the inner product of the corresponding sunlight, orientation of the leaves with respect to the sensor,
mapped feature vectors, (Φ(xi),Φ(xj)). This inner product is and sensor measuring angle. Therefore, it was hypothesized
called a kernel: that high detection accuracy can be achieved through mul-
( ) ( ( ) ( ))
K xi ,x j = Φ xi ,Φ x j
(17)
tiple measurements on a single tree. In this study, we inves-
tigated the accuracy of the three classification methods
(KNN, LR, and SVM) using spectral data from one, three,
The idea of kernels significantly improves the SVM or five measurements from different canopy areas of the
technique in terms of both its accuracy and the scope of same tree.
problems that it can solve. Various different kernels have When more than one measurement was presented to the
been introduced and used in practice (Cherkassky and Mu- classifier, a simple method based on majority voting can be
lier, 2007). Following the suggestion of Hsu et al. (2008), a used to determine the classification group. The tree is la-
Gaussian kernel was used in this study: beled an HLB-infected tree when more than half of the
measurements from that tree are classified as HLB infected.
( )
 2
K xi ,x j = exp  γ xi − x j  An improved approach is to consider the confidence of the
  (18) classifier for its prediction on each of the measurements.
For example, if a classifier predicts “infected” on two out
Therefore, the SVM problem that was solved in this
of five measurements with very high confidence and
study was of the following form:
“healthy” on the three remaining measurements with very
N N low confidence, it would be more reasonable to classify the
 αi − 2  yi y j αi α j K ( xi ,x j )
1
max α tree as “infected.” Our analysis was based on the majority
i =1 i, j =1 with confidence of the classifier.
C ≥ αi ≥ 0, i = 1, ,N For the KNN algorithm, the following equation was
 used to estimate the confidence of each prediction:
such that:  N i
 αi y = 0 KNN confidence=
 i =1
 wi ( for those among the k nearest neighbors that
( 
)
2
have the same class as the predicted class )
where: K xi ,x j = exp  γ xi − x j 
  (19)
÷  wi ( for all of the k nearest neighbors )
(20)
To solve this problem, the sequential minimal optimiza-
tion (SMO) algorithm was used. This algorithm solves the where wi is as defined in equation 2. For the LR method,
above optimization problem by a coordinate ascent ap- the value of the function gθ(x) in equation 4 can be used to
proach. Because of the second constraint in the problem, calculate a measure of confidence in prediction. The closer
N gθ(x) is to either 0 or 1, the more confident is the prediction.
i.e.,  αi yi = 0 , it is not possible to change only one of the Therefore, the value gθ ( x ) − 0.5 was used to evaluate the
i =1
Lagrange multipliers. The SMO algorithm optimizes the confidence of the predictions by the LR method:
function with respect to two Lagrange multipliers simulta- LR confidence = gθ ( x ) − 0.5
neously. The advantage of this approach is that the inner- (21)
most loop is very fast compared to other algorithms (Platt, For the SVM classifier, the value of the function gw(x) in
1998). equation 14 can be used as a measure of confidence in pre-
55(2): 711-720 717

diction. The larger the absolute value of this function, the
more confident is the prediction. Therefore, g w ( x ) was
used to evaluate the confidence of predictions by the SVM
method:
SVM confidence = g w ( x ) (22)
RESULTS AND DISCUSSION

Figure 6 shows an example of the collected spectra for
both healthy and HLB-infected leaves. Each observation is
an average of ten measurements. In general, the reflectance
in the visible region (400 to 700 nm) of HLB-infected (a)
leaves was higher than that of healthy leaves. This finding
is accordance with Poole et al. (2008) and supports our hy-
pothesis that due to the chlorosis in HLB-infected trees,
they absorb less visible light, resulting higher reflectance in
the visible region. In the NIR region, the canopy reflec-
tance of HLB-infected leaves in most cases was lower than
that of healthy leaves.
For the SVM method, the first step was to find the opti-
mum values for C and γ. As mentioned before, this was
performed by a gird search. Figure 7 shows typical contour
graphs that were used to find the optimum parameter val-
ues. First, the classification error was evaluated on a coarse
grid (fig. 7a) to find approximate estimates for C and γ. (b)
Then a finer grid was defined (fig. 7b) and used to deter-
mine accurate values of C and γ that minimized the classi- Figure 7. Contour plots of classification error for finding the
optimum values for the parameters C and γ.
fication error. In the specific case shown in figure 7, the
lowest classification error was approximately 18%, which
trees that were classified as HLB infected) and the false
was obtained for C = 28 and γ = 34.
negative rate (i.e., the percentage of HLB-infected trees
Analysis showed that MSC correction of the spectra did
that were classified as healthy).
not improve the classification accuracy. In other words,
It can be seen from table 2 that the classification error
when MSC, as described earlier in the Spectral Pretreat-
with a single spectrum was relatively large. However, when
ment and Feature Selection section, was removed as a pre-
three or five spectra from the same tree were used, the clas-
treatment procedure, the classification error did not im-
sification error decreased. The SVM method demonstrated
prove for any of the three classification techniques. Table 2
lower classification errors than the other two methods, es-
shows the classification error for each of the three classifi-
pecially with five spectra. For the SVM and KNN methods,
cation techniques without MSC correction. The table shows
the false negative error rate was slightly higher than the
the percentage of misclassified citrus trees when one, three,
false positive error rate, whereas for the LR technique, the
or five spectral measurements from each tree were used for
reverse was true. Because the amount of data was limited, it
classification. In addition to the overall error, the table also
was decided to build the classifiers using the pooled data
shows false positive rate (i.e., the percentage of healthy
from all seven field experiments. As previously mentioned,
75% of the entire dataset was used to build each classifier,
and the rest of the data were used to evaluate the classifiers.
The errors mentioned in table 2 are the average classifica-
Table 2. Average classification error for the three
classification techniques.
Logistic k-Nearest Support Vector
Type of Error Regression Neighbors Machines
One measurement
False positive (%) 37 20 13
False negative (%) 33 25 23
Overall error (%) 35 23 18
Three measurements
False positive (%) 25 8.8 4.5
False negative (%) 21 13 7.5
Overall error (%) 23 11 6.2
Five measurements
False positive (%) 20 5.8 2.9
Figure 6. Representative spectra from leaves of two healthy and two False negative (%) 17 7.1 3.0
HLB-infected trees. Overall error (%) 19 6.5 3.0

tion errors for the testing data from all seven field experi- light and other environmental factors can produce noise
ments. The classification error can also be computed for the that might reduce the classification accuracy. Under these
testing data from each field experiment separately. Our conditions, multiple measurements will be necessary to en-
analysis showed that the classification error was very close sure acceptable classification accuracy.
for different field experiments. For field experiments that
provided more data (i.e., 13 June 2007 and 24 March ACKNOWLEDGEMENTS
2009), the classification error was slightly less than that of This research was supported under Project No. 81086 of
the field experiments that provided less data. We assume the Florida Citrus Production Research Advisory Council,
that this was because more spectra from the field experi- Peter McClure, Chairman. The authors would like to thank
ments with a larger amount of data were used in building Dr. Sindhuja Sankaran and Sherrie Buchanon for their as-
the classifier, thus biasing the classifier toward the pattern sistance in this study.
in the data from those field experiments.
The classifiers obtained by the weighted KNN, logistic
regression, and SVM algorithms can be programmed on a
microcontroller. The three algorithms were also very differ-
REFERENCES
Borengasser, M., T. R. Gottwald, and T. Riley. 2001. Spectral
ent in terms of computation time. For SVM-based classifi- reflectance of citrus canker. Proc. Florida State Hort. Soc.
cation, the most time-consuming step was to find the opti- 114: 77-79.
mum values for C and γ. The grid search method for Byvatov, E., and G. Schneider. 2003. Support vector machine
optimal value selection used in this study took several applications in bioinformatics. Applied Bioinformatics 2(2):
hours (about 8 h) on a PC. Although there are faster meth- 67-77.
ods for finding C and γ (Hsu et al., 2008), this step is inher- Cherkassky, V., and F. Mulier. 2007. Learning from Data:
ently time consuming. The batch gradient descent algo- Concepts, Theory, and Methods. 2nd ed. Hoboken, N.J.: John
rithm described previously for the logistic regression Wiley and Sons.
Chung, K. R., and R. H. Brlansky. 2005. Citrus diseases exotic to
method took approximately 12 s to complete on a PC with
Florida: Huanglongbing (citrus greening). Extension
a 4.8 GHz processor and 512 MB of RAM. As mentioned Publication PP-210. Gainesville, Fla.: University of Florida,
before, the KNN method does not produce any model from IFAS.
the data. Therefore, for the KNN algorithm, there is no Delalieux, S., J. K. Aardt, W., E. Schrevens, and P. Coppin. 2007.
computation until a new spectrum is to be classified. Once Detection of biotic stress (Venturia inaequalis) in apple trees
the model (i.e., the classifier) is obtained, the computation using hyperspectral data: Non-parametric statistical
time for classification of a new spectrum is very fast for approaches and physiological implications. European J. Agron.
SVM and logistic regression. On the same PC as mentioned 27(1): 130-143.
above, the time to make a prediction based on three spectra Etxeberria, E., P. Gonzalez, W. Dawson, and T. Spann. 2007. An
iodine-based starch test to assist in selecting leaves for HLB
from the same tree was only 0.06 ms for the logistic regres-
testing. Extension Publication HS-1122. Gainesville, Fla.:
sion classifier, whereas for the weighted KNN and SVM University of Florida, IFAS.
methods, this time was 5.5 and 5.0 ms, respectively. There- Etxeberria, E., P. Gonzalez, D. Achor, and L. G. Albrigo. 2009.
fore, the logistic regression method was computationally Anatomical distribution of abnormally high levels of starch in
much faster. However, because the KNN and SVM algo- HLB-affected Valencia oranges trees. Physiological and
rithms yielded better classification accuracy than logistic Molecular Plant Pathology 74(1): 76-83.
regression, they are preferable algorithms for HLB detec- Gunn, S. R. 1998. Support vector machines for classification and
tion in citrus groves. regression. Southampton, U.K.: University of Southampton,
Department of Electrical and Computer Science. Available at:
http://meandmyheart.files.wordpress.com/2009/02/svm_
gunn1.pdf. Accessed 1 August 2010.
CONCLUSION Hsu, C. W., C. C. Chang, and C. J. Lin. 2008. A practical guide to
The goal of this study was to evaluate machine learning support vector classification. Taipei, Taiwan: National Taiwan
techniques for rapid detection of HLB-infected citrus trees. University, Department of Computer Science and Information
Engineering. Available at: www.csie.ntu.edu.tw/~cjlin/
Canopy reflectance spectra were measured on infected and
papers/guide/guide.pdf. Accessed 1 August 2010.
healthy trees using a spectroradiometer, and three common Huang, J. F., and A. Apan. 2006. Detection of sclerotinia rot
classification algorithms (logistic regression, k-nearest disease on celery using hyperspectral data and partial least
neighbor, support vector machines) were used to distin- squares regression. J. Spatial Sci. 51(2): 129-142.
guish the infected trees from the healthy trees. The results Lee, W. S., M. R. Ehsani, and L. G. Albrigo. 2008. Citrus
indicated that a single measurement was insufficient for ac- greening (Huanglongbing) detection using aerial hyperspectral
curate detection of the infected trees with the three ap- imaging. In Proc. 9th Intl. Conf. on Precision Agriculture.
proaches used in this study. The classification error was be- International Society of Precision Agriculture.
tween 18% and 35% using a single spectrum. However, Liu, Z., J., J. S. Huang, R. Tao, W. Zhou, and L. Zhang. 2007.
Characterizing and estimating rice brown spot disease severity
using multiple spectral measurements from the tree canopy
using stepwise regression, principal component regression, and
of a single tree, the classification accuracy increased. The partial least square regression. J. Zhejiang University Sci. B
SVM method showed an accuracy greater than 97% when 8(10): 738-744.
it was provided with five spectra from different leaves on Martens, H., and T. Naes. 2001. Multivariate calibration by data
the same tree. Under real field conditions, variation of sun- compression. In Near-Infrared Technology in the Agricultural
55(2): 711-720 719

and Food Industries, 59-100. 2nd ed. P. Williams and K. leaves. In Proc. Intl. Research Conf. on Huanglongbing. St.
Norris, eds. St. Paul, Minn.: American Association of Cereal Paul, Minn.: Plant Management Network.
Chemists. Rumpf, T., A. K. Mahlein, U, Steiner, E. C. Oerke, H. W. Dehne,
Mitchell, T. M. 1997. Machine Learning. 1st ed. New York, N.Y.: and L. Plumer. 2010. Early detection and classification of plant
McGraw-Hill. diseases with support vector machines based on hyperspectral
Muhammed, H. H. 2005. Hyperspectral crop reflectance data for reflectance. Computer and Electronics in Agric. 74(1): 91-99.
characterising and estimating fungal disease severity in wheat. Sankaran, S., A. Mishra, R. Ehsani, and C. Dima. 2010. A review
Biosystems Eng. 91(1): 9-20. of advanced techniques for detecting plant diseases. Computer
Muraro, R. P. 2007. Summary of 2006-2007 citrus budget for the and Electronics in Agric. 72(1): 1-13.
central Florida production region. Extension Publication FE- Schneider, H. 1968. Anatomy of greening-diseased sweet orange
802. Gainesville, Fla.: University of Florida, IFAS. shoots. Phytopathology 58: 1155-1160.
Naidu, R. A., E. M. Perry, F. J. Pierce, and T. Mekuria. 2009. The Smith, K. L., M. D. Steven, and J. J. Collis. 2005. Plant spectral
potential of spectral reflectance technique for the detection of responses to gas leaks and other stresses. Intl. J. Remote
grapevine leafroll-associated virus-3 in two red-berried wine Sensing 26(18): 4067-4081.
grape cultivars. Computer and Electronics in Agric. 66(1): 38- Strang, G. 1991. Calculus. Wellesley, Mass.: Wellesley-
45. Cambridge Press. Available at: http://ocw.mit.edu/ans7870/
NASS. 2009. Florida annual statistical bulletin. Washington, D.C.: resources/Strang/Edited/Calculus/Calculus.pdf. Accessed 1
USDA National Agriculture Statistics Service. Available at: August 2010.
www.nass.usda.gov/Statistics_by_State/Florida. Accessed 29 Takushi, T., T. Toyozato, S. Kawano, S. Taba, A. Ooshiro, and M.
July 2010. Numazawa. 2007. Starch method for simple, rapid diagnosis of
Orfanidis, S. J. 1996. Introduction to Signal Processing. Prentice citrus Huanglongbing using iodine to detect high accumulation
Hall Signal Processing Series. Upper Saddle River, N.J.: of starch in citrus leaves. Japanese J. Phytopathology 73: 3-8.
Prentice Hall. Webb, A. R. 2002. Statistical Pattern Recognition. 2nd ed.
Platt, J. 1998. Fast training of support vector machines using Chichester, U.K.: John Wiley and Sons.
sequential minimal optimization. In Advances in Kernel Witten, I. H., and E. Frank. 2005. Data Mining: Practical
Methods: Support Vector Learning, 41-65. B. Scholkopf, C. Machine Learning Tools and Techniques. 2nd ed. San
Burges, and A. Smola, eds. Cambridge, Mass.: MIT Press. Francisco, Cal.: Morgan Kaufmann.
Poole, G., W. Windham, G. Heitschmidt, B. Park, and T. Yang, C. M., C. H. Cheng, and R. K. Chen. 2007. Changes in
Gottwald. 2008. Visible/near-infrared spectroscopy for spectral characteristics of rice canopy infested with brown
discrimination of HLB-infected citrus leaves from healthy planthopper and leaffolder. Crop Sci. 47(1): 329-335.

I C G (HLB) U Vis-Nir S T: Dentification of Itrus Reening Sing A Pectroscopy Echnique

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

I C G (HLB) U Vis-Nir S T: Dentification of Itrus Reening Sing A Pectroscopy Echnique

Uploaded by

Copyright:

Available Formats

IDENTIFICATION OF CITRUS GREENING (HLB)

USING A VIS-NIR SPECTROSCOPY TECHNIQUE

Transactions of the ASABE

712 TRANSACTIONS OF THE ASABE

55(2): 711-720 713

714 TRANSACTIONS OF THE ASABE

55(2): 711-720 715

fraction of the total number of points, resulting in a huge ξ

716 TRANSACTIONS OF THE ASABE

55(2): 711-720 717

RESULTS AND DISCUSSION

718 TRANSACTIONS OF THE ASABE

55(2): 711-720 719

720 TRANSACTIONS OF THE ASABE

You might also like