You are on page 1of 11

Agricultural and Forest Meteorology 310 (2021) 108629

Contents lists available at ScienceDirect

Agricultural and Forest Meteorology


journal homepage: www.elsevier.com/locate/agrformet

An LSTM neural network for improving wheat yield estimates by


integrating remote sensing data and meteorological data in the Guanzhong
Plain, PR China
Huiren Tian a, b, Pengxin Wang a, b, *, Kevin Tansey c, Jingqi Zhang a, b, Shuyu Zhang d,
Hongmei Li d
a
College of Information and Electrical Engineering, China Agricultural University, East Campus, Qinghua East Road No. 17, Haidian P. O. Box 116, Beijing 100083, PR
China
b
Key Laboratory of Remote Sensing for Agri-Hazards, Ministry of Agriculture and Rural Affairs, Beijing, PR China
c
School of Geography, Geology and the Environment; Centre for Landscape and Climate Research, University of Leicester, Leicester LE1 7RH, United Kingdom
d
Shaanxi Provincial Meteorological Bureau, Xi’an 710014, PR China

A R T I C L E I N F O A B S T R A C T

Keywords: Crop growth condition and production play an important role in food management and economic development.
Vegetation temperature condition index (VTCI) Therefore, estimating yield accurately and timely is of vital importance for regional food security. The long short-
Leaf area index (LAI) term memory (LSTM) model represents a deep network structure to incorporating crop growth processes, which
Meteorological data
has been proven to accommodate different types and representations of data, recognize sequential patterns over
Long short-term memory (LSTM)
Yield estimation
long time spans, and capture complex nonlinear relationships. The LSTM model was developed to estimate wheat
yield in the Guanzhong Plain by integrating meteorological data and two remotely sensed indices, vegetation
temperature condition index (VTCI) and leaf area index (LAI) at the main growth stages. Considering the LSTM
model has characteristics of memorizing time series information, we adopted different time steps to estimate
wheat yield. The results showed that the accuracy of yield estimation was highest (RMSE = 357.77 kg/ha and R2
= 0.83) under two time steps and the input combination (meteorological data and two remotely sensed indices).
We evaluated the yield estimation accuracy of the optimal LSTM model performance compared with the back
propagation neural network (BPNN) and support vector machine (SVM). As a result, the LSTM model out­
performed BPNN (R2 = 0.42 and RMSE = 812.83 kg/ha) and SVM (R2 = 0.41 and RMSE = 867.70 kg/ha), since
its recurrent neural network structure that can incorporate nonlinear relationships between multi-features inputs
and yield. To further validate the robustness of the optimal LSTM method, the correlations between estimated
yield and measured yield at the irrigation sites and the rain-fed sites from 2008 to 2016 were analyzed, and the
results demonstrated that the proposed model can serve as an effective approach for different type sampling sites
and has better adaptability to interannual fluctuations of climate. Our findings demonstrated a reliable and
promising approach for improving yield estimation.

1. Introduction research and is important for the economic development of any nation
(Prasad et al., 2006; Kuwata and Shibasaki, 2016). Therefore, timely,
Wheat provides the most calories and protein for world food supply reliable and accurate wheat yield estimates have significance for
among the top three cereals (wheat, rice and maize) (Cai et al., 2019). As decision-making regarding regional and global food security.
the global population and living standards increasing, stable food such In recent decades, extensive studies have been conducted on crop
as wheat production is expected to increase by 60% towards 2050 yield estimation. Traditionally, crop yield estimates were based on the
(Alexandratos and Bruinsma 2012; Feng et al., 2019). Monitoring of field survey information that farmers provided during the growing
crop conditions is currently one of the major challenges in agricultural season. However, they have difficulty of upscaling to larger areas and

* Corresponding author at: College of Information and Electrical Engineering, China Agricultural University, East Campus, Qinghua East Road No. 17, Haidian P. O.
Box 116, Beijing 100083, PR China.
E-mail addresses: wangpx@cau.edu.cn (P. Wang), kjt7@le.ac.uk (K. Tansey).

https://doi.org/10.1016/j.agrformet.2021.108629
Received 24 June 2020; Received in revised form 19 July 2021; Accepted 27 August 2021
Available online 8 September 2021
0168-1923/© 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

consuming time and labor (Burke and Lobell, 2017; Leroux et al., 2019). of VTCI and LAI at different growth stages on wheat yield. Considering
Another way of estimating yield is to use crop models for simulating BPNN is a traditional and simple structure network among many neural
crop growth and development dynamically according to crop genotypes, networks, the ability to solve nonlinear and complex approximation
field management practice, soil characteristics and meteorological data problems is relatively weak, resulting in lower accuracy of yield esti­
(Araya et al., 2015; Anothai et al., 2013; Fang et al., 2008). Considering mation (R2 = 0.34). CNN and LSTM, two popular types of deep neural
limitations in field data availability and quality have restricted the network (DNN), have attracted a lot of attention for yield prediction
model performance at regional scale (Chen et al., 2018). Thus, how to recently. LSTM is a special kind of recurrent neural network (RNN),
improve model performance and the accuracy of yield estimation has which can remember information for much longer time periods and can
become a key issue. Compared to crop growth models, satellite remote capture complex, nonlinear relationships due to its recurrent structure
sensing has been widely used to monitor large-scale crop areas and to and gating mechanisms that regulate the information flow into and out
estimate yield due to its ability to acquire the information needed and of the cell, and its large capacity to deal with sequential data
repetitive coverage at relatively low cost (Doraiswamy et al., 2003; (Hochreiter and Schmidhuber, 1997; Greff et al., 2015; Cunha et al.,
Becker-Reshef et al., 2010; Bolton and Friedl, 2013). 2018). Contrary to the classical neural networks, LSTM has feedback
A variety of remote sensing based indicators have been used in connections that enable the processing of input sequences of arbitrary
regression between vegetation indices and crop yield for a long time in length and is commonly preferred in classifying, processing and making
many studies. The most extensively used indices include normalized predictions based on time-series data. It has been mainly applied in
difference vegetation index (NDVI) and enhanced vegetation index domains related to sequential data, such as sea surface temperature
(EVI) (Prasad et al., 2006; Becker-Reshef et al., 2010; Sun et al., 2020). prediction (Xiao et al., 2019), runoff prediction (Kratzert et al., 2018),
In addition, vegetation condition index (VCI), temperature condition water table depth predicting in agricultural areas (Zhang et al., 2018),
index (TCI), vegetation temperature condition index (VTCI) and leaf and crop yield prediction (Jiang et al., 2019). Haider et al. (2019)
area index (LAI) were found to have a close correlation with crop yield employed Robust-LOWESS as a smoothing function in conjunction with
and can be used to monitor crop growth conditions and estimate yield LSTM for wheat yield prediction in Pakistan, the results revealed a sig­
(Unganai and Kogan, 1998; Liu and Kogan, 2002; Huang et al., 2015; nificant improvement of in terms of model accuracy in comparison to
Xie et al., 2017). However, there are limitations in using vegetation the autoregressive integrated moving average (ARIMA) model and RNN.
index alone as a main indicator of final crop yields. Leroux et al. (2016) Sun et al. (2019) proposed CNN-LSTM model that leverages the
pointed out the main limitation is the indirect link between yield and spatial-temporal features to predict soybean yield, the results proved
spectral data. Apart from the main limitation, droughts during sensitive that the effectiveness and advantages of the proposed approach, they
crop growth stages can lead to the yield reductions. To overcome the also achieved promising results in predicting corn yield in the U.S. Corn
limitation, many ways have been proposed to incorporate remote Belt (Sun et al., 2020). The impressive application results of LSTM in
sensing information in yield estimation, including detecting crop dy­ crop yield prediction proved that it can capture not only the variation
namics, deriving integrated index, and developing phenological-based trend of data but also characterize the dependence relationship of time
predictors (Li et al., 2019). However, dynamics of crop growth and in­ series data. However, several limitations still exist. First, the application
fluences of external environmental factors were often ignored. There­ of LSTM to handle time series data in the field of yield estimation is
fore, how to better integrate remote sensing information with other relatively rare (You et al., 2017; Maimaitijiang et al., 2020). Second,
environmental factors for yield estimation needs further study. In this most of the previous research focused on the model performance. We not
paper, we choose the VTCI and LAI at each wheat growth stage, which only benefit from deep learning in terms of improvements in the model
are important variables of reflecting water stress and the potential performance, but we specifically focus on considering the internal pa­
amount of photosynthesis and dry matter accumulation respectively for rameters of the model (i.e. time step), thereby placing confidence in the
estimating crop growth conditions and yield, combining with average model. Third, previous research preferred employing the common
precipitation and temperature. remote sensing data as the features, the innovation of this research lies
Previous studies have relied on linear regression models or process- in the use of remote sensing data at different stages (i.e. VTCI) fully
based models to establish the relationship between crop yield and considering the climate characteristics of the study area as the model
observing variables (Edreira and Otegui, 2012; Lobell et al., 2014). input. Fourth, of note is that, few studies have verified the adaptability
However, considering crop production is influenced by a variety of and robustness of the model at sampling sites scale under different
interrelated factors, it is difficult to describe their relationships by farming systems. Consequently, we explore the adaptability and
conventional methods (Paswan and Begum, 2013; Johnson et al., 2016). robustness of the model at different sampling sites of utilizing yield data
Integration of remotely sensed data and machine learning algorithms obtained from ground survey experiments in the past ten years.
offers cost-and time-effective approach for spatial prediction of crop Taking into aforementioned account the limitations in yield esti­
yield compared to conventional approaches (Sami et al., 2018). Thus, mation, in this study we developed a LSTM deep neural network that
machine learning methods provide alternatives to traditional regression takes advantage of multi-features inputs which are based on remote
approaches and have come highly recommended to handle the compli­ sensing data and meteorological data and multi-time steps to improve
cated factors and relationships between different variables and crop the accuracy of estimating yield at county level in the Guanzhong Plain.
yield. As Kaul et al. (2005) reports, artificial neural network (ANN) We aimed to specifically answer the following four research questions in
models produce more accurate yield predictions than regression models this study: (i) What combinations of input data (i.e. remote sensing data
and proved to be a superior methodology for accurately predicting crop and meteorological data) will achieve the best performance of esti­
yield. Cai et al. (2019) stated that machine-learning based methods mating wheat yield in the Guanzhong Plain? (ii) As more data added as
outperformed regression methods in modeling crop yield. In addition, inputs with the progression of time steps, how does the estimation of
neural networks are widely used among the machine learning methods wheat yield improve over the time steps? (iii) How does the LSTM
since their strong ability in modeling the complex patterns hidden in method compare with the machine-learning based methods for
data (Kamilaris and Prenafeta-Boldú, 2018; Xiao et al., 2019). A variety modeling crop yield? (iv) How does the LSTM method perform in terms
of neural networks have emerged in recent years to make yield esti­ of robustness and adaptability across different farming systems? The
mates, such as the back propagation neural network (BPNN), the spiking main objectives of this paper included the following: (1) Develop an
neural network (SNN), the convolutional neural network (CNN) and the LSTM neural network model integrated remote sensing, meteorology,
long short-term memory (LSTM). Tian et al. (2020) used improved and phenology information to achieve accurate wheat yield estimation.
particle swarm optimization algorithm (IPSO)-BPNN to estimate winter (2) Compare with other machine learning approaches to examine per­
wheat yield in the Guanzhong Plain, PR China, and focused on the effect formance of the LSTM model. (3) Evaluate the applicability,

2
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

effectiveness, and robustness of the proposed LSTM model at the rain- Reflectance Daily L2G Global 1 km SIN Grid V006) for tiles h26v05 and
fed and irrigation sampling sites. h27v05 (h for horizontal and v for vertical) during the main winter
wheat growth stages from March to May of 2007–2017 were down­
2. Materials and methods loaded from the National Aeronautics and Space Administration
(NASA)’s Earth Observing System Data and Information System website
2.1. Study area (http://reverb.echo.nasa.gov/reverb/). The initial LST and surface
reflectance images were spliced and resampled, and their projection was
The study area, Guanzhong Plain, is located in the central part of converted from sinusoidal to a Lambert azimuthal projection using the
Shaanxi province, PR China and has an average elevation of 500 m MODIS reprojection tool (MRT). The corresponding NDVI was calcu­
(Fig. 1). This region extends from 106◦ 22′ E to 110◦ 24′ E and from lated from the reflectances in bands 1 and 2.
33◦ 57′ N to 35◦ 39′ N. The climate of the plain is continental monsoon The VTCI is defined as follows (Wang et al., 2001; Sun et al., 2008;
with annual rainfall ranging from 500 to 700 mm, characterized by four Wan et al., 2004):
distinct seasons, with warm and rainy in summer and cold and dry in LSTmax (NDVIi ) − LST(NDVIi )
winter. The average annual temperature is approximately 13 ◦ C. The VTCI = (1)
LSTmax (NDVIi ) − LSTmin (NDVIi )
plain has a flat terrain and fertile soil, and is in a typical warm temperate
transitional zone between semi-humid and semi-arid climates. Accord­ LSTmax (NDVIi ) = a + bNDVIi (2)
ing to soil formation conditions, processes and properties, they can be
divided into loess soil, cinnamon soil, fluvo-aquic soil and other soil ′ ′
LSTmin (NDVIi ) = a + b NDVIi (3)
types. Loess soil is the most important agricultural soil in the territory. It
is widely distributed in the second and third terraces of the Plain. The Where LSTmax (NDVIi ) and LSTmin (NDVIi ) are the warm and cold edges, i.
soil is loose, fertility and moisture are preserved, and it is convenient for e., the maximum and minimum LST values of the pixels that have the
farming. Cinnamon soil is sticky and heavy, retaining water and fertil­ same NDVI value in a study region, respectively, and coefficients a, b, a

izer. The soil texture in the territory is mainly medium soil, heavy soil and b are determined according to the scatter plots of the LST and NDVI.

and light soil, with good aeration, water storage and fertilizer retention, The values of warm and cold edges are important for monitoring
good cultivability, long cultivating period, and wide variety of planting. drought, and the VTCI drought monitoring results at different time
The dominant prevailing planting pattern in the irrigated grain crop area scales show that the method at the scale of 10 days is more accurate and
is winter wheat in rotation with summer maize, while in the rain-fed practical (Lin et al., 2016). In this study, the warm edges were deter­
grain crop area the main prevailing planting pattern is winter wheat. mined using the multiyear maximum value composite (MVC) NDVI and
Winter wheat is generally sown in early October or mid-October and is LST at 10-day intervals, and the cold edges were determined using the
harvested at the beginning of June in the following year. The main multiyear MVC NDVI and the multiyear maximum-minimum value
growth stages of winter wheat are the green-up stage (from early March composite LST (Sun et al., 2008). Thus, VTCI time series data at 10-day
to mid-March), the jointing stage from (late March to mid-April), the intervals were generated. According to the start and end dates of the four
heading-filling stage (from late April to early May), and the milk stage growth stages of winter wheat, the 10-day VTCI images were converted
(from mid-May to late May) (Sun et al., 2008; Xie et al., 2017). In this to VTCI images at the four growth stages by calculating the average
study, we select three representative sampling sites in the Guanzhong values of the 10-day VTCIs for intervals belonging to each growth stage
Plain as our studying sites, as is shown in Fig. 1. pixel by pixel. Then, the VTCI values of counties were generated by
calculating the average VTCI values of pixels located within each
2.2. Remotely sensed data county.

2.2.1. Remotely sensed VTCI 2.2.2. Remotely sensed LAI


To calculate VTCI, the LST product with 1 km spatial resolution LAI plays an important role in vegetation processes such as photo­
(MYD11A1, MODIS/Aqua Land Surface Temperature/Emissivity Daily synthesis and transpiration, and time series LAI can reflect growth status
L3 Global 1 km SIN Grid V006) and surface reflectance data product of crops (Campos-Taberner et al., 2016). The annual time series LAI
with 1 km spatial resolution (MYD09GA, MODIS/Aqua Surface during 2007–2017 were generated using 4-day MODIS LAI data product

Fig. 1. The location of the study area, winter wheat planting areas, county boundaries, city boundaries and sampling sites.

3
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

with a spatial resolution of 500 m (MCD15A3H, MODIS/Terra+Aqua three 60 × 60 cm wheat-growing plots at each sampling site were
Leaf Area Index/FPAR 4-Day L4 Global 500 m SIN Grid V006) for tiles selected to calculate the total spike number of each quadrat. Then,
h26v05 and h27v05. However, due to the presence of cloudiness, which wheat plants of the samples were dried and threshed in the sun. Last, the
results in the inconsistency and discontinuity of spatial and temporal grain weight after drying was measured to calculate the wheat yield at
domains (Xun et al., 2018). The Savitzky-Golay filter proposed by the irrigated and rain-fed sampling sites.
Savitzky and Golay (1964) to remove the sudden drops for smoothing
and computing derivatives of a set of consecutive values. An example of 2.4. Methods
a survey site during day of year 61–161 in 2017 presented the effect of
filtering, as shown in (Tian et al., 2020). In this study, the LAI time series 2.4.1. LSTM deep neural network model
data from 2007–2017 were generated using the filtered 4-day MODIS The LSTM architecture is a special kind of RNN with an appropriate
LAI data product. The maximum value of the LAI over the 10-day in­ gradient-based learning algorithm, which was first proposed by Sepp
terval was taken as the 10-day time series MODIS LAI. Then, the 10-day and Jürgen in 1997 to overcome the error back-flow problems (Sepp and
time series MODIS LAI was generated to composite the LAI images at the Jürgen, 1997). The LSTM model is organized in the form of a chain
four winter wheat growth and development stages using maximum structure as shown in Fig. 2.
method and LAI values of counties were generated using average The core of LSTM lies in the state of the neural network units, just
method. Due to inconsistent spatial resolution between LAI and VTCI, all like on a conveyor belt, which transmitted to the back through the entire
the VTCI images were resampled to a spatial resolution of 500 m. chain structure and flowed in the direction of the arrow. The LSTM uses
three gates to control the cell state (Ct ) and output (ht ), including a
2.3. Meteorological data and wheat yield data f
forget gate gt (Gers et al., 2000), an input gate gti and an output gate gto .
These gates have the ability to control how much information can pass
During the winter wheat growing seasons of 2007–2017, the daily
through and how much information can be reserved (LeCun et al.,
precipitation and temperature in 23 counties of the plain were measured
2015). A gate is similar to a neural network layer or a series of matrix,
and recorded by the China Meteorological Administration (http://data.
which contains different individual weights to do point multiplication
cma.cn/). The average precipitation and average temperature during the
operations.
growth stages were calculated through daily precipitation and temper­
The first step of the gating mechanism between cells is to determine
ature before feed into the model. The winter wheat yield data of the
what information should be forgotten from the state of the neural
counties in the Guanzhong Plain from 2007 to 2017 were acquired from
network unit, which is implemented by a sigmoid function. When the
the Shaanxi Rural Yearbooks. The sampling site-level measured yield
state of the neural network unit is Ct− 1 , the forget gate will read the
data were obtained by conducting ground surveys during main growing
previous output ht− 1 and the new input xt , and output is a number be­
stages of winter wheat from 2008–2016. Since the LSTM model needs to
tween 0 and 1, where 0 means completely discard the information and 1
input time series data and ground survey experiments are restricted by
means completely retain the information. The calculation formula is
many factors such as weather, the available yield values at sampling
shown in Eq. (4),
sites scale were limited. Therefore, three sampling sites were selected in
( )
the study that included two irrigated sampling sites and one rain-fed gft = σ Wf ⋅[ht− 1 , xt ] + bf (4)
sampling site (Fig. 1). The detailed steps of ground survey experiment
to obtain sampling site-level measured yield data are as follows. First, where σ is the sigmoid function, and Wf and bf are the weight matrices

Fig. 2. The structure of the Long Short-Term Memory (LSTM) neural network (modified from Olah, 2015).

4
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

and bias of the forget gate, respectively. the best performance obtained for batch size is set 48 by trial and error.
The following step is to determine what new information is stored in The n_time_steps is set up three schemes, which are 1, 2 and 3, respec­
the neural network unit. This step includes two parts. First, a sigmoid tively, and n_features is set to 10. The output of the model is the esti­
layer called an "input gate" decides which values to update; and then, the mated wheat yield. All the data were first normalized by the min-max
tanh function creates a candidate vector state Nt that can be added. The method as the input data before feeding into the model. A dropout
calculation formulas are shown in Eqs. (5) and (6), mechanism is applied to the inputs of the dense layer to prevent over­
fitting, and the dropout rates are set to 0.5 empirically (Srivastava et al.,
git = σ(Wi ⋅[ht− 1 , xt ] + bi ) (5)
2014). There is no general rule of thumb for the amount of hidden nodes
that should be used, and has to be figured out on a case-by-case basis,
Nt = tanh(Wn ⋅[ht− 1 , xt ] + bn ) (6)
specifically by trial and error. Therefore, the performance of the LSTM
where Wi and Wn are the weight matrices of the input gate and input model was evaluated with two LSTM layers with 50 hidden nodes and 30
candidate element, respectively, and bi and bn are the corresponding hidden nodes, respectively. For the network parameter optimization, we
biases. employed a gradient descending-based optimizer Adam and a learning
The third step is to update the cell state from the old cell state Ct− 1 to rate at 0.001. The proportion of training and testing data is set 80%
the new cell state Ct . The calculation formula is shown in Eq. (7), where training, 20% testing.
f
Ct− 1 × gt indicates which information of the old unit status is stored, and
2.4.2. BP neural network and support vector machine
Nt × gti indicates the new candidate value. The BP neural network was developed by Rumelhart et al. (1986),
Ct = Ct− 1 gft + Nt git (7) which is one of the most popular techniques in the field of neural
network and is the most widely used algorithm for supervised learning
The final step is to determine the output of the LSTM. First running with multi-layered feed-forward networks. The principle behind of the
the Sigmoid layer, which determines which unit states are output; then, BP neural network involves revising the weights and thresholds of the
multiplying the tanh value of the unit state by the output of the sigmoid network with the steepest gradient descent method to minimize the
threshold (normalize the output value) to get the new unit state of the error by modifying the weights of each layer of neurons. A general
output. The calculation formulas are shown in Eqs. (8) and (9). model of the BP has three layers, which includes an input layer, a hidden
got = σ (Wo ⋅[ht− 1 , xt ] + bo ) (8) layer, and an output layer.
Support vector machine (SVM) originated in the 1990s, and is a su­
ht = got tanh(Ct ) (9) pervised statistical learning algorithm proposed by Vapnik (1995),
which has a more mature theoretical basis and better learning perfor­
Here, Wo and bo are the weight matrices and bias of the output gate, mance. When SVM applied to time series data, it has a great prediction
respectively. capacity (Kuwata and Shibasaki, 2016). The basic idea is to map the
In this study, we build a 5-layer deep neural network model for vectors of covariates into a higher dimensional feature space based on
wheat yield estimation as shown in Fig. 3, which includes an input layer, the nonlinear transformations (Zhou et al., 2019). Generally, the
two LSTM layers, a dense layer and an output layer. The input is a time dimension of feature space is very high or even infinite, resulting in a
series with VTCI and LAI at the four growth stages of winter wheat, the huge increase in the amount of computation after space transformation
average precipitation, and the average temperature during the main and the curse of dimensionality. In order to solve this problem, inner
growth stage. The input of the whole network is expressed n_samples, product operation is introduced in high dimensional space, and this
n_time_steps, and n_features. In our method, n_samples is the batch size for operation can be realized by the kernel function. According to Mercer
training, setting a suitable size can speed up the calculation speed, and theorem, a series of kernel functions can be constructed, such as linear

Fig. 3. The architecture of the LSTM model for yield estimation.

5
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

kernel, radial basis function kernel and sigmoid kernel. In this study, we Plain. Since the potential input data include both meteorological data
used the radial basis function as a kernel function and optimized and remote sensing data (i.e. VTCI and LAI), we applied the LSTM model
hyper-parameters with a grid search. The input variables and output to the following two combinations of inputs, they are: (1) remote sensing
variable are the same as those of the LSTM. Similarly, 80% of the dataset data (i.e. VTCI and LAI); and (2) remote sensing data combined with
was used for training and 20% of the dataset was used for testing. meteorological data (i.e. VTCI combined with LAI and meteorological
Several indices, mean relative error (MRE), root mean square error data).
(RMSE) and the coefficient of determination (R2), were employed to The second group of experiments was designed to answer the second
qualitatively evaluate the performance of different models and came up research question (As more data added as inputs with the progression of
with a better model. The smaller values the MRE and RMSE, the better time steps, how does the estimation of wheat yield improve over the
model performance, whereas, the larger value the R2, the closer the time steps?). We developed three schemes of time steps here, they are:
estimated value is to the real one. (1) one time step; (2) two time steps; (3) three time steps. Comparing the
estimated R2 and RMSE performance of the above two input combina­
2.4.3. Model testing method tions and three schemes of time steps, we can find which combination
To answer the research questions proposed in this paper, two groups and time step scheme performs the best and answer the research
of experiments were designed. The first group of experiments was questions.
designed to answer the first research question, what combination(s) of Based on the two groups of experiments, we could get the best LSTM
input data (i.e. remote sensing data and meteorological data) will ach­ model performance of estimating wheat yield. We applied with two
ieve the best performance of estimating wheat yield in the Guanzhong machine learning methods mentioned above (i.e. SVM and BPNN) to

Fig. 4. Comparisons of yield estimation results based on different input combinations and different time steps (a) one time step, (b) two time steps and (c) three time
steps of the LSTM model. The solid line indicates the equal estimated yield and official yield records, and dashed lines indicate the estimation bias less than 1000 kg/
ha. The ‘R2P ’ represents the R2 value of pink points and the ‘R2g ’ represents the R2 value of green points. RMSE is the same. (For interpretation of the references to
colour in this figure legend, the reader is referred to the web version of this article.)

6
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

compared with the best LSTM model performance of yield estimation in = 0.83), higher than one time step (R2 = 0.77) and three time steps (R2
the Guanzhong Plain. SVM and BPNN methods were chosen as com­ = 0.82). This suggested that winter wheat growth and yield are not only
parison methods relative to the LSTM, since SVM is easy to train and has related to the weather conditions and remote sensing information of the
robustness against over-fitting, and BPNN represents the one of the most main growth stage in the current year, but also has a greater relationship
classic algorithms to realize the nonlinear mapping of input data and with the weather conditions and remote sensing information of previous
output data. Results from the three methods can be used to compare and years. Particularly, when time step was set 2, that is, considering the
validate with each other, which could answer the third research ques­ information of input feature variables from the previous two years, the
tion (How do LSTM method compare with the machine-learning based accuracy of yield estimation is highest. In other words, the LSTM esti­
methods for modeling crop yield?). mation model is more sensitive of the two time steps, which capture
Finally, to test the robustness and practical performance of the more features that are related to wheat yields. If the time step is set
model, we conducted experiments at sampling sites under different longer, the errors from input samples gradually accumulate, which will
farming systems to answer the fourth research question. result in a decline in the accuracy of yield estimation. The correlation
between the estimated yields and the official yield records becomes
3. Results and analysis weaker and weaker as the time steps become longer and longer.
Focusing on the spatial patterns of correlation between different
3.1. Yield estimation under different time steps and input combinations combinations of inputs and the yield in different cities (Fig. 5), the LSTM
model achieved the highest estimation accuracy by integrating VTCI,
The scatter plots of estimated yields and official yield records with all LAI and meteorological data with two time steps at city level except for
the testing datasets at county level in the Guanzhong Plain were shown Xi’an city. Adding meteorological data into the LSTM model can
in Fig. 4. The results showed that the LSTM model incorporating VTCI, improve the estimation performance, consistent with county-scale re­
LAI and meteorological data inputs achieved the highest estimation sults, which indicated that meteorological data provides unique and
accuracy (RMSE = 357.77 kg/ha and R2 = 0.83) when the time step was added information. We noticed that the LSTM model in Xi’an city has a
set to 2. The LSTM models with remote sensing data (VTCI and LAI) higher performance than in other three cities. Among these cities, the
inputs under three time steps schemes (one, two and three) captured LSTM model performed the worst in Weinan city (RMSE = 697.71 kg/
61%, 76% and 75% of yield variations and resulted in an RMSE of ha, MRE = 15.84%). In fact, Weinan city is located in the east of the
580.14 kg/ha, 510.59 kg/ha and 501.59 kg/ha respectively. The LSTM Guanzhong Plain, where the distribution of wheat is scattered. This
models with all inputs (VTCI combined with LAI and meteorological reason explains why the accuracy of the lower correlation between
data) under three time steps schemes (one, two and three) captured estimated yields and official yield records was the lowest. The another
77%, 83% and 82% of yield variations and resulted in an RMSE of reason is that most of the official yield records in Weinan city are low-
514.13 kg/ha, 357.77 kg/ha and 466.53 kg/ha respectively. No matter yield samples, and the lack of high-yield samples leads to imbalanced
what the time step is, the more input variables are, the higher the esti­ data and brings errors to estimation, which suggests that the LSTM
mation accuracy is. From the distribution of scatter plots, estimation model is less effective for estimating the yield in low yield areas
results (green points) with meteorological data integrated on the LSTM (2000–3500 kg/ha) as shown in Fig. 4.
presented in Fig. 4 better than those without meteorological data (pink
points), obviously. In addition, with meteorological data integrated into
the LSTM model, green points presented in Fig. 4b and c (under two and
three time steps) showed estimation results evenly distributed on both
sides with the center at solid line, while estimation results presented in
Fig. 4a were distributed mostly below the solid line, which suggested a
tendency to underestimate yield. The results indicated that the inte­
gration of remote sensing data (VTCI and LAI) and meteorological data
(the average precipitation and average temperature) can capture more
wide influences on crop growth and grain formation processes, which
provide extra and unique information beyond what the remote sensing
data have offered for yield estimation. Research also showed that when
meteorological data are effectively integrated on the crop yield esti­
mations, the model performance is improved (Schwalbert et al., 2020;
Maimaitijiang et al., 2020; Cai et al., 2019).
Generally, the change of climate is one of the factors leading to the
interannual fluctuations of winter wheat yields. However, the in­
teractions of how crops respond to the changes of meteorological data as
indicated by the remote sensing information were shown to be critical
factors (Jiang et al., 2019). Integrating multisource data including
remote sensing data and meteorological data to consider both the
interactive and the external factors that related to crop growth can
improve the accuracy of the yield estimation. In addition, most of the
yield estimation biases are between positive and negative 1000 kg/ha,
which demonstrated that the LSTM model can approximate any complex
function and directly learn the mapping from the input data to the
output of statistical data. This "end-to-end" relationship includes the
process of crop growth and can provide robust yield estimation results.
We extended our analysis to explore the sensitivity of the time step
for the estimation model with remote sensing data combined with
meteorological data inputs. Generally, better estimation performance is
achieved with more time steps. However, we noticed that the LSTM Fig. 5. Performance of LSTM yield estimation based on the (a) RMSE and (b)
model achieved the highest estimation accuracy with two time steps (R2 MRE using different input combinations at city level.

7
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

3.2. Performance comparison with conventional machine learning


methods

We applied SVM and BPNN methods using multisource input data


and the proposed LSTM method using multisource input data and two
time steps, with the comparisons of the estimation performance shown
in Fig. 6. From this comparison, we noticed that LSTM performed the
best and showed large improvements relative to SVM and BPNN. SVM
provided the lowest estimation accuracy with RMSE equalled 867.70
kg/ha and R2 equalled 0.41, agreeing with the discovery in previous
study that machine learning methods were limited in simulating the
relationship between crop yield and input data relative to deep learning
approaches (Lin et al., 2020; Wolanin et al., 2020; Maimaitijiang et al.,
2020). BPNN provided a better result than SVM with RMSE equalled
812.83 kg/ha and R2 equalled 0.42. Compared with SVM and BPNN,
LSTM reduced the estimation RMSE by 455.06–509.93 kg/ha, which
demonstrated the effectiveness of the learning process and it is a prac­
tical approach for yield estimation in the regions.
The distribution of estimation errors in 2017 exhibited in Fig. 7
indicated that the bias with the center at zero following a normal dis­
tribution. However, the errors of SVM mostly distributed on the negative
value side, indicating underestimation for county-level wheat yield in
2017. Although the errors of BPNN evenly distributed on both sides with
the center at zero, the error values of BPNN were greater than those of
LSTM model. These results further showed that the LSTM model we
proposed has superior performance. For the BPNN and SVM model,
accuracy of estimation is not sufficient because the capacity for
analyzing complex and nonlinear relations among long time series
variables is not as strong as LSTM, which resulted in poor performance
of the model. Overall, these improvements demonstrated that the
optimal LSTM model performs best in terms of temporal learning
capability.

3.3. Validation of model robustness at different sampling sites

The winter wheat field in the Guanzhong Plain is divided into irri­
gated field and rain-fed field. Because drought is one of the common
natural disasters in this region, the growth and development of winter
wheat is easily affected by drought. Spring irrigation of wheat is
generally at the turning green stage. Under the influence of field irri­
gation, the correlation between soil moisture and grain yield in an
irrigated wheat field was different from that in a rain-fed wheat field,
which resulted in the growth and the development rate of winter wheat
in irrigated field and rain-fed field being different. During the turning
green stage to heading-filling stage, the growth rate of irrigated field is
generally higher than that of rain-fed field. To further validate the
proposed method, the yield estimation accuracy at the irrigation sites
and the rain-fed site (Fig. 1) are calculated respectively based on the
Fig. 7. County-level wheat yield estimation histogram distribution of errors for
2017 by different methods.

optimal LSTM model and shown in Fig. 8. The results indicated that the
optimal LSTM model has good generalization ability in terms of the
irrigation sites and the rain-fed site, with the highest R2 of 0.79 and
lowest RMSE of 1489.63 kg/ha. Since the precipitation in the year of
2015 attained historical high value at the main growth and development
period of winter wheat, the Meteorological Administration reported that
winter wheat yield reached the highest level in 2015 and the measured
yield time series had a large fluctuation in 2015. The low performance in
2015 indicated that the model is constrained by the variabilities in the
training record, and climate conditions are not included in the historical
or training record, the out of sample estimations in 2015 for extreme
climate conditions may not have a good performance. Therefore, the
ability of the LSTM model to capture fluctuation information caused by
Fig. 6. Estimation performance comparison among different methods based on extreme weather events is weak. However, the LSTM model can provide
the RMSE and R2. promising yield estimation results along with interannual climate

8
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

therefore, further study advanced analytical tools such as attention


network to illustrate the machine learning process. In addition, the
difference between temporal linking architecture and the improvement
of model structure will be studied in future studies to understand the
growth of wheat. On the other hand, the potential limitation of the LSTM
model lies in relatively large efforts by trial and error to determine the
optimal hyper-parameters. We need to automate searching for the best
hyper-parameters to work out this problem in the future.
The data intensive nature of the LSTMs (as for any deep learning
model) is a potential barrier for applying them in data-scarce problems.
Due to the scarcity of ground measured yield data and the lack of his­
torical yield data, the learning information of the LSTM model is not
sufficient in the training process. Although the wheat yield of irrigation
sites and rain-fed site used in this study to further validate the perfor­
mance of LSTM model, there is little time series yield data in the sam­
pling sites. With the accumulation of more monitoring data in the future,
more sufficient learning samples can be obtained, so that the LSTM
model can learn better data characteristics. In addition, more and more
large-sample data sets are emerging and an increase of extreme and
uncertain events is characteristic of the most recent climate scenarios
(Sun et al., 2019), which can help LSTM model learn various cases and
will catalyze future applications of LSTMs. Therefore, the LSTM model
can make more accurate estimation of winter wheat yield for various
types of sites and under different management as well as extreme
weather events.
Another prospect for future research is spatiotemporal scalability of
the LSTM model. In the future, we will investigate the coupling of LSTM
neural network and CNN to handle temporal-spatial dependent pro­
duction prediction of crops. Furthermore, a field-scale yield estimation
based on the detailed farming management data and improved temporal
resolution of input features, such as daily weather, can be further
evaluated on the model performance. The deep learning approach based
on LSTM to learning spatiotemporal heterogeneity of crop growth holds
great promise to gain an improved understanding of climate change on
agricultural production. Since a systematic interpretation or the inter­
pretability in general of the network internals would increase the trust in
data-driven approaches, especially those of LSTMs, leading to their use
in more applications in the near future.

5. Conclusions

Due to nonlinear and non-stationary nature of wheat yield forma­


Fig. 8. Model performance for estimating yield at the (a) irrigation site near the tion, a timely and robust wheat yield model based on remote sensing and
north of Fufeng county, (b) irrigation site at Luqiao town, Sanyuan county and meteorological data was developed in this study. We established yield
(c) rain-fed site at Pucun town, Qishan county. The black dash line separates the estimation models with different time steps and different combinations
data into two sets: the training and testing sets. of input data based on the LSTM. The results indicated that the LSTM
model with two time steps and integrating remote sensing data and
variations. Focusing on the curve changing in Fig. 8, we find that with meteorological data can achieve the highest estimation accuracy (R2 =
the progression of time, the LSTM model can well simulate the tendency 0.83 and RMSE = 357.77 kg/ha). Furthermore, the yield estimates were
of yield, which indicated that the LSTM model has better adaptability to upscaled from county level to city level to confirm the model applica­
interannual fluctuations of climate. bility for city level yield estimation, which found that the LSTM model
performance varies at city level, the yield estimation accuracy of the
4. Discussion proposed method differed depending on the region. The results corre­
sponded well with the spatial pattern of high yield region in the west and
Our results highlighted the potential of the LSTM model for winter low yield region in the east.
wheat yield estimation through conflating high dimension including Two different approaches, BPNN and SVM were used to compare
growth stages, temporal, and spatial, remote sensing data and meteo­ with the optimal LSTM model using several indices, including the RMSE
rological data. LSTM processed the yield estimation as a sequence-to- and R2. It was demonstrated that the LSTM model significantly
sequence problem. Although the LSTM model achieved the best per­ improved the precision of estimation, which could be applied moni­
formance for wheat yield estimation compared to our previous work, toring crop growth and estimating yield. To further validate the
there are still several limitations that need to be considered. The LSTM robustness of the optimal LSTM model, the proposed method at different
model captures the empirical relationships between input variables and sampling sites including at the rain-fed site and irrigation sites were
crop growth in neurons. Neural network modeling transformed primi­ applied in the estimation of yield using the same hyper-parameters. The
tive input variables into high-level representation through nonlinear correlation between estimated yield and measured yield demonstrated
activation and squashing functions, which weakened the traceability that the LSTM model can achieve the optimal estimation performance
and interpretability of the LSTM model (Jiang et al., 2019). We can, for different type sampling sites, and we find that the LSTM model has

9
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

better adaptability to interannual fluctuations of climate. The findings in Haider, S.A., Naqvi, S.R., Akram, T., Umar, G.A., Shahzad, A., Sial, M.R., Khaliq, S.,
Kamran, M., 2019. LSTM neural network based forecasting model for wheat
this study demonstrated the potential of applying a deep-learning model
production in Pakistan. Agronomy 9 (2), 72. https://doi.org/10.3390/
in the field of agriculture in estimating and managing real-time wheat agronomy9020072.
yield. Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8),
1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.
Huang, J., Tian, L., Liang, S., Ma, H., Becker-Reshef, I., Huang, Y., Su, W., Zhang, X.,
Declaration of Competing Interest Zhu, D., Wu, W., 2015. Improving winter wheat yield estimation by assimilation of
the leaf area index from Landsat TM and MODIS data into the WOFOST model.
Agric. For. Meteorol. 204, 106–121. https://doi.org/10.1016/j.
The authors declare that they have no known competing financial agrformet.2015.02.001.
interests or personal relationships that could have appeared to influence Jiang, H., Hu, H., Zhang, R., Xu, J., Xu, J., Huang, J., Wang, S., Ying, Y., Lin, T., 2019.
A deep learning approach to conflating heterogeneous geospatial data for corn yield
the work reported in this paper. estimation: a case study of the US Corn Belt at the county level. Glob. Change Biol.
1–13. https://doi.org/10.1111/gcb.14885.
Acknowledgments Johnson, M.D., Hsieh, W.W., Cannon, A.J., Davidson, A., Bedard, F., 2016. Crop yield
forecasting on the Canadian prairies by remotely sensed vegetation indices and
machine learning methods. Agric. For. Meteorol. 218-219, 74–84. https://doi.org/
This work was supported by the National Natural Science Foundation 10.1016/j.agrformet.2015.11.003.
of China under Grant 41871336. This work was supported by UK Kamilaris, A., Prenafeta-Boldú, F.X., 2018. Deep learning in agriculture: a survey.
Comput. Electron. Agric. 147 (1), 70–90. https://doi.org/10.1016/j.
Research and Innovation (UKRI) funding from a Science & Technology compag.2018.02.016.
Facilities Council grant administered through Rothamsted Research (No. Kaul, M., Hill, R.L., Walthall, C., 2005. Artificial neural networks for corn and soybean
SM008 CAU). The work was further supported by a Royal Society- yield prediction. Agric. Syst. 85 (1), 1–18. https://doi.org/10.1016/j.
agsy.2004.07.009.
Newton Mobility Grant (UK).
Kratzert, F., Klotz, D., Brenner, C., Schulz, K., Herrnegger, M., 2018. Rainfall-runoff
modelling using long-short-term-memory (LSTM) networks. Hydrol. Earth Syst. Sci.
References 22 (11), 6006–6022. https://doi.org/10.5194/hess-22-6005-2018.
Kuwata, K., Shibasaki, R., 2016. Estimating corn yield in the United States with MODIS
EVI and machine learning methods. ISPRS Ann. Photogramm. Remote Sens. Spat.
Alexandratos, N., Bruinsma, J., 2012. World Agriculture Towards 2030/2050: the 2012
Inf. Sci. 8 (3), 131–136. https://doi.org/10.5194/isprs-annals-III-8-131-2016.
Revision. ESA Working Paper 12-03. United Nations Food and Agriculture
Lecun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–444. https://doi.
Organization, Rome.
org/10.1038/nature14539.
Anothai, J., Soler, C.M.T., Green, A., Trout, T.J., Hoogenboom, G., 2013. Evaluation of
Leroux, L., Castets, M., Baron, C., Escorihuela, M.J., Bégué, A., Lo, S., 2019. Maize yield
two evapotranspiration approaches simulated with the CSM-CERES-Maize model
estimation in West Africa from crop process-induced combinations of multi-domain
under different irrigation strategies and the impact on maize growth, development
remote sensing indices. Eur. J. Agron. 108, 11–26. https://doi.org/10.1016/j.
and soil moisture content for semi-arid conditions. Agric. For. Meteorol. 176, 64–76.
eja.2019.04.007.
https://doi.org/10.1016/j.agrformet.2013.03.001.
Leroux, L., Baron, C., Zoungrana, B., Traore, S.B., Lo Seen, D., Begue, A., 2016. Crop
Araya, A., Hoogenboom, G., Luedeling, E., Hadgu, K.M., Kisekka, I., Martorano, L.G.,
monitoring using vegetation and thermal indices for yield estimates: case study of a
2015. Assessment of maize growth and yield using crop models under present and
rainfed cereal in semi-arid West Africa. IEEE J. Sel. Top. Appl. Earth Obs. Remote
future climate in southwestern Ethiopia. Agric. For. Meteorol. 214-215, 252–265.
Sens. 9, 347–362. https://doi.org/10.1109/JSTARS.2015.2501343.
https://doi.org/10.1016/j.agrformet.2015.08.259.
Li, Y., Guan, K., Yu, A., Peng, B., Zhao, L., Li, B., Peng, J., 2019. Toward building a
Becker-Reshef, I., Vermote, E.F., Lindeman, M., Justice, C., 2010. A generalized
transparent statistical model for improving crop yield prediction: modeling rainfed
regression-based model for forecasting winter wheat yields in Kansas and Ukraine
corn in the U.S. Field Crop. Res. 234, 55–65. https://doi.org/10.1016/j.
using MODIS data. Remote Sens. Environ. 114 (6), 1312–1323. https://doi.org/
fcr.2019.02.005.
10.1016/j.rse.2010.01.010.
Lin, Q., Wang, P., Zhang, S., Li, L., Jing, Y., Liu, J., 2016. Applicability of vegetation
Bolton, D.K., Friedl, M.A., 2013. Forecasting crop yield using remotely sensed vegetation
temperature condition index for drought monitoring at different time scales. Arid
indices and crop phenology metrics. Agric. For. Meteorol. 173, 74–84. https://doi.
Zone Res. 33 (1), 186–192. https://doi.org/10.13866/j.azr.2016.01.24 (in Chinese
org/10.1016/j.agrformet.2013.01.007.
with English abstract).
Burke, M., Lobell, D.B., 2017. Satellite-based assessment of yield variation and its
Lin, T., Zhong, R., Wang, Y., Xu, J., Jiang, H., Xu, J., Ying, Y., Rodriguez, L., Ting, K.C.,
determinants in smallholder African systems. PNAS 114, 2189–2194. https://doi.
Li, H., 2020. Deepcropnet: a deep spatial-temporal learning framework for county-
org/10.1073/pnas.1616919114.
level corn yield estimation. Environ. Res. Lett. 15 (3), 034016 https://doi.org/
Cai, Y., Guan, K., Lobell, D., Potgieter, A., Wang, S., Peng, J., Xu, T., Asseng, S.,
10.1088/1748-9326/ab66cb.
Zhang, Y., You, L., Peng, B., 2019. Integrating satellite and climate data to predict
Liu, W., Kogan, F., 2002. Monitoring Brazilian soybean production using NOAA/AVHRR
wheat yield in Australia using machine learning approaches. Agric. For. Meteorol.
based vegetation condition indices. Int. J. Remote Sens. 23, 1161–1179. https://doi.
274, 144–159. https://doi.org/10.1016/j.agrformet.2019.03.010.
org/10.1080/01431160110076126.
Campos-Taberner, M., García-Haro, F.J., Camps-Valls, G., Grau-Muedra, G., Nutini, F.,
Lobell, D.B., Roberts, M.J., Schlenker, W., Braun, N., Little, B.B., Rejesus, R.M.,
Crema, A., Boschetti, M., 2016. Multitemporal and multiresolution leaf area index
Hammer, G.L., 2014. Greater sensitivity to drought accompanies maize yield
retrieval for operational local rice crop monitoring. Remote Sens. Environ. 187,
increase in the U.S. Midwest. Science 344 (6183), 516–519. https://doi.org/
102–118. https://doi.org/10.1016/j.rse.2016.10.009.
10.1126/scien ce.1251423.
Chen, Y., Zhang, Z., Tao, F., 2018. Improving regional winter wheat yield estimation
Maimaitijiang, M., Sagan, V., Sidike, P., Hartling, S., Esposito, F., Fritschi, F.B., 2020.
through assimilation of phenology and leaf area index from remote sensing data.
Soybean yield prediction from UAV using multimodal data fusion and deep learning.
Eur. J. Agron. 101, 163–173. https://doi.org/10.1016/j.eja.2018.09.006.
Remote Sens. Environ. 237, 111599 https://doi.org/10.1016/j.rse.2019.111599.
Cunha, R.L.F., Silva, B., Netto, M.A.S., 2018. A scalable machine learning system for pre-
Olah, C., 2015. Understanding LSTM networks. http://colah.github.io/posts/2015-08-
season agriculture yield forecast. In: Proceedings of the 2018 IEEE 14th International
Understanding-LSTMs.
Conference on e-Science (e-Science), pp. 423–430. https://doi.org/10.1109/
Paswan, R.P., Begum, S.A., 2013. Regression and neural networks models for prediction
eScience.2018.00131.
of crop production. Int. J. Sci. Eng. Res. 4 (9), 98–107.
Doraiswamy, P.C., Moulin, S., Cook, P.W., Stern, A., 2003. Crop yield assessment from
Prasad, A.K., Chai, L., Singh, R.P., Kafatos, M., 2006. Crop yield estimation model for
remote sensing. Photogramm. Eng. Remote Sens. 69 (6), 665–674. https://doi.org/
Iowa using remote sensing and surface parameters. Int. J. Appl. Earth Obs. Geoinf. 8
10.14358/PERS.69.6.665.
(1), 26–33. https://doi.org/10.1016/j.jag.2005.06.002.
Edreira, J.I.R., Otegui, M.E., 2012. Heat stress in temperate and tropical maize hybrids:
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-
differences in crop growth, biomass partitioning and reserves use. Field Crop. Res.
propagating errors. Nature 323 (6088), 533–536. https://doi.org/10.1038/
130, 87–98. https://doi.org/10.1016/j.fcr.2012.02.009.
323533a0.
Fang, H., Liang, S., Hoogenboom, G., Teasdale, J., Cavigelli, M., 2008. Corn yield
Sami, K., John, F., Andrew, K., Nathan, D., Scott, S., 2018. Integration of high resolution
estimation through assimilation of remotely sensed data into the CSM-CERES-Maize
remotely sensed data and machine learning techniques for spatial prediction of soil
model. Int. J. Remote Sens. 29 (10), 3011–3032. https://doi.org/10.1080/
properties and corn yield. Comput. Electron. Agric. 153, 213–225. https://doi.org/
01431160701408386.
10.1016/j.compag.2018.07.016.
Feng, P., Wang, B., Liu, D., Waters, C., Yu, Q., 2019. Incorporating machine learning with
Savitzky, A., Golay, M.J.E., 1964. Smoothing and differentiation of data by simplified
biophysical model can improve the evaluation of climate extremes impacts on wheat
least squares procedures. Anal. Chem. 36 (8), 1627–1639. https://doi.org/10.1021/
yield in south-eastern Australia. Agric. For. Meteorol. 275, 100–113. https://doi.
ac60214a047.
org/10.1016/j.agrformet.2019.05.018.
Schwalbert, R.A., Amado, T., Corassa, G., Pott, L.P., Prasad, P.V.V., Ciampitti, I.A., 2020.
Gers, F.A., Schmidhuber, J., Cummins, F., 2000. Learning to forget: continual prediction
Satellite-based soybean yield forecast: integrating machine learning and weather
with LSTM. Neural Comput. 12 (10), 2451–2471. https://doi.org/10.1049/cp:
data for improving crop yield prediction in southern Brazil. Agric. For. Meteorol.
19991218.
284, 107886 https://doi.org/10.1016/j.agrformet.2019.107886.
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J., 2015. LSTM: a
Sepp, H., Jürgen, S., 1997. Long short-term memory. Neural Comput. 9, 1735–1780.
search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28 (10), 2222–2232.
https://doi.org/10.1162/neco.1997.9.8.1735.
https://doi.org/10.1109/TNNLS.2016.2582924.

10
H. Tian et al. Agricultural and Forest Meteorology 310 (2021) 108629

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014. 412–418. https://doi.org/10.13203/j.whugis2001.05.007 (in Chinese with English
Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. abstract).
Res. 15 (56), 1929–1958. Wolanin, A., Mateo-García, G., Camps-Valls, G., Gómez-Chova, L., Meroni, M.,
Sun, J., Di, L., Sun, Z., Shen, Y., Lai, Z., 2019. County-level soybean yield prediction Duveiller, G., Liang, Y., Guanter, L., 2020. Estimating and understanding crop yields
using deep CNN-LSTM model. Sensors 19 (20), 4363. https://doi.org/10.3390/ with explainable deep learning in the Indian Wheat Belt. Environ. Res. Lett. 15 (2),
s19204363. 024019 https://doi.org/10.1088/1748-9326/ab68ac.
Sun, J., Lai, Z., Di, L., Sun, Z., Tao, J., Shen, Y., 2020. Multilevel deep learning network Xiao, C., Chen, N., Hu, C., Wang, K., Gong, J., Chen, Z., 2019. Short and mid-term sea
for county-level corn yield estimation in the U.S. Corn Belt. IEEE J. Sel. Top. Appl. surface temperature prediction using time-series satellite data and LSTM-AdaBoost
Earth Obs. Remote Sens. 13, 5048–5060. https://doi.org/10.1109/ combination approach. Remote Sens. Environ. 233, 13–58. https://doi.org/
JSTARS.2020.3019046. 10.1016/j.rse.2019.111358.
Sun, W., Wang, P., Zhang, S., Zhu, D., Liu, J., Chen, J., Yang, H., 2008. Using the Xie, Y., Wang, P., Bai, X., Khan, J., Zhang, S., Li, L., Wang, L., 2017. Assimilation of the
vegetation temperature condition index for time series drought occurrence leaf area index and vegetation temperature condition index for winter wheat yield
monitoring in the Guanzhong Plain, PR China. Int. J. Remote Sens. 29, 5133–5144. estimation using Landsat imagery and the CERES-Wheat model. Agric. For.
https://doi.org/10.1080/01431160802036557. Meteorol. 246, 194–206. https://doi.org/10.1016/j.agrformet.2017.06.015.
Tian, H., Wang, P., Tansey, K., Zhang, S., Zhang, J., Li, H., 2020. An IPSO-BP neural Xun, L., Wang, P., Li, L., Wang, L., Kong, Q., 2018. Identifying crop planting areas using
network for estimating wheat yield using two remotely sensed variables in the Fourier-transformed feature of time series MODIS leaf area index and sparse-
Guanzhong Plain, PR China. Comput. Electron. Agric. 169, 105180 https://doi.org/ representation-based classification in the North China Plain. Int. J. Remote Sens.
10.1016/j.compag.2019.105180. 1–19. https://doi.org/10.1080/01431161.2018.1492181.
Unganai, L.S., Kogan, F.N., 1998. Drought monitoring and corn yield estimation in You, J., Li, X., Low, M., Lobell, D., Ermon, S., 2017. Deep Gaussian process for crop yield
Southern Africa from AVHRR data. Remote Sens. Environ. 63 (3), 219–232. https:// prediction based on remote sensing data. Proc. Thirty-First AAAI Conf. Artif. Intel.
doi.org/10.1016/s0034-4257(97)00132-6. 31, 4559–4566. https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14435.
Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. Springer, pp. 10–45. Zhang, J., Zhu, Y., Zhang, X., Ye, M., Yang, J., 2018. Developing a long short-term
Wan, Z., Wang, P., Li, X., 2004. Using MODIS Land surface temperature and normalized memory (LSTM) based model for predicting water table depth in agricultural areas.
difference vegetation index products for monitoring drought in the southern Great J. Hydrol. 561, 918–929. https://doi.org/10.1016/j.jhydrol.2018.04.065.
Plains, USA. Int. J. Remote Sens. 25, 61–72. https://doi.org/10.1080/ Zhou, Z., Morel, J., Parsons, D., Kucheryavskiy, S.V., Gustavssom, A., 2019. Estimation of
0143116031000115328. yield and quality of legume and grass mixtures using partial least squares and
Wang, P., Gong, J., Li, X., 2001. Vegetation temperature condition index and its support vector machine analysis of spectral data. Comput. Electron. Agric. 162,
application for drought monitoring. Geomat. Inform. Sci. Wuhan Univ. 26 (5), 246–253. https://doi.org/10.1016/j.compag.2019.03.038.

11

You might also like