You are on page 1of 18

Agricultural and Forest Meteorology 237–238 (2017) 105–122

Contents lists available at ScienceDirect

Agricultural and Forest Meteorology


journal homepage: www.elsevier.com/locate/agrformet

Meteorological drought forecasting for ungauged areas based on


machine learning: Using long-range climate forecast and remote
sensing data
Jinyoung Rhee a , Jungho Im b,∗
a
Climate Research Department, APEC Climate Center, Busan, Republic of Korea
b
School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea

a r t i c l e i n f o a b s t r a c t

Article history: A high-resolution drought forecast model for ungauged areas was developed in this study. The Standard-
Received 28 September 2016 ized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI) with 3-, 6-,
Received in revised form 31 January 2017 9-, and 12-month time scales were forecasted with 1–6-month lead times at 0.05 × 0.05◦ resolution. The
Accepted 6 February 2017
use of long-range climate forecast data was compared to the use of climatological data for periods with
Available online 10 February 2017
no observation data. Machine learning models utilizing drought-related variables based on remote sens-
ing data were compared to the spatial interpolation of Kriging. Two performance measures were used;
Keywords:
one is producer’s drought accuracy, defined as the number of correctly classified samples in extreme,
Drought forecasting
Machine learning
severe, and moderate drought classes over the total number of samples in those classes, and the other
Climate forecast data is user’s drought accuracy, defined as the number of correctly classified samples in drought classes over
Remote sensing the total number of samples classified to those classes. One of the machine learning models, extremely
Spatial interpolation randomized trees, performed the best in most cases in terms of producer’s accuracy reaching up to 64%,
while spatial interpolation performed better in terms of user’s accuracy up to 44%. The contribution of
long-range climate forecast data was not significant under the conditions used in this study, but further
improvement is expected if forecast skill is improved or a more sophisticated downscaling method is
used. Simulated decreases of forecast error in precipitation and mean temperature were tested: the sim-
ulated decrease of forecast error in precipitation improves drought forecast while the decrease of forecast
error in mean temperature does not contribute much. Although there is still some room for improvement,
the developed model can be used for drought-related decision making in ungauged areas.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction studies. In this study, we adopted the definition of the Intergovern-


mental Panel on Climate Change Special Report (IPCC SREX, 2012):
Droughts have caused significant losses and damages. Defining “a period of abnormally dry weather long enough to cause a serious
the types of drought helps to monitor droughts and to build strate- hydrological imbalance.”
gies for the preparation and response to droughts (Ward et al., 2006; Real-time or near real-time drought monitoring is invaluable for
Harou et al., 2010). Wilhite and Glantz (1985) addressed the con- timely response to drought events. Drought early warning systems
ceptual definition of drought as well as the operational definition help to perform drought assessment and monitoring, and to make
of drought. The operational definition includes the onset, severity, appropriate decisions in a timely manner by providing drought
termination, and frequency of drought. They also defined meteo- information, and to ultimately reduce drought damages. Drought
rological, agricultural, hydrological, and socio-economic droughts forecasts, if feasible, can provide drought information in advance,
based on their review of more than 150 studies. More recently, enabling the reduction of drought impacts by securing appropriate
Mishra and Singh (2010) reviewed the definitions of previous resources and planning effective allocation of them.
There are not many existing drought forecasting systems,
but a variety of methods have been studied to provide drought
forecast data to be used for drought-related decision making.
∗ Corresponding author. Mishra and Singh (2011) listed the components of drought
E-mail address: ersgis@unist.ac.kr (J. Im).

http://dx.doi.org/10.1016/j.agrformet.2017.02.011
0168-1923/© 2017 Elsevier B.V. All rights reserved.
106 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

forecasting as hydro-meteorological variables, drought indices, 2. Study area and data


large-scale climate indices, methodologies, and output. The hydro-
meteorological variables used for drought forecasting can be 2.1. Study area
determined according to the type of drought of interest; precipi-
tation is the most important variable for meteorological drought, The study area is South Korea, located in northeast Asia (Fig. 1).
while soil moisture and reservoir level are essential for agricul- The area of South Korea is about 100,284 km2 . About 19% of the
tural and hydrological droughts, respectively. Drought indices such area is composed of rice paddies and other crop lands, and about
as the Standardized Precipitation Index (SPI; McKee et al., 1993) 64% consists of forests (KOSIS, 2016a). The annual average temper-
and the Palmer Drought Severity Index (PDSI; Palmer, 1965), or ature of the five main cities, Seoul, Incheon, Gangneung, Mokpo,
the combination of any related variables, are used as indicators for and Busan ranges from 11.4–14.1 ◦ C, and average annual precipi-
drought conditions. Large-scale climate variabilities such as the El tation ranges from 1108–1445 mm (KOSIS, 2016b). The temporal
Nino-Southern Oscillation (ENSO) and Arctic Oscillation (AO) may scope of this study is from January 2003 to August 2015 and one to
be used for long-range drought forecast. Methodologies of drought six months of lead time were used.
forecasting include regression models (e.g., Leilah and Al-Khateeb, The methodology of this study may be applied to any region
2005), time series models (e.g., Han et al., 2010), probability models with appropriate input variables related to drought. South Korea is
(e.g., Cancelliere et al., 2007), artificial neural network models (e.g., located in the mid-latitude region with partial influence of large-
Morid et al., 2007), and hybrid models (e.g., Mishra et al., 2007), and scale atmosphere-ocean interactions such as ENSO, and the forecast
each method produces outputs to quantify the drought condition skill of the long-range climate forecast data is also known to be
determining the initiation and termination of drought, the nature lower compared to tropical regions (e.g., Wang et al., 2001; Wang
of severity, and the probability of occurrence, among other values et al., 2004). The study area was thus selected to examine the per-
(Mishra and Singh, 2011). formance of machine learning models under the limited skill of the
Long-range climate forecast data for variables such as precip- long-range forecast data. The study area will be expanded to areas
itation, air temperature, and relative humidity with lead times with distinct influence of large-scale atmosphere-ocean interac-
up to 6 months can be used for drought forecasting. The sea- tions such as Southeast Asia in the following studies.
sonal drought outlook of the National Oceanic and Atmospheric There have been serious historical droughts in South Korea.
Administration (NOAA) of the US has already used 3-month The events in 1939, 1968, 1978, and 1982 were considered as
long-range climate forecast data as well as the 1-month precipi- the most extreme drought events before the 1990s (Sim, 2009).
tation and temperature forecast of the Climate Prediction Center After the 1990s, the area experienced a nation-wide drought dur-
(CPC), short-term climate forecast data of the Weather Predic- ing 1994–1995 (Park and Schubert, 1997). The 2001 and 2008–2009
tion Center, soil moisture model results, the probability of the drought events were also recorded as devastating ones especially
termination and reduction of PDSI, climatology, and initial con- in Kangwon Province located in the northeast of South Korea where
ditions to produce seasonal drought outlook results (CPC, 2016). more than 50,000 residents experienced drinking water shortage
Drought forecast data based on long-range climate forecasts are (Sim, 2009). More recently, a short-term spring drought occurred
reprocessed sometimes to be easily understood by end users; in 2012 resulting in less than 30% of normal precipitation, and a
Steinemann (2006) developed the Forecast Precipitation Index long-term drought initiated in 2013 continued to 2015, especially
(FPI) so that decision makers can easily use drought forecast in the northwest part of South Korea.
results after conducting a questionnaire survey on water resources
managers to bridge the gap between climate science and social
2.2. Data
decisions.
Each method introduced above has its own advantages to pre-
2.2.1. Automatic synoptic observation system (ASOS) data
dict drought-related variables. In order to forecast drought indices
Drought conditions in this study were measured using drought
that can be calculated based on hydro-meteorological variables
indices, and the reference drought index values were calculated
from climate model outputs, the use of long-range climate forecast
from the ASOS data. Monthly precipitation and temperature data
is a good choice since it is not necessary to use any proxy vari-
from 61 ASOS stations in the study area (Fig. 1) were obtained
able or to depend on any indirect method. Mo and Lyon (2015)
from the Korea Meteorological Administration for the period
recently performed meteorological drought prediction based on
of January 1973–August 2015. Potential evapotranspiration data
SPI using the North American Multi-Model Ensemble (NMME) and
were obtained from mean air temperature using the Thornth-
the Global Precipitation Climatology Center (GPCC) precipitation
waite method (Thornthwaite, 1948). Despite that this method
data at 1 × 1◦ (about 100 × 100 km near the equator). This resolu-
could be misleading during winter seasons in the study area
tion is too coarse though to provide detailed drought information
because it assumes zero potential evapotranspiration for below
for decision makers. Weather station data can replace the GPCC
zero degree Celsius temperatures, the physically-based methods
precipitation. In this case, there should be a model to provide
such as Penman-Monteith or Hargreaves could not be used because
information for ungauged areas between the locations of weather
the minimum data requirements are the minimum and maximum
stations.
temperature while long-range climate forecast only provides mean
In this study, a high resolution meteorological drought fore-
temperature.
cast model was developed to provide drought forecast information
based on SPI and Standardized Precipitation-Evapotranspiration
Index (SPEI) for ungauged areas. The high resolution (0.05 × 0.05◦ ) 2.2.2. Drought indicators
of the developed model enables the provision of detailed data for Many drought indices have been developed since the mid-20th
regional and local uses and ultimately the reduction of drought century. The most widely used drought indices include the PDSI
impact. The objectives of this study are to (1) develop a drought developed in the 1960s and the SPI developed in the 1990s. The
forecast model based on the combination of remote sensing and Standardized Precipitation Evapotranspiration Index (SPEI), which
long-range forecast data using machine learning for ungauged was developed by Vicente-Serrano et al. (2010) considering not
areas, and (2) provide improved ranges of drought forecast in case only the precipitation used for SPI but also atmospheric moisture
of the improvement of forecasting skill of the long-range forecast demand, has also been used in many studies. Usually drought index
data. values are calculated at weather stations locations from time series
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 107

Fig. 1. Study area of South Korea.

Table 1 Based on the two-parameter gamma distribution, 3-, 6-, 9-, and
Drought condition classifications for SPI and SPEI (McKee et al., 1993).
12-month SPI values for 61 ASOS stations were calculated as ref-
Classification Index Value erence data. The thirty-year period from 1981 to 2010 was used as
Extremely wet (EW) ≥2.00 the calibration period for parameter estimation. In order to com-
Very wet (VW) 1.50–1.99 pare with the machine learning models, SPI values using long-range
Moderately wet (MW) From 1.00 to 1.49 forecast were also calculated for the weather station locations and
Near normal (NN) From 0.99 to −0.99 then spatially interpolated.
Moderate drought (MD) From −1.00 to −1.49
Severe drought (SD) From −1.50 to −1.99
Extreme drought (ED) ≤−2.00 2.2.2.2. Standardized Precipitation Evapotranspiration Index. SPEI
uses the same drought classification as SPI (Table 1). SPEI is based
on a conceptual water balance where the difference between pre-
data of meteorological variables to assess and monitor drought cipitation and evapotranspiration equals the summation of runoff,
conditions. groundwater, and soil storage. The standardization enables spatial
While many drought indices exist, each with their own advan- and temporal comparisons of the index. In this study, the R SPEI
tages and disadvantages (Mishra and Singh, 2010), we used SPI package (version 1.6; http://cran.r-project.org/web/packages/SPEI/
and SPEI in this study. The SPI was recommended for meteorologi- ) was used. Potential evapotranspiration was also obtained based
cal drought monitoring by the World Meteorological Organization on the Thornthwaite method when calculating SPEI. As with SPI,
(WMO) (Hayes et al., 2011). SPEI can be calculated using similar 3-, 6-, 9-, and 12-month SPEI values for 61 ASOS stations were cal-
processes with SPI, but it considers both precipitation and poten- culated as reference data using the thirty-year period from 1981 to
tial evapotranspiration, and reflects water availability in a more 2010 as the calibration period for parameter estimation. In order
realistic way. to compare SPEI with the machine learning models, SPEI values
using long-range forecast were calculated for the weather station
2.2.2.1. Standardized Precipitation Index. SPI can be calculated by locations and then spatially interpolated.
accumulating precipitation over the desired time scale (e.g., 3
months for 3-month SPI, 6 months for 6-month SPI), fitting the 2.2.3. Remote sensing data related to drought
precipitation data to an appropriate probability distribution such Drought-related variables can be monitored using remote sens-
as gamma or pearson type-III (Guttman, 1999), and converting it ing data (e.g., AghaKouchak et al., 2015; McVicar and Jupp, 1998).
to a gaussian probability distribution for the standardization. The Since remote sensing data provide spatially and temporally con-
gamma probability distribution was used in this study as it is an tinuous information, the use may compensate the limitation of
appropriate distribution for fitting precipitation data in the South point-based observation data for hydro-meteorological variables
Korea region (Lee and Kim, 2012). as well as enable drought monitoring in ungauged areas. Hydro-
Since it uses precipitation data only, it can be applied in many meteorological variables include precipitation, potential or actual
areas that lack data for various variables. Spatial and temporal com- evapotranspiration, soil moisture, runoff, reservoir level, stream-
parisons are also possible due to the standardization. Since the time flow, and temperature. Precipitation is the most suitable variable
scale is determined by the user, SPI with a specific time scale can for meteorological drought monitoring. Streamflow or reservoir
be used for monitoring appropriate types of drought as the drought level data may be used for monitoring hydrological drought.
condition becomes aggravated. The mean and standard deviation Soil moisture provides valuable information to assess agricultural
of SPI are zero and one respectively. The drought classes suggested drought. Precipitation and evapotranspiration can be considered
by McKee et al. (1993) were used in this study (Table 1). as drivers controlling drought conditions, while other variables of
108 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Table 2
Remote sensing data used in this study.

Product Temporal Spatial Variable Data Use Start Data Use End Latency
Resolution Resolution Date Date

TRMM 3B43 Monthly 0.25 × 0.25◦ Precipitation Jan 2003 June 2014 Archived past
data
GPM IMERG Monthly 0.1 × 0.1◦ Precipitation July 2014 Apr 2015 4 Months
GPM IMERG Daily 0.1 × 0.1◦ Precipitation May 2015 Aug 2015 18 h (Late Run)
MCD43C4 16-day (Produced 0.05 × 0.05◦ Bands1-7 Surface Jan 2003 Aug 2015 NA
every 8 days with Reflectance
16 days of
acquisition)
MYD11C3 Monthly 0.05 × 0.05◦ Land Surface Jan 2003 Aug 2015 NA
Temperature
(Day, Night)
MYD13C2 Monthly 0.05 × 0.05◦ NDVI Jan 2003 Aug 2015 NA

streamflow and reservoir level are response variables determined and Shehgal, 2010; Gu et al., 2007). Surface reflectance data were
by the drought. Both types of variables are useful for effective obtained from the MCD43C4 Nadir Bidirectional Reflectance Distri-
drought studies. In this study, precipitation (PRCP), daytime land bution Function (BRDF)-Adjusted Reflectance product of the MODIS
surface temperature (LST DAY), nighttime land surface temper- sensor (Strahler et al., 1999). This product combines data from both
ature (LST NIGHT), potential evapotranspiration (PET) estimated the Aqua and Terra satellites. Only data from seven bands out of 35
using mean temperature, the Normalized Difference Vegetation bands of MODIS are provided through this product. The 16-day data
Index (NDVI), and the Normalized Difference Water Index (NDWI) were converted to monthly data using the numbers of days of the
were used (Table 2). 16-day period belonging to each month.
Tropical Rainfall Measuring Mission (TRMM; Huffman et al.,
2007) data developed by National Aeronautics and Space Admin- 2.2.4. Large-scale climate index
istration (NASA) and Japan Aerospace Exploration Agency (JAXA) Drought events are sometimes connected to large-scale atmo-
as well as Global Precipitation Measurement (GPM; Huffman et al., sphere and ocean interactions even in mid-latitude regions. Kang
2015) data were used for precipitation. The TRMM 3B43 data were (1998) examined the statistical correlations between interannual
obtained from the Goddard Earth Sciences Data and Information variations of temperature and precipitation in South Korea and the
Service Center (GES DISC), USA; it combines three-hour TRMM 3B42 Southern Oscillation Index (SOI). High winter temperatures and low
data, the Climate Anomaly Monitoring System (CAMS)’s global summer temperatures, as well as enhanced summer precipitation,
gridded precipitation, and Global Precipitation Climatology Cen- were observed during El Nino years. Cha (1999) also reported that
ter (GPCC)’s global gridded observed precipitation. Since TRMM’s there have been low summer temperature and high winter tem-
descent after July 2014 due to a lack of fuel, GPM Integrated peratures during El Nino years, as well as increased precipitation.
Multi-satellite Retrievals for GPM (IMERG) obtained from the Pre- Kim (2004) performed the Rotated Empirical Orthogonal Function
cipitation Measurement Missions (PMM) of NASA, USA were used (REOF) analysis of 500 hPa geopotential height showing the first
from July 2014. Since the monthly data has latency of 4 months at ENSO mode and the second Arctic Oscillation (AO) mode.
the time of the data analysis of this study, daily data were merged In this study, the Multivariate ENSO Index (MEI; Wolter and
into monthly data from May to August 2015. Timlin, 2011) and Arctic Oscillation Index (AOI; Higgins et al.,
Daytime and nighttime Land Surface Temperature (LST) data 2002) were examined for their contributions to drought forecast-
were obtained from the MYD11C3 Land Surface Temperature and ing. Among various ENSO indices, the MEI was adopted in this study.
Emissivity product of the Moderate Resolution Imaging Spectro- It uses six variables: sea-level pressure, zonal and meridional wind
radiometer (MODIS) sensor onboard NASA’s Aqua satellite (Wan, speed, SST, surface air temperature, and total cloudiness fraction of
1999). The data were obtained from the Earth Observing System the sky. The MEI values are extracted as the first EOF mode. MEI data
Data and Information System of NASA, USA. from 1950 to August 2015 were obtained from NOAA Earth Sys-
NDVI data were obtained from the MYD13C2 Vegetation Indices tem Research Laboratory. AOI data from 1950 to August 2015 were
product of the MODIS sensor onboard the Aqua satellite (Huete obtained from NOAA CPC. Correlations between the meteorological
et al., 1999). Reflectance values of the red and near infrared regions drought indices versus MEI as well as AOI were calculated because
are used to monitor the vitality of vegetation in terms of the changes existing studies did not provide consistent time lags between large-
in chlorophyll and spongy mesophyll content of vegetation (Ke scale signals and drought occurrence in the study area (e.g. Kang,
et al., 2015). The relationship between NDVI and drought, how- 1998; Cha, 1999; Kim, 2004). Although Pearson’s correlations were
ever, depends on the phenological phase, the type of vegetation, generally low (data not shown), a one-year time lag was applied for
and the length of drought (Ji and Peters, 2003; Vicente-Serrano both MEI and AOI since it provided statistically significant results
et al., 2013; Yagci et al., 2015; Muriithi et al., 2016). Thus, NDWI, with a 90% confidence level.
which was developed by Gao (1996) and shows high correlations
with drought conditions (Rhee et al., 2010), was also used in this 2.2.5. Long-range climate forecast data
study. It uses two bands, 0.86 and 1.24 ␮m, which show similar lev- Precipitation and 2-m temperature data with 1–6 month lead
els of scattering by vegetation (Gao, 1996). The 1.24 ␮m channel is times were obtained from six individual Global Climate Mod-
more sensitive to water existing in vegetation compared to other els (GCMs) used for APEC Climate Center (APCC) Multi-Model
channels with longer wavelengths because the energy absorption Ensemble (MME). They are MSC CanCM3 (Meteorological Service
by water inside vegetation is stronger in this region (Gao, 1996). of Canada Climate Centre third generation coupled global climate
NDWI is known to be less sensitive to atmospheric effects com- model), MSC CanCM4 (Meteorological Service of Canada Climate
pared to NDVI and detects the overall amount of water existing Centre fourth generation coupled global climate model), POAMA
in layers of leaves (Gao, 1996); it can be very useful for drought (Predictive Ocean Atmosphere Model for Australia), NCEP CFSv2
studies along with NDVI (e.g., Anderson et al., 2010; Chakraborty (National Centers for Environmental Prediction Coupled Forecast
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 109

Table 3
Six Global Climate Models (GCM) used in this study.

Product Reference Temporal/Spatial Resolution Variable Hindcast Forecast

MSC CanCM3 Scinocca et al. (2008), Verseghy (2000) Monthly, 2.5 × 2.5◦ Precipitation, 2 m Air Jan. 1982–Jan. 2011 Jan. 2013–Aug. 2015
MSC CanCM4 Merryfield et al. (2013) Temperature Feb. 1981–Jan. 2011 Jan. 2013–Aug. 2015
NASA Rienecker et al. (2008), Vernieres et al. (2012) Dec. 1981–Nov. 2012 Jan. 2013–Aug. 2015
NCEP CFSv2 Saha et al. (2014) Mar. 1982–Sep. 2009 Jan. 2013–Aug. 2015
PNU Sun and Ahn (2011) Mar. 1980–Dec. 2012 Jan. 2013–Aug. 2015
POAMA Alves et al. (2003) Jan. 1983–Dec. 2011 Jan. 2013–Aug. 2015

System version 2), PNU (Pusan National University), and NASA 3.2. Drought forecasting at gauge locations
(National Aeronautics and Space Administration) (Table 3). Units
are mm/day and Kelvin (K) for precipitation and temperature, The use of long-range forecast data was explored for drought
respectively. forecasting and compared to the baseline using climatology as per-
formed by Mo and Lyon (2015). Instead of using 1 × 1◦ grids, data
3. Methodology for 61 ASOS weather stations were used. For the station locations,
drought index values of SPI and SPEI with 3-, 6-, 9-, and 12-month
We focused on the meteorological drought caused by a pro- time scales were calculated. For each lead time, observation data
longed period of a lack of precipitation. SPI and SPEI with 3-, 6-, were used for the past period, and two methods were used to fill the
9-, and 12-month time scale were used (SPI3, SPI6, SPI9, SPI12, future period: one method uses long-range forecast data (F-method
SPEI3, SPEI6, SPEI9, and SPEI12) for analyses. Lead times from 1 to hereafter) and the other method uses climatology (baseline; C-
6 months were used. When calculating the index, observation data method hereafter).
were used when available (past period), and the remaining period The climatology data were obtained from the median value of
was filled with long-range forecast data (future period). This means 100 samples randomly derived from the observation data of the
that the period with observation data is longer with a shorter lead same month. The long-range forecast data were used in a way that
time, and more persistency or memory of the index will appear the percent increment of modeled precipitation or temperature
with a shorter lead time. For the SPI9 forecast with 3-month lead anomaly is applied to observational climatology data to form the
time as an example, observation data with a 6-month period and calibrated (bias-corrected) model data (Quan et al., 2012). Before
long-range forecast data for the following 3-month period are used. being combined with observation data, ensembles of percent incre-
ment of precipitation or temperature from six individual models
3.1. Drought category classification accuracy measures were obtained (Quan et al., 2012).

Ma∗ = Ma /Mc (3)


Drought forecasting results can be obtained as a form of drought
index value, or its corresponding category. Since decision making Mcal = Oc (1 + Ma∗ ) (4)
regarding drought is usually done based on drought classes, the
where Ma∗
is the percent increment of model prediction anomaly,
accuracy of drought category forecasting was used when compar-
Ma is model prediction anomaly, Mc is model climatology, Oc is
ing different forecasting models.
observation climatology, and Mcal is calibrated model prediction.
There are several measures in the accuracy assessment of
classifications: overall accuracy, user’s accuracy, and producer’s
3.3. Drought forecasting for ungauged areas
accuracy. The user’s accuracy is used as a measure of the error
of commission that occurs when the samples are classified to a
The use of machine learning models was tested to obtain spa-
class incorrectly, while the producer’s accuracy is used as a mea-
tially distributed drought forecasting data (ML-method hereafter)
sure of the error of omission that occurs when the samples are not
and a spatial interpolation was compared as a baseline method
classified to a class (Jensen, 2005). Two types of cost may occur in
(I-method hereafter). Kriging (gaussian process) was used as the
drought forecasting; these are when a serious drought condition
spatial interpolation baseline method (Cressie, 1993); it has been
was forecasted but actually did not happen, and when a serious
widely used to estimate spatially distributed information in many
drought happened but it was not forecasted beforehand. Consid-
studies. Leave-one-out cross-validation of spatial interpolation was
ering the tremendous economic cost of drought events that have
used to obtain estimated values at weather stations locations for
not been forecasted, it was assumed that the error of omission is
comparison.
more serious than the error of commission and the producer’s accu-
The use of long-range forecast data was also compared to the
racy is more appropriate to use in this study. Overall accuracy was
use of climatology in ungauged areas. Four different combinations
not used because it may be misleading due to the inclusion of all
were tested: F-ML-method (long-range forecast data and machine
wet classes. Thus, producer’s accuracy was used as the main accu-
learning), C-ML-method (climatology data with machine learning),
racy measure for drought category classification, and only drought
classes of extreme drought (ED), severe drought (SD), and moder-
ate drought (MD) were considered (producer’s Drought Accuracy
hereafter):
The number of samples correctly classified to ED, SD, or MD classes
Producer’ s Drought Accuracy = (1)
Total number of samples in ED, SD, or MD classes
User’s accuracy was also used as a supplementary accuracy mea-
sure for drought category classification, to provide false positive
prediction information (user’s Drought Accuracy hereafter):
Then number of samples correctly classified to ED, SD, or MD classes
User’ s Drought Accuracy = (2)
Total number of samples classified to ED, SD, or MD classes
110 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

F-I-method (long-range forecast data with spatial interpolation), sometimes shows overfitting. MAX DEPTH values of 3, 5, 10, and
and C-I-method (climatology data with spatial interpolation). the full development were tested and the parameter value showing
Machine learning models and the spatial interpolation model were the best model performance was selected in each case.
first developed using observation remote sensing data assuming
perfect forecasts (PF-ML model and PF-I model hereafter) and the 3.3.2. Random forest
performance of the models using long-range forecast data or cli- RF is based on Classification and Regression Trees (CART), a rep-
matological data were tested. resentative decision tree algorithm (Breiman, 2001). While the DT
Drought forecasting was performed by machine learning in this model uses only one tree, the RF model uses an ensemble of trees to
study using the scikit-learn python library. Machine learning mod- avoid a well-known problem of CART (i.e., overfitting and sensitiv-
els provide results that are stable against outliers and especially ity to training data configuration), Thus, it gained great popularity
useful when used with remote sensing data (Mishra and Desai, in remote sensing applications during the past decade (Li et al.,
2006). Three machine learning models, decision trees (DT), random 2013; Kim et al., 2014,2015; Han et al., 2015; Park et al., 2016).
forest (RF), and extremely randomized trees (ERT), were tested. Randomness is used during the training processes: during the pro-
The machine learning models use the remote sensing data intro- cess of bagging (bootstrap aggregating), a random subset of input
duced in Section 2.2.3 (LST DAY, LST NIGHT, NDVI, NDWI) as well vectors, with the same size as the original input vectors, are created
as large-scale climate indices (MEI, AOI), the long-range forecast allowing repetition and are used for each tree. Thus input vectors
or climatological data combined with remote sensing data (PRCP, for each tree are independent but have the same probability distri-
PET), and temporal and topographical information (MONTH, ELEV) bution. All trees vote for the final class or produce a probabilistic
(Fig. 2). The drought-related remote sensing variables were used to classification result. The use of feature bagging, where only a sub-
provide initial conditions of ungauged areas and to help estimate set of input variables is used for each leaf, ensures stability against
future drought conditions of the areas. The long-range forecast or outliers and noise.
climatological data are combined with remote sensing data in the RF also provides the relative importance of input variables
same way that they are combined with weather station data; only through the Out-Of-Bag (OOB) errors when a variable is perturbed.
the observation data in this case are not weather station data but OOB errors are the differences between actual and estimated
historical remote sensing data. values of input vectors that are not used for training. Unlike
Feature scaling was applied to LST DAY, LST NIGHT, NDVI, cross-validation errors, the OOB process provides unbiased error
NDWI, PRCP, and PET; the times series of these variables are nor- estimations without using a part of the training data for estimat-
malized using the maximum and minimum values for each month ing errors (Breiman, 2001). In this study, however, cross validation
and location. This is to separate weather impacts from the environ- was performed for all models leaving each year out to avoid over-
mental component (Kogan, 1995). Seasonality can also be removed fitting since the characteristics of input vectors may be different
by this normalization. Separate spatial analyses were not per- each year and the length of observation data is short. Leave-one
formed; data for all 61 weather station locations were pooled and year-out cross validation was used instead of leave-one-out cross
examined together assuming spatial homogeneity of the study area. validation because the length of training data is relatively short and
This assumption may not be realistic and it will be examined in fur- the characteristics of data of a certain year may be different from
ther studies. Monthly analyses were not performed either. Instead, those of other years.
MONTH was included as a discrete input variable to the models. The number of trees (NUM TREES) is another important param-
Elevation data (ELEV) was also included as an input variable to use eter for RF and ERT models in addition to the MAX DEPTH of each
topographical information about the area. tree. Since RF and ERT models use an ensemble of trees, overfitting
Only SPI and SPEI values below −0.5 (the value slightly on the is less likely even with fully developed trees; each tree is trained
drier side of the Near Normal category) were used for training in with a subset of input vectors created to minimize correlations with
order to avoid the tendency that forecast results are more assigned subsets of input vectors for other trees. NUM TREES of 10, 50, 100,
to wetter classes. It would be ideal to have training data with evenly and 200 were tested and the parameter value showing the best
distributed classes, but it is not the case because SPI and SPEI values model performance was selected in each case.
follow the gaussian distribution. The full range of data is used for
performance evaluation.
3.3.3. Extremely randomized trees
ERT is similar to RF, but additional randomness is added to the
3.3.1. Decision trees
way that input vectors are split at each leaf. In ERT models, input
DT is used to develop either a classification or a regression model
vectors are split in random ways and the split showing the best
and has been widely used in remote sensing applications (Jensen
result is chosen later.
and Im 2007; Rhee et al., 2008; Im et al., 2012; Lu et al., 2014; Tor-
bick and Corbiere 2015). Since it is a white-box model, the resultant
tree can be explicitly examined. A tree consists of several leaves and 3.3.4. Cross validation
branches. Input data is split at each leaf node in a way to optimize Relationships between input variables and the target variable
the parameter of a split function, until all subsets have the same may be different each year, since the target variable is drought cat-
value of the target variable or the estimation is no longer improved. egory and its behavior is not well defined due to the short length
The Gini impurity index was used as a split function: of remote sensing records. In order to avoid overfitting, leave-one-
out cross validation was performed not with each sample but with

m
  each year. It may be called as leave-one year-out cross validation
Gini = fi (1 − fi ) , i ∈ 1, 2, . . ., m (5)
in this study. Results of the leave-one year-out cross validation
i=1
were compared to 10-fold with shuffle cross validation, leave-
where m is the number of classes and fi is the fraction of samples one sample-out cross validation, and leave-one station-out cross
classified as class i. validation, and all other results outperformed the leave-one year-
Sensitivity analysis was performed for the most important out cross validation results (data not shown). This indicates that
parameters. The maximum depth of the tree (MAX DEPTH) is unknown patterns may appear in the future and the leave-one
important for DT models. The tree will be fully developed unless year-out cross validation is more appropriate to evaluate model
MAX DEPTH is set to a certain value, and a fully developed tree performance.
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 111

Fig. 2. Structure of each machine learning model.

Since data from January 2003 to August 2015 were used, 13 to drought and one of them was correctly classified (Fig. 4; sample
models were developed for each method, index, time scale, and number data not shown).
lead time. The performance evaluation in the next section is done
using the average of these models in each case. The relative impor- 4.2. Accuracy for ungauged areas
tance of input variables was also evaluated using the average of the
results from 13 models in each case. Machine learning models can be either classification models or
regression models. If regression models are developed, drought
classes can be derived based on regression results and drought
4. Results and discussion classes (Table 1). Machine learning DT, RF, and ERT models were
developed as both classification and regression models in this
4.1. Accuracy at gauge locations study, and the regression models mostly performed better. Only
Drought Accuracy values derived from the regression models were
Drought Accuracy values were compared between the C- further examined.
method and F-method for SPI and SPEI with 3-, 6-, 9-, and 12-month The Drought Accuracy of C-I, F-I, C-ML (DT, RF, ERT models), and
time scales and lead times of 1–6 months (Figs. 3 and 4). For SPI3 and F-ML (DT, RF, ERT models) methods were compared to SPI and SPEI
SPEI3, only 1- and 2-month lead time forecasts retain observation with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6
data. For SPI6 and SPEI6, 1- to 5-month lead time forecasts retain months (Figs. 5 and 6). For producer’s Drought Accuracy based on
observation data. Longer past periods exist for 9- and 12-month C-I and F-I methods, producer’s Drought Accuracy values for both
time scales. SPI and SPEI change with lead time as they do at gauge locations
In all cases where at least one month of past period is included, (Figs. 3 and 5). The only difference is that the accuracy values are
the C-method showed higher producer’s Drought Accuracy com- smaller due to spatial interpolation. Other machine learning-based
pared to the F-method (Fig. 3). This indicates that drought forecasts methods show quite different changes with lead time. Although the
based on climatological data performed better than the forecasts accuracy of the C-I and F-I methods decreases with increased lead
based on long-range forecast data. It may be because the bias- time, many of the methods still show somewhat high producer’s
correction method of the GCM results using the percent increment Drought Accuracy with longer lead times for SPI and SPEI with 9-
of anomaly is too simple; appropriate downscaling methods should and 12-month time scales and also for SPEI3 and SPEI6 (Fig. 5).
be applied to use long-range forecast data for the study area. In Methods based on machine learning mostly performed bet-
cases with no observation data included (SPI3 and SPEI3 forecasts ter than the methods based on spatial interpolation. While there
with 3–6-month lead times as well as SPI6 and SPEI6 forecasts with appeared not much difference between methods using climatologi-
6-month lead time), Drought Accuracy appeared as zero meaning cal data and long-range forecast data, methods using climatological
no forecast skill for both C-method and F-method. Drought Accu- data performed slightly better in many cases (Fig. 5). As a result, the
racy values larger than zero observed here are mostly due to the C-ML (ERT model) method and the F-ML (ERT model) method per-
memory (persistency) of the drought index itself. formed the best except for the 6-month lead SPI6 forecast. In that
The F-method showed higher user’s Drought Accuracy values case, C-ML (RF model) performed the best (Fig. 5). The uncertainty
than the C-method for some cases (SPI6 with shorter lead times, ranges of producer’s Drought Accuracy are provided for C-ML (ERT
SPI9, SPEI9, and SPI12; Fig. 4). The user’s Drought Accuracy values model) method using mean ± one standard deviation, showing that
for SPI3 with lead times longer than 4 months are not available the method performs significantly better than many other methods
because none of the samples were classified to drought classes. It (Fig. 5).
should be noted though that user’s Drought Accuracy values can be For user’s Drought Accuracy, methods based on machine learn-
misleading when only a few samples were classified to drought; for ing show consistently low accuracy (Fig. 6). Methods based
example, the 50% user’s Drought Accuracy of SPEI3 with 6-month on spatial interpolation perform better than methods based on
lead time was obtained because only two samples were classified machine learning for drought indices with longer time scales (SPI9,
112 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Fig. 3. Producer’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-method and F-method.

SPEI9, SPI12, and SPEI12) as well as for drought indices with shorter training of machine learning models. Instead of using 61 weather
time scales (SPI3, SPEI3, SPI6, and SPEI6) with shorter lead times stations, 41 and 21 stations randomly selected were tested; the
(Fig. 6). The uncertainty ranges of user’s Drought Accuracy are pro- results were similar though.
vided for F-I method using mean ± one standard deviation, showing
that the method performs significantly better than many other
methods (Fig. 6). Low user’s Drought Accuracy by machine learn-
ing implies that non-drought samples were not effectively learned 4.3. Improved ranges for ungauged areas
by machine learning compared to the drought training samples,
resulting in overestimation of drought conditions. What if the skill of the long-range forecast improves? Different
Weather stations’ density may affect the results because fewer levels of decrease in forecast error were assumed using observation
stations provide less information for spatial interpolation with data, and their performance were examined (Figs. 7 and 8). Three
increased distances between stations, and also a smaller dataset for simulation scenarios were designed:
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 113

Fig. 4. User’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-method and F-method.

• F1: 30% decrease of forecast error in precipitation the time scale of SPI or SPEI, the larger the improved range of the
• F2: 30% decrease of forecast error in mean temperature accuracy (Table 4). User’s Drought Accuracy was also improved by
• F3: 30% decrease of forecast error in precipitation and mean tem- F1-ML and F3-ML methods compared to F-ML methods in most
perature cases, and F1-I, F2-I, and F3-I methods showed even lower accuracy
values than F-I method (Fig. 8; Table 5).
Producer’s Drought Accuracy was improved by F1-ML and F3-
ML methods compared to F-ML methods, while F2-ML methods 4.4. Relative importance of input variables
produced similar results to F-ML methods (Fig. 7). The results indi-
cate that the decrease of forecast error in precipitation contributed The machine learning models used in this study provides rela-
to the improvement of drought forecast. For methods based on spa- tive importance information about the input variables. The relative
tial interpolation, F1-I, F2-I, and F3-I methods showed even lower importance of input variables of the ERT model was examined for
accuracy values than F-I method in many cases (Fig. 7). The shorter SPI and SPEI with 3-, 6-, 9-, and 12-month time scale and 1–6-month
114 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Fig. 5. Producer’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-I, F-I, C-ML (DT, RF, ERT
model), and F-ML (DT, RF, ERT model) methods. The blue background indicates the uncertainty ranges of the accuracy using mean ± one standard deviation for the C-ML
(ERT model) method. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 115

Fig. 6. User’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-I, F-I, C-ML (DT, RF, ERT model),
and F-ML (DT, RF, ERT model) methods. The red background indicates the uncertainty ranges of the accuracy using mean ± one standard deviation for the F-I method. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
116 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Fig. 7. Producer’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-I, F-I, F1-I, F2-I, F3-I, C-ML
(ERT model), F-ML (ERT model), F1-ML (ERT model), F2-ML (ERT model), and F3-ML (ERT model) methods.

lead times (Fig. 9). The relative importance values are given using This is because SPEI is originally calculated using precipitation and
the importance of the most important variable as 100. potential evapotranspiration.
The most important variable in all case is PRCP. It is not surpris- The relative importance of other input variables varies with
ing that PET also ranked high for SPEI forecast, especially for SPEI9 drought index, time scale, and lead time of the forecast. If only input
and SPEI12 (PET is also on the right-axis for them in Fig. 9f and h). variables with greater than 10% relative importance for at least one
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 117

Fig. 8. User’s Drought Accuracy of SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on C-I, F-I, F1-I, F2-I, F3-I, C-ML (DT
model), F-ML (DT model), F1-ML (DT model), F2-ML (DT model), and F3-ML (DT model) methods.

lead time are considered, MONTH, NDVI, MEI, and LST DAY were and b). LST DAY has more than 10% relative importance for SPI6
chosen as important variables in addition to PET and PRCP. Tempo- and SPEI9 forecasts with 2-month lead time (Fig. 9c and f). The
ral information MONTH appears more important for SPI3 and SPEI3 large-scale climate index MEI also appears relatively important for
forecasts with longer lead times while NDVI appears more impor- drought index forecast with shorter time scales, which are SPI3,
tant for SPI3 and SPEI3 forecasts with 3-month lead time (Fig. 9a
118 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Fig. 9. Relative importance of input variables for SPI and SPEI forecasts with 3-, 6-, 9-, and 12-month time scales and lead times of 1–6 months based on ERT model.
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 119

Table 4
Improved ranges of producer’s Drought Accuracy (%) for ungauged areas for F1-ML (ERT model) method.

Lead Time (month) SPI3 SPI6 SPI9 SPI12 SPEI3 SPEI6 SPEI9 SPEI12

1 5.51 2.74 −0.18 1.01 4.09 1.6 −0.36 −0.2


2 8.05 3.97 −0.67 2.55 5.88 2.93 0.9 1.41
3 11.54 3.11 1.88 5.19 10.15 3.72 2.75 1.6
4 7.78 9 −0.1 2.71 12.6 3.83 0.49 1.8
5 5.64 3.94 2.01 5.44 15.53 4.94 0.65 1.07
6 10.09 −5.83 0.21 7.5 15.39 3.67 1.54 2.98

Table 5
Improved ranges of user’s Drought Accuracy (%) for ungauged areas for F1-ML (DT model) method.

Lead Time (month) SPI3 SPI6 SPI9 SPI12 SPEI3 SPEI6 SPEI9 SPEI12

1 2.6 −0.03 0.37 0.45 2.99 2 0.19 −0.07


2 5.46 0.91 0.45 0.46 2.25 1.96 0.19 1
3 −11.01 1.11 0.34 1.3 11.15 2.1 0.93 1.31
4 15.54 1.12 −0.11 1.19 12.17 3.25 0.88 0.62
5 3.92 1.95 0.98 0.41 4.88 3.41 1 0.02
6 5.56 1.4 1.32 0.33 3.48 1.8 1.74 0.25

SPEI3, and SPEI6, especially for the SPEI3 forecast with all lead times of producer’s Drought Accuracy. Both the C-ML (ERT model) and F-
(Fig. 9b). ML (ERT model) methods produced higher Drought Accuracy than
the C-I and F-I methods. This means machine learning models could
4.5. Drought forecast maps provide more promising drought forecast data compared to spatial
interpolation techniques, such as the Kriging technique used in this
Although the machine learning models performed better than study.
spatial interpolation for drought index forecast in terms of In many cases, machine learning models (C-ML and F-ML meth-
producer’s Drought Accuracy, there is still some room for improve- ods based on the ERT model) even performed better than F1-I, F2-I,
ment. The resultant drought forecast maps for August 2015 with and F3-I using spatial interpolation with assumed enhanced fore-
1–3-month lead times were created for SPI3, SPEI3, SPI6, and SPEI6 cast data. It was shown that the use of machine learning models
(Figs. 10 and 11). In August 2015, the study area was in Mod- supplements the current shortcomings of long-range forecast data
erate Drought when drought index values for 61 ASOS weather in this study area.
stations were averaged, while the northwest part of the study area
including Seoul, Incheon, and Gyeonggi was in Extreme Drought
4.6.3. Limitations
when drought index values for 6 weather stations were averaged
This study still has some limitations. It may not be appropriate to
(Table 6).
use climatological data due to the non-stationary characteristics of
Drought conditions in weather stations (circles on maps) are
changing future climate, although in many cases the methods using
actual observation values of August 2015, while spatially dis-
climatological data perform better than the methods using long-
tributed values are forecasts (Figs. 10 and 11). Drought conditions in
range climate forecast data for drought indices such as SPI3 and
the 1-month lead SPI3 forecast look close to the observation data
SPEI3. Fortunately, the methods using long-range climate forecast
except in the southeast part of the study area where no drought
data produced comparable Drought Accuracy values to the meth-
conditions could not be forecasted correctly (Fig. 10a). For the 1-
ods using climatological data, especially those based on machine
month lead SPEI3 forecast, on the other hand, the Extreme Drought
learning. This implies that machine learning-based drought pre-
in the northwest seems not adequately forecasted (Fig. 10d). Similar
diction would be more promising with improved climate forecast.
conditions were observed for SPI6 and SPEI6 too (Fig. 11). Spatial
Soil moisture data could not be included due to very large dis-
interpolation generally predicted less serious drought conditions
crepancies between satellite-based soil moisture estimates and flux
overall (data not shown).
observation data. The use of a simple bias-correction method for
long-range forecast data can also be a weakness of the study. These
4.6. Comparisons of the methods
limitations will be addressed in further studies. Additional input
variables for initial drought conditions and geographical informa-
4.6.1. Climatology and long-range climate forecast
tion will be explored.
Although not in all cases, the methods using climatological data
(C, C-I, and C-ML) performed better than the methods using long-
range forecast data (F, F-I, and F-ML) for SPI and SPEI forecast in 5. Conclusion
many cases in terms of producer’s Drought Accuracy. This may be
because the simple bias-correction method of long-range forecast In this study, drought forecasting using long-range forecast data
data used here could not reflect the spatial variability within the was tested and compared to the method using climatological data.
study area. More sophisticated downscaling methods need to be In order to obtain drought forecast information in ungauged areas,
tested in further study. The improvement of drought forecast in machine learning of long-range forecast data or climatological data
ungauged areas, however, can be expected with the enhancement combined with remote sensing data was performed and the perfor-
of the forecast skill of precipitation as was proven with assumed mance was compared to the spatial interpolation of Kriging. One of
forecast data scenarios. the machine learning models, the ERT model, performed the best in
most cases in terms of producer’s Drought Accuracy. Although the
4.6.2. Spatial interpolation and machine learning models contribution of long-range forecast data was not significant under
Three machine learning models, DT, RF, and ERT, were tested in the conditions used in this study, further improvement is expected.
this study and the ERT model generally performed the best in terms The use of machine learning models supplements the limitations of
120 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Fig. 10. 1- to 3-month lead SPI3 and SPEI3 forecast in the study area for August 2015.

Fig. 11. 1- to 3-month lead SPI6 and SPEI6 forecast in the study area for August 2015.

Table 6
Drought index values in August 2015.

Drought Index Average of 61 stations (Entire study area) Average of 6 stations (Northwest part)

SPI3 −1.35 −2.12


SPEI3 −1.38 −2.69
SPI6 −1.38 −2.64
SPEI6 −1.47 −3.09
J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122 121

the simple bias-correction method used in this study for drought Hayes, M., Svoboda, M., Wall, N., Widhalm, M., 2011. The lincoln declaration on
forecast in ungauged areas. drought indices: universal meteorological drought index recommended. Bull.
Am. Meteorol. Soc. 92 (4), 485–488.
For high-resolution drought forecasting in ungagged areas, it Higgins, R.W., Leetmaa, A., Kousky, V.E., 2002. Relationships between climate
is recommended to use the machine learning ERT model with cli- variability and winter temperature extremes in the United States. J. Clim. 15,
matological data or long-range climate forecast data for drought 1555–1572.
Huete, A., Justice, C., van Leeuwen, W., 1999. MODIS Vegetation Index (MOD 13)
indices with longer time scales. This model produces drought Algorithm Theoretical Basis Document, Version 3. Goddard Space Flight
forecast data spatially distributed in 0.05 × 0.05◦ resolution. The Center, National Aeronautics and Space Administration, 30 April 1999. 120 p.
producer’s Drought Accuracy of the machine learning-based meth- Huffman, G.J., Adler, R.F., Bolvin, D.T., Gu, G., Nelkin, E.J., Bowman, K.P., Stocker, E.F.,
Wolff, D.B., 2007. The TRMM multi-satellite precipitation analysis:
ods with long-range climate forecast data reaches up to 56% for SPI
quasi-global, multi-year, combined-sensor precipitation estimates at fine
forecast and 64% for SPEI forecast. scale. J. Hydrometeorol. 8, 33–55.
Drought forecast skill using long-range forecast data can be Huffman, G.J., Bolvin, D.T., Braithwaite, D., Hsu, K., Joyce, R., Kidd, C., Nelkin, E.J.,
Xie, P., 2015. Algorithm Theoretical Basis Document (ATBD) Version 4.5. NASA
improved if the forecast skill of long-range forecast data is
Global Precipitation Measurement (GPM) Integrated Multi-satellitE for GPM
improved. The assumed enhanced forecast F1-ML and F3-ML (IMERG) National Aeronautics and Space Administration, 16 November 2015.
showed improved Drought Accuracy in many cases. The drought 30 p.
forecast skill based on long-range forecast data may also be IPCC SREX, 2012. Managing the risks of extreme events and disasters to advance
climate change adaptation. In: Field, C.B., Barros, V., Stocker, T.F., Qin, D.,
improved if more sophisticated bias-correction and downscaling Dokken, D.J., Ebi, K.L., Mastrandrea, M.D., Mach, K.J., Plattner, G.-K., Allen, S.K.,
methods are used. Further studies deserve to examine this. Tignor, M., Midgley, P.M. (Eds.), A Special Report of Working Groups I and II of
The level of Drought Accuracy derived in this study may be used the Intergovernmental Panel on Climate Change. Cambridge University Press,
Cambridge, UK, and New York, NY, USA, 582 p.
as a baseline to compare in future studies trying to produce high Im, J., Jensen, J., Jensen, R., Gladden, J., Waugh, J., Serrato, M., 2012. Vegetation
resolution drought forecast data. Although there is still some room cover analysis of hazardous waste sites in Utah and Arizona using
for improvement, the developed model may be used for drought- hyperspectral remote sensing. Remote Sens. 4, 327–353.
Jensen, J.R., Im, J., 2007. Remote sensing change detection in urban environments,
related decision making in ungauged areas. In Geo-Spatial Technologies in Urban Environments. In: Jensen, R.R., Gatrell,
J.D., McLean, D. (Eds.), Policy, Practice, and Pixels. , 2nd edition. Springer Berlin
Heidelberg, Berlin, Heidelberg, pp. 7–31.
Acknowledgment Jensen, J.R., 2005. Introductory Digital Image Processing: A Remote Sensing
Perspective, 3rd ed. Prentice Hall, Upper Saddle River, New Jersey, USA, pp. 544.
Ji, L., Peters, A.J., 2003. Assessing vegetation response to drought in the northern
This research was supported by the APEC Climate Center and the
Great Plains using vegetation and drought indices. Remote Sens. Environ. 87,
National Disaster Management Research Institute of Korea and by 85–98.
National Space Lab Program through the National Research Founda- KOSIS (Korean Statistical Information System), 2016a. Land Use Statistics,
Available online at http://kosis.kr.
tion of Korea (NRF) funded by the Ministry of Science, ICT, & Future
KOSIS (Korean Statistical Information System), 2016b. Climatology, Available
Planning (Grant: NRF-2013M1A3A3A02042391). online at http://kosis.kr.
Kang, I.S., 1998. Relationship between El Nino and climate variability in Korean
Peninsula. Atmos. J. Korean Meteorol. Soc. 34 (3), 390–396 (In Korean).
References Ke, Y., Im, J., Lee, J., Gong, H., Ryu, Y., 2015. Characteristics of Landsat 8 OLI-derived
NDVI by comparison with multiple satellite sensors and in-situ observations.
AghaKouchak, A., Farahmand, A., Melton, F.S., Teixeira, J., Anderson, M.C., Remote Sens. Environ. 164, 314–329.
Wardlow, B.D., Hain, C.R., 2015. Remote sensing of drought: Progress, Kim, Y.H., Im, J., Ha, H.K., Choi, J.-K., Ha, S., 2014. Machine learning approaches to
challenges, and opportunities. Rev. Geophys. 53, 452–480, http://dx.doi.org/10. coastal water quality monitoring using GOCI data, GISci. Remote Ses. 51,
1002/2014RG000456. 158–174.
Alves, O., Wang, G., Zhong, A., Smith, N., Tseitkin, F., Warren, G., Schiller, A., Kim, M., Im, J., Han, H., Kim, J., Lee, S., Shin, M., Kim, H.-C., 2015. Landfast sea ice
Godfrey, S., Meyers, G., 2003. POAMA: Bureau of Meteorology operational monitoring using multisensory fusion in the Antarctic. GISci. Remote Sens. 52,
coupled model seasonal forecast system. In: Stone, R., Partridge, I. (Eds.), 239–256.
Science for Drought: Proceedings of the National Drought Forum. Brisbane, Apr Kim, S., 2004. The Pattern Change of Atmospheric Circulation in the North
2003, Department of Primary Industries, pp. 49–56. Hemisphere Associated with Spring Precipitation in Korea Master’s Thesis.
Anderson, L.O., Malhi, Y., Aragao, L.E.O.C., Ladle, R., Arai, E., Barbier, N., Philips, O., Gongju University. Department of Atmospheric Science, pp. 61.
2010. Remote sensing detection of droughts in Amazonian forest canopies. Kogan, F.N., 1995. Application of vegetation index and brightness temperature for
New Phytol. 187, 733–750. drought detection. Adv. Space Res. 15, 91–100.
Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32. Lee, J.H., Kim, C.J., 2012. A multimodel assessment of the climate change effect on
CPC (Climate Prediction Center), 2016. Seasonal Drought Outlook. http://www.cpc. the drought severity-duration-frequency relationship. Hydrol. Process. 27,
ncep.noaa.gov/products/expert assessment/sdo summary.html. 2800–2813, http://dx.doi.org/10.1002/hyp.9390.
Cancelliere, A., Mauro, G.D., Bonaccorso, B., Rossi, G., 2007. Drought forecasting Leilah, A.A., Al-Khateeb, S.A., 2005. Statistical analysis of wheat yield under
using the Standardized Precipitation Index. Water Resour. Manag. 21, 801–819. drought conditions. J. Arid Environ. 61, 483–496.
Cha, E.-J., 2012. Summer 2012. Short Changma, drought, heatwave, tropical night, Li, M., Im, J., Beier, C., 2013. Machine learning approaches for forest classification
heavy rain, and typhoon. J. Korean Soc. Hazard Mitig. 12 (3), 4–9 (In Korean). and change analysis using multi-temporal Landsat TM images over Huntington
Chakraborty, A., Shehgal, V.K., 2010. Assessment of agricultural drought using Wildlife Forest. GISci. Remote Sens. 50, 361–384.
MODIS derived normalized difference water index. J. Agric. Phys. 10, 28–36. Lu, Z., Im, J., Rhee, J., Hodgson, M., 2014. Building type classification using spatial
Cressie, N.A.C., 1993. Statistics for Spatial Data, Revised ed. John Wiley & Sons Inc., and landscape attributes derived from LiDAR remote sensing data. Landsc.
pp. 928. Urban Plan. 130, 134–148.
Gao, B.-C., 1996. NDWI – a normalized difference water index for remote sensing McKee, T.B., Doesken, N.J., Kleist, J., 1993. The relationship of drought frequency
of vegetation liquid water from space. Remote Sens. Environ. 58, 257–266. and duration of time scales. In: Proc. of the 8th Conf. of Applied Climatology,
Gu, Y., Brown, J.F., Verdin, J.P., Wardlow, B., 2007. A five-year analysis of MODIS Anaheim CA, USA, Amer. Meteor. Soc, pp. 179–184.
NDVI and NDWI for grassland drought assessment over the central Great Merryfield, W.J., Lee, W.-S., Boer, G.J., Kharin, V.V., Scinocca, J.F., Flato, G.M.,
Plains of the United States. Geophys. Res. Lett. 34, L06407, http://dx.doi.org/10. Ajayamohan, R.S., Fyfe, J.C., 2013. The Canadian seasonal to interannual
1029/2006GL029127. prediction system. Part I: models and initialization. Mon. Weather Rev. 141,
Guttman, N.B., 1999. Accepting the Standardized Precipitation Index: a calculation 2910–2945.
algorithm. J. Am. Water Resour. Assoc. 35, 311–322. Mishra, A.K., Desai, V.R., 2006. Drought forecasting using feed-forward recursive
Han, P., Wang, P.X., Zhang, S.Y., Zhu, D.H., 2010. Drought forecasting based on the neural network. Ecol. Model. 198, 127–138.
remote sensing data using ARIMA models. Math. Comput. Model. 51, Mishra, A.K., Singh, V.P., 2010. A review of drought concept. J. Hydrol. 391,
1398–1403. 202–216.
Han, H., Lee, S., Im, J., Kim, M., Lee, M.-I., Ahn, M.H., Chung, S.-R., 2015. Detection of Mishra, A.K., Singh, V.P., 2011. Drought modeling – a review. J. Hydrol. 403,
convective initiation using Meteorological Imager onboard Communication, 157–175.
Ocean, and Meteorological Satellite based on machine learning approaches. Mishra, A.K., Desai, V.R., Singh, V.P., 2007. Drought forecasting using a hybrid
Remote Sens. 7, 9184–9204. stochastic and neural network model. J. Hydrol. Eng. ASCE 12 (6), 626–638.
Harou, J., Medellin-Azuara, J., Zhu, T., Tanaka, S., Lund, J., Stine, S., Olivares, M., Mo, K.C., Lyon, B., 2015. Global meteorological drought prediction using the North
Jenkins, M., 2010. Economic consequences of optimized water management American multi-model ensemble. J. Hydrometeorol. 16, 1409–1424.
for a prolonged, severe drought in California. Water Resour. Res. 46, W05522,
http://dx.doi.org/10.1029/2008WR007681.
122 J. Rhee, J. Im / Agricultural and Forest Meteorology 237–238 (2017) 105–122

Morid, S., Smakhtin, V., Bagherzadeh, K., 2007. Drought forecasting using artificial Sun, J., Ahn, J.B., 2011. A GCM-based forecasting model for the landfall of tropical
neural networks and time series of drought indices. Int. J. Climatol. 27, cyclones in China. Adv. Atmos. Sci. 28, 1049.
2103–2111. Thornthwaite, C.W., 1948. An approach toward a rational classification of climate.
Muriithi, F., Yu, D., Robila, S., 2016. Vegetation response to intensive commercial Geogr. Rev. 38 (1), 5594, http://dx.doi.org/10.2307/210739.
horticulture and environmental changes within watersheds in central Torbick, N., Corbiere, M., 2015. Mapping urban sprawl and impervious surfaces in
highlands, Kenya, using AVHRR NDVI data. GISci. Remote Sens. 53, 1–21. the northeast United States for the past four decades. GISci. Remote Sens. 52,
Palmer, W.C., 1965. Meteorological drought. In: U.S. Weather Bureau Research 746–764.
Paper 45., pp. 58. Vernieres, G., Rienecker, M.M., Kovach, R., Keppenne, C.L., 2012. ). In: Suarez, M.J.
Park, C.-K., Schubert, S.D., 1997. On the nature of the 1994 East Asian summer (Ed.), The GEOS-iODAS: Description and Evaluation, Technical Report Series on
drought. J. Clim. 10, 1056–1070. Global Modeling and Data Assimilation, vol. 30. National Aeronautics and
Park, S., Im, J., Jang, E., Rhee, J., 2016. Drought assessment and monitoring through Space Administration, Greenbelt, Maryland, 61 p.
blending of multi-sensor indices using machine learning approaches for Verseghy, D.L., 2000. The Canadian Land Surface Scheme (CLASS): its history and
different climate regions. Agric. Forest Meteorol. 216, 157–169. future. Atmos. Ocean 38, 1–13.
Quan, X.-W., Hoering, M.P., Lyon, B., Kumar, A., Bell, M.A., Tippett, M.K., Wang, H., Vicente-Serrano, S.M., Begueria, S., Lopez-Moreno, J.I., 2010. A multiscalar drought
2012. Prospects for dynamicla prediction of meteorological drought. J. Appl. index sensitive to global warming: the Standardized Precipitation
Meteorol. Climatol. 51, 1238–1252. Evapotranspiration Index. J. Clim. 23, 1696–1718.
Rhee, J., Im, J., Carbone, C.J., Jensen, J.R., 2008. Delineation of climate regions using Vicente-Serrano, S.M., Gouveia, C., Camarero, J.J., Begueria, S., Trigo, R.,
in-situ and remotely-sensed data for the Carolinas. Remote Sens. Environ. 112, Lopez-Moreno, J.I., Azorin-Molina, C., Pasho, E., Lorenzo-Lacruz, J., Revuelto, J.,
3099–3111. Moran-Tejeda, E., Sanchez-Lorenzo, A., 2013. Response of vegetation to
Rhee, J., Im, J., Carbone, G.J., 2010. Monitoring agricultural drought for arid and drought time-scales across global land biomes. Proc. Natl. Acad. Sci. U. S. A.
humid regions using multi-sensor remote sensing data. Remote Sens. Environ. 110, 52–57.
114, 2875–2887. Wan, Z., 1999. MODIS Land-Surface Temperature Algorithm Theoretical Basis
Rienecker, M.M., Suarez, M.J., Todling, R., Bacmeister, J., Takacs, L., Liu, H.-C., Gu, Document (LST ATBD) Version 3.3. Goddard Space Flight Center, National
W., Sienkiewicz, M., Koster, R.D., Gelaro, R., Stajner, I., Nielsen, J.E., 2008. NASA, Aeronautics and Space Administration, April 1999. 75 p.
118, NASA/TM-2008-104606. Wang, G., Kleeman, R., Smith, N., Tseitkin, F., 2001. The BMRC coupled general
Saha, S., Moorthi, S., Wu, X., Wang, J., Nadiga, S., Tripp, P., Behringer, D., Hou, Y.-T., circulation model ENSO forecast system. Mon. Weather Rev. 130, 975–991.
Chuang, H., Iredell, M., Ek, M., Meng, J., Yang, R., Mendez, M.P., van den Dool, H., Wang, B., Kang, I.-S., Lee, J.-Y., 2004. Ensemble simulations of Asian-Australian
Zhang, Q., Wang, W., Chen, M., Becker, E., 2014. The NCEP Climate Forecast monsoon variability by 11 AGCMs. J. Clim. 17, 803–818.
System version 2. J. Clim. 27, 2185–2208. Ward, F., Hurd, B., Rahmani, T., Gollehon, N., 2006. Economic impacts of federal
Scinocca, J.F., Mcfarlane, N.A., Lazare, M., Li, J., Plummer, D., 2008. Technical note: policy responses to drought in the Rio Grande Basin. Water Resour. Res. 42,
the CCCma third generation AGCM and its extension into the middle W03420, http://dx.doi.org/10.1029/2005WR004427.
atmosphere. Atmos. Chem. Phys. 8, 7055–7074. Wilhite, D., Glantz, M.R., 1985. Understanding the drought phenomenon: the role
Sim, K.O., 2009. A Study on Extreme Drought. National Disaster Management of definitions. Water Int. 10, 111–120.
Institute, pp. 170 (In Korean). Wolter, K., Timlin, M.S., 2011. El Nino/southern oscillation behavior since 1871 as
Steinemann, A.C., 2006. Using climate forecasts for drought management. J. Appl. diagnosed in an extended multivariate ENSO index (MEI.ext). Int. J. Climatol.
Meteorol. Climatol. 45, 1353–1361. 31, 1074–1087.
Strahler, A.H., Lucht, W., Schaaf, C.B., Tsang, T., Gao, F., Li, X., Muller, J.-P., Lewis, P., Yagci, A., Di, L., Deng, M., 2015. The effect of corn-soybean rotation on the
Barnsley, M.J., 1999. MODIS BRDF/Albedo Product: Algorithm Theoretical Basis NDVI-based drought indicators: a case study in Iowa, USA, using vegetation
Document, Version 5.0. Goddard Space Flight Center, National Aeronautics and condition index. GISci. Remote Sens. 52, 290–314.
Space Administration, April 1999. 53 p.

You might also like