You are on page 1of 6

[Downloaded free from http://www.ijph.in on Tuesday, June 29, 2021, IP: 183.171.126.

85]

Original Article

Forecasting Incidence of Dengue in Rajasthan,


Using Time Series Analyses
Sunil Bhatnagar1, *Vivek Lal2, Shiv D. Gupta3, Om P. Gupta4
1
OSD (ME), Government of Rajasthan, 2Assistant Professor, Institute of Health Management Research, 3Director, Institute of Health
Management Research, Jaipur, 4Director (Public Health), Directorate of Medical Health and Family Welfare, Government of Rajasthan, India

Abstract
Aim: To develop a prediction model for dengue fever/dengue haemorrhagic fever (DF/DHF) using time series
data over the past decade in Rajasthan and to forecast monthly DF/DHF incidence for 2011. Materials and
Methods: Seasonal autoregressive integrated moving average (SARIMA) model was used for statistical modeling.
Results: During January 2001 to December 2010, the reported DF/DHF cases showed a cyclical pattern with seasonal
variation. SARIMA (0,0,1) (0,1,1)12 model had the lowest normalized Bayesian information criteria (BIC) of 9.426 and
mean absolute percentage error (MAPE) of 263.361 and appeared to be the best model. The proportion of variance
explained by the model was 54.3%. Adequacy of the model was established through Ljung–Box test (Q statistic 4.910
and P-value 0.996), which showed no significant correlation between residuals at different lag times. The forecast for
the year 2011 showed a seasonal peak in the month of October with an estimated 546 cases. Conclusion: Application
of SARIMA model may be useful for forecast of cases and impending outbreaks of DF/DHF and other infectious
diseases, which exhibit seasonal pattern.

Key words: Dengue, SARIMA model, Time series analyses

Introduction In India, the disease reflects cyclic patterns, which over


the years have increased in frequency and geographical
An estimated 50 million dengue infections occur extent. Over the past decade, the cases of dengue
annually and approximately 2.5 billion people live in have increased more than 20 times; from 650 cases
dengue endemic countries.1 Dengue fever (DF) inflicts in 2000 to 15,535 in 2009.2 The case fatality rate is
significantly high compared with other infectious
a significant health, economic, and social burden on the
diseases. Although, available data is largely derived
populations of these endemic areas. The World Health
from hospitalized cases, which represent dengue
Organization (WHO) South-East Region and Western hemorrhagic fever (DHF) and dengue shock syndrome
Pacific Region bear nearly 75% of the global disease (DSS), the burden due to uncomplicated DF is
burden due to dengue.2 nevertheless considerable.

*Corresponding Author: Dr. Vivek Lal,


Current dengue prevention strategies are weak as they
Assistant Professor, Institute of Health Management Research, 1, are reactive rather than anticipatory. As a result, they
Prabhu Dayal Marg, Sanganer Airport, Jaipur - 302 011, India. may often be implemented late, thereby reducing the
E-mail: doc.lal@gmail.com
opportunities for preventing transmission and controlling
the epidemic. The Asia Pacific Dengue Strategic Plan
Access this article online (2008–15) has been prepared to aid countries to reverse
Quick Response Code: the rising trend of dengue by enhancing their preparedness
Website: www.ijph.in
to detect, characterize, and contain outbreaks rapidly and
to stop the spread to new areas.3 Detailed information
DOI: 10.4103/0019-557X.106415
about when and where DF/DHF outbreaks occurred in
PMID: ***
the past can be a useful guide to the potential magnitude
and severity of future epidemics. Forecasting incidence

Indian Journal of Public Health, Volume 56, Issue 4, October-December, 2012


[Downloaded free from http://www.ijph.in on Tuesday, June 29, 2021, IP: 183.171.126.85]

282 Bhatnagar S, et al.: Forecasting Incidence of Dengue

of DF/DHF enables suitable allocation of resources for approximately stationary. A stationary time series is one
improved public health interventions. whose statistical properties such as mean and variance
are constant over time. Seasonality usually causes the
The outbreaks of DF/DHF can be predicted by series to be nonstationary because the average values at
epidemiological modeling thus enabling the health some particular times within the seasonal span may be
systems to be in readiness to manage outbreaks. Time different than the average values at other times. 
series analysis has been increasingly used in the field of
epidemiological research on infectious diseases, such as SPSS version 19.0 was used to determine the best-fitting
influenza4 and malaria5-7 and dengue.8-13 model. The stationarity of the series was made by means
of seasonal and nonseasonal differencing. The order of
The objective of the present study was to develop a autoregression (AR) and moving average (MA) were
prediction model for DF/DHF using time series data over identified using autocorrelation function (ACF) and
the past decade in Rajasthan and to forecast the monthly partial autocorrelation function (PACF) of the differenced
DF/DHF incidence for the year 2011. series.

Materials and Methods Several logical combinations of criteria to look for


better models were considered. From among several
Reported monthly DF/DHF cases from all the districts of models, the most suitable was selected based on three
Rajasthan for the period January 2001 through December measures, namely, normalized Bayesian information
2010 were obtained from the Directorate of Health and criteria (BIC), mean absolute percentage error (MAPE),
Family Welfare, Government of Rajasthan. and stationary R-squared. Whereas, lower values of BIC
and MAPE were preferred, a higher value of stationary
Autoregressive integrated moving average (ARIMA) R-squared suggested a greater proportion of variance of
modelshave been used for statistical modeling and the dependent variable explained by the model.
analyzing time series data containing ordinary or seasonal
trends to develop a predictive forecasting model.14 The Before using the model for forecasting, it was checked
ARIMA approach was first popularized by Box and for adequacy. A model is adequate if the residuals left
Jenkins,15 and such models are often referred to as over after fitting the model are simply white noise. This
Box–Jenkins models. The ARIMA procedure provides was done through examining the ACF and PACF of the
a comprehensive set of tools for univariate time residuals. Further, Ljung–Box test was used to provide an
series model identification, parameter estimation, and indication of whether the model was correctly specified.
forecasting, and it offers great flexibility in analysis, A significant value less than 0.05 was considered to
which has contributed to its popularity in several areas acknowledge the presence of structure in the observed
of research and practice. An ARIMA model may possibly series, which was not accounted for by the model;
include autoregressive (p) terms, differencing (d) terms therefore, we ignored the model if it had significant value.
and moving average (q) operations and is represented
by ARIMA (p, d, q). After the best model was identified, forecast for monthly
values of the year 2011 were made.
The ARIMA models can be extended to handle seasonal
components of a data series. Seasonal ARIMA (SARIMA) Results
is an extension of the method to a series in which a pattern
repeats seasonally over time and is represented as SARIMA The time series plot of the reported DF/DHF cases
(p, d, q) (P, D, Q)s. Analogous to the simple ARIMA displayed seasonal fluctuations and therefore deemed
parameters, these are: Seasonal autoregressive (P), seasonal nonstationary. Large autocorrelations were recorded
differencing (D), and seasonal moving average parameters for lags 1, 12, and 24 with values 0.6, 0.4, and 0.3,
(M); s defines the number of time periods until the pattern respectively. The sharp decrease in autocorrelation values
repeats again (for a monthly data it is 12). after lag 1 indicated no evidence of a long-term trend;
consequently, there was no need to include a first-lag
Statistical forecasting methods are based on the difference term in the SARIMA model structure (d = 0).
assumption that the time series can be rendered In contrast, large autocorrelation values were registered

Indian Journal of Public Health, Volume 56, Issue 4, October-December, 2012


[Downloaded free from http://www.ijph.in on Tuesday, June 29, 2021, IP: 183.171.126.85]

Bhatnagar S, et al.: Forecasting Incidence of Dengue 283

at annual lags (and its multiples), which indicated the Discussion and Conclusion
need to include a 12-month difference term in the models
(S = 12, D = 1) [Figure 1]. The ACF and PACF plots ARIMA models are useful in modeling the temporal
of the differenced series provided further support for dependence structure of a time series as they explicitly
these conclusions [Figure 1]. Therefore, a SARIMA assume temporal dependence between observations.16
(p,0,q) (P,1,Q)12 was selected as the basic structure of Particularly for seasonal diseases, ARIMA models have
the candidate model. been shown to be adequate tools for use in epidemiological
surveillance.17 Our study provides an example of applying
Among the statistical models, SARIMA (0,0,1) (0,1,1)12 a SARIMA model to forecast incidence of DF/DHF.
was selected as the best model, with the lowest normalized Although these models have been utilized to forecast
BIC of 9.426 and a MAPE of 263.361 [Table 1]. The DF/DHF incidence in several countries,8-13 such analyses
model explained 54.3% of the variance of the series has not been undertaken in an Indian situation before.
(stationary R-squared). The model parameters were
significant (P-value <0.001) with MA in the model, Among all candidate models, SARIMA (0,0,1) (0,1,1)12
seasonal lag 1 of β = 0.756 (SE = 0.135). was the most suitable predictive model in our study, which
showed the highest stationary R-squared and the lowest
Ljung–Box test (Q statistic 4.910 and P-value 0.996) normalized BIC and MAPE values. In a recent study in
suggested that there were no significant autocorrelation Brazil, SARIMA (2,1,3) (1,1,1)12 model offered best fit for
between residuals at different lag times and the residuals the dengue incidence data.8 However, in a previous study
were white noise. This was further corroborated by by Luz et al.,10 for monitoring dengue incidence in Rio
plotting the ACF and PACF of the residuals [Figure 2]. de Janeiro, Brazil, no seasonal differencing was reported
and SARIMA (2,0,0) (1,0,0)12 model was deemed best fit.
Moreover, the same model was also returned by the Choudhury et al.,9 reported SARIMA (1,0,0) (1,1,1)12 as
expert modeler. Having tested its validity, the prediction the most suitable model for forecasting dengue incidence
model was used to forecast incidence of DF/DHF cases in
the upcoming season in 2011. Figure 3 shows the month- Table 1: Normalized bayesian information criteria (BIC), mean
wise trends of DF/DHF over the past 10 years and for absolute percentage error (MAPE) and stationary R-squared
2011. The cases showed a similar seasonality, with a peak values of SARIMA models
in the month of October similar to previous years with an Models MAPE Normalized Stationary
BIC R-squared
estimated 546 cases (95% CI 311–781). The momentum
SARIMA (0,0,1) (0,1,1)12 263.361 9.426 0.543
in dengue would begin in August 2011, peak in October, SARIMA (0,0,1) (0,1,0)12 299.842 9.809 0.330
and then wane off toward December [Figure 3]. SARIMA (1,0,0) (1,1,1)12 245.526 9.590 0.516
SARIMA (1,0,1) (1,1,0)12 198.894 9.727 0.445
SARIMA (1,0,0) (1,1,0)12 160.090 9.761 0.394

a b c d
Figure 1: Autocorrelation function (ACF) and Partial autocorrelation function (PACF)
of data and transformed series Footnote: (a) shows time series data without any
difference; (b) shows transformed series with seasonal difference (1, period 12);
(c) shows transformed series with nonseasonal difference (1); (d) shows transformed Figure 2: Autocorrelation function (ACF) and partial autocorrelation function (PACF)
series with both seasonal (1, period 12) and nonseasonal difference (1) of residuals

Indian Journal of Public Health, Volume 56, Issue 4, October-December, 2012


[Downloaded free from http://www.ijph.in on Tuesday, June 29, 2021, IP: 183.171.126.85]

284 Bhatnagar S, et al.: Forecasting Incidence of Dengue

Further research is recommended to evaluate the


effectiveness of integrating the forecasting model into
the existing disease control program in terms of its impact
in reducing the disease occurrence.

References
1. World Health Organization. Dengue and dengue
haemorrhagic fever. Factsheet No 117, revised May 2008.
Geneva:World Health Organization; 2008.Available from:
http://www.who.int/mediacentre/factsheets/fs117/en/. [Last
accessed on 2011 Apr 8].
2. World Health Organization. Situation update of dengue
in the SEA Region, Geneva: World Health Organization;
Figure 3: Observed and forecast incident cases of dengue fever/dengue 2010. Available from:http://www.searo.who.int/LinkFiles/
hemorrhagic fever (DF/DHF), rajasthan, 2001–2011
Dengue_Dengue_update_SEA_2010.pdf. [Last accessed
on 2011 Apr 8].
3. World Health Organization and the Special Programme
in Dhaka, Bangladesh. Separate studies undertaken to for Research and Training in Tropical Diseases (TDR).
forecast DF/DHF incidence in northern, southern, and Dengue: Guidelines for diagnosis, treatment, prevention
north-eastern Thailand have yielded SARIMA (2,0,1) and control- New edition. Geneva: World Health
(0,2,0)12, ARIMA (1,0,1), and SARIMA (2,1,0) (0,1,1)12 Organization; 2009. Available from: http://whqlibdoc.who.int/
models as most suitable.11-13 publications/2009/9789241547871_eng.pdf. [Last accessed
on 2011 Apr 8].
Our findings corroborated that DF/DHF cases followed 4. Soebiyanto RP, Adimi F, Kiang RK. Modeling and predicting
seasonal influenza transmission in warm regions using
a seasonal pattern during the past decade 2001–10. The
climatological parameters. PLoS One 2010;5:e9450.
model revealed that there would be again a seasonal spurt 5. Wangdi K, Singhasivanon P, Silawan T, Lawpoolsri S,
in these cases with a peak in October 2011. This is also White NJ, Kaewkungwal J. Development of temporal
consistent with the data from previous years with regard modelling for forecasting and prediction of malaria infections
to the timing of the peak. using time-series and ARIMAX analyses: A case study in
endemic districts of Bhutan. Malar J 2010;9:251.
However, the predictions may not be credible for 6. Loha E, Lindtjørn B. Model variations in predicting incidence
of Plasmodium falciparum malaria using 1998-2007
forecasting the number of dengue cases in epidemic
morbidity and meteorological data from south Ethiopia.
years, as it could be a consequence of a lack of immunity Malar J 2010;9:166.
in a population exposed for the first time to a given 7. Tian L, Bi Y, Ho SC, Liu W, Liang S, Goggins WB, et al.
dengue viral serotype.8 One-year delayed effect of fog on malaria transmission: A
time-series analysis in the rain forest area of Mengla County,
More importantly, meteorological factors such as south-west China. Malar J 2008;7:110.
temperature, humidity, and rainfall have considerable 8. Martinez EZ, Silva EA. Predicting the number of cases
impact on dengue transmission, and climate variables of dengue infection in RibeirãoPreto, São Paulo State,
Brazil, using a SARIMA model. Cad Saude Publica 2011;
introduced into models can increase their predictive power.18
27:1809-18.
9. Choudhury MA, Banu S, Islam MA. Forecasting dengue
The forecasting models are based on reported cases, incidence in Dhaka, Bangladesh: A time series analysis.
which represent the severe cases of DHF/DSS admitted Dengue Bull 2008;32:29-37.
to the hospitals and who have been laboratory-confirmed. 10. Luz PM, Mendes BV, Codeço CT, Struchiner CJ, Galvani AP.
ARIMA modeling is a useful tool for interpreting Time series analysis of dengue incidence in Rio de Janeiro,
surveillance data and forecast of the cases to help guide Brazil. Am J Trop Med Hyg 2008;79:933-9.
11. Silawan T, Singhasivanon P, Kaewkungwal J, Nimmanitya S,
timely prevention and control measures. In addition, the
Suwonkerd W. Temporal patterns and forecast of dengue
usefulness of forecasting expected numbers of infectious infection in Northeastern Thailand. Southeast Asian J Trop
disease may lie in providing decision-makers a clearer Med Public Health 2008;39:90-8.
idea of the variability to be expected among future 12. Wongkoon S, Pollar M, Jaroensutasinee M, Jaroensutasinee
observations.19 K. Predicting DHF incidence in Northern Thailand using time

Indian Journal of Public Health, Volume 56, Issue 4, October-December, 2012


[Downloaded free from http://www.ijph.in on Tuesday, June 29, 2021, IP: 183.171.126.85]

Bhatnagar S, et al.: Forecasting Incidence of Dengue 285

series analysis technique. International Journal of Biological linear model and SARIMA: A comparison of their forecasting
and Life Sciences 2008;4:117-121. performance in epidemiology. Stat Med 2001;20:3051-69.
13. Promprou S, Jaroensutasinee M, Jaroensutasinee K. 18. World Health Organization. Using climate to predict
Forecasting dengue haemorrhagic fever cases in southern infectious disease outbreaks: A review. Geneva: World Health
Thailand using ARIMA models. Dengue Bull 2006;30:99-106. Organization; 2004.
14. Peter JD. Time Series: A biostatistical introduction. Oxford 19. Allard R. Use of time-series analysis in infectious disease
Statistical Science Series-5. 1990. surveillance. Bull World Health Organ 1998;76:327-33.
15. Box GE, Jenkins GM. Time series analysis: Forecasting and
control.San Francisco: Holden-Day; 1976.
16. Helfenstein U. The use of transfer function models, Cite this article as: Bhatnagar S, Lal V, Gupta SD, Gupta OP. Forecasting
intervention analysis and related time series methods in incidence of dengue in Rajasthan, using time series analyses. Indian J Public
Health 2012;56:281-5.
epidemiology. Int J Epidemiol 1991;20:808-15.
Source of Support: Nil. Conflict of Interest: No.
17. Nobre FF, Monteiro AB, Telles PR, Williamson GD. Dynamic

Author Help: Online Submission of the Manuscripts


Articles can be submitted online from http://www.journalonweb.com. For online submission articles should be prepared in two files (first page
file and article file). Images should be submitted separately.
1) First Page File:
Prepare the title page, covering letter, acknowledgement, etc., using a word processor program. All information which can reveal your
identity should be here. Use text/rtf/doc/pdf files. Do not zip the files.
2) Article file:
The main text of the article, beginning from Abstract till References (including tables) should be in this file. Do not include any information
(such as acknowledgement, your names in page headers, etc.) in this file. Use text/rtf/doc/pdf files. Do not zip the files. Limit the file size to
1 MB. Do not incorporate images in the file. If file size is large, graphs can be submitted as images separately without incorporating them
in the article file to reduce the size of the file.
3) Images:
Submit good quality color images. Each image should be less than 4096 kb (4 MB) in size. Size of the image can be reduced by decreasing
the actual height and width of the images (keep up to about 6 inches and up to about 1200 pixels) or by reducing the quality of image. JPEG
is the most suitable file format. The image quality should be good enough to judge the scientific value of the image. Always retain a good
quality, high resolution image for print purpose. This high resolution image should be sent to the editorial office at the time of sending a
revised article.
4) Legends:
Legends for the figures/images should be included at the end of the article file.

Indian Journal of Public Health, Volume 56, Issue 4, October-December, 2012

You might also like