You are on page 1of 7

Environ Model Assess (2013) 18:559–565

DOI 10.1007/s10666-013-9364-4

Estimation of Water Demand in Iran Based


on SARIMA Models
Habib Allah Mombeni · Sadegh Rezaei ·
Saralees Nadarajah · Mahsa Emami

Received: 31 August 2012 / Accepted: 4 March 2013 / Published online: 27 March 2013
© Springer Science+Business Media Dordrecht 2013

Abstract The generation of synthetic, residential water itself in different ways in recent years. The most common
demands that can reproduce essential statistical features ways are (1) increased cost of water usage, (2) intensi-
of historical residential water consumption is essential for fied competition over access to water resources, and (3)
planning, design, and operation of water resource systems. social insecurity (breakout of diseases) due to the lack of
Most residential water consumption series are seasonal and water.
nonstationary. We employ the seasonal autoregressive inte- Urbanization, population growth, industrial development
grated moving average (SARIMA) model. We fit this model of cities, and rising living standards in the world have led to
to monthly residential water consumption in Iran from May a growing trend in per capita consumption of water. Because
2001 to March 2010. We find that a three-parameter log- of the economic investment needed to develop new water
logistic distribution fits the model residuals adequately. We resources, accurate prediction of water demand is becoming
forecast values for 1 year ahead using the fitted SARIMA more important than ever.
model. In the framework of water demand management, it is
vital to analyze and to understand the characteristics of
Keywords Forecast · SARIMA model · Water demand water demand: How demand is formulated, which factors
determine it, how demand responds to changes in income
and relative prices, and eventually how future demand will
1 Introduction be shaped [6, 25]. As a result, the analysis of demand is
an essential component in designing effective water demand
The scarcity of water resources is a crucial problem in management [5, 12].
almost all contemporary societies. Even in areas where there Furthermore, the analysis of water demand is crucial
are adequate quantities of water, the problem of scarcity in determining water prices and evaluating investment
is usually confronted by deterioration of water quality projects. Both issues are of extreme importance in the
resulting in increasing costs for certain—mainly indoor— current water policy. There are several approaches to fore-
water uses. The problem of water scarcity has manifested casting water demand [4, 19, 29]. Empirical models can
take into account the effects of earlier events, and as a result,
they can even explain the past fairly well [6, 16]. However,
H. A. Mombeni · S. Rezaei
new unforeseen events will occur and the future will always
Amirkabir (Polytechnic) University, Tehran, Iran
appear more uncertain than the past [21]. Furthermore, as
S. Nadarajah () Clements and Hendry [21] note, the economic forecasts
University of Manchester, Manchester, M13 9PL, UK end as a mixture of science—based on econometric sys-
e-mail: saralees.nadarajah@manchester.ac.uk
tems that embody consolidated economic knowledge—and
M. Emami art, namely judgements about perturbations from recent
Payam Noor University (PNU), Ahvaz, Iran unexpected events.
560 H.A. Mombeni et al.

Fig. 1 The time series plot for residential water consumption in Iran (left, West Azarbaijan Province; middle, Isfahan Province; and right,
Khuzestan Province)

This short note examines water demand analysis in 2 Mathematical Formulation of the SARIMA Model
a region with specific characteristics such as intensified
scarcity, frequent and lasting drought periods, expanding A variety of different forecasting approaches are available
urbanization, etc. Indeed, water issues in Iran are represen- to forecast time series data. No single model is universally
tative of many other urban areas. We focus on residential applicable [3, 22]. The approach presented here is based on
use, since it appears to be more problematic—requiring Box and Jenkins’ ARIMA model.
water of high cost and of severe scarcity—because of the
quality standards which have to be met. Residential water 2.1 SARIMA Model
consumption includes everyday uses such as drinking, cook-
ing, bathing, toilet flushing, washing clothes and dishes, Many time series data contain a seasonal periodic com-
dishwashing and laundry, etc. ponent. To deal with seasonality, the ARIMA model
The rate of population growth, migration for work, long is extended to a general multiplicative SARIMA model
periods of warm years, and the dust factor have increased defined as follows (Box et al. [17]):
water demand in Iran. So, finding an appropriate model to
forecast water demand in the future is essential.    D  
p B S φp (B) 1−B S (1−B)d Zt =Q B S θq (B)et , (1)
The purpose of this short note is to find a simple model
for water demand. This model is not dependent on price or
household income. We use known time series models and where φp (B) = 1−φ1 B −φ2 B 2 −· · ·−φp B p is the autore-
statistical properties of the data to find the best model [11, gressive operator, θq (B) = 1 − θ1 B − θ2 B 2 − · · · − θq B q
13–15, 28]. is the operator of moving averages, P (B) = 1 − 1 B −
Figure 1 shows the water consumption data. We can see 2 B 2 − · · · − P B P is the seasonal autoregressive oper-
that water demand is higher in warmer months. A sea- ator, Q (B) = 1 − 1 B − 2 B 2 − · · · − Q B Q is the
sonal autoregressive integrated moving average (ARIMA) seasonal operator of moving averages, p is the order of
model is therefore needed. The seasonal time series ARIMA
(SARIMA) model was introduced by Box and Jenkins [9].
It has been applied successfully for forecasting economic,
Table 1 Parameter estimates for the three-parameter log-logistic
marketing, and social problems. While this model has the distribution
advantage of accurate forecasting over short periods, it
also has the limitation that at least 50 observations are Province Shape Scale Location
needed [9]. parameter parameter parameter
The results of this short note are organized as follows.
Khuzestan 0.02177 15.83 −7,541,011
Section 2 introduces the SARIMA model. The fitting of this
Isfahan 0.01905 17.39 −35,658,510
model and the results are discussed in Section 3. Finally,
West Azarbaijan 0.03090 16.35 −12,329,398
Section 4 notes some conclusions.
Estimation of Water Demand in Iran Based on SARIMA Models 561

Fig. 2 ACF plot for residential water consumption in Iran (left, West Azarbaijan Province; middle, Isfahan Province; and right, Khuzestan
Province)

process AR, q is the order of process MA, d is the order of 2. Identification of the ARIMA (p, d, q) × (P , D, Q)S
difference, P is the order of seasonal process AR, Q is the structure;
order of MA, D is the order of seasonal differencing, S is the 3. Estimation of the unknown parameters;
seasonal length, Zt is an appropriately transformed water 4. Goodness-of-fit tests on the estimated residuals;
 D
demand in period t, (1 − B)d and 1 − B S are the nonsea- 5. Forecast future outcomes.
sonal and seasonal differencing operators, respectively, B is
The best model could be determined by using criteria like
the backward shift operator, and et is a purely random pro-
the root mean squared error, the mean absolute error, the
cess. If D is not zero, then seasonal differencing is involved.
mean absolute percentage error, the mean percentage error,
The model in (1) is called a SARIMA model of order
and the Akaike information criterion (AIC). In the current
(p, d, q) × (P , D, Q)S . If d is nonzero, a simple differenc-
study, the AIC is used [10, 23, 24, 27, 28].
ing can be used to remove trend. Seasonal differencing can
be used to remove seasonality.
3 Application to Modeling of Residential Water
2.2 Box and Jenkins’ ARIMA Modeling Procedure
Consumption
The fitting procedure of Box and Jenkins’ ARIMA model
We model time series data of monthly residential water con-
involves a five-stage iterative process:
sumption in Iran. We consider three provinces of Iran with
1. Preparation of data including transformations and different features and different weather conditions: Khuzes-
differencing; tan Province located in southern Iran and the weather there

Fig. 3 PACF plot for residential water consumption in Iran (left, West Azarbaijan Province; middle, Isfahan Province; and right, Khuzestan
Province)
562 H.A. Mombeni et al.

Fig. 4 Probability plot of residuals (left, West Azarbaijan Province; middle, Isfahan Province; and right, Khuzestan Province)

is dry and hot, Isfahan Province located in the center and census in the year 2011, the population of the province
the weather there is relatively mild, and West Azarbaijan was 4,815,863, of which approximately 83.3 % were urban
Province located in northwestern Iran and the weather there residents and 16.7 % resided in rural areas. The literacy rate
can be cold and wet. was 88.65 %. The province experiences a moderate and dry
Khuzestan Province is one of the 31 provinces of Iran. climate on the whole, ranging between 40.6 and 10.6 ◦ C
It is in the southwest of the country, bordering Iraq’s Basra on a cold day in the winter season. The average annual
Province and the Persian Gulf. Its capital is Ahwaz and temperature has been recorded as 16.7 ◦ C and the annual
covers an area of 63,238 km2 . According to the most rainfall on average has been reported as 116.9 mm. The city
recent census taken in 2011, the province had an esti- of Esfahan however experiences an excellent climate, with
mated population of 4,277,998 inhabitants. The province four distinct seasons.
of Khuzestan can be basically divided into two regions: West Azarbaijan Province is in the northwest of Iran.
the rolling hills and mountainous regions north of the Its capital city is Urmia. The province has an area of
Ahwaz ridge, and the plains and marsh lands to its south. 37,059 km2 . It is the 13th largest province in the country
The area is irrigated by the Karoun, Karkheh, Jarahi, and makes up 2.25 % of the total area. West Azarbai-
and Maroun rivers. The climate of Khuzestan is gener- jan Province’s population is 2,873,459, the nation’s eighth
ally hot and occasionally humid, particularly in the south, most populous province. The province is mainly influenced
while winters are much more pleasant and dry. Summer- by moist air flowing from the ocean. But in some winter
time temperatures routinely exceed 50 ◦ C, and in the months, cold air masses from the north cause a significant
winter, it can drop below freezing, with occasional snow- reduction in temperature.
fall, all the way south to Ahwaz. Khuzestan Province is The data are water consumption measurements in cubic
known to master the hottest temperatures on record for a meters for every 2 months, so we have six data points every
populated city anywhere in the world. Many sandstorms year. Khuzestan Province and West Azarbaijan Province
and dust storms are frequent with arid and desert-style have 60 data points, starting from May 2001 to March 2010.
terrains. Isfahan Province has 54 data points, starting from May 2002
Isfahan Province is located in the center of Iran. Its cap- to March 2010. The first step of the analysis is to plot the
ital is the city of Esfahan. The province of Isfahan covers data sets. The time series plots for the three provinces are
an area of approximately 107,027 km2 . According to the shown in Fig. 1.

Fig. 5 Histograms of residuals (left, West Azarbaijan Province; middle, Isfahan Province; and right, Khuzestan Province)
Estimation of Water Demand in Iran Based on SARIMA Models 563

Table 2 Parameter estimates of the SARIMA model fitted to residen- and


tial water consumption in Iran
   
x − γ α −1
Province Parameter Estimate Standard T p value F (x) = 1 + ,
β
error
respectively, where α is the shape parameter, β is
Khuzestan φ1 0.4421 0.1379 3.21 0.002
the scale parameter, and γ is the location parameter
1 −0.5227 0.1351 −3.87 0.000
(γ = 0 yields the two-parameter log-logistic distribu-
Isfahan θ1 −0.4428 0.1490 −2.97 0.005
tion). The three-parameter log-logistic distribution is used in
1 0.4367 0.1502 2.91 0.006
hydrology for modeling flood frequency [1, 16, 26].
West Azarbaijan θ1 0.8402 0.0784 10.72 0.000
We compared the results of fitting different distribu-
1 0.6184 0.1317 4.69 0.000
tions by the Kolmogorov–Smirnov and Anderson–Darling
statistics. We found that the three-parameter log-logistic dis-
The main reason for careful modeling of the data is to tribution gave the best fit. A histogram of the residuals
generate synthetic, residential water demand for the future. showing the best fitting log-logistic distribution [17, 18, 29,
This requires a realistic distributional model for the resid- 30] as well as the corresponding probability plot is shown
uals. After exploring a number of possible distributions, in Figs. 4 and 5. The parameter estimates of the log-logistic
we found that a three-parameter log-logistic distribution distribution are given in Table 1.
performed fairly well. The three-parameter log-logistic dis- In Figs. 2 and 3, we show the autocorrelation function
tribution is also known as the generalized log-logistic dis- (ACF) and partial autocorrelation function plots (PACF) of
tribution or the shifted log-logistic distribution [2, 8]. It can the water consumption data for the three provinces. The fig-
be obtained from the log-logistic distribution by addition ures show the presence of nonstationary pattern. So, it is
of a shift parameter δ: if X has a log-logistic distribution, necessary to make differencing to produce a new series that
then X + δ has a shifted log-logistic distribution. In other is compatible with the assumption of stationary. The data
words, Y has a shifted log-logistic distribution if log(Y − δ) were preprocessed using first-order regular differencing and
has a logistic distribution. The shift parameter adds a loca- first-order seasonal differencing in order to remove trends
tion parameter to the scale and shape parameters of the and seasonality. We then applied the theory of stationary
(unshifted) log-logistic distribution. The properties of this processes for the new series and hence for the original pro-
distribution are straightforward to derive from those of the cess. Figures 4 and 5 show no evidence of serial correlation.
log-logistic distribution. However, an alternative parame- For residential water consumption in Khuzestan
terization, similar to that used for the generalized Pareto Province, the fitted model with the lowest AIC is
and generalized extreme value distributions, gives more SARIMA(1, 1, 0)(1, 1, 0)6. The AIC value is 0.48. For res-
interpretable parameters and also aids estimation. idential water consumption in Isfahan Province, the fitted
The three-parameter log-logistic distribution has its prob- model with the lowest AIC is SARIMA(0, 1, 1)(1, 1, 0)6.
ability density function and cumulative distribution function The AIC value is −0.39. For residential water consumption
specified by [7, 20]: in West Azarbaijan Province, the fitted model with the
      lowest AIC is SARIMA(0, 1, 1)(0, 1, 1)6. The AIC value
α x − γ α−1 x − γ α −2 is −0.42. Table 2 summarizes the estimates of the fitted
f (x) = 1+
β β β models.

Fig. 6 Comparison of mean for simulated vs. observed (left, West Azarbaijan Province; middle, Isfahan Province; and right, Khuzestan Province)
564 H.A. Mombeni et al.

Fig. 7 Comparison of standard deviation for simulated vs. observed (left, West Azarbaijan Province; middle, Isfahan Province; and right,
Khuzestan Province)

The fitted SARIMA(1,1,0)(1,1,0)6, SARIMA(0,1,1) 4 Conclusions


(1,1,0)6, and SARIMA(0,1,1)(0,1,1)6 models are
Generation of synthetic, residential water demand is impor-
(1 + 0.5227B )(1 − 0.4421B)(1 − B )(1 − B)Zt = et ,
6 6 tant for planning, design, and operation of water resources
systems. SARIMA models are powerful tools for mod-
eling periodic hydrologic series in general and residen-
(1 − 0.4367B 6 )(1 − B 6 )(1 − B)Zt = (1 + 0.4428B)et , tial water demand series in particular. In this short note,
we have applied SARIMA models to water demand
and data from Iran. The best fitting models were found to
be SARIMA(1,1,0)(1,1,0)6, SARIMA(0,1,1)(1,1,0)6, and
SARIMA(0,1,1)(0,1,1)6 with residuals described by a three-
(1 − B 6 )(1 − B)Zt = (1 − 0.6184B 6 )(1 − 0.3315B)et ,
parameter log-logistic distribution.
The methodology presented is useful for modeling and
respectively, where et is a purely random process. prediction of residential water demand. The results allow
Although the fitted models are seasonal, the residuals practitioners and planners to explore realistic decision-
should be stationary, so the standard confidence limits still making scenarios for designing effective water demand
apply. The residual plots did not show any pattern, providing management.
evidence that the fitted SARIMA models are adequate.
Finally, we perform a simulation study to test the validity
of the fitted models. We simulate 10 years of water demand Acknowledgments The authors would like to thank the Editor for
from the fitted models. It is customary to simulate several careful reading and for comments which greatly improved the paper.
extra years and throw out the initial years. We then compute
the mean and standard deviation of water demand from
the simulated data. Figures 6 and 7 compare these with
the observed values. It is clear that the fitted model closely References
reproduces the main statistical characteristics, indicating
that the model can be used for predicting future water 1. Robson, A., & Reed, D. (1999). Statistical procedures for flood
frequency estimation. In: Flood estimation handbook (Vol. 3).
demand. The forecast values for 1 year ahead are shown in
Wallingford: Institute of Hydrology.
Table 3. 2. Geskus, B. (2001). Methods for estimating the AIDS incuba-
tion time distribution when date of seroconversion is censored.
Statistics in Medicine, 20, 795–812.
Table 3 Forecast values for residential water consumption in Iran 3. Bowerman, B.L., & O’Connell, R.T. (1993). Forecasting and time
series: an applied approach. Belmont: Duxbury Press.
Year Month Khuzestan Isfahan West Azarbaijan 4. Granger, C.W. (1980). Forecasting methods. New York:
Academic.
2011 May 33,245,451 38,481,335 18,693,153 5. Baumann, D.D., Boland, J.J., Hanemann, W.M. (1998). Urban
July 35,033,334 41,982,426 19,689,571 water demand management and planning. New York: McGraw-
Hill.
September 35,589,043 47,935,583 22,637,376
6. Arbues, F., Garcia-Valinas, M.A., Martinez-Espineira, R. (2003).
November 32,964,526 42,972,561 21,166,657 Estimation of residential water demand: a state-of-the-art review.
January 30,871,814 35,882,957 19,102,702 Journal of Socio Economics, 32, 81–102.
March 31,039,135 33,678,502 17,851,194 7. Ashkar, F., & Mahdi, S. (2006). Fitting the log-logistic distribution
by generalized moments. Journal of Hydrology, 328, 694–703.
Estimation of Water Demand in Iran Based on SARIMA Models 565

8. Venter, G. (1994). Introduction to selected papers from the vari- to Canadian precipitation data. Canadian Journal of Statistics, 16,
ability in reserves prize program. Casualty Actuarial Society 223–236.
Forum, 1, 91–101. 21. Clements, M.P., & Hendry, D.F. (2003). Economic forecasting:
9. Box, G.P., & Jenkins, G.M. (1976). Time series analysis: forecast- some lessons from recent research. Economic Modeling, 20,
ing and control. San Francisco: Holden-Day. 301–329.
10. Akaike, H. (1973). Maximum likelihood identification of Gaus- 22. Brockwell, P.J., & Davis, R.A. (2001). Introduction to time series
sian autoregressive moving average models. Biometrika, 60, and forecasting (pp. 22–60). New York: Springer.
255–266. 23. Brockwell, P.J., & Davis, R.A. (1991). Time series: theory and
11. Chen, H.-L., & Rao, A.R. (2002). Testing hydrologic time series methods (2nd Ed.). New York: Springer.
for stationarity. Journal of Hydrologic Engineering, 7, 129–136. 24. Brockwell, P.J. (2000). Continuous-time ARMA processes. In
12. Briscoe, J. (1997). Managing water as an economic good. In M. C.R. Rao & D.N. Shanbhag (Eds.), Stochastic processes: the-
Kay, T. Franks, L. Smith (Eds.), Water: economics, management ory and methods. Handbook of statistics (Vol. 19, pp. 249–276).
and demand (pp. 339–361). London: E & F Spon. Amsterdam: North-Holland.
13. Salas, J.D. (1993). Analysis and modeling of hydrologic time 25. Griffin, R., & Sickles, R. (2001). Demand specification for munic-
series. In D.R. Maidment (Ed.), Handbook of hydrology (Sections ipal water management: evaluation of the stone-geary form. Land
19.5–19.9). New York: MacGraw-Hill. Economics, 77, 399–422.
14. Salas, J.D., & Obeysekera, J.T.B. (1992). Conceptual basis of 26. Hosking, R.M., & Wallis, J.R. (1997). Regional frequency anal-
seasonal streamflow time series models. Journal of Hydraulic ysis: an approach based on L-moments. New York: Cambridge
Engineering, 118, 1186–1194. University Press.
15. Hamilton, J.D. (1994). Time series analysis. Princeton: Princeton 27. Tsay, R.S., & Tiao, G.C. (1984). Consistent estimates of autore-
University Press. gressive parameters and extended sample autocorrelation function
16. Dalhuisen, J.M., Florax, R.J.G.M., de Groot, H.L.F., Nijkamp, P. for stationary an nonstationary ARMA models. Journal of the
(2003). Price and income elasticities of residential water demand: American Statistical Association, 79, 84–96.
a meta-analysis. Land Economics, 79, 292–308. 28. Tsay, R.S. (2005). Analysis of financial time series (2nd Ed.). New
17. Alkasasbeh, M., & Raqab, M.Z. (2009). Estimation of the gen- York: Wiley.
eralized distribution parameters: comparative study. Statistical 29. Gardiner, V., & Herrington, P. (1986). The basis and practices of
Methodology, 8, 262–279. water demand forecasting. Norwich: GeoBooks.
18. Ahmad, M.I., Sinclair, C.D., Werritty, A. (1987). A log-logistic 30. Singh, V.P., & Ahmad, M.A. (2004). Comparative evaluation of
flood frequency analysis. Journal of Hydrology, 98, 205–224. the estimators of the three-parameter generalized Pareto distri-
19. Chambers, M.J. (1990). Forecasting with demand systems: a bution. Journal of Statistical Computation and Simulation, 74,
comparative study. Journal of Econometrics, 44, 363–376. 91–106.
20. Shoukri, M.M., Mian, I.U.M., Tracy, D.S. (1988). Sampling prop- 31. Wei, W.W. (2006). Time series analysis, univariate and multivari-
erties of estimators of the log-logistic distribution with application ate methods (2nd Ed.). New York: Pearson.

You might also like