197 views

Original Title: Markov vs Arima

Uploaded by DenBagoes

- nourani2014
- CT6
- Air passenger demand forecasting for planned airports, case study Zafer and OR-GI airports in turkey [Planlanan havalimanlarının yolcu talep miktarlarının tahmin edilmesi Zafer ve OR-Gİ hava l.pdf
- Waves.pdf
- AF1
- tmpE050
- Half Life Tsay Notes
- AF1
- Dechow
- Hyperion Planning, Predictive Planning User's Guide
- 3TIER_Solar_Prospecting_to_Finance_Webinar.pdf
- Unit Roots in Macroeconomic Time Series
- NEWLINEDISSERTATION7.doc
- 02 SDMX Information Model Student Book 2010
- Pawlus 2013 Investigation
- Autoregressive–Moving-Average Model - Wikipedia the Free Encyc
- Crp
- CT6_Syllabus for 2011
- 02_SDMX_Information_Model_student_book_2010.pdf
- A Mixed-type Test for Linearity in Time Series

You are on page 1of 93

requirements for the award of the degree of

Master of Engineering (Civil Hydraulic & Hydrology)

Universiti Teknologi Malaysia

JANUARY 2012

iii

DEDICATION

Mr. Muhammad bin Ismail

and

Madam Siti Maznah binti Abdullah

and

My inspiration

throughout the entire creation of this thesis.

iv

ACKNOWLEDGEMENT

Assalammualaikum w.b.t.

Alhamdulillah, all praise to Allah S.W.T for the gift of life and what I have achieved

today.

Appreciation goes to my family for their prayers, moral and financial support. May

Allay reward you abundantly.

My sincere and deepest gratitude goes to my supervisor, Dr. Sobri Harun for his

guidance, encouragement and support in completing this master project.

My gratitude to Dr. Muhammad Askari for his invaluable suggestions, guidance, and

encouragement.

Last but not least, to all my lecturers, classmates and friends, their help and supports are

really appreciated and will be remembers forever, InsyaALLAH. Thank you all

ABSTRACT

Streamflow forecasting plays important roles for flood mitigation and water

resources allocation and management. Inaccurate forecasting will cause losses to water

resources managers and users. The suitability of forecasting method depends on type and

number of available data. Thus, the objective of this study are to propose the streamflow

forecasting methods using Markov and ARIMA models and to inspect the accuracy of

Markov and ARIMA models in forecasting ability. Streamflow data of Sungai Bernam,

Selangor was used. Minitab and Microsoft Excel were used to model ARIMA and

Markov respectively. Criteria performance evaluation procedure that being used in this

study were Mean Absolute Percentage Error (MAPE), Root Mean Squared Error

(RMSE) and Chi-square test of Normality to inspect the forecasting accuracy of the

different models. The tentative model that best fits the criteria and meets the requirement

for ARIMA model is ARIMA (1,1,1)(0,1,1)12. From the criteria performance evaluation

procedure, ARIMA model has better performance of model for forecasting than Markov

model in this study. Therefore, ARIMA model has the ability to accurately predict the

future monthly streamflow for Sungai Bernam.

vi

ABSTRAK

Peramalan aliran sungai memainkan peranan yang penting untuk kawalan banjir

dan pengurusan air. Peramalan yang tidak tepat akan menyebabkan kerugian kepada

pihak pengurusan sumber air dan juga kepada pengguna. Kesesuaian kaedah peramalan

bergantung kepada jenis dan jumlah data yang tersedia. Maka, objektif kajian ini adalah

untuk mencadangkan kaedah peramalan aliran sungai dengan menggunakan model

Markov dan ARIMA dan untuk memeriksa ketepatan model Markov dan ARIMA dalam

membuat peramalan. Data aliran sungai Sungai Bernam telah digunakan. Minitab

digunakan untuk memodelkan model ARIMA dan Microsoft Excel digunakan untuk

memodelkan model Markov. Prosedur penilaian prestasi kriteria yang digunakan dalam

kajian ini ialah Mean Absolute Percentage Error (MAPE), Root Mean Squared error

(RMSE) dan ujian Chi-Squared untuk memeriksa ketepatan peramalan model-model

yang berlainan. Tentatif model yang terbaik sesuai dengan kriteria dan memenuhi

kehendak untuk model ARIMA ialah ARIMA (1,1,1)(0,1,1)12. Dari prosedur penilaian

prestasi kriteria, model ARIMA mempunyai prestasi yang lebih baik dalm membuat

ramalan berbanding dengan model Markov. Justeru, model ARIMA mempunyai

keupayaan untuk meramalkan dengan tepat aliran sungai di masa hadapan untuk Sungai

Bernam.

vii

TABLE OF CONTENTS

CHAPTER

TITLE

DECLARATION

DEDICATION

ACKNOWLEDMENT

ABSTRACT

ABSTRAK

TABLE OF CONTENTS

LIST OF TABLES

LIST OF FIGURES

LIST OF APPENDICES

LIST OF ABBREVIATIONS

PAGE

ii

iii

iv

v

vi

vii

x

xi

xii

xiii

INTRODUCTION

1.1

Background of study

1.2

Problem Statement

1.3

1.4

1.5

Scope of Study

LITERATURE REVIEW

2.1

Introduction

2.2

2.3

2.4

10

2.4.1

11

Markov Model

viii

2.4.2

ARIMA Theory

12

2.4.3

ARIMA Algorithms

13

2.4.3.1

AR Model

14

2.4.3.2

MA Model

14

2.4.3.3

ARMA Model

15

2.4.3.4

ARIMA Model

16

2.5

17

2.6

18

2.7

Concluding Remarks

19

METHODOLOGY

20

3.1

Introduction

20

3.2

Markov Model

21

3.2.1

21

3.2.2

Identification of Distribution

23

3.2.3

24

3.2.4

24

3.3

ARIMA Model

25

3.3.1

26

3.3.2

Model Assumptions

3.3.1.1

Data Stationarity

26

3.3.1.2

Normal Distribution

27

3.3.1.3

Outlier

28

3.3.1.4

Missing Data

28

Model Procedure

29

3.3.2.1

Model Identification

29

3.3.2.2

Parameter Estimation

31

3.3.2.3

Diagnostic Checking

31

ix

3.3.3

3.4

32

33

35

4.1

Introduction

35

4.2

36

4.3

Markov Model

38

4.3.1

39

4.3.2

Identification of Distribution

40

4.3.3

43

4.3.4

45

4.3.5

46

4.4

3.4

Minitab Procedure

ARIMA Model

48

4.4.1

Model Identification

49

4.4.2

Parameter Estimation

53

4.4.3

Diagnostic Checking

55

4.4.4

58

4.4.5

59

60

65

5.1

Conclusion

65

5.2

Recommendations

66

REFERENCES

APPENDICES A-G

68

72 - 81

LIST OF TABLES

TABLE NO.

TITLE

4.1

4.2

for 1960-1970

PAGE

40

42

4.3

45

4.4

46

4.5

47

4.6

51

models

4.7

54

(1,1,1)12

4.8

54

(0,1,1)12

4.9

55

4.10

56

4.11

56

4.12

58

4.13

60

4.14

62

xi

LIST OF FIGURES

FIGURE NO.

2.1

TITLE

Value of time series with forecast function at 50%

probability limits

PAGE

9

3.1

29

4.1

36

4.2

37

4.3

38

4.4

39

4.5

41

4.6

42

4.7

43

Distribution

4.8

47

4.9

48

4.10

50

4.11

50

4.12

51

4.13

52

4.14

52

4.15

53

4.16

59

4.17

Model Comparison

61

4.18

63

xii

LIST OF APPENDICES

APPENDIX

TITLE

PAGE

72

73

74

75

76

78

80

xiii

LIST OF ABBREVIATIONS

ACF

Autocorrelation Function

AD

Anderson Darling

AR

Autoregressive

ARIMA

DF

Degree of Freedom

K-S

Kolmogorov-Smirnov

LSE

MA

Moving Average

MAPE

PACF

RMSE

R2

Coefficient of Determination

Standard Deviation

SE

Standard Error

Sg.

Sungai

Chi-square

CHAPTER 1

INTRODUCTION

1.1

Background of Study

conditions are called forecasts, and the act of making such predictions is called

forecasting. In many types of organizations, forecasting is very important as predictions

of future events must be incorporated into the decision-making process. In forecasting

events that will occur in the future, information concerning events that have occurred in

the past must be relied.

that can be used to describe it. Then, this pattern is extrapolated or extended into the

future. This forecasting technique rests on the assumption that the pattern that has been

identified will continue in the future to give good predictions. If the data pattern that has

been identified does not persist in the future, this indicates that the forecasting technique

used is likely to produce inaccurate predictions (Bowerman and OConnell, 1993).

2

Most forecasting problems involve the use of time series data. In this study, time

series is used to prepare forecasts. Time series is formed from measurements of a

variable taken at regular intervals over time. It is a stochastic process which amounts to

a sequence of random variables. The hydrologic data of streamflows fall under the

category of time series (Gupta, 1989). Time series can be used in application of

forecasting of future values of a time series from current and past values, and can be

used to forecast streamflow (Box and Jenkins, 1976). Time series plots can reveal

patterns such as random, trends, level shifts, periods or cycles, unusual observations, or

a combination of patterns.

Streamflow forecasting plays important roles for flood mitigation and water

resources allocation and management. In water management, the high quality

streamflow forecast and efficient use of this forecast can give considerable economic

and social benefits. Short-term forecasting like hourly and daily forecasting is crucial for

flood warning and defense while long-term forecasting which is based on monthly,

seasonal or annual time series is very useful for reservoir operation, irrigation

management decision, drought mitigation and managing river treaties (Shalamu, 2009).

Recently, due to the increase in data availability from metering stations, real time

data retrieval and increasing computational capability with the development of more

robust methods and computer techniques, time series models have become quite popular

in streamflow forecasting (Wang, 2006). A considerable number of forecasting models

and methodologies have been developed and applied in streamflow forecasting due to

importance of hydrologic forecasting. In this study, Markov and ARIMA model have

been used in the modeling of monthly streamflow processes.

3

The Markov process considers that the value of streamflow at one time is

correlated with the value of the streamflow at an earlier period (i.e. a serial or

autocorrelation exists in the time series). In a first-order Markov process, this correlation

exists in two successive values of the events (Gupta, 1989).

The first order Markov model states that the value of a variable x in one time

period is dependent on the value of x in the preceding time period plus a random

component. Thus, the synthetic streamflow represent a sequence of numbers, each of

which consists of two parts, which are deterministic and random parts (Gupta, 1989).

method of Box-Jenkins time series has good accuracy for short-term forecasting, but less

good accuracy for long-term forecasting. Usually, it will tend to become flat for a

sufficiently long period. ARIMA model ignores the independent variable completely,

and uses past and present values of dependent variable to produce accurate short-term

forecasting (Hendranata, 2003).

the dependent. The purpose of this model is to determine good statistical relationships

between the variables that being predicted and the historical value of these variables, so

that forecasting can be performed with the model (Hendranata, 2003).

4

1.2

Problem Statement

There are many time series forecasting methods can be used to predict the

streamflow. However, not all of these methods can produce accurate forecasts.

Inaccurate forecasting will cause losses to water resources managers and users. The

suitability of forecasting method depends on type and number of available data. ARIMA

and Markov models must be inspected to determine the ability of this method to provide

accurate and reasonable monthly streamflow forecasting. Through statistical methods,

the accuracy of both models for forecasting monthly streamflow will be tested and

evaluated. ARIMA modeling approach and Markov model was employed to the data set

to further investigate the behavioral change in the streamflow. The result of the study

can be used as a reference guideline to the flood control as Markov and ARIMA models

best suited for short-term forecasting.

1.3

reservoir operation management. Stochastic data generation aims to provide alternative

hydrologic data sequences that are likely to occur in future to assess the reliability of

alternative systems designs and policies, and to understand the variability in future

system performances. It is also very important to develop a stochastic hydrologic model

to generate the monthly streamflows and thus to estimate the future streamflows.

Through this model, it is wish that the problem on water shortage can be reduced.

Forecasting also can be used to give warning of extreme events like drought (Joomizan,

2010).

5

1.4

The aim of this paper is to forecast streamflow by using appropriate time series

modeling approach. To achieve this aim, the following objectives have been identified:

models.

1.5

Scope of Study

In this study, two models of time series are used which are Markov model and

ARIMA model to predict the behavior of streamflow. Streamflow data of Sungai

Bernam, Selangor for the period of 1960 to 2010 were used for the application of the

model. The study area that located in southeast Perak and northeast Selangor is semi

developed area and the size is 186km2.

Streamflow data were obtained from station Sg. Bernam at Tanjung Malim

(Station No. 3615412). The data which is monthly streamflow were collected from the

Department of Irrigation and Drainage, Kuala Lumpur. Computer program that being

used for ARIMA model is Minitab 15 and Microsoft Excel is used for Markov model.

CHAPTER 2

LITERATURE REVIEW

2.1

Introduction

Generally, surface water hydrology is the basis to engineering design and sources

of water. High streamflow may cause disaster like flood and erosion. Short-term

forecasting is needed to control this. Meanwhile, low streamflow can disrupt water

supply to domestic user, industrial, generation of hydroelectric power and irrigation.

Here, long-term forecasting is useful to prevent this problem. Therefore, ability to

generate streamflow forecasting accurately can be used in water flow management and

flood control.

Modeling and forecasting time series has long been practiced by using different

statistical methods. Forecasting models of time series that are commonly used are

ARIMA, moving average, exponential smoothing, regression analysis, and Fourier series

analysis. In this study, Markov and ARIMA model are used to predict monthly

streamflow.

7

2.2

variable of interest (Montgomery et al., 2008). Time series models have become popular

in recent years since the publication of the book by Box and Jenkins (1970), and the

subsequent development of computer software for applying these models (Bell, 1984).

The time can be a discrete value, a time interval or a continuous function. The

hydrologic data of streamflows, precipitation, groundwater or lake levels, water

temperatures, or oxygen concentration fall under the category of time series. These data

can be deterministic, random, or a combination of the two (Gupta, 1989).

the observations are assumed to be independent. However, a great deal of data in

business, economics, engineering and natural sciences occur in the form of time series

where observations are dependent. The systematic approach available for answering the

mathematical and statistical questions posed by these series of dependent observations is

called time series analysis. The objective of time series analysis is generally to

understand and identify the stochastic process that produced the observed series and then

to forecast future values of a series from past values alone (Akgun, 2003).

known as the serial correlation coefficient or the autocorrelation coefficient. This

parameter indicates the dependence in successive values of a time series. This

coefficient is determined for successive values (elements) and also for elements that are

various time intervals apart which known as lag period. A graph of the autocorrelation

coefficient against the lag period is known as the correlogram. If a correlogram shows

zero or nearly zero values for all lag periods, the process is purely random. A value close

to 1 will suggest a dominating deterministic process (Gupta, 1989).

8

The analysis of a time series in the frequency domain is done by the spectral

density that identifies the cyclic nature or periodicity in the series. The density indicates

the cycle in the deterministic data. In a purely random process it oscillates randomly.

The purpose of streamflow synthesis, however is not to analyze a time series but to

generate the data based on the series. This does not require the decomposition of the

time series by the analysis above but an understanding of its statistical properties to

reproduce series of similar statistical characteristics (Gupta, 1989).

2.3

Most forecasting problems involve the use of time series data. Montgomery et al.

(2008) stated that forecasting problems are often classified as short-term, medium term,

and long-term. Short-term forecasting problems involve predicting events only a few

time periods (days, weeks, months) into the future. Medium-term forecasts extend from

one to two years into the future, and long-term forecasting problems can extend beyond

that by many years. Short-term and medium-term forecasts are used for operations

management and development of projects while long-term forecasts can be used for

strategic planning.

In this study, we try to use Markov and ARIMA for long-term forecasting. As we

know, Markov and ARIMA models are best for short-term forecasting. Normally, shortterm and medium-term forecasts are based on identifying, modeling, and extrapolating

the patterns found in historical data. These historical data usually exhibit inertia and do

not change very drastically. Therefore, statistical methods are very useful for short-term

and medium-term forecasting (Montgomery et al., 2008).

9

The use at time t of available observations from a time series to forecasts its

value at some future time can provide a basis for (1) economic and business planning,

(2) production planning, (3) inventory and production control, and (4) control and

optimization of industrial processes (Box et al., 1994). As originally described by Brown

(1962), forecasts are usually needed over a period known as the lead time, which varies

with each problem. Usually, forecasts are made at time t by taking the current month Yt

and previous months Y1, Y2,,Yt-1, to forecast at some future time Ft+1, Ft+2,, Ft+m from

Y value forward.

accuracy of the forecasts may be expressed by calculating convenient set of probability

limits on either side of each forecast, such as 50% and 95%. It means that the realized

value of time series will be included within these limits with the stated probability when

it eventually happens. To illustrate, Figure 2.1 shows value of time series with forecast

made from origin t for lead time l together at 50% probability limits.

Figure 2.1: Value of time series with forecast function at 50% probability limits

(Source: Box et al., 1994)

10

2.4

fully random because it has been observed that a low flow tends to follow low flow and

a high flow tends to follow high flow. The word stochastic is used to denote the

randomness in statistics but in hydrology it refers to a partial random sequence as well.

Therefore, the streamflow data that represent time series is actually involving a

stochastic process. Various stochastic processes are used for generating the hydrologic

data (Gupta, 1989).

Stochastic modeling of hydrologic time series has been widely used for planning

and management of water resources systems such as for reservoir sizing and forecasting

the occurrence of future hydrologic events. For example, stochastic models are used to

generate synthetic series of water supply that may occur in the future which are then

utilized for estimating the probability distribution of key decision parameters such as

reservoir storage size. Furthermore, stochastic models can be used for forecasting water

supplies and water demands in days, weeks, months and years in advance (Fortin et al.,

2004).

The previous rainfall and streamflow records can be utilized as model inputs for

forecasting the next time step ahead of the streamflow (Mohd Shafiek et al., 2005). This

study employs the previous streamflow records to forecast the streamflow discharge of

the following month.

There are some stochastic models that can be utilized for synthetic generation

and forecasting of hydrological process. Hydrologic processes such as monthly

streamflow may be well represented by stationary linear models such as Markov process

11

or autoregressive (AR) and autoregressive integrated moving average (ARIMA) models.

These models are usually capable of preserving the historical annual statistics, such as

the mean, variance, skewness and covariance (Fortin et al., 2004). In this study, Markov

and ARIMA models are used to predict future monthly streamflow.

The Markov process considers that the value of an event (i.e. streamflow) at one

time is correlated with the value of the event at an earlier period (i.e. a serial or

autocorrelation exists in the time series). In a first-order Markov process, this correlation

exists in two successive values of the events. The first order Markov model, which

constitutes the classic approach in synthetic hydrology, states that the value of a variable

x in one time period is dependent on the value of x in the preceding time period plus a

random component. Thus the synthetic flow for a stream represent a sequence of

numbers, each of which consists of two parts:

(2.1)

where

is flow at ith time (ith number of a time series); di(t) is deterministic part at ith

time; and ei is random part at ith time. The values of ei are tied up with the historical data

by ensuring that they belong to the same frequency distribution and posses similar

statistical properties (mean, deviation, skewness) as the historical series (Gupta, 1989).

The various forms and combinations of deterministic and random component are

recognized as different models. Single season (annual) flow model of lag 1 is the

12

simplest model which assumes that the magnitude of the current flow is significantly

correlated with the previous flow value only. In the other hand, multiple-season models

divide the yearly flow into seasons or months (Gupta, 1989).

First order Markov Model has been successfully applied to many problems.

Examples include modeling sequential data using Markov chains, and solving control

problems posed in the Markov decision processes (MDP) framework. If the Markov

models parameters are estimated from data, the standard maximum likelihood estimates

consider the first order (single step) transitions only. But for many problems, the first

order conditional independence assumptions are not satisfied as a result of the higher

order transition probabilities can be poorly approximated by the learned model

(Joomizan, 2010).

The assumption of first order Markovian processes for representing the inflow

process of a reservoir has generally been considered in the literature as adequate for

most purposes. The development of models incorporating other approaches result in

extremely complex transition probability matrices (Wurbs, 2005).

introduced by Box and Jenkins (Box et.al., 1994). As such, some authors refer to this

modeling approach as a Box and Jenkins model. Box-Jenkins model is stationary time

series model. Time series that generated from zero-mean, finite variance, and

13

uncorrelated variable is called a white noise series which many useful models can be

constructed from it.

The ARIMA modeling is essentially an exploratory data-oriented approach that

has the flexibility of fitting an appropriate model which is adapted from the structure of

the data itself. The stochastic nature of the time series can be approximately modeled

with the aid of autocorrelation function and partial autocorrelation function; from which

information such as trend, random variables, periodic components, cyclic patterns and

serial correlation can be discovered. As a result, forecasts of the future values of the

series, with some degree of accuracy can be readily obtained (Ho and Xie, 1998).

computer technology today, the iterative model building process and hence accurate

forecast can be aided and made simpler by the ease of many user-friendly statistical

software packages such as SAS, Statgraphics, Statistica and Minitab. An iterative threestage process, i.e. through model identification, parameter estimation and diagnostic

check is required to determine the adequacy of the proposed model (Ho and Xie, 1998).

and moving average (MA) parts. The AR part described the relationship between present

and past observations. The MA part represents the autocorrelation structure of error. The

I part represents the differencing level of the series to eliminate non-stationary

(Hasmida, 2009). It is usually denoted by (p,d,q)(P,D,Q) where p denotes order of autoregressive component, d denotes order of differencing, q denotes order of moving

average and (P,D,Q) denotes corresponding seasonal component.

14

2.4.3.1 AR Model

AR(p) model expressed the current value of time series as a linear combination

of p previous values and a white noise term (random shock). Bell (1984) expressed the

current value of time series of AR(p) model as:

Yt = 1Yt-1 + + pYt-p + at

(2.2)

where 1,, p are AR(p) parameters, the at is the random shock in normal distribution

with zero mean and variance at time t, and p is the order of AR(p).

(2.2) can be written as:

(1- 1B - - pBp)Yt = at

(2.3)

Or

2.4.3.2 MA Model

MA(q) model expressed the current value of a time series as a linear combination

of a current and q previous values of a white noise process. The (purely) moving average

(MA) model is (Bell, 1984):

Or

Yt = at - 1at-1 - - qat-q

(2.4)

Yt = (1- 1B - - qBq) at

(2.5)

15

Yt = (B) at.

Or

where q is the order of MA(q), and coefficients are MA(q) model parameters.

To increase flexibility when fitting actual time series, both autoregressive and

moving average operators are combined to give the ARMA (p,q) model (Bell, 1984):

Yt = 1Yt-1 + + pYt-p + at - 1at-1 - - qat-q

(2.6)

(1- 1B - - pBp)Yt = (1- 1B - - qBq) at

Or

(2.7)

The mixed type of series which are explained both by its own lagged values and

by lagged noise terms is called Autoregressive Moving-Average models of order (p,q).

This systematic class of stationary time series models carries great importance and

usefulness especially in real-life situations. If the process is stationary, a suitable ARMA

model can be used to represent the data. If it is nonstationary, differencing is applied to

make the model become stationary and this leads to ARIMA model (Akgun, 2003).

16

2.4.3.4 ARIMA model

The first of these conditions implies that the series Yt following (2.6) is

stationary. In practice Yt may well be nonstationary, but with stationary first difference,

Yt - Yt-1 = (1-B) Yt.

If (1-B) Yt is nonstationary, we may need to take the second difference,

Yt - 2Yt-1 + Yt-2 = (1-B) [(1-B)Yt]

= (1-B)2 Yt.

In general, we may need to take the dth difference (1-B)d Yt (although rarely is d

larger than 2). Substituting (1-B)d Yt for Yt in (2.7) yields the ARIMA (p,d,q) model

(Bell, 1984):

(1- 1B - - pBp) (1-B)d Yt = (1- 1B - - qBq) at

Or

(2.8)

seasonal ARIMA(p,d,q)(P,D,Q)s model is advantageous. The seasonal time series is

transformed into a stationary time series with non-periodic trend components. A

multiplied seasonal ARIMA model can be expressed as (Lee and Ko, 2011):

(1- 1B - - pBp) (1- 1Bs - - PBPs) (1-Bs)D Yt =

(1- 1B - - qBq) (1- 1B - - QBQs) at

(2.9)

17

(B)(Bs) (1-Bs)D Yt = (B)(Bs)at.

Or

where D is the order of seasonal differencing, (Bs) and (Bs) are the seasonal AR(p)

and MA(q) operators respectively, which are defined as:

(Bs) = 1- 1Bs - - PBPs

(Bs) = 1- 1B - - QBQs

where 1,, p are the seasonal AR(p) parameters and 1,, p are the seasonal

MA(q) parameters.

To illustrate forecasting with ARIMA models, we shall use (2.9) written as:

Yt+l = 1Yn+l-1 + + p+dYn+l-p-d + an+l - 1an+l-1 - - qan+l-q

(2.10)

for t = n + l. We shall assume we want to forecast Yn+l for l = 1, 2, using data Yn, Yn1,

. For simplicity, we are assuming for now that the data set is long enough so that we

2.5

Naadimuthu and Lee (1982) proposed first order or lag one serially correlated

inflow. This means that the inflow of each month is dependent only on the inflow of the

previous month, forming a Markov chain. Markov chain method is stochastic method

that can be used to produce new time series of discharge of inflows based on available

time series of data (Adib and Majd, 2009).

18

According to Heiko (2000), Markov chains are stochastic processes that can be

parameterized by empirically estimating transition probabilities between discrete states

in the observed systems. The Markov chain of the first order is one for which each next

state depends only on immediately preceding one. Markov chains of second or higher

order are the processes in which the next state depends on two or more preceding ones.

which families of three-parameter Weibull distributions describe monthly streamflow

probabilistically, conditioned on streamflow in the preceding month.

2.6

Tang et al. (1991) stated that ARIMA model is only good for short term

forecasting since it builds its forecast on previous observations. ARIMA model needs

long memory series, which are more inputs to provide more accurate forecasts. For long

memory series, more training patterns results in more accurate forecasts. This BoxJenkins model does not work well or does not work at all for short input series.

Ho and Xie (1998) proved that ARIMA model is a viable alternative that give

satisfactory results for repairable system reliability forecasting. Ayob and Amat (2004)

used ARIMA to represent water use behavior at Universiti Teknologi Malaysia. ARIMA

modeling method also can be applied to analyses the water quality and rainfall-runoff

data for Johor River recorded for a long period (Hasmida, 2009).

19

Maia et al. (2008) demonstrated that ARIMA exhibited a satisfactory

performance in forecasting interval series with either a linear or non-linear behavior and

are useful forecasting alternative to interval-valued time series. However, the hybrid

model using ARIMA and artificial neural network had better average performance.

the monthly streamflow forecasting of the Zayandehrud River in western Isfahan

province, Iran (Modarres, 2007). Nazuha (2010) used ARIMA to analyze monthly

Malaysia crude oil production. Besides that, Yurekli et al. (2004) used ARIMA to

simulate monthly maximum data of Cekerek Stream.

2.7

Concluding Remarks

hydrological process. Stochastic models can provide alternative hydrologic data

sequences that are likely to occur in the future to access the reliability of alternative

systems designs and policies, and to understand the variability in future system

performance.

resources management. Hydrologic processes such as monthly streamflow may be well

represented by stationary linear models such as Markov process or autoregressive (AR)

and autoregressive integrated moving average (ARIMA) models.

CHAPTER 3

METHODOLOGY

3.1

Introduction

Various stochastic processes are used for generating the hydrologic data of

streamflow. The models either developed or used in order to carry out this study are of

different types in terms of their purposes, capabilities, interfaces, inputs, and outputs.

These mainly include water balance model, reservoir simulation, and stochastic models.

with each of the models are presented in the following sections. The computation work

used the available historical data taken from Department of Irrigation and Drainage. The

relevant data is used in deriving the forecasting models. Markov and ARIMA modeling

methods have been proposed for streamflow forecasting of Sungai Bernam. The method

to determine the accuracy of these models in forecasting ability also will be discussed.

21

3.2

Markov Model

Gupta (1989) stated that the general Markov procedure of data synthesis comprises:

record

2. Identifying the frequency distribution of the historical data

3. Generating random numbers of the same distribution and statistical

characteristics

4. Constituting the deterministic part considering the persistence (influence

of previous flows) and combining with the random part.

Four parameters that are important in a synthetic study are mean flow, standard

deviation, coefficient of skewness and correlation coefficient. The sample mean flow is

(Gupta, 1989):

(3.1)

Where,

mean observed (historical) flow

total numbers (values) of flow

ith number of observed flow

22

The sample estimate of the variance or standard deviation, S, which is a measure

of the variability of the data is given by (Gupta, 1989):

(3.2)

given by (Gupta, 1989):

(3.3)

any time is affected by the flow at another time. The K-lag coefficient, in which the

effect extends by K time units is given by (Gupta, 1989):

(3.4)

The one-lag serial coefficient, in which the current flow is affected only by the

previous flow can be obtained by substituting K = 1. The additional lags should be

included as long as they produce a model that explains more about the pattern of flows

than one with fewer lag does (Fiering and Jackson, 1971).

23

3.2.2

Identification of Distribution

Generally, the distributions used in streamflow generation are normal, lognormal and gamma families. The bell-shaped, or normal, distribution is most extensively

used in statistical applications because the sum of variables derived from any

distribution tends to be distributed normally according to the central limit theorem. To

test normality, the historical values of flow are plotted against the percentage of values

in the record that are equal to or greater than the plotted value. The flows are arranged in

descending order. For each value xi, the percent is computed by 100(n i + 1) / n where

i is the rank of value xi and n is the number of historic values. If the plot is a straight

line, the distribution is normal. The coefficient of skewness also should be close to zero,

since the normal distribution has no skewness (Gupta, 1989).

distribution. Log-normal distribution is positively skewed, match with characteristic of

many hydrologic variables. This distribution is suitable for low-flow studies because

small changes in low values produce large changes in their logarithmic values. A

straight-line plot indicates the log-normal distribution, while skewness calculated from

the logarithms of value should be close to zero (Gupta, 1989).

flows show appreciable skewness. However, this distribution cannot be used when

multiple lags exist when a flow is affected by many previous flows. Normally, historical

data do not clearly fit any of these distributions. The choice is made based on the

purpose, economics and any other considerations (Gupta, 1989).

24

3.2.3

Gupta (1989) stated that the source of random numbers can be generated either

by the computer-based pseudorandom-number generator or the random number tables.

The random number should belong to the same distribution to which the historical

record belongs for the generated flow to have similar characteristics. Normal random

numbers have a zero mean and one standard deviation while Log-Normal random

numbers have both mean and standard deviation equal to one.

3.2.4

(3.5)

where

from an appropriate distribution with a mean of zero and variance of unity; and i is ith

position in series from 1 to N years.

A model on the same lines for monthly flows, developed by Thomas and Fiering

has the following form (Maass et al., 1962):

(3.6)

25

Where,

qi,j

flow in ith month from the beginning, for jth month of the year

qi-1,j-1 =

bj

month = rjSj/Sj-1 (12 values)

3.3

Sj

ti,j

ARIMA Model

ARIMA models as become common practice for specification of stationary timedependent input processes since the work of Box and Jenkins (1970). ARIMA models

are usually used as discrete-time processes (Leemis, 1998) and hence the data from a

trace is interpreted as a count process for ARIMA fitting. There are some assumptions

that were made for performing ARIMA model. Besides, this model has specific

procedures to be followed for fitting ARIMA models to time series.

26

3.3.1

Model Assumptions

Before performing the ARIMA modelling, some assumptions were made such

that (Hasmida, 2009):

2. The data have normal distribution

3. No outlier exist in the data

4. No missing data

tentatively identify Box-Jenkins model, we must first determine whether the time series

we wish to forecast is stationary. The stationarity of monthly streamflow data were

examined by graphical representation of the data. The original data were plotted against

its time interval which is in month. A time series is stationary if the statistical properties

(for example, the mean and the variance) of the time series are essentially constant

through time (Bowerman and OConnell, 1993). In order word, stationary models

assume that the process remains in equilibrium about a constant mean level that is when

the plotting shows that the data fluctuates around its constant mean (Box et al., 1994).

Other graphical method applied in this present study is by examined the ACF and PACF

plot of the original data. Stationary data have randomly distributed ACF and PACF plot.

27

The transformation process might be required for the non stationary series and

this can be done using differencing method (Box et.al., 1994) and (Shumway, 1988).

This process has been considered in ARIMA modelling approach as the I (Integrated)

component or represent as d in ARIMA notation. The level of differencing is highly

depending on the level of stationarity of the data. The level of differencing might be 0, 1,

2 or higher than 2. 0 levels means that the differencing process is not perform to the

data. Then level 1 represent the first differencing process needed and second

differencing level needed for level 2. Higher level of differencing might be applied to

the nonstationary and complex data (Hasmida, 2009).

Data with normal distribution have a pattern of data distribution which follows a

bell shaped curve. The bell shaped curve has several properties such that the curve

concentrated in the center and decreases on either side. This means that the data has less

of a tendency to produce unusually extreme values, compared to some other

distributions. Besides, the bell shaped curve is symmetric. This tells that the probability

of deviations from the mean is comparable in either direction (Hasmida, 2009).

transformation that can be applied are normal log transformation method and Box-Cox

transformation method. Box-Cox method is applied if the normal log transformation

method is not capable to transform the data into normal distribution (Hasmida, 2009).

28

3.3.1.3 Outlier

(Moore and McCabe, 1999). The presence of an outlier always indicates some sort of

problem. This can be a case which does not fit the model under study or an error in

measurement. Outliers are often easy to spot in histograms. For example, the point on

the far left in the above figure is an outlier. This data point should be removed because it

also a sign of nonstationary data (Hasmida, 2009).

Yafee and McGee (2000) suggested that data should be replaced by a theoretical

defensible algorithm if some data values are missing is observed in the data series. A

crude missing data replacement method is to plug in the mean for the overall series. A

less crude algorithm is to use the mean of the period within the series in which the

observation is missing. Another algorithm is to take the mean of the adjacent

observations. Missing value in exponential smoothing often applies one step ahead

forecasting from the previous observation. Other form of interpolation employs linear

spines, cubic splines, or step function estimation of the missing data.

In order to handle missing data for this study, linear regression between flow of

study area station and flow of adjacent station is used. If data still cannot be obtained,

regression between streamflow and rainfall for that station is used to get the missing

data.

29

3.3.2 Model Procedure

The ARIMA modeling procedure for fitting ARIMA models to time series,

which was developed by Box and Jenkins (1976), consists of three iterative steps: model

identification; parameter estimation; and diagnostic checking. Figure 3.1 depicts the

process of ARIMA modeling. The procedure is itemized as follows:

Original

Streamflo

Model

Identificatio

Parameters

Estimation

No

Diagnostic

Checking

Is

adequate?

Yes

Streamflo

w

Figure 3.1: Flowchart of ARIMA modeling (Lee and Ko, 2011)

time series plot or ACF. From ACF, if large autocorrelations do not die out, indicating

that differencing may be required to give a constant mean. A seasonal pattern that

repeats every kth time interval suggests taking the kth difference to remove a portion of

30

the pattern. Most series should not require more than two difference operations or

orders. Be careful not to overdifference. If spikes in the ACF die out rapidly, there is no

need for further differencing.

Next, examine the ACF and PACF of your stationary data in order to identify

what autoregressive or moving average models terms are suggested. Some general

guidelines (SPSS, 1993) using graphical method was applied in the identification

process:

i.

Nonstationary series have an ACF that remains significant for half a dozen or

more lags, rather than quickly declining to 0. Difference must be done for such a

series until it is stationary before it can be identified.

ii.

first one or more lags of the PACF. The number of spikes indicates the order of

the autoregression.

iii.

Moving average processes have spikes in the first one or more lags of the ACF

and an exponentially declining PACF. The number of spikes indicates the order

of the moving average.

iv.

Mixed (ARMA) processes typically show exponential declines in both the ACF

and the PACF.

At the identification stage, the sign of the ACF or PACF and the speed with which

an exponentially declining ACF or PACF approaches 0 are depend upon the sign and

actual value of the AR and MA coefficients (SSPS, 1993).

31

3.3.2.2 Parameter Estimation

Once the tentative model is formulated, the related model parameters are

estimated using the least squares scheme. Parameters are estimated to have zero gradient

of forecasting errors to the historical load data. The primary objective of this parameter

estimation is to minimize the forecasting error and determine both the model and its

parameters (Lee and Ko, 2011). Each ARIMA tentative model parameter can be tested

using t-values and p-values. Dividing the coefficient by its standard error calculates a tvalue.

Then, diagnostic test was conducted to ensure that the essential modeling

assumptions are satisfied for a given model. When the parameters have been well

estimated, the tentative model accuracy is validated by examining the ACF and PACF

residuals. The residuals should simulate the white noise process. Furthermore, the Qstatistics test is applied to confirm the tentative model (ODonovan, 1983). If the

calculated value Q exceeds the critical value of 2 obtained from the chi-square tables,

the tentative model is inadequate (Lee and Ko, 2011).

Furthermore, for this stage, Ljung-Box is used for testing white noise residual.

Hypothesis null is that residual should be white noise. In other word, the residual series

should be independent, homoscedastic (having constant variance), and normally

distributed. We can reject hypothesis null if p-value in Chi-Square statistic greater than

alpha of 5%.

32

These steps are repeated until an adequate model is identified. When the steps in

ARIMA modeling are completed, a specific ARIMA model is applied to predict the

future monthly streamflow for 1 year ahead.

For modeling ARIMA model, a statistical software has been uses, which is called

Minitab version 15. By using Minitab, ARIMA model step can be summarized as

follows:

If seasonal pattern of ACF and PACF is still found from step No. 6, then go to

step No. 5

33

7. Apply the rest of procedures which are estimation, diagnostic check and

forecasting according to step No. 6until obtaining the best forecasting pattern.

3.4

multicriterion performance evaluation procedure was used in this study. The following

indices were used to evaluate the performance of the models (Shalamu, 2009):

(3.7)

(3.8)

3. Chi-Squared Test:

(3.9)

34

where,

Yi = the observed flow

Fi = the forecasted flow

CHAPTER 4

4.1

Introduction

This chapter consists of detail description on analysis of time series data using

both Markov and ARIMA modeling method for streamflow forecasting. Most of

computation work for ARIMA and Markov models are carried out by using Minitab

Microsoft Excel, respectively. Both of the methods will be used to model the streamflow

of Sungai Bernam at Tanjung Malim, Selangor (Station No. 3615412). The models will

be checked to get an adequate model for streamflow forecasting.

Data from January 1960 to December 2010 was used in deriving stochastic and

forecasting models. Data of 552 months from January 1960 to December 2005 are used

as calibration set for both model. Another 60 months data from January 2006 to

December 2010 is used as validation set.

36

4.2

Some of data values are missing in the data series for Sungai Bernam

streamflow at Tanjung Malim (Station No. 3615412). In order to handle missing data for

this study, linear regression between flow of study area station and flow of adjacent

station is used. Regression line is determined as the best way to predict y from x. As

there was missing data of streamflow for Sungai Bernam at Tanjung Malim, streamflow

data of adjacent station at Jam. Skc (Station No. 3813411) is used. For example, there is

missing data of January 1962, February 1962 and March 1962. Some adjacent

observations month of streamflow data (previous and forward month) of both station are

used to get the regression line to estimate the missing data. This is shown in Figure 4.1.

Missing month data of Station Tanjung Malim for January, February and March

1962 can be completed by using equation of linear regression y = 0.126x + 2.513 with

coefficient of determination, R2 of 0.845, which y and x represented flow of Station

Tanjung Malim (m3/s) and Jam. Skc (m3/s), respectively.

37

If data still cannot be obtained may be because the adjacent streamflow station

also had missing data for that month, rainfall data for adjacent station can be used to get

the regression equation to estimate the missing streamflow data. For example there is

missing data from February 1993 to May 1993 for both station of Tg. Malim and

Jam.Skc. Some adjacent observations month of rainfall data (previous and forward

month) of Station Ldg. Katoyang at Tg. Malim (Station No. 3714152) are used to get the

regression equation with flow data of Station Jam. Skc as shown in Figure 4.2. The

equation of the linear regression was found to be y = 0.146x + 10.43 with coefficient of

determination, R2 of 0.603, which y represented flow for Station Jam. Skc (m3/s) and x

represented rainfall for Station Ldg. Katayong (mm).

After we know the streamflow data for February 1993 to May 1993 at Station

Jam. Skc, we can use that data to estimate the missing data of Station Tg. Malim from

the regression equation of both streamflow by using equation of linear regression y =

0.112x + 3.673 with coefficient of determination, R2 of 0.892, which y and x represented

flow of Station Tanjung Malim (m3/s) and Jam. Skc (m3/s), respectively. Figure 4.3

showed the regression line for the equation.

38

After replacing all the missing data with appropriate estimation data from the

linear regression method, streamflow data of Sungai Bernam is shown in Appendix A.

4.3

Markov Model

which are: (1) determination of statistical parameters from the analysis of the historical

record, (2) identifying the frequency distribution of the historical data, (3) generating

random numbers of the same distribution and statistical characteristics and (4)

constituting the deterministic and combining with the random part.

39

4.3.1 Statistical Parameters of Historical Data

The sample mean flow for 612 month of data is 9.75 m3/s. Then, the sample

standard deviation, S is 4.66, skewness is 1.2, standard error is 0.18863 and coefficient

of variance is 0.47828. These statistical parameters can be calculated using Microsoft

Excel or can be obtained from EasyFit software. The result of the descriptive statistics

using EasyFit is shown in Figure 4.4.

data from January 1960 to December 2005 which using 552 data is shown in Table 4.1.

40

Table 4.1: Parameters of Monthly Historical Data

i

qj

S2

Sj

Rj

Sj-1

bj

qj-1

Jan

0.049549 9.07979E-05

0.001

0.06

Feb

0.04537

0.00943

0.4901265 3.639813919

0.001

0.05

Mac

0.046522 9.69723E-05

0.002

0.05

Apr

0.05187

9.10128E-05

0.00954

0.408

3.69337796

0.001

0.05

May

0.054888 5.21161E-05

0.007219

0.303

3.822355866

0.001

0.05

6.94571E-05

0.008334

0.515

2.990121105

0.001

0.05

July

0.046073 7.22414E-05

0.008499

0.541

3.349038581

0.001

0.05

Aug

0.047227 7.71759E-05

0.008785

0.585

3.27283605

0.002

0.05

Sep

0.053852 7.21758E-05

0.008496

0.406

3.447681936

0.001

0.05

Oct

0.059644 7.62886E-05

0.008734

0.369

3.761513315

0.001

0.05

Nov

0.065038 6.89806E-05

0.008305

0.294

4.175448792

0.001

0.06

Dec

0.059643 0.000101211

0.01006

0.3699155 4.738293291

0.001

0.07

Jun

4.3.2

0.0488

8.89268E-05

Identification of Distribution

In this study, statistical test is used for estimating the parameters of a probability

distribution. Kolmogorov-Smirnov (K-S) test, Anderson Darling (AD) test and Chisquared test can be used as statistical test. K-S test has being used as preference as it is

more powerful and robust. By using EasyFit application, the best-fitting distribution can

be found. K-S goodness of fit test for normal distribution is 0.13466 at ranking 42 while

for Lognormal distribution is 0.05954 at ranking 2. For AD goodness of fit test for

normal distribution is 139.43 at ranking 41 while for lognormal distribution is 34.169 at

ranking 6. Best-fitting distribution for the streamflow data of Sungai Bernam is

Lognormal Distribution (Figure 4.5 and Figure 4.6).

41

0.3

0.28

0.26

0.24

0.22

0.2

0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0

2

10

12

14

16

18

20

22

24

26

28

30

Flow, q (m3/s)

Histogram

hydrologic variables. This distribution is suitable for low-flow studies because small

changes in low values produce large changes in their logarithmic values.

42

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

2

10

12

14

16

18

20

22

24

26

28

30

Flow, q (m3/s)

Sample

As the distribution is log-normal, use the logarithm of the values and finally

convert back the flows. For an example, observed streamflow data in logarithmic values

for 1960 until 1970 is shown in Table 4.2, while other data for year (1971-2005) can be

found in Appendix B. These data as act calibration set to get the parameter of historical

data in order to model the future streamflow.

i

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

Jan

0.056

0.052

0.059

0.050

0.056

0.046

0.058

0.065

0.040

0.054

0.054

Feb

0.051

0.044

0.046

0.045

0.047

0.044

0.048

0.054

0.037

0.047

0.034

Mac

0.058

0.045

0.056

0.045

0.050

0.050

0.052

0.047

0.031

0.040

0.036

Apr

0.064

0.051

0.057

0.044

0.052

0.066

0.058

0.060

0.041

0.046

0.045

May

0.055

0.055

0.055

0.044

0.056

0.068

0.045

0.059

0.059

0.060

0.052

Jun

0.046

0.051

0.046

0.046

0.045

0.050

0.052

0.043

0.050

0.050

0.038

Jul

0.057

0.046

0.045

0.045

0.057

0.039

0.053

0.043

0.043

0.037

0.043

Aug

0.049

0.051

0.049

0.053

0.048

0.043

0.054

0.044

0.043

0.044

0.045

Sep

0.058

0.058

0.056

0.060

0.060

0.053

0.057

0.055

0.055

0.042

0.054

Oct

0.057

0.056

0.069

0.070

0.058

0.069

0.072

0.060

0.057

0.059

0.055

Nov

0.063

0.060

0.075

0.079

0.065

0.067

0.077

0.076

0.058

0.056

0.058

Dec

0.065

0.066

0.058

0.066

0.064

0.072

0.072

0.055

0.060

0.053

0.061

43

4.3.3

RAND( ). To get the random normal deviate, t, of mean equal to 1 and unit standard

deviation, we use inverse error function, erf-1(z):

(4.1)

log-normal distribution:

(4.2)

44

(4.3)

As log-normal random numbers have both mean and standard deviation equal to

one. Therefore, the Equation 4.3 becomes:

(4.4)

(4.5)

2006 is shown in Table 4.3, while the random numbers generation for other year (20072010) can be found in Appendix C.

45

Table 4.3: Generation of Random Number for Year 2006

i

RAND ( )

erf -1

ti,j

January

0.699645

0.399289

0.370085

1.523379

February

0.45481

-0.090379

-0.08027

0.886483

March

0.063732

-0.872536

-1.0558

-0.49313

April

0.224711

-0.550577

-0.53482

0.243657

May

0.236038

-0.527923

-0.50847

0.280915

June

0.471912

-0.056176

-0.04983

0.929536

July

0.999341

0.998683

1.443813

3.041859

August

0.533139

0.066278

0.058805

1.083163

September

0.095672

-0.808656

-0.91763

-0.29772

October

0.044674

-0.910651

-1.15355

-0.63136

November

0.997494

0.994989

1.429319

3.021363

December

0.407816

-0.184368

-0.16487

0.766834

(influence of previous flows) and combining with the random part to develop monthly

streamflow model for year 2006 is shown in Table 4.4, while the streamflow model for

other year (2007-2010) can be found in Appendix D.

The Markov model for monthly flows, developed by Thomas and Fiering is

using the following form (Maass et al., 1962):

(4.6)

46

We will use Equation 4.6 to develop Markov model for monthly flows. Flow in

ith month from the beginning, for jth month of the year can be modeled by adding mean

of flow of jth month of the year (January to December) with deterministic and random

component.

i

Jan

Febr

Mac

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Deterministic Component

qi-1,j-1

qj+bj(qi-1,j-1-qj-1)

Random Component

ti,j

Sjti,j(1-rj2)

0.049549

0.063

0.053

0.043

0.054

0.057

0.055

0.068

0.055

0.052

0.055

0.089

1.523379

0.886483

-0.49313

0.243657

0.280915

0.929536

3.041859

1.083163

-0.29772

-0.63136

3.021363

0.766834

0.049541669

0.045386033

0.04653475

0.051865643

0.054889433

0.048803168

0.046082272

0.04726108

0.053859993

0.059642058

0.065034911

0.059661808

0.013

0.007

-0.004

0.002

0.002

0.007

0.022

0.008

-0.002

-0.005

0.024

0.007

Model flow

qi,j (Log)

qi,j (m3/s)

0.063

0.053

0.043

0.054

0.057

0.055

0.068

0.055

0.052

0.055

0.089

0.067

13.533

9.077

5.641

9.604

10.807

10.210

16.422

10.014

8.642

9.821

32.326

15.849

The model streamflow by using Markov model is compared with the observed

streamflow that have been set as validation set for 60 monthly data from January 2006 to

December 2010. Graphically, from Figure 4.8, we can say that Markov model cannot

work well for streamflow forecasting for Sungai Bernam because it not match well with

the actual streamflow.

47

some forecast evaluation measures like Root Mean Square Error (RMSE), Chi-square

Test and Mean Absolute Percentage Error (MAPE). The result of inspection is

summarized in Table 4.5 and the details of the calculation can be found in Appendix E.

Performance

Evaluation Procedure

Markov

model

MAPE

53.66

RMSE

7.29

Chi-square test

250.99

48

4.4

ARIMA Model

In this study, an appropriate ARIMA tentative model for Sg. Bernam streamflow

is investigated. Examination of the autocorrelation function (ACF) and partial

autocorrelation function (PACF) provides a thorough basis for analyzing the system

behavior under time independence, and will suggest the appropriate parameters to

include in the model.

These tentative models will be checked and best tentative model will be selected

for streamflow forecasting of ARIMA model. As mentioned in previous chapter, the

ARIMA modeling follows three important stages that can be figured in flow diagram of

Box-Jenkins methodology (Figure 4.9).

1. Tentative Identification

No

- ACF & PACF

2. Parameter Estimation

-Testing parameters

3. Diagnostic Checking

[Is the model adequate?]

- Normal distribution of

residual

Ye

4. Forecasting

-Forecast calculation

49

4.4.1 Model Identification

(ACF) and sample partial autocorrelation function (PACF) to determine whether the

series is stationary or not and then make a decision what functional form best fits and

appropriate model for the data. In practice, the ACF and PACF are random variables and

will not give the same picture as the theoretical functions. This makes the model

identification more difficult and can involve much trial and error (Nazuha et al., 2010).

The most common method to check stationary is through examining the time

series plot of the data. Stationary means that data fluctuate around a constant mean. If

the time series plot is found to be non stationary, differencing needs to be applied.

Figure 4.10 showed that the data is non-stationary. The data need to be applied with nonseasonal difference (d = 1, lag, k = 1). Based on graphical examination, Figure 4.11

showed that the data is stationary at the level of the data after applying non-seasonal

difference.

50

12

30

1111

Streamflow, Yt (m3/s)

25

11

20

15

10

9

1111

11

10310

11

8 12

246 10

10

7

12

11

10

9 12

11

1212

10 9

12

4 12

12

11

10

5

1

10

10

12 11

11

611

8

10

11

5

1

11

7

12

12

12

9 5 8126 3

12

12 12 411 1

11

10

11

35

12

1010

11 10

11

4 1 11

10

10

11

12

411

115

4610 8

12 11

11

11

510 14

10 11 79 79

10

4

11

10

5

11

105

9 6 4510 125 11510 13

11

410 12510 12 10

11

1 99

3411

2

4 5813 2

4 59

5

310

10 14 5 511 119 5 510 3459

9 9 412

11

4

1246

711

17

9 7 9 1011

9 511

5 51035 15

12

10

9

5

5 125 5 93

9

6

9

10

12 1 9

4115

26 5102

12

9 9 5 5 13

2

6

8 4 93782 1 12

469

6

10

1

8

12

8 9 9 9 594

6 49

8 4 2 2 41281278 67 12127 7

12

16 8 125 9 11

281468 81 3 36 6 6 6 5 382

11

13 8 8

1 12

3410 1 2 9 49 10236

9 12

11

12 1010 1610 12

7 6 36126812

8 2

72 169

1

9

12

1

9

2

7

4

4

6

1

9

5

2

12

10

7

4

7 2683

8 16 4 127 2 68

6 3727

6 67 612 5 3 248 48210 1 3 10

27 5

10

5 5 34 3 1693 289 8

4

68

1

4

2 235

68 78 9 7 4 78

7

7

2

3

1

5

8 7

8

7

4

10 9 6 4

3 5 1 1267

8 7836

83

14 3 6 67 237

5 2 9 928 34 378137 57 8 78 36 16

7

1

7

7

1

2

2

7

681347

2 73

78 6713 12 2 136

2

1

4

8

3

1

9

2

42

3 238

7

4

3

2 238

2

2

11

0

Month Jan

Year 1960

Jan

1967

11

12

Jan

1974

Jan

1981

Jan

1988

Jan

1995

Jan

2002

10

15

11

12

10

12

10

5

0

-5

-10

-15

9 10

11

11

11

10

11

410

10 11

9

10

11 1010

5

5 10 11

410 11

4

10 11

4

11

411 10

5

11

10

10

4

11

4

12

9

115 9

812

9

1

7

4

11

9

8

9 5

7 11

5 412

1211 2 11

3 11 10

8 12

9

3

4 2 49 925 310 9

9 9 9 99

3

8 10

9 8

9 12

11 269 9 10 9

9 10 126 410

9 9 89 11 121110

10

9

3411

46 4 48 45 5 8 4 7 8 5 94 10

8 10511510 9 4 611 349 5 10812

34

10 35

9 57 12

4511

3

5

3

95

8

4

12 4

4 4 358

88

11

511

3 69 5 39 5 39378

10257 10683 6 89

8 39 10 711

8 5 5 511

12

4 117

9

7

12 1310

9

12

12

4

3

2

11

8

4

4

3

5

7

3

5

5

10

9

11

7

5

3 4 356 34 5 78 78 8

2

7

8

7

8

4

5

58

1 3 4

46

724 8

3

5

2 4 9

23 48 57 3 2813 3 3

10107 47 12

26116712

9 72382 93 12 5 18 3836

11

8 7 238 6

29 47 34

6 2 2 4 38278 37

5 10211 5 23 11

8

10

11

2

12

11

9

2

8

10

3

1 102

63 2 7

2

67

11

11

2

1 128 79

237 2 1 1 10

7 6 1 4 256

716 7

3 46 6 6126 7

2 2 2

10

3 712

7 37 6

1682 6

8 7 5 581 247 512 6

6

12712

71

1

68 161 268 7 12

917

5 6

1 127 612

11

2 1 6 67 6 10 71 82 126

5

10

116 1107

6

1

2 6

6

11

52 1

12

12

12

1

10

12 6 1

2 1

6

1 11

612 11

1

2

1

1

12

12

6

1 1

1

1

1212

12 6

10 12

1

1212

12

1

1

2

1 12

12

12

1

12

Month Jan

Year 1960

Jan

1967

Jan

1974

Jan

1981

Jan

1988

Jan

1995

Jan

2002

The next step is to identify the values of p and q which are the AR (p) and MA

(q) components for both seasonal and non-seasonal series. For this purpose, the ACF and

51

PACF coefficient are computed. The following Table 4.6 gives general theoretical for

identification of the likely model:

Model

MA(q): moving average of order q

ACF

Cut off after lag q

PACF

Dies down

Dies down

moving average of order (p,q)

Dies down

AR(p) or MA(q)

No order AR or MA

(White Noise or Random process)

No spike

No spike

(with 5% significance limits for the autocorrelations)

1.0

0.8

Autocorrelation

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1

10

15

20

25

30

35

Lag

40

45

50

55

60

65

52

(with 5% significance limits for the partial autocorrelations)

1.0

Partial Autocorrelation

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1

10

15

20

25

30

35

Lag

40

45

50

55

60

65

As we can see from the Figure 4.12 and 4.13, ACF and PACF die down

gradually. Based on the pattern, the respective values of p, d, q was determined for

ARIMA is: ARIMA (1, 1, 1). From ACF correlogram, seasonal pattern of the data is

identified. As ACF is indicating seasonal pattern, seasonal difference (D = 1, lag, k =

12) needs to be applied.

(with 5% significance limits for the autocorrelations)

1.0

0.8

Autocorrelation

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1

10

15

20

25

30

35

Lag

40

45

50

55

60

65

53

(with 5% significance limits for the partial autocorrelations)

1.0

Partial Autocorrelation

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

1

10

15

20

25

30

35

Lag

40

45

50

55

60

65

After applying seasonal difference, we can see from the Figure 4.14, ACF cuts

off after lag 12 while in figure 4.15, PACF dies down. For seasonal ARIMA, the general

notation is ARIMA (p, d, q) (P, D, Q)S. Based on the pattern, the respective values of P,

D, Q was determined for ARIMA is: ARIMA (0, 1, 1)12. However, in order to make sure

that we have identified the right model, we suggest another tentative model which is

ARIMA (1, 1, 1)12.

Each ARIMA tentative model parameter can be tested using t-values and pvalues. Dividing the coefficient by its standard error calculates a t-value. The standard

error (SE) of coefficient is the standard deviation of the estimate of a regression

coefficient. It measures how precisely your data can estimate the coefficients unknown

value. Its value is always positive, and smaller values indicate a more precise estimate.

The standard error of a coefficient helps determine whether the value of the coefficient

54

is significantly different than zero. If the p-value associated with this t-statistic is less

than alpha level, we can conclude that the coefficient is significantly different from zero.

From Table 4.7, the standard error of MA 1 coefficient is large relative to the

value of the coefficient itself, so the t-value of 1.26 is too small to declare statistical

significance. We reject hypothesis null if |t|> t/2,df

= n-np.

(=1.26) < ttable (=2.25). The resulting p-value also is much greater than common alpha

level. Therefore, hypothesis null cannot be rejected. So we can conclude this coefficient

not differs from zero. Table 4.8 which estimates parameters for ARIMA (1,1,1)(0,1,1)12

have |tcalc|> ttable (= 2.25) and p-value is less than alpha level. Hence, hypothesis null can

be rejected, and we can conclude that the coefficient is significantly different from zero.

Type

Coefficient

SE Coefficient

AR 1

0.2782

0.0520

5.35

0.000

SAR 12

0.0589

0.0467

1.26

0.208

MA 1

0.8765

0.0256

34.24

0.000

SMA 12

0.9537

0.0206

46.25

0.000

Type

Coefficient

SE Coefficient

AR 1

0.2894

0.0516

5.61

0.000

MA 1

0.8788

0.0248

35.41

0.000

SMA 12

0.9553

0.0184

51.98

0.000

55

4.4.3

Diagnostic Checking

The next step of model identification method of time series modeling approach is

diagnostic checking. It is aimed at examining the accuracy of the chosen tentative model

in ensuring that the modeling assumptions are satisfied. Several procedures can be

applied to check the adequacy of the model as to whether the model satisfies the stability

or stationary condition, as required in stochastic modeling works (Ayob and Amat,

2004).

For this stage, Ljung-Box is used for testing white noise residual. Hypothesis

null is that residual should be white noise. In other word, the residual series should be

independent, homoscedastic (having constant variance), and normally distributed. We

can reject hypothesis null if p-value in Chi-Square statistic greater than alpha of 5%.

In this study, both ARIMA tentative models have p-value less than alpha level.

Table 4.9 and Table 4.10 showed p-value for both tentative models. So, the hypothesis

null cannot be rejected and we can conclude that residual is significantly white noise for

both tentative models.

for ARIMA (1,1,1)(1,1,1)12

Lag

12

24

36

48

Chi-Square

21.2

61.8

82.7

98.1

DF

20

32

44

p-Value

0.007

0.000

0.000

0.000

56

Table 4.10: Modified Box-Pierce (Ljung-Box) Chi-Square statistic

for ARIMA (1,1,1)(0,1,1)12

Lag

12

24

36

48

Chi-Square

23.1

62.2

82.7

97.9

DF

21

33

45

p-Value

0.006

0.000

0.000

0.000

Besides that, the best tentative model can be determined through test of Least

Square Error (LSE) and Root Mean Square Error (RMSE). The result for the test on the

tentative model is summarized in Table 4.11. The best fit in the least-squares sense

minimizes the sum of squared residuals, a residual being the difference between an

observed value and the fitted value provided by a model. RMSE also is a good measure

of accuracy. The smaller the value of LSE and RMSE, the tentative model is more

accurate.

Table 4.11: LSE and RMSE Test for ARIMA Tentative Model

ARIMA

Test

ARIMA

12

(1,1,1)(1,1,1)

(1,1,1)(0,1,1)12

1798

1760

5.5

5.4

So, from two tentative models possible, the model that best fits the criteria and

meets the requirement is model ARIMA (1,1,1)(0,1,1)12. Forecasting is made based on

the chosen model. The model we identified as best-fit model for Sg. Bernam streamflow

is:

(1 - 1B)(1-B)(1-B12)Yt = (1- 1B)(1- 2B12)at

(4.7)

57

Rewriting the model, we have the following:

(1 - 1B)(1-B12-B+B13)Yt = (1- 2B12- 1B + 12B13)at

(1 - 1B)(1-B12-B+B13)Yt = (1- 2B12- 1B + 12B13)at

(1-B12-B+B13- 1B+ 1B13+ 1B2- 1B14) Yt = (1- 2B12- 1B + 12B13)at

(1 - B12 (1+ 1)B + (1+ 1)B13 + 1B2 - 1B14) Yt = (1- 1B - 2B12 + 12B13)at

Yt (1+ 1)Yt-1 + 1Yt-2 Yt-12 + (1+ 1)Yt-13 - 1Yt-14 = at - 1at-1 2at-12 + 12at-13

Yt = (1+ 1)Yt-1 - 1Yt-2 + Yt-12 - (1+ 1)Yt-13 + 1Yt-14 + at - 1at-1 2at-12 + 12at-13

Noted that,

AR1, 1

0.2894

MA1, 1

0.8788

SMA 12 2

0.9553

Yt = (1+ 0.2894) Yt-1 0.2894Yt-2 + Yt-12 - (1+ 0.2894) Yt-13 + 0.2894Yt-14 + 0.2894Yt-14

+ at 0.8788at-1 0.8788at-12 + (0.8788x0.9553)at-13

Yt = 1.2894 Yt-1 0.2894Yt-2 + Yt-12 - 1.2894Yt-13 + 0.2894Yt-14 +

at 0.8788at-1 0.9553at-12 + 0.8395at-13

Yt = Yt-12 + [1.2894 Yt-1 - 1.2894Yt-13 - 0.2894Yt-2 + 0.2894Yt-14] +

[at 0.8788at-1 0.9553at-12 + 0.8395at-13]

(4.8)

Equation (4.8) can be used for streamflow forecasting of ARIMA model. From

Equation 4.8 also, its explained that the forecast for time period t is the sum of (1) the

value of the time series in the same month of the previous year, (2) a trend component

determined by the difference of previous months value and last years previous months

value and difference of last years previous two months value and previous two months

value; (3) the effects of random shocks (or residuals) of period t, t-1, t-12 and t-13 on the

forecast.

58

In this study, we will use Minitab to develop Markov model for monthly flows.

As an example, develop monthly streamflow model using Minitab for year 2006 to 2007

is shown in Table 4.12, while the streamflow model for other year (2008-2010) can be

found in Appendix F.

i

Jan 2006

Feb 2006

Mac 2006

Apr 2006

May 2006

Jun 2006

Jul 2006

Aug 2006

Sep 2006

Oct 2006

Nov 2006

Dec 2006

Jan 2007

Feb 2007

Mac 2007

Apr 2007

May 2007

Jun 2007

Jul 2007

Aug 2007

Sep 2007

Oct 2007

Nov 2007

Dec 2007

Actual Flow

(m3/s)

13.08

8.12

6.11

29.72

29.22

17.82

7.94

9.95

28.05

17.63

17.72

11.23

9.05

6.80

7.62

13.46

12.05

11.38

13.06

8.95

9.36

14.33

14.26

8.24

Model Flow

(m3/s)

9.6732

7.1884

7.2612

9.0165

9.9281

7.6110

6.7046

7.0851

9.5168

12.2889

15.2005

12.3581

7.9227

6.6970

7.1341

8.9949

9.9369

7.6286

6.7248

7.1060

9.5379

12.3101

15.2217

12.3794

Residual

Fit

Coefficient

*

*

*

*

*

*

*

*

*

*

*

*

*

-1.57988

-1.39072

-1.05700

-0.14946

1.10867

-1.04180

1.04920

0.26505

-2.99026

-3.58500

4.03841

*

*

*

*

*

*

*

*

*

*

*

*

*

7.5299

7.6507

9.4570

10.2195

7.3913

7.8818

7.2208

11.0050

13.4603

15.6250

11.4816

0.289364

0.878761

0.955283

59

4.4.5

The model streamflow by using ARIMA model is compared with the observed

streamflow that have been set as validation set for 60 monthly data from January 2006 to

December 2010. Graphically, from Figure 4.16, we can say that ARIMA model may

works quite well for streamflow forecasting for Sungai Bernam because many data from

model match well with the actual streamflow. The ability of ARIMA model in

streamflow forecasting is inspected using some forecast evaluation measures.

Like in Markov models validation, the forecast evaluation measures like Root

Mean Square Error (RMSE), Chi-square Test and Mean Absolute Percentage Error

60

(MAPE) are used to examine the accuracy of ARIMA model. The result of inspection is

summarized in Table 4.13 and the details of the calculation can be found in Appendix G.

4.5

Performance

Evaluation Procedure

ARIMA

model

MAPE

27.50

RMSE

5.41

Chi-square test

191.11

ARIMA model to inspect the accuracy between the models in forecasting ability.

Observed streamflow data that have been set as validation set for 60 monthly data from

January 2006 to December 2010 is used as bench mark to make the comparison. From

From graphical examination on Figure 4.17, we can say that ARIMA model is better for

streamflow forecasting for Sungai Bernam because more data from ARIMA model

match with the actual streamflow.

rather than the actual data. In the accuracy aspects, Markov model is not good rather

than ARIMA model because the model cannot obtain the exact or similar pattern with

the actual ones. However, these high values are a good forecasting as a reference

guideline to prevent damage due to flood problem. We can use Markov model for short-

61

term forecasting, like hourly and daily forecasting in order to give more accurate flood

warning.

Meanwhile, if the forecasts streamflow has the lower value from the actual data,

we cannot estimate the flood occurrence. Lower streamflow forecasts is needed in some

of agriculture field to make sure that plants have sufficient water and grow well.

For short period, ARIMA model can obtain the exact or similar pattern with the

actual ones. ARIMA cannot forecast accurately for longer period as it is best used for

short-term forecasting. Usually, it will tend to become flat for sufficiently long period.

Actually, ARIMA model which is good at short-term forecasting can also be used to

control flood.

62

In order to inspect the forecasting accuracy of the different models, criteria

performance evaluation procedures which are MAPE, RMSE and Chi-square test for

both Markov and ARIMA models are compared. Table 4.14 shows the result of model

comparison of MAPE, RMSE and Chi-Square test for each model.

Performance

Evaluation Procedure

Markov

model

ARIMA

model

MAPE

53.66

27.50

RMSE

7.29

5.4156

Chi-squared test

250.99

191.11

The minimum value of MAPE, RMSE and Chi-squared methods indicates that

the model is the best for streamflow forecasting. From the result of the performance

evaluation procedure, it showed that ARIMA has less value for all methods used to find

the accurate model. Therefore, in this study, the best performance of model for

streamflow forecasting between these two models is ARIMA model.

In this study, one factor that ARIMA model is better than Markov model because

the historical data for Sg. Bernam is non stationary. If the historical data is stationary,

Markov may has advantage because it is propagating the probability method which

transition from state to another state is depend on probability. Markov model cannot

remove non stationary data but the advantage of ARIMA model is it can transform non

stationary data to stationary data.

ARIMA model selected as best fit as it has minimum mean squared forecast error

and therefore it often used in statistical practice. Therefore, for forecasting one period

ahead, which is Yt+1, the equation is as follows:

63

[at+1 0.8788at 0.9553at-11 + 0.8395at-12]

(4.9)

By using Minitab, we can easily do streamflow forecasting for the future values

of time series from current and past values. Figure 4.18 shows the comparison of pattern

of streamflow for actual and model streamflow for Sungai Bernam. The first 5 years

from Jan 2006 to December 2010 is the calibration process. This time series plot reveal

pattern of cycles of ARIMA model. We can see that, the model flows follow the pattern

of observed streamflow quite well although the data is nonstationary for several years.

30

4

5

Variable

Yt-actual

Yt-model

Streamflow, Yt (m3/s)

25

20

6 10

11

11

11

11

11

11

11

11

11 3

11

4

47

1

4 10

11

1012

1012

1012

12

1012

1012

10

1012

1012 5 1012 6 1012

12 6 1 7 12

12

5

5 9

5 9

5

5 9

5 9

5

5 9

5

1 5 89

9 345 9 12 4 89 4 911 4 9

4 9

4

4

1 4 89

4

4

5

2 7 1 3 6 12

1 6 1 6

1 6 1 6

8 1 6

1 6 8 1 67 10 1 6

34 6789 3 8 23 78 23 78 23 78 23 78

36

23 6 8

3

7

23 78 2 78 2 78 1212 7 10 2 7

5

3

23

11

15

10

11

10

11

Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan

Year 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

The next 5 years is the forecast streamflow using ARIMA model which is 60

months from January 2011 to December 2015. We can see from the figure, the model

64

can forecast well but the pattern of streamflow is repeated the same pattern for longer

period. This is because ARIMA model is only good and best suited for short term

forecasting since its forecast on previous observations. For short term forecasting, BoxJenkins model can nicely reproduce the details of the original series. ARIMA cannot

forecast accurately for longer period.

CHAPTER 5

4.1

Conclusion

This study has fulfilled the objectives of the study to propose the streamflow

forecasting methods using Markov and ARIMA models and then inspect the accuracy of

both models in forecasting ability. The Box-Jenkins or ARIMA model is one of the most

popular time series forecasting methods. Markov model has its own advantage in

forecasting ability.

In this study, the tentative model that best fits the criteria and meets the

requirement is model ARIMA (1,1,1)(0,1,1)12. By analyzing the forecasted value using

the performance evaluation procedure, it is found that use of ARIMA model for

forecasting Sg. Bernam streamflow is better than Markov model. From the result of the

performance evaluation procedure, it showed that ARIMA has less value for all methods

used. Therefore, ARIMA model has the ability to predict accurately the future monthly

streamflow for Sungai Bernam.

66

The critical part in modeling using ARIMA is identification of best tentative

model. The tentative model that has been identified will be tested and checked to clarify

that the model is the best fit.

Markov also has some advantage because it forecasts with higher streamflow

compare to actual streamflow. Higher streamflow can cause disaster like flood.

Therefore, Markov model can be used for flood control.

Both Markov and ARIMA models are good for short term forecasting. From the

result, we can see that both models can forecast well for earlier period. But, for longer

period, they cannot forecast accurately.

Although both models good for short-term forecasting and not good for longterm forecasting, comparison between the two model shows that ARIMA is better in

giving accurate forecasts.

4.2

Recommendations

Based on the result, both Markov and ARIMA model can be used for streamflow

forecasting. However, there are some weaknesses that can be overcome. Here are some

recommendations that can be used to increase the accuracy for streamflow forecasting:

67

1. The amount of data, or equivalently the number of training patterns also affects

the forecast performance. For long memory series, more training patterns results

in more accurate forecasts. To forecast accurately, use long input series.

2. To control flood efficiently, we can use Markov model for short-term forecasting

because short-term forecasting is very useful for control flood.

3. Use ARIMA model for short-term forecasting only including for streamflow

forecasting.

4. Compare the streamflow forecasting with other forecasting methods of time

series such as exponential smoothing, regression analysis or Fourier series

analysis.

5. Do the forecasting time series after removing the outliers.

6. Use hybrid model using ARIMA and artificial neural network in streamflow

forecasting.

68

REFERENCES

And Simulation of it by Dynamic Programming and Markov Chain Method.

American-Eurasion J. Agric. & Environ. Sci., 5(6), 796-803.

Middle East Technical University.

Ayob, K. and Amat, S. D. (2004). Water Use Trend at Universiti Tekologi Malaysia:

Application of Arima Model. Jurnal Teknology, 41 (B): 47-56

Mathematics and Economics 3, pp. 241-255.

Applied Approach. Third Edition. Duxbury Press.

Box, G. E. P. and Jenkins, G. M. (1970). Time Series Analysis: Forecasting and Control.

Holden Day, San Francisco.

Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control.

Holden Day, San Francisco.

Forecasting and Control. Third Edition. Prentice Hall.

Prentice Hall, Englewood Cliffs, N. J.

69

Dalphin, R. J. (1987). Markov-Weibull Model of Monthly Streamflow. Journal of Water

Resources Planning and Management, Vol. 113, No. 1.

Monograph 1. American Geophysicists Union. Washington, D. C.

Fortin, V., Perreault, L. and Salas, J. D. (2004). Retrospective Analysis and Forecasting

of Streamflows Using a Shifting Model. Journal of Hydrology, Vol. 296,

135-163.

Hasmida, H. (2009). Water Quality Trend at The Upper Part of Johor River in Relation

to Rainfall and Runoff Pattern. Universiti Teknologi Malaysia.

Heiko, B. (2000). Markov Chain Model for Vegetation Dynamics. Ecological Modeling,

Vol. 126, pp. 139-154.

Keuangan Sektor Publik FEUI

Ho, S. L. and Xie, M. (1998). The Use of ARIMA Models for Reliability Forecasting

and Analysis. Computers ind. Engng, Vol. 35, Nos 1-2, pp. 213-216.

Joomizan, N. (2010). Reservoir Storage Simulation and Forecasting Models for Muda

Irrigation Scheme, Malaysia. Universiti Teknologi Malaysia.

Lee, C. and Ko, C. (2011). Short-term Load Forecasting Using Lifting Scheme and

ARIMA Models. Expert Systems with Applications, Vol. 38, pp. 5902-5911.

70

Conference, ed. D. J. Medeiros, E. F. Watson, J. S. Carson, and M. S.

Manivannan, 1522. Piscataway, New Jersey: Institute of Electrical and

Electronics Engineers, Inc.

Maass, A., Hufschmidt, M. M., Dorfman, R., Thomas, H. A., Marglin, S. A., Fair and G.

M. (1962). The Design of Water-Resource Systems. Harvard University Press,

Cambridge, Mass., pp 467

for Interval-valued Time Series. Neurocomputing, Vol. 71, pp. 3344-3352.

Modarres, R. (2007). Streamflow Drought Time Series Forecasting. Stoch Environ Res

Risk Assess.

Using Simplified Rule-Based Fuzzy Logic System. Journal-The Institution of

Engineers, Malaysia, Vol. 66, No. 4.

Analysis and Forecasting. John Wiley & Sons, Inc.

Edition. New York: W. H. Freeman.

Resources Systems. Mathematical Modelling, Vol. 3, pp. 117-136.

Nazuha, M., Ruzaidah, S. and Zamzulani, M. (2010). Malaysia Crude Oil Production

Estimation: an Application of ARIMA Model. International Conference on

Science and Social Research (CSSR 2010)

71

ODonovan, T. M. (1983). Short Term Forecasting: An Introduction to the Box-Jenkins

Approach. New York: Wiley.

Shalamu, A. (2009). Monthly and Seasonal Streamflow Forecasting in the Rio Grande

Basin. New Mexico State University

Englewood Cliffs, New Jersey.

Tang, Z., Almeida, C. and Fishwick, P. A. (1991). Time Series Forecasting Using

Neural Networks vs. Box-Jenkins Methodology. Simulation.

IOS Press, Amsterdam.

Models. Texas Water Resources Institute, TR-282, pp. 27-131.

Yafee, R. and McGee, M. (2000). Introduction to Time Series Analysis and Forecasting

with Application of SAS and SPSS. Academic Press, Inc., New York.

Streamflow Based on Stochastic Approaches. Journal of Spatial Hydrology,

Vol.4.

72

APPENDIX A

i

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

Jan

10.62

8.72

11.98

8.06

10.31

6.65

11.41

14.99

4.87

9.61

9.69

17.40

8.09

6.27

3.62

8.49

7.88

6.84

5.14

4.08

4.05

8.07

4.95

4.02

5.86

9.55

7.95

4.38

4.81

4.56

5.80

3.89

7.37

7.10

10.66

10.85

12.29

16.24

15.95

9.18

14.05

13.58

6.04

7.45

8.08

3.87

13.08

9.05

11.29

9.73

6.83

Feb

8.38

5.95

6.66

6.35

7.08

6.12

7.38

9.69

4.12

6.95

3.49

6.48

8.04

5.26

7.44

7.38

4.36

4.90

3.54

4.33

3.95

6.99

3.93

2.47

9.04

8.80

5.79

4.07

7.86

2.91

3.14

3.51

6.28

8.00

10.16

9.37

11.50

19.71

16.02

9.77

11.67

9.33

4.35

6.91

6.60

2.73

8.12

6.80

6.76

9.67

4.86

Mac

11.23

6.26

10.28

6.37

7.90

7.96

8.96

6.91

2.94

4.83

3.83

8.20

7.84

5.29

6.80

10.91

5.59

3.34

3.36

3.87

4.91

4.38

5.26

3.84

8.07

9.26

4.94

3.76

6.48

6.13

2.65

5.83

5.56

7.56

10.32

11.74

12.12

20.96

14.69

11.69

15.70

8.05

4.84

5.85

6.67

4.38

6.11

7.62

9.58

15.10

4.36

Apr

14.08

8.40

11.06

6.02

8.73

15.45

11.25

12.13

5.14

6.75

6.48

5.99

9.31

9.88

8.65

11.46

6.05

3.80

6.39

5.70

5.01

10.73

8.94

3.18

8.10

6.99

8.87

5.67

6.22

11.96

3.46

7.66

7.18

11.41

10.59

13.89

18.45

20.22

14.42

11.82

12.87

13.36

11.45

7.34

8.68

4.51

29.72

13.46

12.86

13.72

7.18

May

10.04

10.07

10.07

6.20

10.35

16.39

6.45

11.64

11.72

12.50

8.71

7.06

11.60

11.09

9.52

11.50

4.92

5.61

7.93

5.86

10.07

11.90

9.40

5.42

9.83

11.31

6.31

6.03

9.92

11.79

10.15

11.97

9.65

12.62

10.87

14.36

15.91

17.51

14.90

13.56

9.26

10.84

12.75

9.23

11.12

6.26

29.22

12.05

9.73

8.75

6.17

Jun

6.68

8.50

6.62

6.76

6.44

8.23

8.70

5.89

8.16

8.24

4.41

5.06

9.16

8.87

7.24

8.29

7.92

6.51

3.67

5.24

9.37

7.37

6.94

3.65

9.93

5.53

4.32

4.51

12.09

8.73

5.94

9.24

5.64

7.69

10.78

14.15

16.73

20.14

15.72

9.54

7.45

7.30

7.89

6.69

4.13

6.07

17.82

11.38

12.28

7.31

7.51

Jul

11.01

6.84

6.36

6.45

11.09

4.64

9.38

5.79

5.64

4.26

5.73

4.77

5.88

5.04

7.54

10.70

5.94

4.10

3.99

5.30

6.51

4.83

4.84

4.39

5.98

5.34

3.21

4.75

8.45

8.27

4.55

5.59

6.94

8.96

8.43

13.34

13.12

19.05

16.14

7.52

4.21

5.72

6.99

7.32

6.49

4.56

7.94

13.06

10.89

8.05

7.45

Aug

7.87

8.27

7.79

9.29

7.52

5.81

9.49

5.94

5.91

6.21

6.49

8.17

5.75

7.74

7.79

6.90

8.74

3.73

2.99

5.01

8.79

4.62

6.55

5.68

5.07

4.85

3.68

9.03

8.57

5.09

3.31

5.09

6.01

6.15

10.89

16.59

14.24

15.69

20.16

8.99

9.25

5.05

7.26

6.71

4.44

5.63

9.95

8.95

7.83

9.03

8.04

Sep

11.11

11.27

10.54

12.09

12.39

9.32

11.04

9.91

9.91

5.52

9.79

11.53

8.74

7.42

8.43

11.33

7.55

4.38

4.72

8.86

9.64

9.23

7.73

10.02

5.64

7.14

7.39

12.35

18.19

8.02

6.83

7.83

6.43

9.60

16.08

12.99

13.39

18.91

23.75

10.50

8.85

8.66

8.96

8.80

11.62

3.53

28.05

9.36

9.85

10.08

7.16

Oct

10.83

10.47

16.97

17.82

11.24

16.98

19.17

12.40

11.05

11.84

9.96

6.94

11.88

10.81

7.11

6.14

14.43

18.96

7.86

7.79

12.48

7.44

10.01

5.57

7.82

12.75

14.19

18.64

9.24

12.06

16.42

13.20

7.65

12.40

14.22

13.99

20.88

21.15

19.72

13.37

9.31

6.43

15.31

13.25

14.57

14.99

17.63

14.33

13.14

7.99

6.30

Nov

13.83

12.04

21.14

23.87

14.58

15.76

22.08

21.74

11.56

10.30

11.16

14.11

16.21

9.60

8.15

10.71

12.72

13.17

20.75

12.88

8.48

15.02

18.93

7.60

14.58

21.16

13.57

12.26

11.01

17.90

15.36

13.55

12.25

12.87

13.97

17.08

17.08

26.92

27.30

11.77

14.58

10.47

16.94

19.15

21.65

16.39

17.72

14.26

16.74

12.73

9.56

Dec

14.62

15.52

11.29

15.37

14.11

18.94

18.80

9.84

12.16

9.08

12.61

18.31

9.22

7.13

8.68

13.69

7.66

8.54

5.60

6.31

19.07

10.03

8.03

8.18

19.78

15.96

8.74

9.13

7.39

9.49

11.83

8.27

8.49

16.91

16.15

16.12

29.78

18.46

18.59

16.19

19.88

7.83

7.30

9.72

8.07

18.46

11.23

8.24

10.96

6.88

11.01

73

APPENDIX B

i

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

Mean

Jan

0.056

0.052

0.059

0.050

0.056

0.046

0.058

0.065

0.040

0.054

0.054

0.069

0.050

0.045

0.035

0.051

0.049

0.046

0.041

0.037

0.037

0.050

0.040

0.036

0.043

0.054

0.050

0.038

0.040

0.039

0.043

0.036

0.048

0.047

0.056

0.057

0.060

0.068

0.067

0.053

0.064

0.063

0.044

0.048

0.050

0.036

0.050

Feb

0.051

0.044

0.046

0.045

0.047

0.044

0.048

0.054

0.037

0.047

0.034

0.045

0.050

0.041

0.048

0.048

0.038

0.040

0.034

0.038

0.036

0.047

0.036

0.029

0.053

0.052

0.043

0.037

0.049

0.031

0.032

0.034

0.045

0.050

0.055

0.053

0.058

0.073

0.067

0.054

0.059

0.053

0.038

0.047

0.046

0.030

0.045

Mac

0.058

0.045

0.056

0.045

0.050

0.050

0.052

0.047

0.031

0.040

0.036

0.050

0.049

0.041

0.046

0.057

0.042

0.033

0.033

0.036

0.040

0.038

0.041

0.036

0.050

0.053

0.040

0.035

0.045

0.044

0.030

0.043

0.042

0.049

0.056

0.059

0.060

0.075

0.065

0.059

0.067

0.050

0.040

0.043

0.046

0.038

0.047

Apr

0.064

0.051

0.057

0.044

0.052

0.066

0.058

0.060

0.041

0.046

0.045

0.044

0.053

0.055

0.052

0.058

0.044

0.035

0.045

0.043

0.040

0.057

0.052

0.033

0.050

0.047

0.052

0.043

0.045

0.059

0.034

0.049

0.047

0.058

0.056

0.063

0.071

0.074

0.064

0.059

0.061

0.062

0.058

0.048

0.052

0.038

0.052

May

0.055

0.055

0.055

0.044

0.056

0.068

0.045

0.059

0.059

0.060

0.052

0.047

0.059

0.057

0.054

0.058

0.040

0.042

0.050

0.043

0.055

0.059

0.053

0.042

0.055

0.058

0.045

0.044

0.055

0.059

0.055

0.059

0.054

0.061

0.057

0.064

0.067

0.070

0.065

0.063

0.053

0.057

0.061

0.053

0.058

0.045

0.055

Jun

0.046

0.051

0.046

0.046

0.045

0.050

0.052

0.043

0.050

0.050

0.038

0.041

0.053

0.052

0.048

0.051

0.050

0.045

0.035

0.041

0.053

0.048

0.047

0.035

0.055

0.042

0.038

0.038

0.060

0.052

0.044

0.053

0.043

0.049

0.057

0.064

0.068

0.074

0.067

0.054

0.048

0.048

0.050

0.046

0.037

0.044

0.049

Jul

0.057

0.046

0.045

0.045

0.057

0.039

0.053

0.043

0.043

0.037

0.043

0.039

0.043

0.040

0.049

0.057

0.044

0.037

0.036

0.041

0.045

0.040

0.040

0.038

0.044

0.042

0.033

0.039

0.051

0.051

0.039

0.042

0.047

0.052

0.051

0.062

0.062

0.072

0.067

0.048

0.037

0.043

0.047

0.048

0.045

0.039

0.046

Aug

0.049

0.051

0.049

0.053

0.048

0.043

0.054

0.044

0.043

0.044

0.045

0.050

0.043

0.049

0.049

0.047

0.052

0.035

0.032

0.040

0.052

0.039

0.046

0.043

0.041

0.040

0.035

0.053

0.051

0.041

0.033

0.041

0.044

0.044

0.057

0.068

0.064

0.067

0.074

0.052

0.053

0.040

0.048

0.046

0.038

0.043

0.047

Sep

0.058

0.058

0.056

0.060

0.060

0.053

0.057

0.055

0.055

0.042

0.054

0.058

0.052

0.048

0.051

0.058

0.049

0.038

0.039

0.052

0.054

0.053

0.049

0.055

0.043

0.047

0.048

0.060

0.071

0.050

0.046

0.049

0.045

0.054

0.067

0.061

0.062

0.072

0.079

0.056

0.052

0.052

0.052

0.052

0.059

0.034

0.054

Oct

0.057

0.056

0.069

0.070

0.058

0.069

0.072

0.060

0.057

0.059

0.055

0.047

0.059

0.057

0.047

0.044

0.064

0.072

0.049

0.049

0.060

0.048

0.055

0.042

0.049

0.061

0.064

0.071

0.053

0.060

0.068

0.062

0.049

0.060

0.064

0.063

0.075

0.075

0.073

0.062

0.053

0.045

0.066

0.062

0.065

0.065

0.060

Nov

0.063

0.060

0.075

0.079

0.065

0.067

0.077

0.076

0.058

0.056

0.058

0.064

0.067

0.054

0.050

0.057

0.061

0.062

0.075

0.061

0.051

0.065

0.072

0.049

0.065

0.075

0.063

0.060

0.057

0.070

0.066

0.063

0.060

0.061

0.063

0.069

0.069

0.083

0.083

0.059

0.065

0.056

0.069

0.072

0.076

0.068

0.065

Dec

0.065

0.066

0.058

0.066

0.064

0.072

0.072

0.055

0.060

0.053

0.061

0.071

0.053

0.047

0.052

0.063

0.049

0.051

0.042

0.045

0.072

0.055

0.050

0.050

0.073

0.067

0.052

0.053

0.048

0.054

0.059

0.051

0.051

0.069

0.067

0.067

0.086

0.071

0.071

0.067

0.073

0.049

0.048

0.054

0.050

0.071

0.060

74

APPENDIX C

i

RAND ( )

erf -1

ti,j

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

Nov-06

Dec-06

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Jan-08

Feb-08

Mar-08

Apr-08

May-08

Jun-08

Jul-08

Aug-08

Sep-08

Oct-08

Nov-08

Dec-08

Jan-09

Feb-09

Mar-09

Apr-09

May-09

Jun-09

Jul-09

Aug-09

Sep-09

Oct-09

Nov-09

Dec-09

Jan-10

Feb-10

Mar-10

Apr-10

May-10

Jun-10

Jul-10

Aug-10

Sep-10

Oct-10

Nov-10

Dec-10

0.699645

0.45481

0.063732

0.224711

0.236038

0.471912

0.999341

0.533139

0.095672

0.044674

0.997494

0.407816

0.656401

0.32176

0.733219

0.724521

0.401592

0.010641

0.096817

0.516508

0.053638

0.222905

0.612597

0.663435

0.143889

0.070315

0.523247

0.919276

0.705168

0.237308

0.877403

0.425101

0.402188

0.338947

0.687608

0.014286

0.684203

0.305343

0.627906

0.641724

0.751243

0.729118

0.289185

0.954236

0.428914

0.264273

0.687481

0.765445

0.846072

0.27472

0.555255

0.800866

0.779092

0.847218

0.420992

0.996074

0.600695

0.32158

0.630127

0.323203

0.399289

-0.090379

-0.872536

-0.550577

-0.527923

-0.056176

0.998683

0.066278

-0.808656

-0.910651

0.994989

-0.184368

0.312802

-0.35648

0.466438

0.449041

-0.196816

-0.978717

-0.806366

0.033016

-0.892724

-0.554191

0.225195

0.32687

-0.712222

-0.85937

0.046495

0.838551

0.410335

-0.525384

0.754806

-0.149797

-0.195624

-0.322107

0.375216

-0.971427

0.368406

-0.389314

0.255813

0.283447

0.502486

0.458237

-0.421629

0.908473

-0.142173

-0.471453

0.374963

0.530889

0.692144

-0.45056

0.110509

0.601733

0.558183

0.694435

-0.158017

0.992148

0.20139

-0.35684

0.260254

-0.353593

0.370085

-0.08027

-1.0558

-0.53482

-0.50847

-0.04983

1.443813

0.058805

-0.91763

-1.15355

1.429319

-0.16487

0.284724

-0.32724

0.440226

0.421663

-0.17623

-1.36824

-0.91316

0.029268

-1.1059

-0.53909

0.2023

0.298297

-0.75074

-1.02497

0.041228

0.978848

0.381358

-0.50556

0.819547

-0.13354

-0.17514

-0.29369

0.345832

-1.34224

0.339046

-0.35998

0.230738

0.256729

0.479699

0.431438

-0.39299

1.147587

-0.12667

-0.44563

0.34558

0.511878

0.720449

-0.42327

0.098252

0.597223

0.543827

0.723842

-0.14097

1.418338

0.180416

-0.32759

0.234893

-0.32439

1.523379

0.886483

-0.49313

0.243657

0.280915

0.929536

3.041859

1.083163

-0.29772

-0.63136

3.021363

0.766834

1.402661

0.537217

1.622573

1.596322

0.750771

-0.93498

-0.2914

1.041391

-0.56398

0.237618

1.286095

1.421856

-0.0617

-0.44952

1.058306

2.384299

1.539321

0.28503

2.159015

0.81114

0.752312

0.584661

1.489081

-0.89822

1.479484

0.490905

1.326313

1.36307

1.678397

1.610146

0.444235

2.622933

0.820859

0.369778

1.488724

1.723905

2.018868

0.401403

1.138949

1.8446

1.769087

2.023667

0.800643

3.005833

1.255146

0.536714

1.332189

0.541241

75

APPENDIX D

Month, i

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

Nov-06

Dec-06

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Jan-08

Feb-08

Mar-08

Apr-08

May-08

Jun-08

Jul-08

Aug-08

Sep-08

Oct-08

Nov-08

Dec-08

Jan-09

Feb-09

Mar-09

Apr-09

May-09

Jun-09

Jul-09

Aug-09

Sep-09

Oct-09

Nov-09

Dec-09

Jan-10

Feb-10

Mar-10

Apr-10

May-10

Jun-10

Jul-10

Aug-10

Sep-10

Oct-10

Nov-10

Dec-10

Deterministic Component

Random Component

Model Flow

qi-1,j-1

qj+bj(qi-1,j-1-qj-1)

ti,j

Sjti,j(1-rj2)

qi,j (Log)

0.050

0.063

0.053

0.043

0.054

0.057

0.055

0.068

0.055

0.052

0.055

0.089

0.067

0.062

0.050

0.060

0.066

0.060

0.042

0.044

0.055

0.049

0.062

0.075

0.073

0.049

0.042

0.055

0.073

0.065

0.051

0.062

0.053

0.060

0.064

0.077

0.051

0.062

0.049

0.057

0.064

0.066

0.060

0.049

0.066

0.060

0.063

0.077

0.076

0.067

0.049

0.056

0.068

0.067

0.063

0.052

0.069

0.064

0.064

0.076

0.049541669

0.045386033

0.04653475

0.051865643

0.054889433

0.048803168

0.046082272

0.04726108

0.053859993

0.059642058

0.065034911

0.059661808

0.049559131

0.045384746

0.046529892

0.051883571

0.054896185

0.04880782

0.046063986

0.047223647

0.053859658

0.059640286

0.065039039

0.059650993

0.049565308

0.04536888

0.046516145

0.051878774

0.05490011

0.048815617

0.046075966

0.047251163

0.053858046

0.059649041

0.065040692

0.059652259

0.049543394

0.045385559

0.046529249

0.051881059

0.054895021

0.048816984

0.046088968

0.047231941

0.053870927

0.059649508

0.065039672

0.059652256

0.049568162

0.045391438

0.046528015

0.05187947

0.05489742

0.048817884

0.046093027

0.047235947

0.053873657

0.0596524

0.065040467

0.059651281

1.523379

0.886483

-0.49313

0.243657

0.280915

0.929536

3.041859

1.083163

-0.29772

-0.63136

3.021363

0.766834

1.402661

0.537217

1.622573

1.596322

0.750771

-0.93498

-0.2914

1.041391

-0.56398

0.237618

1.286095

1.421856

-0.0617

-0.44952

1.058306

2.384299

1.539321

0.28503

2.159015

0.81114

0.752312

0.584661

1.489081

-0.89822

1.479484

0.490905

1.326313

1.36307

1.678397

1.610146

0.444235

2.622933

0.820859

0.369778

1.488724

1.723905

2.018868

0.401403

1.138949

1.8446

1.769087

2.023667

0.800643

3.005833

1.255146

0.536714

1.332189

0.541241

0.013

0.007

-0.004

0.002

0.002

0.007

0.022

0.008

-0.002

-0.005

0.024

0.007

0.012

0.004

0.013

0.014

0.005

-0.007

-0.002

0.007

-0.004

0.002

0.010

0.013

-0.001

-0.004

0.009

0.021

0.011

0.002

0.015

0.006

0.006

0.005

0.012

-0.008

0.013

0.004

0.011

0.012

0.012

0.012

0.003

0.019

0.006

0.003

0.012

0.016

0.017

0.003

0.009

0.016

0.012

0.014

0.006

0.021

0.010

0.004

0.011

0.005

0.063

0.053

0.043

0.054

0.057

0.055

0.068

0.055

0.052

0.055

0.089

0.067

0.062

0.050

0.060

0.066

0.060

0.042

0.044

0.055

0.049

0.062

0.075

0.073

0.049

0.042

0.055

0.073

0.065

0.051

0.062

0.053

0.060

0.064

0.077

0.051

0.062

0.049

0.057

0.064

0.066

0.060

0.049

0.066

0.060

0.063

0.077

0.076

0.067

0.049

0.056

0.068

0.067

0.063

0.052

0.069

0.064

0.064

0.076

0.065

76

APPENDIX E

i

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

Nov-06

Dec-06

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Jan-08

Feb-08

Mar-08

Apr-08

May-08

Jun-08

Jul-08

Aug-08

Sep-08

Oct-08

Nov-08

Dec-08

Jan-09

Feb-09

Mar-09

Apr-09

May-09

Jun-09

Jul-09

Aug-09

Sep-09

Oct-09

Actual Flow

(m3/s)

13.08

8.12

6.11

29.72

29.22

17.82

7.94

9.95

28.05

17.63

17.72

11.23

9.05

6.80

7.62

13.46

12.05

11.38

13.06

8.95

9.36

14.33

14.26

8.24

11.29

6.76

9.58

12.86

9.73

12.28

10.89

7.83

9.85

13.14

16.74

10.96

9.73

9.67

15.10

13.72

8.75

7.31

8.05

9.03

10.08

7.99

Model Flow

(m3/s)

13.533

9.077

5.641

9.604

10.807

10.210

16.422

10.014

8.642

9.821

32.326

15.849

13.020

7.992

12.065

15.262

12.299

5.514

6.059

9.874

7.877

13.038

21.161

19.599

7.719

5.384

10.032

19.404

15.098

8.379

13.007

9.219

12.127

14.502

22.302

8.531

13.343

7.856

10.970

14.160

15.629

12.423

7.800

15.337

12.388

13.587

MAPE

RMSE

3.462

11.786

7.681

67.685

63.015

42.706

106.822

0.644

69.192

44.298

82.432

41.109

43.872

17.535

58.336

13.384

2.069

51.550

53.607

10.339

15.846

9.013

48.391

137.858

31.625

20.351

4.721

50.886

55.171

31.768

19.439

17.734

23.119

10.398

33.224

22.161

37.128

18.764

27.352

3.205

78.619

69.942

3.111

69.845

22.897

70.046

0.205001

0.915831

0.220272

404.6583

339.0362

57.91601

71.93914

0.00411

376.688

61.00476

213.3443

21.31845

15.7641

1.421735

19.76006

3.245474

0.06214

34.41474

49.01556

0.855987

2.199906

1.668293

47.61862

129.0379

12.74858

1.89258

0.20453

42.82308

28.81687

15.21902

4.481376

1.928084

5.185165

1.865887

30.93221

5.899487

13.05078

3.292187

17.05788

0.193326

47.32323

26.14056

0.062732

39.77884

5.327024

31.32252

Chi-square

Test

0.015148

0.100896

0.039051

42.13488

31.37171

5.672621

4.380738

0.00041

43.59034

6.211446

6.599874

1.345112

1.210723

0.177887

1.637769

0.212657

0.005052

6.241801

8.089859

0.086689

0.27929

0.127953

2.250341

6.58374

1.651481

0.3515

0.020387

2.206928

1.908638

1.816363

0.344538

0.209153

0.427585

0.12866

1.386991

0.691526

0.97813

0.41909

1.554974

0.013653

3.027875

2.104243

0.008043

2.593644

0.430014

2.305389

77

Nov-09

Dec-09

Jan-10

Feb-10

Mar-10

Apr-10

May-10

Jun-10

Jul-10

Aug-10

Sep-10

Oct-10

Nov-10

Dec-10

12.73

6.88

6.83

4.86

4.36

7.18

6.17

7.51

7.45

8.04

7.16

6.30

9.56

11.01

22.299

21.522

15.834

7.597

10.312

16.493

15.985

13.908

8.744

16.912

14.091

14.296

21.417

14.672

75.168

212.820

131.827

56.317

136.513

129.701

159.084

85.191

17.405

110.346

96.798

126.927

124.026

33.261

53.659

91.56381

214.3895

81.06853

7.491085

35.42605

86.72375

96.34386

40.93266

1.680274

78.70912

48.03522

63.94217

140.5846

13.41047

7.29

4.106203

9.961389

5.119965

0.98606

3.435427

5.258356

6.026957

2.943131

0.192169

4.65409

3.408991

4.472611

6.564209

0.914016

250.9884

78

APPENDIX F

i

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

Nov-06

Dec-06

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Jan-08

Feb-08

Mar-08

Apr-08

May-08

Jun-08

Jul-08

Aug-08

Sep-08

Oct-08

Nov-08

Dec-08

Jan-09

Feb-09

Mar-09

Apr-09

May-09

Jun-09

Jul-09

Aug-09

Sep-09

Oct-09

Nov-09

Actual Flow

(m3/s)

13.08

8.12

6.11

29.72

29.22

17.82

7.94

9.95

28.05

17.63

17.72

11.23

9.05

6.80

7.62

13.46

12.05

11.38

13.06

8.95

9.36

14.33

14.26

8.24

11.29

6.76

9.58

12.86

9.73

12.28

10.89

7.83

9.85

13.14

16.74

10.96

9.73

9.67

15.10

13.72

8.75

7.31

8.05

9.03

10.08

7.99

12.73

Model Flow

(m3/s)

9.6732

7.1884

7.2612

9.0165

9.9281

7.6110

6.7046

7.0851

9.5168

12.2889

15.2005

12.3581

7.9227

6.6970

7.1341

8.9949

9.9369

7.6286

6.7248

7.1060

9.5379

12.3101

15.2217

12.3794

7.9439

6.7182

7.1553

9.0161

9.9581

7.6499

6.7460

7.1273

9.5592

12.3314

15.2429

12.4006

7.9651

6.7394

7.1765

9.0373

9.9794

7.6711

6.7673

7.1485

9.5804

12.3526

15.2642

Residual

Fit

Coefficient

*

*

*

*

*

*

*

*

*

*

*

*

*

-1.57988

-1.39072

-1.05700

-0.14946

1.10867

-1.04180

1.04920

0.26505

-2.99026

-3.58500

4.03841

1.99786

-1.67458

2.57792

0.10621

-1.42906

-1.18154

-1.02019

0.57523

-0.37209

3.89633

3.01737

-4.56349

-1.32522

-0.91615

-1.58516

-3.54478

-3.07188

1.04294

-0.27357

2.58702

1.07675

4.26644

5.43986

*

*

*

*

*

*

*

*

*

*

*

*

*

7.5299

7.6507

9.4570

10.2195

7.3913

7.8818

7.2208

11.0050

13.4603

15.6250

11.4816

9.9828

8.3305

7.7005

10.9538

11.4991

7.8015

7.3802

7.2148

10.9121

13.0737

18.1226

15.8535

9.3852

7.2661

7.9552

9.5648

9.2719

5.7171

6.7266

6.7039

11.0133

13.5536

18.4266

0.289364

0.878761

0.955283

79

Dec-09

Jan-10

Feb-10

Mar-10

Apr-10

May-10

Jun-10

Jul-10

Aug-10

Sep-10

Oct-10

Nov-10

Dec-10

6.88

6.83

4.86

4.36

7.18

6.17

7.51

7.45

8.04

7.16

6.30

9.56

11.01

12.4218

7.9864

6.7607

7.1978

9.0586

10.0006

7.6923

6.7885

7.1698

9.6017

12.3739

15.2854

12.4431

-1.30056

-0.79690

-1.45611

-0.78662

-1.79769

-0.43999

-1.69829

3.62122

-1.95910

1.06042

-3.37562

-2.06698

1.18330

16.6716

11.1109

8.5383

8.6866

10.5277

10.7900

8.1383

7.4688

9.4791

11.3296

14.6156

16.6470

12.9267

80

APPENDIX G

i

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

Nov-06

Dec-06

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Jan-08

Feb-08

Mar-08

Apr-08

May-08

Jun-08

Jul-08

Aug-08

Sep-08

Oct-08

Nov-08

Dec-08

Jan-09

Feb-09

Mar-09

Apr-09

May-09

Jun-09

Jul-09

Aug-09

Sep-09

Oct-09

Actual Flow

(m3/s)

13.08

8.12

6.11

29.72

29.22

17.82

7.94

9.95

28.05

17.63

17.72

11.23

9.05

6.80

7.62

13.46

12.05

11.38

13.06

8.95

9.36

14.33

14.26

8.24

11.29

6.76

9.58

12.86

9.73

12.28

10.89

7.83

9.85

13.14

16.74

10.96

9.73

9.67

15.10

13.72

8.75

7.31

8.05

9.03

10.08

7.99

Model Flow

(m3/s)

9.6732

7.1884

7.2612

9.0165

9.9281

7.6110

6.7046

7.0851

9.5168

12.2889

15.2005

12.3581

7.9227

6.6970

7.1341

8.9949

9.9369

7.6286

6.7248

7.1060

9.5379

12.3101

15.2217

12.3794

7.9439

6.7182

7.1553

9.0161

9.9581

7.6499

6.7460

7.1273

9.5592

12.3314

15.2429

12.4006

7.9651

6.7394

7.1765

9.0373

9.9794

7.6711

6.7673

7.1485

9.5804

12.3526

MAPE

RMSE

26.046

11.473

18.841

69.662

66.023

57.290

15.559

28.793

66.072

30.303

14.215

10.029

12.457

1.515

6.377

33.173

17.536

32.965

48.508

20.594

1.901

14.095

6.744

50.235

29.638

0.618

25.310

29.890

2.345

37.705

38.053

8.975

2.948

6.129

8.943

13.144

18.138

30.306

52.473

34.130

14.050

4.940

15.934

20.836

4.956

54.601

11.606

0.868

1.325

428.633

372.178

104.224

1.526

8.208

343.480

28.547

6.344

1.269

1.271

0.011

0.236

19.937

4.465

14.073

40.135

3.397

0.032

4.080

0.925

17.134

11.196

0.002

5.879

14.776

0.052

21.438

17.172

0.494

0.084

0.648

2.241

2.075

3.115

8.588

62.781

21.927

1.511

0.130

1.645

3.540

0.250

19.032

Chi-square

Test

1.200

0.121

0.183

47.538

37.487

13.694

0.228

1.158

36.092

2.323

0.417

0.103

0.160

0.002

0.033

2.217

0.449

1.845

5.968

0.478

0.003

0.331

0.061

1.384

1.409

0.000

0.822

1.639

0.005

2.802

2.546

0.069

0.009

0.053

0.147

0.167

0.391

1.274

8.748

2.426

0.151

0.017

0.243

0.495

0.026

1.541

81

Nov-09

Dec-09

Jan-10

Feb-10

Mar-10

Apr-10

May-10

Jun-10

Jul-10

Aug-10

Sep-10

Oct-10

Nov-10

Dec-10

12.73

6.88

6.83

4.86

4.36

7.18

6.17

7.51

7.45

8.04

7.16

6.30

9.56

11.01

15.2642

12.4218

7.9864

6.7607

7.1978

9.0586

10.0006

7.6923

6.7885

7.1698

9.6017

12.3739

15.2854

12.4431

19.907

80.550

16.931

39.109

65.087

26.164

62.085

2.428

8.848

10.824

34.102

96.411

59.889

13.016

27.497

6.422

30.712

1.337

3.613

8.053

3.529

14.674

0.033

0.434

0.757

5.962

36.892

32.780

2.054

5.416

0.421

2.472

0.167

0.534

1.119

0.390

1.467

0.004

0.064

0.106

0.621

2.981

2.145

0.165

191.114

- nourani2014Uploaded byAntonio Soto Leon
- CT6Uploaded byAnkit Bagla
- Air passenger demand forecasting for planned airports, case study Zafer and OR-GI airports in turkey [Planlanan havalimanlarının yolcu talep miktarlarının tahmin edilmesi Zafer ve OR-Gİ hava l.pdfUploaded byÇağrı KARABİLLİOĞLU
- Waves.pdfUploaded byNicola Caso
- AF1Uploaded byAgryma Hs
- tmpE050Uploaded byFrontiers
- Half Life Tsay NotesUploaded byjez
- AF1Uploaded byshahadat292
- DechowUploaded bySyArif Al Qodri Mengucapkan
- Hyperion Planning, Predictive Planning User's GuideUploaded bysuchai
- 3TIER_Solar_Prospecting_to_Finance_Webinar.pdfUploaded byGilberto Figueiredo
- Unit Roots in Macroeconomic Time SeriesUploaded byelz0rr0
- NEWLINEDISSERTATION7.docUploaded byKonstantinos Kostoulas
- 02 SDMX Information Model Student Book 2010Uploaded byRicardo Pérez De La Torre
- Pawlus 2013 InvestigationUploaded byMomo Pierre
- Autoregressive–Moving-Average Model - Wikipedia the Free EncycUploaded byAyush choudhary
- CrpUploaded bysaadahmedkalidaas
- CT6_Syllabus for 2011Uploaded bySunil Pillai
- 02_SDMX_Information_Model_student_book_2010.pdfUploaded byRicardo Pérez De La Torre
- A Mixed-type Test for Linearity in Time SeriesUploaded bysazdajt4
- gupea_2077_29504_1Uploaded byVictor Manuel
- Sethares (2015) Conditional Granger Causality and Partitioned Granger Causality: Differences and SimilaritiesUploaded byEscuela De Arte
- 1%2E9781611972801%2E73 (1)Uploaded byGustavo Henrique
- Demand ForecastingUploaded bySurya Panwar
- 3.Time series chart.pdfUploaded bygugulethu moyo
- 2008jtecha1138.1[1].pdfUploaded byTom Jones
- Time Series Simulators PptUploaded byPrachi
- PO103Uploaded byMariano
- Lecture 3 Stationary ProcessesUploaded byronalduck
- 11539025Uploaded byAlqaas Chaudhry

- Jw - Hec-ras 10 StepsUploaded byDOUDOU-38
- 00PipeCVUploaded byDenBagoes
- Latihan SolverUploaded byDenBagoes
- Kajian Drainase Kawasan PertanianUploaded byDenBagoes
- Water Resources Management in JapanUploaded byDenBagoes
- Water Resources, Institutions, And Intrastate ConflictUploaded byDenBagoes
- 95010014 Cecep Ridwan GunawdijajaUploaded byBlackgothic Javanassemetall
- CROSS HEC RASUploaded byDenBagoes
- group 4.April 23_Travel cost_Duck hunting 2002.pdfUploaded byDenBagoes
- Groundwater Leaverage FinalUploaded byDenBagoes
- CROSS SECTIONUploaded byDenBagoes
- Software Exercise 2Uploaded byDenBagoes
- Water Resources, Sustainability, Adn Societal Livelihoods in IndonesiaUploaded byDenBagoes
- Water Resources Management Brazil, Eropean, PortugalUploaded byDenBagoes
- Headloss FormulaUploaded byDenBagoes
- FM 2014 CheatsUploaded byDenBagoes

- IJFM JournalUploaded bytraderescort
- Monitoring Inventory RotablesUploaded byVebe
- Time SeriesUploaded byAbraham Zeus
- Optimization of ITS for Urban TrafficUploaded byRahul Kayala
- How to forecast using ARIMAUploaded byNaba Kr Medhi
- 2014 SSCI GoldminerUploaded byRogerio Guahy
- OT Modified HP-Filter eUploaded bysomebody314
- Forecasting Municipal Solid Waste Generation Using Prognostic ToolsUploaded byVitta Telefonieremich Phita
- Time series Analysis of Global CO2 EmissionsUploaded byVineeth Menon
- Chapter-1Uploaded bybharath
- ARIMA GARCH Spectral AnalysisUploaded byRewa Shankar
- Univariate Time SeriesUploaded byShashank Gupta
- SessII_Harvey.pdfUploaded byKrishnan Muralidharan
- Time Series Analysis of Bus speeds in DelhiUploaded byM Mushtaq
- Dragan-Kramberger_Forecasting the Container Throughput ICLST14 BUploaded byDejan Dragan
- Forecast Time Series-notesUploaded byflorin
- Study of Effectiveness of Time Series Modeling (Arima) in Forecasting Stock PricesUploaded byBilly Bryan
- Stat6087 Module Outline 2014 15Uploaded byAbdul Aziz
- ARIMA Modeling for Exchange RateUploaded bySami Ben El Mamoun
- McCleary & McDowall (2012) Time-series DesignsUploaded byJean Paul Vaudenay
- Oil Bosler FabianUploaded byLuis Carrillo
- Application of Regression Models for Area Production and Productivity Trends of MaizeUploaded byEditor IJTSRD
- Chapter 1Uploaded byCharleneKronstedt
- MODELING OF MEAN TEMPERATURE OF FOUR STATIONS IN ASSAM.Uploaded byIJAR Journal
- ARIMA AR MA ARMA ModelsUploaded byOisín Ó Cionaoith
- Shipping Freight Derivatives - A Survey of Recent EvidenceUploaded byLast_Don
- 2.2. Univariate Time Series Analysis.pdfUploaded byane9sd
- Air Quality Forecasting - AthensUploaded byAnurag_Kandya_3182
- i1936-900X-13-2-163Uploaded byKomang Ayu Eka Wijayanti
- Statistical Forcasting - Excel, ARIMAUploaded byAnonymous rkZNo8