You are on page 1of 3

ARIMA Model Building

serensweekly.blogspot.com/2015/07/arima-model-building.html

Dear Data Enthusiast,

I will be introducing ARIMA Model


building on this entry. ARIMA is one of
the tools for building time series models.

ARIMA has parameters (P, D, Q).


P indicates the number of autoregressive
terms.
D indicates the number of nonseasonal
differences needed for stationary.
Q indicates the number of lagged
forecast errors in the prediction equation.
P, D and Q takes numbers as 0,1 or 2.
We must find the optimal parameters by trying out below models!

ARIMA(P,D,Q)
ARIMA(1,0,0) is a first order autoregressive model.
ARIMA(2,0,0) is a second order autoregressive model.
ARIMA (0,1,0) is a random walk model.
ARIMA(1,1,0) is a differenced first order autoregressive model.
ARIMA(0,1,1) is a simple exponential smoothing model.
ARIMA(0,2,1) is a linear exponential smoothing model.

A model with no orders of differencing (D = 0) assumes that the original series are
stationary (mean reverting) and include constant term.

A model with one order of differencing (D = 1) means original series have a constant
average trend (like random walk model). Constant term is included (P=1)if it has a non
zero average trend.

A model with two orders of differencing (D = 2) means original series have a time varying
trend and does not include constant term (P = 0)

Optimal parameters can be found out easily on R with ACF and PACF plots. If both ACF
and PACF plots decrease gradually we need to make the time series stationary and
introduce a value to "D".

According to ACF and PACF plot result, we estimate the approximate parameter values.
However, we need to explore more (P,D,Q) combinations. Parameters which has the
lowest BIC and AIC should be our choice.

1/3
Once we have the final parameters, we can make predictions on the future time series
points. We can also visualize the trends.

Tricks for ARIMA model building!

ADF (Augmented Dickey - Fuller Test) is helpful for understanding whether our model is
stationary or non stationary. We have two tricks in ADF test. First, we need to remove
unequal variances by using log function. Second, we need to take difference of the series
for addressing the trend component.

ADF test (R code)

adf.test(diff(log(Series)), alternative="stationary", k=0)

If the result indicates p value less than 0.005; we must accept the alternative hypothesis.
Therefore, the series is stationary. D component of ARIMA model should take a value of 1
in order to satisfy the stationary series.

ACF plot (R code)

If autocorrelation function plot is very slow, it means that the time series is not stationary.

acf(log(Series))

pacf(diff(log(AirPassengers)))

ACF and PACF (partial ACF) plots can point out the optimal (P,D,Q) values.

Once we have decided on optimal parameters, we fit ARIMA model and predict the future
time series points. Then, we can visualise the future prediction based on training data.

Fitting ARIMA model based on 12 months seasonal time series data:

(fit <- arima(log(Series), c(0, 1, 1),seasonal = list(order = c(0, 1, 1), period =


12)))

10 years prediction:

pred <- predict(fit, n.ahead = 10*12)

Time series plot draw:

ts.plot(AirPassengers,2.718^pred$pred, log = "y", lty = c(1,3))

Warm regards..

REF: http://www.analyticsvidhya.com/blog/2015/03/framework-application-build-
arima-model/
REF: http://people.duke.edu/~rnau/411arim.htm

2/3
3/3

You might also like