Professional Documents
Culture Documents
Time Series
Time Series
Page 1 of 13
Figure 1 Plot of the Time Series of National Government Expenditures (Jan. 2012 – Oct. 2016)
From the plot, as shown in Figure 1, it can be seen that there is an existing trend in the
expenditures for the past years which is in an increasing manner. Also, there seems to be seasonal
variation in the amount of expenditures per month since there is a peak every first middle (around May
to June) and end of the calendar year and a trough every beginning and second middle (June to July)
of the calendar year. The seasonal fluctuations are roughly constant in size and the random
fluctuations seem to be roughly constant in size over time. The time series could be described by using
an additive model.
Upon checking the boxplot of the time series data of the national government expenditures, no
outliers were found as shown in Figure 2.
Page 2 of 13
Figure 2 Boxplot of the Time Series of the National Government Expenditures
In using the processes to generate an ARIMA model, the time series must be stationary. Using
the R code below to determine the number of differences to achieve the stationary status:
>ndiffs(PHts, alpha=0.05, test="adf", max.d=3)
The code resulted to zero differences which means that no differencing is needed to make the time
series be stationary. To formally check the stationary status of the time series, Augmented Dickey-
Fuller Test is used. Using the R code for the said test given below:
>adf.test(PHts, alternative="stationary")
The resulting p-value of the test is less than 0.01. This implies that at 95% confidence level, there is a
sufficient evidence to reject the null hypothesis that the time series data of the national government
expenditure is not stationary. Another requirement for the time series data to be used in the ARIMA
model is that the data for each period must be correlated. Using the autocorrelation test in R, which is
the Box-Ljung test with the given code below:
>Box.test(PHts, type="Ljung-Box")
With the resulting p-value of 0.008819 from the test, we can say that at 95% confidence level, there is a
sufficient evidence to reject the null hypothesis that the monthly expenditures in the time series data are
Page 3 of 13
not autocorrelated. With the given conditions above that is satisfied by the time series data of the
national government expenditures, selection of ARIMA model can be pursued.
IV. Methodology on selecting ARIMA Model
A. Selecting an ARIMA Model
Since the time series used in this study is already stationary, the next step is to finding an
appropriate ARIMA model. This means that appropriate values for p and q for an ARIMA(p,d,q) model
must be attained. This can be done by examining first the correlogram and then the partial correlogram
of the stationary time series. Using the R codes below in generating the correlogram and partial
correlogram, autocorrelation function (acf) and partial autocorrelation function (pacf) will be used,
respectively:
>acf(PHts, lag.max=20)
>pacf(PHts, lag.max=20)
Page 4 of 13
Figure 4 Partial correlogram of PHts
In Figures 3 and 4, the correlogram and the partial corellogram of the time series are shown,
respectively. From the correlogram, the autocorrelation at lags 1, 3, 5, 6 and 12 exceed the Bartletts,
but all other autocorrelations do not exceed the Bartletts. On the other hand, the partial correlogram
shows that the partial autocorrelation at lags 1, 6 and 12 exceed the Bartletts, but all other partial
autocorrelations do not exceed the Bartletts. Also, from the partial correlogram, the seasonal
fluctuations that were implied in the plot of the time series at the beginning of this paper are manifested
in the lags 1, 6 and 12.
From the above conclusion, many combinations of ARIMA models can be made. To start with
and taking into consideration the principle of parsimony, ARIMA(1,0,0)x(1,0,0) or AR(1)(1,0,0) with non-
zero mean will be used as the preliminary model and will be justified at the end of this paper as the final
model using the required diagnostic checkings and, in turn, will be used in forecasting. AR(1)(1) model
with non-zero mean is an autoregressive model of order p=1 with autoregressive seasonality of order
P=1. The equation for this model is given below:
𝑦𝑡 = ∅𝑦𝑡−1 + 𝜗𝑦𝑡−12 + 𝑎𝑡 + 𝑐
where 𝑦𝑡 is the amount of the national government expenditure at time t, ∅ is the coefficient of the past
value 𝑦𝑡−1 , 𝜗 is the coefficient of the past value 𝑦𝑡−12, 𝑎𝑡 is the error term and 𝑐 be the intercept.
B. Coefficients of the model
Using the R software, the coefficients for AR(1)(1) model can be generated with the given code
below:
Page 5 of 13
>PHmod <- arima(PHts, order = c(1,0,0), seasonal = c(1,0,0), include.mean = TRUE)
>coeftest(PHmod)
From the result above, at 95% confidence level there is a sufficient evidence to reject the null
hypothesis that the coefficients for the AR(1)(1) model are equal to zero when in fact the coefficients
are not equal to zero and are significant.
C. Diagnostic Checking
The following tests are done to the AR(1)(1) model of this study with the results.
a. Stationary – It is stated from the section of the Data Characteristics that the time series
used for the model is stationary.
b. Test for Autocorrelation using Box-Ljung Test – The test is used to know if the residuals of
the model have autocorrelations. The null hypothesis for this test is that there is no
autocorrelation among the data points. Using the R code below for this test:
>Box.test(PHmod$residuals, type = “Ljung-Box”)
Given the above result, at 5% significance level, there is sufficient evidence to accept the
null hypothesis that the residuals of the model are not autocorrelated.
c. Test for Normality using Jarque-Bera test – the residuals must be normally distributed. The
null hypothesis for this test is that the distribution is normal for the data points. Using the R
code below, the result of the test for normality of the residuals of the model is generated:
>jarque.bera.test(PHmod$residuals)
Given the above result, at 95% confidence level, there is sufficient evidence to accept the
null hypothesis that the residuals of the model are normally distributed.
Page 6 of 13
d. Test for Homoscedasticity using ARCH test – the data point of the time series must not be
heteroscedastic when used to generate the model. Using the R code below, the result for
the ARCH test is given as:
>jarque.bera.test(PHmod$residuals)
Given the above result, there is sufficient evidence, at 95% confidence level, to accept the
null hypothesis that the residuals of the model are homoscedastic.
e. Acf/ correlogram – the autocorrelation of the residuals at all lags must not exceed the
Bartletts in the correlogram. Using the R code below, the correlogram is generated:
>acf(PHmod$residuals)
From the result above reflected in Figure 5, the autocorrelation of the residuals at each lag
does not exceed the Bartletts
f. Pacf/ partial correlogram – the partial autocorrelation of the residuals at all lags must not
exceed the Bartletts in the partial correlogram. Using the R code below, the correlogram is
generated:
>pacf(PHmod$residuals)
Page 7 of 13
Figure 6 Partial Correlogram of the residuals of the model AR(1)(1)
From the result above reflected in Figure 6, the partial autocorrelation of the residuals at
each lag does not exceed the Bartletts.
Given that the model satisfies the required diagnostic checking mentioned above, the
model can be used in forecasting. Substituting the significant coefficients of the model
generated, the final model becomes:
𝑦𝑡 = 0.34591𝑦𝑡−1 + 0.73985𝑦𝑡−12 + 173,190
The model is interpreted as: the amount of the month’s National Government expenditures
is affected by the preceding month’s expenditure by a factor of 0.3459, by the same month
of the preceding year by a factor of 0.73985 and an intercept of 173,190.
IV. Forecasting using the In-sample data
To start with the forecasting, the original time series data must be split into in-sample and out-
sample data points. For this study, the in-sample used includes data points of the original series from
January 2012 to May 2016 to maintain the at least 50 data points, whereas, the out-sample used
includes the remaining data points from June 2016 to October 2016. Using the R codes below, the
forecasts for the next 5 months and the next 12 months worth of data on the amount of National
Government expenditures are generated:
>PHts_in <- window(PHts, frequency=12, start=c(2012,1), end=c(2016,5))
>PH <- arima(PHts_in, order=c(1,0,0), seasonal=c(1,0,0), include.mean=TRUE)
Page 8 of 13
>plot(PH)
Figure 7 Time series plot of the AR(1)(1) model using the In-Sample data
>coeftest(PH)
Page 9 of 13
Figure 8 Forecasts from AR(1)(1) model using the In-Sample data points for succeeding 5 months
(June 2016 to October 2016)
Page 10 of 13
Figure 9 Forecasts from AR(1)(1) model using the In-Sample data points for succeeding 12 months
(June 2016 to May 2017)
Page 11 of 13
Acf(PHts, lag.max = 20) #requires stats package
Pacf(PHts, lag.max = 20) #requires stats package
#ARIMA model
PHmod <- Arima(PHts, order = c(1,0,0), seasonal = c(1,0,0), include.mean = TRUE)
#requires stats package
coeftest(PHmod)
#Diagnostic Checking
jarque.bera.test(PHmod$residuals)
Box.test(PHmod$residuals, type="Ljung-Box")
ArchTest()
#Residual analysis
plot(PHmod$residuals)
Acf(PHmod$residuals)
Pacf(PHmod$residuals)
ArchTest(PHmod$residuals)
# Suppose in sample until May 2016, out sample = June 2016 to October 2016
PHts_in <- window(PHts, frequency=12, start=c(2012,1), end=c(2016,5))
PHts_in
ndiffs(PHts_in, alpha=0.05, test="adf", max.d=3)
adf.test(PHts_in, alternative="stationary")
Acf(PHts_in)
Pacf(PHts_in)
Page 12 of 13
summary(PH)
plot(PH)
#Diagnostic checking
plot(PH$residuals)
Acf(PH$residuals)
Pacf(PH2$residuals)
jarque.bera.test(PH$residuals)
ArchTest(PH$residuals, lag=12) #requires FinTS package
Box.test(PH$residuals)
#Forecast
#5 months worth of forecast
forecast.Arima(PH, h=12, level=c(0.95))
plot.forecast(forecast.Arima(PH, h=12, level=c(0.95)), plot.conf=TRUE)
#1 year worth of forecast
forecast.Arima(PH, h=12, level=c(0.95))
plot.forecast(forecast.Arima(PH, h=12, level=c(0.95)), plot.conf=TRUE)
VII. References
SDAD-RS. (2016, November 29). Statistical Data | Bureau of the Treasury. Retrieved December 2016,
from Bureau of the Treasury: http://www.treasury.gov.ph/?page_id=746
Coghlan, A. (2016, July). A Little Book of R for Time Series. Retrieved December 2016, from Read the
Docs: https://media.readthedocs.org/pdf/a-little-book-of-r-for-time-series/latest/a-little-book-of-r-for-time-
series.pdf
Page 13 of 13