You are on page 1of 16

Time Series Models for Business and

Economic Forecasting
Andrew Flores, Jeremy Flores, Shannon Park, JunGyo Kim

2023-11-17
library(AER)

1. Exploratory Data Analysis. (a) Briefly discuss the question you are trying to
answer.

From October 1973 through April 1996, how do past monthly European spot prices for black pepper and
white pepper (US dollars per ton) affect future spot prices for white pepper?

(b) Cite the dataset and give a summary of what the dataset is about

The data(“PepperPrice”) is a monthly multiple time series from 1973(10) to 1996(4) with 2 variables. The
data consists of two variables: spot price for black pepper and spot price for white pepper (in US dollars per
ton).
Sources:
Franses, P.H. (1998). Time Series Models for Business and Economic Forecasting. Cambridge, UK: Cam-
bridge University Press.
Franses, P.H., van Dijk, D. and Opschoor, A. (2014). Time Series Models for Business and Economic
Forecasting, 2nd ed. Cambridge, UK: Cambridge University Press.

(c) First check for completeness and consistency of the data (if there are NAs or
missing observations, replace with the value of the previous observation; make
a note of this)

data("PepperPrice")
which(is.na(PepperPrice))

## integer(0)

View(PepperPrice[116:120, ])

There are no NAs, which indicates the completeness of the dataset. Most of the data is rounded to the ones
value but not all of them, potentially leading to inconsistency. After observation 115, the Black Pepper Price
is rounded to its ones value, and the White Pepper Price is rounded to its ones value from observation 40.
Most of the white pepper price values are also rounded to make the ones value a 5 or a 0.

1
(d) Provide descriptive analyses of your variables. This should include the his-
togram with overlying density, boxplots, cross correlation. All figures/statistics
must include comments.

hist(PepperPrice[,1], main = "Histogram of Black Pepper Spot Price",


xlab = "Spot Price in US Dollars per ton", freq = FALSE, data = PepperPrice)
lines(density(PepperPrice[,1]), col = "blue", lwd = 1)

Histogram of Black Pepper Spot Price


6e−04
4e−04
Density

2e−04
0e+00

1000 2000 3000 4000 5000

Spot Price in US Dollars per ton

hist(PepperPrice[,2], main = "Histogram of White Pepper Spot Price",


xlab = "Spot Price in US Dollars per ton",freq = FALSE, data = PepperPrice)
lines(density(PepperPrice[,2]), col = "blue", lwd = 1)

2
Histogram of White Pepper Spot Price
4e−04
Density

2e−04
0e+00

1000 2000 3000 4000 5000 6000 7000

Spot Price in US Dollars per ton

boxplot(PepperPrice[,1], main = "Boxplot of Black Pepper Spot Price")

3
Boxplot of Black Pepper Spot Price
5000
4000
3000
2000
1000

boxplot(PepperPrice[,2], main = "Boxplot of White Pepper Spot Price")

4
Boxplot of White Pepper Spot Price
6000
4000
2000

summary(PepperPrice)

## black white
## Min. : 884 Min. :1230
## 1st Qu.:1292 1st Qu.:1758
## Median :1768 Median :2640
## Mean :2077 Mean :2875
## 3rd Qu.:2382 3rd Qu.:3400
## Max. :4963 Max. :6887

black<- PepperPrice[,1]
white<- PepperPrice[,2]
ccf(black,white)
print(ccf(black,white))

5
black & white
0.8
0.6
ACF

0.4
0.2
0.0

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5

Lag

##
## Autocorrelations of series ’X’, by lag
##
## -1.7500 -1.6667 -1.5833 -1.5000 -1.4167 -1.3333 -1.2500 -1.1667 -1.0833 -1.0000
## 0.396 0.437 0.477 0.516 0.553 0.590 0.626 0.662 0.697 0.733
## -0.9167 -0.8333 -0.7500 -0.6667 -0.5833 -0.5000 -0.4167 -0.3333 -0.2500 -0.1667
## 0.768 0.800 0.826 0.848 0.866 0.881 0.896 0.912 0.928 0.942
## -0.0833 0.0000 0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667
## 0.952 0.952 0.938 0.919 0.900 0.884 0.870 0.854 0.838 0.821
## 0.7500 0.8333 0.9167 1.0000 1.0833 1.1667 1.2500 1.3333 1.4167 1.5000
## 0.801 0.778 0.757 0.734 0.711 0.687 0.660 0.632 0.602 0.568
## 1.5833 1.6667 1.7500
## 0.531 0.492 0.452

The histograms are right-skewed and the box plot shows a significant number of outliers, which may indicate
non-normality and heteroskedasticity of the data. A transformation (ie. log) may help normalize the data.
From the ACF above, cross correlation between black and white pepper appears significant and positively
correlated, as the points are significantly above the 5% significance bound. The significance is prominent for
all the data points because the lines are highly above the bound and dies down extremely slowly as it moves
away from Lag 0. There are lags for all points at the 5% significance bound.

6
2. Data PreProcessing. (a) With tsdisplay or ggtsdisplay, for each variable, use
its time series plot, ACFand PACF to comment on its stationarity (you can also
decompose the timeseries; note if there is seasonality). To supplement this, use
the appropriate Dickey-Fuller (unit root) test, to determine whether or not it is
stationary. Note using its PACF what the suspected order might be.

is.ts(PepperPrice)

## [1] TRUE

library(forecast)
library(tseries)

pepper.ts <- ts(PepperPrice, start=c(1973,10), end=c(1996,4),frequency=12)


tsdisplay(pepper.ts[,1], main = "Time Series Plot of Black Pepper Spot Prices",
ylab = "Spot Prices $/ton")

Time Series Plot of Black Pepper Spot Prices


Spot Prices $/ton

4000
1000

1975 1980 1985 1990 1995


0.8

0.8
PACF
0.4

0.4
ACF

−0.2

−0.2

0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35

Lag Lag

tsdisplay(pepper.ts[,2], main = "Time Series Plot of White Pepper Spot Prices",


ylab = "Spot Prices $/ton")

7
Time Series Plot of White Pepper Spot Prices
Spot Prices $/ton

6000
2000

1975 1980 1985 1990 1995


1.0

1.0
0.6

0.6
PACF
ACF

−0.2 0.2

−0.2 0.2
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35

Lag Lag

adf.test(pepper.ts[,1])

##
## Augmented Dickey-Fuller Test
##
## data: pepper.ts[, 1]
## Dickey-Fuller = -1.6434, Lag order = 6, p-value = 0.7262
## alternative hypothesis: stationary

adf.test(pepper.ts[,2])

##
## Augmented Dickey-Fuller Test
##
## data: pepper.ts[, 2]
## Dickey-Fuller = -1.6001, Lag order = 6, p-value = 0.7444
## alternative hypothesis: stationary

nsdiffs(pepper.ts[,1])

## [1] 0

8
nsdiffs(pepper.ts[,2])

## [1] 0

The time series plot of black pepper spot prices appear as non-stationary “wandering”. There is no trend
or linear relationship between the prices and time. Since the p values for the Augmented Dickey-Fuller Test
for black and white pepper stock prices are above 0.05, we fail to reject null. Thus, the data has a unit
root and is non-stationary. Moreover, there is no seasonality in the dataset, which suggests that there are
no repetitive fluctuations in the data caused by external factors like weather, holidays, or other recurring
events.

(b) If it is not stationary, determine the level of differencing to make our series
stationary. We can use the ndiffs function which performs a unit-root test to de-
termine this. After this, difference your data to ascertain a stationary timeseries.
Re-do part a) for your differenced time series and comment on the time series
plot, ACF and PACF. Recall that the time series models we’ve observedrely on
stationarity.

ndiffs(pepper.ts[,1], alpha = 0.05, test = "adf")

## [1] 1

ndiffs(pepper.ts[,2], alpha = 0.05, test = "adf")

## [1] 1

par(mfrow=c(2,2))
plot(pepper.ts[,1],main = "Original Time Series for Black",
ylab = "Spot Prices $/ton")
plot(diff(pepper.ts[,1]), main = "Differenced Time Series for Black",
ylab = "Spot Prices $/ton")
plot(pepper.ts[,2], main = "Original Time Series for White",
ylab = "Spot Prices $/ton")
plot(diff(pepper.ts[,2]), main = "Differenced Time Series for White",
ylab = "Spot Prices $/ton")

9
Original Time Series for Black Differenced Time Series for Black
Spot Prices $/ton

Spot Prices $/ton

600
4000

−600 0
1000

1975 1980 1985 1990 1995 1975 1980 1985 1990 1995

Time Time

Original Time Series for White Differenced Time Series for White
Spot Prices $/ton

Spot Prices $/ton


6000

500
2000

−1000
1975 1980 1985 1990 1995 1975 1980 1985 1990 1995

Time Time

tsdisplay(diff(pepper.ts[,1]), main = "Time Series Plot of Black Pepper Spot Prices")

10
600
−600 0
Time Series Plot of Black Pepper Spot Prices

1975 1980 1985 1990 1995


0.4

0.4
0.2

0.2
PACF
ACF

0.0

0.0
−0.2

−0.2
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35

Lag Lag

tsdisplay(diff(pepper.ts[,2]), main = "Time Series Plot of White Pepper Spot Prices")

11
500
−1000
Time Series Plot of White Pepper Spot Prices

1975 1980 1985 1990 1995


0.2

0.2
PACF
ACF

0.0

0.0
−0.2

−0.2
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35

Lag Lag

adf.test(diff(pepper.ts[,1]))

##
## Augmented Dickey-Fuller Test
##
## data: diff(pepper.ts[, 1])
## Dickey-Fuller = -4.973, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

adf.test(diff(pepper.ts[,2]))

##
## Augmented Dickey-Fuller Test
##
## data: diff(pepper.ts[, 2])
## Dickey-Fuller = -5.8575, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

The ndiffs test results tells us pepper spot price should be differenced one time to be stationary in our
dataset. Differencing means subtracting each observation from the previous observation and removing any
linear trend. After differencing one time , both time series plots become stationary and are mean reverting
(oscillating around the mean), indicating that differencing one time was sufficient. Compared to the previous
ACF, the ACF after differencing shows significantly fewer significant lags. Yet, since there are a few significant
lags in the later periods (9,20,33), we need addition identification criteria such as AIC and BIC to validate
our model.

12
3. Feature Generation, Model Testing and Forecasting.(a) Fit an AR(p) model
to the data (using part 2(a), AIC or some built in R function)

library(dynlm)
for(i in 1:10)
{whitemodel <- dynlm(white ~ L(white, 1:i), data = pepper.ts)
print(AIC(whitemodel))
summary(whitemodel)}

## [1] 3698.992
## [1] 3664.556
## [1] 3652.391
## [1] 3641.28
## [1] 3630.263
## [1] 3617.854
## [1] 3603.561
## [1] 3592.878
## [1] 3577.351
## [1] 3558.311

auto.arima(pepper.ts[,2])

## Series: pepper.ts[, 2]
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## 0.2924
## s.e. 0.0552
##
## sigma^2 = 47316: log likelihood = -1835.88
## AIC=3675.76 AICc=3675.8 BIC=3682.96

Based on PACF from the original model(before differencing), we would use two lags for the white model,
but once we do the AIC tool in 3 part a, we see that the AIC keeps decreasing. This indicates an underlying
moving average so we will proceed with whatever model that is suggested by autoarima, which suggest 0
lags for white model.

(b) Plot and comment on the ACF of the residuals of the model chosen in 3(a). If
the model is properly fit, then we should see no autocorrelations in the residuals.
Carry out a formal test for autocorrelation and comment on the results.

residuals<-resid(whitemodel)
acf(residuals)

13
Series residuals
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0.0 0.5 1.0 1.5 2.0

Lag

There is no autoregressive component for white pepper prices because the autoarima states (0,1,1). Thus,
the ARDL is a FDL model. There is confounding result and can affect the order model. Since there are
no lags in the variable white pepper, we cannot carry out tests with its residuals. There seems to be no
significant long lags because there are no strokes above the significance boundary above lag 0. Since there
are no residuals to perform the bg test, we cannot perform a test on autocorrelation.

(c) Using the appropriate predictors, fit an ARDL(p,q) model to the data and
repeat step (b) in part 3.

auto.arima(pepper.ts[,1])

## Series: pepper.ts[, 1]
## ARIMA(2,1,1)
##
## Coefficients:
## ar1 ar2 ma1
## -0.5154 0.2976 0.9481
## s.e. 0.0698 0.0640 0.0346
##
## sigma^2 = 17902: log likelihood = -1703.83
## AIC=3415.65 AICc=3415.81 BIC=3430.05

14
ARDLmodel <- dynlm(white ~ L(black, 1:2), data = pepper.ts)
ARDLresiduals<-resid(ARDLmodel)
acf(ARDLresiduals)

Series ARDLresiduals
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0.0 0.5 1.0 1.5 2.0

Lag

bgtest(ARDLmodel, order =1)

##
## Breusch-Godfrey test for serial correlation of order up to 1
##
## data: ARDLmodel
## LM test = 209.61, df = 1, p-value < 2.2e-16

bgtest(ARDLmodel, order =2)

##
## Breusch-Godfrey test for serial correlation of order up to 2
##
## data: ARDLmodel
## LM test = 211.04, df = 2, p-value < 2.2e-16

The results of the bgtest indicates that there is autocorrelation in our FDL model up to the 2nd lag. This
would need to be corrected with the Newey West tool. There is probably autocorrelation because of the
underlying moving average mentioned in 3a.

15
4. Provide a brief summary of your findings and state which model performs
better.

The question we aimed to answer was “From October 1973 through April 1996, how do past monthly
European spot prices for black pepper and white pepper (US dollars per ton) affect future spot prices for
white pepper?” We utilized time series plot, ACF and PACF, and Dickey-Fuller (unit root) test to check for
stationarity. Then, through the ndiffs and auto arima function, we concluded that there are no lags for our
AR model “whitemodel” and that there are two lags for the black pepper variable. The underlying moving
average led us to change our ARDL model to an FDL model, leading us to an inconclusive result between
the AR model and the FDL model being the better performing model. Thus, we need to correct the moving
average aspect to proceed with our models.

5. Suggest any limitations faced or improvements which could’ve been made to


the model based on your findings, which should be supplemented with statistical
tests(eg. degree of freedom restrictions, reverse causality).

We have not performed the Hausman test to ascertain whether endogeneity is present or not. In that case,
we would need to add instruments. Especially since we only had two variables, there is a high chance for
omitted variables. Moreover, the CCF closely related to one another, which indicates a high chance for
reverse causality. Black and white pepper may be substitute goods because when there is an increase in
price for black pepper, customers are likelier to replace black pepper with white pepper and vice versa. This
may lead to endogeneity, which causes biased and inconsistent estimates and incorrect interpretations. We
could have also added the Var select test to further confirm the number or lags to be tested. The histograms
are right-skewed and the box plot shows a significant number of outliers, which may indicate non-normality
and heteroskedasticity of the data. A transformation (ie. log) may help normalize the data. There is no
autoregressive component for white pepper prices because the autoarima states (0,1,1). Thus, the ARDL is
a FDL model. There is confounding result and can affect the order model. Since there are no lags in the
variable white pepper, we cannot carry out tests with its residuals. Since there are no residuals to perform
the bg test, we cannot perform a test on autocorrelation.The results of the bgtest indicates that there is
autocorrelation in our FDL model up to the 2nd lag. This would need to be corrected with the Newey West
tool.

16

You might also like