You are on page 1of 9

HOMEWORK – 2

SABITRA RUDRA

17.5 f)
Fit and plot your better ARIMA forecast for 4 quarters
library(forecast)
df <- read.csv("DeptStoreSales.csv")
sales<- ts(df$Sales, start=c(1,1), frequency=4)

nTrain <- 20
nValid <- length(sales) - nTrain
salesTrain <- window(sales, start = c(1,1), end= c(1, nTrain))
salesValid <- window(sales, start= c(1, nTrain+1), end= c(1, nTrain+nValid))

salesTrain %>% diff()%>%  Acf(main = "Acf after differencing (removing


trend)")
salesTrain %>% diff() %>% Pacf(main = "Pacf after differencing (removing
trend)")

salesTrain %>% diff(lag = 4)%>%  Acf(main = "Acf after differencing with lag =
4 ( seasonality)")
salesTrain %>% diff(lag = 4) %>% Pacf(main = "Acf after differencing  lag = 4
( seasonality)")
modeldos <- Arima(salesTrain,order = c(0,1,0), seasonal = c(0,1,0))
modeldos

model.fit<-auto.arima(salesTrain)
model.fit

We can observe both ACF and PACF graphs after differencing without lag, and then differencing
with lag = 4 (quarterly data )
Arima model would have six parameters – (p,d,q) [Non seasonal element] and (P,D,Q) [Seasonal
element
PACF after differencing with lag=4

Looking at these two plots , differencing with lag = 4, takes into consideration the seasonality, we see
there is no Geometric decay at each year interval (lag multiple of 4) in the ACF graph, and no significant
measure at year intervals (lag multiple of 4) in the PACF graph. Hence P in (P,D,Q)

P= 0

Also there is no significant measures year interval lags in the ACF graph, and no seasonal Geometric
Decay in the PACF graph, hence Q= 0
Since we performed differencing with lag = 4, D = 1.

Hence (P,D,Q) is (0,1,0)

18.1
a. Create a time plot for the pre-event AIR time series. What time series components
appear from the plot?
Based on Time Plot for pre event AIR Time Series there Is a Seasonal Element in play as well as
an General trend of growth in number of people travelling

b. Figure 18.6 shows a time plot of the seasonally adjusted pre-September-11 AIR
series.

Which of the following smoothing methods would be adequate for forecasting


this series?
• Moving average (with what window width?)
Not Adequate – Since the time series has trend
• Simple exponential smoothing
Not Adequate – Since the time series has trend
• Holt exponential smoothing
Adequate- The time series has been seasonally adjusted, and this method takes into account the
trend in the time series
• Holt–Winter’s exponential smoothing
Not Adequate - As this method addresses both trend and seasonality, and seasonality has already
been removed.

18.2 Relation between Moving Average and Exponential Smoothing. Assume that
we apply a moving average to a series, using a very short window span. If we wanted
to achieve an equivalent result using simple exponential smoothing, what value should
the smoothing coefficient take?
The smoothing constant should be closer to 1. In order to obtain an equal result using exponential
smoothing. As a very short window duration was used by moving average, older values will have much
less impact. A smoothing constant closer to 1 will place heavier weights towards the more recent values
in order to obtain a similar outcome.

18.5
a. Which of the following methods would not be suitable for forecasting this series?
 Moving average of raw series – 
Not Suitable - The series has both trend and seasonality
 Moving average of deseasonalized series 
Not Suitable - The series has both trend and seasonality nevertheless it is quite feasible
to deseasonalize the data.
 Simple exponential smoothing of the raw series 
Not suitable - The series has both trend and seasonality
 Double exponential smoothing of the raw series 
Not Suitable – It doesn’t take into account seasonality
 Holts-Winter exponential smoothing of the raw series
Suitable – This method addresses both trend and seasonality
 Regression Model Fit to the raw series
Suitable– This method captures trend and seasonality
 Random Walk Fit to the raw series
Not Suitable–Random Walk has nothing to do with trend and seasonality

b. i. Run this method on the data.

storeSales <- read.csv("DeptStoreSales.csv")


storeSalesTS <- ts(storeSales$Sales, freq = 4)
nValid <- 4
trainLength1 <- length(storeSalesTS) - nValid
salesTrain <- window(storeSalesTS, end = c(1, trainLength1))
salesValid <- window(storeSalesTS, start = c(1, trainLength1+1))
hwSales <- ets(storeSalesTS, restrict = FALSE, model = "ZZZ", alpha = 0.2,beta
= 0.15, gamma = 0.05)
hwSales.pred <- forecast(hwSales, h = nValid, level = 0)
hwSales.pred
ii. The forecasts for the validation set are given in Table 18.3. Compute the MAPE values for
the forecasts of quarters 21 and 22.

Q21 <- window(storeSalesTS , start = c(1,21), end= c(1,21))


accuracy(hwSales, Q21)

Q22 <- window(storeSalesTS , start = c(1,22), end= c(1,22))


accuracy(hwSales, Q22)

The MAPE values are 1.66 for both Q1 and Q2 with the test sets performing better both the
times

c. The fit and residuals from the exponential smoothing are shown in Figure 18.8.
Using all the information thus far, which model is more suitable for forecasting quarters 21
and 22?
On finding out the Mean Absolute Percentage Error for the other Exponential Smoothing
Modes (lowest 3.63) as well as Holt Winter method (1.66) we can conclude that the latter
model is more suited to forecasting quarters 21 and 22 .

18.9

a. Which forecasting method would you choose if you had to choose the same method
for all series? Why?
A quick look at the data shows both seasonality and trend, meaning Holt-Winter’s exponential
smoothing would be best. It doesn’t require further smoothing.
b. Fortified wine has the largest market share of the above six types of wine. You are asked to
focus on fortified wine sales alone, and produce as accurate as possible forecasts for the next
2 months.
• Start by partitioning the data using the period until December 1993 as the training set.
wine <- read.csv("AustralianWines.csv")
fWineTS1 <- ts(wine$Fortified, start = c(1980,1), freq=12)
fwineTS <- na.omit(fWineTS1)
nValid <- 12
trainLength2 <- length(fwineTS) - nValid
wineTrain <- window(fwineTS, start = c(1980,1), end= c(1980, trainLength2))
wineValid <- window(fwineTS, start = c(1980, trainLength2+1), end = c(1980,
trainLength2+nValid))

• Apply Holt–Winter’s exponential smoothing to sales with an appropriate season length (use
the default values for the smoothing constants).

hwWine <- ets(wineTrain, restrict = FALSE, model = "ZNM", alpha = 0.2)


hWine.pred <- forecast(hwWine, h = nValid, level = 0)
hWine.pred

• c. Create an ACF plot for the residuals from the Holt–Winter’s exponential smoothing
until lag 12.

Residual.ts <- hWine.pred$residuals


Acf(Residual.ts, lag.max = 12, main = "")

i. Examining this plot, which of the following statements are reasonable conclusions?
• Decembers (month 12) are not captured well by the model.
Reasonable: The model often under predicts December numbers by some of the biggest margins.
• There is a strong correlation between sales on the same calendar month.
Reasonable: The peaks and valleys of this plot for certain months roughly correspond to each
other’s numbers for past years.
• The model does not capture the seasonality well.
Unreasonable: While it’s not perfect, seasonality is captured here by the Holt-Winter’s model
• We should first deseasonalize the data and then apply Holt–Winter’s exponential
Smoothing.
Unreasonable: That’s not necessary because the model is equipped to handle seasonality.

ii. How can you handle the above effect without adding another layer to your
Model?
autoplot(decompose(fwineTS))

Looking at the components of the original series, it has trend and seasonality. We could de-
trend and deseasonalize the data by differencing twice and then use the simple exponential
smoothing method.

You might also like