You are on page 1of 1

In [97]: library('urca')

library("Hmisc")
library('Metrics')
library('zoo')
library('tseries')
library('forecast')
library('fpp2')
library('astsa')
library('dynlm')
library('FinTS')
library('tsutils')

In [98]: load('Data_for_test2.rda')

In [99]: head(Data_for_test2)

A tibble: 6 × 5
Year Month v1 v2 v3

<dbl> <dbl> <dbl> <dbl> <dbl>

2013 1 453.90 17.9 350

2013 2 261.88 16.8 220

2013 3 381.86 15.0 320

2013 4 581.41 17.9 400

2013 5 552.45 13.7 320

2013 6 498.62 16.1 210

In [100… attach(Data_for_test2)

The following objects are masked _by_ .GlobalEnv:

v1, v2, v3

The following objects are masked from Data_for_test2 (pos = 3):

Month, v1, v2, v3, Year

The following objects are masked from Data_for_test2 (pos = 4):

Month, v1, v2, v3, Year

In [101… time <- seq_along(v1)

In [102… length(time)

108

In [103… head(cbind(time, Year, Month, v1, v2, v3))

A Time Series: 6 × 6
time Year Month v1 v2 v3

Jan 2013 1 2013 1 453.90 17.9 350

Feb 2013 2 2013 2 261.88 16.8 220

Mar 2013 3 2013 3 381.86 15.0 320

Apr 2013 4 2013 4 581.41 17.9 400

May 2013 5 2013 5 552.45 13.7 320

Jun 2013 6 2013 6 498.62 16.1 210

In [104… v1 <- ts(v1, start = c(2013, 1), frequency = 12)


v2 <- ts(v2, start = c(2013, 1), frequency = 12)
v3 <- ts(v3, start = c(2013, 1), frequency = 12)

Question 1: variance of v2: 849.336728971963

In [105… var(v2)

849.336728971963

Question 2: Generate the year-to-year growth rate of v1 (%) Standard deviation of growth rate: 18.952 %

In [106… sd(((v1 - Lag(v1, 12))/Lag(v1, 12))[13:108])

0.189528840927677

Question 3 Regress v3 on Time trend Estimated change of v3 per month : 0.4813

In [107… q3 <- lm(v3 ~ time)


summary(q3)

Call:
lm(formula = v3 ~ time)

Residuals:
Min 1Q Median 3Q Max
-178.241 -47.499 2.099 51.605 171.941

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 312.1748 14.1740 22.025 <2e-16 ***
time 0.4813 0.2257 2.132 0.0353 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 73.14 on 106 degrees of freedom


Multiple R-squared: 0.04112, Adjusted R-squared: 0.03208
F-statistic: 4.546 on 1 and 106 DF, p-value: 0.0353

Question 4: Regress v1 on time trend in quadratic form At which significant level that both slopes are significant? 1%

In [108… q4 <- lm(v1 ~ I(time^2) + I(time))


summary(q4)

Call:
lm(formula = v1 ~ I(time^2) + I(time))

Residuals:
Min 1Q Median 3Q Max
-147.072 -26.652 -0.508 21.318 173.291

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 410.056537 15.713895 26.095 < 2e-16 ***
I(time^2) 0.033924 0.005915 5.735 9.46e-08 ***
I(time) -0.620013 0.665488 -0.932 0.354
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 53.43 on 105 degrees of freedom


Multiple R-squared: 0.784, Adjusted R-squared: 0.7799
F-statistic: 190.6 on 2 and 105 DF, p-value: < 2.2e-16

Question 5: Let seasonal dummies for 12 month, automatically. Regress v1 on seasonal dummies, with December is base. Estimated value of January's coefficient: -38.668

In [109… dummy <- seasdummy(108, 12)


q5 <- lm(v1 ~ dummy)
summary(q5)

Call:
lm(formula = v1 ~ dummy)

Residuals:
Min 1Q Median 3Q Max
-246.04 -76.12 -17.02 87.32 201.06

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 539.881 39.101 13.807 <2e-16 ***
dummy1 -38.668 55.297 -0.699 0.4861
dummy2 -93.982 55.297 -1.700 0.0924 .
dummy3 -58.111 55.297 -1.051 0.2960
dummy4 -21.062 55.297 -0.381 0.7041
dummy5 -35.881 55.297 -0.649 0.5180
dummy6 -19.361 55.297 -0.350 0.7270
dummy7 -35.882 55.297 -0.649 0.5180
dummy8 -1.177 55.297 -0.021 0.9831
dummy9 -14.639 55.297 -0.265 0.7918
dummy10 -24.711 55.297 -0.447 0.6560
dummy11 -15.106 55.297 -0.273 0.7853
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 117.3 on 96 degrees of freedom


Multiple R-squared: 0.04823, Adjusted R-squared: -0.06083
F-statistic: 0.4423 on 11 and 96 DF, p-value: 0.9329

Question 6: Let seasonal dummies for 12 month, automatically. Regress v1 on Time trend and Seasonal dummies, with December is base. At 5%, among 3 coefficients of January, February, March, how many are significant? 1 - Feb

In [110… q6 <- lm(v1 ~ time + dummy)


summary(q6)

Call:
lm(formula = v1 ~ time + dummy)

Residuals:
Min 1Q Median 3Q Max
-136.195 -37.907 -0.579 34.972 209.053

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 356.804 23.367 15.270 <2e-16 ***
time 3.051 0.190 16.057 <2e-16 ***
dummy1 -5.104 28.920 -0.176 0.8603
dummy2 -63.469 28.907 -2.196 0.0306 *
dummy3 -30.650 28.895 -1.061 0.2915
dummy4 3.348 28.884 0.116 0.9080
dummy5 -14.522 28.875 -0.503 0.6162
dummy6 -1.053 28.867 -0.036 0.9710
dummy7 -20.626 28.860 -0.715 0.4766
dummy8 11.028 28.855 0.382 0.7032
dummy9 -5.485 28.850 -0.190 0.8496
dummy10 -18.609 28.847 -0.645 0.5204
dummy11 -12.054 28.845 -0.418 0.6770
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 61.19 on 95 degrees of freedom


Multiple R-squared: 0.7437, Adjusted R-squared: 0.7114
F-statistic: 22.98 on 12 and 95 DF, p-value: < 2.2e-16

Question 7: Decomposition v1 by Additive model Among 4 month Feb, Mar, Apr, May which has greatest season's coefficient? April 2.576094

In [111… decom7 <- decompose(v1, type = 'additive')


decom7$seasonal

A Time Series: 9 × 12
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2013 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2014 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2015 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2016 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2017 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2018 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2019 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2020 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

2021 5.205260 -35.218646 -11.615677 2.576094 -13.872500 8.185104 -6.536667 28.250677 8.733906 -5.212240 2.930052 16.574635

Question 8: Decomposition v1 by Multiplicative model Trend component of July 2013: 450.2654

In [112… decom8 <- decompose(v1, type = 'm')


decom8$trend

A Time Series: 9 × 12
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2013 NA NA NA NA NA NA 450.2654 452.8338 453.5171 442.6963 425.6167 412.3771

2014 405.8804 400.0363 389.4117 380.0750 376.1088 368.6950 355.2900 347.7683 346.8267 348.3654 348.7492 346.8529

2015 345.7929 348.8742 357.2538 368.3533 380.5058 395.4650 413.3271 425.1229 433.9754 441.0379 447.5729 459.0967

2016 469.4296 470.0271 465.1171 464.4354 464.4417 462.9496 459.5692 459.2504 460.3046 463.2108 469.0729 473.4137

2017 477.6354 482.9908 487.2308 487.1912 488.0646 489.7533 490.6237 495.8875 502.2962 502.9129 499.5987 494.5850

2018 491.4392 490.9717 492.8304 496.9867 499.9225 502.7504 508.4362 512.6933 516.1079 522.0638 529.9912 537.9921

2019 544.2200 550.3117 556.6592 563.3246 570.0638 578.4817 586.6600 592.6504 598.0417 603.1812 608.7879 615.0804

2020 621.5446 627.9238 634.1996 640.1408 645.6563 649.2892 652.7162 656.5825 659.1337 661.4042 663.4546 665.3096

2021 667.3204 669.7083 672.1775 674.7287 677.6513 681.0796 NA NA NA NA NA NA

Question 9: Partial autocorrelation order 2 of series v1: 0.084 is correct, 0.1667 is wrong

In [113… pacf(v1[1:108])
pacf(v1[1:108], plot = FALSE)

Partial autocorrelations of series 'v1[1:108]', by lag

1 2 3 4 5 6 7 8 9 10 11
0.876 0.084 0.283 0.215 -0.063 -0.084 0.211 -0.045 -0.024 -0.039 0.048
12 13 14 15 16 17 18 19 20
-0.016 -0.054 0.028 -0.067 -0.172 -0.016 -0.073 0.142 -0.022

In [114… pacf(v1)
pacf(v1, plot = FALSE)

Partial autocorrelations of series 'v1', by lag

0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667 0.7500 0.8333 0.9167
0.876 0.084 0.283 0.215 -0.063 -0.084 0.211 -0.045 -0.024 -0.039 0.048
1.0000 1.0833 1.1667 1.2500 1.3333 1.4167 1.5000 1.5833 1.6667
-0.016 -0.054 0.028 -0.067 -0.172 -0.016 -0.073 0.142 -0.022

Question 10: Apply Dickey Fuller test for v1, with trend, no lag. At 5%, what is conclusion?: Trend is significant & no Unit Root

In [115… summary(ur.df(v1, type = 'trend', lags = 0))

###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression trend

Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt)

Residuals:
Min 1Q Median 3Q Max
-148.791 -16.327 1.425 17.433 212.651

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 129.52070 27.58379 4.696 8.15e-06 ***
z.lag.1 -0.38335 0.07573 -5.062 1.80e-06 ***
tt 1.25517 0.27430 4.576 1.32e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 47.32 on 104 degrees of freedom


Multiple R-squared: 0.1998, Adjusted R-squared: 0.1844
F-statistic: 12.98 on 2 and 104 DF, p-value: 9.258e-06

Value of test-statistic is: -5.0622 8.7611 12.9832

Critical values for test statistics:


1pct 5pct 10pct
tau3 -3.99 -3.43 -3.13
phi2 6.22 4.75 4.07
phi3 8.43 6.49 5.47

Question 11: Apply Dickey Fuller test for difference of v1, with drift(????) and 4 lags At 5%, conclusion: Trend is insignificant & No Unit Root

In [116… summary(ur.df(diff(v1), type = 'trend', lags = 4))

###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression trend

Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)

Residuals:
Min 1Q Median 3Q Max
-122.254 -18.471 0.572 12.530 133.122

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.71739 8.21223 -1.183 0.2396
z.lag.1 -1.74041 0.33162 -5.248 9.33e-07 ***
tt 0.23333 0.13270 1.758 0.0819 .
z.diff.lag1 0.29177 0.28448 1.026 0.3077
z.diff.lag2 -0.04328 0.21241 -0.204 0.8390
z.diff.lag3 -0.25254 0.14280 -1.769 0.0802 .
z.diff.lag4 -0.13921 0.08435 -1.650 0.1022
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 38.55 on 95 degrees of freedom


Multiple R-squared: 0.7184, Adjusted R-squared: 0.7006
F-statistic: 40.39 on 6 and 95 DF, p-value: < 2.2e-16

Value of test-statistic is: -5.2482 9.3939 14.0593

Critical values for test statistics:


1pct 5pct 10pct
tau3 -3.99 -3.43 -3.13
phi2 6.22 4.75 4.07
phi3 8.43 6.49 5.47

Question 12: Apply model ARMA(1,1) for difference of v2, estimate by OLS, with drift At 5%, what is conclusion ? : AR(1) is insignificant & MA(1) is significant

In [117… summary(arma(diff(v2), order = c(1,1), include.intercept = TRUE))

Call:
arma(x = diff(v2), order = c(1, 1), include.intercept = TRUE)

Model:
ARMA(1,1)

Residuals:
Min 1Q Median 3Q Max
-32.2284 -4.7637 -2.5516 0.6518 53.2114

Coefficient(s):
Estimate Std. Error t value Pr(>|t|)
ar1 -0.2213 0.1337 -1.655 0.0979 .
ma1 -0.5442 0.1115 -4.879 1.07e-06 ***
intercept 1.0945 0.6000 1.824 0.0681 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fit:
sigma^2 estimated as 176.6, Conditional Sum-of-Squares = 18547.76, AIC = 863.29

Question 13: Apply model ARMA(1,1) for difference of v2, estimate by OLS, with drift The last residual: 2.66871769

In [118… summary(Arima(v2, order = c(1,1,1), include.drift = TRUE))

Series: v2
ARIMA(1,1,1) with drift

Coefficients:
ar1 ma1 drift
-0.2193 -0.5397 0.8916
s.e. 0.1337 0.1131 0.4868

sigma^2 = 178.4: log likelihood = -427.96


AIC=863.93 AICc=864.32 BIC=874.62

Training set error measures:


ME RMSE MAE MPE MAPE MASE
Training set -0.03917564 13.10755 7.786111 -19.80584 32.51068 0.6226295
ACF1
Training set -0.0002143489

In [119… Arima(v2, order = c(1,1,1), include.drift = TRUE)$residuals

A Time Series: 9 × 12
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2013 0.01700835 -1.57196940 -3.60517612 -0.42282269 -4.85648567 -2.21603231 -7.75292279 -2.48559443 -0.42959015 0.02219384 -0.23396589 -1.38185599

2014 -1.09875573 -6.00476695 -8.41483135 3.29132457 -2.40575517 -2.52578082 -3.23113902 -3.15037383 -3.20944572 4.59289215 -8.36369601 -2.30095047

2015 -2.75733113 -1.14812369 -3.61215133 3.65893756 -3.33356822 -3.65811680 -2.85193821 -3.20451537 -2.14829914 -2.17124908 -4.78097192 -2.71577527

2016 0.57594420 -2.16233913 -5.09276071 1.73786574 -2.01163772 -3.37453541 7.48186027 -2.94661532 -7.17549131 -3.45204802 2.01010481 -2.11551837

2017 -3.90867786 -5.91606933 -2.82843391 5.82476872 -5.78917067 -3.01670188 -1.59180659 -1.85859203 -5.29029188 15.65584044 -1.58601503 -7.38128206

2018 -4.21926932 7.84252058 -11.66424734 -4.95411318 -16.20164707 14.44235693 0.49385233 -4.59557789 -2.50874366 20.34383524 -7.57333546 -6.78624275

2019 0.67361756 1.30697704 -25.26244237 -4.22578690 0.24022836 13.26445379 -2.01424287 -4.23003773 -2.78255360 -3.57936084 -1.56024963 44.04348119

2020 -10.41671147 -15.55965339 -32.15375955 -10.14101083 -3.24446813 39.94939874 -5.26231560 -11.22351827 44.94277805 -28.72804709 53.32787463 43.47026598

2021 25.10547107 -9.41311618 -28.72310029 18.85081925 4.70442462 22.00163109 8.21878080 0.56056203 4.97423658 1.91323818 0.12625759 2.66871769

Question 14: Estimate model ARIMA(2,1,2) for series v2, drift included, by Maximum likelihood estimation. Estimated coefficient of AR(2): 0.0344

In [120… summary(Arima(v2, order = c(2,1,2), include.drift = TRUE))

Series: v2
ARIMA(2,1,2) with drift

Coefficients:
ar1 ar2 ma1 ma2 drift
-0.3260 0.0344 -0.4259 -0.0946 0.8916
s.e. 0.7577 0.2136 0.7521 0.4083 0.4788

sigma^2 = 181.5: log likelihood = -427.84


AIC=867.68 AICc=868.52 BIC=883.72

Training set error measures:


ME RMSE MAE MPE MAPE MASE
Training set -0.04260786 13.09288 7.834108 -19.74718 32.57282 0.6264676
ACF1
Training set -0.00598096

Question 15: Estimate model ARIMA(2,1,1) for series v2, drift included, by Maximum likelihood estimation. Test for serial correlation of residuals, at 5%: p-value 0.02 -> Reject Ho, residual has serial correlation

In [121… checkresiduals(Arima(v2, order = c(2,1,1), include.drift = TRUE))

Ljung-Box test

data: Residuals from ARIMA(2,1,1) with drift


Q* = 33.236, df = 19, p-value = 0.02257

Model df: 3. Total lags used: 22

Question 16: Consider model Test for ARCH(1) with At 5%: chi-sq stat > critical, have ARCH effect chisq crit = 0.0039

In [122… ArchTest(diff(v2), lags = 1, demean = TRUE)

ARCH LM-test; Null hypothesis: no ARCH effects

data: diff(v2)
Chi-squared = 24.875, df = 1, p-value = 6.118e-07

Question 17: Estimate model ARIMA(2,1,1) for v2, with drift, by Maximum likelihood estimation. Forecast v2 at 2022 February: 112.7339

In [123… forecast(Arima(v2, order = c(2,1,1), include.drift = TRUE), h = 4)

Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Jan 2022 111.3208 94.13509 128.5066 85.03750 137.6042
Feb 2022 112.7339 95.03159 130.4362 85.66055 139.8073
Mar 2022 113.4048 94.21030 132.5994 84.04932 142.7604
Apr 2022 114.3647 94.22565 134.5037 83.56470 145.1647

Question 18: Apply model Seasonal Auto-regression SAR(1)(1)6 for difference of v2 What is correct?: Negative AR(1) & Positive Seasonal AR(1)

In [124… summary(ur.df(diff(v2), type = 'none', selectlags = 'AIC'))

###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression none

Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)

Residuals:
Min 1Q Median 3Q Max
-32.320 -3.113 -0.226 3.636 51.530

Coefficients:
Estimate Std. Error t value Pr(>|t|)
z.lag.1 -1.89146 0.17042 -11.099 <2e-16 ***
z.diff.lag 0.20953 0.09637 2.174 0.032 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.8 on 103 degrees of freedom


Multiple R-squared: 0.7914, Adjusted R-squared: 0.7873
F-statistic: 195.4 on 2 and 103 DF, p-value: < 2.2e-16

Value of test-statistic is: -11.0987

Critical values for test statistics:


1pct 5pct 10pct
tau1 -2.58 -1.95 -1.62

In [125… summary(Arima(diff(v2), order = c(1,0,0), include.mean = FALSE, seasonal = list(order = c(1,0,0), period = 6)))

Series: diff(v2)
ARIMA(1,0,0)(1,0,0)[6] with zero mean

Coefficients:
ar1 sar1
-0.5917 0.1589
s.e. 0.0790 0.0957

sigma^2 = 190.2: log likelihood = -431.89


AIC=869.78 AICc=870.01 BIC=877.79

Training set error measures:


ME RMSE MAE MPE MAPE MASE ACF1
Training set 1.188448 13.66175 7.96681 Inf Inf 0.6457739 -0.1550759

Question 19: Apply model Seasonal Moving Average SMA(2)(2)6 for difference of v2 Coefficient of MA(2): 0.2639

In [126… summary(Arima(diff(v2), order = c(0,0,2),include.mean = FALSE, seasonal = list(order = c(0,0,2), period = 6)))

Series: diff(v2)
ARIMA(0,0,2)(0,0,2)[6] with zero mean

Coefficients:
ma1 ma2 sma1 sma2
-0.8408 0.2639 0.1378 0.4146
s.e. 0.0974 0.0907 0.0976 0.1036

sigma^2 = 155.4: log likelihood = -421.29


AIC=852.57 AICc=853.16 BIC=865.93

Training set error measures:


ME RMSE MAE MPE MAPE MASE ACF1
Training set 1.333478 12.23153 6.941837 Inf Inf 0.5626916 -0.03630357

Question 20: Apply model Holt-Winter Multiplicative for series v1. Time trend coefficient: 2.4226310

In [132… HoltWinters(v1, seasonal = 'm')

Holt-Winters exponential smoothing with trend and multiplicative seasonal component.

Call:
HoltWinters(x = v1, seasonal = "m")

Smoothing parameters:
alpha: 0.5174382
beta : 0.03114415
gamma: 0.7133889

Coefficients:
[,1]
a 666.5654117
b 2.4226310
s1 1.0599292
s2 0.9909977
s3 0.9902070
s4 0.9956686
s5 0.9832787
s6 1.0032476
s7 1.0096582
s8 1.0272691
s9 1.0475569
s10 1.0611815
s11 1.0787814
s12 1.0938726

You might also like