EXAM1 - Muhibbul Arman Mannan: List Ls

EXAM1 - Muhibbul Arman Mannan
01/11/1023
INSTRUCTIONS: INCLUDE CODES AS WELL AS ANSWERS/COMMENTS IN THE R MARKDOWN

UNDER THE CODE CHUNK. DON’T DELETE THE ** AND PLACE YOUR ANSWERS BETWEEN
THEM
Example: ANSWER/COMMENT: THIS IS THE ANSWER
remove(list=ls())
library(fpp3)
## Warning: package ’fpp3’ was built under R version 4.2.3
## -- Attaching packages ---------------------------------------------- fpp3 0.5 --
## v tibble 3.2.1 v tsibble 1.1.3

## v dplyr 1.1.1 v tsibbledata 0.4.1
## v tidyr 1.3.0 v feasts 0.3.1
## v lubridate 1.9.2 v fable 0.3.3
## v ggplot2 3.4.2 v fabletools 0.3.3
## Warning: package ’tibble’ was built under R version 4.2.3
## Warning: package ’dplyr’ was built under R version 4.2.3
## Warning: package ’tidyr’ was built under R version 4.2.2
## Warning: package ’lubridate’ was built under R version 4.2.2
## Warning: package ’ggplot2’ was built under R version 4.2.3
## Warning: package ’tsibble’ was built under R version 4.2.3
## Warning: package ’tsibbledata’ was built under R version 4.2.3
## Warning: package ’feasts’ was built under R version 4.2.3
## Warning: package ’fabletools’ was built under R version 4.2.3
## Warning: package ’fable’ was built under R version 4.2.3
1
## -- Conflicts ------------------------------------------------- fpp3_conflicts --
## x lubridate::date() masks base::date()
## x dplyr::filter() masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval() masks lubridate::interval()
## x dplyr::lag() masks stats::lag()
## x tsibble::setdiff() masks base::setdiff()
## x tsibble::union() masks base::union()
library(tsibble)# do not forget to call the library, even for data.
We will be using the GDP information in the global_economy dataset.
View(global_economy)
PART 1
a) Plot the GDP per capita over time for 3 countries of your choice. Which countries did you
choose? AND How has GDP per capita changed over time for these 3 countries?
ANSWER/COMMENT: I have selected three country which are United States, Canada and
Australia. After that I have found GDP growth is consistently increasing which meaning
upward trend.
my_choice <- c("USA", "CAN", "AUS")
global_economy %>%
filter(Code %in% my_choice) %>%
as_tsibble(key = Code, index = Year) %>%
ggplot(aes(x = Year, y = GDP / Population, color = Code)) +
geom_line() +
labs(
title = "GDP Over Time",
y = "GDP",
x = "Year"
)
2
GDP Over Time
60000
Code
40000
AUS
GDP
CAN
USA
20000
0
1960 1980 2000
Year
b) Out of ALL the countries in the dataset, which one had the highest GDP per capita in
2017? Filter and mutate the data as needed.
ANSWER/COMMENT: Luxembourg (Year - 2017)
highest_gdp <- global_economy %>%

filter(Year == 2017) %>%
mutate(GdpPerPoppulation = GDP / Population) %>%
arrange(desc(GdpPerPoppulation)) %>%
head(1)
highest_gdp
## # A tsibble: 1 x 10 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Popul~1 GdpPe~2
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Luxembourg LUX 2017 6.24e10 2.30 111. 194. 230. 599449 104103.
## # ... with abbreviated variable names 1: Population, 2: GdpPerPoppulation
PART 2.
Use the canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 –
February 2005).
3
a) Plot Volume using autoplot, gg_subseries, gg_season to look at the effect of changing
seasonality over time. What do you observe with the seasonality?
ANSWER/COMMENT: The Canadian gas data reveals a developing trend and strong season-
ality, with summer output declining and winter output increasing. Seasonality dramatically
increased from 1975 to 1990 due to stronger summer and winter output fluctuations.
canadian_gas %>%
autoplot(Volume)+
labs(title = "Gas production of canada by monthly",
subtitle = "autoplot()",
y = " bcm")+
theme_replace()+
geom_line(col = "black")
Gas production of canada by monthly

autoplot()
20
15
bcm
10
1960 Jan 1970 Jan 1980 Jan 1990 Jan 2000 Jan
Month [1M]
canadian_gas %>%
gg_subseries(Volume)+
labs(title = "Gas production of canada by monthly",
subtitle = "gg_subseries()",
y = " bcm")
4
bcm
5
10
15
20
1960
1970
1980 Jan
1990
2000
1960
canadian_gas %>%
1970
y = "bcm")
1980
Feb
1990
2000
gg_season(Volume)+
gg_subseries()
1960
1970
1980
Mar
1990
2000
1960
1970
1980
Apr
subtitle = "gg_season()",
1990
2000
1960
1970
1980
May
1990
2000
1960
1970
5
1980
Jun
1990
Gas production of canada by monthly
2000
1960
labs(title = "Monthly Gas Production of Canada",

Month
1970
Jul
1980
1990
2000
1960
1970
1980
Aug
1990
2000
1960
1970
1980
Sep
1990
2000
1960
1970
1980
Oct
1990
2000
1960
1970
1980
Nov
1990
2000
1960
1970
1980
Dec
1990
2000
Monthly Gas Production of Canada
gg_season()
20
15
1999
1989
bcm
10
1979
1969
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
b) Do an STL decomposition of the data. You will need to choose a seasonal window to allow
for the changing shape of the seasonal component
canadian_gas %>%
model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)) %>%
components() %>%
autoplot()+
labs(title = "decomposition of canadian gas production")
6
decomposition of canadian gas production
Volume = trend + season_year + remainder
20
15
Volume
10
5
15
trend
10
5
season_year
1
0
−1
1.0
remainder
0.5
0.0
−0.5
−1.0
Month
c) How does seasonal SHAPE change over time? Plot season_year using gg_season().
ANSWER/COMMENT:From a flat start in 1960 to a trend cycle in 1975, the seasonal shape
of gas production varies, indicating a slow increase over time.
canadian_gas %>%
gg_season(Volume)+
labs(title = "Monthly Gas Production of Canada",
subtitle = "gg_season()",
y = "bcm")
7
Monthly Gas Production of Canada
gg_season()
20
15
1999
1989
bcm
10
1979
1969
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
d) Can you produce a plausible seasonally adjusted series?
ANSWER/COMMENT: INSERT ANSWER HERE
ad_plot <- canadian_gas %>%

model(
STL(Volume ~ trend(window = 21) +
season(window = 13),
robust = TRUE)
) %>%
components() %>%
ggplot(aes(x = Month)) +
geom_line(aes(y = Volume, colour = "original data")) +
geom_line(aes(y = season_adjust, colour = "Seasonally Adjusted data")) +
geom_line(aes(y = trend, colour = "Trend Component")) +
labs(title = "seasonally adjusted series")
ad_plot
8
seasonally adjusted series
20
15
colour
Volume
original data
10
Seasonally Adjusted data
Trend Component
0
Month
PART 3
Aus Retail Time Series We will use aus_rail dataset Using the code below, get a series (it gets a series
randomly by using sample() function):
set.seed(1234567)
myseries <- aus_retail %>%

filter(`Series ID` == sample(aus_retail$`Series ID`,1))
head(myseries)
## # A tsibble: 6 x 5 [1M]
## # Key: State, Industry [1]
## State Industry Serie~1 Month Turno~2
## <chr> <chr> <chr> <mth> <dbl>
## 1 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 Apr 85.1
## 2 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 May 85.1
## 3 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 Jun 82.8
## 4 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 Jul 82.1
## 5 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 Aug 81.8
## 6 Victoria Cafes, restaurants and takeaway food servic~ A33494~ 1982 Sep 84.6
## # ... with abbreviated variable names 1: ‘Series ID‘, 2: Turnover
9
# remover NA's in the series with below:
myseries = myseries %>% filter(!is.na(`Series ID`))
nrow(myseries)
## [1] 441
# rename the column name `Series ID` with MyRandomSeries

rename(myseries, MyRandomSeries = `Series ID`)
## # A tsibble: 441 x 5 [1M]

## # Key: State, Industry [1]
## State Industry MyRan~1 Month Turno~2
## <chr> <chr> <chr> <mth> <dbl>
## 1 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Apr 85.1
## 2 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 May 85.1
## 3 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Jun 82.8
## 4 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Jul 82.1
## 5 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Aug 81.8
## 6 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Sep 84.6
## 7 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Oct 91.7
## 8 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Nov 97.7
## 9 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1982 Dec 109.
## 10 Victoria Cafes, restaurants and takeaway food servi~ A33494~ 1983 Jan 94.6
## # ... with 431 more rows, and abbreviated variable names 1: MyRandomSeries,
## # 2: Turnover
a) Run a linear regression of Turnover on its trend. Hint: use TSLM() and trend() functions)
my_model <- myseries %>%

model(TSLM(Turnover ~ trend()))
report(my_model)
## Series: Turnover
## Model: TSLM
##
## Residuals:
## Min 1Q Median 3Q Max
## -125.471 -50.951 -9.889 48.598 242.364
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -23.529 6.121 -3.844 0.000139 ***
## trend() 1.921 0.024 80.057 < 2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 64.17 on 439 degrees of freedom
## Multiple R-squared: 0.9359, Adjusted R-squared: 0.9357
## F-statistic: 6409 on 1 and 439 DF, p-value: < 2.22e-16
b) Forecast for next 3 years. What are the values for the next 3 years are they monthly values?
10
ANSWER/COMMENT: INSERT ANSWER HERE
forecast_result <- forecast(my_model, h = 36)

forecast_result
## # A fable: 36 x 6 [1M]
## # Key: State, Industry, .model [1]
## State Industry .model Month Turnover .mean
## <chr> <chr> <chr> <mth> <dist> <dbl>
## 1 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Jan N(826, 4155) 826.
## 2 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Feb N(828, 4155) 828.
## 3 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Mar N(830, 4155) 830.
## 4 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Apr N(832, 4155) 832.
## 5 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 May N(833, 4156) 833.
## 6 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Jun N(835, 4156) 835.
## 7 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Jul N(837, 4156) 837.
## 8 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Aug N(839, 4156) 839.
## 9 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Sep N(841, 4157) 841.
## 10 Victoria Cafes, restaurants and takeaway ~ TSLM(~ 2019 Oct N(843, 4157) 843.
## # ... with 26 more rows
c) Autoplot the forecast with original data
myseries %>%
autoplot(Turnover) +
autolayer(forecast_result, series = "Turnover") +
labs(title = "TS plot with Forecast", x = "Time", y = "Turnover")
## Warning in distributional::geom_hilo_ribbon(intvl_mapping, data =

## dplyr::anti_join(interval_data, : Ignoring unknown parameters: ‘series‘
## Warning in distributional::geom_hilo_linerange(intvl_mapping, data =

## dplyr::semi_join(interval_data, : Ignoring unknown parameters: ‘series‘
## Warning in geom_line(mapping = mapping, data = dplyr::anti_join(object, :

## Ignoring unknown parameters: ‘series‘
## Warning in ggplot2::geom_point(mapping = mapping, data =

## dplyr::semi_join(object, : Ignoring unknown parameters: ‘series‘
11
TS plot with Forecast
900
level
Turnover
600
80
95
300
1990 Jan 2000 Jan 2010 Jan 2020 Jan

Time
d) Get the residuals, does it satisfy requirements for white noise error terms? Hint:
gg_tsresiduals()
First figure displays a residual plot for the damped model, confirming no white noise. However,
an ACF plot shows spikes exceeding 5% for all lags, indicating the probability of white noise.
residual_diagnostic_model <- myseries %>%

model(MAdM = ETS(Turnover ~ error("M") + trend("Ad") + season("M")))
residual_diagnostic_model %>% gg_tsresiduals + labs(title = "Residual diagnostics")
12
Residual diagnostics
0.15
Innovation residuals
0.10
0.05
0.00
−0.05
−0.10
1990 Jan 2000 Jan 2010 Jan 2020 Jan

Month
0.10 60
0.05
40
count
acf
0.00
−0.05 20
−0.10
0
6 12 18 24 −0.1 0.0 0.1
lag [1M] .resid
13

EXAM1 - Muhibbul Arman Mannan: List Ls

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EXAM1 - Muhibbul Arman Mannan: List Ls

Uploaded by

Copyright:

Available Formats

EXAM1 - Muhibbul Arman Mannan

INSTRUCTIONS: INCLUDE CODES AS WELL AS ANSWERS/COMMENTS IN THE R MARKDOWN

## Warning: package ’fpp3’ was built under R version 4.2.3

## -- Attaching packages ---------------------------------------------- fpp3 0.5 --

## v tibble 3.2.1 v tsibble 1.1.3

## Warning: package ’tibble’ was built under R version 4.2.3

## Warning: package ’dplyr’ was built under R version 4.2.3

## Warning: package ’tidyr’ was built under R version 4.2.2

## Warning: package ’lubridate’ was built under R version 4.2.2

## Warning: package ’ggplot2’ was built under R version 4.2.3

## Warning: package ’tsibble’ was built under R version 4.2.3

## Warning: package ’tsibbledata’ was built under R version 4.2.3

## Warning: package ’feasts’ was built under R version 4.2.3

## Warning: package ’fabletools’ was built under R version 4.2.3

## Warning: package ’fable’ was built under R version 4.2.3

library(tsibble)# do not forget to call the library, even for data.

We will be using the GDP information in the global_economy dataset.

my_choice <- c("USA", "CAN", "AUS")

ANSWER/COMMENT: Luxembourg (Year - 2017)

highest_gdp <- global_economy %>%

Gas production of canada by monthly

labs(title = "Monthly Gas Production of Canada",

d) Can you produce a plausible seasonally adjusted series?

ANSWER/COMMENT: INSERT ANSWER HERE

ad_plot <- canadian_gas %>%

myseries <- aus_retail %>%

# rename the column name `Series ID` with MyRandomSeries

## # A tsibble: 441 x 5 [1M]

my_model <- myseries %>%

forecast_result <- forecast(my_model, h = 36)

c) Autoplot the forecast with original data

## Warning in distributional::geom_hilo_ribbon(intvl_mapping, data =

## Warning in distributional::geom_hilo_linerange(intvl_mapping, data =

## Warning in geom_line(mapping = mapping, data = dplyr::anti_join(object, :

## Warning in ggplot2::geom_point(mapping = mapping, data =

1990 Jan 2000 Jan 2010 Jan 2020 Jan

residual_diagnostic_model <- myseries %>%

1990 Jan 2000 Jan 2010 Jan 2020 Jan

You might also like