2 Time Series Regression and Exploratory Data Analysis 2.2 Exploratory Data Analysis

3/1/24, 10:23 AM 2 Time Series Regression and Exploratory Data Analysis 2.
2 Exploratory Data Analysis
2 Time Series Regression and

Exploratory Data Analysis 2.2
Exploratory Data Analysis
Aaron Smith
2022-11-24
This code is modified from Time Series Analysis and Its Applications, by Robert H. Shumway, David S. Stoffer
https://github.com/nickpoison/tsa4 (https://github.com/nickpoison/tsa4)
The most recent version of the package can be found at https://github.com/nickpoison/astsa/

(https://github.com/nickpoison/astsa/)
You can find demonstrations of astsa capabilities at

https://github.com/nickpoison/astsa/blob/master/fun_with_astsa/fun_with_astsa.md
(https://github.com/nickpoison/astsa/blob/master/fun_with_astsa/fun_with_astsa.md)
In addition, the News and ChangeLog files are at https://github.com/nickpoison/astsa/blob/master/NEWS.md

(https://github.com/nickpoison/astsa/blob/master/NEWS.md).
The webpages for the texts and some help on using R for time series analysis can be found at
https://nickpoison.github.io/ (https://nickpoison.github.io/).
UCF students can download it for free through the library.
Punchline of this video:
if we have a trend stationary time series, we use detrending to get the stationary component
if we have a random walk time series, we use differencing to get a stationary time series
Our time series needs to be stationary for averaging the values over time to make sense.
We use sample autocorrelation to measure (estimate) the dependence of values between each other.
When we use autocorrelation, we are assuming that the dependence between values is constant over the time
interval.
stationarity in mean
stationarity in autocorrelation
Often, this is not the case.
The Johnson & Johnson series has a mean that increases exponentially over time, and the increase in the
magnitude of the fluctuations around this trend causes changes in the covariance function; the variance of the
process, for example, clearly increases as one progresses over the length of the series.
Johnson and Johnson Quarterly Earnings Per Share
Johnson and Johnson quarterly earnings per share, 84 quarters (21 years) measured from the first quarter of
1960 to the last quarter of 1980.
Note the gradually increasing underlying trend and the rather regular variation superimposed on the trend that
seems to repeat over quarters.
file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 1/34

3/1/24, 10:23 AM 2 Time Series Regression and Exploratory Data Analysis 2.2 Exploratory Data Analysis
data(
list = "jj",
package = "astsa"
)
astsa::tsplot(
x = jj,
col = 4,
type="o",
ylab = "Quarterly Earnings per Share"
)
The global temperature series shown contains some evidence of a trend over time.
data(
list = "globtemp",
package = "astsa"
)
astsa::tsplot(
x = globtemp,
col = 4,
type = "o",
ylab = "Global Temperature Deviations"
)

Trend stationary
Trend stationary model is the easiest form of nonstationarity to work with. It has stationary behavior around a
trend.
x t = μt + yt
μt is the trend
yt is a stationary process
Frequently we will estimate the trend, then find the stationary process by working with the residuals
ŷ t = x t − μ̂t
Example 2.4 Detrending Chicken Prices

Let’s use a trend stationary model
x t = μt + yt
μt = β 0 + β 1 t
load the data
data(
list = "chicken",
package = "astsa"
)

lm(
formula = chicken ~ time(chicken)
)
##
## Call:
## lm(formula = chicken ~ time(chicken))
##
## Coefficients:
## (Intercept) time(chicken)
## -7131.022 3.592
plot the time series
astsa::tsplot(
x = chicken,
main = "original time series"
)
μ̂t = −7131 + 3.59t
ŷ t = x t + 7131 − 3.59t

plot(
x = time(chicken),
y = chicken - predict(
object = lm(
formula = chicken ~ time(chicken)
)
),
type = "l"
)
plot the detrended time series
# astsa now has a detrend script, so Figure 2.4 can be done as

#par(mfrow=2:1)
astsa::tsplot(
x = astsa::detrend(
series = chicken
),
main = "detrended"
)

plot the difference between observations as a time series
astsa::tsplot(
x = diff(
x = chicken
),
main = "first difference"
)

random walk with drift model,

μt = δ + μt−1 + wt
δ the drift
wt white noise
If x t is trend stationary, and the trend is a random walk with drift
x t − x t−1 =(μt + yt ) − (μt−1 + yt−1 )
=(μt − μt−1 ) + (yt − yt−1 )
=(δ + wt ) + (yt − yt−1 )
Since δ is constant, E(wt ) = 0 , and yt is stationary, the difference of consecutive observations has constant
expected value.
Let z_t =& y_{t} - y_{t-1} , then
γ z (h) =cov(zt+h , zt )
=cov(yt+h − yt+h−1 , yt − yt−1 )
=cov(yt+h , yt ) + cov(yt+h , yt−1 ) + cov(yt+h−1 , yt ) + cov(yt+h−1 , yt−1 )
=γ y (h) + γ y (h + 1) + γ y (h − 1) + γ y (h)
=γ y (h + 1) + 2γ y (h) + γ y (h − 1)
this is independent of time
An advantage of differencing is that no parameter is estimated.
A disadvantage of differencing is that it does not provide an estimate of the stationary component yt .

Use differencing when you want a stationary time series from a non-stationary time series.
Use detrending if you want to estimate a stationary component yt .
If x t = μt + yt and μt = β0 + β1 t , then
x t − x t−1 =(μt + yt ) − (μt−1 + yt−1 )
=(β 0 + β 1 t + yt ) − (β 0 + β 1 (t − 1) + yt−1 )
=β 1 + yt − yt−1
differencing notation
▽x t = x t − x t−1
We use first differences to estimate a linear trend.
We use second differences to estimate a quadratic trend.
the backshift operator

Bx t = x t−1
2
B x t = B(Bx t ) = B(x t−1 ) = x t−2
k
B x t = x t−k
−1 −1 −1
B Bx t = x t = BB x t (B is the forward shift operator)
0
B xt = xt
0
▽x t = ( B − B)x t
2 0 2 0 2
▽ x t = (B − B) x t = ( B − 2B + B )x t = x t − 2x t−1 + x t−2
2
▽ x t = ▽(x t − x t−1 ) = (x t − x t−1 ) − (x t − x t−2 ) = x t − 2x t−1 + x t−2
1
Definition 2.5 Differences of order d

d 0 d
▽ = (B − B)
The first difference is a a linear filter applied to eliminate a trend.
Other filters, formed by averaging values near x t , can produce adjusted series that eliminate other kinds of
unwanted fluctuations.
The differencing technique is an important component of the ARIMA model of Box and Jenkins.
Example 2.5 Differencing Chicken Prices

The first difference of the chicken prices series produces different results than removing trend by detrending via
regression.
The differenced series does not contain the long (five-year) cycle we observe in the detrended series.
The differenced series exhibits an annual cycle that was obscured in the original or detrended data.
plot the autocorrelation of the time series, detrended time series, and the differences

# and Figure 2.5 as

#dev.new()
#par(mfrow=c(3,1)) # plot ACFs
astsa::acf1(
series = chicken,
max.lag = 48,
main = "chicken"
)
## [1] 0.99 0.97 0.95 0.93 0.91 0.89 0.87 0.86 0.84 0.82 0.80 0.78 0.75 0.73 0.71
## [16] 0.68 0.66 0.63 0.61 0.59 0.57 0.55 0.53 0.50 0.48 0.46 0.44 0.43 0.41 0.40
## [31] 0.38 0.37 0.37 0.36 0.35 0.34 0.33 0.31 0.30 0.28 0.27 0.26 0.25 0.24 0.23
## [46] 0.22 0.21 0.20
astsa::acf1(
series = astsa::detrend(
series = chicken
),
max.lag = 48,
main = "detrended"
)

## [1] 0.97 0.91 0.83 0.75 0.68 0.61 0.56 0.51 0.48 0.46 0.43 0.39
## [13] 0.33 0.26 0.20 0.14 0.08 0.03 0.00 -0.03 -0.04 -0.05 -0.07 -0.10
## [25] -0.13 -0.18 -0.21 -0.24 -0.25 -0.25 -0.23 -0.20 -0.16 -0.13 -0.11 -0.10
## [37] -0.11 -0.13 -0.14 -0.16 -0.17 -0.16 -0.15 -0.13 -0.10 -0.08 -0.05 -0.04
astsa::acf1(
series = diff(
x = chicken
),
max.lag = 48,
main = "first difference"
)

## [1] 0.72 0.39 0.09 -0.07 -0.16 -0.20 -0.27 -0.23 -0.11 0.09 0.26 0.33
## [13] 0.20 0.07 -0.03 -0.10 -0.19 -0.25 -0.29 -0.20 -0.08 0.08 0.16 0.18
## [25] 0.08 -0.06 -0.21 -0.31 -0.40 -0.40 -0.33 -0.18 0.02 0.20 0.30 0.35
## [37] 0.26 0.13 -0.02 -0.14 -0.23 -0.21 -0.18 -0.11 -0.03 0.08 0.21 0.33
Example 2.6 Differencing Global Temperature

The global temperature series appears to behave more as a random walk than a trend stationary series.
Rather than detrend the data, it would be more appropriate to use differencing to coerce it into stationarity.
In this case it appears that the differenced process shows minimal autocorrelation, which may imply the global
temperature series is nearly a random walk with drift.
It is interesting to note that if the series is a random walk with drift, the mean of the differenced series, which is
an estimate of the drift, is about .008, or an increase of about one degree centigrade per 100 years.
load the data
data(
list = c("globtemp","gtemp"),
package = "astsa"
)
plot the time series

astsa::tsplot(
x = globtemp
)
astsa::tsplot(
x = gtemp
)

#par(mfrow=c(2,1))
astsa::tsplot(
x = diff(
x = globtemp
),
type = "o"
)

astsa::tsplot(
x = diff(
x = gtemp
),
type = "o"
)

mean(
x = diff(
x = globtemp
)
) # drift estimate = .008
## [1] 0.007925926
mean(
x = diff(
x = gtemp
)
) # drift estimate = .0066
## [1] 0.006589147
autocorrelation of the differences
astsa::acf1(
series = diff(
x = globtemp
),
max.lag = 48,
main = ""
)

## [1] -0.24 -0.19 -0.08 0.20 -0.15 -0.03 0.03 0.14 -0.16 0.11 -0.05 0.00
## [13] -0.13 0.14 -0.01 -0.08 0.00 0.19 -0.07 0.02 -0.02 0.08 -0.12 -0.07
## [25] 0.10 0.13 -0.15 -0.01 0.09 0.00 -0.09 0.07 -0.03 -0.13 0.06 -0.06
## [37] 0.09 0.01 0.09 -0.06 -0.12 0.00 0.13 -0.03 0.00 0.01 0.10 -0.06
astsa::acf1(
series = diff(
x = gtemp
),
max.lag = 48,
main = ""
)

## [1] -0.29 -0.16 -0.12 0.22 -0.15 0.02 0.03 0.11 -0.20 0.15 0.04 -0.07
## [13] -0.17 0.15 0.06 -0.08 0.00 0.14 -0.14 0.04 0.00 0.11 -0.13 -0.03
## [25] 0.08 0.10 -0.23 0.07 0.07 -0.01 -0.11 0.15 -0.05 -0.10 0.02 -0.03
## [37] 0.06 0.00 0.07 -0.05 -0.12 0.04 0.13 -0.03 -0.04 -0.01 0.11 -0.09
log-transformations
frequently, log-transformations of time series will equalize the variability over a length of time. Especially if
larger fluctuations tend to appear with larger observed values.
yt = log(x t )
Box-Cox transformation
Frequently we use the Box-Cox transformation to get a variable that looks more similar to normally distributed,
or to improve a variable as an input for another time series.
λ
⎧ x − 1
t
if λ ≠ 0
yt = ⎨
λ
⎩
log(x t ) if λ = 0
Example 2.7 Paleoclimatic Glacial Varves

Melting glaciers deposit yearly layers of sand and silt during the spring melting seasons, which can be
reconstructed yearly over a period ranging from the time deglaciation began in New England (about 12,600
years ago) to the time it ended (about 6,000 years ago). Such sedimentary deposits, called varves, can be

used as proxies for paleoclimatic parameters, such as temperature, because, in a warm year, more sand and
silt are deposited from the receding glacier.
The plot shows the thicknesses of the yearly varves collected from one location in Massachusetts for 634
years, beginning 11,834 years ago. For further information.
data(
list = "varve",
package = "astsa"
)
time series plot of the time series
#layout(matrix(1:4,2), widths=c(2.5,1))
astsa::tsplot(
x = varve,
main = "",
ylab = "",
col = 4
)
mtext(
text = "varve",
side = 3,
line = 0.5,
cex = 1.2,
font = 2,
adj = 0
)

Because the variation in thicknesses increases in proportion to the amount deposited, a logarithmic
transformation could remove the nonstationarity observable in the variance as a function of time. It is clear that
this improvement has occurred.
time series of the log-transform of the time series
astsa::tsplot(
x = log(varve),
main = "",
ylab = "",
col = 4
)
mtext(
text = "log(varve)",
side = 3,
line = 0.5,
cex = 1.2,
font = 2,
adj = 0
)
We may also plot the histogram of the original and transformed data to argue that the approximation to
normality is improved. The ordinary first differences. We note that the first differences have a
normal plots of the time series and the log-transformed time series
hist(
x = varve
)

qqnorm(
y = varve,
main = "",
col = 4
)
qqline(
y = varve,
col = 2,
lwd = 2
)

hist(
x = log(varve)
)

qqnorm(
y = log(varve),
main = "",
col = 4
)
qqline(
y = log(varve),
col = 2,
lwd = 2
)

Scatterplot matrices for lagged data

We use scatterplot matrices to visualize the relationship between and time series and its lags.
The autocorrelation function tells us whether a substantial linear relation exists between the series and its own
lagged values. The ACF gives a profile of the linear correlation at all possible lags and shows which values of h
lead to the best predictability.
The restriction of this idea to linear predictability, which ignores non-linear relationships between a time series
and its lags.
Example 2.8 Scatterplot Matrices, SOI and

Recruitment
To check for nonlinear relations of this form, it is convenient to display a lagged scatterplot matrix.
The sample autocorrelations are displayed in the upper right-hand corner and superimposed on the
scatterplots are locally weighted scatterplot smoothing (lowess) lines that can be used to help discover any
nonlinearities.
load the data
data(
list = c("soi","rec"),
package = "astsa"
)

We notice that lags 1, 12, 2, and 11 have the strongest correlations. SOI is over months so -12 corresponds to
the same month in the previous year.
lag plot on soi
astsa::lag1.plot(
series = soi,
max.lag = 12,
col = astsa::astsa.col(
col = 4,
alpha = 0.3
),
cex = 1.5,
pch = 20
)
In a previous video we established a relationship between SOI and the recruitment time series.
We see that there is a relationship between recruitment and SOI lagged by 5, 6, 7, 8.
The negative correlation signs indicate that increases (decreases) in SOI lead to decreases (increases) in
recruitment.
The curvative in the LOESS lines leads us to conjecture that different signs of SOI have different impacts on
recruitment.
lag plot of soi leading rec

astsa::lag2.plot(
series1 = soi,
series2 = rec,
max.lag = 8,
col = 4,
alpha = 0.3
),
cex = 1.5,
pch = 20
)
Example 2.9 Regression with Lagged

Variables
R t = β 0 + β 1 St−6 + wt
Let’s expand this model with a dummy variable to incorporate the positive/negative findings for SOI
R t =β 0 + β 1 St−6 + β 2 Dt−6 + β 3 Dt−6 St−6 + wt
0 if St < 0
D t ={
1 if St ≥ 0
β 0 + β 1 St−6 + wt if St < 0
R t ={ 6
(β 0 + β 2 ) + (β 1 + β 3 )St−6 + wt if St ≥ 0
6

dummy = ifelse(
test = soi < 0,
yes = 0,
no = 1
)
fish = ts.intersect(
rec = rec,
soiL6 = lag(
x = soi,
k = -6
),
dL6 = lag(
x = dummy,
k = -6
),
dframe = TRUE
)
lm_fish <- lm(
formula = rec~ soiL6*dL6,
data = fish,
na.action = NULL
)
summary(
object = lm_fish
)
##
## Call:
## lm(formula = rec ~ soiL6 * dL6, data = fish, na.action = NULL)
##
## Residuals:
## Min 1Q Median 3Q Max
## -63.291 -15.821 2.224 15.791 61.788
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.479 2.865 25.998 < 2e-16 ***
## soiL6 -15.358 7.401 -2.075 0.0386 *
## dL6 -1.139 3.711 -0.307 0.7590
## soiL6:dL6 -51.244 9.523 -5.381 1.2e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21.84 on 443 degrees of freedom
## Multiple R-squared: 0.4024, Adjusted R-squared: 0.3984
## F-statistic: 99.43 on 3 and 443 DF, p-value: < 2.2e-16

astsa::tsplot(
x = fish$soiL6,
y = fish$rec,
type = 'p',
col = 4,
ylab = 'rec',
xlab = 'soiL6'
)
lines(
x = lowess(
x = fish$soiL6,
y = fish$rec
),
col = 4,
lwd = 2
)
points(
x = fish$soiL6,
y = fitted(
object = lm_fish
),
pch = '+',
col = 6
)
time series plot of the residuals, there is autocorrelation in the residuals

astsa::tsplot(
x = resid(
object = lm_fish
)
) # not shown ...
astsa::acf1(
series = resid(
object = lm_fish
)
) # ... but obviously not noise

## [1] 0.69 0.62 0.49 0.37 0.24 0.15 0.08 0.00 -0.03 -0.10 -0.13 -0.16
## [13] -0.17 -0.23 -0.24 -0.23 -0.23 -0.22 -0.17 -0.09 -0.05 0.01 0.05 0.06
## [25] 0.09 0.07 0.10 0.06 0.02 -0.02 -0.02 -0.02 -0.03 -0.02 0.00 0.01
## [37] -0.01 -0.04 -0.07 -0.05 -0.06 -0.03 -0.02 0.01 0.04 0.04 0.08 0.08
Example 2.10 Using Regression to Discover

a Signal in Noise
Frequently we can statistically capture periodic behavior without knowing the mathematical function of the
signal.
The trigonometric identities and the orthogonality of Fourier series enables regression to estimate periodic
signal.
cos(α + β) = cos(α)cos(β) − sin(α)sin(β)
3π 3π 3π
cos (2πx + ) = cos (2πx) cos ( ) − sin (2πx) sin ( )
5 5 5
3π 3π 3π
2cos (2πx + ) = 2cos ( ) cos (2πx) − 2sin ( ) sin (2πx)
5 5 5
3π
2cos (2πx + ) ≈ −0.618034cos (2πx) − −1.902113sin (2πx)
5
true coefficients: − 0.618034, −1.902113

set.seed(
seed = 823
) # so you can reproduce these results
x = 2*cos(x = 2*pi*(1:500)/50 + 0.6*pi) + rnorm(n = 500,mean = 0,sd = 5)
z1 = cos(
x = 2*pi*(1:500)/50
)
z2 = sin(
x = 2*pi*(1:500)/50
)
M_trig <- data.frame(
x = x,
z1 = z1,
z2 = z2
)
lm_trig <- lm(
formula = x ~ 0 + z1 + z2,
data = M_trig
)
summary(
object = lm_trig
) # zero to exclude the intercept
##
## Call:
## lm(formula = x ~ 0 + z1 + z2, data = M_trig)
##
## Residuals:
## -14.1836 -2.9692 -0.0714 3.4311 14.0427
##
## Coefficients:
## z1 -0.6126 0.2986 -2.052 0.0407 *
## z2 -1.6664 0.2986 -5.581 3.94e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## F-statistic: 17.68 on 2 and 498 DF, p-value: 3.828e-08
astsa::tsplot(
x = x,
col = 4
)

astsa::tsplot(
x = x,
col = 4,
alpha = 0.7
),
ylab = expression(hat(x))
)
lines(
x = fitted(
object = lm_trig
),
col = 2,
lwd = 2
)

increase the sample size to show convergence of the coefficients
set.seed(
seed = 823
) # so you can reproduce these results
x = 2*cos(x = 2*pi*(1:(1e6))/(1e5) + 0.6*pi) + rnorm(n = (1e6),mean = 0,sd = 5)
z1 = cos(
x = 2*pi*(1:(1e6))/(1e5)
)
z2 = sin(
x = 2*pi*(1:(1e6))/(1e5)
)
M_trig <- data.frame(
x = x,
z1 = z1,
z2 = z2
)
lm_trig <- lm(
formula = x ~ 0 + z1 + z2,
data = M_trig
)
summary(
object = lm_trig
) # zero to exclude the intercept

##
## Call:
## lm(formula = x ~ 0 + z1 + z2, data = M_trig)
##
## Residuals:
## -23.8745 -3.3805 -0.0108 3.3583 25.4331
##
## Coefficients:
## z1 -0.617281 0.007065 -87.38 <2e-16 ***
## z2 -1.908289 0.007065 -270.12 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## F-statistic: 4.03e+04 on 2 and 999998 DF, p-value: < 2.2e-16
Write estimated model as one trigonometric function. In general we write a sine/cosine wave using sine, but
since the original function is written as a cosine we will use cosine. (Sine and cosine are shifts of each other.)
Since the author used 2π in every trigonometric function, we take the point of view that the period/frequency is
known. Amplitude and phase shift are unknown. Since there is no intercept, there is no phase shift.
ˆ ˆ
−0.6172813cos(2πt) − 1.9082887sin(2πt) =A cos(2πt + θ )
ˆ ˆ ˆ ˆ
=A cos(2πt)cos(θ ) − A sin(2πt)sin(θ )
ˆ ˆ
A cos(θ ) = − 0.6172813
ˆ ˆ
−A sin(θ ) = − 1.9082887
2 2 2
ˆ 2 ˆ ˆ 2 ˆ ˆ 2 2
A cos (θ ) + A sin (θ ) =A = (−0.6172813) + (−1.9082887) = 4.022602
ˆ
|A | =2.005643
ˆ
cos(θ ) = − 0.3077723
ˆ
sin(θ ) =0.9514600
ˆ −1
θ =cos (−0.3077723) = 1.883647
ˆ ˆ
A cos(2πt + θ ) =2.005643cos(2πt + 1.883647)
3
Acos(2πt + θ) =2cos (2πt + π)
5
3
π ≈1.884956
5

t0 <- seq(
from = 0,
to = 1,
length = 10000
)
x_correct <- 2*cos(2*pi*t0 + 3*pi/5)
x_estimated <- 2.005643*cos(2*pi*t0 + 1.883647)
library(ggplot2)
M <- data.frame(
t0 = t0,
correct_model = x_correct,
estimated_model = x_estimated
)
M <- tidyr::gather(
data = M,
key = "model",
value = "x",
-t0
)
ggplot(M) +
aes(x = t0,y = x,group = model,color = model) +
geom_line() +
theme_bw()

2 Time Series Regression and Exploratory Data Analysis 2.2 Exploratory Data Analysis

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 Time Series Regression and Exploratory Data Analysis 2.2 Exploratory Data Analysis

Uploaded by

Copyright:

Available Formats

3/1/24, 10:23 AM 2 Time Series Regression and Exploratory Data Analysis 2.

2 Exploratory Data Analysis

2 Time Series Regression and

The most recent version of the package can be found at https://github.com/nickpoison/astsa/

You can find demonstrations of astsa capabilities at

In addition, the News and ChangeLog files are at https://github.com/nickpoison/astsa/blob/master/NEWS.md

UCF students can download it for free through the library.

Punchline of this video:

Often, this is not the case.

Johnson and Johnson Quarterly Earnings Per Share

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 1/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 2/34

Example 2.4 Detrending Chicken Prices

load the data

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 3/34

plot the time series

μ̂t = −7131 + 3.59t

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 4/34

plot the detrended time series

# astsa now has a detrend script, so Figure 2.4 can be done as

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 5/34

plot the difference between observations as a time series

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 6/34

random walk with drift model,

If x t is trend stationary, and the trend is a random walk with drift

x t − x t−1 =(μt + yt ) − (μt−1 + yt−1 )

=(μt − μt−1 ) + (yt − yt−1 )

=(δ + wt ) + (yt − yt−1 )

Let z_t =& y_{t} - y_{t-1} , then

=cov(yt+h − yt+h−1 , yt − yt−1 )

=cov(yt+h , yt ) + cov(yt+h , yt−1 ) + cov(yt+h−1 , yt ) + cov(yt+h−1 , yt−1 )

this is independent of time

An advantage of differencing is that no parameter is estimated.

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 7/34

Use detrending if you want to estimate a stationary component yt .

x t − x t−1 =(μt + yt ) − (μt−1 + yt−1 )

We use first differences to estimate a linear trend.

We use second differences to estimate a quadratic trend.

the backshift operator

Definition 2.5 Differences of order d

The first difference is a a linear filter applied to eliminate a trend.

Example 2.5 Differencing Chicken Prices

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 8/34

# and Figure 2.5 as

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 9/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 10/34

Example 2.6 Differencing Global Temperature

load the data

plot the time series

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 11/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 12/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 13/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 14/34

autocorrelation of the differences

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 15/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 16/34

Example 2.7 Paleoclimatic Glacial Varves

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 17/34

time series plot of the time series

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 18/34

time series of the log-transform of the time series

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 19/34

file:///G:/My Drive/Time Series/NEU/Tài liệu/2-Time-Series-Regression-and-Exploratory-Data-Analysis-2.2-Exploratory-Data.html 20/34