Examples of Time Series

Examples of Time Series
Lecture 1
1
What is Time Series?
Mr. Google:
Time Series are an old idea, the city of Barcelona stores data about its citizens
extensively since 13th Century
From https://mapr.com/blog/time-series-data-new-big-data/ 4
What Time Series Are
or
How is Time Series Data Different
from all Other Data ?
5
What Time Series Are
or
How is Time Series Data Different
from all Other Data ?
Hints: In most of your courses

• The data points were pretty much independent
• The order in which observations (data points) came in where
not important
6
Example:
Independent bivariate data (x, y) = (chocolate consumption, Nobel prize winners)

Order of observations does not matter (cross-sectional data)
from
http://www.businessinsider.com/chocolate-consumption-vs-nobel-prizes-2014-4?r=UK&IR=T 7
• Time Series Data: a series of values recorded in time
• Most common: equally spaced time-points:

Xt, Xt+h, Xt+2h, Xt+3h, …, or X(t), X(t+h), X(t+2h), …
h= time between observations or sampling interval;

1/h = sampling frequency or sampling rate
• Order: very important! Observations dependent!

• Notation: position in the series, not actual time: X1, X2, X3, …, Xn
• Goal of time series: forecast
• Examples: class enrollments in PSTAT 120A per quarter;

population of US per year;
daily sales in Costco, etc 8
Examples of Time Series Data
Weekly Kitten Weight:
http://shorthair.candra.info/domestic-shorthair-cat-weight-chart.html
9
Times series forecasting is performed in nearly every
organization that works with quantifiable data:
• Retails stores use it to forecast sales.
• Energy companies use it to forecast reserves, production, demand, and
prices.
• Educational institutions use it to forecast enrollment.
• International financial organizations such as the World Bank and
International Monetary Fund use it to forecast inflation and economic
activity.
• Transportation companies use time series forecasting to forecast
future travel.
• Banks and lending institutions use it to forecast new homes purchases.
• Venture capital firms use it to forecast market potential/evaluate
business plans.
• Meteorologists use it to predict precipitation, temperatures, etc.
10
A comment on time scales:
• Modern technology allows to record data on frequent time

scales:
✓ Stock data are available at ticker level.
✓ Online and in store purchases are recorded in real time.
In considering choice of time scale, one must consider the scale of
the required forecasts and the level of noise in data.
▪ To forecast next-day sales at a grocery store, would you use
minute-by-minute sales data or daily aggregates?
• Many modern time series are long:
▪ Weekly interest rates, daily closing stock prices, the electrical
activity of the heart observed at milisecond intervals
• Examples of short time series:
▪ Annual data for UCSB,
IIM starting 1991, less than 30 observations;
▪ Annual data for country X, starting 1948, less than 100
observations..
11
US Population, 1790-1990, ten year intervals
Xt =population of US at year t: t=1790 Xt= 3,929,214

(in millions) t=1800 Xt= 5,308,483,
….
t=1990 Xt = 248,709,873
Note: looks like exponential trend. Non-stationary times series 12

US Population, 1790-1990, ten year intervals
(taking a closer look:

discrete points recording population of US every 10 years starting from 1790 13
Log(US Population), 1790-1990, ten year intervals
Xt =population of US (in millions) at year t, t=1790, 1800, …, 1990

Vt=log(Xt), natural logarithm:
Not a straight line, not so simple, not exponential trend!
> uspop <- read.table(“ ”)

> View(uspop)
> pop=ts(uspop)
R commands: > plot(uspop)
> logpop=log(pop)
> plot(logpop)
14
Sqrt(US Population), 1790-1990, ten year intervals
Xt =population of US (in millions) at year t, t=1790, 1800, …, 1990

Vt=√Xt, square root.
Almost straight line, better, not perfect!
Commands used in R:
> uspop <- read.table(“ ”)
> pop=ts(uspop)
> sqrtpop=sqrt(pop)
> plot(sqrtpop)
15
International Airline Data.
Monthly totals of international passengers (1/1949 --12/1960)
Xt=total number of passengers (in thousands) taking flights on the airline in a

month t.
Non-stationary seasonal times series:
• upward trend (linear?)
• Seasonal (low in winter, high in summer)
• Variability increases with time
16
International Airline Data.
Monthly totals of international passengers (1/1949 --12/1960)
Xt=total number of passengers (in thousands) taking flights on the airline in a

month t.
Vt=log(Xt), t=1, 2, …, 144, natural logarithms of the AIRPASS.DAT file.

17
REMEMBER: The main difference between time series and
other statistical samples:
• dependent observations
that become available at
▪ equally spaced time intervals &
▪ are time-ordered
Goals of Time Series are explanatory and predictive:
• understand or model stochastic mechanism that gives rise to an

observed series
• forecast the future values of a series based on the history of that series
18
Examples from
PROJECT W’17 Projects:
EXAMPLES
ARIMA Modeling of the Consumer Prize Index
19
20
Results given by different models
21
PROJECT: Time Series Analysis on Monthly Milk Production
Based on SARIMA Model
Data: Monthly milk production (pounds per cow) from Jan. 1962- Dec. 1975 22
PROJECT: Atmospheric CO2 Concentration with Forecasts
Data: monthly mean atmospheric carbon dioxide concentrations

measured in parts per million. The data were collected at the Mauna Loa
Observatory in Hawaii, beginning in March 1958.
Source: National Oceanic & Atmospheric Administration (NOAA )website
23
OBJECTIVES OF TIME SERIES ANALYSIS
• Understanding the dynamic or time-
dependent structure of the observations of a
single series (univariate analysis)
• Forecasting of future observations
• Ascertaining the leading, lagging and feedback
relationships among several series
(multivariate analysis)
24
STEPS IN TIME SERIES ANALYSIS
• Model Identification
– Time Series plot of the series
– Check for the existence of a trend or seasonality
– Check for the sharp changes in behavior
– Check for possible outliers
• Remove the trend and the seasonal component to get stationary
residuals.
• Estimation
– MME
– MLE
• Diagnostic Checking
– Normality of error terms
– Independency of error terms
– Constant error variance (Homoscedasticity)
• Forecasting
– Exponential smoothing methods
– Minimum MSE forecasting
25
CHARACTERISTICS OF A SERIES
• For a time series Yt , t  0,1,2,
THE MEAN FUNCTION:
t  EYt  Exists iff E Yt  .
The expected value of the process at time t.
THE VARIANCE FUNCTION:

   0  Var Yt   E Yt  t   E Yt 2   t2  
t
2 2
0  0
26
CHARACTERISTICS OF A SERIES
• THE AUTOCOVARIANCE FUNCTION:
 t ,s  CovYt , Ys   EYt  t Ys   s 
 E YtYs   t  s ; t , s  0,  1,  2,
Covariance between the value at time t and the value at time s of a stochastic process Yt.
• THE AUTOCORRELATION FUNCTION:

 t ,s
t ,s  Corr Yt , Ys   ,1  t ,s  1
 t s
The correlation of the series with itself
27
EXAMPLE
• Moving average process: Let ti.i.d.(0, 1),
and
Xt = t + 0.5 t−1
28
EXAMPLE
• RANDOM WALK: Let e1,e2,… be a sequence of
i.i.d. rvs with 0 mean and variance  e . The
2
observed time series

Yt , t  1,2,, n
is obtained as
Y1  e1
Y2  e1  e2  Y2  Y1  e2
Y3  e1  e2  e3  Y3  Y2  e3

Yt  e1    et  Yt  Yt 1  et 29
A RULE ON THE COVARIANCE
• If c1, c2,…, cm and d1, d2,…, dn are constants
and t1, t2,…, tm and s1, s2,…, sn are time points,
then
m
 i 1
n  m n

Cov  ciYti ,  d jYs j     ci d j Cov Yti , Ys j
j 1  i 1 j 1

 m
 i 1
 m
 i 1
2
n i 1
i  2 j 1

Var   ciYti    ci Var Yti   2   ci c j Cov Yti , Yt j 
30
JOINT PDF OF A TIME SERIES
• Remember that
FX1  x1  : the marginal cdf
f X 1  x1  : the marginal pdf
FX 1 , X 2 ,, X n  x1 , x2 ,, xn  : the joint cdf
f X 1 , X 2 ,, X n  x1 , x2 ,, xn  : the joint pdf
31
• For the observed time series, say we have two
points, t and s.
• The marginal pdfs: fYt  yt  and fYs  ys 
• The joint pdf: fYt ,Ys  yt , ys   fYt  yt . fYs  ys 
32
• Since we have only one observation for each r.v.
Yt, inference is too complicated if distributions
(or moments) change for all t (i.e. change over
time). So, we need a simplification.
12
r.v.
10 Y4
8
6 r.v.
Y6
4
0
1 2 3 4 5 6 7 8 9 10 11 12
33
• To be able to identify the structure of the
series, we need the joint pdf of Y1, Y2,…, Yn.
However, we have only one sample. That is,
one observation from each random variable.
Therefore, it is very difficult to identify the
joint distribution. Hence, we need an
assumption to simplify our problem. This
simplifying assumption is known as
STATIONARITY.
34
STATIONARITY
• The most vital and common assumption in
time series analysis.
• The basic idea of stationarity is that the
probability laws governing the process do not
change with time.
• The process is in statistical equilibrium.
35
TYPES OF STATIONARITY
• STRICT (STRONG OR COMPLETE) STATIONARY
PROCESS: Consider a finite set of r.v.s.
Yt1 ,Yt2 ,,Ytn  from a stochastic process
Y w, t ; t  0,1,2,.
• The n-dimensional distribution function is
defined by
FYt ,Yt ,,Yt  yt1 , yt2 ,, ytn   Pw : Yt1  y1 ,, Ytn  yn 
1 2 n
where yi, i=1, 2,…, n are any real numbers.

36
STRONG STATIONARITY
• A process is said to be first order stationary in
distribution, if its one dimensional distribution
function is time-invariant, i.e.,
FYt  y1   FYt  k  y1  for any t1 and k.
1 1
• Second order stationary in distribution if

FYt
1
,Yt2  y1, y2   FYt1k ,Yt2 k  y1, y2  for any t1 ,t2 and k.
• n-th order stationary in distribution if
FYt ,,Yt  y1,, yn   FYt  k ,,Yt  k  y1,, yn  for any t1 , ,tn and k.
1 n 1 n
37
STRONG STATIONARITY
n-th order stationarity in distribution = strong stationarity
 Shifting the time origin by an amount “k” has

no effect on the joint distribution, which must
therefore depend only on time intervals
between t1, t2,…, tn, not on absolute time, t.
38
STRONG STATIONARITY
• So, for a strong stationary process
i) fYt1 ,,Ytn  y1,, yn   fYt1k ,,Ytn k  y1,, yn 
ii) EYt   EYt  k   t  t  k   ,t , k
Expected value of a series is constant over time, not a function of time
iii)Var Yt   Var Yt k    2

t   2
t k   2
, t , k
The variance of a series is constant over time, homoscedastic.
iv) CovYt , Ys   CovYt  k , Ys  k    t ,s   t  k ,s  k , t , k

  t  s   t  k  s k   h
Not constant, not depend on time, depends on time interval, which we call “lag”, k
39
STRONG STATIONARITY
Yt
12
10
8
Y1 Y2 Y3 ……………………………………….... Yn
6
4
   
2 2 2
2 2
0 t
1 2 3 4 5 6 7 8 9 10 11 12
CovY2 , Y1    21   1
CovY3 , Y2    3 2   1
CovYn , Yn 1    n ( n 1)   1 Affected from time lag, k
CovY3 , Y1    31   2
CovY1, Y3    13    2
40
STRONG STATIONARITY
v) CorrYt , Ys   CorrYt  k , Ys  k   t ,s  t  k ,s  k , t , k
  t  s   t  k  s k  h
Let t=t-k and s=t,
t ,t k  t  k ,t   k , t , k
• It is usually impossible to verify a distribution
particularly a joint distribution function from
an observed time series. So, we use weaker
sense of stationarity.
41
WEAK STATIONARITY
• WEAK (COVARIANCE) STATIONARITY OR
STATIONARITY IN WIDE SENSE: A time series is
said to be covariance stationary if its first and
second order moments are unaffected by a
change of time origin.
• That is, we have constant mean and variance
with covariance and correlation beings
functions of the time difference only.
42
WEAK STATIONARITY
From, now on, when we say “stationary”, we

imply weak stationarity.
43
EXAMPLE
• Consider a time series {Yt} where
Yt=et
and eti.i.d.(0,2). Is the process stationary?
(Also known as a “White Noise” process)
44
EXAMPLE
45
EXAMPLE
• RANDOM WALK
Yt  e1  e2    et
where eti.i.d.(0,2). Is the process {Yt}
stationary?
46
EXAMPLE
• Suppose that time series has the form
Yt  a  bt  et
where a and b are constants and
eti.i.d.(0,2).
(Also known as a linear trend)
Is {Yt} stationary?
47
EXAMPLE
Yt   1 et
t
where eti.i.d.(0,2). Is the process {Yt}

stationary?
48
STRONG VERSUS WEAK STATIONARITY
• Strict stationarity means that the joint distribution only
depends on the ‘difference’ h, not the time (t1, . . . , tk).
• Finite variance is not assumed in the definition of strong
stationarity, therefore, strict stationarity does not
necessarily imply weak stationarity. For example, processes
like i.i.d. Cauchy are strictly stationary but not weak
stationary.
• A nonlinear function of a strict stationary variable is still
strictly stationary, but this is not true for weak stationary.
For example, the square of a covariance stationary process
may not have finite variance.
• Weak stationarity usually does not imply strict stationarity
as higher moments of the process may depend on time t.
49
• If process {Xt} is a Gaussian time series, which
means that the distribution functions of {Xt}
are all multivariate Normal, weak stationary
also implies strict stationary. This is because a
multivariate Normal distribution is fully
characterized by its first two moments.
50
• For example, a white noise is stationary but may
not be strict stationary, but a Gaussian white
noise is strict stationary. Also, general white
noise only implies uncorrelation while Gaussian
white noise also implies independence. Because
if a process is Gaussian, uncorrelation implies
independence. Therefore, a Gaussian white
noise is just i.i.d. N(0, 2).
51
STATIONARITY AND NONSTATIONARITY
• Stationary and nonstationary processes are
very different in their properties, and they
require different inference procedures. We
will discuss this in detail through this course.
52

Examples of Time Series

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Examples of Time Series

Uploaded by

Copyright:

Available Formats

Examples of Time Series

Hints: In most of your courses

Independent bivariate data (x, y) = (chocolate consumption, Nobel prize winners)

• Most common: equally spaced time-points:

h= time between observations or sampling interval;

• Order: very important! Observations dependent!

• Goal of time series: forecast

• Examples: class enrollments in PSTAT 120A per quarter;

Weekly Kitten Weight:

• Modern technology allows to record data on frequent time

Xt =population of US at year t: t=1790 Xt= 3,929,214

Note: looks like exponential trend. Non-stationary times series 12

(taking a closer look:

Xt =population of US (in millions) at year t, t=1790, 1800, …, 1990

> uspop <- read.table(“ ”)

Xt =population of US (in millions) at year t, t=1790, 1800, …, 1990

Xt=total number of passengers (in thousands) taking flights on the airline in a

Xt=total number of passengers (in thousands) taking flights on the airline in a

Vt=log(Xt), t=1, 2, …, 144, natural logarithms of the AIRPASS.DAT file.

Goals of Time Series are explanatory and predictive:

• understand or model stochastic mechanism that gives rise to an

ARIMA Modeling of the Consumer Prize Index

Results given by different models

Data: monthly mean atmospheric carbon dioxide concentrations

THE VARIANCE FUNCTION:

• THE AUTOCORRELATION FUNCTION:

observed time series

• The joint pdf: fYt ,Ys  yt , ys   fYt  yt . fYs  ys 

where yi, i=1, 2,…, n are any real numbers.

• Second order stationary in distribution if

 Shifting the time origin by an amount “k” has

iii)Var Yt   Var Yt k    2

iv) CovYt , Ys   CovYt  k , Ys  k    t ,s   t  k ,s  k , t , k

From, now on, when we say “stationary”, we

(Also known as a “White Noise” process)

where eti.i.d.(0,2). Is the process {Yt}

You might also like