You are on page 1of 72

Characteristics of Time

Series
DS226o Financial Analytics

Instructor
Dr Shashi Jain
Department of Management Studies
Indian Institute of Science
Bangalore
Learning Objective

• Introduce the basic concepts for time series that will


serve as foundation for the development of models
related to time series analysis.

Time Series for Business Analytics


Topics Covered
• Nature of time series data

• White Noise

• Measure of Dependence: Autocorrelation and cross


correlation

• Stationary time series

Time Series for Business Analytics


Time Series

• A time series can be defined as a collection of


random variables indexed according to the order
they have been obtained in time.

• The primary objective of time series analysis is to


develop mathematical models that provide plausible
descriptions for sample data

Time Series for Business Analytics


Stochastic Process
• Time series is a sequence of random variable
x1,x2,x3,..., where x1 denotes the value taken by
series at time point 1, x2 at time point 2 etc

• A collection of random variables {xt}, indexed by t is


referred as stochastic process.

• In this course t is going to be discrete

Time Series for Business Analytics


Classification of TS
Time-domain vs. Frequency-domain

• Time-domain approach: How does what happened today affect what


will happen tomorrow?

• These approaches view the investigation of lagged relationships as


most important,

• e.g. autocorrelation analysis. –

• Frequency-domain approach: What is the economic cycle through


periods of expansion and recession?

• These approaches view the investigation of cycles as most


important, e.g. spectral analysis and wavelet analysis.

Time Series for Business Analytics


Classification of TS
• Univariate vs. Multivariate:

• A time series containing records of a single variable is termed as univariate,


but if records of more than one variable are considered then it is termed as
multivariate.

• Linear vs. non-linear

• A time series model is said to be linear or non-linear depending on whether


the current value of the series is a linear or non-linear function of past
observations.

• Discrete vs. continuous:

• In a continuous time-series observations are measured at every instance of


time, whereas a discrete time series contains observations measured at
discrete points in time.

Time Series for Business Analytics


Example of Nonlinear TS

Time Series for Business Analytics


Components of TS: Trend
• Trend: The general tendency of a time series to increase, decrease or
stagnate over a long period of time.

Time Series for Business Analytics


Components of TS:
Seasonality
• Seasonal variation: Explains the fluctuations within a year during the
season, probably caused by climate and weather conditions, customs etc

Time Series for Business Analytics


Components of TS: Cyclical
Variation
• This component describes the medium-term changes caused by
circumstances, which repeat in cycles.

Time Series for Business Analytics


Components of TS: Irregular
variation
• Irregular or random variations in a time series are
caused by unpredictable influences, which are not
regular and do not repeat in a particular pattern.

• These variations are caused by incidences such as


war, strike, earthquake, flood, revolution, etc. ‘

• There is no defined statistical technique for


measuring random fluctuations in a time series.

Time Series for Business Analytics


Wold Decomposition
• The essence of linear time series theory is expressed by the
Wold Decomposition:

• any stochastic process can be separated into the sum of two


processes –

• a deterministic one that is a linear function of its past values,


and

• a stochastic one that is a linear function of previous values of


an uncorrelated random variable.

• Once these two pieces have been found there is nothing more
that can be said about the system.

Time Series for Business Analytics


Additive vs Multiplicative
• Additive Model Y (t) = T(t) + S(t) + C(t) + I(t)

• Assumption: These four components are


independent of each other.

• Multiplicative Model Y (t) = T(t) × S(t) × C(t) × I(t)

• Assumption: These four components of a time


series are not necessarily independent, and they
can affect one another.

Time Series for Business Analytics


Going Deeper into
Time series
White Noise
• What if there was no dependence between xt and xt-1, xt-2,…

Time Series for Business Analytics


White noise

Time Series for Business Analytics


What is IID ?
• Independent: Two random variables X and Y are independent if knowing
the value of one of them does not change the probabilities for the other
one. 

• P(X|Y) = P(X)

• Example, if I have a coin and have been recording the outcomes of toss:

• HHTHTTT

• What is the probability of H in the next toss ?

• Will it be different if it was TTHHHT ?

• Does it matter if the coin is biased or unbiased?

Time Series for Business Analytics


Example of dependence
• Suppose you have two independent random variables and

• More concretely you have two coins

• H results in a value of 1 and T in a value of -1

• Now construct X = [ So takes a value of 1 or -1]

• Y = + [So take a value of either -2, 0, 2]

• If you noticed the outcome of the first coin, and it was H, what is the
probability that Y = -2 ?

• If you didn’t notice the outcome of the first coin, now what is the
probability that Y = -2 ?

Time Series for Business Analytics


All about dependence
• The whole point of TSA is to find if there is a
dependence

• Dependence to what ?

• If I have observed certain set of events, can they


help me better predict the future ?

• Does the future have any dependence to the past


events?

Time Series for Business Analytics


• So, when do we know we cannot predict the future ?

• If we observe a TS and can say with confidence


that the TS is not white noise, then past information
is valuable.

• If it is white noise, past information doesn’t help at


all.

• It is just like a sequence of coin tosses!

Time Series for Business Analytics


What is identically distributed
in iid

Time Series for Business Analytics


Next steps

• We start the other way round [focusing on Linear


Time Series]

• We have a white noise and we see what operations


result in some of the TS we had earlier discussed.

• Basically, see how the dependence is created

Time Series for Business Analytics


Introducing correlations in
TS

• One way to introduce correlation


is by using moving average.

• For example, replacing the


current wt value with its moving
average (current and immediate
neighbors), i.e.,
Introducing Periodicity
Using wt we construct an output as:

Time Series for Business Analytics


Introducing periodicity
• Consider a model with underlying signal with
added noise

Time Series for Business Analytics


Signal in noise
• Adding noise to signal is an example of additive
models

• Such models are of interest in many areas, for


instance in economics the signal xt maybe a trend
or a seasonal component.

• Such models are the motivation for the state space


models.

Time Series for Business Analytics


Joint distribution
• P(X< x, Y<y)

• How do we recover
this distribution ?

• Using some
statistical approach,
given a set of pairs
of (x,y).

Time Series for Business Analytics


Poll Question
• We have past series of observations for the following two
cases.

• The distribution of which of these two can be recovered


from the sample.

a. 1

b. 2

c. 1&2

d. Neither

Time Series for Business Analytics


Stationary time series
• You have a time series {x_0,x_1,x_2,x_3,…..}. You create two sets of
sample

• {x_0, x_1, x2,...,x_10}

• {x_3,x_4,x_5,...,x_13}

• You recover the joint distribution and notice F(x_0,x_1,…,x_10) =


F(x_3,x_4,…,x_13)

• In fact, for any lead l, you observe the same F(x_0,x_1,…,x_10) =


F(x_{0+l),x_{1+l},…,x_{10+l})

Time Series for Business Analytics


Basic statistical measures for
Time series
• A complete description of a univariate time series
can be provided by the joint distribution function.

• The time series is represented as n random


variables that are observed at

• Then probability the values of the series are jointly


less than is given by

Time Series for Business Analytics


iid series

• When the random numbers are iid the joint


distribution is simplified

• …

Time Series for Business Analytics


Mean of time series
• Mean of white noise series is for all t

• Mean of moving average series?

• Mean of signal (deterministic func of time) + noise?

Time Series for Business Analytics


Autocovariance for time
series

Time Series for Business Analytics


Autocorrelation for time
series

Time Series for Business Analytics


Examples: Autocovariance

• Autocovariance of white noise

Time Series for Business Analytics


Example: MA

Time Series for Business Analytics


Observations MA

• Smoothing operation introduces a covariance

• The covariance decreases as separation between


time points increases

• the covariance for this case depends only on time


separation or lag and not on absolute values of t or s.

Time Series for Business Analytics


Cross covariance and
correlation

Time Series for Business Analytics


Stationary time series
• No assumptions about behavior of time series have
been made in the examples discussed so far

• However, we somewhere assume some sort of


regularity in the behavior of time series.

• We here introduce the notion of regularity using a


concept called stationarity

Time Series for Business Analytics


Strictly stationary time series

Time Series for Business Analytics


Weak stationarity

Time Series for Business Analytics


Weak stationarity
• As auto covariance depends only on lag, and not
absolute values of s, or t.

Time Series for Business Analytics


Examples of stationarity

• White noise:

• Moving average:

Time Series for Business Analytics


ACF of MA

Time Series for Business Analytics


Estimation of correlation
• Computing auto-correlations from sampled points is
non-trivial.

• If we want to compute mean of xt from a sampled


time series, we only have a single realization at t to
work with.

• We do not have iid copies of xt, i.e., x1t,x2t,…

Time Series for Business Analytics


Estimation of correlation
• Assumption of stationarity becomes critical.

• If the TS is stationary E[xt]= constant

• So instead of averaging over the population at t, we


can average over the sampled x at different time
points, i.e.

Time Series for Business Analytics


Sample Auto-covariance

• Note that the above sum runs only up till n-h, as other xt+h
will not be available.

• The above is not an unbiased estimator

Time Series for Business Analytics


Large sample distribution of
ACF

• Under conditions that xt is a linear process

• E[wt4] < infinity

Time Series for Business Analytics


White noise and ACF
• Most models reduce a time series to a combination white
noise series and a deterministic series’

• After model has been fitted using given dataset, one would
like to test whether the residuals are truly white noise

• We know that for white noise:

• Estimated ACF for any lag will be normally distributed


around mean 0,

• and a standard deviation that reduces to 0 as sample size


increases.

Time Series for Business Analytics


Is it an ACF for white noise?
• If our hypothesis is that a random
variable has a mean 0 and std of
1/sqrt(n), then

• If the sampled 1.96*(1/sqrt(n)),


you accept that is with some
certainty equal to 0

• Else, we know with good


certainty that it is not 0 (but what
it is, we do not know!)
ACF of white noise

Time Series for Business Analytics


Estimating cross covariance

Time Series for Business Analytics


Example for ACF
• We construct an experiment by tossing a fair coin:

• Let xt=1 when H is obtained and -1 otherwise

• X0=-1

Time Series for Business Analytics


Example of ACF

Time Series for Business Analytics


Theoretical ACF vs Sample
based ACF
Theoretical ACF at h:

Time Series for Business Analytics


Exploratory Data
Analysis for Time Series
Linear process
• Most of the models we will consider are going to be
linear, or linear approximations.

• Consider a dependent time series xt, that is


influenced by independent series, say

Time Series for Business Analytics


Fitting global temperature
deviations

• Note that we are assuming that the errors, wt, are an iid normal sequence, which may not be true!

• Linear regression is simple approach to supervised learning. It assumes that the dependence between y
and X is linear.

Time Series for Business Analytics


Time Series for Business Analytics
Finding the coefficient

• Or

• Where there are n observations.

Time Series for Business Analytics


Residual sum of squares

Time Series for Business Analytics


Minimizing RSS

Time Series for Business Analytics


Variance of beta
• is the unbiased estimator of β

• The variance of the error is estimated as

• is the unbiased estimator of

• Therefore

Time Series for Business Analytics


Reducing the model

• Under normal assumption, t above has a t-


distribution with n-q degrees of freedom.

• Here n are the number of samples and q are the


number of independent variables.

Time Series for Business Analytics


Reducing the model
• Suppose a proposed model specifies that only a
subset r < q independent variables,

• say, zt,1:r = {zt1, zt2, . . ., ztr } is influencing the


dependent variable xt . The reduced model is

• The null hypothesis in this case is

• H0: βr+1 = ··· = βq = 0.

Time Series for Business Analytics


Reducing the model

• The null hypothesis is rejected at level α if the 1 − α


percentile of the F distribution with q − r numerator and
n−q−1 denominator degrees of freedom.

Time Series for Business Analytics


Akaike’s information
Criterion
• Suppose we consider normal regression with k-coeff

• Then Akaike suggested measuring goodness of fit against


number of parameters in the model

Time Series for Business Analytics


Example

Time Series for Business Analytics


EDA

Time Series for Business Analytics


Which model is better?

Time Series for Business Analytics


And hypothesis testing
• We want to compare if only trend is better than using the full model.

• The null hypothesis is rejected at level α if the 1 − α percentile of the F


distribution with q − r numerator and n−q−1 denominator degrees of
freedom.

Time Series for Business Analytics

You might also like