Unit III Time Series Analysis Lesson 6

Unit III
Time Series Analysis

Lesson 6
Introduction
A time series is a sequence of data points, measured typically at successive

times spaced at uniform time intervals. Examples of time series are the daily
closing value of the Dow Jones index or the annual flow volume of the Nile River
at Aswan. Time series analysis comprises methods for analyzing time series
data in order to extract meaningful statistics and other characteristics of the
data. Time series forecasting is the use of a model to forecast future events
based on known past events: to predict data points before they are measured. An
example of time series forecasting in econometrics is predicting the opening price
of a stock based on its past performance. Time series are very frequently plotted
via line charts.
Time series data have a natural temporal ordering. This makes time series
analysis distinct from other common data analysis problems, in which there is no
natural ordering of the observations (e.g. explaining people's wages by reference
to their education level, where the individuals' data could be entered in any order).
Time series analysis is also distinct from spatial data analysis where the
observations typically relate to geographical locations (e.g. accounting for house
prices by the location as well as the intrinsic characteristics of the houses). A time
series model will generally reflect the fact that observations close together in time
will be more closely related than observations further apart. In addition, time series
models will often make use of the natural one-way ordering of time so that values
for a given period will be expressed as deriving in some way from past values,
rather than from future values.
There are several types of data analysis available for time series which are
appropriate for different purposes.
General exploration
 Graphical examination of data series

 Autocorrelation analysis to examine serial dependence
 Spectral analysis to examine cyclic behaviour which need not be related
to seasonality. For example, sun spot activity varies over 11 year cycles. [1]
[2]
Other common examples include celestial phenomena, weather patterns,
neural activity, commodity prices, and economic activity.
Description
 Separation into components representing trend, seasonality, slow and fast

variation, cyclical irregular: see decomposition of time series
 Simple properties of marginal distributions
Prediction and forecasting
 Fully-formed statistical models for stochastic simulation purposes, so as to

generate alternative versions of the time series, representing what might
happen over non-specific time-periods in the future
 Simple or fully-formed statistical models to describe the likely outcome of
the time series in the immediate future, given knowledge of the most recent
outcomes (forecasting).
Models
Models for time series data can have many forms and represent
different stochastic processes. When modeling variations in the level of a process,
three broad classes of practical importance are the autoregressive (AR) models,
the integrated (I) models, and the moving average(MA) models. These three
classes depend linearly on previous data points. Combinations of these ideas
produce autoregressive moving average (ARMA) and autoregressive integrated
moving average (ARIMA) models. The autoregressive fractionally integrated
moving average(ARFIMA) model generalizes the former three. Extensions of
these classes to deal with vector-valued data are available under the heading of
multivariate time-series models and sometimes the preceding acronyms are
extended by including an initial "V" for "vector". An additional set of extensions of
these models is available for use where the observed time-series is driven by
some "forcing" time-series (which may not have a causal effect on the observed
series): the distinction from the multivariate case is that the forcing series may be
deterministic or under the experimenter's control. For these models, the acronyms
are extended with a final "X" for "exogenous".
Non-linear dependence of the level of a series on previous data points is of

interest, partly because of the possibility of producing a chaotictime series.
However, more importantly, empirical investigations can indicate the advantage of
using predictions derived from non-linear models, over those from linear models.
Among other types of non-linear time series models, there are models to represent
the changes of variance along time. These models are called autoregressive
conditional heteroskedasticity (ARCH) and the collection comprises a wide variety
of representation (GARCH, TARCH, EGARCH, FIGARCH, CGARCH, etc). Here
changes in variability are related to, or predicted by, recent past values of the
observed series. This is in contrast to other possible representations of locally-
varying variability, where the variability might be modelled as being driven by a
separate time-varying process, as in a doubly stochastic model.
In recent work on model-free analyses, wavelet transform based methods (for
example locally stationary wavelets and wavelet decomposed neural networks)
have gained favor. Multiscale (often referred to as multiresolution) techniques
decompose a given time series, attempting to illustrate time dependence at
multiple scales.
Notation
A number of different notations are in use for time-series analysis:
X = {X1, X2, ...}
is a common notation which specifies a time series X which is indexed by

the natural numbers. Another common notation is:
Y = {Yt: t T}.
Conditions
There are two sets of conditions under which much of the theory is built:
 Stationary process
 Ergodicity
However, ideas of stationarity must be expanded to consider two important

ideas: strict stationarity and second-order stationarity. Both models and
applications can be developed under each of these conditions, although the
models in the latter case might be considered as only partly specified.
In addition, time-series analysis can be applied where the series are seasonally

stationary or non-stationary. Situations where the amplitudes of frequency
components change with time can be dealt with in time-frequency analysis which
makes use of a time–frequency representationof a time-series or signal.[4].
Review Questions
Give a general explanation of Time Series. Elucidate the models in Time

series Analysis
Lesson 7
Time Series Analysis
Trend in Time Series
To analyse a (time) series of data, we assume that it may be represented as trend

plus noise which is represented by Seasonal Variations, Cyclical Variations and
Irregular oscillations:
Where a and b are (usually unknown) constants and the e's are independent

randomly distributed "errors". Unless something special is known about
the e's, they will be assumed to have a normal distribution. It is simplest if
the e's all have the same distribution, but if not (if some have higher variance,
meaning that those data points are effectively less certain) then this can be
taken into account during the least squares fitting, by weighting each point by
the inverse of the variance of that point.
In most cases, where only a single time series exists to be analysed, the
variance of the e's is estimated by fitting a trend, thus allowing a t + b to be
removed and leaving the e's as residuals, and calculating the variance of
the e's from the residuals — this is often the only way of estimating the
variance of the e's.
One particular special case of great interest, the (global) temperature time
series, is known not to be homogeneous in time: apart from anything else, the
number of weather observations has (generally) increased with time, and thus
the error associated with estimating the global temperature from a limited set
of observations has decreased with time. In fitting a trend to this data, this can
be taken into account, as described above.
Once we know the "noise" of the series, we can then assess the significance
of the trend by making the null hypothesis that the trend, a, is not significantly
different from 0. From the above discussion of trends in random data with
known variance, we know the distribution of trends to be expected from
random (trendless) data. If the calculated trend, a, is larger than the value, V,
then the trend is deemed significantly differentiable from zero at significance
level S.
Variations in time series with example
It is harder to see a trend in a noisy time series. For example, if the true series is
0, 1, 2, 3 all plus some independent normally distributed "noise" e of standard
deviation E, and we have a sample series of length 50, then if E = 0.1 the trend
will be obvious; if E = 100 the trend will probably be visible; but if E = 10000 the
trend will be buried in the noise.
If we consider a concrete example, the global surface temperature record of the

past 140 years as presented by the IPCC: [1], then the interannual variation is
about 0.2°C and the trend about 0.6°C over 140 years, with 95% confidence limits
of 0.2°C (by coincidence, about the same value as the interannual variation).
Hence the trend is statistically different from 0.
Review Question
1. Elucidate the trend and its modeling in time series.
2. What do you understand by noise (errors/ variations) in time series? How

do you explain these oscillations?
3. In what way, trend neutralizes the noise (Seasonal, Cyclical and Irregular
Oscillations)
Time Series Analysis with Moving Average
A moving average, also called rolling average, rolling mean or running

average, is a type of finite impulse response filter used to analyze a set of data
points by creating a series of averages of different subsets of the full data set.
Given a series of numbers and a fixed subset size, the moving average can be
obtained by first taking the average of the first subset. The fixed subset size is
then shifted forward, creating a new subset of numbers, which is averaged. This
process is repeated over the entire data series. The plot line connecting all the
(fixed) averages is the moving average. Thus, a moving average is not a single
number, but it is a set of numbers, each of which is the average of the
corresponding subset of a larger set of data points. A moving average may also
use unequal weights for each data value in the subset to emphasize particular
values in the subset.
A moving average is commonly used with time series data to smooth out short-

term fluctuations and highlight longer-term trends or cycles. The threshold
between short-term and long-term depends on the application, and the parameters
of the moving average will be set accordingly. For example, it is often used
in technical analysis of financial data, like stock prices, returns or trading volumes.
It is also used in economics to examine gross domestic product, employment or
other macroeconomic time series. Mathematically, a moving average is a type
of convolution and so it is also similar to the low-pass filter used in signal
processing. When used with non-time series data, a moving average simply acts
as a generic smoothing operation without any specific connection to time, although
typically some kind of ordering is implied.
Simple moving average
A simple moving average (SMA) is the un-weighted mean of the previous n data

points. For example, a 10-day simple moving average of closing price is the mean
of the previous 10 days' closing prices. If those prices are
then the formula is
When calculating successive values, a new value comes into the sum and an
old value drops out, meaning a full summation each time is unnecessary,
In technical analysis there are various popular values for n, like 10 days,
40 days, or 200 days. The period selected depends on the kind of
movement one is concentrating on, such as short, intermediate, or long
term. In any case moving average levels are interpreted as support in a
rising market, or resistance in a falling market.
In all cases a moving average lags behind the latest data point, simply
from the nature of its smoothing. An SMA can lag to an undesirable
extent, and can be disproportionately influenced by old data points
dropping out of the average. This is addressed by giving extra weight to
more recent data points, as in the weighted and exponential moving
averages.
One characteristic of the SMA is that if the data have a periodic

fluctuation, then applying an SMA of that period will eliminate that
variation (the average always containing one complete cycle). But a
perfectly regular cycle is rarely encountered in economics or finance. [1]
For a number of applications it is advantageous to avoid the shifting

induced by using only 'past' data. Hence a central moving average can
be computed, using both 'past' and 'future' data. The 'future' data in this
case are not predictions, but merely data obtained after the time at which
the average is to be computed.
Cumulative moving average
The cumulative moving average is also frequently called a running average or

a long running average although the term running average is also used as
synonym for a moving average. This article uses the term cumulative moving
average or simply cumulative average since this term is more descriptive and
unambiguous.
In some data acquisition systems, the data arrives in an ordered data stream and
the statistician would like to get the average of all of the data up until the current
data point. For example, an investor may want the average price of all of the stock
transactions for a particular stock up until the current time. As each new
transaction occurs, the average price at the time of the transaction can be
calculated for all of the transactions up to that point using the cumulative average.
This is the cumulative average, which is typically an unweighted average of the
sequence of i values x1... xi up to the current time:
The brute force method to calculate this would be to store all of the data and
calculate the sum and divide by the number of data points every time a new
data point arrived. However, it is possible to simply update cumulative
average as a new value xi+1 becomes available, using the formula:
The brute force method to calculate this would be to store all of the data and
calculate the sum and divide by the number of data points every time a new data
point arrived. However, it is possible to simply update cumulative average as a
new value xi+1 becomes available, using the formula:
Where CA0 can be taken to be equal to 0.
Thus the current cumulative average for a new data point is equal to the
previous cumulative average plus the difference between the latest data point
and the previous average divided by the number of points received so far.
When all of the data points arrive (i = N), the cumulative average will equal
the final average.
The derivation of the cumulative average formula is straightforward. Using
and similarly for i + 1, it is seen that
Solving this equation for CAi+1 results in:
Weighted moving average
A weighted average is any average that has multiplying factors to give different
weights to different data points. Mathematically, the moving average is
the convolution of the data points with a moving average function; in technical
analysis, a weighted moving average (WMA) has the specific meaning of
weights that decrease arithmetically. In an n-day WMA the latest day has weight
n, the second latest n − 1, etc, down to zero.
WMA weights n = 15
The denominator is a triangle number, and can be easily computed
as
When calculating the WMA across successive values, it can be noted the
difference between the numerators of WMA M+1 and
WMAM is npM+1 − pM − ... − pM−n+1. If we denote the sumpM + ... + pM−n+1 by Total
M, then
Review Question
1. Explain the term moving average. Why this method is more reliable than the
trend to analyze the time series?
2. Explain following terminologies:
a. Cumulative Moving Average,
b. Weighted Moving Average,
c. Centered Moving Average

Lesson 8
Time Series Smoothing
Introduction
Exponential smoothing is a technique that can be applied to time series data,

either to produce smoothed data for presentation, or to make forecasts. The time
series data themselves are a sequence of observations. The observed
phenomenon may be an essentially random process, or it may be an orderly,
but noisy, process. Whereas in the simple moving average the past observations
are weighted equally, exponential smoothing assigns exponentially decreasing
weights over time.
Exponential smoothing is commonly applied to financial market and economic

data, but it can be used with any discrete set of repeated measurements. The raw
data sequence is often represented by {xt}, and the output of the exponential
smoothing algorithm is commonly written as {st} which may be regarded as our
best estimate of what the next value of x will be. When the sequence of
observations begins at time t = 0, the simplest form of exponential smoothing is
given by the formulas
where α is the smoothing factor, and 0 < α < 1.
The exponential moving average
The simplest form of exponential smoothing is given by the formulae

where α is the smoothing factor, and 0 < α < 1. In other words, the smoothed
statistic st is a simple weighted average of the previous observation xt-1 and
the previous smoothed statistic st−1. The term smoothing factor applied to α
here is something of a misnomer, as larger values of α actually reduce the
level of smoothing. In the limiting case with α = 1 the output series is just the
same as the original series. Simple exponential smoothing is easily applied,
and it produces a smoothed statistic as soon as two observations are
available.
Values of α close to one have less of a smoothing effect and give greater
weight to recent changes in the data, while values of α closer to zero have a
greater smoothing effect and are less responsive to recent changes. There is
no formally correct procedure for choosing α. Sometimes the statistician's
judgment is used to choose an appropriate factor. Alternatively, a statistical
technique may be used to optimizethe value of α. For example, the method of
least squares might be used to determine the value of α for which the sum of
the quantities (sn-1 − xn-1)2 is minimized.
This simple form of exponential smoothing is also known as an "exponentially

weighted moving average". Technically it can also be classified as
an Autoregressive integrated moving average (ARIMA) (0,1,1) model with no
constant term.[2]
Why is it "exponential"?
By direct substitution of the defining equation for simple exponential smoothing

back into itself we find that
In other words, as time passes the smoothed statistic st becomes the

weighted average of a greater and greater number of the past
observations xt−n, and the weights assigned to previous observations are in
general proportional to the terms of the geometric progression
{1, (1 − α), (1 − α)2, (1 − α)3, …}. A geometric progression is the discrete
version of an exponential function, so this is where the name for this
smoothing method originated.
Double exponential smoothing
Simple exponential smoothing does not do well when there is a trend in the data.
[1]
In such situations, double exponential smoothing can be used.
Again, the raw data sequence of observations is represented by {xt}, beginning at

time t = 0. We use {st} to represent the smoothed value for time t, and {bt} is our
best estimate of the trend at time t. The output of the algorithm is now written
as Ft+m, an estimate of the value of x at time t+m, m>0 based on the raw data up to
time t. Double exponential smoothing is given by the formulas [1]
where α is the data smoothing factor, 0 < α < 1, β is the trend smoothing

factor, 0 < β < 1, and b0 is taken as (xn-1 - x0)/(n - 1) for somen > 1.
Review Question
What do you understand by smoothing? elucidate

Autoregressive moving average (ARMA) models
Also called Box-Jenkins models after the iterative Box-Jenkins methodology

usually used to estimate them, are typically applied to auto correlated time
series data.
Given a time series of data Xt, the ARMA model is a tool for understanding and,
perhaps, predicting future values in this series. The model consists of two parts,
an autoregressive (AR) part and a moving average (MA) part. The model is usually
then referred to as the ARMA(p,q) model where p is the order of the
autoregressive part and q is the order of the moving average part
Autoregressive model
The notation AR(p) refers to the autoregressive model of order p. The AR(p)
model is written
where are the parameters of the model, c is a constant and

is white noise. The constant term is omitted by many authors for simplicity.
An autoregressive model is essentially an all-pole infinite impulse

response filter with some additional interpretation placed on it.
Some constraints are necessary on the values of the parameters of this model
in order that the model remains stationary. For example, processes in the
AR(1) model with |φ1| ≥ 1 are not stationary.
Autoregressive moving average model
The notation ARMA(p, q) refers to the model with p autoregressive terms

and q moving average terms. This model contains the AR(p) and MA(q) models,
Specification in terms of lag operator
In some texts the models will be specified in terms of the lag operator L. In these
terms then the AR(p) model is given by
where represents the polynomial
The MA(q) model is given by
where θ represents the polynomial
Finally, the combined ARMA(p, q) model is given by
or more concisely,
Fitting models
ARMA models in general can, after choosing p and q, be fitted by least

squares regression to find the values of the parameters which minimize the error
term. It is generally considered good practice to find the smallest values of p and q
which provide an acceptable fit to the data. For a pure AR model the Yule-Walker
equations may be used to provide a fit.
Finding appropriate values of p and q in the ARMA(p,q) model can be facilitated by

plotting the partial autocorrelation functions for an estimate of p, and likewise using
the autocorrelation functions for an estimate of q. Further information can be
gleaned by considering the same functions for the residuals of a model fitted with
an initial selection of p and q.
Applications
ARMA is appropriate when a system is a function of a series of unobserved

shocks (the MA part) as well as its own behavior. For example, stock prices may
be shocked by fundamental information as well as exhibiting technical trending
and mean-reversion effects due to market participants. When looking at long term
data, econometricians tend to opt for an Ar(p) model for simplicity.
Autoregressive Integrated Moving Average (ARIMA)
In time series analysis, an autoregressive integrated moving average

(ARIMA) model is a generalization of an autoregressive moving average (ARMA)
model. These models are fitted to time series data either to better understand the
data or to predict future points in the series (forecasting). They are applied in some
cases where data show evidence of non-stationarity, where an initial differencing
step (corresponding to the "integrated" part of the model) can be applied to
remove the non-stationarity.
The model is generally referred to as an ARIMA (p, d, q) model where p, d,
and q are integers greater than or equal to zero and refer to the order of the
autoregressive, integrated, and moving average parts of the model respectively.
ARIMA models form an important part of the Box-Jenkins approach to time-series
modeling.
When one of the terms is zero, it's usual to drop AR, I or MA. For example,
an I(1) model is ARIMA(0,1,0), and a MA(1) model is ARIMA(0,0,1).
Definition
Given a time series of data Xt where t is an integer index and the Xt are real

numbers, then an ARMA(p,q) model is given by:
where L is the lag operator, the αi are the parameters of the autoregressive part of

the model, the θi are the parameters of the moving average part and the are
error terms. The error terms are generally assumed to be independent,
identically distributed variables sampled from a normal distribution with zero mean.
Assume now that the polynomial has a unitary root of

multiplicity d. Then it can be rewritten as:
An ARIMA(p,d,q) process expresses this polynomial factorisation property,

and is given by:
and thus can be thought as a particular case of an ARMA(p+d,q) process
having the auto-regressive polynomial with some roots in the unity. For
this reason every ARIMA model with d>0 is not wide sense stationary.
Forecasts using ARIMA models
ARIMA models are used for observable non-stationary processes Xt that have

some clearly identifiable trends:
 constant trend (i.e. a non-zero average) leads to d = 1

 linear trend (i.e. a linear growth behavior) leads to d = 2
 quadratic trend (i.e. a quadratic growth behavior) leads to d = 3
In these cases the ARIMA model can be viewed as a "cascade" of two models.
The first is non-stationary:
while the second is wide-sense stationary:
Now standard forecasts techniques can be formulated for the process Yt,

and then (having the sufficient number of initial conditions) Xt can be
forecasted via opportune integration steps.
Partial autocorrelation function
In time series analysis, the partial autocorrelation function (PACF) or Partial

autoCORrelation (PARCOR) plays an important role in data analyses aimed at
identifying the extent of the lag in an autoregressive model. The use of this
function was introduced as part of the Box approach to time series modeling,
where by plotting the partial auto correlative functions one could determine the
appropriate lags p in an AR(p) model or in an extended ARIMA(p,d,q) model.
Description
Given a time series zt, the partial autocorrelation of lag k is

the autocorrelation between zt and zt + k with the linear dependence of zt + 1through
to zt + k − 1 removed; equivalently, it is the autocorrelation between zt and zt − k that is
not accounted for by lags 1 to k-1, inclusive.
There are algorithms, not discussed here, for estimating the partial autocorrelation
based on the sample autocorrelations. These algorithms derive from the exact
theoretical relation between the partial autocorrelation function and the
autocorrelation function.
Partial autocorrelation plots (Box and Jenkins, pp. 64–65, 1970) are a commonly
used tool for model identification in Box-Jenkins models. Specifically, partial
autocorrelations are useful in identifying the order of an autoregressive model. The
partial autocorrelation of an AR(p) process is zero at lag p+1 and greater. If the
sample autocorrelation plot indicates that an AR model may be appropriate, then
the sample partial autocorrelation plot is examined to help identify the order. One
looks for the point on the plot where the partial autocorrelations for all higher lags
are essentially zero. Placing on the plot an indication of the sampling uncertainty
of the sample PACF is helpful for this purpose: this is usually constructed on the
basis that the true value of the PACF, at any given positive lag, is zero. This can
be formalised as described below.
An approximate test that a given partial correlation is zero (at a 5% significance
level) is given by comparing the sample partial autocorrelations against the critical
region with upper and lower limits given by , where n is the record length
(number of points) of the time-series being analysed. This approximation relies on
the assumption that the record length is moderately large (say n>30) and that the
underlying process has a multivariate normal distribution.
Review Questions
1. Explain in detail the Time Series Modeling.
2. Explain the process by which different components of Time series are

neutralized.
3. Explain the smoothing techniques. Elucidate ARMA and ARIMA.

Unit III Time Series Analysis Lesson 6

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit III Time Series Analysis Lesson 6

Uploaded by

Copyright:

Available Formats

Unit III

Time Series Analysis

A time series is a sequence of data points, measured typically at successive

 Graphical examination of data series

 Separation into components representing trend, seasonality, slow and fast

Prediction and forecasting

 Fully-formed statistical models for stochastic simulation purposes, so as to

Non-linear dependence of the level of a series on previous data points is of

A number of different notations are in use for time-series analysis:

X = {X1, X2, ...}

is a common notation which specifies a time series X which is indexed by

However, ideas of stationarity must be expanded to consider two important

In addition, time-series analysis can be applied where the series are seasonally

Give a general explanation of Time Series. Elucidate the models in Time

Time Series Analysis

Trend in Time Series

To analyse a (time) series of data, we assume that it may be represented as trend

Where a and b are (usually unknown) constants and the e's are independent

Variations in time series with example

If we consider a concrete example, the global surface temperature record of the

1. Elucidate the trend and its modeling in time series.

2. What do you understand by noise (errors/ variations) in time series? How

A moving average, also called rolling average, rolling mean or running

A moving average is commonly used with time series data to smooth out short-

Simple moving average

A simple moving average (SMA) is the un-weighted mean of the previous n data

One characteristic of the SMA is that if the data have a periodic

For a number of applications it is advantageous to avoid the shifting

The cumulative moving average is also frequently called a running average or

Where CA0 can be taken to be equal to 0.

The derivation of the cumulative average formula is straightforward. Using

and similarly for i + 1, it is seen that

Solving this equation for CAi+1 results in:

Weighted moving average

The denominator is a triangle number, and can be easily computed

2. Explain following terminologies:

a. Cumulative Moving Average,

b. Weighted Moving Average,

c. Centered Moving Average

Time Series Smoothing

Exponential smoothing is a technique that can be applied to time series data,

Exponential smoothing is commonly applied to financial market and economic

where α is the smoothing factor, and 0 < α < 1.

The exponential moving average

The simplest form of exponential smoothing is given by the formulae

This simple form of exponential smoothing is also known as an "exponentially

By direct substitution of the defining equation for simple exponential smoothing

In other words, as time passes the smoothed statistic st becomes the

Double exponential smoothing

Again, the raw data sequence of observations is represented by {xt}, beginning at

where α is the data smoothing factor, 0 < α < 1, β is the trend smoothing

What do you understand by smoothing? elucidate

Also called Box-Jenkins models after the iterative Box-Jenkins methodology

where are the parameters of the model, c is a constant and

An autoregressive model is essentially an all-pole infinite impulse

The notation ARMA(p, q) refers to the model with p autoregressive terms

Specification in terms of lag operator

where represents the polynomial

The MA(q) model is given by

where θ represents the polynomial

Finally, the combined ARMA(p, q) model is given by

ARMA models in general can, after choosing p and q, be fitted by least

Finding appropriate values of p and q in the ARMA(p,q) model can be facilitated by