You are on page 1of 11

Time Series in Brief

Soumen Manna
Goa Institute of Management, Goa, India

1. Introduction

In regression analysis, we have output or response variable Y and a set of input or causal variables X =
(X1 , , X2 , . . . , Xn ) and we try to establish a linear relationship between Y and X from the available data. The
data consisting of past observations on X and Y values. However, there are many process where the output
variable Y is observable but the input variables X are not. The reason may be, in most cases, they (input
variables) are too many in nature and we don’t have exact information of their occurrence. So, we are only left
with a sequence of Y observations over time. Analysing the pattern this sequence of Y observations over time
and building a mathematical model out of it for the future prediction are the main objectives of Time Series
Analysis.

Definition 1.1. A time series is a sequence of measurements of the same variable collected over time. Most
often, the measurements are made at regular time intervals.

One difference from standard linear regression is that the data are not necessarily independent and not
necessarily identically distributed. One defining characteristic of a time series is that it is a list of observations
where the ordering matters. Ordering is very important because there is dependency and changing the order
could change the meaning of the data.

Important Characteristics of time series

Some important questions to first consider when first looking at a time series are:

• Is there a trend, meaning that, on average, the measurements tend to increase (or decrease) over time?

• Is there seasonality, meaning that there is a regularly repeating pattern of highs and lows related to
calendar time such as seasons, quarters, months, days of the week, and so on?

• Are there outliers? In regression, outliers are far away from your line. With time series data, your
outliers are far away from your other data.

• Is there a long-run cycle or period unrelated to seasonality factors?

• Is there constant variance over time, or is the variance non-constant?

• Are there any abrupt changes to either the level of the series or the variance?

There are many models available to analyse and forecast time series data. No model is 100% accurate to
explain a time series. You have to try with many probable models and check which fits the data best. Here,
in this note, we learn some fundamental models that are good enough to explain any time series data. Going
forward, we denote a time series T by {Yt }.

Preprint submitted to MMMD December 9, 2021


2. Moving Average Smoothing Models

Moving average models are classical methods of obtaining pattern in a time series data. There are many
ways to define a moving average model. In this note, we adopt 3 basic moving average models. To understand
these better, let us consider the following time series with 7 data points.

2.1. Simple Moving Average (SMA)


Simple Moving Average of order k is called k-SMA. It can be defined in many ways.

Type 1: When prediction Ŷt at time point t, is the average of past k data points
Yt−1 +Yt−2 + . . . +Yt−k
Ŷt = .
k

Type 2: When prediction Ŷt at time point t, is the average of t-th and past (k − 1) data points
Yt +Yt−1 + . . . +Yt−(k−1)
Ŷt = .
k

R Code. SMA(ts, n = 2) implements this formula.

2
k−1 k−1
Type 3: When prediction Ŷt at time point t, is the average of t-th and previous and future data
2 2
points. k is considered to be odd in this case.
Yt− k−1 + . . . +Yt + . . . +Yt+ k−1
2 2
Ŷt = .
k

R Code. filter(ts, filter = c(1/3, 1/3, 1/3), sides = 2) implements this formula.

2.2. Weighted Moving Average (WMA)


Weighted Moving Average of order k is called k-WMA. Instead of putting equal weights to past k data
points, here we put higher weights to immediate past (or, future) observations and lower weights to far past
k
(or, future) observations. Here we use k weights a1 , a2 , . . . , ak , with ∑ ai = 1. It can be defined in many ways.
i=1
Type 1: When prediction Ŷt at time point t, is the weighted average of past k data points

Ŷt = a1Yt−1 + a2Yt−2 + . . . + akYt−k .

If we consider k = 2 and a1 = 2/3, a2 = 1/3, then

Type 2: When prediction Ŷt at time point t, is the weighted average of t-th and past (k − 1) data points

Ŷt = a1Yt + a2Yt−1 + . . . + akYt−(k−1) .

If we consider k = 2 and a1 = 2/3, a2 = 1/3, then

3
R Code. WMA(ts, n = 2, 1:2) implements this formula.
k−1
Type 3: When prediction Ŷt at time point t, is the weighted average of t-th and previous and future
2
k−1
data points. k is considered to be odd in this case.
2
Ŷt = at− k−1 Yt− k−1 + . . . + at Yt + . . . + at+ k−1 Yt+ k−1 .
2 2 2 2

If we consider k = 3 and at−1 = 1/4, at = 1/2, at+1 = 1/4, then

R Code. filter (ts, filter = c(1/4, 1/2, 1/4), sides = 2) implements this formula.

2.3. Exponential Moving Average (EMA)


Like SMA and WMA, EMA can be defined in many ways. However, in this note, we adopt the way R
defined it. The prediction Ŷt at time point t is the weighted average of original Yt and predicted Ŷt−1 . Thus,
Exponential Moving Average of order k is

Ŷt = αYt + (1 − α)Ŷt−1

= αYt + (1 − α)[αYt−1 + (1 − α)Ŷt−2 ]

= αYt + α(1 − α)Yt−1 + (1 − α)2Ŷt−2


..
.
= αYt + α(1 − α)Yt−1 + α(1 − α)2Yt−2 + · · · + (1 − α)kŶt−k

2
R consider α = in its implementation.
k+1

R Code. EMA(ts, n=2) # Here n is the number of order (k). So, for the above example n = k = 2.

4
3. ARIMA models

Moving Average models are good for understanding patterns on existing data. But when it comes to
forecasting, moving average methods are not competent enough for this task. We need advanced models like
ARIMA, SARIMA for future predictions. However, one cannot use these models to any kind of time series
data. These models are only be applicable if the series is stationary.
A stationary time series is one whose properties do not depend on the time at which the series is observed.
Thus, time series with trends, or with seasonality, are not stationary — the trend and seasonality will affect
the value of the time series at different times. On the other hand, a white noise series is stationary — it does
not matter when you observe it, it should look much the same at any point in time.
Some cases can be confusing — a time series with cyclic behaviour (but with no trend or seasonality) is
stationary. This is because the cycles are not of a fixed length, so before we observe the series we cannot be
sure where the peaks and troughs of the cycles will be.
In general, a stationary time series will have no predictable patterns in the long-term. Time plots will show
the series to be roughly horizontal (although some cyclic behaviour is possible), with constant variance.

Figure 1: Which of these series are stationary? (a) Google stock price for 200 consecutive days; (b)
Daily change in the Google stock price for 200 consecutive days; (c) Annual number of strikes in the US;
(d) Monthly sales of new one-family houses sold in the US; (e) Annual price of a dozen eggs in the US
(constant dollars); (f) Monthly total of pigs slaughtered in Victoria, Australia; (g) Annual total of lynx trapped
in the McKenzie River district of north-west Canada; (h) Monthly Australian beer production; (i) Monthly
Australian electricity production.

5
Consider the nine series plotted in Figure 1. Which of these do you think are stationary?
Obvious seasonality rules out series (d), (h) and (i). Trends and changing levels rules out series (a), (c),
(e), (f) and (i). Increasing variance also rules out (i). That leaves only (b) and (g) as stationary series. At first
glance, the strong cycles in series (g) might appear to make it non-stationary. But these cycles are aperiodic —
they are caused when the lynx population becomes too large for the available feed, so that they stop breeding
and the population falls to low numbers, then the regeneration of their food sources allows the population to
grow again, and so on. In the long-term, the timing of these cycles is not predictable. Hence the series is
stationary.

If the time series data is generated from the following processes, then it become stationary.

Autoregressive or AR(p) models: When a time series data is generated from its past p original values.

Yt = a0 + a1Yt−1 + a2Yt−2 + . . . + a pYt−p + ε (3.1)


where, ai ’s are parameters and ε is the random term, also known as white noise. From ACF and PACF graph,
we can identify an AR(p) process. For an AR(p) process, ACF graph will gradually goes down and PACF
graph will cut-off after p significant lag. The following graph shows an AR(1) process.

Moving Average or MA(q) models: When a time series data is generated from its past q error values.

Yt = wt + b0 + b1 wt−1 + b2 wt−2 + . . . + bq wt−q (3.2)


where, bi ’s are parameters and wi ’s are past error term. From ACF and PACF graph, we can identify an
MA(q) process. For an MA(q) process, PACF graph will gradually goes down and ACF graph will cut-off
after q significant lag.

6
The following graph shows an MA(1) process.

Autoregressive Moving Average or ARMA(p, q) models: When a time series data is generated from its past
p original values and q error values.

Yt = c + a1Yt−1 + a2Yt−2 + . . . + a pYt−p + b1 wt−1 + b2 wt−2 + . . . + bq wt−q + wt (3.3)


where, c, ai ’s and bi ’s are parameters. From ACF and PACF graph, we can identify an ARMA(p, q) process.
For an ARMA(p, q) process, both ACF and PACF graphs will gradually goes down. The following graph
shows an ARMA(1, 1) process.

7
Therefore, when we have only time series data {yt }, from the ACF and PACF graph we can identify a
probable model that can be fitted to the data. The following table summarize the relationship between ACF
and PACF graphs and AR(p), MA(q) and ARMA(p, q) models.

R Code. acf2(ts) function will display the ACF and PACF graph.

3.1. Non-Seasonal ARIMA models


When a time series data {yt } is stationary or has only trend (but no seasonal) component, we can model it
under non-seasonal Autoregressive Integrated Moving Average or simply, ARIMA set up. If the original time
series {yt } has a trend component, then we have to transform the series {yt } first to a series that is stationary.
The transformation is done by taking difference on the original time series data. If the trend of {yt } is linear,
then the transformed series {zt }, where zt = yt − yt−1 become stationary. If the trend of {yt } is quadratic,
then the transformed series {zt∗ }, where zt∗ = zt − zt−1 = (yt − yt−1 ) − (yt−1 − yt−2 ) become stationary and
so on. However, in practice almost all non-seasonal time series with trend become stationary by at most 2
differentiation.

An ARIMA model has 3 parameters p, d, q and together is denoted as ARIMA(p, d, q) models. The
parameters p, d and q represents AR order, differencing order and MA order respectively. Thus,

• An AR(2) model is denoted by ARIMA(2,0,0).

• An MA(2) model is denoted by ARIMA(0,0,2).

• An ARMA(2,1) model is denoted by ARIMA(2,0,1).

• A model with one AR term, a first difference, and one MA term is denoted by ARIMA(1,1,1).

Please note, in the last model ARIMA (1,1,1), one AR term and one MA term are being applied to the
transformed time series {zt }, where zt = yt − yt−1 .

R Code. The following functions are important for building an ARIMA(p, d, q) model

1. decompose(ts, type = ”additive”) function will decompose the time series ts into three components,
namely, trend, seasonal and random parts. If you find an insignificant seasonal component, then you may
apply ARIMA(p, d, q) model for some d ≥ 0.

2. ndiffs(ts) will tell you how many non-seasonal differentiations are required to make a series stationary.

3. sarima(ts, p, d, q) function will build an ARIMA(p, d, q) model. Note that, here you have to pass the
original time series ts, not the transformed series. You only need to mention the appropriate p, d, q values.

8
3.2. Seasonal ARIMA models
Seasonality in a time series is a regular pattern of changes that repeats over S time periods, where S defines
the number of time periods until the pattern repeats again.
For example, there is seasonality in monthly data for which high values tend always to occur in some
particular months and low values tend always to occur in other particular months. In this case, S = 12 (months
per year) is the span of the periodic seasonal behavior. For quarterly data, S = 4 time periods per year.

In a seasonal ARIMA model, seasonal AR and MA terms predict using data values and errors at times
with lags that are multiples of S (the span of the seasonality).

A seasonal ARIMA mode is denoted by ARIMA(p, d, q) × (P, D, Q)S . Here, the parameters p, d and q
represent non-seasonal AR order, differencing order and MA order respectively. And P, D and Q represent
seasonal AR order, differencing order, MA order respectively. S represents the number of season in the data.
In other words, there are S data points in each year data. For example,

• With monthly data (and S = 12), a seasonal first order autoregressive model, i.e., ARIMA(0, 0, 0) ×
(1, 0, 0)12 would use Yt−12 to predict Yt . For instance, if we were selling cooling fans we might predict
this August’s sales using last August’s sales. (This relationship of predicting using last year’s data
would hold for any month of the year.)
• A seasonal second order autoregressive model, i.e., ARIMA(0, 0, 0) × (2, 0, 0)12 would use Yt−24 and
Yt−12 to predict Yt . Here we would predict this August’s values from the past two Augusts.
• A seasonal first order MA(1) model, i.e., ARIMA(0, 0, 0) × (0, 0, 1)12 would use wt−12 to predict Yt .
• A seasonal second order MA(2) model, i.e., ARIMA(0, 0, 0) × (0, 0, 2)4 would use wt−8 and wt−4 to
predict Yt .

Differencing

Almost by definition, it may be necessary to examine differenced data when we have seasonality. Seasonality
usually causes the series to be non-stationary because the average values at some particular times within the
seasonal span (months, for example) may be different than the average values at other times. For instance, our
sales of cooling fans will always be higher in the summer months.

Seasonal differencing
Seasonal differencing is defined as a difference between a value and a value with lag that is a multiple of S.

• With S = 12, which may occur with monthly data, a seasonal difference will transform the original
series {Yt } to a transformed series {Yt∗12 = Yt −Yt−12 }.
• With S = 4, which may occur with quarterly data, a seasonal difference will transform the original series
{Yt } to a transformed series {Yt∗4 = Yt −Yt−4 }.

Seasonal differencing removes seasonal trend in the data.

Non-Seasonal differencing
If trend is still present in the data, we may need non-seasonal differencing. Often (not always) a first difference
(non-seasonal) will “detrend” the data.

The ‘d’ and ‘D’ term of ARIMA (p, d, q) × (P, D, Q)S model can be identified from the fact that how many
Non-Seasonal and Seasonal differentiations are required to make the series stationary. The other parameters
will be identified from the ACF and PACF graphs. For example, consider a model with a non-seasonal MA(1)
term, a seasonal MA(1) term, no differencing, no AR terms and the seasonal period is S = 12. We can represent
this model by ARIMA(0, 0, 1) × (0, 0, 1)12 .

9
The ACF and PACF graph of the model ARIMA(0, 0, 1) × (0, 0, 1)12 is

Note that ACF is cutting-off after the 1-st lag in the span of first 12 lags indicating non-seasonal MA(1)
term. A seasonal repetition of 12-lag can be found from the graph, which tells the existence of seasonal MA(1)
term. Now we will discuss the steps to follow for building a seasonal ARIMA model.

Identifying a Seasonal Model


Step 1: Do a time series plot of the data

Examine it for features such as trend and seasonality. You’ll know that you’ve gathered seasonal data
(months, quarters, etc.,) so look at the pattern across those time units (months, etc.) to see if there is indeed a
seasonal pattern.

Step 2: Do any necessary differencing

If there is seasonality and no trend, then take a difference of lag S. For instance, take a 12th difference for
monthly data with seasonality. Seasonality will appear in the ACF by tapering slowly at multiples of S.

If there is linear trend and no obvious seasonality, then take a first difference. If there is a curved trend,
consider a transformation of the data before differencing.

If there is both trend and seasonality, apply a seasonal difference to the data and then re-evaluate the trend.
If a trend remains, then take first differences.

If there is neither obvious trend nor seasonality, don’t take any differences.

Step 3: Examine the ACF and PACF of the differenced data (if differencing is necessary)

We’re using this information to determine possible models. This can be tricky going involving some
(educated) guessing. Some basic guidance:

10
• Non-seasonal terms: Examine the early lags (1, 2, 3, . . . ) to judge non-seasonal terms. Spikes in the
ACF (at low lags) with a tapering PACF indicate non-seasonal MA terms. Spikes in the PACF (at low
lags) with a tapering ACF indicate possible non-seasonal AR terms.

• Seasonal terms: Examine the patterns across lags that are multiples of S. For example, for monthly
data, look at lags 12, 24, 36, and so on (probably won’t need to look at much more than the first two or
three seasonal multiples). Judge the ACF and PACF at the seasonal lags in the same way you do for the
earlier lags.

Step 4: Estimate the model(s) that might be reasonable on the basis of step 3

Don’t forget to include any differencing that you did before looking at the ACF and PACF. In the R
software, specify the original series as the data and then indicate the desired differencing when specifying
parameters in the sarima command that you’re using.

Step 5: Examine the residuals (with ACF, Normal Q-Q plot, Ljung-Box statistics etc.) to see if the model
seems good

Compare AIC or BIC values to choose among several models. If things don’t look good here, go back to
Step 3 (or maybe even Step 2).

Step 6: Forecast for future

If Step 5 looks good, you can forecast the series for future observation. In the R software, you may use
sarima.for function for forecasting.

Remark 3.1. While forecasting for validating a fitted model, do not use a single test set for validation.
You may create a set of train-test datasets and forecast the model to all these test sets and collect the error
measurements like MSE, MAPE etc. Select your best model where the average MSE or MAPE measurement
is minimum.

R Code. The following functions are important for building a seasonal ARIMA(p, d, q) × (P, D, Q)S model.

• sarima(ts, p, d, q, P, D, Q, S) function is used for building an ARIMA(p, d, q) × (P, D, Q)S model.

• sarima.for(ts, h, p, d, q, P, D, Q, S) function is used for forecasting an ARIMA(p, d, q) × (P, D, Q)S


model for h-steps ahead.

Resources

1. https://online.stat.psu.edu/stat510/

2. https://otexts.com/fpp3/

3. https://rstudio-pubs-static.s3.amazonaws.com/19198_aba6dbcabd0748159e3f395cc02c0f0c.
html
4. https://rpubs.com/ryankelly/tsa4

11

You might also like