You are on page 1of 62

Module MP(C)

Methods for predicting numerical variables


Simon Malinowski

M1 Miage - M1 Data Science, Univ. Rennes 1

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 1 / 55


Schedule of this course

1 Part 1 : Time series prediction


I 5h CM
I 8h TD/TP (including a graded work)
2 Part 2 : Prediction using other variables (regression models)
I 5h CM
I 8h TD/TP
I 4h Project (graded)
→ Kaggle competition : Lien

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 2 / 55


Time series prediction
Simon Malinowski

M1 Miage, Univ. Rennes 1

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 3 / 55


1 Introduction : context, definitions

2 Time series decomposition

3 Time series prediction

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 4 / 55


Time series examples

electricity consumption of users every minute or hour


number of people in the public transport every day
average daily temperature in a given city

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 5 / 55


Our goal: predict the next value(s)

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 6 / 55


Our goal: predict the next value(s)
pollution

355
350
345
340
335
330
325

20 22 24 26 28

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 7 / 55


What is a time series?

Definition
A time series is a finite series of numerical data points that represents
the evolution of a certain quantity over reguar time stamps

Also called sometimes chronolgical series


Denoted by X (t) = x1 , x2 , . . . , xn , or also X (t) = x(1), x(2), . . . , x(n)

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 8 / 55


Time series

The frequency of the observations (time stamps) can be:


hourly
daily
weekly
monthly
quarterly
yearly
whatever, provided that it is regular
The measured quantity is numerical

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 9 / 55


Graphical representations

1) Global representation : the points (t, xt ) are depicted on a plot


Example 1 : number of unemployed people registered to ”Pôle Emploi” in
France every month from 2001 to 2014
5000
4500
Nombre de chômeurs

4000
3500
3000

2000 2005 2010 2015

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 10 / 55


Graphical representations

1) Global representation : the points (t, xt ) are depicted on a plot


Example 2 : Monthly average temperature in Nottingham
Period 1920-1940
65
60
55
Temperature(F)

50
45
40
35
30

1920 1925 1930 1935 1940

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 11 / 55


Graphical representations
1) Global representation : the points (t, xt ) are depicted on a plot
Example 3 : Average monthly air trafic
Period 1949-1961
600
500
Traffic aérien

400
300
200
100

1950 1952 1954 1956 1958 1960

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 12 / 55


Graphical representations
2) By-period representation : superimposed periods
Unemployment data
6000

1996
1997
2001
2007
5500

2012
2013
2014
5000
chom[1:12]

4500
4000
3500
3000

2 4 6 8 10 12

Index

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 13 / 55


Graphical representations

2) By-period representation : superimposed periods

1964
1968
70

1972
1975
60
Temperature(F)

50
40
30
20
10

2 4 6 8 10 12

Index

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 13 / 55


1 Introduction : context, definitions

2 Time series decomposition

3 Time series prediction

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 14 / 55


Analyzing and understanding a time series

Time series often represent a complex phenomenon, difficult to analyze in


a straightforward manner
The methods that we will use to predict a time series are based on the
decomposition of a TS into elements that are simpler to
handle with
control
The elements that are considered are :
1 the trend Tt
2 the seasonal component St
3 the residual component Et

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 15 / 55


Decomposition models

Additive model
We assume that Xt can be written as

Xt = Tt + St + Et

Multiplicative model
We assume that Xt can be written as

Xt = (Tt ∗ St ) + Et

Xt = Tt ∗ St ∗ Et

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 16 / 55


The trend

Definition
This component represents the global evolution of the considered
phemomenon

Assumptions about this component


slowly varying
deterministic
can be estimated as a (simple) mathematical function

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 17 / 55


Trend or not trend ?

600
500
AirPassengers

400
300
200
100

1950 1952 1954 1956 1958 1960

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 18 / 55


Trend or not trend ?

600
500
AirPassengers

400
300
200
100

1950 1952 1954 1956 1958 1960

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 19 / 55


Trend or not trend ?

582
581
580
LakeHuron

579
578
577
576

1880 1900 1920 1940 1960

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 20 / 55


Trend or not trend ?

582
581
580
LakeHuron

579
578
577
576

1880 1900 1920 1940 1960

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 21 / 55


Kinds of trends

1 Linear: Tt = a + b × t
2 Polynomial : Tt = a0 + a1 × t + · · · + an × t n
3 Logarithmic : Tt = a + b × log t
4 Exponential : Tt = a × exp(bt)

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 22 / 55


The seasonal component

Definition
The seasonal component represents periodic fluctuations around the mean
inside a period (of length p) and that occur almost identically from period
to period

Example : Temperature in Nottingham


65
60
55
Temperature(F)

50
45
40
35
30

1920 1925 1930 1935 1940

Time

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 23 / 55


The seasonal component
Definition
The seasonal component represents periodic fluctuations around the mean
inside a period (of length p) and that occur almost identically from period
to period

Example : Temperature in Nottingham


60
55
Température moy

50
45
40

2 4 6 8 10 12

Mois

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 23 / 55


The seasonal component

This component is completetly defined by p seasonal coefficients


s1 , . . . sp that represent the seasonal profile.
We have St = s1 , . . . , sp , s1 , . . . , sp , . . .
Hence, the seasonal component St is the repetition of these p coefficients
over each period.

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 24 / 55


The residual component Et

This component represents irregular and ”unpredictable” fluctuations.


It is a random component, considered often as a zero-mean random
variable

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 25 / 55


Determining the composition of a time series

All time series are not composed of both a trend and a seasonal component
The typical kinds of time series are
completely random : Xt = Et
with a trend component only : Xt = Tt + Et
with a seasonal component only : Xt = St + Et
with both a trend and a seasonal component : Xt = Tt + St + Et

Next step is a tool to help you determine in which case we are

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 26 / 55


Autocorrelation function

Let Xt = x1 , . . . , xn be a time series, et k an integer ∈ [0, n − 1].


The autocorrelation function of Xt is defined as :
n
X
(xt − x)(xt−k − x)
cov (xt , xt−k ) t=k+1
ρ(k) = = n
var (xt ) X
(xt − x)2
t=1

Properties :
ρ(0) = 1
ρ(k) ≤ 1, ∀k > 0

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 27 / 55


Autocorrelation function
For a completely random time series (Xt = Et ),
p
|ρ(l)| < 1.96 ∗ 1/n, ∀l > 0
3
2
1
bbg

0
-1
-2
-3

0 200 400 600 800 1000

Index

Series bbg
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20 25 30

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 28 / 55


Autocorrelation function
For a time series with a trend (Xt = Tt + Et ), we have:

ρ(l) → 1, ∀l > 0
2.0
1.8
1.6
yy

1.4
1.2
1.0

0 200 400 600 800 1000

xx

Series yy
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20 25 30

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 29 / 55


Autocorrelation function
For a time series with a trend (Xt = Tt + Et ), we have:

ρ(l) → 1, ∀l > 0
20
15
yy

10
5
0

0 50 100 150 200

xx

Series yy
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 29 / 55


Autocorrelation function
For a time series of the kind xt = cos( 2Tπ t) (with a seasonal component),
we have :

ρ(l) → cos( t), ∀l > 0
T
0.0 0.5 1.0
yy

-1.0

0 50 100 150 200

xx

Series yy
0.0 0.5 1.0
ACF

-1.0

0 5 10 15 20

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 30 / 55


Autocorrelation function

Example for a time series with a period of length 4


80
60
yy

40
20

0 20 40 60 80

Index

Series yy
1.0
0.5
ACF

0.0
-0.5

0 5 10 15

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 31 / 55


Autocorrelation function
And for a series with both a trend and a seasonal component:

xt = a + b × t + cos( t)
T
25
20
15
zz

10
5
0

0 50 100 150 200

xx

Series zz
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20

Lag

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 32 / 55


Detection of trend and/or seasonality

Summary:
1 Intuition about the phenomenon
2 Graphical representations of the time series
3 Analysis of the autocorrelation function
I slow decrease of the coefficients → trend
I periodicity of the coefficients → seasonality
I combination of both : trend + seasonality

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 33 / 55


Seasonal component: additive or multiplicative ?

For an additive model, the time series stays inside a constant ”strip”
around the trend

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 34 / 55


Seasonal component: additive or multiplicative ?

For a multiplicative model, the ”strip” arounf the mean is not


constant over time, i.e. the seasonal components depend on the trend

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 34 / 55


1 Introduction : context, definitions

2 Time series decomposition

3 Time series prediction


Series witout trend nor seasonality

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 35 / 55


Example

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 36 / 55


Prediction of a series witout trend nor seasonality

It is the simplest case, rarely used in practice. But it allows to understand


and apprehend some important things that will be used later.
Hypothesis: the time series takes its values around a stable level a , it has
no trend an no seasonal component.

Xt = a + et ,
with et a residual component.
Problem : how to estimate a in order to predict the future values of Xt ?

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 37 / 55


Prediction of a series witout trend nor seasonality

Let Xt = x1 , . . . , xn a series witout trend nor seasonality. We aim at


predicting the next value : x̂n+1 (and why not the next ones).
1 Naı̈ve : the last value xn is used as the prediction at time n + 1

x̂n+1 = xn

+ next value is likely to be close to the previous one


− only the previous value is used to predict (the past information is lost)
On our example:

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 38 / 55


Prediction of a series witout trend nor seasonality

2 prediction is the mean of all past values :


n
1X
x̂n+1 = xt
n
t=1

+ all past values are used for prediction



On our example:

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 39 / 55


Prediction using the mean of past values

Prévisions des ventes par moyenne de toutes observations passées

Ventes réelles
Prévision
350
ventes

300
250
200

5 10 15

Index

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 40 / 55


Prediction of a series witout trend nor seasonality

3 prediction is the median of all past values


+ all available data is taken into account
+ extreme values have less influence
− the close past is as important than the far past
On our example:

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 41 / 55


Prediction using the median of past values

Ventes réelles
Prévision par moyenne
Prévision par médiane
350
ventes

300
250
200

5 10 15

Index

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 42 / 55


Prévision d’une série sans tendance, ni saisonnalité

4 Weighted sum of all past values


Pn
ωi x i
x̂n+1 = Pi=1
n
i=1 ωi

+ we can decide to give more influence to some past values


+ all available data is used
− choice of ωi is tedious
On our example:

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 43 / 55


Prediction of a series witout trend nor seasonality

5 Simple exponential smoothing


+ more influence to recent values
+ all available data is used
+ flexible and easy to apply
Principle:

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 44 / 55


Simple exponential smoothing : analysis

x̂n+1 = α xn + (1 − α) x̂n

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 45 / 55


Simple exponential smoothing : analysis

x̂n+1 = α xn + (1 − α) x̂n
= α xn + (1 − α) ( α xn−1 + (1 − α) x̂n−1 )

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 45 / 55


Simple exponential smoothing : analysis

x̂n+1 = α xn + (1 − α) x̂n
= α xn + (1 − α) ( α xn−1 + (1 − α) x̂n−1 )
= α xn + α(1 − α) xn−1 + (1 − α)2 (αxn−2 + (1 − α)x̂n−2 )

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 45 / 55


Simple exponential smoothing : analysis

x̂n+1 = α xn + (1 − α) x̂n
= α xn + (1 − α) ( α xn−1 + (1 − α) x̂n−1 )
= α xn + α(1 − α) xn−1 + (1 − α)2 (αxn−2 + (1 − α)x̂n−2 )
= ...
= α xn + α(1 − α) xn−1 + α(1 − α)2 xn−2 + · · · +
α(1 − α)k xn−k + α(1 − α)n−1 x1 + (1 − α)n x̂1

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 45 / 55


Simple exponential smoothing : analysis
alpha = 0.9
alpha = 0.7
alpha = 0.5

0.8
alpha = 0.3
alpha = 0.1
Poids associé pour la prévision

0.6
0.4
0.2
0.0

2 4 6 8 10 12 14

Indice observation passée

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 46 / 55


Influence of the parameter α

if α is big (close to 1), we give a lot of influence to recent values. At


the limit (α = 1), I predict the last observed value
→ predictions are fluctuating; quick reaction to changes in the data

if α is small (close to 0), more influence is given to past values (tends


toward the mean of all past values)
→ predictions are stable but slow reactions to changes in the data

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 47 / 55


Simple Exponential Smoothing

Ventes réelles
Prévision Lissage expo simple alpha = 0.7
Prévision Lissage expo simple alpha = 0.3
350
ventes

300
250
200

5 10 15

Index

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 48 / 55


Simple Exponential smoothing : example

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 49 / 55


Choice of α and x̂0

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 50 / 55


Auto-regressive models (AR)

Principle :
p
X
x̂n = β0 + βk xn−k
k=1

p is the order of the model : AR(p)


the coefficients βj are estimated based on the original series

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 51 / 55


Auto-regressive models : example

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 52 / 55


How to choose the order p ?

For a given p, when a model is fitted (coefficients are estimated), we can


compute the BIC score (Bayesian Information Criterion).
It is a tradeoff between accuracy of the intermediate predictions and the
complexity of the model (number of coefficients)
The best value of p for a given time series can be chosen according to this
score (the lower BIC, the better)

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 53 / 55


Summary

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 54 / 55


Sélection de la meilleure méthode

Simon Malinowski (Univ. Rennes 1/IRISA) January 28, 2021 55 / 55

You might also like