You are on page 1of 38

State Space Modelling For UK LFS Unemployment

Gary Brown, Ping Zong


Time Series Analysis Branch
Office for National Statistics
Jan Angenendt
  Knowledge, Analysis and Intelligence
HM Revenue and Customs
Moshe Feder
Southampton Statistical Sciences Research Institute (S3RI)
University of Southampton
Overview

• Introduction
• LFS Rolling Quarterly Data
• State Space Model
• The General State Space Model
• The Specific Model Proposed for UK LFS
• Results
• Further Work
Introduction

• A State Space Model (SSM) represents a


structural time series approach to capturing
the characteristics of a time series.
• Similarly to X-12-ARIMA, a time series can be
decomposed into trend, seasonal and
irregular using SSM.
• A key advantage of using SSM:
• allows explicitly modelling of unobservable
components
Aims of the SSM Project

• Currently the UK LFS publishes a single estimate


for each rolling quarter, based on a rotating panel
design with five waves of interviews
• The aim of the SSM project is to model the
complex LFS structure by fitting wave-specific
rolling quarter data, to better account for:
• sampling error autocorrelation (between wave-specific
estimates)
• rotation group bias (systematic differences between
wave-specific estimates)
LFS Sample Design

• The LFS sample size, around 120,000 people in


40,000 households, is split into 190 Interviewer
Areas (IAs).
• Each IA is split into 13 weekly ‘stints’ – in this way
a representative sample is achieved every 13
weeks.
• To weight the sample, the 13 weeks are allocated
to ‘months’ in a 4-4-5 pattern.
LFS Sample Design (cont)

• Interviews are (approximately) split by mode:


• First interview – face-to-face
• Second interview (13 weeks later) – telephone
• Third, fourth and fifth interviews (each 13 weeks after
the previous) – telephone.

• After the fifth interview (wave 5) households drop


out of the survey and are replaced with a new set
of households (wave 1).
Data Structure

Table 1: Rolling quarterly estimates

Rolling Wave 1 Wave 2 Wave 3 Wave 4 Wave 5


quarter
May-Jul Ym-j,w1 Ym-j,w2 Ym-j,w3 Ym-j,w4 Ym-j,w5
Jun-Aug Yj-a,w1 Yj-a,w2 Yj-a,w3 Yj-a,w4 Yj-a,w5
Jul-Sep Yj-s,w1 Yj-s,w2 Yj-s,w3 Yj-s,w4 Yj-s,w5
Aug-Oct Ya-o,w1 Ya-o,w2 Ya-o,w3 Ya-o,w4 Ya-o,w5
Sep-Nov Ys-n,w1 Ys-n,w2 Ys-n,w3 Ys-n,w4 Ys-n,w5
Oct-Dec Yo-d,w1 Yo-d,w2 Yo-d,w3 Yo-d,w4 Yo-d,w5
Rolling quarterly estimate

• Each three months yields a representative sample, and


each month is in three of these
• For example, survey responses from June are included in
three rolling quarterly estimates: (April,May,Jun),
(May,Jun,Jul), (Jun,Jul,Aug)
• Given this structure, the overall unemployment quarterly
estimate at time t is a combination of three months:

• where ‘Y’ = unemployment rate, ‘ILO’ = ILO unemployed, and


‘EA’ = economically  ILOt 1  ILOt  2
ILOt active.
Yt 
EAt  EAt 1  EAt  2
Table 2: Sample rotation in wave-specific data
Central Stint Wave 1 Wave 2 Wave 3 Wave 4 Wave 5
month
July Week 10 Moshe Nigel Oscar Ping Quentin
Week 11 Mary Nuovella Olivia Penelope Queenie
Week 12 Mark Nat Owen Paul Quinlan
Week 13 Maxine Naomi Olga Pam Quanita
August week 1 Andrew Brian Craig David Eric
week 2 Amy Bella Catherine Dominica Edwina
week 3 Albert Bill Charles Danny Edgar
week 4 Anthony Brenda Carys Davina Emily
week 5 Amanda Ben Callum Dominic Edward
September week 6 Frederic Giovanna Helen Iris Janice
week 7 Fred Gary Harry Ian Jan
week 8 Fenella Gemima Hannah Irene Jacky
week 9 Frank Geoff Henry Iqbal Jeremy
October week 10 Lionel Moshe Nigel Oscar Ping
week 11 Lorna Mary Nuovella Olivia Penelope
week 12 Larry Mark Nat Owen Paul
week 13 Lesley Maxine Naomi Olga Pam
Sample rotation

The sample rotation means:


• The same wave does not include the same households
(samples) in each rolling quarter.
• The same households (cohort) appear in different
waves after one quarter.
• There are different data collection methods in waves.
These characteristics need to be accounted for.
• Using the SSM approach for UK LFS unemployment
enables this to happen
State space model

The General State Space Model (GSSM) is:


(1)
yt  Z t t   t  t  N (0, H t )
(2)
• where:  t  T  t 1  t t  N (0, Qt )
• yt is the measurement equation
  is the state vector (the transition equation)
• Z, T, H and Q are matrices
•  and  are error terms

Compare the General Linear Regression Model (GLRM)


(3)

yt  Z t   t
Comparing SSM and GLRM

• In (1) and (3), y is a function of time, but


• GLRM: the coefficient is  which is fixed
• SSM: the coefficient is t which will vary over time
• Hence, GLRM is a static regression model and
SSM is a dynamic regression model
• In the SSM, each coefficient tvaries according to a
random walk t = t-1twhich gives a state vector in
the form t = Tt-1tas in Equation (2).
• So, equation (1) expresses the dynamic
regression process, and equation (2) expresses
the dynamic change condition
SSM with Signal and Noise

The SSM model can be expressed as two parts:


• signal t
• noise et
(4)
where:
y  e
• yt is the design unbiasedt surveyt estimate
t
• t is signal - the unknown population quantity
• et is noise - the survey errors
Signal t - Basic Structural Model (BSM)

t  Lt  St (4a)
(4b)
Lt  Lt 1  Rt 1   L ,t
(4c)
Rt  Rt 1   R ,t
(4d) if using dummy seasonality
11
S   St  j   S ,t
where: t j 1
• Lt is the level
• Rt is the slope
• St is the seasonal
 L,t, R,t, S,t are white noise terms
Noise et - the Extended SSM model

p
et    j et  j  e ,t (5)
j 1

where:
  is the coefficient of AR process
• et-j is the sampling error
• e,t is white noise

The standard assumption is independence of errors


E[ L ,t ]  E[ R ,t ]  E[ S ,t ]  E[e ,t ]  0

Var[e,t ]   2e , Var[ L ,t ]  2L , Var[R ,t ]  2R , Var[S ,t ]  2S


BSM + Extended SSM

• Both signal and noise have their own


measurement equation (ZBSM,t and Ze,t) and state
vector (BSM,t and e,t) in the transition equation, ie
- measurement equation for signal
t Z for signal
y  (vector
- state
BSM ,t )
BSM ,t t
- measurement equation for noise
 BSM ,t  ( Lt , Rt , St ) '
- state vector for noise if AR(4)
et  ( Z e ,t e ,t   t )
 e ,t  (et , et 1 , et  2 , et 3 ) '
• These two parts, signal and noise, are brought
together to form a completed SSM model.
The Specific SSM Proposed for UK LFS
State Vector (Lt, Rt, St, et)
• As survey responses from month t are included in
estimates based on three representative samples
centred at (t-1,t,t+1), the state vector will not only
consider parameters at time t but will take all three time
periods (t-1,t,t+1) into account
• All the original Lt, Rt, St and et will include
for level L*t 1 , L*t and L*t 1
for slope R* , R* and R *
t 1 t t 1

for seasonal
St*1 , St* and St*1
for samplee* error
, e* and e*
t 1 t t 1
State vector (Lt, Rt, St, et) - cont

• The slope (R) will be the same at three


different levels, so is kept at the t+1 value.
• Also because there are five waves, and each
wave includes three time periods (t-1, t, t+1)
for sample error, there are in total 15 sample
error state variables in our model.
• Total = 30 state variables for the state vector.
Survey error structure (et)
• Survey errors do not overlap in the wave
structure data but do appear between two
quarter across waves (cohorts), ie
• someone interviewed in wave i at time t will be
interviewed in wave (i+1) at time t+3 – the sample
errors will correlated so are defined as follows.
et , w 2   e(t 3), w1   t , w 2
et , w3   e(t 3), w 2   t , w3
et , w 4   e(t 3), w3   t , w 4 (6)
et , w5   e( t 3), w 4   t , w5
et , w1   * e( t 3), w5   t , w1
Building the SSM for UK LFS

• Matrices are used as the basic method for building the SSM
• The main SSM matrices/vectors for UK LFS are:
• observation matrices (Z)
• transition matrices (T)
• covariance matrices (Q)
• state vectors (t)
• disturbance vectors

(t )
Observation matrices (Z)

• Observation matrices:
• ZBSM (signal) is a 5x15 matrix for SSM with dummy
seasonality
• Ze (noise) is a 5x15 matrix
1 1 1 0 1 1 1 0 0 0 0 0 0 0 0
1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 

ZBSM  1  1 1 1 0 1 1 1 0 0 0 0 0 0 0 0
3  
1 1 1 0 1 1 1 0 0 0 0 0 0 0 0
1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 

Ze  1  ( I5 x5 I5 x5 I5 x5 )
3
State vectors (tt-1) and transition matrix (T): BSM

• State vectors (BSM,t), transition matrix (TBSM) + disturbance


 Lt 1  0 0 0 0 0 0 0 0 0 0 0 0 0   Lt 1   0 
0 0
      
 L t  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0   Lt   0 
 Lt 1  0 1 1 0 0 0 0 0 0 0 0 0 0 0 0   Lt 1   tL 
       
R
 t 1  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0   Rt  tR 
 St 1  0 0 0 0 1 0 0 0 0 0 0 0 0 0 0   St  2   0 
       
 St   0 0 0 0 1 0 0 0 0 0 0 0 0 0 0   St 1   0 
 S  0 0 0 0 1 1 1 1 1 1 1 1 1 1 1  St   tS 
 t 1       
 St  2    0 0 0 0 0 0 1 0 0 0 0 0 0 0 0    St 1    0 
 S  0 0 0 0 0 0 0 1 0 0 0 0 0 0 0   St  2   0 
 t 3   
 St  4   0 0 0 0 0 0 0 0 1 0 0 0 0 0 0   St  3   0 
       
S
 t 5  0 0 0 0 0 0 0 0 0 1 0 0 0 0 0   St  4   0 
 St  6   0 0 0 0 0 0 0 0 0 0 1 0 0 0 0   St  5   0 
       
 St  7   0 0 0 0 0 0 0 0 0 0 0 1 0 0 0   St  6   0 
 S  0 0 0 0 0 0 0 0 0 0 0 0 1 0 0   St  7   0 
 t 8       
 St 9  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0   St 8   0 
State vectors (tt-1) and transition matrix (T): e

• State vectors (e,t), transition matrix (Te) + disturbance


 etw11   0 0 0 0 0 1 0 0 0 0 0 0 0 0 0   etw12   0 
 w2      
 et 1   0 0 0 0 0 0 1 0 0 0 0 0 0 0 0   etw22   0 
 etw31   0 0 0 0 0 0 0 1 0 0 0 0 0 0 0   etw32   0 
 w4     w4   
 et 1   0 0 0 0 0 0 0 0 1 0 0 0 0 0 0   et  2   0 
 e w5   0 0 0 0 0 0 0 0 0 1 0 0 0 0 0  etw52   0 
 tw11     w1   
 et   0 0 0 0 0 0 0 0 0 0 1 0 0 0 0   et 1   0 
 ew2   0 0 0 0 0 0 0 0 0 0 0 1 0 0 0   etw21   0 
 t     w3   
 etw3    0 0 0 0 0 0 0 0 0 0 0 0 1 0 0    et 1    0 
 w4    
 et   0 0 0 0 0 0 0 0 0 0 0 0 0 1 0   etw41   0 

 etw5   0 0 0 0 0 0 0 0 0 0 0 0 0 0 1   etw51   0 
 w1       
 et 1   0 0 0 0 * 0 0 0 0 0 0 0 0 0 0   ett   te, w1 
 ew2    0 0 0 0 0 0 0 0 0 0 0 0 0 0   ett 3  te, w 2 
 tw31     t  6   e , w3 
 et 1   0  0 0 0 0 0 0 0 0 0 0 0 0 0   et  t 
 ew4   0 0  0 0 0 0 0 0 0 0 0 0 0 0   ett 9  te, w 4 
 t 1     t 12   e, w5 
 e w5   0 0 0  0 0 0 0 0 0 0 0 0 0 0   et  t 
 t 1  
Join Signal and Noise - block all matrices together
• Observation matrices: Z  ( Z BSM ,t Z e ,t )
• State vectors:  t  ( BSM ,t  e ,t )
• Disturbance vectors:
t  (BSM ,t  e,t )
• Transition matrices:
 TBSM (15 x15) 015 x15 
T(30 x30)   
0
 15 x15 Te (15 x15) 

where TBSM is the 15x15 matrix and Te is the 15x15 matrix

with 0 0 0 0  *
 05 x 5 I5 x5 05 x 5   
   0 0 0 0 
Te   05 x 5 05 x 5 I5 x5  AR5 x 5 0  0 0 0 
 AR 05 x 5 05 x 5 



 5 x5 0 0 0 0 
0 0 0  0 

Covariance matrices (Q)

 QBSM ,t (15 x15) 0(15 x15) 


Q30 x 30   
 0(15 x15) Qe,t (15 x15) 
where

 Q ,t ( L , R  4 x 4) 04 x11 
QBSM ,t (15 x15)  
with  0 11x 4  2
s ,t (11 x11) 

and with

 02 x 2 02 x 2    2L 0 
QLR 
Q ,t ( L , R )(4 x 4)    0  2 
 02 x 2 QLR   R 
Covariance matrices (Q) – cont.

and
 05 x 5 05 x 5 05 x 5 
 
Qe ,t (1515)   05 x 5 05 x 5 05 x 5 
0 05 x 5 VC5 x 5 
 5 x5
with
VC(55)  ( 2 I (5 x 5) )
The Model Estimate Setting
• As long as all SSM matrices are set
appropriately and all parameters in the model
are known, the state vector can be predicted,
filtered and smoothed using the Kalman Filter
• In fact, all these parameters are unknown,
thus we need initialisation of all parameters:
• (t-1) in the state vector
• ( L,  R ,  S , e ) in the disturbance matrix
2 2 2 2

• AR parameters (and) in the transition matrix


Initialisation for (t-1) in the state vector

• For non-stationary components:


• initialised the non-stationary components mean by zero
• initialised the associated non-stationary component
variances with a very large value (ie 10000)
• For stationary components:
• initialised the stationary component (sampling error e t)
mean with unconditional mean
• initialised the stationary component variance with its own
pseudo-error variance
Initialisation for (t) in the disturbance vector

• All     in the disturbance vector can be


2
 L,
2
 R,
2
S ,
2
e

estimated using Maximum Likelihood in the


model.
• There are two approaches to estimating these
parameters:
• The hyper-parameters approach assumes that all
parameters are unknown and are estimated
simultaneously in the SSM model.
• The pseudo-error approach is different ...
Initialisation in the pseudo-error approach
Different approaches for different parameters
• L,t, R,t,S,t are treated as unknown parameters, and
e,t is treated as a known parameter (estimated in a
separate process)
• Initialised variance value for the unknown parameter
vector (in Q matrices) 2 0
L ,t
• are set based on a separate estimation
 2R and  2S
process (‘Proc ucm’ in SAS, ‘StructTS’ in DLM/R)
• obtained based on calculation of the
 2e
autocorrelation through the pseudo-error process
• AR parameters, and, are estimated using Yule-
Walker equations and substituted into SSM
Simulation results

1. Trend prediction:
Seasonality prediction
Sample error prediction
Further work

The project is not complete – work remaining:


• test whether including (t-1, t, t+1) into the model is
necessary (through comparison analysis)
• test whether the proposed method for sample error
estimation is correct
• test a consistent approach with one used in SSM to
estimate the AR(1) coefficients of the pseudo-error
• consider MA models
• consider including rotating group bias and the claimant
count
Any questions?

PING.ZONG@ONS.GSI.GOV.UK
Appendix

• Trigonometric seasonality model was using sines and


cosines.
6
St    j .t   j .t
j 1

 j .t   j .t 1 cos  j   *j ,t 1 sin  j   j ,t
 *j ,t   j .t 1 sin  j   *j ,t 1co n  j   *j ,t

where E[  j ,t ]=0, E[ ]=0, Var[ j ,t


t
j ,t
 2
]= Var[ ]=  and
*
j ,t
j
j   for j = 1,...,6.
6
The observation matrix (ZBSM)(5 x17 matrix)

1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0 

ZBSM  1  1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0
3  
1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0 

• Total sate vector ( t ) - 32 variables:

 t  ( Lt 1, Lt , Lt 1, Rt 1, St 1, St , 1,t 1, .... 6,t 1, 1,*t 1, .... 6,* t 1,
et 1, w1...et 1, w5, et , w1...et , w5, et 1, w1 ..et 1, w5 ) '
The transition matrix (TBSM) (5 x17 matrix)

 Lt 1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0   Lt 1   0 
      
 Lt  0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0   Lt   0 
 Lt 1  0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0   Lt 1   tL 
       
 Rt 1  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0   Rt   tR 
 St 1  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0   St  2   0 
       
 St   0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0   St 1   0 
   0 0 0 0 0 0 cos(1 ) 0 0 0 0 0 sin(1) 0 0 0 0    1,t   1,t 
 1,t 1       
  2,t 1  0 0 0 0 0 0 0 cos(2 ) 0 0 0 0 0 sin(2 ) 0 0 0    2,t   2,t 
      
 3,t 1   0 0 0 0 0 0 0 0 cos(3 ) 0 0 0 0 0 sin(3 ) 0 0     3,t    3,t 

  4,t 1  0 0 0 0 0 0 0 0 0 cos(4 ) 0 0 0 0 0 sin(4 ) 0    4,t   4,t 
       
  5,t 1  0 0 0 0 0 0 0 0 0 0 cos(5 ) 0 0 0 0 0 sin(5 )    5,t   5,t 
  6,t 1  0 0 0 0 0 0 0 0 0 0 0 cos(6 ) 0 0 0 0 0    6,t   6,t 
 *       
  1,t 1  0 0 0 0 0 0  sin(1 ) 0 0 0 0 0 cos(1) 0 0 0 0    1,*t   1,*t 
  *  0 0 0 0 0 0 0  sin(2 ) 0 0 0 0 0 cos(2 ) 0 0 0    2,* t   2,* t 
 2,* t 1       
  3,t 1  0 0 0 0 0 0 0 0  sin(3 ) 0 0 0 0 0 cos(3 ) 0 0    3,* t   3,* t 
  *  0 0 0 0 0 0 0 0 0  sin(4 ) 0 0 0 0 0 cos(4 ) 0    4,* t   4,* t 
 4,t 1    *
  *  0 0 0 0 0 0 0 0 0 0  sin(5 ) 0 0 0 0 0 cos( 5 )    5,t   5,t* 
 5,t 1  

You might also like