You are on page 1of 18

Journal of Applied Statistics

ISSN: 0266-4763 (Print) 1360-0532 (Online) Journal homepage: https://www.tandfonline.com/loi/cjas20

Integer autoregressive models with structural


breaks

Akanksha S. Kashikar, Neelabh Rohan & T.V. Ramanathan

To cite this article: Akanksha S. Kashikar, Neelabh Rohan & T.V. Ramanathan (2013) Integer
autoregressive models with structural breaks, Journal of Applied Statistics, 40:12, 2653-2669, DOI:
10.1080/02664763.2013.823920

To link to this article: https://doi.org/10.1080/02664763.2013.823920

Published online: 01 Aug 2013.

Submit your article to this journal

Article views: 272

View related articles

Citing articles: 2 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=cjas20
Journal of Applied Statistics, 2013
Vol. 40, No. 12, 2653–2669, http://dx.doi.org/10.1080/02664763.2013.823920

Integer autoregressive models with


structural breaks

Akanksha S. Kashikara , Neelabh Rohanb∗ and T.V. Ramanathana


a Department of Statistics & Centre for Advanced Studies, University of Pune, Pune 411 007, India;
b IndianStatistical Institute, North East Centre, Tezpur, Assam 784028, India

(Received 1 April 2013; accepted 8 July 2013)

Even though integer-valued time series are common in practice, the methods for their analysis have been
developed only in recent past. Several models for stationary processes with discrete marginal distributions
have been proposed in the literature. Such processes assume the parameters of the model to remain constant
throughout the time period. However, this need not be true in practice. In this paper, we introduce non-
stationary integer-valued autoregressive (INAR) models with structural breaks to model a situation, where
the parameters of the INAR process do not remain constant over time. Such models are useful while
modelling count data time series with structural breaks. The Bayesian and Markov Chain Monte Carlo
(MCMC) procedures for the estimation of the parameters and break points of such models are discussed.
We illustrate the model and estimation procedure with the help of a simulation study. The proposed model
is applied to the two real biometrical data sets.

Keywords: INAR models; Gibbs sampling; MCMC; structural break

Mathematical Subject Classification: 65C40; 62M10; 62P10

1. Introduction
Stationary integer-valued autoregressive (INAR) models developed by Alzaid and Al-Osh [2],
Al-Osh and Alzaid [1] and Mckenzie [23] have received considerable attention in modelling
the time series count data with discrete marginal distributions, see for example MacDonald and
Zucchini [20]. An important class of such models is formed when the counts are specified in
terms of a Poisson distribution, see Freeland and McCabe [13] and McCabe and Martin [21]. For
modelling the spread of a communicable disease, INAR models are preferred among the class of
count data time series models, as they have a natural interpretation in terms of the spread of the
disease [6]. For example, in the case of INAR (1) model, Xt = α t Xt−1 + t if Xt−1 is the number of
new cases of disease that develops during the time interval (t − 2, t − 1], αt will be the probability
that each of these new cases transmits infection independently of the other Xt−1 − 1 individuals
and t become the number of new cases arriving in the time interval (t − 1, t] infected from other

∗ Corresponding author. Email: neelabhrohan@gmail.com

© 2013 Taylor & Francis


2654 A.S. Kashikar et al.

sources. Discrete Autoregressive Moving Average (DARMA) models given by Jacobs and Lewis
[17,18] do not have such an interpretation. Recent applications of INAR models in economics
can be found in Böckenholt [3], Brännäs and Shahiduzzaman [4] and Gourieroux and Jasiak [14]
among others. Some of the applications of INAR models in medical sciences are available in
Cardinal et al. [6] and Franke and Seligmann [12]. Recently, Enciso-Mora et al. [10] considered
the inclusion of explanatory variables into the INAR models to extend the applicability of such
models. Zheng et al. [33] allowed the parameters of an INAR model to vary stochastically by
introducing the pth order random coefficient INAR models. However, they did not suggest any
method of estimation for such models. Taylor [30] considered a stationary Markov-switching
INAR model and applied it to the financial time series data.
In some real situations, for example in epidemics, the parameters of the process do not remain
constant throughout the time period. Generally, the daily number of affected cases is very less in
the first phase of the epidemic, then it increases and finally it decreases as the epidemic recedes,
either naturally or as a result of the human intervention. Therefore, the stationary INAR models
may not provide a good fit in such a situation. Recently, Franke et al. [11] and Pap and Szabo
[25] considered the problem of testing the change point in the structure of an integer-valued time
series using cumulative sum tests. Szabo [28] introduced a change point detection procedure for
INAR (p) model. Neal and Subba Rao [24] also considered a piecewise constant INAR model.
However, these authors do not provide a formal procedure for the estimation of multiple break
points and other parameters in INAR models.
In this article, we propose an INAR model with structural breaks by allowing the parameters
of the process to vary according to time. The process remains stationary in each regime, that is,
between the two successive structural breaks and non-stationary in general. An Markov Chain
Monte Carlo (MCMC) and Gibbs sampling approach has been suggested for the estimation of
the parameters and break points of the process. MCMC methods have been found to be useful
in several branches of statistics and are particularly well suited to integer-valued time series. The
suitability of the model and the estimation procedure has been established with the help of a
simulation study.
We apply the proposed model and the estimation procedure to two real biometrical data sets,
viz., Schizophrenic patient data and H1N1 data. Both the data sets witness a structural break,
which has been estimated by our model along with the other parameters. The fit of the model in
both the cases is found to be reasonably good.
The rest of the paper is organized as follows. In Section 2, we introduce the model and discuss
a Bayesian procedure for the estimation of parameters and break points. Section 3 reviews some
of the methods of model selection in the case of INAR models. Simulation results are reported
in Section 4. Section 5 deals with empirical applications of the model. Some concluding remarks
are given in Section 6. Details about the MCMC algorithm are provided in the appendix.

2. The model and estimation


Let {Yt } denote a non-negative integer-valued time series. Then, {Yt } is said to follow an INAR (p)
model [1] if


p
Yt = αi ◦ Yt−i + t ,
i=1

where {t } is a sequence of non-negative integer-valued independent and identically distributed


(i.i.d.) random variables with mean λ and ‘◦’ denotes the Steutel and van Harn’s convolution
Journal of Applied Statistics 2655

operator (also known as binomial thinning operator) defined by



X
α◦X = Zj .
j=1

Here, Zj ’s are independent Bernoulli variables with success probability α.


Suppose that the time series {Yt } has m break points and hence (m + 1) different regimes. Let
m = (τ1 , τ2 , . . . , τm ) denote the vector of unknown break points. Then, {Yt } is said to follow an
INAR (1) model with m break points if


⎪ α1 ◦ Yt−1 + t for t ≤ τ1 ,


⎨α2 ◦ Yt−1 + t for τ1 < t ≤ τ2 ,
Yt = . .. (1)

⎪ ..


.

αm+1 ◦ Yt−1 + t for t > τm .

Let {t } be a sequence of independent Poisson random variables with mean λ1 , if t ≤ τ1 ,


λj , if τj−1 < t ≤ τj , for j = 2, . . . , m and λm+1 if t > τm . We denote θ = (α, λ), where α =
(α1 , α2 , . . . , αm+1 ) and λ = (λ1 , λ2 , . . . , λm+1 ). Notice that the process defined in Equation (1)
is non-stationary in general. However, it is stationary in each regime if 0 ≤ αj < 1, j =
1, 2, . . . , m + 1. We call the process {Yt } defined in Equation (1) as an INAR (1) with m breaks.
An INAR (p) process with m breaks can be defined in a similar way as follows:
⎧ p
⎪ 

⎪ α1i ◦ Yt−i + t for t ≤ τ1 ,





⎪ i=1

⎪  p

⎨ α2i ◦ Yt−i + t for τ1 < t ≤ τ2 ,
Yt = i=1




. ..
⎪..
⎪ .


⎪

p

⎪ αm+1 i ◦ Yt−i + t for t > τm .

i=1

Towards the estimation of break points, we define a new discrete random variable st , which
takes values in {1, 2, . . . , m + 1}. This random variable st represents the state of the system at
time t, i.e. st = k indicates that the observation Yt is evolved from the kth regime. Thus, the model
in Equation (1) is essentially a state-space model. With this formulation we can see that st is a
Markov chain with the transition probability matrix
⎛ ⎞
p11 p12 0 ··· 0
⎜ 0 p22 p23 · · · 0 ⎟
⎜ ⎟
⎜ .. . . . .. ⎟ ,
P=⎜ . .. .. .. . ⎟
⎜ ⎟
⎝0 0 · · · pmm pm,m+1 ⎠
0 0 ··· 0 1

where pij = P(st = j | st−1 = i), the probability of moving to regime j at time t given that the
regime at time t − 1 was i and pi i+1 = 1 − pii . The sole purpose of this new formulation in the
form of a hidden Markov model [7,8] is to facilitate the estimation of the break points. Harvey
[15] had introduced a ‘state-space’ model for Poisson count time series by considering the state
equation as a random walk in the mean of the process. However, such a model cannot accommodate
the type of structural breaks introduced in Equation (1). This is because, in case of Equation (1),
2656 A.S. Kashikar et al.

the state of the process does not change at each time point. In fact, it remains the same until the
next break occurs.
Let Yn = (Y1 , Y2 , . . . , Yn ) denote the time series data and Sn = (s1 , s2 , . . . , sn ) the states of all
the observations. We assume that s1 = 1 and sn = m + 1 (m < n), that is, the first observation
belongs to the first regime and the last observation to the last regime. The parameters to be
estimated are θ and m (or Sn ). Notice that the estimation of vector of states Sn is as good as the
estimation of break points.
In the simple, stationary Poisson INAR (1) model with parameters λ and α, Y1 ∼
Poisson(λ/(1 − α)). Furthermore, the conditional distribution of Yt given Yt−1 = yt−1 is a sum
of two independent random variables: a binomial random variable with parameters (α, yt−1 ) and
a Poisson(λ) random variable. Using these facts, we obtain

e−λ1 /(1−α1 ) [λ1 /(1 − α1 )]Y1


f (Y1 | s1 = 1, θ) = , (2)
Y1 !
min(Yt−1 ,Yt ) 
 Yt−1 e−λj λYj t −i
f (Yt | Yt−1 , st = j, θ) = αji (1 − αj )Yt−1 −i ∀ t ≥ 2. (3)
i=0
i (Yt − i)!

Using Equations (2) and (3), the likelihood function may be written as

e−λ1 /(1−α1 ) [λ1 /(1 − α1 )]Y1


L(m , θ | Yn ) =
Y1 !
τj+1 min(Yt−1 ,Yt ) 
 
m  Yt−1 Yt−1 −i
e−λj λYj t −i
× αj (1 − αj )
i
,
j=0 t=τj +1 i=0
i (Yt − i)!

where τ0 = 1 and τm+1 = n. In general, for a Poisson INAR (p) model with m breaks, the likelihood
function will be

e−λ1 /(1−α1 ) [λ1 /(1 − α1 )]Y1


L(m ,  | Yn ) =
Y1 !
 p   p
m  τj+1
  −λj Yt − k=1 ik
Yt−k ik e λ
αjk (1 − αjk )Yt−k −ik
j
× p , (4)
j=0 t=τ +1 i ,i ,...,i k=1
ik (Y t − k=1 ik )!
j 1 2 p

p
where (i1 , i2 , . . . , ip ) are such that, 0 ≤ ik ≤ min(Yt , Yt−k ) and k=1 ik ≤ Yt . We take Yt−k = 0, if
t ≤ k. In this case,  = (θ 1 , θ 2 , . . . , θ m+1 ), where θ j = (αj , λj ) and αj = (αj1 , αj2 , . . . , αjp ) denote
the parameters of the jth regime.
It may not be feasible to estimate the parameters of this model using the maximum likelihood
method. Therefore, the estimation of the model parameters and break points is carried out in three
steps using the following MCMC and Gibbs sampling scheme.

(1) Simulation of Sn from posterior mass of (Sn | Y n , , P).


(2) Simulation of P from posterior density of (P | Y n , , Sn ).
(3) Simulation of  from posterior density of ( | Y n , P, Sn ).

Simulation of Sn and P in Steps 1 and 2 of the above Gibbs sampling is same as in Chib [8]. To
proceed towards Step 3, we first assume independent priors for the parameters. Let p() denote
Journal of Applied Statistics 2657

the prior for the parameters. Then,


m+1
p() = p(αj ) × p(λj | aj , bj ),
j=1

where (aj , bj ) are constants. The priors for αj and λj are


p
p(αj ) ∝ 1 if αji < 1, 0 ≤ αji < 1 and
i=1
(5)
0 otherwise,
p(λj ) ∼ Gamma (aj , bj ).

The priors for αj are chosen ensuring the stationarity of the process in each regime. The conditional
posterior density can be written as

p( | Yn , P, Sn ) ∝ p() L(m ,  | Yn ).

Due to the complicated nature of the likelihood and joint posterior density, it is not possible
to sample from the conditional posterior densities of the parameters directly. Therefore, we use
the Metropolis-Hastings within Gibbs sampling technique [29] to sample from the conditional
posterior densities of all the parameters. We use the single move Gibbs sampler algorithm, that is,
we sample from the individual conditional density of each parameter in one move. More details
about the sampling from the conditional densities are given in the Appendix.

3. Selection of m and p
The technique described above assumes that the number of breaks m is known. However, when it
is not known, different models with different number of break points may be compared to decide
upon the number of breaks [8]. Also, the testing of hypothesis approach developed by Rohan and
Ramanathan [26] can be adopted for the selection among different break point models.
To select the order of the INAR model, various criteria are available in the literature. Bu and
McCabe [5] have suggested that the order of the INAR model can be decided using the sample
autocorrelation/partial autocorrelation functions or using the Akaike/Bayesian information cri-
terion (AIC/BIC). Cardinal et al. [6] used relative forecast error (RFE) criterion to measure the
forecasting performance of the model. The RFE is defined as

|observed value − forecasted value|


RFE = . (6)
observed value

RFE is well defined only if the observed value is nonzero. It can be used to select the optimal
model from different class of models by comparing their forecasting potential. It can also be used
for order selection in INAR models. Cardinal et al. [6] used a summary measure by putting the
sum of observed values and sum of forecasted values in place of ‘observed value’ and ‘forecasted
value’ for a single period in Equation (6). We may use the average values as well. Another criterion
may be obtained by replacing the absolute difference in the numerator on the right hand side of
Equation (6) by the squared difference. Silva [27] studied the following three criteria for the model
selection. The value of p, which minimizes the final prediction error (FPE), AIC or the corrected
2658 A.S. Kashikar et al.

Akaike information criterion (AICC), is chosen, where



2p
FPE(p) = V̂p 1 + ,
N
AIC(p) = N log(V̂p ) + 2(p + 1)
1 + p/N
and AICC(p) = N log(V̂p ) + .
1 − (p + 2)/N

Here, N is the total number of observations and V̂p is the estimator for the variance of the one-step
ahead linear prediction error. This may be estimated in one of the following ways:
⎛ 
p ⎞
1− α̂i2
⎜ ⎟
V̂p = λ̂ ⎜ ⎟
i=1
⎝ ⎠ (7)

p
1− α̂i
i=1


p
or V̂p = R̂(0) − α̂i R̂(i) (8)
i=1
 

p
or V̂p = Var Xt − α̂i Xt−i − λ̂ , t = p + 1, . . . , N, (9)
i=1

where R̂ (i) is the sample autocovariance function at lag i of the observation set. For INAR (p)
models, it is difficult to calculate the likelihood function. Instead, the spectral density function
can be easily computed. Therefore, Hurvich and Tsai [16] used the Whittle criterion [31,32]
to approximate the likelihood through the spectral density function of the model and deduced
the AICC criterion on the basis of this approximation. AIC and FPE do not have such strong
justification. Hence, the AICC is preferred to AIC.

4. Simulation study
We carried out a simulation study to assess the performance of the proposed model and the
Bayesian estimation procedure. Three data sets were generated, each of size 100 from three
different Poisson INAR models with structural breaks: an INAR (1) with one break (M1 ), INAR
(1) with two breaks (M2 ) and an INAR (2) with one break (M3 ). The details about the actual values
of the parameters of all the three models chosen for the simulation study are given in Table 1.
Here, the notations used for the parameters are the same as those defined in Section 2 for the INAR
(p) model with breaks. For each data set, the estimates of the parameters and break points were
obtained using an MCMC run of 10,000. In addition, a burn-in sample of 10,000 was left to ensure
the convergence of the chain. All the estimates were obtained after ensuring the convergence and
good mixing of the Gibbs sampler chains. The trace and autocorrelation function plots of all
the converged chains can be made available from authors on request. The priors chosen for the
parameters are same as in Equation (5). The values of aj and bj in the prior of λj were taken as 1 ∀j.
The posterior means and standard deviations of the parameters obtained from the Gibbs sam-
pling run are reported in Table 1. All the parameter estimates are close to their actual values. The
break points are efficiently estimated in all the three data sets. These results support the validity
of the proposed model and the estimation procedure.
The above simulation study has been carried out to judge the performance of the proposed model
and estimation procedure for different number of break points and parameters in the model. To
Journal of Applied Statistics 2659
Table 1. Simulation results using models M1 to M3 .

Parameters
Model∗ α11 α21 α12 α22 α31 λ1 λ2 λ3 Break 1 Break 2

M1 0.3 0.5 – – – 1.0 2.0 – 50 –


0.261 0.543 – – – 1.008 1.951 – 49.73 –
0.157 0.087 – – – 0.120 0.594 – 0.863 –
M2 0.5 0.75 – – 0.75 0.15 2.0 3.5 30 50
0.488 0.753 – – 0.743 0.169 1.940 3.537 30.240 52.141
0.184 0.189 – – 0.142 0.124 0.347 1.036 0.521 1.769
M3 0.1 0.2 0.6 0.7 – 0.5 2.0 – 50 –
0.142 0.175 0.560 0.690 – 0.563 1.819 – 49.345 –
0.098 0.093 0.248 0.300 – 0.045 0.346 – 2.297 –

Note: ∗ For each model, the first, second and third rows represent the actual value, posterior mean and posterior standard
deviation of parameters, respectively.

Table 2. Simulation results using models M4 to M11 .

Parameters
Model∗ α11 α21 λ1 λ2 Break point

M4 0.1 0.9 1.0 1.0 50


0.080 0.910 0.987 1.182 50.521
0.065 0.032 0.205 0.408 0.134
M5 0.3 0.7 1.0 1.0 50
0.238 0.671 1.192 1.366 46.081
0.099 0.081 0.201 0.342 3.531
M6 0.7 0.3 1.0 1.0 50
0.667 0.225 1.393 1.070 50.011
0.066 0.108 0.259 0.197 0.731
M7 0.9 0.1 1.0 1.0 50
0.880 0.074 1.161 1.146 49.998
0.031 0.074 0.268 0.172 0.014
M8 0.5 0.5 0.5 2.0 50
0.701 0.363 0.301 2.371 49.840
0.080 0.120 0.096 0.476 0.499
M9 0.5 0.5 1.0 2.0 50
0.501 0.582 1.094 1.964 45.622
0.119 0.084 0.327 0.463 1.411
M10 0.5 0.5 2.0 0.5 50
0.541 0.393 1.727 0.430 55.013
0.088 0.152 0.370 0.113 3.809
M11 0.5 0.5 2.0 1.0 50
0.536 0.349 2.064 0.979 48.557
0.103 0.109 0.479 0.192 2.613

Note: ∗ For each model, the first, second and third rows represent the actual value,
posterior mean and posterior standard deviation of parameters, respectively.

check the validity of the proposed estimation procedure further, another simulation study was
carried out. For computational simplicity, we have restricted ourselves to the INAR (1) model
with one break. However, several combinations of the parameter values were chosen to ensure
the wide applicability of the procedure. Also, the parameter values were chosen in such a way
2660 A.S. Kashikar et al.
Table 3. Simulation results using models M12 to M15 (with changed priors).

Parameters
Model∗ α11 α21 λ1 λ2 Break point

M12 0.5 0.5 0.5 2.0 50


0.360 0.394 0.444 2.289 48.142
0.142 0.124 0.127 0.495 1.744
M13 0.5 0.5 1.0 2.0 50
0.470 0.447 1.903 2.349 44.599
0.146 0.094 0.317 0.451 1.571
M14 0.5 0.5 2.0 0.5 50
0.442 0.401 2.335 0.608 48.913
0.138 0.143 0.688 0.168 5.739
M15 0.5 0.5 2.0 1.0 50
0.466 0.501 2.441 0.866 49.769
0.111 0.102 0.537 0.178 2.475

Note: ∗ For each model, the first, second and third rows represent the actual value,
posterior mean and posterior standard deviation of parameters, respectively.

that they are not far away from each other in the two adjacent regimes. The purpose behind this is
to ensure that using the suggested procedure, it is possible to estimate the break point accurately,
even if the break point is not so prominent. For example, in models M4 to M7 (Table 2), the
parameters λ1 and λ2 are kept the same in both the regimes and various combinations of α11
and α21 were used to generate the data. All the data sets generated were of size 100 with break
at 50. Similarly, in models M8 to M11 , parameters α11 and α21 were kept the same, but λ was
allowed to change. For all these models, we used the same priors as in the models M1 to M3 . It
is evident from Table 2 that the Bayes estimates of the parameters are quite close to their actual
values. Specially, the estimates of the break points are not far from their actual values, even for
the models in which there is not much difference between the values of the parameters of the two
regimes. This is noticeable, as the model is able to capture the break point, even if the change is
moderate. This feature may be quite useful in the estimation and forecasting of the real situations,
where the break points are not very obvious many a time.
In any Bayesian procedure, it is a matter of investigation that how strong the influence of priors
is. In all the above simulation studies, the prior chosen for α parameter was very weak as defined
in Equation (5). However, the priors for λ and the transition probabilities were Gamma (a, b)
and Beta (23, 1), respectively, which are relatively stronger. Therefore, to check the influence
of priors, we used the weaker priors to estimate the models M12 to M15 . Notice that parameters
of these models are same as those of models M8 to M11 , in which λ parameters are varying and
α parameters remain constant. The priors for λ and transition probabilities were taken as the
indicator function of λ > 0 and the Uniform (0, 1), respectively. We did not repeat the estimation
procedure for models M4 to M7 , since in these models, λ’s are kept constant and parameters α are
allowed to vary. We already have weak priors for α in all the simulations. The estimation results
are reported in Table 3. It can be seen that there is very little or in fact no influence of the stronger
priors on our estimation procedure. The results of Table 3 are more or less similar to those of
Table 2. Also, the estimation procedure was carried out with different starting values and it was
found that the final estimates remained unaffected in all the cases.

5. Empirical applications
The main purpose of the proposed model and estimation procedure in Section 2 is to apply it to
the real data sets. We have shown the applications of the proposed model with the help of two
Journal of Applied Statistics 2661

90
80
70
data
6050
40
30

0 20 40 60 80 100 120
Time

Figure 1. Schizophrenic patient data.

Table 4. Parameter estimates for Schizophrenic patient data.

Parameter Posterior mean Posterior standard deviation

α1 0.347 0.039
α2 0.300 0.094
λ1 44.203 2.709
λ2 23.411 3.271
Break point 79.923 1.124

real data sets. The notations used in both the applications are same as that of the INAR (1) model
with structural breaks as defined in Equation (1).

5.1 Schizophrenic patient data


This data set contains the daily observations of the score achieved by a schizophrenic patient on a
test of perceptual speed [22,24]. The data consist of 120 consecutive daily scores. From the 61st
day onwards the patient began receiving a tranquilizer (chloropromazine) that could be expected
to reduce perceptual speed. The data are presented in Figure 1. The data indicate the reduction
in patient’s score after 60th day, and after 80th day the speed has significantly reduced. Neal and
Subba Rao [24] guessed the structural break at 60th day and hence fitted two different INAR
(1) models to the two parts of the data. However, they did not provide any method to formally
estimate the break point.
We fitted the INAR (1) model with a break to this data and estimated the break point as well
as the parameters in both the regimes. Priors used for αj , j = 1, 2 in both the regimes were Beta
(1,1). For λ1 and λ2 , we used Gamma (40,1) and Gamma (20,1) priors, respectively. A burn-in
period of 10,000 was considered. The posterior estimates along with their standard deviations
are reported in Table 4. All the estimates have quite low standard deviation. The break point is
estimated at the 80th point and not at 60 as guessed by Neal and Subba Rao [24]. The histogram
and posterior density of the break point in Figure 2 are heavily concentrated at the 80th point
showing a significant break. Notice that after getting the tranquilizer, the patient’s score started
decreasing, which was the basis on which Neal and Subba Rao [24] guessed the break. However,
a decrease in the score cannot be regarded as a breakpoint. It is also evident from the plot of the
scores (Figure 1) that the actual significant decrease in the score of the patient was after 80th day.
2662 A.S. Kashikar et al.

8000 10000

2.0
6000

1.5
Frequency

Density
4000

1.0
2000

0.5
0.0
0

70 74 78 82 68 72 76 80
break point break point

Figure 2. Histogram and posterior density of the break point (Schizophrenic data).
90
80
70
data
60
50
40
30

0 20 40 60 80 100 120
Time

Figure 3. Fit of the model (Schizophrenic data).

This is equivalent to saying that although the tranquilizer started its effect from the very first day
itself, the patient was significantly remedied only after 20 days of getting it, that is, after 80 days.
The fit of the model as shown in Figure 3 supports our model with a break at 80. The red curve
in the plot shows the fitted values. The fit is quite good and is able to capture all the jumps of the
data correctly. Moreover, we fitted the INAR (1) model with a break at 60 also and found that
the mean squared error of the predicted values of INAR (1) model with a break at 60 is 139.045,
while that of a model with break at 80 is 117.452. This clearly indicates a better performance of
the fitted INAR (1) model with a break at 80.
Figure 4 shows the time series plots of the Gibbs sampling chains of parameters. The MCMC
algorithm exhibits good mixing. The posterior densities of the parameters are given in Figure 5.
The posterior densities of parameters αj , j = 1, 2, are not very significantly different. The change
in the regime is taken care of by the parameters λj , j = 1, 2, which are significantly different in
both the regimes.
Journal of Applied Statistics 2663

0.40

0.0 0.2 0.4


alpha1

alpha2
0.25
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Time Time
50
lambda1

lambda2
25
44
38

15
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Time Time

Figure 4. Time series plots of the parameter chains in MCMC (Schizophrenic data).
10

4
8

3
Density

Density
6

2
4

1
2
0

0.25 0.30 0.35 0.40 0.45 0.0 0.1 0.2 0.3 0.4 0.5
alpha1 alpha2
0.12
0.12

0.08
0.08
Density

Density
0.04
0.04
0.00

0.00

40 45 50 15 20 25 30
lambda1 lambda2

Figure 5. Posterior densities of parameters in both the regimes (Schizophrenic data).

5.2 H1N1 data


The first H1N1 (Swine flu) pandemic death in India was recorded in Pune, in August 2009. The
spread was very rapid and the number of deaths among the affected was alarmingly high. It had
created a panic and the local government authorities immediately attempted to curb the spread in
various ways. Kale et al. [19] modelled the number of daily cases of H1N1 in Pune with the help
of several time series models, including INAR. However, none of the stationary INAR models
provided a satisfactory fit.
The data considered here consist of the daily number of cases of H1N1 in Pune, from 15
July to 31 August 2009. The later half of the time series plot of the data (Figure 6) indicates a
2664 A.S. Kashikar et al.

100
80
60
data
40
20
0

0 10 20 30 40
Time

Figure 6. H1N1 data.

Table 5. Parameter estimates for H1N1 data.

Parameter Posterior mean Posterior standard deviation

α1 0.315 0.114
α2 0.275 0.036
λ1 3.601 0.653
λ2 19.332 1.209
Break point 21.598 0.493
0.6

0.35
alpha1

alpha2
0.3

0.20
0.0

0 2000 4000 6000 8000 0 2000 4000 6000 8000


Time Time
6
lambda1

lambda2
20
4

16
2

0 2000 4000 6000 8000 0 2000 4000 6000 8000


Time Time

Figure 7. Time series plots of the parameter chains in MCMC (H1N1 data).

sudden increase in the number of cases, indicating a structural break. In such a situation, fitting
a simple INAR model to the entire data may result in erroneous forecasts and hence can mislead
the authorities in future planning to fight the epidemic.
We fitted an INAR (1) model with one structural break to the given data. The priors assumed
for the parameters αj , j = 1, 2, were same as in Schizophrenic data, that is, Beta (1,1), while for
λ1 and λ2 the priors were Gamma (2,1) and Gamma (15,1), respectively. The estimates of all the
parameters are reported in Table 5. All the estimates were obtained after the convergence of the
Gibbs sampling chain. We left a burn-in sample of 10,000. The time series plots of the parameter
Journal of Applied Statistics 2665

10 12
3.0

8
Density

Density
2.0

6
4
1.0

2
0.0

0
0.0 0.2 0.4 0.6 0.15 0.20 0.25 0.30 0.35 0.40
alpha1 alpha2
0.6

0.30
0.4

0.20
Density

Density
0.2

0.10
0.00
0.0

2 3 4 5 6 14 16 18 20 22 24
lambda1 lambda2

Figure 8. Posterior densities of the parameters in both the regimes (H1N1 data).

chains are shown in Figure 7. These plots indicate that the Gibbs sampling chains are well mixed.
Figure 8 depicts the posterior densities of the parameters in both the regimes. All the posterior
densities are unimodal and approximately symmetric. There is not much of change in the posterior
density of the parameters αj , j = 1, 2, in the two regimes. However, the parameter λj is showing
a highly significant shift from regime 1 to 2.
The estimate of the break point is obtained at 22. Figure 9 represents the histogram and the
posterior density plot of the break point. The posterior mode is clearly 22, which is very close to
the posterior mean of the break point distribution. The break point density is bimodal, however,
3.0
5000
Frequency

2.0
Density
3000

1.0
1000

0.0
0

20.0 20.5 21.0 21.5 22.0 20.0 21.0 22.0


break point break point

Figure 9. Histogram and posterior density of the break point (H1N1 data).
2666 A.S. Kashikar et al.
Table 6. Model selection criteria for H1N1 data.

Regime 1 FPE AIC AICC Regime 2 FPE AIC AICC

V1 INAR(1) 5.19 36.66 33.88 INAR(1) 26.47 90.53 87.69


INAR(2) 5.91 39.66 35.01 INAR(2) 28.38 92.60 87.86
V2 INAR(1) 9.22 48.75 45.97 INAR(1) 507.39 170.26 167.43
INAR(2) 8.90 48.24 43.60 INAR(2) 525.44 171.40 166.67
V3 INAR(1) 9.56 49.50 46.72 INAR(1) 485.40 169.07 166.23
INAR(2) 8.87 48.18 43.53 INAR(2) 463.18 168.00 163.26
100
80
60
data
40
20
0

0 10 20 30 40
Time

Figure 10. Fit of the model (H1N1 data).

both the modes are at the subsequent points (21 and 22). It is evident from Figure 6 that after 21st
day, the number of positive cases increased drastically. This drastic change is easily captured in
our model as indicated by high estimated value of the parameter λ2 .
To justify the choice of INAR (1) model, we compute the model selection criteria described in
Section 3 (see Table 6) for the INAR (1) and INAR (2) models in both the regimes. The plots of
autocorrelation and partial autocorrelation functions in the two regimes did not suggest the higher
order INAR models for the given data. The results are reported in Table 6. Here, V 1, V 2 and V 3
represent the V̂p , computed using the three formulae in Equations (7)–(9), respectively. Notice
that most of the criteria are suggesting the INAR (1) model in both the regimes.
The fit of the model is reasonably good as depicted in Figure 10. The model is able to capture
the peaks and troughs correctly. The effect of structural break model is clearly shown as the fitted
values also shift towards higher side in the second regime.

6. Concluding remarks
In this paper, we have developed the Poisson INAR (p) process with structural breaks and applied
it to the two biometrical data sets. The proposed procedure can be applied to any non-negative
integer-valued time series data set, where there is a possibility of structural breaks. It can be very
useful in modelling the number of affected cases by an epidemic, since such data sets generally
witness the breaks, as seen in the H1N1 data. The techniques used here can be easily adapted
to other INAR processes such as geometric and negative binomial. Also, it will be interesting to
modify the proposed methodology to the general integer autoregressive moving average processes
Journal of Applied Statistics 2667

with structural breaks. Studying some of the probabilistic properties of such processes is also worth
pursuing.

Acknowledgements
The authors are thankful to the two anonymous referees for their comments.

References
[1] M. Al-Osh and A.A. Alzaid, An integer-valued pth-order autoregressive structure (INAR(p)) process, J. Appl. Probab.
27 (1990), pp. 314–347.
[2] A.A. Alzaid and M. Al-Osh, First-order integer-valued autoregressive (INAR(1)) processes, J. Time Series Anal.
8 (1987), pp. 261–275.
[3] U. Böckenholt, Analysing state dependences in emotional experiences by dynamic count data models, J. Appl. Stat.
52 (2003), pp. 213–226.
[4] K. Brännäs and Q. Shahiduzzaman, Integer-valued moving average modelling of the number of transactions in
stocks, Working Paper Economic Studies 637, University of Umeä, Umea, 2004.
[5] R. Bu and B. McCabe, Model selection, estimation and forecasting in INAR(p) models: A likelihood-based Markov
chain approach, Int. J. Forecast. 24 (2008), pp. 151–162.
[6] M. Cardinal, R. Roy, and J. Lambert, On the application of integer-valued time series models for the analysis of
disease incidence, Stat. Med. 18 (1999), pp. 2025–2039.
[7] S. Chib, Calculating posterior distributions and modal estimates in Markov mixture models, J. Econ. 75 (1996),
pp. 79–98.
[8] S. Chib, Estimation and comparison of multiple change point models, J. Econ. 86 (1998), pp. 221–241.
[9] S. Chib and E. Greenberg, Understanding the Metropolis-Hastings algorithm, Am. Stat. 49 (1995), pp. 327–335.
[10] V. Enciso-Mora, P. Neal, and T. Subba Rao, Integer valued AR processes with explanatory variables, Sankhya 71
(2009), pp. 248–263.
[11] J. Franke, C.T. Kirch, and J.T. Kamgaing, Change points in time series of counts, J. Time Series Anal. 33 (2012),
pp. 757–770.
[12] J. Franke and T. Seligmann, Conditional maximum likelihood estimates for INAR(1) processes and their application
to modelling epileptic seizure counts, in Time Series Analysis, T. Subba Rao, ed., Chapman and Hall, London, 1993,
pp. 310–330.
[13] R. Freeland and B. McCabe, Analysis of low count time series data by Poisson autoregression, J. Time Series Anal.
25 (2004), pp. 701–722.
[14] C. Gourieroux and J. Jasiak, Heterogeneous INAR(1) model with application to car insurance, Insur.: Math. Econ.
34 (2004), pp. 177–192.
[15] A. Harvey, Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge University Press,
Cambridge, 1989.
[16] C. Hurvich and C.L. Tsai, Regression and time series model selection in small samples, Biometrika 76 (1989),
pp. 297–307.
[17] P.A. Jacobs and P.A.W. Lewis, Discrete time series generated by mixtures I: Correlational and runs properties,
J. R. Stat. Soc. B 40 (1978), pp. 94–105.
[18] P.A. Jacobs and P.A.W. Lewis, Discrete time series generated by mixtures II: Asymptotic properties, J. R. Stat. Soc.
B 40 (1978), pp. 222–228.
[19] M.M. Kale, T.V. Ramanathan, K. Amrutkar, R. Chauhan, A. Kashikar, and R. Rizhwani, Statistical Modelling and
Analysis of Influenza Data, Project report submitted to the National Institute of Virology, Pune, 2010.
[20] I.L. MacDonald and W. Zucchini, Hidden Markov and Other Models for Discrete-Valued Time Series, Chapman and
Hall, London, 1997.
[21] B.P.M. McCabe and G.M. Martin, Bayesian predictions of low count time series, Int. J. Forecast. 22 (2005),
pp. 315–330.
[22] R. McCleary and R.A. Hay, Applied Time Series Analysis for the Social Sciences, Sage Publications, London, 1980.
[23] E. Mckenzie, Some simple models for discrete variate time series, J. Am. Water Resour. Assoc. 21 (1985),
pp. 645–650.
[24] P. Neal and T. Subba Rao, MCMC for integer-valued ARMA processes, J. Time Series Anal. 28 (2007), pp. 92–110.
[25] G. Pap and T.T. Szabo, Change detection in INAR(p) processes against various alternative hypotheses, Commun.
Stat. – Theory and Methods 42 (2013), pp. 1386–1405. Available at http://arxiv.org/abs/1111.2532
[26] N. Rohan and T.V. Ramanathan, Asymmetric volatility models with structural breaks, Commun. Stat. Simul. Comput.
41 (2012), pp. 1519–1543.
2668 A.S. Kashikar et al.

[27] I. Silva, Contributions to the analysis of discrete-valued time series, Ph.D. thesis, University of Porto, 2005.
[28] T.T. Szabo, Test statistics for parameter changes in INAR (p) models and a simulation study, Aust. J. Stat. 40 (2011),
pp. 265–280.
[29] M.A. Tanner, Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood
Functions, Springer-Verlag, New York, 1996.
[30] N. Taylor, Autoregressive Hidden Markov Switching Models of Count Data, Cardiff Business School, Cardiff
University, Cardiff, 2002 (preprint).
[31] P. Whittle, Estimation and information in stationary time series, Ark. Mat. 2 (1953), pp. 423–434.
[32] P. Whittle, Some recent contributions to the theory of stationary processes (appendix 2), in A Study in the Analysis
of Stationary Time Series, H. Wold, ed., Almquist and Wiksell, Stockholm, 1954, pp. 196–228.
[33] H. Zheng, I.V. Basava, and S. Datta, Inference for pth order random coefficient integer-valued autoregressive
processes, J. Time Series Anal. 27 (2006), pp. 411–440.

Appendix. Details of the MCMC algorithm


In this appendix, we provide the details about sampling from the conditional posterior densities in three steps as described
in Section 2. Let St = (s1 , s2 , . . . , st ) denote the states up to time t and St+1 = (st+1 , st+2 , . . . , sn ) the states of the system
after time t. Similarly, let Yt = (Y1 , Y2 , . . . Yt ) denote the data up to time t. Then we proceed by using the recursive
techniques, first specifying the initial values of the parameters  and the transition probability matrix P through the
following three steps:
Step 1: Simulation of Sn
Simulation of Sn is completed by drawing a sequence of values of st from {1, 2, . . . , m + 1} using the mass function
p(st | Y n , St+1 , , P), which can be written as
p(st | Y n , St+1 , , P) ∝ p(st | Y t , , P)p(st+1 | st , P). (A1)
The mass function in Equation (A1) is written in a reverse time order as this makes it easier to sample st for each t. Notice
that in the reverse time order, it is possible to write the conditional mass function as proportional to two nice probabilities.
More details about the method can be found in Chib [8, Section 2]. The second probability here can be readily obtained
from the transition probability matrix. The first probability is calculated recursively for all t, using the following steps:
(1) Prediction step:


k
p(st = k | Yt−1 , , P) = pjk p(st−1 = j | Yt−1 , , P).
j=k−1

(2) Update step:


p(st = k | Yt−1 , , P)f (Yt | Yt−1 , st = k, )
p(st = k | Yt , , P) = k .
j=k−1 p(st = j | Yt−1 , , P)f (Yt | Yt−1 , st = j, )

The calculation is initialized at t = 1 by setting p(s1 | Y0 , , P) to be the mass function degenerate at 1. This will help to
obtain p(s1 | Y1 , , P) using the update step. In the update step, p(st = j | Yt−1 , , P), j = k − 1, k can be obtained using
the prediction step and f (Yt | Yt−1 , st = j, ) is calculated using Equation (3) ∀t. In this way, the update step probabilities
are calculated for every t. With these probabilities, the states are simulated according to Equation (A1), starting from time
n, setting sn equal to m + 1 and working backward.
Step 2: Simulation of P
Once the states Sn are simulated, the full posterior distribution, p(P | Y n , Sn , ), of P becomes independent of (Y n , )
and it depends only on Sn . We assume the prior distribution of pii to be Beta(a, b). Multiplying the prior by likelihood
gives the posterior density p(pii | Sn ) as Beta(a + nii , b + 1), where nii is the number of one-step transitions from i to i in
Sn . We choose the values of a and b as 20 and 1 in all the calculations. The transition probabilities can be easily simulated
now.
Step 3: Simulation of 
Let −αji =  − {αji } and −λj =  − {λj }, i = 1, 2, . . . , p, j = 1, 2, . . . , m + 1. Then the conditional posterior densi-
ties of the individual parameters are given as
p(αji | −αji ,Y n , P, Sn ) ∝ p(αj | −αji ) × L(Y n | Sn , ),

p(λj | −λj ,Y n , P, Sn ) ∝ p(λj ) × L(Y n | Sn , ).


Here, expressions on the right hand sides of both the conditional densities are obtained as functions of αji and λj ,
respectively, by conditioning on the other parameters. The priors and L(Y n | Sn , ) are as defined in Equations (4)
and (5). The parameters are sampled using the single move Gibbs sampler in the following steps:
Journal of Applied Statistics 2669

(1) Sample αji from p(αji | −αji ,Y n , P, Sn ), i = 1, 2, . . . , p, j = 1, 2, . . . , m + 1.


(2) Sample λj from p(λj | −λj ,Y n , P, Sn ), j = 1, 2, . . . , m + 1.
Metropolis-Hastings algorithm [9] is used to sample from the above conditional densities, the details of which are as
follows:
Starting with αji(k) (k corresponds to loop in the Bayesian estimation), iterate on the following steps:

(1) Propose a new αji∗ , where αji∗ ∼ q(αji(k) ).


(2) Accept αji∗ with probability
 
p(αji∗ | −αji ,Y n , P, Sn )/q(αji∗ )
min ,1 .
p(αji(k) | −αji ,Y n , P, Sn )/q(αji(k) )

Here q(·) is the candidate generating density. The candidate density taken for the generation of αji is Uniform [0,1]. The
parameter λj is also simulated in a similar way by using Gamma as the candidate density.
Iterating the above three steps recursively a large number of times, we can easily get the realization of the break points
in the form of Sn . The Bayes estimate of parameters are obtained as the mean of a large number of sampled observations
after the convergence of the chain.

You might also like