You are on page 1of 23

Journal of Applied Statistics

ISSN: 0266-4763 (Print) 1360-0532 (Online) Journal homepage: http://www.tandfonline.com/loi/cjas20

Sensitivity analysis of longitudinal count


responses: a local influence approach and
application to medical data

Alejandra Tapia, Viviana Giampaoli, Maria del Pilar Diaz & Victor Leiva

To cite this article: Alejandra Tapia, Viviana Giampaoli, Maria del Pilar Diaz & Victor Leiva (2018):
Sensitivity analysis of longitudinal count responses: a local influence approach and application to
medical data, Journal of Applied Statistics, DOI: 10.1080/02664763.2018.1531978

To link to this article: https://doi.org/10.1080/02664763.2018.1531978

Published online: 15 Oct 2018.

Submit your article to this journal

Article views: 5

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=cjas20
JOURNAL OF APPLIED STATISTICS
https://doi.org/10.1080/02664763.2018.1531978

Sensitivity analysis of longitudinal count responses: a local


influence approach and application to medical data
Alejandra Tapia a , Viviana Giampaoli b , Maria del Pilar Diaz c and Victor Leiva d

a Institute of Statistics, Faculty of Economic and Administration Sciences, Universidad Austral de Chile,

Valdivia, Chile; b Institute of Mathematics and Statistics, Universidade de São Paulo, São Paulo, Brazil; c School
of Nutrition, Faculty of Medical Sciences and INICSA-CONICET, Universidad Nacional de Córdoba, Córdoba,
Argentina; d School of Industrial Engineering, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile

ABSTRACT ARTICLE HISTORY


Longitudinal count responses are often analyzed with a Poisson Received 2 March 2018
mixed model. However, under overdispersion, these responses are Accepted 25 September 2018
better described by a negative binomial mixed model. Estimators of KEYWORDS
the corresponding parameters are usually obtained by the maximum Approximation of integrals;
likelihood method. To investigate the stability of these maximum local influence; longitudinal
likelihood estimators, we propose a methodology of sensitivity anal- data; Monte Carlo and
ysis using local influence. As count responses are discrete, we are Metropolis-Hastings
unable to perturb them with the standard scheme used in local influ- methods; Poisson and
ence. Then, we consider an appropriate perturbation for the means negative binomial
of these responses. The proposed methodology is useful in differ- distributions
ent applications, but particularly when medical data are analyzed,
because the removal of influential cases can change the statistical
results and then the medical decision. We study the performance
of the methodology by using Monte Carlo simulation and applied it
to real medical data related to epilepsy and headache. All of these
numerical studies show the good performance and potential of the
proposed methodology.

1. Introduction
Longitudinal count responses are often described by a Poisson mixed model. However,
under overdispersion, the negative binomial mixed model provides better results [2,10,15].
The Poisson mixed model belongs to the family of generalized linear mixed models
–GLMM– [17,25]. The negative binomial mixed model is also a member of this family
if its precision parameter is fixed. GLMM accommodate the existing correlation structure
in longitudinal responses by means of random effects. Conditional on these effects, the
responses follow a distribution of the exponential family with means related to the lin-
ear predictor by a link function. Generally, it is assumed that the random effects follow a
multivariate normal distribution.
Estimation of parameters in GLMM is usually performed by the maximum like-
lihood –ML– method. Nevertheless, the corresponding likelihood function includes
intractable mathematically integrals. To address this intractability, approximations of

CONTACT Victor Leiva victorleivasanchez@gmail.com


© 2018 Informa UK Limited, trading as Taylor & Francis Group
2 A. TAPIA ET AL.

Laplace –AL– [34] and adaptive Gauss-Hermite quadrature –AGHQ– [28] have been used,
as well other methods, such as restricted pseudo-likelihood –RPL– [41] and penalized
quasi-likelihood [3].
After the estimation procedure, a diagnostic analysis should be considered in any sta-
tistical modeling. A method to perform such an analysis is the case-deletion technique,
which consists of studying the stability of estimates after removing individual observations
[8,42]. However, the most used method is the local influence technique, which studies the
stability of estimates under perturbations in the model or data. This technique introduced
by Cook [6] was applied to the normal linear model and to several statistical models. For
example, some studies on local influence consider mixed models [21], elliptical times series
and regression models [22,23], negative binomial models including zero-inflation [13,37],
missing data models [45], multivariate regression models [11,24], semiparametric mod-
els [16], spatial models [1,9,14], generalized linear type, restricted and varying precision
models [19,20,35], and survival analysis models [18].
As local influence is a likelihood-based technique, its use in GLMM has the same
problem of intractability above indicated, which was solved by Ouwens et al. [27]. As
mentioned, Zhu and Lee [45] applied local influence to missing data models and there
defined a type of likelihood displacement which is known as Q-displacement function.
Based on this work, Zhu and Lee [46] suggested to consider the random effects of GLMM
as missing data and employed the conditional expectation of the complete-data likelihood
function for estimating their parameters and assessing local influence. In this approach,
the local influence technique was applied as usual (adapted to mixed models) considering
schemes of perturbation based on: (i) case-weights (within and among clusters, as well as
among clusters and random effects); (ii) random effects variance; (iii) explanatory variable;
and (iv) response. However, note that the local influence technique is based on derivatives
which need continuity for explanatory and response variables. Therefore, influence local
for models with count responses cannot be conducted perturbing these responses in the
usual manner, although studying this type of perturbation is crucial for count models.
Selecting an appropriate perturbation in local influence is relevant because arbitrary
perturbations may provoke unreliable results when identifying influential observations.
Zhu et al. [44] derived a methodology for choosing an appropriate perturbation scheme
based on the observed-data log-likelihood function. Applications of this methodology to:
structural equation models [4], generalized linear models with missing covariates [36],
symmetric generalized linear models [40] and capital asset pricing models [12], are some
examples. In the case of GLMM, Chen et al. [5] derived local influence selecting an appro-
priate perturbation. The more recent work on local influence in GLMM is attributed to
Rakhmawati et al. [33]. To our best knowledge, local influence to assess how a longitudi-
nal count response affects the estimates of parameters have been no addressed to the date.
This is particularly important in medicine when count data related to, for example, the
number of epileptic seizures or of headaches suffered by patients, are analyzed. Moreover,
as in presence of overdispersion the Poisson distribution is not suitable, mixed models
and their local influence diagnostics addressing overdispersion, in the case of longitudinal
count data, should be based on the negative binomial distribution.
The main objective of this work is to derive a methodology of sensitivity analysis using
the local influence technique in Poisson and negative binomial mixed models. Specifically,
we detect how their responses affect the estimates of parameters by using an appropriate
JOURNAL OF APPLIED STATISTICS 3

perturbation of means with the Q-displacement function. We emphasize that, as the count
response is discrete, we are unable to perturb it as usual in local influence and then we
perturb its mean. This as a way of indirectly perturbing count responses in both models.
Note it means that we are able to detect influential observations (measurements), but not
influential subjects, which allows us to avoid misleading ML estimates, and consequently,
inaccurate decisions. Also, this is of practical importance, because beyond identifying
influential measurements, it improves the process of data validation and model selection.
The performance of the derived methodology is evaluated by Monte Carlo (MC) simula-
tions. In addition, we apply the methodology to two real data sets related to epilepsy [39]
and headache [26]. The numerical studies of this work are performed with a computational
routine implemented by the authors in the R software; see www-R-project.org and [30].
This article is organized as follows. In Section 2, we present the local influence approach
for Poisson and negative binomial mixed models. We derive an appropriate perturbation
scheme for the mean to indirectly evaluate the effect of their count responses in both mod-
els. In Section 3, an MC simulation study is carried out to assess the performance of the
methodology derived in Section 2, whereas an application with two real medical data sets
is conducted in Section 4. Finally, some concluding remarks as well as proposals for future
research are described in Section 5. Background about the Poisson and negative binomial
mixed models, as well as mathematical results, are presented in the appendices.

2. The local influence approach


Consider the negative binomial mixed model detailed in Appendix 1 as follows. We assume
a count response Yij with negative binomial distribution of dispersion parameter 1/φ,
for j = 1, . . . , ni and i = 1, . . . , I. If 1/φ → 0, we have a Poisson mixed model. Also,
we assume that the random effects correspond to a p2 × 1 vector bi with distribution
Np2 (0p2 ×1 , ), where 0p2 ×1 is a p2 × 1 vector of zeros and the p2 × p2 matrix of variance-
covariance  (of rank p2 ) depends on a p3 × 1 vector γ of unknown variance components.
In this framework, according to (A2) of Appendix 1, we consider as well p1 × 1 and
p2 × 1 vectors corresponding to values of covariates denoted by xij = (xij1 , . . . , xijp1 ) and
z ij = (zij1 , . . . , zijp2 ) for the fixed (regression) and random effects, respectively, and the
p1 × 1 vector β of unknown regression coefficients.
Let ψ = (φ, β  , γ  ) be an M × 1 parameter vector to be estimated, with M = 1 +
p1 + p3 , and yo be the observed data set. Then, for the negative binomial mixed model, the
log-likelihood function of ψ based on observed data is defined in (A3) of Appendix 1.
To address the intractability of high-dimensional integrals in this observed-data log-
likelihood function, Zhu and Lee [46] considered the random effects as an unobserved
data set, yu namely. Thus, we denote the complete data set as yc = (y  
o , yu ) . There-
fore, from the observed-data likelihood function defined in defined in (A3), we obtain
the corresponding complete-data log-likelihood function for ψ given by

⎛ ⎞
I  ni
1 1
(ψ; yc ) = ⎝ φ(yij θij − d(θij )) + c(yij , φ) − b −1 ⎠
i  bi − log(det()) , (1)
i=1 j=1
2 2
4 A. TAPIA ET AL.

where θij and the functions d, c are defined in (A1) of Appendix 1 and θij depends on
xij , z ij and β. Note that (1) is a relatively simple expression to develop the local influence
approach. 
Consider a perturbation vector ω ∈  ⊂ Rq , with q = Ii=1 ni , such that (ψ, ω; yc )
based on (1) is the log-likelihood function for ψ based on complete data of the per-
turbed model. Assume that there is a non-perturbation vector ω0 ∈  ⊂ Rq such that
(ψ, ω0 ; yc ) = (ψ; yc ), for all ψ. Let ψ̂(ω) be the ML estimate of ψ for the perturbed
model that maximizes

Q(ψ, ω)|ψ=ψ̂ = E((ψ, ω; Y c ) | Y c )|ψ=ψ̂ , (2)

where ψ̂ is the ML estimator of ψ. Note that the expectation expressed in (2) is cal-
culated with respect to the conditional probability mass function pY u | Y c =yc (yu ; ψ̂); see
Appendix 1. To assess the influence of ω ∈  ⊂ Rq , Zhu and Lee [46] used the Q-
displacement function based on (2) given by

fQ (ω) = 2 Q(ψ̂)|ψ=ψ̂ − Q(ψ̂(ω))|ψ=ψ̂ , (3)

where Q(ψ̂)|ψ=ψ̂ = Q(ψ̂, ω0 )|ψ=ψ̂ and Q(ψ̂(ω))|ψ=ψ̂ = Q(ψ̂(ω), ω0 )|ψ=ψ̂ . Then, simi-
larly to Cook [6, 7] and Zhu and Lee [45], the influence graph of fQ (ω) expressed in (3) is
defined as α(ω) = (ω , fQ (ω)) . The normal curvature CfQ ,h of α(ω) in ω0 , in the direc-
tion of a unit vector h ∈ Rq , is used to summarize the local behavior of fQ (ω). It can be
proven [45] that the normal curvature CfQ ,h of α(ω) in ω0 , in the direction of a unit vector
h ∈ Rq , is established by

CfQ ,h = −2h Q̈ω0 h = 2h  −1


ω0 (−Q̈ψ (ψ̂)) ω0 h, (4)

where


∂ 2 Q(ψ̂(ω))

∂ 2 Q(ψ)

∂ 2 Q(ψ, ω)

Q̈ω0 =
, Q̈ψ (ψ̂) = , ω0 = , (5)
∂ω∂ω

ψ=ψ̂,ω=ω0
∂ψ∂ψ 
ψ=ψ̂ ∂ψ∂ω
ψ=ψ̂,ω=ω0

are q × q and M × M semidefinite positive matrices and an M × q perturbation matrix,


respectively. Now, as (4) is not invariant under reparametrization of ψ and can assume any
value, Zhu and Lee [45] used the normal conformal curvature [29] BfQ ,h of α(ω) at ω0 , in
the direction of a unit vector h ∈ Rq as in (4), given by

−2h Q̈ω0 h
BfQ ,h = , (6)
tr(−2Q̈ω0 )

where tr(A) is the trace of the matrix A and Q̈ω0 is defined in (5). Note that (6) is invariant
under reparametrization of ψ and assumes values in the interval [0, 1] [29].
In a different way to Cook [6, 7], Zhu and Lee [45] studied (6) in the direction of the
observation j. Then, doing Q = −2Q̈ω0 /tr(−2Q̈ω0 ), it is possible to define the aggregate
JOURNAL OF APPLIED STATISTICS 5

contribution vector [21,29,45] given by


M
M(0) = λm e
m em , (7)
m=1

where (λ1 , e1 ), . . . , (λM , eM ) are pairs of eigenvalues and eigenvectors of Q, such that λ1 ≥
· · · ≥ λM > λM+1 = · · · = λq = 0 and (e1 , . . . , eM ) is an orthonormal basis of RM . The
jth component of M(0) defined in (7) is denoted by M(0)j . Then, the observation j is influ-
q
ential if M(0)j > M(0) + 2SD(M(0)), where M(0) = (1/q) i=1 M(0)j and SD(M(0)) is
the standard deviation (SD) of M(0)1 , . . . , M(0)q .
In order to obtain BfQ ,h defined in (6), it is necessary to calculate the matrices −Q̈ψ (ψ̂)
and ω0 defined in (5). However, as the conditional expectation included in the blocks of
these matrices cannot be calculated in closed form, Zhu and Lee [46] used the classical MC
method of integration for its approximation; see technical details in Appendices 2–4. In this
section, we present only the derivatives related to −Q̈ψ (ψ̂), since they do not depend on the
proposed perturbation scheme. Thus, the derivatives involved in −Q̈ψ (ψ̂) are presented in
Appendix 5. The methodology for selecting an appropriate perturbation is important in the
local influence approach. For GLMM, Chen et al. [5] derived this methodology as follows.
Consider the perturbed model by P = {pY c (yc ; ψ, ω) : ω ∈  ⊂ Rq }, where pY c (yc ; ψ, ω)
is the joint probability mass function of yc under (ψ, ω; yc ). Then, the corresponding
Fisher expected information matrix with respect to the perturbation vector ω, under P,
is given by

G(ω) = (gij (ω))q×q , (8)

with

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))
gij (ω) = E ,
∂ωi ∂ωj

where the expectation is calculated using pY c (yc ; ψ, ω). The elements of the diagonal of
G(ω) defined in (8) are the variances of the scores with respect to the components of ω,
and indicate the amount of perturbation introduced by the corresponding components of
ω. The elements outside the diagonal of G(ω) are the covariances of the scores with respect
to the components of ω, and represent the association between the different components
of ω. Thus, a perturbation is appropriate if it satisfies the following conditions: (i) G(ω)
is of full range in a small neighborhood of ω0 , to avoid redundant components of ω; (ii)
the off-diagonal components are as small as possible, to avoid a strong association between
the components of ω and, consequently, perturbations with strong ambiguous effects; and
(iii) the difference between the components of the diagonal are as small as possible, so that
the perturbation quantities introduced by the components of ω are uniform. In practical
applications, the appropriate perturbation satisfying the above conditions (i)–(iii) is that
verifying G(ω0 ) = aI q , with a > 0 and I q being the q × q identity matrix. Now, if G(ω0 ) =
aI q , to solve this fact, we can choose a new q × 1 vector of perturbation given by

ω̃ = ω0 + a−1/2 G(ω0 )1/2 (ω − ω0 ), (9)


6 A. TAPIA ET AL.

such that G(ω̃) evaluated at ω = ω0 is equal to aI q , that is, the perturbed model by P̃ =
˜ ⊂ Rq } is considered, where from (9)
{pY c (yc ; ψ, ω(ω̃)) : ω̃ ∈ 

ω = ω0 + a1/2 G(ω0 )−1/2 (ω̃ − ω0 ) (10)

and ˜ = {ω0 + a−1/2 G(ω0 )1/2 (ω − ω0 ) : ω ∈  ⊂ Rq }. Note that, in order to emphasize


the dependence of ω on ω̃, we use la notation ω(ω̃), such that ω defined in (10) is now
written as
ω(ω̃) = ω0 + a1/2 G(ω0 )−1/2 (ω̃ − ω0 ). (11)
In order to obtain the perturbation vector ω̃, it is necessary to calculate the matrix G(ω0 ).
However, as the expectation involved in the blocks of G(ω0 ) cannot be calculated in closed
form, Chen et al. [5] used the classical MC method of integration for approximating it.
As the count responses are discrete random variables, we modify their values introduc-
ing a perturbation in their means. Note that there are several ways to perturb response
mean and to assess local influence and stability of the ML estimates. However, arbitrarily
perturbing these means may result in misleading inference about the influential observa-
tions. In this context, we use an appropriate quadratic multiplicative perturbation of the
response mean for both models given in (15) instead of using an inappropriate pertur-
bation as that expressed in (12). Observe that this last perturbation does not impose any
restriction on the perturbation vector, whereas the appropriate perturbation depends on a
matrix G, which guarantees its appropriateness for both models, according to the elements
of G defined in (13) and (14) for negative binomial and Poisson models, respectively. The
scheme adopted is multiplicative because we believe it is interesting to analyze an inflation
or deflation of the mean (response mean of the Poisson model) and it is quadratic to ensure
that the perturbed value remains positive.
First, we consider a perturbation scheme for the response mean of Poisson and negative
binomial mixed models given by

μij (ωij ) = μij ωij2 , j = 1, . . . , ni , i = 1, . . . , I. (12)

Thus, ω0 = 1q×1 , where 1q×1 is a q × 1 vector of ones. Then, by applying the methodology
for selecting an appropriate perturbation of Chen et al. [5] in the negative binomial mixed
model, the matrix G(ω0 ) is formed by the elements

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))

gij (ω0 ) = E
, (13)
∂ωi ∂ωj ω=ω0

where, for i = j,

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))

6φyij μij + 2φ 2 yij + 2φ 2 μij − 2φμ2ij



= ,
∂ωi ∂ωj ω=ω0
(μij + φ)2

and for i = j,

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))


= 0,
∂ωi ∂ωj ω=ω0
JOURNAL OF APPLIED STATISTICS 7

with i, j = 1, . . . , q. In the case of the Poisson mixed model, the matrix G(ω0 ) is formed by
the elements

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))

gij (ω0 ) = E
, (14)
∂ωi ∂ωj ω=ω0

where, for i = j,

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))


= 2(yij + μij ),
∂ωi ∂ωj ω=ω0

and for i = j,

∂ log(pY c (yc ; ψ, ω)) ∂ log(pY c (yc ; ψ, ω))


= 0,
∂ωi ∂ωj ω=ω0

with i, j = 1, . . . , q. Since for both models the elements of G(ω0 ) defined in (13) and (14)
provide G(ω0 ) = aI q , with a > 0, we choose a new perturbation vector given by consid-
ering (9). Thus, an appropriate quadratic multiplicative perturbation scheme of response
mean is established as

μij (ωij (ω̃ij )) = μij ωij (ω̃ij )2 , i, j = 1, . . . , q, (15)

where ωij (ω̃ij ) = 1 + gij (ω0 )−1/2 (ω̃ij − 1), such that G(ω0 ) = aI q , with a = 1. Under (15),
the derivatives included in ω0 for the Poisson and negative binomial mixed models are
given in Appendix 6. As mentioned in matrix terms in (11), we use the notation ωij (ω̃ij )
in (15) due to that ωij is expressed as a function of ω̃ij based on the transformation given
in (9).

3. Simulation study
We carry out an MC simulation study with R = 100 replications under Poisson and negative
binomial mixed models. The scenarios of the simulation are defined by the combina-
tion of the sample sizes: q = 90 (I = 30, ni = 3), q = 360 (I = 60, ni = 6), q = 1080 (I =
120, ni = 9), and values of the perturbation: ω̃ij = 0.975, 1.025. For the mentioned models,
we consider the systematic component given by


I
log(μij ) = β0 + β1 xij + bi , j = 1, . . . , ni , i = 1, . . . , I, q= ni , (16)
i=1

where xij are the values of a covariate X ∼ U(0, 1). Also, β0 , β1 are the regression coef-
ficients and bi is the random intercept with distribution N(0, σ 2 ). The true values of
the parameters are: β0 = 0.5, β1 = 0.5 and σ 2 = 0.1, 0.5, 1.0. In addition, for the nega-
tive binomial mixed model, we consider φ = 5.0. We generate the values yij of the count
response from the Poisson and negative binomial distributions with parameter given
by (16). To obtain each replication yij , we randomly choose the group to perturb, i namely,
and generate again the ni observations yij as follows. Given a value of ω̃ij = 0.975, 1.025, we
8 A. TAPIA ET AL.

obtain the values yij from the Poisson and negative binomial distributions with parameter
μij (ωij (ω̃ij )) = μij ωij (ω̃ij )2 , where

ωij (ω̃ij ) = 1 + gij (ω0 )−1/2 (ω̃ij − 1), (17)

with gij (ω0 ) being approximated by T = 2000 additional observations of bi generated from
a N(0, σ 2 ) distribution.
For each replication, we obtain the ML estimate ψ̂ of ψ = (β0 , β1 , σ 2 ) . Based on ψ̂, we
generate S = 500 observations of bi through the Metropolis-Hastings algorithm. The first
M0 = 100 observations are discarded as burn-in phase, while the remaining 400 obser-
vations are used for approximating −Q̈ψ (ψ̂) and ω0 . In addition, we generate T = 2000
observations of bi from a N(0, σ 2 ) distribution for approximating gij (ω0 ) given in (17).
We are in presence of a correct detection when the observation is identified as influential
and the value of the perturbed count response is different from the value of the non-
perturbed count response. Tables 1 and 2 display the correct detection percentages for each
simulation scenario under the Poisson and negative binomial mixed models, respectively.
Table 2 shows some NAs in cases where there were difficulties with certain replications
during the computations. These difficulties appear because some diagonal elements of the
matrix G are negative, due to that in some replications, yij = 0 and α ≤ μij ; see diagonal
components given in (13) of the matrix G. In the Poisson mixed model, for each ω̃ij , as
σ 2 increases, the correct detection percentages increase reaching values greater than 99%,
particularly for q large. In the negative binomial mixed model, for each ω̃ij , as σ 2 increases,
the correct detection percentages are similar to the Poisson mixed model, reaching values
greater than 96%, also for q large. In summary, for both models, the percentages of correct
detection are very satisfactory, showing that the proposed methodology is able to detect
the perturbed observations as influential.

Table 1. Percentages of correct detection for each simulation scenario under the Poisson mixed model.
ω̃ij
0.975 1.025
σ2
q I ni 0.1 0.5 1.0 0.1 0.5 1.0
90 30 3 1 43 81 2 47 74
3 53 82 3 59 89
1 37 75 2 40 75
360 60 6 91 90 95 89 95 96
31 87 94 33 92 94
16 86 91 18 85 94
65 90 93 66 91 94
76 86 88 78 90 89
12 84 92 11 92 94
1080 120 9 35 94 95 76 89 99
88 91 94 83 95 94
50 95 91 78 90 96
42 91 93 86 94 94
76 92 94 89 93 94
82 91 91 82 90 91
89 88 92 89 92 97
36 93 96 82 86 95
27 92 94 74 91 94
JOURNAL OF APPLIED STATISTICS 9

Table 2. Percentages of correct detection for each simulation scenario under the negative binomial
mixed model.
ω̃ij
0.975 1.025
σ2
q I ni 0.1 0.5 1.0 0.1 0.5 1.0
90 30 3 2 0 0 3 0 0
0 0 0 0 0 0
0 0 0 0 0 0
360 60 6 91 92 NAs 88 91 NAs
5 1 NAs 4 0 NAs
5 0 NAs 2 0 NAs
33 49 NAs 32 42 NAs
87 92 NAs 90 87 NAs
1 0 NAs 1 0 NAs
1080 120 9 6 1 NAs 7 0 NAs
93 65 NAs 89 69 NAs
2 0 NAs 0 0 NAs
3 5 NAs 3 4 NAs
48 50 NAs 52 51 NAs
90 84 NAs 93 88 NAs
96 80 NAs 92 74 NAs
2 1 NAs 1 1 NAs
1 0 NAs 5 1 NAs

4. Applications to real data


4.1. Epilepsy data
A balanced data set derived from a clinical trial with 59 epileptic patients was presented by
Thall and Vail [39]. The objective of that study was to analyze whether a new drug reduces
the epileptic seizures or not. The response of interest is the number of epileptic seizures
experienced by each patient during a period of two weeks before each four visits to the
clinic (V1, V2, V3, V4). Patients suffering from epileptic seizures were randomly assigned
to receive a new drug or a placebo, as an adjuvant to standard chemotherapy.
Figure 1 shows profile plots (first row) and boxplots (second row) for the number of
epileptic seizures in the treated group with the new drug (right) and the placebo group
(left). Table 3 provides a descriptive summary for the number of epileptic seizures in V1,
V2, V3, V4. From the figure and table, note that (i) there are no clear patterns in the profiles
that differentiate the responses of both groups so that random intercept and slope are not
discarded in the modeling and will be considered; (ii) no important changes are detected
in the response means nor in the dispersion, but considering that no explanatory variables
are being used in this exploratory data analysis; and (iii) several atypical data are identified.
For the data reported by Thall and Vail [39] widely used in the statistical literature
[3,17,46], we consider a Poisson mixed model with systematic component given by

log(μij ) = β0 + β1 Baseij + β2 Trtij + β3 Baseij Trtij + β4 Ageij


+ β5 Visitij + bi0 + bi1 Visitij , (18)

for j = 1, . . . , 4, i = 1, . . . , 59, and q = 236, where ‘Base’ is the logarithm of the crisis
number of a pre-clinical trial; ‘Trt’ is an indicating variable of treatment (Trt = 1) and
placebo (Trt = 0) groups; ‘Base Trt’ is the interaction between ‘Base’ and ‘Trt’; ‘Age’ is
10 A. TAPIA ET AL.

Figure 1. Profile plot (first row) and boxplots (second row) for epilepsy data.

the logarithm of age in years; and ‘Visit’ is an indicating variable of the periods (Visit =
−0.3, −0.1, 0.1, 0.3). We assume that random effects bi0 and bi1 are independent, with
bi0 ∼ N(0, σ02 ) and bi1 ∼ N(0, σ12 ).
The ML estimates of the components of the parameter ψ = (β0 , β1 , β2 , β3 , β4 , β5 , σ0 ,
σ1 ) of the model with systematic component defined in (18) are presented in Table 4.
JOURNAL OF APPLIED STATISTICS 11

Table 3. Descriptive summary of the number of epileptic seizures in the indicated clinic visit and
treatment, where SE denotes standard error.
V1 V2 V3 V4
Placebo Mean 9.36 8.29 8.71 7.96
Sample SE 1.92 1.54 2.76 1.44
Treatment Mean 8.58 8.42 8.12 6.71
Sample SE 3.28 2.13 2.50 2.02

Based on ψ̂, we generate S = 10000 observations of bi0 and bi1 through the Metropolis-
Hastings algorithm. The first M0 = 1000 observations are discarded as burn-in phase,
while the remaining 9000 observations are used for approximating −Q̈ψ (ψ̂) and ω0 .
In addition, we generate T = 2000 observations of bi0 from the N(0, σ02 ) distribution and
T = 2000 observations of bi1 from the N(0, σ12 ) distribution for approximating gij (ω0 ).
Figure 2 shows an index plot of M(0)j , with M(0)j corresponding to jth component of
the aggregate contribution vector. From this figure, note that observations corresponding
to patient #5 (case #20), patient #43 (cases #169, #170, #171, #172) and patient #60 (cases
#57, #58, #59, #60) were detected as potentially influential. To evaluate the magnitude of
the impact exerted by the sets of potentially influential observations on the ML estimates,
we compare the original ML estimates with those obtained by removing the observations
of the patient combinations through the percentage error (PE) in absolute value given by

Table 4. Estimates, p-values and PE for epilepsy data.


Dropped observation Patient Parameter Estimate p-value PE
None – β0 −1.346 0.251 –
β1 0.884 < 0.001 –
β2 −0.927 0.020 –
β3 0.338 0.094 –
β4 0.470 0.172 –
β5 −0.267 0.087 –
σ0 0.499 – –
σ1 0.736 – –
{169,170,171,172} 43 β0 −1.150 0.338 14.559
β1 0.885 < 0.001 0.096
β2 −0.895 0.026 3.512
β3 0.309 0.133 8.499
β4 0.411 0.244 12.564
β5 −0.297 0.060 11.251
σ0 0.502 – 0.628
σ1 0.733 – 0.510
{57,58,59,60}/{169,170,171,172} 15/43 β0 −1.130 0.350 16.088
β1 0.909 < 0.001 2.860
β2 −0.860 0.036 7.260
β3 0.282 0.185 16.400
β4 0.395 0.268 15.916
β5 −0.271 0.092 1.353
σ0 0.507 – 1.499
σ1 0.734 – 0.310
{20}/{57,58,59,60}/{169,170,171,172} 5/15/43 β0 −1.195 0.325 11.231
β1 0.894 < 0.001 1.114
β2 −0.884 0.031 4.568
β3 0.301 0.159 10.740
β4 0.421 0.239 10.486
β5 −0.303 0.054 13.342
σ0 0.508 – 1.824
σ1 0.692 – 5.995
12 A. TAPIA ET AL.

Figure 2. Index plots of M(0)j for epilepsy data.

PE = |(ψ̂k − ψ̂k∗ )/ψ̂k | × 100%, (19)

where ψ̂k is the estimate obtained from the fit of the model with all observations and ψ̂k∗ is
the estimate obtained from the fit of the model removing the combinations of influential
observations and sets of influential observations (when they belong to the same patient),
para k = 1, . . . , 8. Table 4 shows the results of the largest values of PE defined in (19) for
each possible combination. By removing the cases #169, #170, #171 and #172 of patient
#43, the estimates of β0 , β3 , β4 and β5 present variations greater than 14%. By removing
the cases #57, #58, #59 and #60 of patient #15 and the cases #169, #170, #171 and #172 of
patient #43, the estimates of β0 , β3 and β4 present variations greater than 16%. By removing
the case #20 of patient #5, the cases #57, #58, #59 and #60 of patient #15, and the cases #169,
#170, #171 and #172 of patient #43, the estimates of β0 , β3 , β4 and β5 present variations
greater than 13%. With respect to this last combination, it should be noted that the p-value
associated with t-test for β5 decreases, being it very close to the significance level of 5%. In
summary, the removal of the cases of the patient combinations leads to variations mostly
in the estimates of β0 , β3 , β4 and β5 , being them the most sensitive to the appropriate
quadratic multiplicative perturbation scheme of means. However, with a significance level
of 5% no inferential changes were detected.

4.2. Headache data


McKnight and van den Eeden [26] presented a set of unbalanced data derived from two-
treatment, double-blinded crossover trial design including a total of 27 patients. This study
was designed to examine whether aspartame causes headache in patients who believe expe-
rience headache due to the use of this drug. The treatment regimens were randomized
to patients and each regimen began with a placebo administration period of seven days,
usually called of run-in period, followed by four periods of seven days each alternating
JOURNAL OF APPLIED STATISTICS 13

aspartame and placebo treatments. To eliminate the effects of the previous treatment, the
periods were separated by one day. The response of interest is the number of headaches
suffered by patients at the end of each one of five periods (P1, P2, P3, P4, P5). All periods
were of seven days, but in some patients had a shorter duration.
Once again we conduct an exploratory data analysis. Figure 3 shows profile plots (left)
and boxplots (right) for the number of headache in each placebo or aspartame group at
the end of each period of time. Table 5 provides a descriptive summary for the number of
headache in P1, P2, P3, P4, P5. We obtain similar results to the application with epilepsy
data from our exploratory analysis with headache data. From the figure and table, note
that (i) there are no clear patterns in the profiles that differentiate the responses of both
groups so that random intercept and slope are not discarded in the modeling and will be
considered; (ii) no important changes are detected in the response means nor in the dis-
persion, but considering that no explanatory variables are being used in this exploratory
data analysis; and (iii) some atypical data are dentified.
For the data reported by McKnight and van den Eeden [26], also widely used in statis-
tical literature [27,33,43], we consider a negative binomial mixed model with systematic
component given by

log(μij ) = log(tij ) + β0 + β1 Aspij + bi , (20)

Figure 3. Profile plot (left) and boxplots (right) for headache data.

Table 5. Descriptive summary for the number of headache in the indicated period and treatment, where
SE denotes standard error.
P1 P2 P3 P4 P5
Placebo Mean 1.50 1.67 1.93 1.21 1.33
Sample SE 0.37 0.53 0.50 0.37 0.75
Aspartame Mean – 2.36 1.64 2.00 1.62
Sample SE – 0.58 0.51 0.91 0.45
14 A. TAPIA ET AL.

for j = 1, . . . , ni , i = 1, . . . , 27 and q = 122, where Asp is an indicating variable of the treat-


ment, that is, Asp = 1 for aspartame and Asp = 0 for placebo. In addition, β0 and β1 are
the regression coefficients, bi is the random intercept with N(0, σ 2 ) distribution and tij is
the number of days in the period.
The ML estimates of the components of the parameter vector ψ = (β0 , β1 , σ 2 , ϕ) of
the model with systematic component defined in (20) are presented in Table 6. To approxi-
mate −Q̈ψ (ψ̂), ω0 and gij (ω0 ), we use the same values of S, M0 and T defined in epilepsy
data. Figure 4 shows the index plot of M(0)j , with M(0)j defined in the previous applica-
tion. Note that the cases #2 and #5 of patient #1, the case #6 of patient #2, the cases #10
and #13 of patient #3, the case #33 of patient #7, the cases #58, #59, #60, #61 and #62 of
patient #13, the case #76 of patient #17, the case #88 of patient #20, the cases #91, #92, #93,
#94 and #95 of patient #21, the cases #102, #103, #104 and #105 of patient #23, the cases
#111, #112, #113, #114 and #115 of patient #25, and the cases #120 and #121 of patient
#27, all of them are detected as potentially influential. Table 6 displays the results of PE
for the indicated cases of the patient combinations, where the largest values of this were

Table 6. Estimates, p-values and PE for headache data.


Dropped cases Patient Parameter Estimate p-value PE
None – β0 −1.715 < 0.001 –
β1 0.284 0.049 –
σ 0.688 –
φ 102.456
{6,33}/{58,59,60,61,62}/ 2/7/13/21/25/27 β0 −1.749 < 0.001 1.988
{91,92,93,94,95}/{111,112, β1 0.067 0.715 76.519
113,114,11}/{5}/{120,121}
σ 0.587 14.705
φ 82.469
{6,33}/{58,59,60,61,62}/ 2/7/13/20/21/25/27 β0 −1.738 < 0.001 1.317
{88}/{91,92,93,94,95}/ β1 0.054 0.767 80.938
{111,112,113,114,115}/{120,121}
σ 0.627 8.837
φ 109.655

Figure 4. Index plots of M(0)j for headache data.


JOURNAL OF APPLIED STATISTICS 15

obtained. By removing the case #6 of patient #2, the case #33 of patient #7, the cases #58,
#59, #60, #61 and #62 of patient #13, the cases #91, #92, #93, #94 and #95 of patient #21,
the cases #111, #112, #113, #114 and #115 of patient #25 and the cases #120 and #121 of
patient #27, the estimate of β1 presents a variation of approximately 76%. By removing the
cases #6 of patient #2, the case #33 of patient #7, the cases #58, #59, #60, #61 and #62 of
patient #13, the case #88 of patient #20, the cases #91, #92, #93, #94 and #95 of patient #21,
the cases #111, #112, #113, #114 and #115 of patient #25 and the cases #120 and #121 of
patient #27, the estimate of β1 presents a variation of approximately 80%. In both cases,
the p-value associated with β1 turns out to be greater than 0.05, becoming the coefficient
of the indicating variable for the treatment to be non-significant. In summary, the removal
of observations of patient combinations leads to changes in the inference of β1 , being it the
most sensitive to the appropriate quadratic multiplicative perturbation scheme of means.
In addition, in this application, to difference of the study with epilepsy data, inferential
changes were detected, showing the relevance of using the diagnostic approach presented
in this paper.

5. Concluding remarks and future research


In this article, we have emphasized the importance of evaluating the stability of the max-
imum likelihood estimates for the parameters of Poisson and negative binomial mixed
models. For this evaluation, we have implemented a sensitivity analysis methodology using
the local influence approach based on the Q-displacement function. Due to the discrete
nature of the count responses, we have explored the appropriate quadratic multiplicative
perturbation scheme of means, as a way of indirectly assessing the perturbation in the
responses of both models. It should be noted that it is possible to perturb the means in
different ways, but based on the selection of a perturbation considered in this work, we
have derived an appropriate perturbation for the response mean of the underlying models.
It is important to note that the proposed perturbation scheme is framed in the perspective
of individual perturbation, because we are interested in the impact of the count responses
on the maximum likelihood estimates of both models. Complementarily, given the existing
correlation structure in the longitudinal count responses, some perturbation scheme could
be studied from the perspective of global perturbation, that is, perturbing simultaneously
all observations of the group or individual, but its interpretation could change.
The performance of the proposed methodology has been evaluated by Monte Carlo
simulations, showing its adequacy in detecting influential cases under an appropriate per-
turbation for the count response mean. In addition, we have applied the methodology
empirically through two real data sets widely studied in the medical literature related to
epilepsy and headache. For the balanced data set related to epilepsy, we have identified
cases having some impact on the maximum likelihood estimates for the parameters of
the Poisson mixed model, but no inferential changes were found. This adds new results
and complements the findings presented in [45,46], for the discrete response case. For the
unbalanced data set related to headache, we have detected cases having a high impact in the
maximum likelihood estimates, but, to difference of the epilepsy data, inferential changes
were found with the negative binomial mixed model, when these cases were removed. Such
as in the case of the Poisson model, it provides new findings and complements the works by
Xu et al. [43], Ouwens et al. [27], Xiang et al. [42] and Rakhmawati et al. [33]. Our study has
16 A. TAPIA ET AL.

shown that the proposed methodology is useful, giving new information related to the fit
of Poisson and negative binomial mixed models. It goes beyond the identification of influ-
ential observations, when considering response perturbation, because it has allowed us to
analyze how the count responses can affect the parameter estimates of the models. Further-
more, the proposed methodology has complemented the diagnostic tools already known
for the different sources of uncertainty in these models that are widely used in practice,
and particularly in medicine.
The proposed perturbation scheme for the means allowed us to detect the influence
of each count response for each individual, that is, the influence of measurements rather
than individual is evaluated. As we have used a model for longitudinal measurements, a
count response of an individual can be detected as influential, but another measurement
of the same individual could be not influential. Thus, with the proposed scheme we are able
to judge whether a count response is potentially influential or not. This is an advantage of
our proposal over the proposals of Ouwens et al. [27], for both models, and of Rakhmawati
et al. [33], for the Poisson mixed model, who performed local influence for detecting influ-
ential individuals. Our proposal is a complement to the works of Zhu and Lee [45,46], who
carried out local influence analysis only for the Poisson mixed model and under schemes
of perturbation that do not consider the response. For other models involving the Poisson
distribution in their responses, there is also no knowledge regarding alternative ways for
perturbing these responses; see [31–33]. Now, for the negative binomial mixed model, to
our best knowledge, no works on local influence under any scheme of perturbation have
been considered. For the negative binomial regression models, Svetliza and Paula [37,38]
derived local influence under a case-weight perturbation scheme, whereas Garay et al. [13]
developed local influence considering a zero-inflated version. This opens new options for
future research.

Acknowledgements
The authors thank the Editors and two referees for their constructive comments on an earlier version
of this manuscript which resulted in this improved version. This study was financed in part by the
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code
001, with HPC resources provided by the Information Technology Superintendence of the University
of São Paulo, and also by CNPq from Brazil; as well as by the Chilean Council for Scientific and
Technology Research (CONICYT) through fellowship ‘Becas-Chile’ (A. Tapia) and FONDECYT
1160868 Grant (V. Leiva) from the Chilean government.

Disclosure statement
No potential conflict of interest was reported by the authors.

ORCID
Alejandra Tapia http://orcid.org/0000-0003-0762-7618
Viviana Giampaoli http://orcid.org/0000-0001-7812-6963
Maria del Pilar Diaz http://orcid.org/0000-0001-5207-4253
Victor Leiva http://orcid.org/0000-0003-4755-3270
JOURNAL OF APPLIED STATISTICS 17

References
[1] R.A.B. Assumpção, M.A. Uribe-Opazo, and M. Galea, Analysis of local influence in geostatistics
using Student-t distribution, J. Appl. Stat. 41 (2014), pp. 2323–2341.
[2] J.G. Booth, G. Casella, H. Friedl, and J.P. Hobert, Negative binomial loglinear mixed models,
Stat. Modelling 3 (2003), pp. 179–191.
[3] N.E. Breslow and D.G. Clayton, Approximate inference in generalized linear mixed models, J.
Am. Stat. Assoc. 88 (1993), pp. 9–25.
[4] F. Chen, H.-T. Zhu, and S.-Y. Lee, Perturbation selection and local influence analysis for nonlinear
structural equation model, Psychometrika 74 (2009), pp. 493–516.
[5] F. Chen, H.-T. Zhu, X.-Y. Song, and S.-Y. Lee, Perturbation selection and local influence analysis
for generalized linear mixed models, J. Comput. Graph. Stat. 19 (2010), pp. 826–842.
[6] R.D. Cook, Assessment of local influence, J. R. Stat. Soc. B 48 (1986), pp. 133–169.
[7] R.D. Cook, Influence assessment, J. Appl. Stat. 14 (1987), pp. 117–131.
[8] R.D. Cook and S. Weisberg, Residuals and Influence in Regression, Chapman and Hall, London,
UK, 1982.
[9] F. De Bastiani, A.H.M.A. Uribe-Opazo, M.A. Cysneiros, and M. Galea, Influence diagnostics in
elliptical spatial linear models, TEST 24 (2015), pp. 322–340.
[10] E. Demidenko, Mixed Models: Theory and Applications with R, Wiley, New Jersey, US,
2013.
[11] J. Díaz-García, M. Galea, and V. Leiva, Influence diagnostics for elliptical multivariate linear
regression models, Comm. Statist. Theory Methods 32 (2003), pp. 625–641.
[12] M. Galea and P. Giménez, Local influence diagnostics for the test of mean-variance efficiency and
systematic risks in the capital asset pricing model, Statist. Papers (2016), in press.
[13] A.M. Garay, E.M. Hashimoto, E.M.M. Ortega, and V.H. Lachos, On estimation and influence
diagnostics for zero-inflated negative binomial regression models, Comput. Stat. Data Anal. 55
(2011), pp. 1304–1318.
[14] F. Garcia-Papani, V. Leiva, M.A. Uribe-Opazo, and R.G. Aykroyd, Birnbaum-Saunders spatial
regression models: Diagnostics and application to chemical data, Chemometr. Intell. Lab. Syst.
177 (2018), pp. 114–128.
[15] J.M. Hilbe, Negative Binomial Regression, Cambridge University Press, New York, US, 2007.
[16] G. Ibacache-Pulgar, G.A. Paula, and F.J.A. Cysneiros, Semiparametric additive models under
symmetric distributions, TEST 22 (2013), pp. 103–121.
[17] J. Jiang, Linear and Generalized Linear Mixed Models and Their Applications, Springer, Califor-
nia, US, 2007.
[18] J. Leão, V. Leiva, H. Saulo, and V. Tomazella, Incorporation of frailties into a cure rate regression
model and its diagnostics and application to melanoma data, Stat. Med. (2018), in press.
[19] V. Leiva, S. Liu, L. Shi, and F.J.A. Cysneiros, Diagnostics in elliptical regression models with
stochastic restrictions applied to econometrics, J. Appl. Stat. 43 (2016), pp. 627–642.
[20] V. Leiva, M. Santos-Neto, F.J.A. Cysneiros, and M. Barros, Birnbaum-Saunders statistical
modelling: A new approach, Stat. Modelling 14 (2014), pp. 21–48.
[21] E. Lesaffre and G. Verbeke, Local influence in linear mixed models, Biometrics 54 (1998), pp.
570–582.
[22] S. Liu, On local influence in elliptical linear regression models, Statist. Papers 41 (2000), pp.
211–224.
[23] S. Liu, On diagnostics in conditionally heteroskedastics time series models under elliptical distri-
butions, J. Appl. Probab. 41 (2004), pp. 393–406.
[24] C. Marchant, V. Leiva, F.J.A. Cysneiros, and J.F. Vivanco, Diagnostics in multivariate generalized
Birnbaum-Saunders regression models, J. Appl. Stat. 43 (2016), pp. 2829–2849.
[25] S. McCulloch and S. Searle, Generalized, Linear and Mixed Models, Wiley, New York, US,
2001.
[26] B. McKnight and S.K. van den Eeden, A conditional analysis for two-treatment multiple-period
crossover designs with binomial or poisson outcomes and subjects who drop out, Stat. Med. 12
(1993), pp. 825–834.
18 A. TAPIA ET AL.

[27] F.E.S. Ouwens, M.J.N.M. Tan, and M.P.F. Berger, Local influence to detect influential data
structures for generalized linear mixed models, Biometrics 57 (2001), pp. 1166–1172.
[28] J.C. Pinheiro and E.C. Chao, Efficient Laplacian and adaptive Gaussian quadrature algorithms
for multilevel generalized linear mixed models, J. Comput. Graph. Stat. 15 (2006), pp. 58–81.
[29] W.Y. Poon and Y.S. Poon, Conformal normal curvature and assessment of local influence, J. R.
Stat. Soc. B 61 (1999), pp. 51–61.
[30] R Core Team, R: A Language and Environment for Statistical Computing, (2016). R Foundation
for Statistical Computing, Vienna, Austria.
[31] T.W. Rakhmawati, G. Molenberghs, G. Verbeke, and C. Faes, Local influence diagnostics for
incomplete overdispersed longitudinal counts, J. Appl. Stat. 43 (2016), pp. 1722–1737.
[32] T.W. Rakhmawati, G. Molenberghs, G. Verbeke, and C. Faes, Local influence diagnostics for
hierarchical count data models with overdispersion and excess zeros, Biom. J. 58 (2016), pp.
1390–1408.
[33] T.W. Rakhmawati, G. Molenberghs, G. Verbeke, and C. Faes, Local influence diagnostics for
generalized linear mixed models with overdispersion, J. Appl. Stat. 44 (2017), pp. 620–641.
[34] S.W. Raudenbush, M. Yang, and M. Yosef, Maximum likelihood for generalized linear models
with nested random effects via high-order, multivariate laplace approximation, J. Comput. Graph.
Stat. 9 (2000), pp. 141–157.
[35] M. Santos-Neto, F.J.A. Cysneiros, V. Leiva, and M. Barros, Reparameterized Birnbaum-Saunders
regression models with varying precision, Electron. J. Stat. 10 (2016), pp. 2825–2855.
[36] X. Shi, H. Zhu, and J.G. Ibrahim, Local influence for generalized linear models with missing
covariates, Biometrics 65 (2009), pp. 1164–1174.
[37] C.F. Svetliza and G.A. Paula, On diagnostics in log-linear negative binomial models, J. Stat.
Comput. Simul. 71 (2001), pp. 231–244.
[38] C.F. Svetliza and G.A. Paula, Diagnostics in nonlinear negative binomial Models, Comm. Statist.
Theory Methods 32 (2003), pp. 1227–1250.
[39] P.F. Thall and S.C. Vail, Some covariance models for longitudinal count data with overdispersion,
Biometrics 46 (1990), pp. 657–671.
[40] C. Villegas, G.A. Paula, F.J.A. Cysneiros, and M. Galea, Influence diagnostics in generalized
symmetric linear models, Comput. Stat. Data Anal. 59 (2013), pp. 161–170.
[41] R. Wolfinger and M. O’Connell, Generalized linear mixed models: A pseudo-likelihood approach,
J. Stat. Comput. Simul. 48 (1993), pp. 233–243.
[42] L. Xiang, S.K. Tse, and A.H. Lee, Influence diagnostics for generalized linear mixed models:
Applications to clustered data, Comput. Stat. Data Anal. 40 (2002), pp. 759–774.
[43] L. Xu, S.Y. Lee, and W.Y. Poon, Deletion measures for generalized linear mixed effects models,
Comput. Stat. Data Anal. 51 (2006), pp. 1131–1146.
[44] H. Zhu, J.G. Ibrahim, S. Lee, and H. Zhang, Perturbation selection and influence measures in
local influence analysis, Ann. Statist. 35 (2007), pp. 2565–2588.
[45] H.-T. Zhu and S.-Y. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. B 63 (2001),
pp. 111–126.
[46] H.-T. Zhu and S.-Y. Lee, Local influence for generalized linear mixed models, Canad. J. Statist.
31 (2003), pp. 293–309.

Appendix 1. Background
Consider the count responses Yij , for j = 1, . . . , ni , i = 1, . . . , I, and a p2 × 1 random vector bi
with distribution Np2 (0p2 ×1 , ), where 0p2 ×1 is a p2 × 1 vector of zeros and the p2 × p2 matrix of
variance-covariance  (of rank p2 ) depends on a p3 × 1 vector γ of unknown variance components.
Assume that the conditional distribution Yij on bi belongs to the exponential family with probability
mass function given by
pYij | bi (yij ) = exp(φ(yij θij − d(θij )) + c(yij , φ)), j = 1, . . . , ni , i = 1, . . . , I, (A1)
where θij is the canonical parameter, d,c are known functions and φ is the corresponding preci-
sion parameter. Under usual conditions of regularity, E(Yij | bi ) = μij = ḋ(θij ) and Var(Yij | bi ) =
JOURNAL OF APPLIED STATISTICS 19

d̈(θij )/φ, where ḋ(θij ) = ∂d(θij )/∂θij and d̈(θij ) = ∂ 2 d(θij )/∂θij2 . Thus, GLMM are defined from (A1)
with their systemic component expressed as
g(μij ) = ηij , k(ηij ) = θij , ηij = x 
ij β + z ij bi , j = 1, . . . , ni , i = 1, . . . , I, (A2)

where xij = (xij1 , . . . , xijp1 ) and z ij = (zij1 , . . . , zijp2 ) are p1 × 1 and p2 × 1 vectors, respectively,
with the values of covariates, and β is a p1 × 1 vector of unknown regression coefficients. Also, g
and k are known continuous differentiable functions, such that k(u) = ḋ−1 (g −1 (u)), where ḋ−1 and
g −1 are the inverse functions of ḋ and g, respectively.
Let ψ = (φ, β  , γ  ) be an M × 1 unknown parameter vector to be estimated, in a GLMM,
with M = 1 + p1 + p3 , and yo the observed data set. Then, the log-likelihood function for ψ based
on the observed data is given by

I  ni
(ψ | yo ) = log ⎝ exp(φ(yij θij − d(θij )) + c(yij , φ))
i=1 Rp2 j=1


1 1  −1
× det() −1/2
exp − bi  bi dbi ⎠ , (A3)
(2π )p2 /2 2

whose integrals are intractable mathematically, as mentioned in the introduction, so that approxi-
mations are necessary to obtain the corresponding ML estimates.
If the count responses Yij follow the Poisson distribution, then Yij given bi has a probability mass
function defined as
pYij | bi (yij ) = exp(yij log(μij ) − μij − log(yij !)), j = 1, . . . , ni , i = 1, . . . , I, (A4)
with mean and variance given by E(Yij | bi ) = Var(Yij | bi ) = μij . Thus, the Poisson mixed model is
defined from (A4) with systemic component given by
log(μij ) = ηij , ηij = x 
ij β + z ij bi , j = 1, . . . , ni , i = 1, . . . , I.
As mentioned, for fixed values of the precision parameter φ, the negative binomial distribution is
a member of the exponential family. Therefore, we are still within the framework of GLMM. If the
count responses Yij , with j = 1, . . . , ni and i = 1, . . . , I, follow a negative binomial distribution, then
Yij given bi has a probability mass function defined as

(yij + φ) μij φ
pYij | bi (yij ) = exp log + yij log − μij + φ log , (A5)
(φ)yij ! μij + φ μij + φ
with mean and variance given by E(Yij | bi ) = μij and Var(Yij | bi ) = μij + μ2ij /φ, respectively. Thus,
the negative binomial mixed model is defined from (A5) with systemic component given by
log(μij ) = ηij , ηij = x 
ij β + z ij bi , j = 1, . . . , ni , i = 1, . . . , I.
Recall that 1/φ quantifies the amount of overdispersion. Then, when 1/φ → 0, we are in the
presence of a Poisson mixed model.

Appendix 2. Approximations for −Q̈ψ (ψ̂) and ω0


Since the conditions of regularity allow the exchange of order between integration and differentia-
tion, the matrices given by (5) can be expressed as
2
2

∂ (ψ; yc )

∂ (ψ, ω; yc )

− Q̈ψ (ψ̂) = E − ,  = E , (A6)


∂ψ∂ψ 
ψ=ψ̂

ω0
∂ψ∂ω ψ=ψ̂,ω=ω0

respectively. However, the conditional expectation, which is in the blocks of (A6), cannot be calcu-
lated in closed form. Therefore, Zhu and Lee [46] used the classical MC method of integration to
20 A. TAPIA ET AL.

(s)
solve this intractability in the following way. Let {yu : s = 1, . . . , S} be a random sample generated
from pY u |Y c =yc (yu ; ψ̂). Then,

S (s)
 S (s)

1 ∂ 2 (ψ; yc , yu )

1 ∂ 2 (ψ, ω; yc , yu )

−Q̈ψ (ψ̂) ≈ , ω0 ≈ ,


M0 − S
s=M0 +1
∂ψ∂ψ 

S − M0
s=M0 +1
∂ψ∂ω

ψ=ψ̂ ψ=ψ̂,ω=ω0
where M0 are the first, for example, 1000 observations.

Appendix 3. Conditional probability mass function pbi | Y c


Let
⎛ ⎞
1 
ni 
pbi | Y c =yc (bi ; ψ) ∝ exp ⎝− bi  bi +
 −1
φ yij k(xij β + z ij bi ) − d(k(xij β + z ij bi )) ⎠ .
   
2
j=1
(s)
Then, generating the observations {yu : s = 1, . . . , S} is not trivial. Zhu and Lee [46] used the
Metropolis-Hasting algorithm to do it as described below.
The algorithm is initialized from an arbitrary value b(0)i . Then, in the rth iteration, follow the
steps:

(1) Given the current value of b(r−1)


i , generate a new candidate as bi ∼ Np2 (bi
(r−1)
, i (0)), where
⎛ ⎞−1

ni

⎠


(0) = (bi )|bi =0 =  + φ −1 2
d̈(k(ηij ))k̇ (ηij )z ij z ij ,

j=1

bi =0

with the notation k̇ being analogous to ḋ defined in (A1).


(2) Obtain u from U ∼ U(0, 1). If u  ϕ(b(r−1) i
(r) (r)
, bi ), then bi = bi . Otherwise, consider bi =
(r−1)
bi , where
 
(r−1) pbi | Y c =yc (bi , ψ)
ϕ(bi , bi ) = min (r−1)
,1
pbi | Y c =yc (bi ; ψ)
is the probability of accepting a new candidate.
(3) Repeat steps 1 and 2 until obtaining the values requested.

For the Poisson mixed model, we have


⎛ ⎞
1 
ni
pbi | Y c =yc (bi ; ψ) ∝ exp ⎝− b  −1 bi + yij (x   
ij β + z ij bi ) − exp(xij β + z ij bi )

2 i
j=1

and
⎛ ⎞−1


ni

⎠

(0) = (bi )|bi =0 ⎝ −1


=  +  
exp(xij β + z ij bi )z ij z ij .

j=1

bi =0
For the negative binomial mixed model, we get
⎛  
1  −1 
ni
exp(x 
ij β + z ij bi )

pbi | Y c =yc (bi ; ψ) ∝ exp − bi  bi + yij log
2 exp(x
j=1

ij β + z ij bi ) + φ
 
φ
+φ log
exp(xij β + z 

ij bi ) + φ
JOURNAL OF APPLIED STATISTICS 21

and

⎛ ⎞−1


ni
φ exp(x 
ij β + z ij bi ) ⎠

(0) = (bi )|bi =0 = ⎝ −1 +  


z ij z ij
.
j=1
exp(xij β + z ij bi ) + φ

bi =0

Appendix 4. Approximation for G(ω0 )


The expectation involved in the blocks of G(ω0 ) cannot be calculated in closed form. Therefore,
Chen et al. [5] used the classical MC method of integration as follows. Generate a random sample
{b(t)
i ; t = 1, . . . , T} from a normal distribution of zero mean vector and covariance-variance .
Then, approximate the elements of G(ω0 ) by

1  ∂ log(pY c ,b (yc , bi ; ψ, ω)) ∂ log(pY c ,b (yc , bi ; ψ, ω))

T (t) (t)
gij (ω0 ) ≈
.
T ∂ωi ∂ωj

t=1 ψ=ψ̂,ω=ω0

Appendix 5. Derivatives related to −Q̈ψ (ψ̂)


For the Poisson mixed model, we have

∂ 2 (ψ; yc ) 
I 
ni
− = μij xij x
ij ,
∂β∂β  i=1 j=1

∂ 2 (ψ; y
c)
− 
= 0p1 ×p3 ,
∂β∂γ
∂ 2 (ψ; yc )
− = 0p3 ×p1 ,
∂γ ∂β 
 
∂ 2 (ψ; yc ) I −1 
I
− = − ( ⊗  −1 ) +  −1 bi b
i 
−1
⊗  −1
p3 ×p3 .
∂γ ∂γ  2
i=1

For the negative binomial mixed model, we get

∂ 2 (ψ; yc ) 
I 
ni
φ(φ + yij )μij
− = xij x
ij ,
∂β∂β  i=1 j=1
(μij + φ)2

∂ 2 (ψ; yc )
− = 0p1 ×p3 ,
∂β∂γ 
∂ 2 (ψ; yc )
− = 0p3 ×p1 ,
∂γ ∂β 
 
∂ 2 (ψ; yc ) I −1 
I
− −1
= − ( ⊗  ) +  −1  −1
bi bi  ⊗  −1
p3 ×p3 .
∂γ ∂γ  2
i=1
22 A. TAPIA ET AL.

Appendix 6. Derivatives related to ω0


For the Poisson mixed model, we have
∂ 2 (ψ, ω̃; yc )
= −2gij (ω0 )−1/2 μij xij ,
∂β∂ ω̃ij
∂ 2 (ψ, ω̃; yc )
= 0p3 ×q .
∂γ ∂ ω̃ij
For the negative binomial mixed model, we get
∂ 2 (ψ, ω̃; yc ) 2gij (ω0 )−1/2 φ(φ + yij )μij xij
=− ,
∂β∂ ω̃ij (μij + φ)2
∂ 2 (ψ, ω̃; yc )
= 0p3 ×q .
∂γ ∂ ω̃ij

You might also like