Professional Documents
Culture Documents
Textbook Data Analysis Using Hierarchical Generalized Linear Models With R 1St Edition Youngjo Lee Ebook All Chapter PDF
Textbook Data Analysis Using Hierarchical Generalized Linear Models With R 1St Edition Youngjo Lee Ebook All Chapter PDF
https://textbookfull.com/product/bayesian-hierarchical-models-
with-applications-using-r-peter-d-congdon/
https://textbookfull.com/product/generalized-linear-models-and-
extensions-fourth-edition-hardin/
https://textbookfull.com/product/linear-and-generalized-linear-
mixed-models-and-their-applications-2nd-edition-jiming-jiang/
https://textbookfull.com/product/an-introduction-to-generalized-
linear-models-annette-j-dobson/
Generalized Linear Models and Extensions 4th Edition
James W. Hardin
https://textbookfull.com/product/generalized-linear-models-and-
extensions-4th-edition-james-w-hardin/
https://textbookfull.com/product/repeated-measures-design-with-
generalized-linear-mixed-models-for-randomized-controlled-
trials-1st-edition-toshiro-tango/
https://textbookfull.com/product/generalized-additive-models-an-
introduction-with-r-second-edition-simon-n-wood/
https://textbookfull.com/product/longitudinal-data-analysis-
autoregressive-linear-mixed-effects-models-ikuko-funatogawa/
https://textbookfull.com/product/regression-analysis-an-
intuitive-guide-for-using-and-interpreting-linear-models-1st-
edition-jim-frost/
DATA ANALYSIS USING
HIERARCHICAL GENERALIZED
LINEAR MODELS WITH R
DATA ANALYSIS USING
HIERARCHICAL GENERALIZED
LINEAR MODELS WITH R
Youngjo Lee
Lars Rönnegård
Maengseok Noh
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
List of notations ix
Preface xi
1 Introduction 1
1.1 Motivating examples 5
1.2 Regarding criticisms of the h-likelihood 14
1.3 R code 16
1.4 Exercises 17
v
vi CONTENTS
3.4 Extended likelihood principle 50
3.5 Laplace approximations for the integrals 53
3.6 Street magician 56
3.7 H-likelihood and empirical Bayes 58
3.8 Exercises 62
5 HGLMs modeling in R 95
5.1 Examples 95
5.2 R code 124
5.3 Exercises 132
References 299
Symbol Description
y Response vector
X Model matrix for fixed effects
Z Model matrix for random effects
β Fixed effects
v Random effect on canonical scale
u Random effect on the original scale
n Number of observations
m Number of levels in the random effect
h(·) Hierarchical log-likelihood
L(·; ·) Likelihood with notation parameters; data
fθ (y) Density function for y having parameters θ
θ A generic parameter indicating
any fixed effect to be estimated
φ Dispersion component for the mean model
λ Dispersion component for the random effects
g(·) Link function for the linear predictor
r(·) Link function for random effects
η Linear predictor in a GLM
µ Expectation of y
s Linearized working response in IWLS
V Marginal variance matrix used in linear mixed models
V (·) GLM variance function
I(.) Information matrix
pd Estimated number of parameters
T
Transpose
δ Augmented effect vector
γ Regression coefficient for dispersion
ix
Preface
xi
xii PREFACE
follow different distributions, which can also fit factor and structural
equation models. The frailtyHL package is used for survival analysis us-
ing frailty models, which is an extension of Cox’s proportional hazards
model to allow random effects. The jointdhglm package allows joint mod-
els for HGLMs and survival time and competing risk models. In Chapter
10, we introduce variable selection methods via random-effect models.
Furthermore, in Chapter 10 we study the random-effect models with dis-
crete random effects and show that hypothesis testing can be expressed
in terms of prediction of discrete random effects (e.g., null or alternative)
and show how h-likelihood gives a general extension of likelihood-ratio
test to multiple testing. Model-checking techniques and model-selection
tools by using the h-likelihood modeling approach add further insight to
the data analysis.
It is an advantage to have studied linear mixed models and GLMs before
reading this book. Nevertheless, GLMs are briefly introduced in Chapter
2 together with a short review of GLM theory, for a reader who wishes
to freshen up on the topic. The majority of data sets used in the book
are available at URL
http : //cran.r − project.org/package = mdhglm
for the R package mdhglm (Lee et al., 2016b). Several different examples
are presented throughout the book, while the longitudinal epilepsy data
presented by Thall and Vail (1990) is an example dataset used recur-
rently throughout the book from Chapter 1 to Chapter 6 allowing the
reader to follow the model development from a basic GLM to a more
advanced DHGLM.
We are grateful to Prof. Emmanuel Lesaffre, Dr. Smart Sarpong, Dr.
Ildo Ha, Dr. Moudud Alam, Dr. Xia Shen, Mr. Jengseop Han, Mr. Dae-
han Kim, Mr. Hyunseong Park and the late Dr. Marek Molas for their
numerous useful comments and suggestions.
Introduction
1
2 INTRODUCTION
iii) we can use model-checking tools for linear regression and generalized
linear models (GLMs), making assumptions in all parts of an HGLM
checkable.
The marginal likelihood is used for inference on the fixed effects both
in classical frequentist and h-likelihood approaches, but the marginal
likelihood involves multiple integration over the random effects that are
most often not feasible to compute. For such cases the adjusted profile
h-likelihood, a Laplace approximation of the marginal likelihood, is used
in the h-likelihood approach. Because the random effects are integrated
out in a marginal likelihood, classical frequentist method does not allow
any direct inference of random effects.
Bayesians assume prior for parameters and for inference they often rely
on Markov Chain Monte Carlo (MCMC) computations (Lesaffre and
Lawson, 2012). The h-likelihood allows complex models to be fitted by
maximizing likelihoods for fixed unknown parameters. So for a person
who does not wish to express prior beliefs, there are both philosophical
and computational advantages of using HGLMs. However, this book does
not focus on the philosophical advantages of using the h-likelihood but
rather on the practical advantages for applied users, having a reasonable
statistical background in linear models and GLMs, to enhance their data
analysis skills for more general types of data.
In this chapter we introduce a few examples to show the strength of
the h-likelihood approach. Table 1.1 contains classes of models and the
chapters where they are first introduced, and available R packages. Var-
ious packages have been developed to cover these model classes. From
Table 1.1, we see that the dhglm package has been developed to cover a
wider class of models from GLMs to DHGLMs . Detailed descriptions of
dhglm package are presented in Chapter 7, where full structure of model
classes are described.
Figure 1.1 shows the evolution of the model classes presented in this
book together with their acronyms. Figure 1.1 also shows the building-
block structure of the h-likelihood; once you have captured the ideas at
one level you can go to a deeper level of modeling. This book aims to
show how complicated statistical model classes can be built by combin-
ing interconnected GLMs and augmented GLMs, and inference can be
made in a single framework of the h-likelihood. For readers who want
a more detailed description of theories and algorithms on HGLMs and
on survival analysis, we suggest the monographs by Lee, Nelder, and
Pawitan (2017) and Ha, Jeong, and Lee (2017). This book shows how to
analyze examples in these two books using available R-packages and we
have also new examples.
INTRODUCTION 3
Factor analysis Generalized linear mixed Joint GLM (Ch 4) Multiple testing
Generalized linear model
(Ch 7) model (GLMM, Ch 4-5) including dispersion model with (Ch 10)
Generalized linear model including fixed effects
Gaussian random effects
Multivariate DHGLM
(MDHGLM, Ch 7)
DHGLM including outcomes from
several distributions
Table 1.1 Model classes presented in the book including chapter number and available R packages
Model Class R package Developer Chapter
GLM glm() function 2
Nelder and Wedderburn (1972) dhglm Lee and Noh (2016)
Joint GLM (Nelder and Lee, 1991) dhglm Lee and Noh (2016) 4
GLMM dhglm Lee and Noh (2016) 4, 5
Breslow and Clayton (1993) lme4 Bates and Maechler (2009)
hglm Alam et al. (2015)
HGLM dhglm Lee and Noh (2016) 2,3,4,5
Lee and Nelder (1996) hglm Alam et al. (2015)
Spatial HGLM dhglm Lee and Noh (2016) 5
Lee and Nelder (2001b) spaMM Rousset et al. (2016)
DHGLM (Lee and Nelder, 2006; Noh and Lee, 2017) dhglm Lee and Noh (2016) 6
Multivariate DHGLM mdhglm Lee, Molas, and Noh (2016b) 7
Lee, Molas, and Noh (2016a) mixAK Komarek (2015)
Lee, Nelder, and Pawitan (2017) mmm Asar and Ilk (2014)
Frailty HGLM frailtyHL Ha et al. (2012) 8
Ha, Lee, and Song (2001) coxme Therneau (2015)
survival Therneau and Lumley (2015)
Joint DHGLM jointdhglm Ha, Lee, and Noh (2015) 8
Henderson et al. (2000) JM Rizopoulos (2015)
INTRODUCTION
MOTIVATING EXAMPLES 5
1.1 Motivating examples
Thall and Vail (1990) presented longitudinal data from a clinical trial of
59 epileptics, who were randomized to a new drug or a placebo (T=1 or
T=0). Baseline data were available at the start of the trial; the trial in-
cluded the logarithm of the average number of epileptic seizures recorded
in the 8-week period preceding the trial (B), the logarithm of age (A),
and number of clinic visit (V: a linear trend, coded (-3,-1,1,3)). A multi-
variate response variable (y) consists of the seizure counts during 2-week
periods before each of four visits to the clinic.
The data can be retrieved from the R package dhglm (see R code at the
end of the chapter). It is a good idea at this stage to have a look at
and get acquainted with the data. From the boxplot of the number of
seizures (Figure 1.2) there is no clear difference between the two treat-
ment groups. In Figure 1.3 the number of seizures per visit are plotted
for each patient, where the lines in this spaghetti plot indicate longi-
tudinal patient effects. Investigate the data further before running the
models below.
A simple first preliminary analysis could be to analyze the data (ignoring
that there are repeated measurements on each patient) with a GLM
having a Poisson distributed response using the R function glm. Let yij
be the corresponding response variable for patient i(= 1, · · · , 59) and
visit j(= 1, · · · , 4). We consider a Poisson GLM with log-link function
modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT , (1.1)
where β0 , βB , βT , βA , βV , and βBT are fixed effects for the intercept,
6 INTRODUCTION
4
log(Seizure counts + 1)
3
2
1
0
0 1
Treatment
Figure 1.2 Boxplot of the logarithm of seizure counts (new drug = 1, placebo
= 0).
Call:
glm(formula=y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
MOTIVATING EXAMPLES 7
5
log(Seizure counts + 1)
4
3
2
1
0
1 2 3 4 5 6 7
Visit
Figure 1.3 Number of seizures per visit for each patient. There are 59 patients
and each line shows the logarithm of seizure counts + 1 for each patient.
Deviance Residuals:
Min 1Q Median 3Q Max
-5.0677 -1.4468 -0.2655 0.8164 11.1387
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.79763 0.40729 -6.869 6.47e-12 ***
8 INTRODUCTION
B 0.94952 0.04356 21.797 < 2e-16 ***
T -1.34112 0.15674 -8.556 < 2e-16 ***
A 0.89705 0.11644 7.704 1.32e-14 ***
V -0.02936 0.01014 -2.895 0.00379 **
B:T 0.56223 0.06350 8.855 < 2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Random effects:
Groups Name Variance Std.Dev.
patient (Intercept) 0.2515 0.5015
Number of obs: 236, groups: patient, 59
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.37817 1.17746 -1.170 0.24181
B 0.88442 0.13075 6.764 1.34e-11 ***
T -0.93291 0.39883 -2.339 0.01933 *
A 0.48450 0.34584 1.401 0.16123
V -0.02936 0.01009 -2.910 0.00362 **
B:T 0.33827 0.20247 1.671 0.09477 .
The output gives a variance for the random patient effect equal to 0.25.
In GLMMs the random effects are always assumed normally distributed.
The same model can be fitted within the h-likelihood framework using
the R package hglm, which gives the output (essential parts of the output
shown):
Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|patient),
data=epilepsy,family=poisson(link = log), fix.disp = 1)
----------
MEAN MODEL
----------
----------------
DISPERSION MODEL
----------------
The output gives a variance for the random patient effect equal to 0.27,
similar to the lme4 package. In Chapter 4, we compare similarities and
difference between HGLM estimates and lme4. Unlike the lme4 package,
however, it is also possible to add non-Gaussian random effects to further
model the over-dispersion. However, with the marginal-likelihood infer-
ences, subject-specific inferences cannot be made. For subject-specific
inferences, we need to estimate random patient effects vi via the h-
likelihood.
With dhglm package it is possible to allow different distribution for dif-
ferent random effects. For example, in the previous GLMM, a gamma
distributed random effect can be included for each observation, which
gives a conditional distribution of the response that can be shown to
be negative binomial, while patient effects are modeled as normally dis-
tributed:
E(yij |vi , vij ) = µij and var(yij |vi ) = µij .
with log-link function modeled as
log(µij ) = β0 + xBi βB + xTi βT + xAi βA + xVj βV + xBi Ti βBT + vi + vij ,
(1.3)
where vi ∼ N(0, λ1 ), uij = exp(vij ) ∼ G(λ2 ) and G(λ2 ) is a gamma
distribution with E(uij ) = 1 and var(uij ) = λ2 . Then, this model is
equivalent to the negative binomial HGLM such that the conditional
distribution of yij |vi is the negative binomial distribution with the prob-
ability function
!yij !1/λ2
yij + 1/λ2 − 1 λ2 1 ∗y
µij ij ,
1/λ2 − 1 1 + µ∗ij λ2 1 + µ∗ij λ2
Call:
hglm2.formula(meanmodel=y~B+T+A+B:T+V+(1|id)+
(1|patient),data=epilepsy,family=poisson(link=log),
rand.family=list(Gamma(link=log),gaussian()),
fix.disp = 1)
MOTIVATING EXAMPLES 11
----------
MEAN MODEL
----------
----------------
DISPERSION MODEL
----------------
The variance component for the random patient effect is now 0.24 with
additional saturated random effects picking up some of the over-disper-
sion. By adding gamma saturated random effects we can model over-
dispersion (extra Poisson variation). The example shows how modeling
with GLMMs can be further extended using HGLMs. This can be viewed
as an extension of Poisson GLMMs to negative-binomial GLMMs, where
repeated observations on each patient follow the negative-binomial dis-
tribution rather than the Poisson distribution.
Dispersion modeling is an important, but often challenging task in statis-
tics. Even for simple models without random effects, iterative algorithms
are needed to compute maximum likelihood (ML) estimates. An example
is the heteroscedastic linear model
y ∼ N(Xβ, exp(X d β d ))
0
~
>-
"'c
(!)
0
0
ro
::l
0'"
(!) 0
La:: 0
v
1 2 3 4 5 6 7 8 9 11 13 15 17 19 21
Litter Size
>-
"'c
(!)
0
0
ro
::J
0"
~ 0
LL
~
2 3 4 5 6 7 8 9
Figure 1.4 Distributions of observed litter sizes and the number of observations
per sow.
Likelihood is used both in the frequentist and Bayesian worlds and it has
been the central concept for almost a centry in statistical modeling and
inference. However, likelihood and frequentists cannot make inference
for random unknowns whereas Bayesian does not make inference of fixed
unknowns. The h-likelihood aims to allow inferences for both fixed and
random unknowns and could cover both worlds (Figure 1.5).
The concept of the h-likelihood has received criticism since the first pub-
lication of Lee and Nelder (1996). Partly the criticism has been motivated
because the theory in Lee and Nelder (1996) was not fully developed.
However, these question marks have been clarified in later papers by Lee,
Nelder and co-workers. One of the main concerns in the 1990’s was the
similarity with penalized quasi-likelihood (PQL) for GLMMs of Breslow
and Clayton (1993), which has large biases for binary data. PQL estima-
tion for GLMMs is implemented in e.g., the R package glmmPQL. Early
h-likelihood methodology was criticized because of non-ignorable biases
in binary data. These biases in binary data can be eliminated by im-
proved approximations of the marginal likelihood through higher-order
Laplace approximations (Lee, Nelder, and Pawitan, 2017).
Meng (2009, 2010) established Bartlett-like identities for h-likelihood.
That is, the score for parameters and unobservables has zero expecta-
tion, and the variance of the score is the expected negative Hessian under
easily verifiable conditions. However, Meng noted difficulties in infer-
ences about unobservables: neither the consistency nor the asymptotic
normality for parameter estimation generally holds for unobservables.
Thus, Meng (2009, 2010) conjectured that an attempt to make proba-
bility statements about unobservables without using a prior would be
in vain. Paik et al. (2015) studied the summarizability of h-likelihood
estimators and Lee and Kim (2016) showed how to make probability
statements about unobservables in general without assuming a prior as
we shall see.
The h-likelihood approach is a genuine approach based on the extended
REGARDING CRITICISMS OF THE H-LIKELIHOOD 15
likelihood principle. This likelihood principle is mathematical theory so
there should be no controversy on its validity. However, it does not tell
how to use the extended likelihood for statistical inference. Another im-
portant question, that has been asked and answered, is: For which family
of models can the joint maximization of the h-likelihood be applied on?
This is a motivated question since there are numerous examples where
joint maximization of an extended likelihood containing both fixed ef-
fects β and random effects v gives nonsense estimates (see Lee and Nelder
(2009). Such examples use the extended likelihood for joint maximiza-
tion of both β and v. If the h-likelihood h, defined in Chapter 2, is
jointly maximized for estimating both β and v, such nonsense estimates
disappear. However, consistent estimates for the fixed effect can only be
guaranteed for a rather limited class of models, including linear mixed
models and Poisson HGLMs having gamma random effects on a log scale.
Thus, as long as the marginal likelihood is used to estimate β and the h-
likelihood h is maximized to estimate the random effects these examples
do not give contradictory results.
Lee and Nelder have developed a series of papers to show the iterative
weighted least squares (IWLS) algorithm for GLMs can be extended to a
general class of models including HGLMs. It is computationally efficient
and therefore potentially very useful in statistical applications, to allow
analysis of more and more complex models (Figure 1.1). For linear mixed
models, there is nothing controversial about this algorithm because it can
be shown to give BLUP. Neither is it controversial for GLMs with ran-
dom effects in general, because the adjusted profile h-likelihoods (defined
in Chapter 2) simply are approximations of marginal and restricted like-
lihoods for estimating fixed effects and variance components, and as such
16 INTRODUCTION
H4it:e ihoodwottd
1.3 R code
library(dhglm)
data(epilepsy)
model1 <- glm(y~B+T+A+B:T+V,family=poisson(link=log),
data=epilepsy)
summary(model1)
library(lme4)
model2 <- glmer(y~B+T+A+B:T+V+(1|patient),
family=poisson(link=log),data=epilepsy)
summary(model2)
Another random document with
no related content on Scribd:
reconcile their habits, views, and interests with those of the South
and West. The latter are beginning to rule with a rod of iron.”
Pickering knew that the Federalist majority in Massachusetts was
none too great. The election in May, four months later, showed a
Federalist vote of 30,000 against a Republican minority of 24,000,
while in the Legislature Harrison Gray Otis was chosen Speaker by
129 votes to 103. Pickering knew also that his colleague, Senator
Adams, was watching his movements with increasing ill-will, which
Pickering lost no chance to exasperate. Nothing could be more
certain than that at the first suggestion of disunion Senator Adams
and the moderate Federalists would attack the Essex Junto with the
bitterness of long-suppressed hatred; and if they could not command
fourteen votes in the Legislature and three thousand in the State, a
great change must have occurred since the year before, when they
elected Adams to the Senate for the long term over Pickering’s head.
Pickering concealed his doings from his colleague; but Tracy was not
so cautious. Adams learned the secret from Tracy; and the two
senators from Massachusetts drew farther and farther apart, in spite
of the impeachments, which tended to force them together.
The Essex Junto, which sent Pickering to Washington, and to
which he appealed for support, read his letter with evident
astonishment. George Cabot, Chief-Justice Parsons, Fisher Ames,
and Stephen Higginson, who were the leaders consulted,[104]
agreed that the scheme was impracticable; and Cabot, as gently as
possible, put their common decision into words.
“All the evils you describe,” he said,[105] “and many more, are to
be apprehended; but I greatly fear that a separation would be no
remedy, because the source of them is in the political theories of our
country and in ourselves. A separation at some period not very remote
may probably take place,—the first impression of it is even now
favorably received by many; but I cannot flatter myself with the
expectation of essential good to proceed from it while we retain
maxims and principles which all experience, and I may add reason
too, pronounce to be impracticable and absurd. Even in New England,
where there is among the body of the people more wisdom and virtue
than in any other part of the United States, we are full of errors which
no reasoning could eradicate if there were a Lycurgus in every village.
We are democratic altogether; and I hold democracy in its natural
operation to be the government of the worst.
“There is no energy in the Federal party, and there could be none
manifested without great hazard of losing the State government.
Some of our best men in high stations are kept in office because they
forbear to exert any influence, and not because they possess right
principles. They are permitted to have power if they will not use it.... I
incline to the opinion that the essential alterations which may in future
be made to amend our form of government will be the consequences
only of great suffering or the immediate effects of violence. If we
should be made to feel a very great calamity from the abuse of power
by the National Administration, we might do almost anything; but it
would be idle to talk to the deaf, to warn the people of distant evils. By
this time you will suppose I am willing to do nothing but submit to fate.
I would not be so understood. I am convinced we cannot do what is
wished; but we can do much, if we work with Nature (or the course of
things), and not against her. A separation is now impracticable,
because we do not feel the necessity or utility of it. The same
separation then will be unavoidable when our loyalty to the Union is
generally perceived to be the instrument of debasement and
impoverishment. If it is prematurely attempted, those few only will
promote it who discern what is hidden from the multitude.”