Professional Documents
Culture Documents
we have xi j = 1 for all 'i', the coefficient of an intercept, often denoted by α].
[Sometimes, we take j = 0 for the intercept term and retain j = 1, 2, ..., p for other 'p' number
of predictor variables]
The Link Function connects the random & systematic components. Let µi=E(Yi), i =1,2,...,N.
The model links µi to ηi by the equation ηi = g(µi) where the link function 'g' is a monotone
p
differentiable function. That is, g(µi) =
j =1
j x i j . The link function that transforms the mean
1
CATEGORICAL DATA ANALYSIS
Dr. Martin L. William
2.1.3 Binomial Logit Models for Binary Data
Binary variables follow Bernoulli distribution. Let P(Y = 1) = π ( the probability of success),
so that E(Y) = π. The natural parameter is log [π / (1 ˗ π) ] which is called logit of π. Thus, the
link function is g(πi) = log[ πi / (1 - πi)]. We formulate the link
p
log[ πi / (1 - πi)] =
j =1
j xi j
Therefore, for binary variables, the relevant model is the Binary Logit Model.
2.1.4 Loglinear Models for Poisson count Data
Count data, under certain situations, obey the Poisson law. If µ = E(Y) is the mean of a
Poisson variable, the natural parameter is log µ. Thus, the link function is g(µi) = log µi. We
formulate the link
p
log µi =
j =1
j xi j
Therefore, for Poisson variables, the relevant model is the Loglinear Model.
2.1.5 Types of GLMs
A traditional way to analyze data transforms the response variable Y so that it has an
approximate normal distribution with constant variance across the subjects. When such a
transformation is possibile, ordinary least squares regression is applicable. In contrast, with
GLMs, the choice of link function is separate from the choice of random component. If a
suitable link function exists, it is not necessary that it stabilizes variance or produces
normality. This is because, the fitting process maximizes the likelihood for the choice of
distribution not restricted to normality.
The different types of GLMs are given below:
Random Component Link function Systematic Component Model
Normal Identity Continuous Linear Regression
Normal Identity Categorical ANOVA
Normal Identity Mixed ANOCOVA
Binomial Logit Mixed Logistic Regression
Poisson Log Mixed Loglinear
Multinomial Generalized Logit Mixed Multinomial Logit
2.1.6 Deviance
For a particular GLM, for observations y = (y1, y2,...,yN), let L(µ, y) be the log-likelihood and
let L( , y) denote the maximum of the log likelihood for the model under consideration. The
maximum achievable log likelihood over all possible models is L(y,y), which occurs for the
most general (saturated) model having a separate parameter for each observation with = y.
Such a model is useless since it does not provide reduction. However, it serves as a baseline
for comparison. We test H0: Model holds against H1: Saturated model holds. The deviance is
defined as D(y; ) = – 2[L( ,y) - L(y,y)] which asymptotically follows χ2(N – p) under the
null hypothesis. Here p is the number of parameters specified by the model being tested.
This is nothing but the LR test.
2
CATEGORICAL DATA ANALYSIS
Dr. Martin L. William
(Eg.) Consider a study where binomial counts at N fixed settings of the predictors are
observed. Let Yi ~ B(ni, πi), i = 1, 2, …, N. Suppose we wish to test homogeneity of the πi’s.
That is H0: πi = α for all i = 1, 2, .., N. [The number of parameters here is 1]. The saturated
model makes no assumption about the πi’s, letting them any N values between 0 & 1. [The
number of parameters here is N]. The deviance has dof = N – 1. It equals the G2 (LR) statistic
for testing independence in the N X 2 table that these samples form.
We note that H0 is same as “independence between the ‘settings’ of the predictors & the
outcomes (success/ failure).”Under independence, the test statistic has an approximate chi-
square distribution as the values of ni increase (whatever be N).
Poisson: Let Yibe Poisson with mean μi ; Then, f (yi) = exp { yi log μi – μi – log yi ! }
= exp{ yiθi – exp ( θi) – log yi ! }
where θi = log μi . This has the form (2.2.1) with b(θi) = exp(θi), a( ) = 1, c( yi , ) = – log
yi!
The natural parameter is θi = log μi. We note that
E(Yi) = b'(θi ) = exp( θi ) = μi, Var(Yi) = b''(θi ) a( ) = exp( θi ) = μi
3
CATEGORICAL DATA ANALYSIS
Dr. Martin L. William
Binomial: Let Yi be the proportion of successes in ni trials with probability of success in each
trial being πi. Then, niYi ~ B(ni, πi). Let θi = log [ πi / ( 1 – πi) ] so πi = exp(θi) / [1+ exp(θi) ].
ni ni yi
And f(yi) = i (1 − i ) ni − ni yi
i i
n y
y − log[1 + exp( i ) ] n
= exp i i + log i
1 ni ni y i
ni
This is in the form (4.4.1) with b(θi) = log [1+ exp(θi) ], a( ) = 1/ ni, c( yi , ) = log
i i
n y
The natural parameter is the logit namely θi .
We note that E(Yi) = b'(θi ) = exp(θi) / [1+ exp(θi) ] = πi ,
Var(Yi) = b''(θi ) a( ) = exp(θi) / { [1+ exp(θi) ]2ni} = πi (1 – πi) / ni
4
CATEGORICAL DATA ANALYSIS
Dr. Martin L. William
Although β does not explicitly appear in these equations, it is there implicitly through µi ,
p
since µi = g −1 j xi j .
j =1
Different link functions yield different equations. Interestingly, the likelihood equations
(2.2.6) depend on the distribution of Yi only through µi and Var(Yi). The variance itself is a
function of the mean say, Var(Yi) = υ(µi) such as υ(µi) = µi for Poisson, υ(µi) = µi(1- µi) for
Bernoulli, υ(µi) = σ2 (constant) for normal.
written in matrix form as η = X β, where η = (η1, η2,…,ηN)', β = (β1, β2, .., βp)', X is N x p
matrix called Model Matrix. For most GLMs, the likelihood equations are nonlinear functions
of β. Let denote the MLE of β. The likelihood-ratio based inference gives the Deviance.
Let W = diag (w1, …,wN) where wi = (∂ μi / ∂ ηi)2 / Var(Yi).
The Information Matrix is I =X'WX. The estimated asymptotic covariance matrix of the
MLE is Cov ( ) = (X' W X ) −1 where W is W evaluated at .
5
CATEGORICAL DATA ANALYSIS
Dr. Martin L. William
yi − i
Residuals for GLMs: The Pearson Residual for a GLM is defined as e i =
. For
1/ 2
[Var (Yi )]
yi − i
instance, for a Poisson GLM, Var(Yi) = i , and the Pearson Residual is e i = .
i
For two-way contingency tables, yi’s are nothing but the cell counts nij. These residuals are
ni j −ni j
expressed as e i j
= ^ 1/ 2
nij
Then, it is easily seen that, e 2
ij = X2, the Pearson X2 statistic.