Professional Documents
Culture Documents
Notation
The following notation is used throughout this chapter unless otherwise stated:
A Generic categorical independent (explanatory) variable. Its categories are
indexed by an array of integers.
B Generic categorical dependent (response) variable. Its categories are
indexed by an array of integers.
r Number of categories of B.
c Number of categories of A.
p Number of nonredundant (nonaliased) parameters.
i Generic index for the category of B.
j Generic index for the categories of A.
k Generic index for the parameter.
nij Observed count in the ith response of B and the jth setting of A.
Nj Marginal total count at the jth setting of A. It is equal to
∑
r
nij .
i =1
N Total observed count. It is equal to
∑ ∑
c r
nij .
j =1 i =1
mij Expected count.
1
2 GENLOG Multinomial Loglinear and Logit Models
∑ ∑
c r
setting of A. 0 ≤ π ij ≤ 1 and π ij = 1 .
j =1 i =1
zij Cell structure value.
αj jth normalizing constant.
xijk An element in the ith row and the kth column of the design matrix for the j
setting.
The same notation is used for both loglinear and logit models so that the methods
are presented in a unified way. Conceptually, one can consider a loglinear model as
a special case of a logit model where the explanatory variable has only one level
(that is, c = 1).
Random Component
The random component describes the joint distribution of the counts.
3N ,π
j 1j ,K, π 8 distribution.
rj
∏π j nij
(1)
∏
r ij
j =1n ! ij i =1
i =1
where δ ab = 1 if a = b and δ ab = 0 if a ≠ b .
Systematic Component
The systematic component describes the linkage function between the expected
counts and the parameters. The expected counts are themselves functions of other
parameters. Explicitly, for i = 1,K, r and j = 1,K, c ,
%Kz eα j + vij
mij = &K0 ij if zij > 0
' if zij ≤ 0
where
p
vij = ∑x ijk β k
k =1
Normalizing Constants
N
= log
j
αj j = 1,K, c
∑ z e
r
(2)
vij
ij
i =1
Maximum-Likelihood Estimation
The multinomial log-likelihood is
c r
16 3
L β = L β 1 ,K, β p = constant + 8 ∑ ∑ n log3m 8ij ij (3)
j =1 i =1
Likelihood Equations
It can be shown that
c r
∑ ∑ 3n 8
∂L
= ij − mij x ijk for k = 1,K, p
∂β k
j =1 i =1
GENLOG Multinomial Loglinear and Logit Models 5
16 3 16 1 68 0 5
Let g β = g1 β ,K, g p β ′ be the p + 1 gradient vector with
16
gk β =
∂L
∂β k
4 9
t
The maximum-likelihood estimates β$ = β$ 1 ,K, β$ p are regarded as a solution to
the vector of likelihood equations:
16
g β =0 (4)
Hessian Matrix
The likelihood equations are nonlinear functions of β. Solving them for β$ requires
an iterative method. The Newton-Raphson method is used. It can be shown that
c r
∑ ∑ m 3x 83 8
∂ 2L
=− ij ijk − θ jk xijl − θ jl
∂β k ∂β t
j =1 i =1
where
∑m x
1
θ jk = ij ijk j = 1,K, c and k = 1,K, p (5)
Nj
i =1
16 16
Let H β be the p × p information matrix, where −H β is the Hessian matrix of
(3). The elements of H β are 16
16
hkl β = −
∂ 2L
∂β k ∂β l
k = 1, K, p and l = 1,K, p (6)
16
Note: H β is a symmetric positive-definite matrix. The asymptotic covariance
matrix of β$ is estimated by H −1 β . 16
6 GENLOG Multinomial Loglinear and Logit Models
Newton-Raphson Method
05
Let β s denote the sth approximation for the solution to (4). By the Newton-
Raphson method,
β 0 s +15 = β 0 s 5 + H −1 β 0s 5 g β 0s 5
4 94 9
Define q1β 6 = H1 β 6β + g1β 6 . Using (5) again, the kth element of q1β 6 is
c r
1 6 ∑ ∑η 3x
qk β = ij ijk − θ jk 8 (7)
j =1 i =1
where
η ij =
K%&m v + 3n − mij 8 if zij > 0 and mij > 0
K'0
ij ij ij
otherwise
Then
H β 0 s5 β 0 s +15 = q β 0 s5
4 9 4 9 (8)
05 0 5 0 5
Thus, given β s , the s + 1 th approximation β s+1 is found by solving the system
of equations in (8).
GENLOG Multinomial Loglinear and Logit Models 7
Initial Values
05
SPSS uses the β 0 , which corresponds to a saturated model as the initial value for
β. Then the initial estimates for the expected cell counts are
00 5
mij =
%&n ij +∆ if zij > 0
'0
(9)
if zij ≤ 0
where ∆ ≥ 0 is a constant.
Note: For saturated models, SPSS adds ∆ to nij if zij > 0 . This is done to avoid
numerical problems in case some observed counts are 0. We advise users to set ∆ to
0 whenever all observed counts (other than structural zeros) are positive.
θ0 5 = ∑ m0 5 x
r
0 1 0
jk ij ijk (10)
Nj
i =1
and
η 0ij 5 =
%Km0 5 log4m0 5 / z 9 + 4n 00 5 9 005
&K0
0 0
0 ij ij ij ij − mij if zij > 0 and mij > 0
(11)
' otherwise
Stopping Criteria
SPSS checks the following conditions for convergence:
1. max i, j mij 0 s +1 5 − m0s5 / m 0s5 < ε provided that mij 0s5 > 0
ij ij
2. max i, j m0 ij
s +1 5 − m 0s5 < ε
ij
∑ p
4 9 / p < ε
gk2 β$
3.
k =1
8 GENLOG Multinomial Loglinear and Logit Models
Algorithm
The iteration process uses the following steps:
1. Calculate mij
005 using (9), θ 005 using (10), and n005 using (11).
jk ij
2. Set s = 0 .
%
0s +15 = K& N j zij e v
0 5
s +1
/ ∑
r vij0 s +15
if zij > 0
ij
ztj e
K'0
m ij t =1
if zij ≤ 0
6. Check whether the stopping criteria are satisfied. If yes, stop iteration and
declare convergence. Otherwise continue.
7. Increase s by 1 and check whether the maximum iteration has been reached. If yes,
stop iteration and declare the process not converged. Otherwise repeat steps 3-7.
GENLOG Multinomial Loglinear and Logit Models 9
N
= log
j
α$ j j = 1,K, c
∑ z e r
i =1
ij
v$ij
where
p
v$ij = ∑x $
ijk β k
k =1
%K N z e /
∑
r
v$ij v$tj
=&
ztj e if zij > 0
m$ ij j ij
K'0
t =1
if zij ≤ 0
Goodness-of-Fit Statistics
The Pearson chi-square statistic is
c r
X2 = ∑∑ X ij
2
j =1 i =1
10 GENLOG Multinomial Loglinear and Logit Models
where
c r
G2 = 2 ∑∑G ij
2
j =1 i =1
where
%Kn 4log3n
ij ij / m$ ij 89 if zij > 0, nij > 0 and m$ ij > 0
Gij 2 =&
KSYSMIS if zij > 0, nij > 0 and m$ ij = 0
KK0 if zij > 0, nij = 0, and m$ ij ≥ 0;
' zij ≤ 0 or nij = m$ ij
Degrees of Freedom
The degrees of freedom for each statistic is defined as a = c r − 1 − p − E , where E 0 5
is the number of cells with zij ≤ 0 or m$ ij = 0 .
Significance Level
The significance level (or the p value) for the Pearson chi-square statistic is
4
Prob χ 2a > X 2 9 and that for the likelihood-ratio chi-square statistic is
Prob4 χ 2
a > G2 9 . In both cases, χ 2a is the central chi-square distribution with a
degrees of freedom.
GENLOG Multinomial Loglinear and Logit Models 11
05 0 5 05
where S A + S B| A = S B . Also define
∑
c
m$ ij
j =1
π$ i =
∑
c
Nj
j =1
m$ ij
π$ i| j =
Nj
Entropy
05 ∑ S 0 B5
r
S B = −N i
i =1
where
0 5 %&π0$ log1π 6
Si B = i i if 0 < π$ i ≤ 1
' if π$ i = 0
12 GENLOG Multinomial Loglinear and Logit Models
and
c r
1 6 ∑ N ∑ S 1 B | A6
S B| A = − j ij
j =1 i =1
where
1 6 %K&Kπ0$
Sij B | A = i| j 3 8
log π$ i| j if 0 < π$ i| j ≤ 1
' if π$ i| j = 0
Concentration
0 5 ∑ π$
r
S B = N 1− 2
i =1
i
c r
S1 B | A6 = ∑ N 1 − ∑ π$
2
j =1
j
i =1
i| j
Degrees of Freedom
Residuals
Goodness-of-fit statistics provide only broad summaries of how models fit data.
The pattern of lack of fit is revealed in cell-by-cell comparisons of observed and
fitted cell counts.
Simple Residuals
The simple residual of the (i,j)th cell is
rij =
%&n − m$
ij ij if zij > 0
'SYSMIS if zij ≤ 0
Standardized Residuals
The standardized residual for the (i,j)th cell is
%K3n − m$ 8 / 3
m$ ij 1 − m$ ij / N j 8 if zij > 0 and 0 < m$ ij < N j
rijS
K
= &0
ij ij
if zij > 0 and nij = m$ ij
KKSYSMIS
' otherwise
The standardized residuals are also known as Pearson residuals even though
∑ ∑ 4r 9
c r S 2
ij ≠ X 2 . Although the standardized residuals are asymptotically
j =1 i =1
normal, their asymptotic variances are less than 1.
Adjusted Residuals
The adjusted residual is the simple residual divided by its estimated standard
error. Its definition and applications first appeared in Haberman (1973) and re-
appeared on page 454 of Haberman (1979). This statistic for the (i,j)th cell is
14 GENLOG Multinomial Loglinear and Logit Models
%K3n − m$ 8 /
ij ij sij if zij > 0 and m$ ij > 0
rijA = &0 if zij > 0 and n ij = m$ ij
K'SYSMIS otherwise
where
m$ ij
p p
sij = m$ ij 1 − − m$ ij ∑ ∑4 94 9
xijk − θ$ jk xijl − θ$ jl h kl
Nj
k =1 l =1
49
h kl is the (k,l)th element of H −1 β$ . The adjusted residuals are asymptotically
standard normal.
Deviance Residuals
Pierce and Schafer (1986) and McCullagh and Nelder (1989) define the signed
square root of the individual contribution to the G 2 statistic as the deviance
residual. This statistic for the (i,j)th cell is
3
rijD = sign nij − m$ ij 8 dij
where
3 8
2 nij − m$ ij is added to it so that dij > 0 for all i and j. However, we still have
∑ j =1 ∑i=14rijD 9 = G 2 .
c r 2
GENLOG Multinomial Loglinear and Logit Models 15
Generalized Residual
∑ ∑
c r
Consider a linear combination of the cell counts dij nij , where dij are
j =1 i =1
real numbers.
The estimated expected value is
c r
∑ ∑ d m$ ij ij
j =1 i =1
c r
∑ ∑ d 3n ij ij − m$ ij 8
j =1 i =1
∑ ∑
c
j =1
r
i =1
3
dij nij − m$ ij 8
∑ 2
∑ ∑
c r r
j =1 i =1
dij2 m$ ij −
i =1
dij mij
/ Nj
The adjusted residual for this linear combination is, as given on page 420 of
Haberman (1979),
∑ ∑
c
j =1
r
i =1
3
dij nij − m$ ij 8
V
16 GENLOG Multinomial Loglinear and Logit Models
where
c r c r 2 p p kl
V= ∑∑ dij2 m$ ij −∑ ∑ dij m$ ij − ∑ ∑ fk fl h
1
j =1 i =1 j =1 i=1 k =1 l =1
Nj
c r
fk = ∑ ∑ d m$ 3 x ij ij ijk − θ ik 8
j =1 i =1
c r
∑d ij =0 j = 1,K, c
i =1
c r c r c r p
∑∑ 3 8 ∑∑
dij log m$ ij = 3 8 ∑∑∑d
dij log zij + $
ij x ijk β k (13)
j =1 i =1 j =1 i =1 j =1 i =1 k =1
GENLOG Multinomial Loglinear and Logit Models 17
c r p p
var ∑ ∑ dij log3m$ ij 8 = ∑ ∑ wk wl h kl (14)
j =1 i =1 k =1 l =1
where
c r
wk = ∑∑d ij x ijk k = 1,K, p
j =1 i =1
Wald Statistic
The null hypothesis is
c r
H0 : ∑∑d ij 3 8
log mij = 0
j =1 i =1
∑ ∑ d log3m$ 8
c r 2
W=
j =1 i =1
ij ij
∑ ∑ w wh
p p kl
k l
k =1 l =1
c r p p
∑∑ 3 8
dij log m$ ij ± zα / 2 ∑ ∑ wk wl h kl
j =1 i =1 k =1 l =1
where zα /2 is the upper α / 2 point of the standard normal distribution. The default
value of α is 0.05.
Aggregated Data
This section shows how data are aggregated for a multinomial distribution. The
following notation is used in this section:
%K
= &∑
* +
nijs if vij+ > 0
1≤ s ≤ vij
K'0
nij
if vij = 0 or vij+ = 0
where
+
=
%&nijs if nijs > 0 and zijs > 0
'0
nijs
if nijs ≤ 0 and zijs > 0
∑
*
and means summation over the range of s with the terms zijs > 0 .
1≤ s ≤ vij
The cell weight value is
%K∑ * +
nijs zijs / nij if nij > 0 and vij+ > 0
KK 1≤ s ≤ vij
= &∑
*
zijs / vij+ if nij = 0 and vij+ > 0
KK0
zij 1≤ s ≤ vij
if vij+ = 0
K'1 if vij = 0
If no variable is specified as the cell weight variable, then all cases have unit cell
weights by default.
20 GENLOG Multinomial Loglinear and Logit Models
%K∑ * +
nijs x ijs / nij if nij > 0 and vij > 0
KK 1≤ s ≤ vij
= &∑
*
xijs / vij+ if nij = 0 and vij+ > 0
KK
xij
1≤ s ≤ vij
if vij+ = 0 or vij = 0
K'0
The cell GRESID coefficient is
%K∑ * +
nijs cijs / nij if nij > 0 and vij > 0
KK 1≤ s ≤ vij
= &∑
*
cijs / vij+ if nij = 0 and vij+ > 0
KK
cij
1≤ s ≤ vij
if vij+ or vij = 0
K'0
There are no defaults for the GRESID coefficients.
The cell GLOR coefficient is
%K∑ * +
nijs eijs / nij if nij > 0 and vij > 0
KK 1≤ s ≤ vij
= &∑
*
eijs / vij+ if nij = 0 and vij+ > 0
KK
eij
1≤ s ≤ vij
if vij+ = 0 or vij = 0
K'0
There are no defaults for the GLOR coefficients.
GENLOG Multinomial Loglinear and Logit Models 21
References
Agresti, A. 1990. Categorical data analysis. New York: John Wiley & Sons, Inc.
McCullagh, P., and Nelder, J. A. 1989. Generalized linear models. 2nd ed. London:
Chapman and Hall.