You are on page 1of 8

STAT 526 - Spring 2011 Olga Vitek

Homework 8 - Solution

Each part of the problems 5 points

1. Present each of the following distributions in the exponential family form. Identify the relevant com-
ponents necessary for use in a GLM: (1) the canonical parameter θ, (2) the dispersion parameter φ,
(3) a(·), b(·), c(·), (4) the variance function, (5) the canonical link, µ(θ), (6) the deviance and (7) the
scaled deviance. Show your work.
 
yθ − b(θ)
f (y) = exp + c(y, φ)
a(φ)

(a) Normal distribution

Answer:
Let y ∼ N (0, σ 2 ). Since

yµ − µ2 /2 y2
 
1
f (y) = exp 2
− 2
− log(2πσ 2 )
σ 2σ 2

(1) canonical parameter : θ = µ


(2) dispersion parameter : φ = σ 2
(3) a(φ) = φ = σ 2 ,
b(θ) = θ2 /2,
y2
c(y, φ) = − 2φ − 12 log(2πφ)
(4) variance function : V ar(y) = b”(θ)a(φ) = 1 · φ, V (µ) = b00 (θ) = 1
(5) canonical link : µ(θ) = b0 (θ) = θ, g(µ) = µ
(6) Deviance : With a data set (yi , xi ), i = 1, 2, ...n, for the saturated model we estimate each θ˜i
by yi , i.e. θ˜i = yi . Also denote θˆi the estimates under the model of interest.
n n 2 2
X X θ˜i θˆi
D = 2 {[yi θ˜i − yi θˆi ] − [b(θ˜i ) − b(θˆi )]} = 2 {[yi θ˜i − yi θˆi ] − [ − ]}
i=1 i=1
2 2
n n n
X yi2θ̂ 2 X X
= 2 {yi2 − yi θˆi − + }= (yi − θˆi )2 = (yi − µ̂i )2 = SSE
i=1
2 2 i=1 i=1

(7) scaled deviance :


Pn Pn
i=1 (yi − θˆi )2 i=1 (yi − µ̂i )2 SSE
D/φ = = =
φ φ σ2

1
(b) Binomial distribution

Answer:
Let y ∼ Binomial(n, π), and denote ȳ = y/n. Since
 
n y
f (y) = π (1 − π)n−y
y
π
ȳ log 1−π − (− log(1 − π))
  
n
= exp + log
1/n nȳ

π
(1) canonical parameter : θ = log 1−π
(2) dispersion parameter : φ = 1
(3) a(φ) = 1/n,
b(θ) = −log(1 −π) = log(1 + eθ ),
n
c(y, φ) = log nȳ

(4) variance function : V (µ) = b00 (θ) = 1
1+eθ 1+eθ
= π(1 − π), where µ = π.
θ
(5) canonical link : µ(θ) = b0 (θ) = 1+e e π
θ , g(µ) = log 1−π

(6) Deviance : With a data set (yi , ni , xi ), i = 1, 2, ...n, for the saturated model we estimate each
θ˜i as θ˜i = log 1−ȳ
ȳi
i
. Also denote θˆi the estimates under the model of interest. Then
n
X n o
D = 2 ni [ȳi θ˜i − ȳi θˆi ] − [b(θ˜i ) − b(θˆi )]
i=1
n  
X ȳi ˆ 1 θˆi
= 2 ni ȳi log − ȳi θi − log + log(1 + e )
i=1
1 − ȳi 1 − ȳi
n  
X ȳi 1 − ȳi 1 − ȳi
= 2 ni ȳi log − ȳi log + log
i=1
π̂i 1 − π̂i 1 − π̂i
n  
X ȳi 1 − ȳi
= 2 ni ȳi log + (1 − ȳi ) log
i=1
π̂ i 1 − π̂i
n  
X yi ni − yi
= 2 yi log + (ni − yi ) log
i=1
ni π̂i ni − ni π̂i

(7) scaled deviance : Since φ = 1, the scaled deviance D/φ = D.

(c) Poisson distribution

Answer:
Let y ∼ P oisson(λ). Since

λy
f (y) = e−λ = exp {y log(λ) − λ − log(y!)}
y!

(1) canonical parameter : θ = log λ


(2) dispersion parameter : φ = 1

2
(3) a(φ) = 1,
b(θ) = eθ = λ,
c(y, φ) = − log(y!)
(4) variance function : V (µ) = b00 (θ) = λ, where µ = λ.
(5) canonical link : µ(θ) = b0 (θ) = eθ , g(µ) = log λ
(6) Deviance : With a data set (yi , xi ), i = 1, 2, ...n, for the saturated model we estimate each θ˜i
by θ˜i = log yi . Also denote θˆi as the estimates under the model of interest. Then deviance D
is
n
X n
X
D = 2 {[yi θ˜i − yi θˆi ] − [b(θ˜i ) − b(θˆi )]} = 2 {yi log yi − yi log λ̂i − yi + λ̂i }
i=1 i=1
n
X yi
= 2 {yi log − (yi − λ̂i )}
i=1 λ̂i

(7) scaled deviance : Since φ = 1, the scaled deviance D/φ = D.

(d) Gamma distribution G(ν, µ/ν) with density function


 ν
1 ν
f (y) = y ν−1 e−yν/µ , y > 0
Γ(ν) µ

Answer:
Since
 
ν
f (y) = exp (−y ) + (ν − 1) log(y) + ν log(ν) − ν log(µ) − log((Γ(ν))
µ
(
−y( µ1 ) − log(µ)
)
= exp 1 + (ν − 1) log(y) + ν log(ν) − log(Γ(ν))
ν

(1) canonical parameter : θ = −1/µ


(2) dispersion parameter : φ = 1/ν
(3) a(φ) = φ,
b(θ) = log(µ) = − log(−θ),
c(y, φ) = ( φ1 − 1) log(y) − log(φ)
φ − log(Γ(1/φ))
(4) variance function : V (µ) = b00 (θ) = 1/θ2 = µ2
(5) canonical link : µ(θ) = b0 (θ) = −1/θ, g(µ) = 1/µ
(6) Deviance : With a data set (yi , xi ), i = 1, 2, ...n, for the saturated model we estimate each θi
by θ˜i = −1/yi . Also denote θˆi as the estimates under the model of interest. Then deviance
D is
n
X
2 {[yi θ˜i − yi θˆi ] − [b(θ˜i ) − b(θˆi )]}
i=1
n
X
= 2 {[yi θ˜i − yi θˆi ] − [log(−θ˜i ) + log(−θˆi )]}
i=1
n   n  
X 1 1 X yi − µ̂i yi
= 2 yi (− ) − yi (− ) − [log(yi ) − log(µ̂i )] =2 − log
i=1
yi µ̂i i=1
µ̂i µ̂i

3
(7) scaled deviance :
n  
1 X yi − µ̂i yi
2 − log
φ i=1 µ̂i µ̂i

(e) Inverse Gaussian distribution IG(µ, λ) with density function


s
λ(y − µ)2
 
λ
f (y) = exp − , y, µ, λ > 0
2πy 3 2µ2 y

Answer:
Since
 
λ λ λ 1
f (y) = exp −y 2 + − + log(λ)/2 − log(2πy 3 )
2µ µ 2y 2
( 1
)
y(− µ2 ) + 2/µ λ log(λ) − log(2πy 3 )
= exp − +
2/λ 2y 2

(1) canonical parameter : θ = −1/µ2


(2) dispersion parameter : φ = 2/λ
(3) a(φ) = φ = 2/λ, √
b(θ) = − µ2 = −2 −θ,
3
c(y, φ) = − log(φ/2)+log(2πy
2
)
− 1
φy
(4) variance function : V (µ) = b00 (θ) = 1
2 (−θ)
−3/2

(5) canonical link : µ(θ) = b0 (θ) = (−θ) −1/2


, g(µ) = −1/µ2
(6) Deviance : With a data set (yi , xi ), i = 1, 2, ...n, for the saturated model we estimate each θ˜i
by θ˜i = −1/yi2 . Also denote θˆi the estimates under the model of interest. Then deviance D is
n 
X q q 
2 [yi θ˜i − yi θˆi ] − [−2 −θ̃i + 2 −θˆi ]
i=1
n  
X 1 1 1 1
= 2 [yi (− 2 ) − yi (− 2 )] − [−2 + 2 ]
i=1
yi µ̂i yi µ̂i
n
X (yi − µi )2
= 2
i=1
µ̂2i yi

(7) scaled deviance :


n
X (yi − µi )2
λ
i=1
µ̂2i yi

4
2. [Methods qualifying exam, August 2010: use paper and pencil.] Data are generated for the exponential
distribution with density f (y) = λexp(−λy), where λ, y > 0. The distribution is a member of the
exponential family which takes the general form
 
yθ − b(θ)
f (y|θ, φ) = exp + c(y, φ) .
a(φ)

(a) Identify the specific form of θ, φ, a(), b(), and c() for the exponential distribution.
Answer:
Notice the exponential distribution is a special case of gamma distribution.
f (y) = exp(−λy + log λ).
θ = −λ,
b(θ) = − log λ = − log (−θ),
φ = 1,
a(φ) = 1,
c(y, φ) = 0.

(b) What’s the canonical link and variance function for a generalized linear model (GLM) with a
response following the exponential distribution?
Answer:
1 notation
b0 (θ) = ∂θ

[−log(−θ)] = − θ1 , so the expected value E{y} = − θ1 = λ = µ.
The canonical link expresses θ as function of µ, i.e. θ(µ) = − µ1
1 in terms of µ
The variance function is b00 (θ) = ∂ 1
∂θ [− θ ] = θ2 = µ2

(c) Identify a practical difficulty that may arise when using the canonical link in this instance.
Answer:
Modeling the canonical link as a linear combination of predictors can result in a negative mean.

(d) When comparing nested models in this case, should an F or χ2 test be used? Explain.
Answer:
The dispersion parameter of the exponential distribution is equal to 1, therefor we can compare
the models using the deviance test with the χ2 as the reference distribution. F test is used when
the dispersion is estimated.

5
(e) Express the deviance in terms of the responses yi and the fitted values µ̂i .
Answer:
X  yi − µ̂i yi

Deviance = 2 − log
µ̂i µ̂i

3. [Methods qualifying exam, January 2008: use paper and pencil.] The Conway-Maxwell-Poisson distri-
bution has the probability function
λy 1
P (Y = y) = , y = 0, 1, 2, . . .
(y!)ν Z(λ, ν)

where

X λi
Z(λ, ν) =
i=1
(i!)ν

(a) Place this distribution in an exponential family form with respect to both parameters, and identify
all the relevant components.
Answer:

f (y) = exp{y ln λ − ν ln(y!) − ln Z(λ, ν)}


Therefore,

θ = ln λ
φ = 1
a(φ) = 1
b(θ) = ln Z(λ, ν) = Z(eθ , ν)
c(y, φ) = −ν ln(y!)

(b) Explain why this distribution can be used to model overdispersion for count data.
Answer:
When v = 1, this is Poisson distribution, which is a special case of this Conway-Maxwell-Poisson
distribution. While we cannot control variance of Poisson, since there’s only one parameter, we
have additional parameter, ν, to adjust both mean and variance of the distribution and it satisfies
V (Y ) ≥ E(Y ) with V (Y ) = E(Y ) when v = 1. Thus, it can be used to model overdispersion.

6
4. [Methods qualifying exam, August 2007: use paper and pencil.] Consider a study intended to investigate
race discrimination in calling fouls by referees in National Basketball Association (NBA). n black
referees and n white referees were randomly selected (n > 0). For each referee, k foul calls were
randomly selected to count how many calls were given to black players and how many calls were given
to white players. Therefore, we have the dataset {(Yi , Xi ), i = 1, . . . , n}, where Yi is the number of
foul calls given to black players by ith referee, and Xi indicates whether the ith referee is black.
(a) Please construct a generalized linear model to serve the study goal:
i. Specify θi , φ, b(θi ), a and c(yi ) which defines the exponential family of distributions of Yi
 
yi θi − b(θi )
P (Yi = yi |Xi ) = exp + c(yi , φ) ;
a(φ)

Answer:
Yi follows a binomial distribution. Let pi = pi (xi ) be the probability of a foul to a black
player. The, we have
 
k yi
P (Yi = yi |xi ) = p (1 − pi )k−yi
yi i
  
pi k
= exp yi log + k log(1 − pi ) + log ;
1 − pi yi

pi
θi = log
1 − pi
φ = 1
b(θi ) = k log(1 + eθi )
 
k
c(yi ) = log
yi

ii. State the canonical link function;


Answer:

eθ i
µ(θi ) = b0 (θi ) = k
1 + eθ i
µ/k
g(µ) = (b0 (θi ))−1 = log
1 − µ/k

iii. State the variance function.


Answer:

eθ i µ
V (µ) = b00 (θ) = k = µ(1 − )
(1 + eθi )2 k

7
(b) Using the canonical link function, suppose we have θ̂i = 1.2 − 0.5Xi estimated from the data.
Please interpret the result.
Answer:
eθi is the odds ratio. The odds increases e−0.5 − 1 = −0.3935. The odds for a black referee calling
a foul to black player is 39.35% lower than calling a foul to white player.

(c) Suppose k = 1, i.e., only one foul call by each referee was recorded. With this dataset, is it
possible for us to build up a model to predict the referee’s race for a foul call on a black player?
Please explain.
Answer:
It is possible since we can summarize the data into a 2 × 2 table as
Referee
Play Black White
Black y1 y2
White n − y1 n − y2
where y1 is the total number of fouls called to black players by black referees and y2 is the total
number of fouls called to black players by white referee. We can use methods of 2 × 2 table to
analyze the data, e.g. χ2 test.

You might also like