Identification of outliers, Leverage for GLMs is measured using the hat
6
Models for count data
Se re
‘This chapter deals with GLMs where the response is a count. Examples of
‘count data models in insurance ace:
studies
¢ In mortal imber of deaths in terms of
insurance, we to explain the number of claims made
ividuals or groups of individuals in terms of explanatory
ber of
of the
‘olor of the ear, engine capacity, previous claims experience, and $0 on,
6.1. Poisson regression
Poisson regression, the mean
variables a via an appropriate link.
jereases the expected va
Ale +1} -
al6.1 Poisson regression 83
Table 6.1. Poisson models, chil
Response variable Number of children
Response distribution Poisson
Link los
Deviance 1650
Degrees of freedom 139
Parameter af a se ox
“Tnercept LSS 8090 0 the effect is an increase, since
3
log link yields
4.000 4 0.1132
hrs oe
Fig. 6.1, Number of children versus mother's gs
adequate fits: 165.0 on 139 degrees
1.4.0n 139 degrees of freedom for the
her's age. ‘These relationships
during the childbearing years.iahmmineed 85
th 2 an indicator vari
Table 6.
lumber of children, Poisson m
tf sven an. 4p AB Variables a5 discussed below. Population sie nem
Source A dt? pvalue model as an offs
“Sha imo ea Fo siny
Tnkrmode! —ineepr 1984
on Age 165.0 1 294 <0.0001 are considered first,
Tae nk model necep oo
Age 141 230
tive binomial and Pois-
ss of 0.5 atLog dain ate
Leg dean ate
Fig. 64, Observed and fted Swedish male death rates
6.2 Polsson overdispersion and negative binomial regression 93
Fig. 6.5, Fitted Swedish male death rates using negative binomial model
yields a much better fit: a deviance of 2838 on $704 degrees of freedom, The
{ited regression coefficients are displayed in Figure 6,5 ~ see code and output
(on page 157,
+ Bp? + Band +o + Dptgd®
AIC as a selection criterion for p and q yie
And hence a model with 30 parameters, The AIC
53 898, compared to a value of 54 023 corresponding,
cients, Figure 6,5 and the bottom ps
fits are appropriat
bredicted end the more recent data is seen a5 1
‘SAS notes. Numerical problems often occur when there ae large number of
Polynomial terms, asto avoid these numerical problems, These are
genmod, but are av
ee
Dispersion
Variance parameter
@=1
on é=9102
wenn)
Poisson and quas-Walheod (Pocon
| Noone oatunderlying the overdispers
Which are Of the two analyses are usual
lent. The only difference between the Poisson
variance) models is an
parameter estimates.
64 Counts and frequencies
Counts are often converted to freque!
“exposed
ion of age and other tisk factors.
the proportion of deaths
is the total number in the sample who are
wodels for disease frequency
the number of people exposed tothe risk
frequencies, they cannot be modeled using a
ch as the Poisson of ne;
insurance data set, described on page 15.
62 nswdeaths2002 contains all-cause mortality
New South Wales, Austral
63
7
Categorical responses
ee Tespon
Categorical variables take on one ofa discrete number of categories, Forexame
8 person is either male or fem:
employment status and 50 on.
ul variable is where the
occurrence an! non-occurrence of the event
Categorical variables
tegories are ordered, althoug!
the case that 4 is wwice as bad as 2
lis the response to
bbe explained in terms of other explanatory variables. ‘Tis is dealt
current chapter.
7.1 Binary responses
Consider a binary response variat
indicates whether a person die
im is to explain y in
robabilty that y =
EI