You are on page 1of 3

THE CHINESE UNIVERSITY OF HONG KONG

Department of Statistics

STAT4006: Categorical Data Analysis


Problem Sheet 1

The deadline for this Problem Sheet is 5.30pm on Tuesday 6th October.
Please hand in your answers to Wang Chaojie in LSB Rm141 or into box 18.
No late submissions will be accepted. A late submission will receive a
mark of zero. Students may discuss set problems with others, but their final
submissions must be their own work.
Please answer the following problems.
1. In the following examples, identify the response variable and the explana-
tory variables.
(a) Attitude toward gun control (favor, oppose), Gender (female, male),
Mother’s education (high school, college).
(b) Heart disease (yes, no), Blood pressure, Cholesterol level.
(c) Race (white, nonwhite), Religion (Catholic, Jewish, Protestant), Vote
for president (Democrat, Republican, Other), Annual income.
(d) Marital status (married, single, divorced, widowed), Quality of life
(excellent, good, fair, poor).

2. The data in the following table is obtained from a multinomial distribu-


tion.

Cell 1 2 3 4 5
Probability π1 π2 π3 π4 π5
Frequency 10 13 23 21 29

Table 1: Multinomial Data

(a) Test with α = 0.05 the null hypothesis H0 : π1 = 0.1, π2 = 0.1, π3 =


0.25, π4 = 0.25 by using the Pearson chi-square test and the likelihood
ratio test.
(b) Derive the maximum likelihood estimates of πi , i = 1, . . . , 5 under
the null hypothesis H0 : π1 = π2 , π3 = π4 .
(c) Test with α = 0.05 the null hypothesis H0 : π1 = π2 , π3 = π4 by
using the Pearson chi-square test and the likelihood ratio test.

1
3. The following table gives a random sample of size 150 of the random
variable X. Do you think X follows the Poisson distribution? (α = 0.05).

Values of X 0 1 2 3 4 5 6 7 8 9
Frequency 5 11 18 26 29 25 15 10 7 4

Table 2: Poisson Data

4. The following table is from a report on the relationship between aspirin


use and myocardial infarction (heart attacks) by the Physicians’ Health
Study Research Group at Harvard Medical School. Find the P -value for

Myocardial Infarction
Group Yes No
Placebo 158 10321
Aspirin 71 10410

Table 3: Heart Attack Data

testing that the incidence of heart attacks is independent of aspirin intake


using chi-square. Interpret your results.
5. An analysis of campus accident data was made to determine the distribu-
tion of numbers of fatal accidents for automobiles of two sizes. The data
for 16 accidents are given in the following table. Do the data indicate that
the frequency of fatal accidents is independent on the size of automobiles?

Size of auto
Small Large Total
Fatal 2 6 8
not Fatal 4 4 8
Total 6 10 16

Table 4: Campus Accident Data

6. For adults who sailed on the Titanic on its fateful voyage, the odds ratio
between gender (female, male) and survival (yes, no) was 11.4 (For data,
see R. Dawson, J. Statist, Educ. 3, no. 3, 1995).
(a) What is wrong with the interpretation,”The probability of survival
for females was 11.4 times that for males”? Give the correct inter-
pretation. When would the quoted interpretation be approximately
correct?
(b) The odds of survival for females equaled 2.9. For each gender, find
the proportion who survived.

2
7. The following table is based on records of accidents in 1988 compiled by
the Department of Highway Safety and Motor Vehicles in Florida.

Injury
Safety Equipment in Use Fatal Non-fatal
None 1598 162526
Seat belt 502 412360

Table 5: Highway Safety Data

(a) Find and interpret the difference of proportions, relative risk, and
odds ratio. Why are the relative risk and odds ratio approximately
equal?
(b) Construct 95% confidence intervals for the difference of proportions
and the odds ratio, and interpret.

THE END

You might also like