You are on page 1of 4

Austin Kinion

STA 138
SID: 998793649
Project 1
Introduction: A researcher wants to know whether there is a significant difference among
three therapies for curing patients of cocaine dependence (defined as not taking cocaine for at
least 6 months). She tests 500 patients and obtains the results shown in the table below. We
need to determine which of the 8 models described in class is the most parsimonious fit for the
data.

There are three variables in the table: Cure (C) , Sex (S), and Therapy(T). Cure can take
the value Positive (i.e. the patient was cured) or Negative (i.e. the patient was not cured), Sex is
classified as Male or Female and Therapy is any one of three therapies used to treat the patient.
The table lists the number of patients that meets each of the 2 2 3 different
combinations of the three variables. Just as for two-way contingency tables, the saturated
model provides a complete characterization of the data equivalent to the information in The
table. What we are looking for is the smallest model which is a significantly good fit for the data.
We will look at each of models as described in class, one by one, to determine which is best.

Materials and methods: I used SAS for the analysis of the data. the SAS code is
provided on the last page of this report for ease of reading. I was able to analyze each of the 8
models in SAS, and get the G2 , AIC, and BIC values of each of the models. After looking at the
SAS output data, I was able to determine that Model 4 is the best fitting model for this data with
the lowest AIC(85.32), BIC(89.20), and G2(5.59) for the amount of variables it uses. Model 7
had the ultimate best G2 , AIC, and BIC values, but it contained many more variables than
model 4 and was significantly different than model 4, so I chose model for.
To see if Model 4 was not significantly different that Model 7 , I just compared Model 7
(CS, ST, CT) with Model 4(CS, CT). The difference between the chi-square statistics for the two
models was 7.86 1.11 = 6.75, which yields a p-value of .106 on 4 2 = 2 degrees of freedom.
This is not a significant difference, and so I was able to adopt the model 4 (CS, CT). This

indicates that the interaction between S(sex) and T (therapy) does not make a significant
contribution.
A table of all the important values from all models is shown below, with Model 4 and
Model 7 highlighted.

This indicates that that there is an interaction between cure and sex as well as between
cure and therapy. This can be seen by looking at the odds ratios of the observed data below.

Odds ratios

There is an interaction between Cure and Therapy. This can be seen from the fact that
the odds of a cure for therapy 1 is 91/25 = 3.64, while that for therapy 2 is 79/45 = 1.76. The
odds ratio is 2.07 (i.e. therapy 2 seems to be twice as effective as therapy 1.)
There is an interaction between Cure and Sex. This can be seen from the fact that the
odds of a cure for males is 221/38 = 5.82, while that for females is 136/105 = 1.30. The odds
ratio is 4.49(i.e. the therapies seem to be much more effective for men than for women.).
Finally, I computed the coefficients for model 4 in SAS as well and they are as follows:
log(ijk)= + ix+ jy+ kz+ ikxz+ jkyz
= 3.9816 - 1.01 - 1.07 - .48 + .2845 + 1.502 + .3513 - .3779
= 3.1799.
ijk = 23.996

Conclusion and Results: I have determined that , of the 8 models described in class,
model 4 is the most parsimonious fit for this data. This was determined after analyzing each of
the models in SAS, based on AIC, BIC, and G2. This means that there is an interaction between
Cure and Therapy and also an interaction between Cure and Sex.
I was able to determine that Model 4 was not significantly different that Model 7 ,by
comparing the difference between the chi-square statistics for the two models, which was 7.86
1.11 = 6.75, which yields a p-value of .106 on 4 2 = 2 degrees of freedom. Since this is not a
significant difference, and I was able to adopt the model 4 as my most parsimonious fit. It is also
important to note that since model 4 was the best fit, this indicates that the interaction between
Sex and Therapy does not make a significant contribution.

The important SAS output for Model 4 is provided below:

You might also like