You are on page 1of 3

Using backward selection, all variables involved in the study is fitted to the model.

Table __ shows the summary of the full model.


Variable Description Coefficient Standard error P-value
Sex Male -0.059 0.169 0.72
Female (R)
Age 0.070 0.007 <0.001
Hypertension 1 0.515 0.186 0.005
0 (R)
Heart disease 1 0.452 0.220 0.040
0 (R)
Ever married Yes -0.125 0.267 0.64
No (R)
Average 0.005 0.001 <0.001
glucose level
BMI 0.008 0.013 0.56
Smoking Formerly smoked 0.081 0.191 0.670
status Smokes 0.422 0.216 0.051
Never smoked (R)

Then, we test for the significance of the variable sex as it has the highest p-value in
the full model. Using likelihood ratio test (LRT), we construct the null hypothesis that
the reduced model excluding variable sex is a better fit model. The results show that
the variable sex does not significanctly contribute to the model, G2 (df=1)=0.12, p-
value=0.73.

Table __ shows the summary of the fitted model after removing variable sex.

Variable Description Coefficient Standard error P-value


Age 0.070 0.007 <0.001
Hypertension 1 0.514 0.186 0.006
0 (R)
Heart disease 1 0.443 0.219 0.043
0 (R)
Ever married Yes -0.129 0.267 0.63
No (R)
Average 0.005 0.001 <0.001
glucose level
BMI 0.008 0.013 0.55
Smoking Formerly smoked 0.072 0.189 0.704
status Smokes 0.414 0.215 0.054
Never smoked (R)

Another LRT was performed to test for the significant contribution of variable ever
married that has highest p-value of 0.63 in the model. The test results show weak
evidence against the null hypothesis, thus the reduced model is selected, G2
(df=1)=0.23, p-value=0.63.

Variable Description Coefficient Standard error P-value


Age 0.070 0.007 <0.001
Hypertension 1 0.515 0.186 0.006
0 (R)
Heart disease 1 0.447 0.219 0.041
0 (R)
Average 0.005 0.001 <0.001
glucose level
BMI 0.008 0.013 0.56
Smoking Formerly smoked 0.068 0.189 0.719
status Smokes 0.411 0.214 0.055
Never smoked (R)

A LRT was performed with the reduced model excluding variale BMI since it has a p-
value of 0.56. The test shows that variable BMI does not contribute significantly to
the model, G2 (df=1) = 0.34, p-value=0.56.

Variable Description Coefficient Standard error P-value


Age 0.069 0.006 <0.001
Hypertension 1 1.527 0.185 0.004
0 (R)
Heart disease 1 0.442 0.219 0.04
0 (R)
Average 0.005 0.001 <0.001
glucose level
Smoking Formerly smoked 0.070 0.189 0.71
status Smokes 0.410 0.214 0.06
Never smoked (R)

Another LRT was executed with the reduced model excluding variable smoking
status. The results show that adding variable smoking status does not contribute
significnatly to the model, G2 (df=2)=3.60, p-value=0.16. Table __ shows the
summary of the reduced model containing four variables age, hypertension, heart
disease, and average glucose level.
Variable Description Coefficient Standard error P-value
Age 0.067 0.006 <0.001
Hypertension 1 0.518 0.184 0.005
0 (R)
Heart disease 1 0.488 0.217 0.02
0 (R)
Average 0.005 0.001 <0.001
glucose level

All possible interaction effects for this model was investigated but no significant
contribution was found.

Automatic search procedures were also executed which has generated the same model
using the forward, backward, and stepwise selection. For the best subset selection, a
model containing age, hypertension, average glucose level, and smoking status was
generated. Therefore we have the following competing models:

Model 1: stroke~age+hypertension+heart disease+average glucose level


Model 2: stroke~age+hypertension+average glucose level+smoking status

All possible interaction effects for these two models were investigated but none were
significant to the model. In choosing for the best model, the AIC criteria was used.
From Table __ Model 1 has a lower AIC= 1144.5 compared to Model 2.

Competing Models AIC


Model 1: stroke~age+hypertension+heart disease+average 1144.5
glucose level
Model 2: stroke~age+hypertension+average glucose 1146.8
level+smoking status

You might also like