Professional Documents
Culture Documents
CAMPUS
1
Group Members Name and ID. Number
2
PART I: Categorical Data Analysis
1. Use regular health checkup, BMI, age and exercise as predictors and describe your finding
(NB: Both crude and adjusted coefficients shall be reported and the variable BMI should be
included as a categorical variable).
A. Heart attack versus exercise
H0: There is no association between Heart attack and exercise
HA: There is association between Heart attack and exercise
Response variable: heart attack
Independent variables: exercise
No exercise = 0 (reference category); Exercise regularly = 1
Table 1: SPSS output for logistic regression
Variables in the Equation
95% C.I.for EXP(B)
B S.E. Wald df Sig. Exp(B) Lower Upper
a
Step 1 exercises regularly(1) -2.522 .210 143.833 1 .000 .080 .053 .121
Constant 2.053 .172 141.909 1 .000 7.789
a. Variable(s) entered on step 1: exercises regularly.
3
Here, on the above table chi-square is highly significant (chi-square= 182.581, df =1, p=0.000).
So our full model (model with independent variables) is significantly better than null model
(model without independent variables).
4
So our model with independent variables is significantly better than model without independent
variables (chi-square= 59.146, df =1, p=0.000).
5
D. heart attack and age
H0: There is no association between Heart attack and age
HA: There is association between Heart attack and age
Outcome variable: heart attack and Independent variables: Age in years
Age is statistically significant independent risk factors for the heart attack (p = 0.000, COR =
1.293). The odds of having heart attack is increased by 29.3 %( COR = 1.293) for each yearly
increase in age.
df =6, p-value=0.824) suggesting that the model is a good fit to the data.
6
The table 10 show that chi-square is highly significant (chi-square=43.646,df =1, p=0.000). So
our full model (model with independent variables) is significantly better than null model (model
without independent variables).
In the multivariable analysis, multiple logistic regressions were performed, using the forward
stepwise method in order to identify independent risk factors for the onset of heart attack.
Model 1 can be used as a comparison with model 2 and model 3 to evaluate the potential
confounding effect of the variable age and BMI_cat. When we see the Exp (B) column of regular
exercise of model 1 is 0.080(COR). After inclusion of variable age it decreased to 0.054(AOR).
The crude model (model 1) yields an estimated odds ratio that is somewhat lower than the
corresponding estimate obtained when we adjust for log age. Since AOR < COR; confounding
7
due age is controlled. When we compare model 1 with model 3; OR of regular exercise is
reduced to 0.063(AOR).
The confidence intervals width for regular exercise in model 1is equal to 0.068, in model 2 equal
to 0.053 while in model 3 it is 0.063. Therefore, model 2 gives a more precise estimate of the
hazard ratio than do model 1 and 2.
Table: 12 SPSS Output for logistic regression
Omnibus Tests of Model Coefficients
Chi-square Df Sig.
Step 1 Step 182.581 1 .000
Block 182.581 1 .000
Model 182.581 1 .000
Step 2 Step 67.061 1 .000
Block 249.642 2 .000
Model 249.642 2 .000
Step 3 Step 24.995 1 .000
Block 274.637 3 .000
Model 274.637 3 .000
From the table 13 above Log-likelihoods (specifically the -2LLs) is reduced from the model one
to model three. So our full model is significantly better than null model. Nagelkerke’s R2 explain
the model 49% in model 3. The chi-square is significant for the entire three models. Due to this
entire reason model 3 is the best model
8
Logistic regression Equation for fitted model:
Looking first at the results for BMI_CAT (1), there is a significant overall effect (Wald=19.696,
df =1, p=0.000). BMI_CAT (1) (BMI ≥ 30) has significant contribution to the heart attack as
compared to BMI <30. Those participants who have BMI ≥ 30 are more than five times
decreased risk of heart attack controlling for other independent variables (AOR = 5.495; 95% C.I
( 2.589, 11.662).
Regular exercises reduces the risk of heart attack by 93.7% (AOR = 0.063; 95% C.I (0.039,
0.102) controlling for the other variables. It has significant effect on the heart attack (p = 0.000)
The odds of having heart attack is increased by 48.6% for each additional yearly increase in age
(AOR = 1.486; C.I (1.339, 1.650) after adjusting other variables and it is significant risk factors
(p = 0.000).
Conclusion: Regular exercise (p = 0.000), BMI_cat (p = 0.000) and age (p = 0.000) are
statistically significant risk factors for heart attack.
2. In place of exercise, use intensity of exercise as a predictor and describe the result of the
revised model.
9
Table: 14 Table: SPSS Output for logistic regression
The adjusted Odds Ratio tells us that individuals who have BMI greater than 30kg/m2 are 5.555
times more likely than those who have below 30kg/m2 (reference category) to develop hearth
attack(AOR= 5.555 ; 95% CI: 2.574, 11.985), controlling for other independent variables.
Individuals who have moderate exercise intensity have 91.7% (AOR = 0.083, CI: 0.050, 0.137)
decreased risk of heart attack; while heavy exercise intensity decrease by 97.1% (AOR = 0.029,
CI: 0.014, 0.057) as compared to those who have no exercise intensity. The intensity of exercise
have statistically significant (p = 0.000) relationship with heart attack.
3. Has the performance of the model improved? How did you make the judgment or arrive at
the conclusion? The performance of the model is not improved. When we look the confidence
intervals of the both model (model with regular exercise and model with exercise intensity) there
is no increase in width or decrease in the narrows’. They have almost the same width. The other
thing is when we look at AOR of the age and BMI; they are almost the same
10
PART II: Survival Analysis
HO: Rural and urban under year five children have the same survival experience
HA: Rural and urban under five children have the different survival experience
Plots of the KM curves for urban and rural are shown here on the same graph. The KM curve for
urban is consistently higher than the KM curve for rural. These figures indicate that urban under-
five children have better survival experience than rural under-five children. As the number of
months increases, the two curves appear to get farther apart, suggesting that the beneficial effects
living in the urban than rural in reducing death of under-five children as the child stays longer in
urban.
11
Table: 15 SPSS output for survival analysis
Overall Comparisons
Chi-Square df Sig.
Log Rank (Mantel-Cox) 1.253 1 .263
Test of equality of survival distributions for the different levels of type of place of residence.
The log–rank statistic is 1.253 and the corresponding P-value is 0.263 to three decimal places.
This P-value indicates that we fail to reject the null hypothesis being tested that there is no
overall difference between the two survival curves.
Conclusion: Rural and urban under year five children have the same survival experience
B. Survival analysis for survival time and sex of child
HO: male and female under year five children have the same survival experience
HA: male and female under five children have different survival experience
12
Table: 16 SPSS output for survival analysis
Overall Comparisons
Chi-Square df Sig.
Log Rank (Mantel-Cox) 7.063 1 .008
Test of equality of survival distributions for the different levels of sex of child.
13
Table: 17 SPSS output for survival analysis
Overall Comparisons
Chi-Square df Sig.
Log Rank (Mantel-Cox) 5.277 3 .153
Test of equality of survival distributions for the different levels of highest educational level.
Decision: Since p-value is greater than 0.05, we fail to reject null hypothesis
Conclusion: Mother’s educational levels have no effect on the survival experience of under year
five children.
2. Fit cox regression model for the factors identified in the previous steps. [Please produce both
crude and adjusted effect measures and also check all the assumptions of the model].
A. COX regression for sex of child
Reference category = male (0)
Table: 18 SPSS output for cox regression
Variables in the Equation
95.0% CI for Exp(B)
B SE Wald df Sig. Exp(B) Lower Upper
sex of child -.285 .108 6.925 1 .008 .752 .608 .930
Females have 24.8% (1- 0.752) decreased risk of dying before the age of five (Crude HR =
0.752) as compared to male. It is statistically significant (p = 0.008, 95% CI: (0.608, 0.930). The
assumption is met because the curves do not cross each other.
14
Under five year children who are living in the rural have 16.8% increased risk of dying (crude
HR = 1.168 95% CI: (0.888, 1.538)) as compared to under five year children who are living in
the urban. Place of residence is non-significant risk factor for dying of under five year children
(p = 0.268, 95% CI: (0.888, 1.538)).
The assumption is met because the curves of the two groups do not cross each other (figure 1).
E. Cox regression for highest educational level
Reference category = no education
Mother’s educational levels have no effect on the survival experience of under year five children.
Because the table below shows that that the overall p – value is greater than 0.05 and confidence
interval each category cross one as shown on the output. Overall mother mother’s educational
status is non-significant risk factors for hazard of dying of under year five children.
As shown on figure 4 below; assumption of Cox PH model is not meets; hazard function curves
of no education and primary education touch each other. Relative hazard difference between
groups is not constant over time.
15
Figure: 4 Cox regression curves for survival function of educational level
Hazard for dying of under year five children who is first multiple (crude HR = 2.664; 95% CI:
(1.561, 4.547) is 2.6 times the hazard for single birth. Second multiple birth (crude HR = 0.002)
have almost the same (99.8%) hazard for dying as single birth.
16
Figure: 5 SPSS output for cox regression analysis
The assumption is met because the curves do not cross each other.
Hazard for dying of under year five children whose family is richer (crude HR = 0.926) is
reduced by 36.5% as compared to the hazard for under year five children whose family is
poorest. Under year five children whose family have middle income (crude HR = 0.926) have
14.9% decreased risk of dying when compared to poorest family. Wealth index is not significant
risk factors for hazard of dying of under year five children (p = 0.474).
Assumption of cox PH hazard model not met because survival curves do cross each other.
17
Figure: 6 SPSS output for cox regression analysis
18
When we compare model 1 (step 1) with Model 2(step 2) to evaluate the potential confounding
effect of the variable “sex of child”; In particular the HR column for the “child is twin” variable
is the same in both model. This means is no confounding due to sex of the child. Thus, the crude
model yields an estimated hazard ratio that is same with adjusted model.
The confidence intervals for child is twin in each model are shown on above table 23. The
interval for first multiple in model 1 has width of 2.986; in model 2 the width is 2.976, therefore,
model 2 gives a more precise estimate of the hazard ratio than do model 1. The fitted cox
regression model is model 2.
The first multiple birth (AHR = 2.654; 95% CI: (1.555, 4.531)) have 2.6 times higher risk of
dying when compared to single birth. Females under year five child (AHR = 0.752; 95% CI:
(0.608, 0.931))
19