You are on page 1of 13

LABORATORY EXERCISE 9

General Linear Model:


Logistic and Poisson Regression

Penyusun:

Fenny Hotimatul Hasanah 19/452970/PKU/18341

MINAT KIA-KESPRO MAGISTER S2 IKM

FK-KMK UGM

2020
Homework 1
Critical appraisal from published article:
1. Read following article entitle: Semba RD, de Pee S, Ricks MO, Sari M, Bloem MW,
Diarrhea and fever as risk factors for anemia among children under age five living in urban
slum areas of Indonesia. International Journal of Infectious Diseases 2008; 12: 62-70.
a. Please re-arrange the table 2 and 3 so that reader can easily read the OR and its’
confidence interval as well as comparing models. Please write models in the column! See
an example an article on breastfeeding on the module 1 table number 5.
Table 2 Univariate and multivariate logistic regression models of current diarrhea as a risk factor for
anemia
Variabel Model 1 Model 2 Model 3

Current diarrhea 1.33***(1.21-1.46) 1.12*(1.02-1.23) 1.20**(1.07 – 1.35)

Child age category


(months) -
6-11 - 5.02***(4.56 – 5.52) 4.79***(4.28 – 5.36)
12-23 - 3.27***(3.00 – 3.56) 3.23***(2.91 – 3.57)
24-35 - 2.07***(1.90 – 2.25) 2.03***(1.83 – 2.24)
36-47 - 1.33***(1.22 – 1.46) 1.32***(1.17 – 1.45)
48-59 1.00 (-) 1.00 (-)

Male Gender - 1.17***(1.12 – 1.23) 1.17***(1.11 – 1.24)

HAZ < – 2 - 1.31***(1.24 – 1.37) 1.38***(1.30 – 1.46)

Mother’s age (years)


<24 - 1.21*** (1.13 – 1.29) 1.23***(1.14 – 1.33
25-28 1.18*** (1.10 – 1.26) 1.17***(1.08 – 1.26
29-32 1.06 (0.98 – 1.13) 1.07 (0.99 – 1.16)
33+ 1.00 (-) 1.00 (-)
Mother’s education
(years) -
0 - 1.65***(1.46 – 1.87) 1.66***(1.43 – 1.91)
1-6 - 1.44***(1.36 – 1.53) 1.38*** (1.29 – 1.48)
7-9 - 1.23***(1.15 – 1.31) 1.20***(1.11 – 1.29)
10+ 1.00 (-) 1.00 (-)
Weekly per capita
household expenditure - - 0.99 (0.99 – 1.01)
(US $)
OR(p value)CI
* p< 0.05, ** p< 0.01, *** p< 0.001
OR, odds ratio; CI, confidence interval; HAZ, height for age Z score
Table 3 Univariate and multivariate logistic regression models of current fever as a risk factor
for anemia
Variabel Model 1 Model 2 Model 3

Current Fever 1.60***(1.37 – 1.88) 1.52*** (1.28-1.79) 1.44***(1.18 – 1.75)

Child age category


(months) -
6-11 - 4.96***(4.50 – 5.47) 4.82***(4.29 – 5.41)
12-23 - 3.23***(2.96 – 3.52) 3.22***(2.91 – 3.57)
24-35 - 2.07***(1.89 – 2.26) 2.03***(1.89 – 2.25)
36-47 - 1.32***(1.15 – 1.45) 1.30***(1.21 – 1.45)
48-59 1.00 (-) 1.00 (-)

Male Gender - 1.18***(1.12 – 1.23) 1.18***(1.12 – 1.23)

HAZ < – 2 - 1.30***(1.24 – 1.37) 1.30***(1.24 – 1.37)

Mother’s age (years)


<24 - 4.96*** (1.13 – 1.29) 1.24***(1.14 – 1.34)
25-28 1.17*** (1.09 – 1.25) 1.17***(1.08 – 1.26)
29-32 1.04 (0.98 – 1.12) 1.07 (0.99 – 1.15)
33+ 1.00 (-) 1.00 (-)
Mother’s education
(years) -
0 - 1.62***(1.43 – 1.83) 1.63***(1.41 – 1.89)
1-6 - 1.44***(1.36 – 1.52) 1.38*** (1.29 – 1.47)
7-9 - 1.23***(1.15 – 1.31) 1.20***(1.11 – 1.29)
10+ 1.00 (-) 1.00 (-)
Weekly per capita
household expenditure - - 1 (0.99 – 1.01)
(US $)
OR(p value)CI
* p< 0.05, ** p< 0.01, *** p< 0.001
OR, odds ratio; CI, confidence interval; HAZ, height for age Z score
b. Please re-write the models (1 to 3) from table 2 and 3 into regression coefficient its’ SE of
the original coefficient.
Answer: Koefisien = Log OR
Table 2 Univariate and multivariate logistic regression models of current diarrhea as a risk factor
for anemia
Regression Coefficient Regression Coefficient Regression Coefficient
Variabel
Model 1 Model 2 Model 3

Current diarrhea 0.28 0.11 0.18

Child age category


(months) - 1.61 1.57
6-11 - 1.18 1.17
12-23 - 0.73 0.71
24-35 - 0.11 0.28
36-47 - 0 0
48-59

Male Gender - 0.16 0.16

HAZ < – 2 - 0.27 0.32

Mother’s age (years)


<24 - 0.19 0.21
25-28 0.17 0.16
29-32 0.06 0.07
33+ 0 0
Mother’s education
(years) - 0.50 0.51
0 - 0.36 0.32
1-6 - 0.21 0.18
7-9 - 0 0.00
10+
Weekly per capita
household expenditure - - -0.01
(US $)
Table 3 Univariate and multivariate logistic regression models of current fever as a risk factor for
anemia
Regression Coefficient Regression Coefficient Regression Coefficient
Variabel
Model 1 Model 2 Model 3

Current Fever 0.47 0.42 0.36

Child age category


(months) - 1.60 1.58
6-11 - 1.17 1.17
12-23 - 0.73 0.71
24-35 - 0.28 0.26
36-47 - 0.00 0.00
48-59

Male Gender - 0.17 0.17

HAZ < – 2 - 0.26 0.26

Mother’s age (years)


<24 - 1.60 0.22
25-28 0.16 0.16
29-32 0.04 0.07
33+ 0.00 0.00
Mother’s education
(years) - 0.48 0.49
0 - 0.36 0.32
1-6 - 0.21 0.18
7-9 - 0.00 0.00
10+
Weekly per capita
household expenditure - - 0.00
(US $)
c. Can you add descriptive presentations of this findings in the form of graphs?

Y Grafik Koefisien +
6

0
0 1 2 3 4
X

Interpretasi : Berdasarkan grafik diatas dapat diketahui bahwa grafik berbentuk Sigmoid
terletak di sebelah kanan atas, karena tiap-tiap variabel memiliki nilai koefisien positif (+).

Homework 2

2. Framingham data
a. Do analysis on the prevalent chd (mi,ap,ci) for the first period of examination.

. logit prevchd c.age i.sex c.sysbp c.bmi c.totchol c.cigpday c.diabp i.diabetes i.cursmoke if period==1,or nolog

Logistic regression Number of obs = 4,332


LR chi2(9) = 207.13
Prob > chi2 = 0.0000
Log likelihood = -673.20768 Pseudo R2 = 0.1333

prevchd Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

age 1.111609 .0123881 9.49 0.000 1.087592 1.136157


2.sex .3517604 .0615562 -5.97 0.000 .2496269 .4956814
sysbp 1.010336 .0051426 2.02 0.043 1.000307 1.020466
bmi 1.016262 .0198923 0.82 0.410 .9780119 1.056008
totchol 1.001267 .0017592 0.72 0.471 .9978255 1.004721
cigpday .997228 .0104911 -0.26 0.792 .9768764 1.018004
diabp 1.000129 .009532 0.01 0.989 .9816196 1.018987
1.diabetes 1.394793 .4608936 1.01 0.314 .7298607 2.665506
1.cursmoke 1.088615 .2715543 0.34 0.734 .6676402 1.775031
_cons .0000277 .0000266 -10.91 0.000 4.20e-06 .0001822
b. Which variables are determining the prevalent of chd?
Answer: Berdasarkan hasil analisis diatas dapat diketahui bahya faktor yang
mempengaruhi prevalent of chd adalah faktor usia, jenis kelamin , dan tekanan darah
systolic.

c. Do you think there are interactions among variables? Can you prove it?
Answer: Jika terdapat hasil yang signifikan dengan p value < 0.05, maka artinya terdapat
interaksi antar variabel. Dengan ketentuan tersebut dan setelah dilakukan analisis maka
dapat disimpulkan bahwa tidak ada interaksi antar variabel prevstrk, baik variabel usia
dengan jenis kelamin, jenis kelamin dengan tekanan darah systolic, usia dengan tekanan
darah systolic, usia dengan variabel diabetes, jenis kelamin dengan diabetes, dan tekanan
darah systolic dengan diabetes, karena semua p value > 0.05.

. regress prevchd i.sex##c.age##c.sysbp##c.diabp##i.diabetes if period==1

Source SS df MS Number of obs = 4,434


F(31, 4402) = 9.86
Model 12.0462566 31 .388588924 Prob > F = 0.0000
Residual 173.465696 4,402 .03940611 R-squared = 0.0649
Adj R-squared = 0.0584
Total 185.511953 4,433 .041847948 Root MSE = .19851

prevchd Coef. Std. Err. t P>|t| [95% Conf. Interval]

2.sex -.844592 1.309332 -0.65 0.519 -3.411542 1.722358


age -.0090082 .0203318 -0.44 0.658 -.0488688 .0308525

sex#c.age
2 .0251013 .024411 1.03 0.304 -.0227567 .0729592

sysbp -.0043948 .0084538 -0.52 0.603 -.0209686 .012179

sex#c.sysbp
2 .009504 .010199 0.93 0.351 -.0104913 .0294993

c.age#c.sysbp .000148 .0001539 0.96 0.336 -.0001538 .0004497

sex#c.age#c.sysbp
2 -.0002662 .0001859 -1.43 0.152 -.0006307 .0000983

diabp .0001579 .0131357 0.01 0.990 -.0255947 .0259106

sex#c.diabp
2 .0106626 .0157036 0.68 0.497 -.0201243 .0414496

c.age#c.diabp .0000732 .0002436 0.30 0.764 -.0004044 .0005508

sex#c.age#c.diabp
2 -.0003219 .000293 -1.10 0.272 -.0008963 .0002525
c.sysbp#c.diabp .0000169 .0000941 0.18 0.858 -.0001677 .0002014

sex#c.sysbp#c.diabp
2 -.0001029 .0001118 -0.92 0.358 -.0003222 .0001164

c.age#c.sysbp#c.diabp -9.65e-07 1.71e-06 -0.56 0.572 -4.32e-06 2.39e-06

sex#c.age#c.sysbp#c.diabp
2 2.96e-06 2.03e-06 1.45 0.146 -1.03e-06 6.94e-06

1.diabetes -5.130037 11.99061 -0.43 0.669 -28.63766 18.37759

sex#diabetes
2 1 4.372821 13.08649 0.33 0.738 -21.28328 30.02892

diabetes#c.age
1 .1031425 .2007089 0.51 0.607 -.2903479 .496633

sex#diabetes#c.age
2 1 -.0841152 .2215925 -0.38 0.704 -.518548 .3503176

diabetes#c.sysbp
1 .0572033 .0856127 0.67 0.504 -.1106406 .2250473

sex#diabetes#c.sysbp
2 1 -.0752557 .0924562 -0.81 0.416 -.2565164 .1060049

diabetes#c.age#c.sysbp
1 -.0011923 .0014441 -0.83 0.409 -.0040235 .0016389

sex#diabetes#c.age#c.sysbp
2 1 .0015507 .0015719 0.99 0.324 -.001531 .0046325

diabetes#c.diabp
1 .0336068 .1401954 0.24 0.811 -.2412466 .3084603

sex#diabetes#c.diabp
2 1 -.0171712 .1549915 -0.11 0.912 -.3210325 .28669

diabetes#c.age#c.diabp
1 -.0006218 .0023319 -0.27 0.790 -.0051935 .00395

sex#diabetes#c.age#c.diabp
2 1 .0001916 .0026118 0.07 0.942 -.0049289 .0053121

diabetes#c.sysbp#c.diabp
1 -.000501 .0009572 -0.52 0.601 -.0023776 .0013755

sex#diabetes#c.sysbp#c.diabp
2 1 .000659 .0010406 0.63 0.527 -.001381 .002699

diabetes#c.age#c.sysbp#c.diabp
1 .0000102 .000016 0.63 0.526 -.0000213 .0000416

sex#diabetes#c.age#c.sysbp#c.diabp
2 1 -.0000129 .0000176 -0.73 0.463 -.0000474 .0000216

_cons .1489998 1.099266 0.14 0.892 -2.006114 2.304113


d. Do similar analysis for the prevalent of stroke but consider the low prevalent.
. logit prevstrk c.age i.sex c.sysbp c.bmi c.totchol c.cigpday c.diabp i.diabetes i.cursmoke if period==1,or nolog

Logistic regression Number of obs = 4,332


LR chi2(9) = 24.06
Prob > chi2 = 0.0042
Log likelihood = -162.06018 Pseudo R2 = 0.0691

prevstrk Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

age 1.057898 .0275322 2.16 0.031 1.005289 1.11326


2.sex .7277221 .2964592 -0.78 0.435 .327494 1.617066
sysbp 1.020824 .0112559 1.87 0.062 .9989996 1.043126
bmi 1.012574 .0435693 0.29 0.772 .9306818 1.101673
totchol .997376 .0044198 -0.59 0.553 .9887509 1.006076
cigpday .9627874 .0325762 -1.12 0.262 .9010103 1.0288
diabp .9940021 .0213802 -0.28 0.780 .9529687 1.036802
1.diabetes 1.404007 1.064497 0.45 0.654 .317688 6.20494
1.cursmoke 1.37259 .8522989 0.51 0.610 .4064375 4.635405
_cons .0000563 .0001163 -4.74 0.000 9.83e-07 .003226

Interpretasi : berdarakan hasil anaslisis faktor yang memiliki hubungan yang signifikan
dengan variabel prevstrk adalah variebel usia dengan p value 0.031.

. regress prevstrk i.sex##c.age##c.diabp if period==1

Source SS df MS Number of obs = 4,434


F(7, 4426) = 5.12
Model .255140951 7 .036448707 Prob > F = 0.0000
Residual 31.5139163 4,426 .00712018 R-squared = 0.0080
Adj R-squared = 0.0065
Total 31.7690573 4,433 .007166492 Root MSE = .08438

prevstrk Coef. Std. Err. t P>|t| [95% Conf. Interval]

2.sex -.0318247 .107844 -0.30 0.768 -.243253 .1796035


age -.002372 .0016055 -1.48 0.140 -.0055196 .0007757

sex#c.age
2 .0006105 .00208 0.29 0.769 -.0034673 .0046883

diabp -.0016033 .0010064 -1.59 0.111 -.0035764 .0003698

sex#c.diabp
2 .0004962 .0013108 0.38 0.705 -.0020735 .003066

c.age#c.diabp .0000373 .0000193 1.93 0.054 -6.03e-07 .0000751

sex#c.age#c.diabp
2 -9.76e-06 .0000251 -0.39 0.697 -.0000589 .0000394

_cons .1040655 .0835262 1.25 0.213 -.0596876 .2678187


Interpretasi: berdasarkan hasil hasil analisis maka dapat disimpulkan bahwa tidak ada
interaksi antar varibel prevstrk dengan usia dan jenis kelamin, prevstrk dengan usia dan variabel
diabp, maupun prevstrk dengan jenis kelamin dengan variabel diabp karena p-value > 0.05.

e. Report IRR for your analysis


Insiden kematian akibat Storke
. ir death prevstrk timestrk

prevalent stroke
[infarct,hem]
Exposed Unexposed Total

death indicator 112 3415 3527


days baseline-in 207620 8.89e+07 8.91e+07

Incidence rate .0005394 .0000384 .0000396

Point estimate [95% Conf. Interval]

Inc. rate diff. .000501 .0004011 .0006009


Inc. rate ratio 14.03754 11.52364 16.94679 (exact)
Attr. frac. ex. .9287624 .9132219 .9409918 (exact)
Attr. frac. pop .0294929

(midp) Pr(k>=112) = 0.0000 (exact)


(midp) 2*Pr(k>=112) = 0.0000 (exact)

Interpretasi: Berdasaran hasil analisis dapat diketahui bahwa insiden kematian akibat
Stroke terjadi 14 kali lebih tinggi jika dibaningkan dengan kelompok yang tidak Stroke.

Insiden kematian akibat Strok berdasarkan usia


. ir death prevstrk timestrk,by( sex)

sex IRR [95% Conf. Interval] M-H Weight

1 10.27459 7.548351 13.67937 4.660083 (exact)


2 18.66784 14.3043 23.97443 3.420761 (exact)

Crude 14.03754 11.52364 16.94679 (exact)


M-H combined 13.8276 11.45584 16.69039

Test of homogeneity (M-H) chi2(1) = 9.66 Pr>chi2 = 0.0019


Interpretasi: Berdasaran hasil analisis dapat diketahui bahwa insiden kematian akibat
Stroke dengan mempertimbangkan jenis kelamin yaitu 13.8 dan insiden kematian lebih
tinggi pada kelompok perempuan yang mengalami stroke daripada kelompok laki – laki.

Inseiden kematian Akibat Stroke berdasarkan status merokok

. ir death prevstrk timestrk,by( cursmoke )

current cig smok IRR [95% Conf. Interval] M-H Weight

0 13.35044 10.53647 16.70305 5.972966 (exact)


1 17.93529 12.22448 25.42563 1.78216 (exact)

Crude 14.03754 11.52364 16.94679 (exact)


M-H combined 14.40406 11.92737 17.39501

Test of homogeneity (M-H) chi2(1) = 1.95 Pr>chi2 = 0.1626

Interpretasi : Berdasaran hasil analisis dapat diketahui bahwa insiden kematian akibat
Stroke dengan mempertimbangkan variabel status merokok adalah 14.4, insiden kematian
meningkat 0.4 poin.

Inseiden kematian Akibat Stroke berdasarkan status didebetes

. ir death prevstrk timestrk,by( diabetes )

diabetic y/n IRR [95% Conf. Interval] M-H Weight

0 15.28728 12.31617 18.77076 6.136758 (exact)


1 5.21762 3.051379 8.385642 3.411636 (exact)

Crude 14.03754 11.52364 16.94679 (exact)


M-H combined 11.6894 9.679021 14.11734

Test of homogeneity (M-H) chi2(1) = 17.64 Pr>chi2 = 0.0000

Interpretasi: Berdasaran hasil analisis dapat diketahui bahwa insiden kematian akibat
Stroke dengan mempertimbangkan penderita diabetes dan bukan penderita diabetes adalah
11.6, insiden kematian menurun 2.4 poin.
Insiden kematian akibat Coronary Heart Disease
. ir death prevchd timechd

prevalent chd
[mi,ap,ci]
Exposed Unexposed Total

death indicator 527 3000 3527


days baseline-in 902127 8.06e+07 8.15e+07

Incidence rate .0005842 .0000372 .0000433

Point estimate [95% Conf. Interval]

Inc. rate diff. .0005469 .0004971 .0005968


Inc. rate ratio 15.69127 14.27681 17.21908 (exact)
Attr. frac. ex. .9362703 .9299563 .9419249 (exact)
Attr. frac. pop .1398964

(midp) Pr(k>=527) = 0.0000 (exact)


(midp) 2*Pr(k>=527) = 0.0000 (exact)

Interpretasi: Berdasarkan hasil analisis dapat diketahui bahwa insiden kematian


akibat Coronary Heart Disease terjadi lebih tinggi 15 kali dibanding yang tidak
mengalami Coronary Heart Disease.

Insiden kematian akibat Coronary Heart Disease berdasarkan jenis kelamin


. ir death prevchd timechd ,by( sex )

sex IRR [95% Conf. Interval] M-H Weight

1 14.57937 12.95765 16.36835 24.04245 (exact)


2 14.50063 12.30142 17.00287 11.6966 (exact)

Crude 15.69127 14.27681 17.21908 (exact)


M-H combined 14.5536 13.2583 15.97545

Test of homogeneity (M-H) chi2(1) = 0.00 Pr>chi2 = 0.9568

Interpretasi: Berdasarkan hasil analisis dapat diketahui bahwa insiden kematian akibat
Coronary Heart Disease dengan mempertimbangkan jenis kelamin yaitu 14.5, hampir
sama antara kelompok laki-laki dan perempuan.
Insiden kematian akibat Coronary Heart Disease berdasarkan status merokok

. ir death prevchd timechd ,by( cursmoke )

current cig smok IRR [95% Conf. Interval] M-H Weight

0 14.78219 13.05944 16.68917 21.03192 (exact)


1 17.84286 15.36806 20.632 11.78044 (exact)

Crude 15.69127 14.27681 17.21908 (exact)


M-H combined 15.88104 14.47459 17.42416

Test of homogeneity (M-H) chi2(1) = 3.85 Pr>chi2 = 0.0498

Interpretasi: Berdasarkan hasil analisis dapat diketahui bahwa insiden kematian akibat
Coronary Heart Disease dengan mempertimbangkan status merokok adalah 15.8. Insiden kematian
sebelum dan sesudah mempertimbangkan variabel current smoke hampir sama.

Insiden kematian akibat Coronary Heart Disease berdasarkan status diabetes

. ir death prevchd timechd ,by( diabetes )

diabetic y/n IRR [95% Conf. Interval] M-H Weight

0 16.69051 15.06331 18.45858 26.28903 (exact)


1 6.200528 4.778659 7.978753 12.82906 (exact)

Crude 15.69127 14.27681 17.21908 (exact)


M-H combined 13.25025 12.06209 14.55544

Test of homogeneity (M-H) chi2(1) = 56.29 Pr>chi2 = 0.0000

Interpretasi: Berdasarkan hasil analisis dapat diketahui bahwa insiden kematian akibat
Coronary Heart Disease dengan mempertimbangkan penderita diabetes dan bukan penderita
diabetes adalah 13.2. Insiden kematian turun 2.4 poin.

You might also like