1 Lm sng thng k Chn bin trong phn tch hi qui logistic: mt sai lm ph bin
Nguyn Vn Tun
Hi: Trong mt bi vit trc y, Thy vit rng cch chn bin cho mt m hnh hi qui logistic a bin t cc phn tch n bin l sai lm. Xin Thy gii thch thm ti sao?
Mt nghin cu y hc tiu biu thng o lng nhiu yu t lm sng nhm tin lng mt bin c no , chng hn nh t vong, gy xng, i tho ng, v.v... Ly v d mt nghin cu v nguy c t vong, nh nghin cu c th thu thp cc thng tin nh tui, chiu cao, cn nng, tin s bnh tt, li sng, hay c th o lng cc hormone, cc ch s sinh ha, v.v (s gt tt l bin hay variable) v cu hi t ra l trong nhng bin ny, bin no c lin quan n t vong. y l mt vn khng n gin, v cu tr li thng phi da vo kt qu phn tch thng k v kin thc sinh hc. Mt m hnh c th tin on rt chnh xc, nhng hon ton v dng v khng c ngha lm sng hay sinh hc; ngc li, mt m hnh c ngha lm sng nhng khng ph hp vi cc gi nh thng k cng ch l mt tr chi con s!
Mt trong nhng kh khn v c th ni l vn nan gii trong cc nghin cu a bin l cc bin tin lng (predictor variables) thng c mi lin quan sinh hc vi nhau. Chng hn nh chiu cao v cn nng c lin quan vi nhau, hay cc ch s sinh ha bin chuyn theo tng tui. V, nhng mi tng quan ny lm cho vn chn m hnh thm rc ri, nht l trong iu kin nghin cu da vo mt mu.
Vn chn m hnh
bn c hiu r vn , ti s ly mt v d n gin: mt nghin cu lm sng nhm mc ch pht trin mt m hnh tin lng nguy c t vong (hay kh nng sng st cho tch cc hn) cc bnh nhn cp cu (ICU) da vo cc ch s lm sng thu thp c t lc bnh nhn nhp vin. Tiu ch lm sng l t l bnh nhn sng st sau 30 ngy xut vin (v tit kim ch ngha, gi bin ny l Y). Cc bin thu thp lc nhp vin gm tui, cn nng, v khong 8 ch s sinh ha khc (gi tt l x 1 , x 2 , x 3 , ., x 10 ). tin lng kh nng sng st chng ta c rt nhiu m hnh kh d, chng hn nh:
Y = b 0 + b 1 x 1 + e Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 2 Y = b 0 + b 1 x 1 + b 2 x 2 + e Y = b 0 + b 1 x 1 + b 2 x 2 + b 3 x 3 + e Y = b 0 + b 1 x 1 + b 2 x 2 + b 6 x 6 + e v.v
trong , b 0 , b 1 , b 3 , v.v l nhng thng s lin quan n tng bin cn c tnh, v e l phn ngu nhin ca m hnh. Tht ra, cc m hnh trn y cn n gin, v chng ta cha xem xt n cc nh hng tng tc, nh hng phi tuyn tnh, v.v C th ni khng ngoa rng, vi 10 bin s, con s m hnh kh d c th ln n hng trm ngn, thm ch bt tn. Nhng trong nhng m hnh ny, m hnh no c th tin lng chnh xc nht v n gin nht?
y l mt cu hi lm tn bit bao cng sc ca nhiu nh khoa hc thng k, nh ton hc v bit bao giy mc tr li, nhng cho n nay vn vn cha ng ng. Rt nhiu phng php c pht trin, nhng cha c mt phng php no hon chnh. Rt nhiu nh thng k hc v ton hc mun gii quyt vn , v i khi h cng pht trin mt vi phng php, nhng rt tic l cc phng php ny khi p dng vo mi trng y hc th rt v ngha, v duyn, v khng th s dng c. Ti s khng bn chi tit ti sao vn vn cha ng ng (ti s quay li ch ny trong mt bi vit khc), m ch nhn c hi ny bn v mt sai lm ph bin trong vic i tm mt m hnh tin lng.
Mt sai lm ph bin
c mt bi bo khoa hc trn mt tp san y hc trong nc trc y, ti cc thy tc gi vit: Cc bin c lin quan vi t vong trong phn tch n bin vi mc ngha p<0.05 s c a vo phn tch hi qui a bin logistic. Ni cch khc, cc tc gi tin hnh phn tch hai giai on: Giai on 1, phn tch tng bin mt v lu cc bin c ngha thng k (tc p < 0.05); Giai on 2, cho tt c cc bin c ngha thng k trong giai on 1 vo mt m hnh a bin.
y l mt sai lm rt v t v kh ph bin trong y vn, khng ch nc ta m cn rt ph bin cc nc Ty phng. Thm ch, theo kinh nghim ca ngi vit bi ny, cc nh thng k chuyn nghip cng sai! Sai lm ny khng hn l do tc gi c , nhng do hiu lm (hay cha thng hiu) c ch ca cc m hnh thng k.
Vn chnh ca cch chn m hnh theo hai giai an trn l khi phn tch tng bin mt (giai on 1), m hnh hi qui logistic khng xem xt n nh hng ca cc Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 3 bin khc cng mt lc. Chng hn nh nu bin x 1 v x 2 c tng quan vi nhau, th phn tch giai on 1 c th chn c hai bin, nhng trong m hnh a bin (giai on 2), c th ch c x 1 c ngha thng k, cn x 2 th khng (hay ngc li), bi v thng tin ca bin ny hm cha trong thng tin ca bin kia (do hai bin c lin quan nhau).
Mt vn khc, tinh vi hn v t nh hn, l nh hng ca mt bin trung gian, rt kh hay khng th kim sot trong giai on 1. (Ti s bn qua v vn nh hng ca bin trung gian trong mt bi khc). Trong trng hp ny, c th hai bin c th hai bin x 1 v
x 5 (chng hn) trong thc t u c nh hng n Y, nhng nh hng ny ch tn ti khi chng xut hin bn nhau (cng hng); do , khi phn tch ring l, chng ta khng pht hin c nh hng ca chng, v do phn tch n gin trong giai on 1 c th b qua c hai bin!
V d 1: Gii, th dc, v t vong. Mt nghin cu (m phng) mt thi im (cross-sectional study) nhm nh gi mi lin h ca gii v nguy c t vong v bnh nhi mu c tim. Cc nh nghin cu cn thu thp thng tin lin quan n thi quen tp th dc v vn ng c th tng i tng. Kt qu nghin cu c th tm lc nh sau:
Bng 1. S i tng t vong v cn sng chia theo gii v thi quen tp th dc
Bin T vong Sng Odds ratio v tr s P Gii N Nam
113 94
2000 2000
OR = 1.21 p = 0.176 Tp th dc Khng C
164 43
2000 2000
OR = 4.06 p = 0.0001
Trong nghin cu trn, nu chng ta p dng phng php phn tch hi qui logistic cho tng bin ring l, chng ta s c:
OR (odds ratio) cho n l 1.21 vi tr s p = 0.176, tc khng c ngha thng k.
OR cho nhm khng thng xuyn tp th dc l 4.06 vi p = 0.0001, tc c ngha thng k.
Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 4 Nh vy, nu da vo phn tch ny, chng ta ch chn bin tp th dc vo m hnh a bin. Nhng kt qu ny c th sai. Quay li vi s liu ca nghin cu trn, chng ta th xc nh tn s t vong v sng st theo c hai bin cng mt lc nh sau:
Bng 2. S i tng t vong v cn sng chia theo thi quen tp th dc cng vi gii
Tp th dc v gii T vong Sng OR v tr s P Khng tp th dc N Nam
80 84
800 1200
OR = 1.43 p = 0.028 Tp th dc N Nam
33 10
1200 800
OR = 2.20 p = 0.026
Kt qu phn tch, nh trnh by trong ct s 3 ca bng trn, rt khc vi kt qu phn tch trong bng 1. y, chng ta thy, gii c nh hng n nguy c t vong trong c hai nhm khng tp th dc v tp th dc thng xuyn. Trong nhm khng tp th dc thng xuyn, OR t vong n l 1.43 vi p = 0.028; trong nhm tp th dc thng xuyn, OR l 2.20 vi p = 0.026.
Do , phng php phn tch ng cho trng hp ny l chng ta phi xem xt n nh hng ca hai bin cng mt lc trong m hnh a bin. M hnh ny c th vit nh sau:
Y = b 0 + b 1 x 1 + b 2 x 2 + e [1]
Trong , Y l log ca odd t vong, x 1 l gii, x 2 l tp th dc, v b 0 , b 1 , v b 2 l cc thng s cn c tnh. c s ca m hnh ny c th tm lc nh sau:
Bin H s ca phng trnh hi qui logistic OR v tr s P Gii (N) b 1 = 0.434 OR = 1.54, p = 0.003 Tp th dc (Khng) b 2 = 1.425 OR = 4.16, p < 0.0001
Kt qu phn tch a bin trn cho chng ta mt bc tranh rt khc vi phn tch n bin trong bng 1. n y, chng ta c th kt lun rng nh hng ca c hai bin (gii v tp th dc) u c ngha thng k, nhng nh hng ca tp th dc c v cao hn nh hng ca gii.
Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 5 Mt s nh nghin cu cho rng cch truy tm bin c ngha thng k cho phn tch a bin c th da vo kt qu ca phn tch n bin bng cch nng tr s p ln 0.15 (thay v 0.05). Ni cch khc, thay v lu gi nhng bin c tr s p < 0.05 trong giai on 1, c th nng cao tiu chun ny thnh p < 0.15 lu gi nhng bin c th b st v tiu chun p < 0.05. Tuy nhin, phng php ny cng sai nt! chng minh cho sai lm ny, ti s ly mt v d di y.
V d 2: Vn vi ch ca v d 1, nhng ln ny, ti thay i vi s liu chng minh khim khuyt va nu nh sau:
Bng 3. S i tng t vong v cn sng chia theo thi quen tp th dc cng vi gii
Bin T vong Sng Odds ratio v tr s P Gii N Nam
107 91
1935 1935
OR = 1.18 p = 0.267 Tp th dc Khng C
107 91
1984 1886
OR = 3.71 p = 0.0001
Trong nghin cu trn, nu phn tch tng bin ring l, mt ln na, nh hng ca yu t gii khng c ngha thng k (p = 0.267). Do , nu da vo tiu chun p < 0.15, chng ta phi loi b yu t gii trong phn tch a bin. Tuy nhin, bng s liu di y (Bng 4) cho thy nu phn tch nh hng ca gii trong tng nhm tp th dc, chng ta thy nh hng ca gii c ngha thng k.
Bng 4. S lng i tng t vong v cn sng chia theo thi quen tp th dc cng vi gii
Tp th dc v gii T vong Sng OR v tr s P Khng tp th dc N Nam
75 81
774 1161
OR = 1.39 p = 0.048 Tp th dc N Nam
32 10
1161 774
OR = 2.13 p = 0.034
Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 6 By gi, chng ta xem xt m hnh [1] (tc c tnh nh hng ca hai bin s cng mt lc trong mt m hnh a bin) cho s liu trong Bng 4, kt qu cho thy c hai bin u c ngha thng k:
Bin H s ca phng trnh hi qui logistic OR v tr s P Gii (N) b 1 = 0.4077 OR = 1.50, p = 0.0064 Tp th dc (Khng) b 2 = 1.3938 OR = 4.03, p < 0.0001
Tm tt
Xy dng mt m hnh hi qui logistic a bin l mt vn khng n gin, nht l trong trng hp cc bin tin lng c tng quan vi nhau. Cc v d trn y cho thy phng php truy tm bin c ngha thng k trong m hnh a bin da vo phn tch n bin c th dn n sai lm quan trng. Ngay c nng cao tr s p ln 0.15 cng vn c th phm sai lm.
Hin nay, cc phn mm thng k c sn mt s thut ton (algorithm) truy tm bin c lp cho m hnh a bin, nh thut ton stepwise, backward, v forward. Nhng ngay c cc thut ton ny, nht l thut ton stepwise v forward, vn c nhiu khim khuyt v cho ra nhng kt qu dng tnh gi, tc l nhng bin chng c lin quan g n bin ph thuc. Rt nhiu ngi khng hiu cc thut ton ny nn vn p dng chng mt cch v ti v v h qu l c rt nhiu nghin cu vi nhng kt qu sai trong y vn.
Xy dng mt m hnh a bin l mt khoa hc, nhng cng l mt ngh thut. Khoa hc tnh lin quan n cc tiu chun nh lng v thut ton thch hp. Ngh thut tnh lin quan n nhng yu t c th ni l ch quan, i hi nh nghin cu phi vn dng kin thc chuyn ngnh i n mt m hnh c ngha lm sng. M m hnh a bin nu ch tha mn cc tiu chun khoa hc vn cha th l mt m hnh c ch. Mt m hnh c ngha lm sng nhng khng p ng cc tiu chun khoa hc khng th l mt m hnh c tin cy cao. Do , phn tch a bin, d l m hnh logistic hay hi qui tuyn tnh, l mt phng php phc tp, i hi nhiu thi gian suy ngh v tnh ton. Khng th v khng nn cho my tnh suy ngh dm cho chng ta.
Chng trnh hun luyn y khoa YKHOA.NET Training Nguyn Vn Tun 7
Ch thch k thut:
Phn di y l cc m R s dng cho cc c tnh trnh by trong bi vit.
# Phn tch s liu v d 1 # phn tch nh hng ca gii sex <- c("Nu", "Nam") ntotal <- c(2000, 2000) ndeaths <- c(113, 94) pdeath <- ndeaths/ntotal logistic <- glm(pdeath ~ sex, binomial, weight=ntotal) logistic.display(logistic)