Professional Documents
Culture Documents
R : Linear Regression
呂冠臻
2020.10.22
Contents
3
Diagnosis of regression model
Recall
i.i.d 2
• Model : Yi = β0 + β1Xi + εi , where εi ∼ N(0, σ ), i = {1,..., n}
M
• Error : εi = Yi − β0 − β1Xi
̂ ̂ ̂
• Residual : ei = Yi − Yi = Yi − β0 − β1 Xi
2
• Use ei to estimate εi σ
n
2 1 2
n−2∑
σ̂ = ei = MSE
i=1
2 2
E[σ ̂ ] = σ
4
Diagnosis of regression model
Assumptions
• Linearity : The relationship between X and the mean of Y is linear
• ε ⊥ E[Y | X] = Y ̂
• X-axis : i or Xi or Yî
• Y-axis : ei
• ε ⊥ E[Y | X] = Y ̂
• ε⊥i
• ε⊥X
• Normality of ε
6
https://condor.depaul.edu/sjost/it223/documents/regress.htm
Diagnosis of regression model
Scatter plot for residual
• 請建⽴線性迴歸,探討SGOT之變化是否可⽤SGPT來預測或解釋,並對其進⾏
模型檢測。
• ε ⊥ E[Y | X] = Y ̂
• ε⊥i
• ε⊥X
• Normality of ε
• Use ei to estimate εi
7
Diagnosis of regression model
Scatter plot for residual
• Present the diagnosis result • R code
• par(mfrow=c(n, m))
̂
• 3 scatter plots : (Yi , ei) , (Xi , ei) , and (i, ei) • par(mfcol=c(n, m))
• Q-Q plot Check normality of ε
https://bookdown.org/ndphillips/YaRrr/arranging-plots-with-parmfrow-and-layout.html 8
Diagnosis of regression model
Scatter plot for residual
9
Your turn !
• 請建⽴線性迴歸,探討SGOT之變化是否可⽤SGPT來預測或解釋,在此題請將⼤
於SGPT第三分位數之資料剔除(剩437⼈),並對其進⾏模型檢測。
• Hint : Use summary() to get the 3rd quantile, then use which().
• Do not evaluate this question by your “eyes”.
10
Diagnosis of regression model
Results of exercise
11
Multiple linear regression model
Motivation
n
• Data : {(Xi , Yi)}i=1
Random Part (Error)
2
• Model : Yi = β0 + β1Xi1 + . . . + βpXip + εi , where εi ∼ N(0, σ ), i = {1,..., n}
M i.i.d
12
Linear Regression
Example : Indian Liver Patient Data
• 探討DB之變化是否可⽤Gender及SGPT來預測或解釋,並檢定其預測是否達
統計學上之顯著︖(α = 0.05)
H0 : βSGOT = 0
HA : βSGOT ≠ 0
H0 : βGender = 0
HA : βGender ≠ 0
• p-value <0.05,達統計上之顯著,故拒絕虛無假設。
當調整SGOT之差異後,以平均來說,男⽣組相較於
13
女⽣的DB⾼0.5247個單位,且達統計上之顯著。
Linear Regression
Interaction term
• 探討DB之變化是否可⽤Gender、SGPT及其⼆之交互作⽤,來預測或解釋,
並檢定其預測是否達統計學上之顯著︖(α = 0.05)
Yi = β0 + βSGOT XSGOT,i + βSex XSex,i + βinteract XSex,i XSGOT,i + εi
• How to interpret it ?
14
Linear Regression
Why we need interaction term ?
Female Male
15