Professional Documents
Culture Documents
Regression Analysis: Linear, Logistic and Mixed Models Linear, Logistic and Mixed Models
Regression Analysis: Linear, Logistic and Mixed Models Linear, Logistic and Mixed Models
Regression
g
analysis
y
Predict outcome (dependent variable) from
one or more independent variables
Implies causality
Used to explore relationships and assess
contributions
ib i
Models developed for explicit prediction
Familiar types
yp
Multiple regression
Logistic regression
Stepwise
S
regression
GLM: generalized linear models
GLMM: generalized linear mixed models
Dependent
p
and independent
p
variables
Numerical: continuous or ordinal
Fixed effects model
Y=b0 + b1x +
Or complex
Y=b0 + b1x1 + b2x2 ++ bijxi*xj +
Multivariate, interactions
Model Selection
Main effects
Interactions to specified level
Power
issues
Full model
Stepwise selection
GLM
Use
distributions
Based
on distribution of Y
Common
distribution
Binomial:
logit link
Logistic
g
regression
g
distribution
0 or 1, often coding for qualitative result
Infected
or not
Present or not
Logit link
Log(/1-
) = b0 + b1x1 + b2x2 +
Logistic
g
- interpretation
p
Probability of a 1
1 response given predictor
variables
Expressed as likelihood of an event
Odds ratios (or log odds ratios)
Odds
of a 1
1 response with one X over another
More common with levels of X rather than continuous
Example:
p logistic
g
stepwise
p
regression
g
Fixed effects
Example:
p Stepwise
p
logistic
g
regression
g
Logistic Model R2 = 0.34
Mosquito population
Mosquito mortality (baseline)
300
280
Yes
260
240
220
200
Epidemic?
180
No
160
70
80
90
100
110
120
10
100
1000
10000
130
140
GLM
Fixed
effects
Random effects
Independent
GLMM models
Depends on:
Distribution of
variables
Design of study
Variance structure
Bias in p
parameter estimation
Hypothesis testing: If coefficient is not significant, it is treated as 0
Simulation study:
Y=1 + 0.5*x +e
e~N(0,1)
n = 10
Regression model fit
Slope tested using
hypothesis testing
Slope = 0 accepted
Sl
d if
p<0.05
Distribution of slopes
from models fit
Distribution of slopes
from hypothesis testing
model selection
Stepwise
p
regression
g
of criteria used
Issues in stepwise
p
regression
g
Hypothesis testing
Involves
12 data
d
sets
Reanalyzed using
1: Stepwise F test
2: Stepwise
p
AIC
3: Stepwise BIC
4: All subset AIC
5: All subset BIC
6 Regression tree A
6:
7: Regression tree B
Models produced by IC
methods included more
variables
SF SIC SSIC
Model selection
How
Acknowledgements
g