Introduction to Biostatistics
Module 8: Multiple Regression Analysis
Describing Relation b/w Disease & Two or More
Exposures
Multiple Regression Analyses
2
+ +
Learning Objectives
3
Explain association between disease/event
and two or more exposures.
Identify independent exposures for a disease.
Predict disease from the knowledge of
exposures.
1
Multiple Regression Analysis - Example
4
more than 1 exposure
Consider a blood pressure (BP) study data
Exposures:
is multiple regression
Age, BMI, DBP, CL, SES
Outcome: SBP
Multiple Regression: Notation
5
Multiple Regression:
efect of age on SBP
Outcome
= a + b1 × exposure_1 + b2 x exposure_2 + …
EFFECT OF BMI ON…
Example:
SBP = a + b1 x Age + b2 x BMI + b3 x CL + b4 x SES + ….
Multiple Regression Analysis: SBP on AGE & DBP box also residual-
6
scotter diagr or box plot
Hypothesis:
Null: SBP is not related to DBP and AGE
Assumptions:
Check the normality for SBP! How?
Check linearity only for numerical covariates!
How?
Check constant variability. How?
2
Normality for SBP
7
its normal distrb,
effect outliers( sensitivity),
use box-plot analisis
withand without outlier)
Checking Linearity/Variability/outliers
8
no solo defiende
valores estadisiticos,
sino tambien la situacion
clinica,
Multiple Linear Regression for SBP on AGE and DBP
‘b’ coeficinte: (.340)
9
positive relation
por every one year
increse , the SBP,
increases in 0.340
** 1.180 ( por cada ano
de incremento
hay significant
relationship
3
Multiple Regression Analysis
10
Interpretation of beta coefficients
DBP: 1.18, …………………………..!
Age: 0.34, …………………………!
Interpretation of 95% CI:
DBP: 95% CI (1.03, 1.33)
Age: 95% CI (0.19, 0.49)
Significance:
P-value: AGE (<0.001) & DBP (<0.001)
95% CI: both excludes ZERO
Model Adequacy
11
How good the prediction Model?
R-squared
Residual plot
Normal probability plot
de un paciente al tener varias
medidas, si queremos saber si
esatn relacionadas al resultado.
more than 100 variables,
Stepwise Regression: Select Significant Variables
into Regression hay un mecanismo, si scottler
pic, no es normal, lo quitamos,
12
Outcome SBP
Covariates: AGE, DBP, BMI, CL and SES
Note: should follow the steps discussed
1. public health experiencia
before
pero si es clinicamente
importante lo retemos a la
variable
2. plot scottler +run linear
regression, to check potential
variables.
**select the
4
Select Significant Variables
in Regression
13
most recommnde : backward
Methods: Stepwise variable selection
Backward elimination method
Forward selection method
How do they work?
Variable Selection: Backward Elimination
14
in the first model has
higest, so you drop it,
then for the following it s
not there, so you do
that, till all variables
Model: SBP = 10.7 + 0.34 × AGE +1.18 × DBP aresignificant
Model Adequacy
15
R-squared.
Residual plot.
Normal probability plot.
5
Model Adequacy – R-square se coge el ultimo
16
modelo, que ya
ha sido evalaudo,
su grado de
significancia
59% can be explained
Model Adequacy: Residual Plot
17 outliers remove them for
the sample size, run the
regression, if doesnt
affect, you keep them
Model Adequacy: Normal probability plot
18
6
Discussion & Conclusion
list the name variables
19 then presnet result of simple an
Discussion:
Patients age (beta = 0.34, p <0.001, 95% CI: 0.189 – 0.490) and
diastolic blood pressure (beta = 1.18, p<0.001, 95% CI: 1.032, 1.328)
multiple
are significantly related to systolic blood pressure.
About 59.5% of the total variation in SBP is explained by age and diastolic
blood pressure, thus the model’s prediction performance is moderately high.
The residual plot shows an outlier and the normality plot reconfirms that the
outcome follows the normality. b, coef, p valu, CI(x simple reg
Note: Further investigation required to evaluate the effect of outlier on
the analysis results.
Conclusion: Patients older age and higher diastolic blood pressure are related
to increased systolic blood pressure. in papers:
give direction,
Module Summary
20
Correlation Analysis
Linear Regression Analysis: Continuous outcome
Simple regression – only one covariate
Multiple regression – 2 or more covariates
Objectives
Assumptions
Hypotheses/method
Run regression using SPPS
Evaluate model adequacy
Discussion & Conclusion
Stepwise Variable Selection: Use Backward elimination method
Recommended Reading
Module 8 – Billah 2018
Reference Text Books