Professional Documents
Culture Documents
30
20
Births
10
0
0 2 4 6 8 10
Storks
Equations for bivariate and multiple regression
Y= a + bX
a = Y intercept
Y= a + b1X1 + b2X2
a = Y intercept
X1
b
Suppose that you find that older mothers give birth to heavier
babies. That is, the slope of maternal age (X) on birthweight (Y)
is positive.
BW = a + b1 (Maternal age)
Where a = 2500 and b1 = 20
B
W
AGE
X1
b1
X2
b1 reflects the magnitude of the association
between X1 and Y, holding X2 constant.
In this case b1 changes – it is smaller than the
Seeing the patterns in an example
Suppose: older women are less likely to smoke during pregnancy, and
women who smoke less have healthier (heavier) babies.
Let’s now look at the association between AGE and BW, holding
constant # of cigarettes smoked.
or
B
W
AGE
X2 X1
b1
Suppose: Mothers of higher SES are older when they have their babies.
Mothers of higher SES have heavier babies
(perhaps because they have better prenatal care)?
Now let’s look at the association between AGE and BW, holding constant
maternal SES.
BW = a + b1 (AGE) + b2 (SES)
And the values of the coefficients might be:
a = 2200 and b1 = 0 and b2 = 200, or
B
W
AGE
“Holding SES constant, there is no association between birthweight and age”,
or
For every additional year of age, birthweight increases 0 grams,
holding SES constant”
(3) Adding a third variable
doesn’t change the association at all
Seeing the patterns in Venn diagrams
X1
b1
X2
Suppose that: while male children are heavier then female children,
there is no association between maternal age and gender of the child.
Gender of the child has no effect on the association between maternal age
and infant birth weight. We say that the relationship between maternal age
and infant birth weight is direct, or not confounded by the gender of
the child.
Seeing the patterns in a regression context
or
B
W
AGE
http://www.math.yorku.ca/SCS/spida/lm/visreg.html
Multiple regression in Stata:
interpreting the output
Example: Predicting the number of hours worked per week,
based on education and drinking habits
N = 1725
Look under Number of obs
R2 = 0.0572
The proportion of variance in Y that is explained both independent
variables together. In this case it is 5.72%.
Constant = 38.49222
The constant is the Y intercept (a.k.a. “a”).
For every additional hour spent in the pub last night, there is an
average decrease of 6.5 hours worked per week, holding constant
years of education.”
Interpreting the regression output, continued
If “t” were not given, you could still calculate t by using the
formula:
t= Statistic
Std. Error
Rules for X’s in multiple regression
JB
This week in Stat 1
No Lab this week
For next lecture
No Healey reading
Re-read Chapters 7, 8, 9 of Wagner Way
Complete the homework
Complete the challenge problems
Final assignment
You can now complete almost all analyses
Dig into the assignment deeply before you
attend next week’s lecture – I’ll take questions
then
Always: analyze, then stop and think
JB
Practice problems
JB