Professional Documents
Culture Documents
Regression
Regression
(RFM) 40
• Y = percent using 30
helmets (HELM) 20
• n = 12 (outlier 10
removed to study
0
0 20 40 60 80 100
linear relation)
X - % Receiving Reduced Fee School Lunch
Regression Model (Equation)
(p. 15.2)
yˆ a bX “y hat”
where
yˆ represents predicted average of Y at a given X
a represents the line's intercept
b represents the line's slope
How formulas determine best line
(p. 15.2)
• Distance of points
from line = 60
residuals (dotted)
50
40
• Minimizes sum of 30
square residuals 20
• Least squares
10
regression line
0 20 40 60 80 100
SS XY 4231.1333
b 0.539
SS XX 7855.67
a y bx 30.8833 (0.539)(30.8333) 47.49
SPSS output:
Alternative formula for slope
sY r
b
sX
Interpretation of Slope (b)
(p. 15.3)
• b = expected change in Y
per unit X 60
• H0: ß = 0
• tstat (formula 7) with df = n – 2
• Convert tstat to p value
b 0.539
t stat 5.09
seb 0.1058
df =12 – 2 = 10
p = 2×area beyond tstat on t10
Use t table and StaTable
Regression ANOVA
(not in Reader & NR)
• Linearity
• Independence
• Normality
• Equal variance X
Regression Line
Distribution of Y
Y
Validity Assumptions
(p. 15.6)
• Data = farr1852.sav 200
correlation 40
r = -0.88
10
ŷ = 129.9 + (-1.33)x 8
p = .009
6
0 100 200 300 400
• Farr used these results to support Mean Elevation of the Ground above the Higwater Mark
miasma theory and refute contagion
theory
– But data not valid (confounded by
“polluted water source”)