Professional Documents
Culture Documents
e i = Y i − Yˆi
• The residual for observation i, ei, is the difference between its observed and predicted value
• Check the assumptions of regression by examining the residuals
– Examine for Linearity assumption
– Evaluate Independence assumption
– Evaluate Normal distribution assumption
– Examine for constant variance for all levels of X (homoscedasticity)
RESIDUAL OUTPUT
House Price Model Residual Plot
Square Feet
(X) Residuals
80
1 1400 -6.923162
60
2 1600 38.12329
40
3 1700 -5.853484
Residuals
20
4 1875 3.937162
1100 0
5 -19.99284
1000 2000 3000
1550 -20 0
6 -49.38832
2350 -40
7 48.79749
8 2450 -43.17929 -60
1425 Square Feet
9 64.33264
10 1700 -29.85348
Does not appear to violate any regression assumptions
Inferences About the Slope: t Test
• t test for a population slope
– Is there a linear relationship between X and Y?
• Null and alternative hypotheses
– H0: β1 = 0 (no linear relationship)
– H1: β1 ≠ 0 (linear relationship does exist)
• Test statistic
b1 −β1
t STAT =
Sb
where: 1
d.f. = 10- 2 = 8
α /2=.025 α/2=.025
Decision: Reject H0
b1
From Minitab output: S b1
Predictor Coef SE Coef T P
Constant 98.25 58.03 1.69 0.129
Square Feet 0.10977 0.03297 3.33 0.010
b1− β 0.10977 − 0
t STAT = 1 = = 3.32938
b1 Sb1 Sb 0.03297
1
Multiple Regression
The Multiple Regression Model
Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi)
Yi = β 0 + β1X1i + β 2 X 2i + + β k X ki + ε i
Multiple Regression Equation
The coefficients of the multiple regression model are estimated using sample data
Yˆ = b + b X + b X + + b X
i 0 1 1i 2 2i k ki
Example: 2 Independent Variables
• A distributor of frozen dessert pies wants to evaluate factors thought
to influence demand
– Dependent variable: Pie sales (units per week)
– Independent variables: Price (in $)
Advertising ($100’s)
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333