Professional Documents
Culture Documents
Regression
Correlation vs. Regression
Y Y
X X
Y Y
X X
Types of Relationships
No relationship
X
Regression Models
Linear relationships Curvilinear relationships
Y Y
X X
Y Y
X X
What is
Regression Analysis
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Yi β0 β1Xi ε i
Linear component Random Error
component
Simple Linear Regression Equation
(Prediction Line)
Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
Ŷi b0 b1Xi
observation i
Interpretation of the
Slope and the Intercept
350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet
Simple Linear Regression Example:
Using Excel Data Analysis Function
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
98.25 0.1098(2000)
317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Inferences About the Slope:
t Test Example
H0: Square footage does not
impact house prices
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
b1 Sb1
b1 β 1 0.10977 0
t STAT 3.32938
Sb 0.03297
1
Inferences About the Slope:
t Test Example
H0: β1 = 0
From Excel output: H1: β1 ≠ 0
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
Regression Statistics
Multiple R 0.76211
MSR 18934.9348
R Square 0.58082 FSTAT 11.0848
Adjusted R MSE 1708.1957
Square 0.52842
Standard Error 41.33032
With 1 and 8 degrees p-value for
Observations 10 of freedom the F-Test
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
F Test for Significance (continued)
Yi β 0 β1 X1i β 2 X 2i β k X ki ε i
Multiple Regression Equation
The coefficients of the multiple regression model are
estimated using sample data
ˆ b b X b X b X
Yi 0 1 1i 2 2i k ki
In this chapter we will use Excel to obtain the regression
slope coefficients and other regression summary measures.
Example: 2 Independent Variables
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341 Sales 306.526 - 24.975(Price) 74.131(Advertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Regression Statistics
SSR 29460.0
Multiple R 0.72213
r
2
.52148
R Square 0.52148 SST 56493.3
Adjusted R Square 0.44172
52.1% of the variation in pie sales
Standard Error 47.46341
is explained by the variation in
Observations 15
price and advertising
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Regression Statistics
Multiple R 0.72213
Test Statistic:
bj 0
t STAT (df = n – k – 1)
Sb
j
Are Individual Variables
Significant? Excel Output (continued)
Regression Statistics
Multiple R 0.72213
t Stat for Price is tSTAT = -2.306, with
R Square 0.52148 p-value .0398
Adjusted R Square 0.44172
Standard Error 47.46341 t Stat for Advertising is tSTAT = 2.855,
Observations 15 with p-value .0145
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333