Professional Documents
Culture Documents
Regression
Regression
y = b0 + b1x +e
where:
b0 and b1 are called parameters of the model,
e is a random variable called the error term.
Simple Linear Regression Equation
E(y) = 0 + 1x
E(y)
Regression line
Intercept Slope b1
b0
is positive
x
Simple Linear Regression Equation
E(y)
Intercept
b0 Regression line
Slope b1
is negative
x
Simple Linear Regression Equation
No Relationship
E(y)
x
Types of Linear Relationships
Y Y
X X
Y Y
X X
08/14/2020 1
Estimated Simple Linear Regression Equation
ŷ b0 b1 x
Estimated
b0 and b1 Regression Equation
provide estimates of ŷ b0 b1 x
b0 and b1 Sample Statistics
b0, b1
Least Squares Method
No need to
Least Squares Criterion memorise
this formula
min (y i y i ) 2
where:
It’s called a “least squares” because the best line of fit is
one that minimizes the variance (the sum of squares of the
errors). ^
Least Squares Method
b0 y b1 x
Simple Linear Regression
Number of Number of
TV Ads (x) Cars Sold (y)
1 14
3 24
2 18
1 17
3 27
Sx = 10 Sy = 100
x2 y 20
Estimated Regression Equation
25
20
Cars Sold
y = 5x + 10
15
10
5
0
0 1 2 3 4
TV Ads
Coefficient of Determination
• The Coefficient of Determination, also known as R Squared, is
interpreted as the goodness of fit of a regression.
• The higher the coefficient of determination, the better the variance that
the dependent variable is explained by the independent variable.
• The coefficient of determination is the overall measure of the
usefullness of a regression.
where:
b1 = the slope of the estimated regression
equation yˆ b0 b1 x
rxy (sign of b1 ) r 2
yˆ 10 is
The sign of b1 in the equation 5 x“+”.
rxy = + .8772
rxy = +.9366
Assumptions About the Error Term e
Hypotheses
H0 : 1 0
H a : 1 0 No need to
Memorise this
Test Statistic formula
b1 s
t where sb1
sb1 ( xi x ) 2
Testing for Significance: t Test
Rejection Rule
where:
t is based on a t distribution
with n - 2 degrees of freedom
Testing for Significance: t Test
ANOVA
df SS MS F Significance F
Weekly sales
312 1600 300
0
199 1100
0 500 1000 1500 2000 2500 3000
219 1550 Number of customers
405 2350
324 2450
319 1425
255 1700
Regression Using Excel
08/14/2020 3
Graphical Presentation
450
400
350
Slope
Weekly sales
300
= 0.10977
($1000s)
250
200
150
100
50
0
Intercept 0 1000 2000 3000
= 98.248 Number of customers