Professional Documents
Culture Documents
Chapter 5
Chapter 5
CHAPTER 5
Y = α + ꞵ Xi + ε
y-Intercept Slope
y = a + b xi
Interpretation of a
a is conditional mean when x=0
Interpretation of b
b is the slope, also stated as the change in mean of Y per 1 unit change in x.
Note that when b is positive, an increase in x will lead to an increase in y, and a
decrease in x will lead to a decrease in y. In other words, when b is positive, the
movements in x and y are in the same direction. Such a relationship between x and y
is called a positive linear relationship. The regression line in this case slopes upward
from left to right. On the other hand, if the value of b is negative, an increase in x will
lead to a decrease in y, and a decrease in x will cause an increase in y. The changes in
x and y in this case are in opposite directions. Such a relationship between x and y is
called a negative linear relationship. The regression line in this case slopes downward
from left to right.
We wish to use the sample data to estimate the population parameters: the slope β
and the intercept α .
Observed value
Data (y)
Estimated Regression
Line
g(a,b) = ∑( )2
ei = –
So, in least squares estimation, we wish to minimize the sum of the squared
residuals (or error sum of squares SSE).
To minimize
g(a,b) = ∑( y – a – bx )2
we take the derivative of g with respect to a and b, set equal to zero, and solve.
= - 2 ∑( y – a – bx ) = 0
= - 2 ∑( y – a – bx ) xi = 0
a =
OR a =Ῡ-bẍ
x 0 1 2 3 4
y 2 3 5 4 6
x y xy x2
0 2 0 0
1 3 3 1
2 5 10 4
3 4 12 9
4 6 24 16
Σx = 10 Σy = 20 Σx y = 49 Σx2 = 30
We now calculate a and b using the least square regression formulas for
a and b.
b y.x =
b y.x =
b y.x = = = 0.9
a = -b
=∑y/n and =∑x/n
= 20 / 5 = 10 / 5
=4 =2
a = 4 – (0.9 *2)
a = 4 – 1.8 = 2.2
σy.x =
Where N is the population size.
sy.x =
Alternate formula
sy.x =
Where n is the sample size.
r2 = =
r2 =
For example, assuming you have a set of four observed values for an x y
unnamed experiment, the table below shows y values observed and
recorded for given values of x:
1 1
Solution:
4
7
x y xy x2
1 1 1 1
2 4 8 4
3 6 18 9
4 7 28 16
Σx = 10 Σy = 18 Σx y = 55 Σx2 = 30
We now calculate a and b using the least square regression formulas for
a and b.
b y.x =
b y.x =
b y.x = = =2
a = -b
=∑y/n and =∑x/n
= 18 / 4 = 10 / 4
= 4.5 = 2.5
a = 4.5 – (2 *2.5)
a = 4.5 – 5 = -0.5
Now that we have the estimated least square regression line of y on x is
= -0.5 + 2 x
If the linear equation or slope of the line predicted by the data in the model
is given as =-0.5 + 2x where = predicted y value, the residual for each
observation can be found.
The residual is equal to (y - ), so for the first set, the actual y value is 1
and the predicted yest value given by the equation is = -0.5 +2(1) = 1.5.
The residual value is thus 1 – 1.5 = -0.5, a negative residual value.
For the second set of x and y data points, the predicted y value when x is 2 and y
is 4 can be calculated as -0.5 + 2 (2) = 3.5.
The residual value is thus 4 – 3.5 = 0.5, a positive residual value.
In this case, the actual and predicted values are the same, so the residual value
will be zero. You would use the same process for arriving at the predicted values
for y in the remaining two data sets.
sy.x = =
sy.x .70711
r2 = 1 -
= / n = 102 - (18)2 / 4
= 102 – 81 = 21
r2 = 1 - = 1 – 0.047619 = 0.952381
It means 95% variation is explained by the regression line.
ry.x =
ry.x =
X 4 8 12 16
Y 5 10 15 20
Solution:
For finding the linear coefficient of these data, we need to first
construct a table for the required values.
ry.x =
ry.x =
ry.x =
ry.x =
ry.x = 1
Hence there is perfect positive correlation between X and Y.
a) Find the regression line with price as a dependent variable and age as an
independent variable .
b) Predict the price of a 7 years old car of this model.
c) Estimate the price of an 18 year-old car of this model. Comment on this finding.
Q2: Calculate and analyze the correlation coefficient between the number of study
hours and the number of sleeping hours of different students.
Number of Study 2 4 6 8 10
Hours
Number of Sleeping 10 9 8 7 6
Hours
Q4:The following data give the experience (in years) and monthly salaries (in
hundreds of dollars) of nine randomly selected secretaries.
Experience
14 3 5 6 4 9 18 5 16
Monthly salary
62 29 37 43 35 60 67 32 60