Professional Documents
Culture Documents
Notes13 1student
Notes13 1student
1 The Simple Regression Model
In chapter 5, we studied linear regression and calculated equations in the form ŷ = a + bx . Since this
equation was based on a sample, it is subject to sampling variability and probably incorrect.
In the _________________________________________________, we will assume that
there is a true or population regression line ŷ = a + b x
the relationship between the observed variables is given by y = a + b x + e where e : N (0, s e )
Note that this means that for any fixed value of x, y has a normal distribution with mean a + b x and
standard deviation s e . The actual y value for a particular value of x is based on two terms: the mean
value of y for that particular value of x and a random deviation from the mean: y = yˆ + e . If the error
term, e, was not included, that would imply that all points would fall exactly on the true regression line.
The ŷ value produced by the model has 2 interpretations:
ŷ is the predicted y value for a particular x (our best guess for the actual value of y)
ŷ is the average y value for a particular x (sometimes this is expressed as m y = a + b x )
Each value of x has its own distribution of yvalues.
Thus, when we do inference for regression, we will need to check that the variability of the residuals is
the same for all x’s and that they are approximately normally distributed.
For example, suppose we knew that a = 2, b = 5, and s e = 1.3.
If x = 3, then we would expect y to be normally distributed with
mean = ________________
SD = _______________
In the next section we will use inference methods to estimate the parameters of the regression model and
to test the validity of the linear model. The sample regression line ŷ = a + bx will provide point
estimates for these parameters.
Suppose we wanted to model a teacher’s salary (y) based on the number of years s/he has been teaching
(x). Using a random sample of 13 salaries (in 1000’s), the following model was calculated:
yˆ = 45.59 + 0.798 x r 2 = .587 se = 5.31
This gives an estimation of the true regression line.
Interpret the slope and constant term of this model:
Interpret the values of r 2 and se .
Interpret the value of ŷ when x = 3.
HW #96: 13.1, 13.3, 13.6, 13.8 (find the regression line from the data using your calculators)