You are on page 1of 1

SIMPLE LINEAR REGRESSION

15.1 SIMPLE LINEAR REGRESSION It represents the percent of the data that is the closest to the
A simple linear regression attempts to model the line of best fit.
relationship between two variables by fitting a linear equation For example, if r = 0.922, then r 2= 0.850, this means that
to observed data. One variable is considered to be an 85% of the total variation in y can be explained by the linear
explanatory variable, and the other is considered to be a
relationship between x and y . The other 15% of the total
dependent variable.
 Dependent variable – the variable that is being variation in y remains unexplained. If the regression line
estimated or predicted. passes exactly through every point on the scatter plot, it would
 Independent variable – the variable that provides a basis be able to explain all of the variation. The further the line is
for estimation. It is the predictor variable. away from the points, the less it is able to explain.
The linear regression model postulates that EXERCISES:
Y = β0 + β 1 X +e 1. A study was made by a retail merchant to determine the
where: relation between weekly advertising expenditures and sales.
Y = dependent/response variable The following data were recorded:
X = independent/explanatory variable Advertisin Sales ($) Advertisin Sales ($)
g Costs ($) g Costs ($)
β 0 and β 1= regression coefficients
40 385 40 490
β 0= y -intercept of the regression line 20 400 20 420
β 1 = slope of the regression line 25 395 50 560
e = residual/random error 20 365 40 525
In general, the goal of linear regression is to find the line 30 475 25 480
50 440 50 510
that best predicts Y from X , that is, to find the line
a) Plot a scatter diagram.
Y =a+bX that best estimates the regression model b) Find the equation of the regression line to predict weekly
Y = β0 + β 1 X +e by determining a and b that best estimate sales from advertising expenditures.
β 0 and β 1. c) Compute the coefficient of determination.
**Note that linear regression assumes that the data are d) Estimate the weekly sales when advertising costs are $35.
linear and it finds the slope and intercept that make a straight 2. In the 1990’s, research efforts have focused on the problem
line best fit the data. of predicting a manufacturer’s market share using information
on the quality of its product. Suppose that the following data
15.2 METHOD OF LEAST SQUARES are available on market share, in percentage ( Y ), and product
The slope: quality, on scale of 0 to 100, determined by an objective
n ∑ xy−∑ x ∑ y evaluation procedure ( X ).
b= 2 X 27 39 73 66 33 43 47 55 60 68 70 75
n ∑ x 2− ( ∑ x )
Y 2 3 10 9 4 6 5 8 7 9 10 13
a) Draw the scatter diagram.
The y -intercept: b) Estimate the simple linear regression relationship between

a=
∑ y −b ∑ x market share and product quality rating. Graph the line.
c) Compute for the coefficient of determination. Interpret.
n n
d) Estimate the market share when the product quality is 95.
a= ý−b x́
3. The paired data below consist of the costs of advertising (in
The goal of linear regression is to adjust the values of thousands of dollars) and the number of products sold (in
slope and intercept to find the line that best predicts Y from X thousands).
. More precisely, the goal of regression is to minimize the sum Cost 9 2 3 4 2 5 9 10
of the squares of the vertical distances of the points from the Number 85 52 55 68 67 86 83 73
line. a) Plot a scatter diagram.
b) Find the equation of the regression line to predict weekly
15.3 THE COEFFICIENT OF DETERMINATION sales from advertising expenditures.
The coefficient of determination, r 2, is used to determine c) Compute the coefficient of determination. Interpret.
the proportion of the variance (fluctuation) of one variable that d) Estimate the number of products sold when advertising
is predictable from the other variable. It allows us to determine costs are $4500.
how certain one can be in making predictions from a certain
model/graph.
The coefficient of determination has values from 0 to +1,
and measures how well the regression line represents the data.
Page 1 of 1

You might also like