You are on page 1of 15

13- 1

Regression Analysis.
Linear Regression
PLAN

1. Least squares regression line, slope and intercept


values.
2. The Standard Error of Estimate.

3. Construction and interpretation a confidence interval


and prediction interval for the dependent variable.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 2

Regression Analysis

 In regression analysis we use the independent


variable (X) to estimate the dependent variable (Y).

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 3

Regression Analysis
The regression equation: Y’= a + bX, where:
 Y’ is the average predicted value of Y for any X.

 a is the Y-intercept. It is the estimated Y value


when X=0
 b is the slope of the line, or the average change in
Y’ for each change of one unit in X
 the least squares principle is used to obtain a and b.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 4

Regression Analysis
 The least squares principle is used to obtain a and b. The
equations to determine a and b are:

n( XY )  ( X )( Y )
b
n(  X 2 )  (  X ) 2
Y X
a b
n n

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 5

EXAMPLE 1
 Dan Ireland, the student body president at
Toledo State University, is concerned about the
cost to students of textbooks. He believes there
is a relationship between the number of pages
in the text and the selling price of the book. To
provide insight into the problem he selects a
sample of eight textbooks currently on sale in
the bookstore. Draw a scatter diagram.
Compute the correlation coefficient.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 6

EXAMPLE 1 continued

Book Page Price ($)


Into to History 500 84
Basic Algebra 700 75
Into to Psyc 800 99
Into to Sociology 600 72
Bus. Mmgt 400 69
Intro to Biology 500 81
Fund. of Jazz 600 63
Princ. of Nursing 800 93
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 7

EXAMPLE 1 continued

Develop a regression equation for the information given in


that can be used to estimate the selling price based
on the number of pages.

8(397 ,200 )  (4,900 )( 636 )


b  .05143
8(3,150 ,000 )  (4,900 ) 2

636 4,900
a  0.05143  48 .0
8 8

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 8

Example 1 continued
The regression equation is:

Y’ = 48.0 + .05143X

 The equation crosses the Y-axis at $48. A book with no


pages would cost $48.
 The slope of the line is 0.05143. Each addition page
costs about a nickel.
 The sign of the b value and the sign of r will always be
the same.
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 9

Example 1 continued

We can use the regression equation to estimate values of Y.

 The estimated selling price of an 800 page book is $89.14, found


by

Y   48 .0  0.05143 X
 48 .0  0.05143 (800 )  89 .14

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 10

2. The Standard Error of Estimate


 The standard error of estimate measures the scatter, or
dispersion, of the observed values around the line of
regression

The formulas that are used to compute the standard


error:
(Y  Y ) 2
s y. x 
n2
Y 2  aY  bXY

n2
McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 11

Example 2
Find the standard error of estimate for xathe problem involving the
number of pages in a book and the selling price from Example 1.

Y 2  aY  bXY
s y. x 
n2
51,606  48(636 )  0.05143 (397 ,200 )

82
 10 .408

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 12
3. Assumptions Underlying Linear
Regression
 For each value of X, there is a group of Y values,
and these Y values are normally distributed.
The means of these normal distributions of Y
values all lie on the straight line of regression.
The standard deviations of these normal
distributions are equal.
The Y values are statistically independent. This
means that in the selection of a sample, the Y values
chosen for a particular X value do not depend on the
Y values for any other X values.

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 13

Confidence Interval
 The confidence interval for the mean value of Y for a
given value of X is given by:

1 ( X  X )2
Y   ts y. x 
n (X ) 2
X 
2
n
1 (800  612 .5) 2
89 .14  2.447 (10 .408 ) 
8 (4900 ) 2
3,150 ,000 
8
89 .14  15 .31

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 14

Prediction Interval

 The prediction interval for an individual value of Y


for a given value of X is given by:
1 ( X  X )2
Y   ts y. x 1 
n ( X ) 2
X 2 
n
1 (800  612 .5) 2
89 .14  2.447 (10 .408 ) 1  
8 (4900 ) 2
3,150 ,000 
8
89 .14  29 .72

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.
13- 15

EXAMPLE 3

McGraw-Hill/Irwin Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved.

You might also like