This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Chap 13-1

Statistics for Managers

Using Microsoft® Excel

5th Edition

Chapter 13

Simple Linear Regression

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-2

Learning Objectives

In this chapter, you learn:

To use regression analysis to predict the value of a

dependent variable based on an independent variable

The meaning of the regression coefficients b

0

and b

1

To evaluate the assumptions of regression analysis and

know what to do if the assumptions are violated

To make inferences about the slope and correlation

coefficient

To estimate mean values and predict individual values

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-3

Correlation vs. Regression

A scatter plot (or scatter diagram) can be

used to show the relationship between two

numerical variables

Correlation analysis is used to measure

strength of the association (linear

relationship) between two variables

Correlation is only concerned with strength of

the relationship

No causal effect is implied with correlation

Correlation was first presented in Chapter 3

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-4

Regression Analysis

Regression analysis is used to:

Predict the value of a dependent variable based

on the value of at least one independent variable

Explain the impact of changes in an independent

variable on the dependent variable

Dependent variable: the variable you wish to explain

Independent variable: the variable used to explain

the dependent variable

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-5

Simple Linear Regression

Model

Only one independent variable, X

Relationship between X and Y is described

by a linear function

Changes in Y are related to changes in X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-6

Types of Relationships

Y

X

Y

X

Y

Y

X

X

Linear relationships Curvilinear relationships

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-7

Types of Relationships

Y

X

Y

X

Y

Y

X

X

Strong relationships Weak relationships

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-8

Types of Relationships

Y

X

Y

X

No relationship

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-9

The Linear Regression Model

i i 1 0 i

ε X β β Y + + =

Linear component

The population regression model:

Population

Y intercept

Population

Slope

Coefficient

Random

Error

term

Dependent

Variable

Independent

Variable

Random Error

component

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-10

The Linear Regression Model

Random Error

for this X

i

value

Y

X

Observed Value

of Y for X

i

Predicted Value

of Y for X

i

i i 1 0 i

ε X β β Y + + =

X

i

Slope = β

1

Intercept = β

0

ε

i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-11

Linear Regression Equation

i 1 0 i

X b b Y

ˆ

+ =

The simple linear regression equation provides an

estimate of the population regression line

Estimate of the

regression

intercept

Estimate of the

regression slope

Estimated (or

predicted) Y

value for

observation i

Value of X for

observation i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-12

The Least Squares Method

b

0

and b

1

are obtained by finding the values of b

0

and b

1

that minimize the sum of the squared

differences between Y and :

2

i 1 0 i

2

i i

)) X b (b (Y min ) Y

ˆ

(Y min + ÷ = ÷

¯ ¯

Y

ˆ

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-13

Finding the Least Squares

Equation

The coefficients b

0

and b

1

, and other regression

results in this chapter, will be found using Excel

Formulas are shown in the text for

those who are interested

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-14

Interpretation of the

Intercept and the Slope

b

0

is the estimated mean value of Y when the

value of X is zero

b

1

is the estimated change in the mean value of

Y for every one-unit change in X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-15

Linear Regression Example

A real estate agent wishes to examine the

relationship between the selling price of a

home and its size (measured in square feet)

A random sample of 10 houses is selected

Dependent variable (Y) = house price in

$1000s

Independent variable (X) = square feet

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-16

Linear Regression Example

Data

House Price in $1000s

(Y)

Square Feet

(X)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-17

Linear Regression Example

Scatterplot

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

House price model: scatter plot

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-18

Linear Regression Example

Using Excel

Tools

--------

Data Analysis

--------

Regression

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-19

Linear Regression Example

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

The regression equation is:

feet) (square 0.10977 98.24833 price house + =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-20

Linear Regression Example

Graphical Representation

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

House price model: scatter plot and regression line

feet) (square 0.10977 98.24833 price house + =

Slope

= 0.10977

Intercept

= 98.248

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-21

Linear Regression Example

Interpretation of b

0

b

0

is the estimated mean value of Y when the value

of X is zero (if X = 0 is in the range of observed X

values)

Because the square footage of the house cannot be 0,

the Y intercept has no practical application.

feet) (square 0.10977 98.24833 price house + =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-22

Linear Regression Example

Interpretation of b

1

b

1

measures the mean change in the average value of

Y as a result of a one-unit change in X

Here, b

1

= .10977 tells us that the mean value of a

house increases by .10977($1000) = $109.77, on

average, for each additional one square foot of size

feet) (square 0.10977 98.24833 price house + =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-23

Linear Regression Example

Making Predictions

317.85

0) 0.1098(200 98.25

(sq.ft.) 0.1098 98.25 price house

=

+ =

+ =

Predict the price for a house with 2000 square feet:

The predicted price for a house with 2000 square feet

is 317.85($1,000s) = $317,850

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-24

Linear Regression Example

Making Predictions

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

When using a regression model for prediction, only

predict within the relevant range of data

Relevant range for

interpolation

Do not try to

extrapolate beyond

the range of

observed X’s

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-25

Measures of Variation

Total variation is made up of two parts:

SSE SSR SST + =

Total Sum of

Squares

Regression Sum of

Squares

Error Sum of

Squares

¯

÷ =

2

i

) Y Y ( SST

¯

÷ =

2

i i

) Y

ˆ

Y ( SSE

¯

÷ =

2

i

) Y Y

ˆ

( SSR

where:

= Mean value of the dependent variable

Y

i

= Observed values of the dependent variable

i

= Predicted value of Y for the given X

i

value Y

ˆ

Y

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-26

Measures of Variation

SST = total sum of squares

Measures the variation of the Y

i

values around

their mean Y

SSR = regression sum of squares

Explained variation attributable to the

relationship between X and Y

SSE = error sum of squares

Variation attributable to factors other than the

relationship between X and Y

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-27

Measures of Variation

X

i

Y

X

Y

i

SST = ¯(Y

i

- Y)

2

SSE = ¯(Y

i

- Y

i

)

2

.

SSR = ¯(Y

i

- Y)

2

.

_

_

_

Y

.

Y

Y

_

Y

.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-28

Coefficient of Determination, r

2

The coefficient of determination is the portion of

the total variation in the dependent variable that

is explained by variation in the independent

variable

The coefficient of determination is also called r-

squared and is denoted as r

2

1 r 0

2

s s

squares of sum total

squares of sum regression

SST

SSR

r

2

= =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-29

Coefficient of Determination, r

2

r

2

= 1

Y

X

Y

X

r

2

= -1

r

2

= 1

Perfect linear relationship between

X and Y:

100% of the variation in Y is

explained by variation in X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-30

Coefficient of Determination, r

2

Y

X

Y

X

0 < r

2

< 1

Weaker linear relationships

between X and Y:

Some but not all of the variation

in Y is explained by variation in

X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-31

Coefficient of Determination, r

2

r

2

= 0

No linear relationship between X

and Y:

The value of Y is not related to X.

(None of the variation in Y is

explained by variation in X)

Y

X

r

2

= 0

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-32

Linear Regression Example

Coefficient of Determination, r

2

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

58.08% of the variation in house

prices is explained by variation in

square feet

0.58082

32600.5000

18934.9348

SST

SSR

r

2

= = =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-33

Standard Error of Estimate

The standard deviation of the variation of observations

around the regression line is estimated by

2

)

ˆ

(

2

1

2

÷

÷

=

÷

=

¯

=

n

Y Y

n

SSE

S

n

i

i i

YX

Where

SSE = error sum of squares

n = sample size

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-34

Linear Regression Example

Standard Error of Estimate

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

41.33032 S

YX

=

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-35

Comparing Standard Errors

Y Y

X X

YX

s small

YX

s large

S

YX

is a measure of the variation of observed

Y values from the regression line

The magnitude of S

YX

should always be judged relative to

the size of the Y values in the sample data

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-36

Assumptions of Regression

L.I.N.E

Linearity

The relationship between X and Y is linear

Independence of Errors

Error values are statistically independent

Normality of Error

Error values are normally distributed for any given

value of X

Equal Variance (also called homoscedasticity)

The probability distribution of the errors has constant

variance

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-37

Residual Analysis

The residual for observation i, e

i

, is the difference between its

observed and predicted value

Check the assumptions of regression by examining the residuals

Examine for Linearity assumption

Evaluate Independence assumption

Evaluate Normal distribution assumption

Examine Equal variance for all levels of X

Graphical Analysis of Residuals

Can plot residuals vs. X

i i i

Y Y e

ˆ

÷ =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-38

Residual Analysis for Linearity

Not Linear Linear

x

r

e

s

i

d

u

a

l

s

x

Y

x

Y

x

r

e

s

i

d

u

a

l

s

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-39

Residual Analysis for

Independence

Not Independent Independent

X

X

r

e

s

i

d

u

a

l

s

r

e

s

i

d

u

a

l

s

X

r

e

s

i

d

u

a

l

s

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-40

Checking for Normality

Examine the Stem-and-Leaf Display of the

Residuals

Examine the Box-and-Whisker Plot of the

Residuals

Examine the Histogram of the Residuals

Construct a Normal Probability Plot of the

Residuals

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-41

Residual Analysis for

Equal Variance

Unequal variance Equal variance

x x

Y

x

x

Y

r

e

s

i

d

u

a

l

s

r

e

s

i

d

u

a

l

s

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-42

Linear Regression Example

Excel Residual Output

House Price Model Residual Plot

-60

-40

-20

0

20

40

60

80

0 1000 2000 3000

Square Feet

R

e

s

i

d

u

a

l

s

RESIDUAL OUTPUT

Predicted

House Price Residuals

1 251.92316 -6.923162

2 273.87671 38.12329

3 284.85348 -5.853484

4 304.06284 3.937162

5 218.99284 -19.99284

6 268.38832 -49.38832

7 356.20251 48.79749

8 367.17929 -43.17929

9 254.6674 64.33264

10 284.85348 -29.85348

Does not appear to violate

any regression assumptions

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-43

Measuring Autocorrelation:

The Durbin-Watson Statistic

Used when data are collected over time to

detect if autocorrelation is present

Autocorrelation exists if residuals in one

time period are related to residuals in another

period

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-44

Autocorrelation

Autocorrelation is correlation of the errors

(residuals) over time

Violates the regression assumption that residuals are

statistically independent

Time (t) Residual Plot

-15

-10

-5

0

5

10

15

0 2 4 6 8

Time (t)

R

e

s

i

d

u

a

l

s

Here, residuals suggest a

cyclic pattern, not

random

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-45

The Durbin-Watson Statistic

¯

¯

=

=

÷

÷

=

n

i

i

n

i

i i

e

e e

D

1

2

2

2

1

) (

The possible range is 0 ≤ D ≤ 4

D should be close to 2 if H

0

is true

D less than 2 may signal positive

autocorrelation, D greater than 2 may

signal negative autocorrelation

The Durbin-Watson statistic is used to test for autocorrelation

H

0

: residuals are not correlated

H

1

: autocorrelation is present

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-46

The Durbin-Watson Statistic

Calculate the Durbin-Watson test statistic = D

(The Durbin-Watson Statistic can be found using Excel)

Decision rule: reject H

0

if D < d

L

H

0

: positive autocorrelation does not exist

H

1

: positive autocorrelation is present

0 d

U

2 d

L

Reject H

0

Do not reject H

0

Find the values d

L

and d

U

from the Durbin-Watson table

(for sample size n and number of independent variables k)

Inconclusive

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-47

The Durbin-Watson Statistic

Example with n = 25:

Durbin-Watson Calculations

Sum of Squared

Difference of Residuals 3296.18

Sum of Squared Residuals 3279.98

Durbin-Watson Statistic 1.00494

y = 30.65 + 4.7038x

R

2

= 0.8976

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30

Time

S

a

l

e

s

Excel output:

1.00494

3279.98

3296.18

e

) e (e

D

n

1 i

2

i

n

2 i

2

1 i i

= =

÷

=

¯

¯

=

=

÷

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-48

The Durbin-Watson Statistic

Here, n = 25 and there is k = 1 independent variable

Using the Durbin-Watson table, d

L

= 1.29 and d

U

= 1.45

D = 1.00494 < d

L

= 1.29, so reject H

0

and conclude that

significant positive autocorrelation exists

Therefore the linear model is not the appropriate model to

predict sales

Decision: reject H

0

since

D = 1.00494 < d

L

0 d

U

=1.45 2 d

L

=1.29

Reject H

0

Do not reject H

0

Inconclusive

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-49

Inferences About the Slope:

t Test

t test for a population slope

Is there a linear relationship between X and Y?

Null and alternative hypotheses

H

0

: β

1

= 0 (no linear relationship)

H

1

: β

1

≠ 0 (linear relationship does exist)

Test statistic

1

b

1 1

S

β b

t

÷

=

2 n d.f. ÷ =

where:

b

1

= regression slope

coefficient

β

1

= hypothesized slope

S

b1

= standard

error of the slope

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-50

Inferences About the Slope:

t Test Example

House Price in

$1000s

(y)

Square Feet

(x)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

(sq.ft.) 0.1098 98.25 price house + =

Estimated Regression Equation:

The slope of this model is 0.1098

Is there a relationship between the

square footage of the house and its

sales price?

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-51

Inferences About the Slope:

t Test Example

H

0

: β

1

= 0

H

1

: β

1

≠ 0

From Excel output:

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

1

b

S

t

b

1

32938 . 3

03297 . 0

0 10977 . 0

S

β b

t

1

b

1 1

=

÷

=

÷

=

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-52

Inferences About the Slope:

t Test Example

Test Statistic: t = 3.329

There is sufficient evidence

that square footage affects

house price

Decision: Reject H

0

Reject H

0

Reject H

0

o/2=.025

-t

α/2

Do not reject H

0

0

t

α/2

o/2=.025

-2.3060 2.3060 3.329

d.f. = 10- 2 = 8

H

0

: β

1

= 0

H

1

: β

1

≠ 0

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-53

Inferences About the Slope:

t Test Example

H

0

: β

1

= 0

H

1

: β

1

≠ 0

From Excel output:

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

P-Value

There is sufficient evidence that

square footage affects house price.

Decision: Reject H

0

, since p-value < α

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-54

F-Test for Significance

F Test statistic:

where

MSE

MSR

F =

1 k n

SSE

MSE

k

SSR

MSR

÷ ÷

=

=

where F follows an F distribution with k numerator degrees of freedom

and (n - k - 1) denominator degrees of freedom

(k = the number of independent variables in the regression model)

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-55

F-Test for Significance

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

11.0848

1708.1957

18934.9348

MSE

MSR

F = = =

With 1 and 8 degrees

of freedom

P-value for

the F-Test

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-56

F-Test for Significance

H

0

: β

1

= 0

H

1

: β

1

≠ 0

o = .05

df

1

= 1 df

2

= 8

Test Statistic:

Decision:

Conclusion:

Reject H

0

at o = 0.05

There is sufficient evidence that

house size affects selling price 0

o = .05

F

.05

= 5.32

Reject H

0

Do not

reject H

0

11.08

MSE

MSR

F = =

Critical Value:

F

o

= 5.32

F

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-57

Confidence Interval Estimate

for the Slope

Confidence Interval Estimate of the Slope:

Excel Printout for House Prices:

At the 95% level of confidence, the confidence interval

for the slope is (0.0337, 0.1858)

1

b 2 n 1

S t b

÷

±

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

d.f. = n - 2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-58

Confidence Interval Estimate

for the Slope

Since the units of the house price variable is $1000s, you

are 95% confident that the mean change in sales price is

between $33.74 and $185.80 per square foot of house size

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between house price and

square feet at the .05 level of significance

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-59

t Test for a Correlation Coefficient

Hypotheses

H

0

: ρ = 0 (no correlation between X and Y)

H

1

: ρ ≠ 0 (correlation exists)

Test statistic

(with n – 2 degrees of freedom)

2 n

r 1

ρ - r

t

2

÷

÷

=

0 b if r r

0 b if r r

where

1

2

1

2

< ÷ =

> + =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-60

t Test for a Correlation Coefficient

Is there evidence of a linear relationship between

square feet and house price at the .05 level of

significance?

H

0

: ρ = 0 (No correlation)

H

1

: ρ ≠ 0 (correlation exists)

o =.05 , df = 10 - 2 = 8

3.329

2 10

.762 1

0 .762

2 n

r 1

ρ r

t

2 2

=

÷

÷

÷

=

÷

÷

÷

=

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-61

t Test for a Correlation Coefficient

Reject H

0

Reject H

0

o/2=.025

-t

α/2

Do not reject H

0

0

t

α/2

o/2=.025

-2.3060

2.3060

3.329

d.f. = 10- 2 = 8

Conclusion:

There is evidence

of a linear

association at the

5% level of

significance

Decision:

Reject H

0

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-62

Estimating Mean Values and

Predicting Individual Values

X

Y = b

0

+b

1

X

i

.

Confidence

Interval for

the mean of

Y, given X

i

Prediction Interval for an

individual Y, given X

i

Goal: Form intervals around Y to express

uncertainty about the value of Y for a given X

i

Y

X

i

Y

.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-63

Confidence Interval for

the Average Y, Given X

Confidence interval estimate for the mean value of

Y given a particular X

i

Size of interval varies according to

distance away from mean, X

i YX 2 n

X X | Y

h S t Y

ˆ

: μ for interval Confidence

i

÷

=

±

¯

÷

÷

+ =

÷

+ =

2

i

2

i

2

i

i

) X (X

) X (X

n

1 ) X (X

n

1

h

SSX

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-64

Prediction Interval for

an Individual Y, Given X

Prediction interval estimate for an individual value

of Y given a particular X

i

This extra term adds to the interval width to reflect the

added uncertainty for an individual case

i

h t Y + ±

÷

=

1 S

ˆ

: Y for interval Prediction

YX 2 n

X X

i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-65

Estimation of Mean Values:

Example

Find the 95% confidence interval for the mean price of 2,000

square-foot houses

Predicted Price Y

i

= 317.85 ($1,000s)

.

Confidence Interval Estimate for μ

Y|X=X

37.12 317.85

) X (X

) X (X

n

1

S t Y

ˆ

2

i

2

i

YX 2 - n

± =

÷

÷

+ ±

¯

The confidence interval endpoints are 280.66 and 354.90,

or from $280,660 to $354,900

i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-66

Estimation of Individual

Values: Example

Find the 95% prediction interval for an individual house with

2,000 square feet

Predicted Price Y

i

= 317.85 ($1,000s)

.

Prediction Interval Estimate for Y

X=X

102.28 317.85

) X (X

) X (X

n

1

1 S t Y

ˆ

2

i

2

i

YX 1 - n

± =

÷

÷

+ + ±

¯

The prediction interval endpoints are 215.50 and 420.07, or

from $215,500 to $420,070

i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-67

Pitfalls of Regression Analysis

Lacking an awareness of the assumptions

underlying least-squares regression

Not knowing how to evaluate the assumptions

Not knowing the alternatives to least-squares

regression if a particular assumption is violated

Using a regression model without knowledge of

the subject matter

Extrapolating outside the relevant range

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-68

Strategies for Avoiding

the Pitfalls of Regression

Start with a scatter plot of X on Y to observe

possible relationship

Perform residual analysis to check the

assumptions

Plot the residuals vs. X to check for violations

of assumptions such as equal variance

Use a histogram, stem-and-leaf display, box-

and-whisker plot, or normal probability plot of

the residuals to uncover possible non-normality

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-69

Strategies for Avoiding

the Pitfalls of Regression

If there is violation of any assumption, use

alternative methods or models

If there is no evidence of assumption

violation, then test for the significance of the

regression coefficients and construct

confidence intervals and prediction intervals

Avoid making predictions or forecasts

outside the relevant range

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-70

Chapter Summary

Introduced types of regression models

Reviewed assumptions of regression and

correlation

Discussed determining the simple linear

regression equation

Described measures of variation

Discussed residual analysis

Addressed measuring autocorrelation

In this chapter, we have

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 13-71

Chapter Summary

Described inference about the slope

Discussed correlation -- measuring the

strength of the association

Addressed estimation of mean values and

prediction of individual values

Discussed possible pitfalls in regression and

recommended strategies to avoid them

In this chapter, we have

Materi ini merupakan bahan ajar sebagai pelengkap e-materi mata kuliah statistika bisnis.
Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Mic...

Materi ini merupakan bahan ajar sebagai pelengkap e-materi mata kuliah statistika bisnis.

Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Microsoft Excel. Pearson.

Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Microsoft Excel. Pearson.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd