You are on page 1of 33

Least Squares Method

Coefficient of Determination

Model Assumptions

Testing for Significance

Slide

1

1

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression

relationship between two or more variables.

Regression analysis can be used to develop an

equation showing how the variables are related.

The variable being predicted is called the dependent

variable and is denoted by y.

The variables being used to predict the value of the

dependent variable are called the independent

variables and are denoted by x.

Slide

2

2

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression

variable and one dependent variable.

The relationship between the two variables is

approximated by a straight line.

Regression analysis involving two or more

independent variables is called multiple regression.

Slide

3

3

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Model

an error term is called the regression model.

The simple linear regression model is:

y = b0 + b1x +e

where:

b0 and b1 are called parameters of the model,

e is a random variable called the error term.

Slide

4

4

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Equation

E(y) = b0 + b1x

• b0 is the y intercept of the regression line.

• b1 is the slope of the regression line.

• E(y) is the expected value of y for a given x value.

Slide

5

5

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Equation

E(y)

Regression line

Intercept Slope b1

b0

is positive

Slide

6

6

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Equation

E(y)

Intercept

b0 Regression line

Slope b1

is negative

Slide

7

7

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression Equation

No Relationship

E(y)

b0

Slope b1

is 0

Slide

8

8

or duplicated, or posted to a publicly accessible website, in whole or in part.

Estimated Simple Linear Regression Equation

ŷ b0 b1 x

• b0 is the y intercept of the line.

• b1 is the slope of the line.

• ŷis the estimated value of y for a given x value.

Slide

9

9

or duplicated, or posted to a publicly accessible website, in whole or in part.

Estimation Process

y = b0 + b1x +e x y

Regression Equation x1 y1

E(y) = b0 + b1x . .

Unknown Parameters . .

b0, b1 xn yn

Estimated

b0 and b1 Regression Equation

provide estimates of ŷ b0 b1 x

b0 and b1 Sample Statistics

b0, b1

Slide

10

10

or duplicated, or posted to a publicly accessible website, in whole or in part.

Least Squares Method

min (y i y i ) 2

where:

yi = observed value of the dependent variable

for the ith observation

y^i = estimated value of the dependent variable

for the ith observation

Slide

11

11

or duplicated, or posted to a publicly accessible website, in whole or in part.

Least Squares Method

b1 ( x x )( y y )

i i

(x x ) i

2

where:

xi = value of independent variable for ith

observation

yi = value of dependent variable for ith

_ observation

x = mean value for independent variable

_

y = mean value for dependent variable

Slide

12

12

or duplicated, or posted to a publicly accessible website, in whole or in part.

Least Squares Method

b0 y b1 x

Slide

13

13

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression

Reed Auto periodically has a special week-long sale.

As part of the advertising campaign Reed runs one or

more television commercials during the weekend

preceding the sale. Data from a sample of 5 previous

sales are shown on the next slide.

Slide

14

14

or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Linear Regression

Number of Number of

TV Ads (x) Cars Sold (y)

1 14

3 24

2 18

1 17

3 27

Sx = 10 Sy = 100

x2 y 20

Slide

15

15

or duplicated, or posted to a publicly accessible website, in whole or in part.

Estimated Regression Equation

b1 ( xi x )( y i y ) 20

5

( xi x ) 2

4

y-Intercept for the Estimated Regression Equation

b0 y b1 x 20 5(2) 10

Estimated Regression Equation

yˆ 10 5x

Slide

16

16

or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Determination

i

( y y ) 2

i

( ˆ

y y ) 2

i i

( y ˆ

y ) 2

where:

SST = total sum of squares

SSR = sum of squares due to regression

SSE = sum of squares due to error

Slide

17

17

or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Determination

r2 = SSR/SST

where:

SSR = sum of squares due to regression

SST = total sum of squares

Slide

18

18

or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Determination

The regression relationship is very strong; 87.72%

of the variability in the number of cars sold can be

explained by the linear relationship between the

number of TV ads and the number of cars sold.

Slide

19

19

or duplicated, or posted to a publicly accessible website, in whole or in part.

Sample Correlation Coefficient

rxy (sign of b1 ) r 2

where:

b1 = the slope of the estimated regression

equation yˆ b0 b1 x

Slide

20

20

or duplicated, or posted to a publicly accessible website, in whole or in part.

Sample Correlation Coefficient

rxy (sign of b1 ) r 2

yˆ 10 is

The sign of b1 in the equation 5 x“+”.

rxy = + .8772

rxy = +.9366

Slide

21

21

or duplicated, or posted to a publicly accessible website, in whole or in part.

Assumptions About the Error Term e

all values of the independent variable.

variable.

Slide

22

22

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance

must conduct a hypothesis test to determine whether

the value of b1 is zero.

t Test and F Test

the variance of e in the regression model.

Slide

23

23

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance

An Estimate of 2

The mean square error (MSE) provides the estimate

of 2, and the notation s2 is also used.

s 2 = MSE = SSE/(n 2)

where:

SSE ( yi yˆ i ) 2 ( yi b0 b1 xi ) 2

Slide

24

24

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance

An Estimate of

• To estimate we take the square root of 2.

• The resulting s is called the standard error of

the estimate.

SSE

s MSE

n2

Slide

25

25

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: t Test

Hypotheses

H0 : b1 0

H a : b1 0

Test Statistic

b1 s

t where sb1

sb1 S( xi x ) 2

Slide

26

26

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: t Test

Rejection Rule

or t < -t or t > t

where:

t is based on a t distribution

with n - 2 degrees of freedom

Slide

27

27

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: t Test

H a : b1 0

2. Specify the level of significance. = .05

b1

3. Select the test statistic. t

sb1

or |t| > 3.182 (with

3 degrees of freedom)

Slide

28

28

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: t Test

b1 5

t 4.63

sb1 1.08

t = 4.541 provides an area of .01 in the upper

tail. Hence, the p-value is less than .02. (Also,

t = 4.63 > 3.182.) We can reject H0.

Slide

29

29

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: F Test

Hypotheses

H0 : b1 0

H a : b1 0

Test Statistic

F = MSR/MSE

Slide

30

30

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: F Test

Rejection Rule

Reject H0 if

p-value <

or F > F

where:

F is based on an F distribution with

1 degree of freedom in the numerator and

n - 2 degrees of freedom in the denominator

Slide

31

31

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: F Test

H a : b1 0

or F > 10.13 (with 1 d.f.

in numerator and

3 d.f. in denominator)

Slide

32

32

or duplicated, or posted to a publicly accessible website, in whole or in part.

Testing for Significance: F Test

F = 17.44 provides an area of .025 in the upper

tail. Thus, the p-value corresponding to F = 21.43

is less than .025. Hence, we reject H0.

The statistical evidence is sufficient to conclude

that we have a significant relationship between the

number of TV ads aired and the number of cars sold.

Slide

33

33

or duplicated, or posted to a publicly accessible website, in whole or in part.

