© All Rights Reserved

10 views

© All Rights Reserved

- Additive Solution
- Correlation and Regression Analysis
- Regression With SAS
- Chapter1_2010
- A PREDICTIVE MODEL FOR OZONE UPLIFTING IN OBSTRUCTION PRONE ENVIRONMENT
- The Household, Income and Labour Dynamics in Australia Survey
- Knowledge management and its importance in improving the performance of small and medium-size enterprises
- Curve Fitting
- Op Mgmt Ch8-HW
- salesforecasting-120421012853-phpapp02
- HILDA-SR-2017.pdf
- The Critical Factors of Corporate Social Responsibility Csr That Contribute Towards Consumer Behavior in the Ready Made Garments Rmg Industry of Bangladesh
- Regression in Data Mining
- 04 - Reg.pptx
- ISOM CH 14
- jurnal
- Upload
- IJETR012001
- hw 3 number 3
- RELTIQUES_V1N1

You are on page 1of 44

Chapter 12

Simple Regression

Statistics for

Business and Economics

6

th

Edition

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-2

Introduction to

Regression Analysis

Regression analysis is used to:

Predict the value of a dependent variable based on

the value of at least one independent variable

Explain the impact of changes in an independent

variable on the dependent variable

Dependent variable: the variable we wish to explain

(also called the endogenous variable)

Independent variable: the variable used to explain

the dependent variable

(also called the exogenous variable)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-3

Linear Regression Model

The relationship between X and Y is

described by a linear function

Changes in Y are assumed to be caused by

changes in X

Linear regression population equation model

Where |

0

and |

1

are the population model

coefficients and c is a random error term.

i i 1 0 i

x Y + + =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-4

i i 1 0 i

X Y + + =

Linear component

Simple Linear Regression

Model

The population regression model:

Population

Y intercept

Population

Slope

Coefficient

Random

Error

term

Dependent

Variable

Independent

Variable

Random Error

component

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-5

(continued)

Random Error

for this X

i

value

Y

X

Observed Value

of Y for X

i

Predicted Value

of Y for X

i

i i 1 0 i

X Y + + =

X

i

Slope =

1

Intercept =

0

i

Simple Linear Regression

Model

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-6

i 1 0 i

x b b y + =

estimate of the population regression line

Simple Linear Regression

Equation

Estimate of

the regression

intercept

Estimate of the

regression slope

Estimated

(or predicted)

y value for

observation i

Value of x for

observation i

The individual random error terms e

i

have a mean of zero

) )

(

i 1 0 i i i i

x b (b - y y - y e + = =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-7

Least Squares Estimators

b

0

and b

1

are obtained by finding the values

of b

0

and b

1

that minimize the sum of the

squared differences between y and :

2

i 1 0 i

2

i i

2

i

)] x b (b [y min

) y (y min

e min SSE min

+ =

=

=

coefficient estimators b

0

and b

1

that minimize SSE

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-8

The slope coefficient estimator is

And the constant or y-intercept is

The regression line always goes through the mean x, y

X

Y

xy

n

1 i

2

i

n

1 i

i i

1

s

s

r

) x (x

) y )(y x (x

b =

=

=

x b y b

1 0

=

xx

Least Squares Estimators

(continued)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-9

Finding the Least Squares

Equation

The coefficients b

0

and b

1

, and other

regression results in this chapter, will be

found using a computer

Hand calculations are tedious

Statistical routines are built into Excel

Other statistical analysis software can be used

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-10

Linear Regression Model

Assumptions

The true relationship form is linear (Y is a linear function

of X, plus random error)

The error terms,

i

are independent of the x values

The error terms are random variables with mean 0 and

constant variance,

2

(the constant variance property is called homoscedasticity)

The random error terms,

i

, are not correlated with one

another, so that

n) , 1, (i for ] E[ and 0 ] E[

2

2

i i

= = =

j i all for 0 ] E[

j i

= =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-11

b

0

is the estimated average value of y

when the value of x is zero (if x = 0 is

in the range of observed x values)

b

1

is the estimated change in the

average value of y as a result of a

one-unit change in x

Interpretation of the

Slope and the Intercept

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-12

Simple Linear Regression

Example

A real estate agent wishes to examine the

relationship between the selling price of a home

and its size (measured in square feet)

A random sample of 10 houses is selected

Dependent variable (Y) = house price in $1000s

Independent variable (X) = square feet

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-13

Sample Data for House Price

Model

House Price in $1000s

(Y)

Square Feet

(X)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-14

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

Graphical Presentation

House price model: scatter plot

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-15

Regression Using Excel

Tools / Data Analysis / Regression

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-16

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

The regression equation is:

feet) (square 0.10977 98.24833 price house + =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-17

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

Graphical Presentation

House price model: scatter plot and

regression line

feet) (square 0.10977 98.24833 price house + =

Slope

= 0.10977

Intercept

= 98.248

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-18

Interpretation of the

Intercept, b

0

b

0

is the estimated average value of Y when the

value of X is zero (if X = 0 is in the range of

observed X values)

Here, no houses had 0 square feet, so b

0

= 98.24833

just indicates that, for houses within the range of

sizes observed, $98,248.33 is the portion of the

house price not explained by square feet

feet) (square 0.10977 98.24833 price house + =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-19

Interpretation of the

Slope Coefficient, b

1

b

1

measures the estimated change in the

average value of Y as a result of a one-

unit change in X

Here, b

1

= .10977 tells us that the average value of a

house increases by .10977($1000) = $109.77, on

average, for each additional one square foot of size

feet) (square 0.10977 98.24833 price house + =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-20

Measures of Variation

Total variation is made up of two parts:

SSE SSR SST + =

Total Sum of

Squares

Regression Sum

of Squares

Error Sum of

Squares

=

2

i

) y (y SST

=

2

i i

) y (y SSE

=

2

i

) y y ( SSR

where:

= Average value of the dependent variable

y

i

= Observed values of the dependent variable

i

= Predicted value of y for the given x

i

value y

y

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-21

SST = total sum of squares

Measures the variation of the y

i

values around their

mean, y

SSR = regression sum of squares

Explained variation attributable to the linear

relationship between x and y

SSE = error sum of squares

Variation attributable to factors other than the linear

relationship between x and y

(continued)

Measures of Variation

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-22

(continued)

x

i

y

X

y

i

SST = (y

i

- y)

2

SSE = (y

i

- y

i

)

2

.

SSR = (y

i

- y)

2

.

_

_

_

y

.

Y

y

_

y

.

Measures of Variation

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-23

The coefficient of determination is the portion

of the total variation in the dependent variable

that is explained by variation in the

independent variable

The coefficient of determination is also called

R-squared and is denoted as R

2

Coefficient of Determination, R

2

1 R 0

2

s s

note:

squares of sum total

squares of sum regression

SST

SSR

R

2

= =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-24

r

2

= 1

Examples of Approximate

r

2

Values

Y

X

Y

X

r

2

= 1

r

2

= 1

Perfect linear relationship

between X and Y:

100% of the variation in Y is

explained by variation in X

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-25

Examples of Approximate

r

2

Values

Y

X

Y

X

0 < r

2

< 1

Weaker linear relationships

between X and Y:

Some but not all of the

variation in Y is explained

by variation in X

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-26

Examples of Approximate

r

2

Values

r

2

= 0

No linear relationship

between X and Y:

The value of Y does not

depend on X. (None of the

variation in Y is explained

by variation in X)

Y

X

r

2

= 0

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-27

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

58.08% of the variation in

house prices is explained by

variation in square feet

0.58082

32600.5000

18934.9348

SST

SSR

R

2

= = =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-28

Correlation and R-square

The coefficient of determination, R

2

, for a

simple regression is equal to the simple

correlation squared

2

xy

2

r R =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-29

Estimation of Model

Error Variance

An estimator for the variance of the population model

error is

Division by n 2 instead of n 1 is because the simple regression

model uses two estimated parameters, b

0

and b

1

, instead of one

is called the standard error of the estimate

2 n

SSE

2 n

e

s

n

1 i

2

i

2

e

2

= =

=

2

e e

s s =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-30

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

41.33032 s

e

=

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-31

Comparing Standard Errors

Y Y

X X

e

s small

e

s large

s

e

is a measure of the variation of observed y

values from the regression line

The magnitude of s

e

should always be judged relative to the size

of the y values in the sample data

i.e., s

e

= $41.33K is

moderately small relative to house prices in

the $200 - $300K range

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-32

Inferences About the

Regression Model

The variance of the regression slope coefficient

(b

1

) is estimated by

2

x

2

e

2

i

2

e

2

1)s (n

s

) x (x

s

s

1

b

where:

= Estimate of the standard error of the least squares slope

= Standard error of the estimate

1

b

s

2 n

SSE

s

e

=

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-33

Excel Output

Regression Statistics

Multiple R 0.76211

R Square 0.58082

Adjusted R Square 0.52842

Standard Error 41.33032

Observations 10

ANOVA

df SS MS F Significance F

Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

0.03297 s

1

b

=

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-34

Comparing Standard Errors of

the Slope

Y

X

Y

X

1

b

S small

1

b

S large

is a measure of the variation in the slope of regression

lines from different possible samples

1

b

S

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-35

Inference about the Slope:

t Test

t test for a population slope

Is there a linear relationship between X and Y?

Null and alternative hypotheses

H

0

:

1

= 0 (no linear relationship)

H

1

:

1

= 0 (linear relationship does exist)

Test statistic

1

b

1 1

s

b

t

=

2 n d.f. =

where:

b

1

= regression slope

coefficient

1

= hypothesized slope

s

b1

= standard

error of the slope

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-36

House Price

in $1000s

(y)

Square Feet

(x)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

(sq.ft.) 0.1098 98.25 price house + =

Estimated Regression Equation:

The slope of this model is 0.1098

Does square footage of the house

affect its sales price?

Inference about the Slope:

t Test

(continued)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-37

Inferences about the Slope:

t Test Example

H

0

:

1

= 0

H

1

:

1

= 0

From Excel output:

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

1

b

s

t

b

1

3.32938

0.03297

0 0.10977

s

b

t

1

b

1 1

=

=

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-38

Inferences about the Slope:

t Test Example

H

0

:

1

= 0

H

1

:

1

= 0

Test Statistic: t = 3.329

There is sufficient evidence

that square footage affects

house price

From Excel output:

Reject H

0

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

1

b

s

t b

1

Decision:

Conclusion:

Reject H

0

Reject H

0

o/2=.025

-t

n-2,/2

Do not reject H

0

0

o/2=.025

-2.3060 2.3060

3.329

d.f. = 10-2 = 8

t

8,.025

= 2.3060

(continued)

t

n-2,/2

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-39

Inferences about the Slope:

t Test Example

H

0

:

1

= 0

H

1

:

1

= 0

P-value = 0.01039

There is sufficient evidence

that square footage affects

house price

From Excel output:

Reject H

0

Coefficients Standard Error t Stat P-value

Intercept 98.24833 58.03348 1.69296 0.12892

Square Feet 0.10977 0.03297 3.32938 0.01039

P-value

Decision: P-value < so

Conclusion:

(continued)

This is a two-tail test, so

the p-value is

P(t > 3.329)+P(t < -3.329)

= 0.01039

(for 8 d.f.)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-40

Confidence Interval Estimate

for the Slope

Confidence Interval Estimate of the Slope:

Excel Printout for House Prices:

At 95% level of confidence, the confidence interval for

the slope is (0.0337, 0.1858)

1 1

b /2 2, n 1 1 b /2 2, n 1

s t b s t b

+ < <

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

d.f. = n - 2

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-41

Since the units of the house price variable is

$1000s, we are 95% confident that the average

impact on sales price is between $33.70 and

$185.80 per square foot of house size

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386

Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between

house price and square feet at the .05 level of significance

Confidence Interval Estimate

for the Slope

(continued)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-42

Prediction

The regression equation can be used to

predict a value for y, given a particular x

For a specified value, x

n+1

, the predicted

value is

1 n 1 0 1 n

x b b y

+ +

+ =

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-43

317.85

0) 0.1098(200 98.25

(sq.ft.) 0.1098 98.25 price house

=

+ =

+ =

Predict the price for a house

with 2000 square feet:

The predicted price for a house with 2000

square feet is 317.85($1,000s) = $317,850

Predictions Using

Regression Analysis

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 12-44

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Square Feet

H

o

u

s

e

P

r

i

c

e

(

$

1

0

0

0

s

)

Relevant Data Range

When using a regression model for prediction,

only predict within the relevant range of data

Relevant data range

Risky to try to

extrapolate far

beyond the range

of observed Xs

- Additive SolutionUploaded bySwastik Mohapatra
- Correlation and Regression AnalysisUploaded byNagarjuna Appu
- Regression With SASUploaded byAkshay Mathur
- Chapter1_2010Uploaded byjudgerohan
- A PREDICTIVE MODEL FOR OZONE UPLIFTING IN OBSTRUCTION PRONE ENVIRONMENTUploaded byIAEME Publication
- The Household, Income and Labour Dynamics in Australia SurveyUploaded byErin Hill
- Knowledge management and its importance in improving the performance of small and medium-size enterprisesUploaded byThe Ijbmt
- Curve FittingUploaded byPurushothaman
- Op Mgmt Ch8-HWUploaded byAbhishek Mishra
- salesforecasting-120421012853-phpapp02Uploaded byDharmang Patel
- HILDA-SR-2017.pdfUploaded byABC News Online
- The Critical Factors of Corporate Social Responsibility Csr That Contribute Towards Consumer Behavior in the Ready Made Garments Rmg Industry of BangladeshUploaded byIJSTR Research Publication
- Regression in Data MiningUploaded bypoonam
- 04 - Reg.pptxUploaded bygopi
- ISOM CH 14Uploaded bytotally_fatal7848_56
- jurnalUploaded byNurul Fitri Rasyid
- UploadUploaded byDan Lu
- IJETR012001Uploaded byerpublication
- hw 3 number 3Uploaded byYiyi Zhao
- RELTIQUES_V1N1Uploaded bygustavo orozco
- 5.OptimizationUploaded byHenry Lopez
- Spss Pak HajiUploaded byDhenok
- 330_Lecture8_2014Uploaded byAnonymous gUySMcpSq
- metode cantitativeUploaded byHorvath Zsofia
- praktikum ekomettUploaded byRetno Jati Sahari
- ERES_2014_Artigo _impact o f socioeconomic_com Cristovao e Leo_Template.pdfUploaded byfpontual
- Table 2 PepoUploaded byConstantino Hevia
- 57Uploaded byChipo Chip
- CAPITAL STRUCURE AND ITS IMPACT ON FINANCIAL PERFORMANCE OF INDIAN STEEL INDUSTRY.pdfUploaded byRAVI RAUSHAN JHA
- Linear RegressionUploaded byHan Myo

- SAT workbookUploaded byLisa Crawford
- Math Review SatUploaded byLisa Crawford
- ExamView - 9 FunctionsUploaded byLisa Crawford
- Sample QuestionsUploaded byLisa Crawford
- Practice Data Sufficiency Questions 1Uploaded byabhijithk2
- Quantitative Economical Methods for MBA 13Uploaded byLisa Crawford
- Exam 2008Uploaded byLisa Crawford
- Nova ShortUploaded byLisa Crawford
- SL Planner 12aUploaded byLisa Crawford
- Section 6Uploaded byLisa Crawford

- Parlimentary Form of GovtUploaded byAjeet Krishnamurthy
- VP0509Uploaded byAnonymous 9eadjPSJNg
- fs 5Uploaded byShairaManauis
- primary mathUploaded byapi-33610044
- Call for IPSF Co-Ordinator and Sub-Committee Positions 2012-13Uploaded byInternational Pharmaceutical Students' Federation (IPSF)
- Written ReportUploaded byCamille Lucelo
- Debunking the Dominance MythUploaded byFontes Conde de Almourol
- Physics - Practical.pdfUploaded byOmar Khan
- A Corpus Based Study of the Errors Committed by PakistaniUploaded byAlexander Decker
- Management Skills for Everyday Life 3rd Edition by Paula Caproni – Test BankUploaded byshaista.aziz
- m500_2_5_ssdUploaded byMark Reinhardt
- Syllabus for Ali RkeinUploaded byAli Nasser El Deen
- Moses Hershberge Essay2Uploaded byMoses Hershberger
- Biodiesel Transesterification Kinetics Monitored by PH MeasurementUploaded byCláudia Silva
- Performance MgtUploaded bySoujanya Bhoopal
- Ch 5 Quiz FinanceUploaded byValeriaRios
- True Hermaphrodite: A Case ReportUploaded byMuhammad Bilal Mirza
- Week 1_The Role and Importance of ResearchUploaded bySue Ahmad
- Agents charged with fraud.pdfUploaded byDebbie Hilder
- Sage Ing23LRUploaded byOkanagan Institute
- The Antioxidant MythUploaded bycygnus_11
- 07 Chapter 1Uploaded byPrashu Prashanth
- E5 Syllabus TelecomUploaded byMANISH KUMAR
- 190312 - Legal Opinion on Joint Instrument and Unilateral Declaration Co.. 2Uploaded byZerohedge
- Chapter 21 - Fuels and Crude OilUploaded byJERVINLIM
- Yap v. TañadaUploaded bybrandon dawis
- Asthma.pdfUploaded byChong KA
- Introduction of Public Policy Analysis MBS First YearUploaded byStore Sansar
- Child - CYRM Manual.pdfUploaded bykgbkarl
- BIOGASUploaded byLely Mardiyanti, S.Pd.