0 Up votes0 Down votes

0 views50 pagesApr 21, 2018

© © All Rights Reserved

PPT, PDF, TXT or read online from Scribd

© All Rights Reserved

0 views

© All Rights Reserved

- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- The Art of Thinking Clearly
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- The 10X Rule: The Only Difference Between Success and Failure
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions

You are on page 1of 50

BUSINESS

STATISTICS

by

AMIR D. ACZEL

&

JAYAVEL SOUNDERPANDIAN

7th edition.

University

Chapter 10

Simple Linear Regression and Correlation

McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.

10-2

• Using Statistics

• The Simple Linear Regression Model

• Estimation: The Method of Least Squares

• Error Variance and the Standard Errors of Regression Estimators

• Correlation

• Hypothesis Tests about the Regression Relationship

• How Good is the Regression?

• Analysis of Variance Table and an F Test of the Regression Model

• Residual Analysis and Checking for Model Inadequacies

• Use of the Regression Model for Prediction

• The Solver Method for Regression

10-3

10 LEARNING OBJECTIVES

After studying this chapter, you should be able to:

• Determine whether a regression experiment would be useful in a given

instance

• Formulate a regression model

• Compute a regression equation

• Compute the covariance and the correlation coefficient of two random

variables

• Compute confidence intervals for regression coefficients

• Compute a prediction interval for the dependent variable

10-4

• Test hypothesis about a regression coefficients

• Conduct an ANOVA experiment using regression results

• Analyze residuals to check if the assumptions about the

regression model are valid

• Solve regression problems using spreadsheet templates

• Use LINEST function to carry out a regression

10-5

relationship between variables.

• In simple linear regression, we model the relationship

between two variables.

• One of the variables, denoted by Y, is called the dependent

variable and the other, denoted by X, is called the

independent variable.

• The model we will use to depict the relationship between X and

Y will be a straight-line relationship.

• A graphical sketch of the the pairs (X, Y) is called a scatter

plot.

10-6

This scatterplot locates pairs of observations of Scatterplot of Advertising Expenditures (X) and Sales (Y)

advertising expenditures on the x-axis and sales 140

100

Sales

80

Larger (smaller) values of sales tend to be 60

associated with larger (smaller) values of 40

advertising. 20

0

0 10 20 30 40 50

A d ve rtising

The scatter of points tends to be distributed around a positively sloped straight line.

The pairs of values of advertising expenditures and sales are not located exactly on a

straight line.

The scatter plot reveals a more or less strong tendency rather than a precise linear

relationship.

The line represents the nature of the relationship on average.

10-7

Y

Y

Y

X 0 X X

Y

Y

X X X

10-8

Model Building

relationship between component is the variation

advertising and sales of means between samples

suggests that a statistical or treatments (SSTR) and

model might be useful in

Statistical the random component is

analyzing the relationship. model the unexplained variation

(SSE).

A statistical model separates

the systematic component Systematic In regression, the

of a relationship from the systematic component is

component

random component. the overall linear

+ relationship, and the

Random random component is the

errors variation around the line.

10-9

Model

The population simple linear regression model:

Y= 0 + 1 X +

Nonrandom or Random

Systematic Component

Component

where

Y is the dependent variable, the variable we wish to explain or predict

X is the independent variable, also called the predictor variable

is the error term, the only random component in the model, and thus, the

only source of randomness in Y.

1 is the slope of the systematic component.

10-10

Regression Model

Y

Regression Plot The simple linear regression

model gives an exact linear

relationship between the

expected or average value of Y,

the dependent variable, and X,

E[Y]=0 + 1 X

the independent or predictor

Yi

variable:

{

Error: i } 1 = Slope

E[Yi]=0 + 1 Xi

}

1

Actual observed values of Y

0 = Intercept

differ from the expected value by

an unexplained or random error:

X

Yi = E[Yi] + i

Xi = 0 + 1 Xi + i

10-11

Regression Model

• The relationship between X and Y is a Assumptions of the Simple

straight-line relationship. Y Linear Regression Model

• The values of the independent

variable X are assumed fixed (not

random); the only randomness in the

values of Y comes from the error term

i. E[Y]=0 + 1 X

• The errors i are normally distributed

with mean 0 and variance 2. The

errors are uncorrelated (not related)

in successive observations. That is:

~ N(0,2)

Identical normal

distributions of errors,

all centered on the

regression line.

X

10-12

Squares

Estimation of a simple linear regression relationship involves finding

estimated or predicted values of the intercept and slope of the linear

regression line.

Y = b0 + b1X + e

b1 estimates the slope of the population regression line, 1;

and e stands for the observed errors - the residuals from fitting the estimated

regression line b0 + b1X to a set of n points.

The estimated regression line:

Y b0 + b1 X

where Y (Y - hat) is the value of Y lying on the fitted regression line for a given

value of X.

10-13

Y Y

Data

Three errors from the

least squares regression

X line X

Y

from a fitted line squares regression

line are minimized

X X

10-14

Errors in Regression

Y

the observeddata point

Y b0 b1 X the fitted regression line

Yi .

Yi

{

Error ei Yi Yi

Yi the predicted value of Y for X

i

X

Xi

10-15

n n

SSE = e

i=1

2

i (y

i=1

i y i ) 2

The least squares regression line is that which minimizes the SSE

with respect to the estimates b 0 and b 1 .

n n

y

i=1

i nb0 b1 x i

i=1

At this point

SSE is

Least squares b0 minimized

n n n with respect

x y

i=1

i i b0 x i b1 x 2i

i=1 i=1

to b0 and b1

Least squares b1 b1

10-16

and Least Squares Estimators

Sums of Squares and Cross Products:

x

2

SSx (x x ) x

2 2

n 2

SS y ( y y ) y

2 2 y

n

SSxy (x x )( y y ) xy

x ( y )

n

Least squares regression estimators:

SS XY

b1

SS X

b0 y b1 x

10-17

Example 10-1

2 x 2

1211

1345

1802

2405

1466521

1809025

2182222

3234725

SS x x

1422 2005 2022084 2851110 n

1687 2511 2845969 4236057 2

1849 2332 3418801 4311868 79, 448

2026 2305 4104676 4669930 293, 426,946 40,947 ,557.84

2133 3016 4549689 6433128 25

2253

2400

3385

3090

5076009

5760000

7626405

7416000 x ( y )

2468 3694 6091024 9116792 SS xy xy

2699 3371 7284601 9098329 n

2806 3998 7873636 11218388

(79, 448)(106,605)

390,185,014 51, 402,852.4

3082 3555 9498724 10956510

3209 4692 10297681 15056628

3466 4244 12013156 14709704 25

3643 5298 13271449 19300614

3852 4801 14837904 18493452 SS 51, 402,852.4

4033 5147 16265089 20757852 b XY 1.255333776 1.26

4267 5738 18207288 24484046 1 SS 40,947 ,557.84

4498 6420 20232004 28877160 X

4533 6059 20548088 27465448

4804 6426 23078416 30870504 106,605 79,448

5090 6321 25908100 32173890 b y b x (1.255333776 )

5233 7026 27384288 36767056

0 1 25 25

5439 6964 29582720 37877196

79,448 106,605 293,426,946 390,185,014 274.85

10-18

used to carry out a Simple Regression

10-19

to carry out a Simple Regression

10-20

to carry out a Simple Regression

between the residuals and the X-values (miles).

10-21

to carry out a Simple Regression

would indicate that the normality assumption for the errors has not

been violated.

10-22

Y Y

X X

What you see when looking

at the total variation of Y.

along the regression line at

the error variance of Y.

10-23

Errors of Regression Estimators

Y

Degrees of Freedom in Regression:

for each parameter estimated (b 0 and b1 ) )

2 Square and sum all

2 ( SS XY ) regression errors to find

SSE = ( Y - Y ) SSY SSE.

SS X X

SSE = SS Y b1 SS XY

2 2 66855898 (1.255333776)( 51402852 .4 )

An unbiased estimator of s , denoted by S :

2328161.2

SSE 2328161.2

SSE MSE

MSE = n2 23

(n - 2) 101224 .4

s MSE 101224 .4 318.158

10-24

Regression

2

s x

s(b0 )

s(b0 )

s x 2

nSS X

nSS X 318.158 293426944

( 25)( 4097557.84 )

where s = MSE 170.338

s

The standard error of b1 (slope): s(b1 )

SS X

318.158

s

s(b1 ) 40947557.84

SS X 0.04972

10-25

Regression Parameters

A (1 - ) 100% confidence interval for b :

0

b t s (b ) Example 10 - 1

0 ,(n 2 ) 0 95% Confidence Intervals:

2

b t s (b )

0 0.025,( 25 2 ) 0

A (1 - ) 100% confidence interval for b : = 274.85 ( 2.069) (170.338)

1

b t s (b ) 274.85 352.43

1 ,(n 2 ) 1

2 [ 77.58, 627.28]

Least-squares point estimate:

b1=1.25533

b1 t s (b1 )

0.025,( 25 2 )

= 1.25533 ( 2.069) ( 0.04972 )

Height = Slope

1.25533 010287

.

[115246

. ,1.35820]

Length = 1

regression slope at 95%)

10-26

to obtain Confidence Intervals for 0 and 1

10-27

10-5 Correlation

degree of linear association between the two variables.

The population correlation, denoted by, can take on any value from -1 to 1.

-1 < < 0 indicates a negative linear relationship

0 indicates no linear relationship

0<<1 indicates a positive linear relationship

1 indicates a perfect positive linear relationship

10-28

Illustrations of Correlation

Y Y Y

= -1 =0

=1

X X X

Y = -.8 Y =0 Y

= .8

X X X

10-29

The covariance of two random variables X and Y:

Cov ( X , Y ) E [( X )(Y )]

X Y

where and Y are the population means of X and Y respectively.

X

Cov ( X , Y ) SS

= XY

r=

SS SS

X Y X Y

51402852.4

The sample correlation coefficient * :

( 40947557.84)( 66855898)

SS

XY 51402852.4

r= .9824

SS SS 52321943.29

X Y

10-30

Coefficient

Example 10 -1:

r

H0: = 0 (No linear relationship) t( n 2 )

H1: 0 (Some linear relationship) 1 r2

n2

0.9824

r =

Test Statistic: t( n 2 ) 1 - 0.9651

1 r2

25 - 2

n2 0.9824

= 25.25

0.0389

t0. 005 2.807 25.25

H 0 rejected at 1% level

10-31

Regression Relationship

Constant Y Unsystematic Variation Nonlinear Relationship

Y Y Y

X X X

A hypothesis test for the existence of a linear relationship between X and Y:

H0: 1 0

H1: 1 0

Test statistic for the existence of a linear relationship between X and Y:

b

1

t

(n - 2) s(b )

1

where b is the least - squares estimate of the regression slope and s ( b ) is the standard error of b .

1 1 1

When the null hypothesis is true, the statistic has a t distribution with n - 2 degrees of freedom.

10-32

Slope

Example 10 - 1: Example10 - 4 :

H0: 1 0 H : 1

0 1

H1: 1 0 H : 1

1 1

b b 1

1 1

t t

(n - 2) s(b ) ( n - 2) s (b )

1

1

1.24 - 1

1.25533 = 1.14

= 25.25 0.21

0.04972

t 1.671 1.14

t 2.807 25.25 (0.05,58)

( 0 . 005 , 23 ) H is not rejected at the10% level.

0

H 0 is rejected at the 1% level and we may

We may not conclude that the beta

conclude that there is a relationship between

coefficien t is different from 1.

charges and miles traveled.

10-33

the regression relationship, a measure of how well the regression line fits the data.

( y y ) ( y y) ( y y )

Y Total = Unexplained Explained

Deviation Deviation Deviation

Y . (Error) (Regression)

Y

Y

Unexplained Deviation

Explained Deviation

{

}

{

Total Deviation

SST

2

= SSE

2

( y y ) ( y y) ( y y )

+ SSR

Percentage of

2

2 SSR SSE

r 1 total variation

SST SST explained by

X

X the regression.

10-34

Y Y Y

X X X

SST SST SST

S

r2 = 0 SSE r2 = 0.50 SSE SSR r2 = 0.90 S SSR

E

7000

Example 10 -1: 6000

5000

Dollars

SSR 64527736.8

r 2

0.96518 4000

2000

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

Miles

10-35

an F Test of the Regression Model

Source of Sum of Degrees of

Variation Squares Freedom Mean Square F Ratio

MSE

Error SSE (n-2) MSE

Total SST (n-1) MST

Example 10-1

Source of Sum of Degrees of

Variation Squares Freedom F Ratio p Value

Mean Square

Regression 64527736.8 1 64527736.8 637.47 0.000

Error 2328161.2 23 101224.4

Total 66855898.0 24

10-36

Variance and an F Test of the Regression Model

10-37

for Model Inadequacies

Residuals Residuals

0 0

x or y x or y

random. No indication of model inadequacy. increases when x changes.

Residuals Residuals

0 0

Time x or y

Residuals exhibit a linear trend with time. underlying nonlinear relationship.

10-38

Residuals

Flatter than Normal

10-39

Residuals

10-40

Residuals

Positively Skewed

10-41

Residuals

Negatively Skewed

10-42

for Prediction

• Point Prediction

A single-valued estimate of Y for a given value of X obtained by

inserting the value of X in the estimated regression equation.

• Prediction Interval

For a value of Y given a value of X

Variation in regression line estimate

Variation of points around regression line

For an average value of Y given a value of X

Variation in regression line estimate

10-43

Regression line Regression line

Y Y Lower limit on intercept

X X X X

slope of the regression line intercept of the regression line

10-44

Regression narrowest at the mean value of X.

line • The prediction band widens as the

distance from the mean of X increases.

Y • Predictions become very unreliable when

we extrapolate beyond the range of the

sample itself.

X X

10-45

Value of Y

Y

Regression line Y Prediction band for E[Y|X]

Regression

line

X X X

3) Variation around the regression

line Prediction Interval for E[Y|X]

10-46

1 (x x) 2

yˆ t s 1

2 n SS X

Example10 - 1 (X = 4,000) :

1 (4,000 3,177.92) 2

25 40,947,557.84

10-47

Value of Y

A (1 - ) 100% prediction interval for the E[Y X] :

1 (x x) 2

yˆ t s

2 n SS X

Example10 - 1 (X = 4,000) :

1 (4,000 3,177.92) 2

25 40,947,557.84

10-48

Intervals

10-49

Regression

The solver macro available in EXCEL can also be used to conduct a

simple linear regression. See the text for instructions.

10-50

Regression

Y = - 0.8465 + 1.352 X

9.0 S 0.184266

R-Sq 95.2%

R-Sq(adj) 94.8%

8.5

8.0

7.5

Y

7.0

6.5

6.0

5.5 6.0 6.5 7.0 7.5

X

- Linear Regression Analysis and Least Square MethodsUploaded bylovekeshthakur
- chapter-4-regression-fitting-lines-to-dataUploaded byapi-367364282
- Quantitative TechniquesUploaded bysanjayifm
- Simple Regression QuizUploaded byKiranmai Gogireddy
- Regression and CorrelationUploaded byKelly Hoffman
- QT - Assignment AnswerUploaded byIsuru Wijewardene
- Linear Regression and Correlation Analysis PPT @ BEC DOMSUploaded byBabasab Patil (Karrisatte)
- Regression and CorrelationUploaded byApporva Malik
- ch13_F06Uploaded byRavi Rathod
- c1d1Uploaded byKwabena Agyeman
- ArticleStatistics for Laboratory Method Comparison StudiesUploaded byathanasiosm
- CVEN2002 Week11Uploaded byKai Liu
- Notebook File Chapter 8Uploaded byamy12young
- Linear RegressionUploaded byzapel
- 3Uploaded byapi-438458005
- economicsUploaded byMia Caldwell
- 9 CorrelationUploaded byJude Patrick Sabaybay
- THE INFLUENCE OF EMPLOYEE MOTIVATION, LEADERSHIP AND ORGANIZATION CULTURE TOWARD EMPLOYEES SATISFACTION.Uploaded byIJAR Journal
- Exam 3 ReviewUploaded bySergio
- Course 6 Econometrics RegressionUploaded byValentina Olteanu
- Regresi UsedUploaded byLutfi Untung Angga Laksana
- Regresi GandaUploaded byekoefendi
- irfan 015.docUploaded byIrfan Saddique
- Module III - Regression MethodUploaded bysanjo george
- RegressionUploaded byluispedro1985
- Report Statistical Technique in Decision Making (GROUP BPT) - Correlation & Linear Regression123 (1)Uploaded byAlieffiac
- linear regression prjtUploaded byapi-319410229
- Output Path AnalysisUploaded byRandy Cavalera
- 1-7 Least-Square RegressionUploaded byRawash Omar
- ReferatUploaded byflorin

- ExecutionUploaded byAli Elattar
- Chapt 10 Motivating and rewarding employess.pptUploaded byAbu Umar
- Ch09Uploaded byAli Elattar
- Ch05Uploaded byAli Elattar
- The ISQua Fellowship Programme 2017Uploaded byAli Elattar
- Brochure_Fact Sheet Infection ControlUploaded byAli Elattar
- Chapt 5 Basic Organization Designs.pptUploaded byAbu Umar
- Robbins S and DeCenzo Chapter3 FundamentalsOfManagementUploaded byMaritza Figueroa P.
- Ch07Uploaded byAli Elattar
- tmeihsUploaded byAli Elattar
- Ch02Uploaded byAli Zainal Abidin
- Fundamental of ManagmentUploaded byKuldeep Jangid
- tmeihs.pdfUploaded byAli Elattar
- Chapter8 conflict analysis.pdfUploaded byTeecast Tv
- Primer Hosp Acct Finance 4 the dUploaded byAli Elattar
- Primer Hosp Acct Finance 4 the dUploaded byTejinder Singh
- Cost Analysis HospitalsUploaded byGeta_Varvaroi_6157
- DBA Research TempleteUploaded byAli Elattar
- Discussion Log BookUploaded byAli Elattar
- AssignmentUploaded byAli Elattar
- Student Slides Chapter 10Uploaded byGaurav Widhani
- scarb_eesbm6e_ppt_06.pptUploaded byAli Elattar
- Digital Dairy 1900 to 2078Uploaded byDILLIP KUMAR MAHAPATRA
- Coursera Healthcaresafety 2014 (1)Uploaded byAli Elattar
- 4 new.pptUploaded byAli Elattar
- 12Uploaded byAli Elattar
- Six Sigma WasiemUploaded byAli Elattar
- Control ChartUploaded byAli Elattar
- 4 newUploaded byAli Elattar

- Well Cap Garcin e 2014Uploaded byherlia
- FCP Schedule & ObjectivesUploaded byMichael Allen
- Project Manager Resume SampleUploaded byresume7.com
- Advertising and Marketing Research(Maam)Uploaded byApurva Sharma
- Chapter 4 - Modern CanadaUploaded byBety Blagu
- Socratic Seminars DirectionsUploaded bytoubanetworking
- Manual Establishment Administration ICAR OfficersUploaded byRichard Holt
- Encyclopedia of PsychometricsUploaded byJ G
- inter d- lesson planUploaded byapi-299521003
- Chi Square TestUploaded byParul Mittal
- how to help children recognise and develop their strengthsUploaded byapi-256218444
- 50196972-MPEE-1-0-1Uploaded byGustavo Acosta Panda
- 02 General InformationUploaded byAriani Amalia
- Topics in Biology Lab Manual-2015Uploaded byNate
- Physics Lab 2Uploaded byRana Bus
- Green Skinner Time Mgt 2005Uploaded bySunny
- Customer Retention and Services Marketing Strategies Adopted By Selected Fast Food Restaurants in Awka: Anambra State - NigeriaUploaded byinventionjournals
- The work ethic is it universalUploaded byKPEduard
- DAA ReviewQuestions 5Uploaded byDipu
- 9702-P5-Q1Uploaded byMichael Leung
- Fractionally spaced equalizer for indoor visible light communication systemUploaded bySEP-Publisher
- MODERN+ORATORY+PAPER+REQUIREMENTS+1Uploaded byAngie Mandeoya
- Biology Statistics Made Simple Using ExcelUploaded byYoga Dwija W
- Student Thesis Proposal GuidelinesUploaded bylubliner
- printable-act-practice-test-pdf-2013-2014.pdfUploaded byrochelle
- pip report - jenna negrukUploaded byapi-361395259
- The Roadmap to personal immortalityUploaded byTurchin Alexei
- CV_asmUploaded byasmartinez
- Gestalt Bender ReportUploaded bydfriady
- Skill Matrix documentUploaded byjagansd3

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.