This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Econometrics

Lecture 6

Multiple Regression

13-1

Chapter Goals

After completing this chapter, you should be able to:

• Apply multiple regression analysis to business decisicnmaking situations

• Analyze and interpret the computer output for a multiple regression model

• Perform a hypothesis lest for all regression coefficients or for a subset of coefficients

• Fit and interpret nonlinear regression models

• Incorporate qualitative va riables into the regression model by using dummy variables

• Discuss model specification and analyze residuals

The Multiple Regression Model

Multiple Regression Model with k Independent Variables:

Chapter 13

Multiple Regression Equation

The coefficients of the multiple regression model are estimated using sample data

Multiple regression equation with k independent variables:

In this chapter we will always use a computer to obtain the regression slope coefficients and other regression summary measures.

Multiple Regression Equation

(continued)

Two variable model

(The constant variance property is called homoscedasticity)

Standard Multiple Regression Assumptions

• The values X; and the error terms Ei are independent

• The error terms are random variables with mean 0 and a constant variance, a2.

IE[E,l=O and E[E~1=a2 for(i=1 •...• nll

13-2

Chapter 13

• The random error terms, £j , are not correlated with one another, so that

IE[EjE 11 = a for all i * jl

It is not possible to find a set of numbers, co' c" ... , ck, such that

ICO+C,X,I+C2X21+ ... +CKXKi =01

(This is the property of no linear relation for

the ~'s)

Standard Multiple Regression Assumptions

(continued)

Example:

2 Independent Variables

• A distributor of frozen desert pies wants to evaluate factors thought to influence demand

• Dependent variable: Pie sales (units per week)

• Independent vanables:{ Price (in $) Advertising ($100'5)

'. Data are collected for 15 weeks

**:!J Pie Sales Example
**

-~ -- ..... - .. Multiple regression equation:

Woo. ..... III ($1001)

1 '50 .,.0 ,~ »<:

2 .... ,.so " Sales = bo + b, (Price)

, '''' • .00 l.a

• . " .n. ... + b2 (Advertising)

• '50 MO ,.

• "" ,.., ••

7 .30 • .50 U

• ." .... l.7

• <SO 7.00 , .s

10 400 ... 0 '.a

11 ,.., 7~ U

12 SOl) , ... ' ..

l' ..... .... . .. •

14 ... 15.00 a ..

1. "'" 7.o{l 2.1 . 13-3

L_-------------- ...... ---- ..... --------------"'~""'"'''''"="-··- ..... -.-,- . ..-~ .. - _L*" - ~=''''' ...... =''''''''''=~_.... .• _

Chapter 13

Estimating a Multiple Linear Regression Equation

• Excel will be used to generate the coefficients and measures of goodness of fit for multiple regression

• Excel:

• Tools I Data Analysis ... I Regression

• PHStat:

• PHStat I Regression I Multiple Regression ...

~J Multiple Regression Output

~l'

MuIUpI.R 0.12113

RSqtlul 0.52148

Adjwled R SqI.lUI 0.44172

SI,fJd;In;lEnQor 41.46341

O~'fVlIlOM is IS.I .. ~ 306.526- 24.975(Prico)+ 74.131(Adv<!rtising)1

f

I ss

ANOVA

AlS

df

**,: / 29460.027 14730.013 6.53861 0.0121:)1
**

27033.306 2252.nifi

" 56493.333

Co&(ifo:lMI.Il: St.~tUrdE.70r rStolll! p·~(1R Loww'},5% UppwSl5'%

306.52519 114.253!19 2..58255 D.D1991:J, 57.56.B35 555.464<14

-24.!l150a 10.532!3 -2.30555 CI.0391'9 -48.57G26 -1.37l~2

14.130~5 25.95732 2..8S4n 11.01'"9 HJ,)5:'lIl;!. 13Q.1~!lPJ,S R~r."IJIQIl Rosldual Tot.1

'"t'fnpt Prl,. Adll'lrtl.'rtg

The Multiple Regression Equation

**ISaies =306.526 - 24.975(Price)+ 74.131 (Advertising)1
**

wfcre I I

Sales is iii number of ples per week

Price is in $

Advertising is in 51oo's.

b, = -24.975: sales b, = 74.131: sales will

will decrease, on increase, orr average,

average, by 24.975 by 74.131 pies per

pies per week for week for each $100

each $1 increase in increase in

selling price, net of advertising, net of the

r-:{~~ the effects of changes effects of changes

due to advertising due to price

~, 13-4

Chapter 13

Coefficient of Determination, R2

• Reports the proportion of total variation in y explained by all x variables taken together

R2 '" SSR = regression sum of squares

SST total sum of squares

• This is the ratio of the explained variability to total sample variability

13-5

.5\t Coefficient of Determination, R2

1 (continued)

M,"". R '.7221' .IR' ~ SSR ~ 29460.0 ~ .521481•

R"~~ 0.52146 »> l' SST 56493.3 '1 ~,.

Adlualld R Sqwrl 0.44172 I:

511rtU.1rd Err<.>r 41.46341 52.1% of the variation in pie sales

ObII4IfYaUons 15 is explained by the variation in price and advertising

or

55/ MS F SlgnlffuncftF

ANO"A

**RagrMAlon 29"(1.1)27 14730.013 6.53861 0.01201
**

RlIlldu.1 tz ;Z70JJ,3Il6 2252.178

Talal ,. 554U333

eo_me/enis Stand.lllrdError tSt~f prVIIlu~ LOW8dS% Uwar95%

1"I~rclp.1 306_52619 1111>~U9 2.68285 0.01893 51.5SB3,5 555.4IJ:4IJ.4

Puce ·24.91609 nI.83213 .2.3056,5 O.OJ!l79 -4S.57626 -1.37392

AdwrUI1ng 14.13096 2S.96132 Z,oIl-5-4TiII 0.014411 11.5SJIJJ 130.70'668 Estimation of Error Variance

• Consider the population regression model I~ =130 +13,xll +I3'X~+"'+~KXKJ+t,1

• The unbiased estimate of the variance of the errors is

te,' SSE

52 = _"' __ ~ _

• n-K -1 n-K-l

where e, = y, - y,

• The square root of the variance, s, , is called the standard error of the estimate

Chapter 13

Standard Error, s,

•

.

.

13-6

Muillpitl R O.7Z21:J

R Squarl 0.52148

Adjusted R Squar. 0.44112

St~or.krd En« 41.453<4'

ObsllVilions ~5

Js. =47.463\

The magnitude of this value can be compared to the average y value

ANO .... A

at

MS

SlanJ"c.!Jm;;a F

ss

1-'.::::"","=":::;I~::;__+ __ -=+ _ _'''=''=O::C'O'"-i1 14730.013 6.53861

r.~;:::~:::;~~="'=-' --+----,;:o:+-~:;:;~:o;:::::~::1.~ 225.2..778

(Ul12fl1

CotlffldMI!: SrtllrrdardE'rmr t$'oIII P~I'B{1JfI Loww95% Upper9SYo

114.25359 2..68285 0.0199:).

10.83213 .2,.30565 0.03919

25.Ml7J2 2..115415 0.01449

57,SUl5 :5SS.4S404

-48_57Siti TU13ii1Z

17.553IlJ 130.1Ua.a

3IJS.6Z5'9 ·24.1175ml 74.13096

Prici

Adjusted Coefficient of Determination, "FP

• R2 never decreases when a new X variable is added to the model, even if the new variable is not an important predictor variable

• This can be a disadvantage when comparing models

• What is the net effect of adding a new variable?

• We lose a degree of freedom when a new X variable is added

• Did the new X variable add enough explanatory power to offset the loss of one degree of freedom?

Adjusted Coefficien_1 of Determination, R2

(continued)

• Used to correct for the fact that adding non-relevant independent variables will still reduce the error sum of

squares

R:' =1- SSE/(n-K-1) SST/(n-1)

(where n ~ sample size, K: number of independent variables)

• Adjusted R2 provides a better comparison between multiple regression models with different numbers of independent variables

• Penalize excessive use of unimportant independent variables

• Smaller than R2

Chapter 13

RegrnsiOllSflftisUcs r II ~

M",,~.R '.72213 IR2 =.441721 ~

44.2%. of the variation in pie sales is explained by the variation In price and advertising. taking into account the sample size and number of indeoendent variables

**RSqU';lIf9 0.52143
**

Adjusted R Squall 0.44\72

Sllndlrd E:rTot" 47.46341

Ob$III"\rJlllo)m. ts

ANOVA Of

R~grB.altm

Rulcjull 12

Te-Ial 14 SS MS F Slgn[ftcanclJ F

294EC.027 141UUlt3 (:;,Metll 0.1,11201

2'1D3~.306 2252.776

5645:1.333

13-7

114.2:5389 2.!lU85 (1,0199;1

1Il,.l13213 ·2.l0Sll.5 0,1]:1979

2So.96732 2.55478 O,IlI449

Inllu:opl Pile.., Adwrtlllrll;!

:WG.5l(;19 -24.97.fi09 14.1309oEi

67.68835 ~.*4(J-4

~.57S:25 ·1.,37392

17.55JQ3 1l11,70lI8.8

Coefficient of Multiple Correlation

• The coefficient of multiple correlation is the correlation between the predicted value and the observed value of the dependent variable

• Is the square root of the multiple coefficient of determination

• Used as another measure of the strength of the linear relationship between the dependent variable and the independ ent variables

• Comparable to the correlation between Y and X in simple regression

Evaluating Individual Regression Coefficients

• Use t-tests for individual coefficients

• Shows if a specific independent variable is conditionally important

• Hypotheses:

• Ho: ~j = 0 (no linear relationship)

• H1: ~j '# 0 (linear relationship does exist between X; and y)

Chapter 13

!(df= n- k-1l!

Evaluating Individual Regression Coefficients

(continued)

Ho: ~i = 0 (no linear relationship)

H1: ~i t 0 (linear relationship does exist between Xi and y)

Test Statistic:

rn

J

b J

**~I' Evaluating Individual
**

Re9ression Coefficients

"""'_., (continued)

."'ro'~'""""~' I' _

Murtlple R U221l t .. value for Pnce Is t = -2.306 ~ ..

R Squue "_5214& t-value for Advertising Is t = 2.855 .-

Att]l.I"'lIdRSqu.ilrO,l ~.441f2

Sl.IndlrdErrDl' 47.46341

Obael'Vl.iloos ts

ANOVA .r 55 "5 F BltJtill'i~nG"F

R9'i1flll~Qfl 2 1946O.D27 14730,013 6.53861 0.01201

RlllduI' 12 27033.3G6 Z252.776

Tol&1 ,. 56491333

COfJfflc/enrs StlJnd!rdError tStat P·d'~ Low~,'ii% """".'"

InI8u:.!Ipl lClIU'i2619 t14,.25J89 2.S82e.5 F)JI1!1'S,3 !iT.-5&IJ6 56/J.4IWJ4

"'I~ ·;C,4.91.5O"9 ~~,632'3 1_2030585 IU139~1 -4Il.57626 ·1.37392

AMIllaing 74.130095 25.96732 2.e;:S41& lJo.a14'1-i 11.m03 1Ja.7(JS~ Example: Evaluating Individual Regression Coefficients

From Excel output:

Reject Ho for each variable Conclusion:

___;'~;:Ij:;::::,:::::;::jl::;;:::~_ There is evidence that both

R~""H._""""'""ci~'H,td, R~"'tH, Price and Advertising affect

-2.1788 2.1788 pie sales at a = .05

13-8

Chapter 13

Confidence Interval Estimate for the Slope

Confidence interval limits for the population slope i3j

Ib + t S I where I has

j - n-K-1,o/2 bj (n-K-1)d.f.

Example: Form a 95% confidence interval for the effect of changes in price (x,) on pie sales:

-24.975 ± (2.1788)(10.832)

So lhe interval is -48.576 < ~, < -1,.374

Confidence Interval Estimate for the Slope

(continued) Confidence interval for the population slope i31

**COlml';IMt~ $loIImilintError LO~fj,95% UPPff!'95Y1
**

Inltfc.pl 306.52619 114_253p.9 57 .see as 555.46404

""'" .24.91SOSlo 10.aJ2t3 .. 48.61626 .. 1.37392

AdYlJrllAlng 14.1~OO 25.96732: 17.~l· :'0.111$&$ I I Example: Excel output also reports these inlerval endpoints:

Weekly sales are estimated to be reduced by between 1.37 to 48.58 pies for each increase of $1 in the selling price

13-9

Test on All Coefficients

• F-Test for Overall Significance of the Model Shows if there is a linear reiationship between all of the X variables considered together and Y

• Use F test statistic

• Hypotheses:

Ho: 13, = 132 = ... = 13k = 0 (no linear relationship) H1: at least one 131 'I: 0 (at least one independent varia ble affects Y)

Chapter 13

where F has k (numerator) and

(n - K - 1) (denominator) degrees of freedom

• The decision rule is

IReject Ho if F > F.,n-K-1,.1

F-Test for Overall Significance

• Test statistic:

F = MSR = SSRIK

S~ SSE/(n-K-1)

**I F-Test for Overall Significance
**

(continued)

Rqgro:li~Qn S!O;(j~I'-r:;:. IF= MSR = 14730.0 =6,53861-

MulUphlR 1l.722tJ

RSquarl 1l.52l48

Adj~l~d Fe $qlU..1I 1l.44172 MSE 2252.8

Stlndard Errol 47.46341 /

ObUTVIUons " J ~th 2 end 12 dogr.os I I ,P-valu. for I

offroedom the F-Test

ANOYA • f / S s M' F / Slilntf1;;m'1(;o}F /

Regn .... lIJoo 2 Z945Q.027 1473G.013 6.SJa61 0.01201

RlSldu,,1 12 Z70]].3G6 2252.170

Tolal 14 564I1J,333

COitffll:lenls SrMdilrdEtTor I Stitt P ...... I., .. LowadlS% UPfH'~!l'~

1.,II,IJI;;upl :'llls.52610 ,14.2S3e5' 2:.68285 0.01993 57.5&1135 65,5..46404

PII-C!l -24.97509 1CI.83213 -2.'30565 0.0'3&79 4.S7.fiZ'5 -1.:37392

Adwrtl.lng 14.13096 25.961:)2 2,.pJ54l'6 0,01449 17.553[]] 1l1l.706GB F-Test for Overall Significance

(continua d)

Test Statistic:

F = MSR = 6,5386 MSE

Decision:

Since F test statistic is in the rejection region reject Ho

Conclusion:

There Is evidence that at least one independent variable affects Y

13-10

Chapter 13

-I

Tests on a Subset of Regression Coefficients

13-11

• Consider a multiple regression model involving variables l) and Z;, and the null hypothesis that the z variable coefficients are all zero:

Ho : 0, = a, = ". = 0, = 0

H,: atleastoneoto);cO (j=1,,,.,r)

Prediction

• Given a population regression model

Iy, = 130 +i3,xlI +i32x~ +"'+i3Kxt(J +€, (i=1,2, ... ,n)1

• then given a new observation of a data point

(X1,n+l' X 2,n+1' ... ,x K,n+l) "

the best linear unbiased forecast of Yn+1 is

1:9"+, = bo + b,x"n+' + b2x2!1+1 + ... + bKXK,JH,1

II is risky to forecast for new X values outside the range of the data used 10 estlm ate the model coefficients. because we do not have data to support that the linear model extends beyond the observed range

Using The Equation to Make Predictions

!predict sales for a week in which the selling !

price is $5.50 and advertising is $350:

s~= 306.526 - 24.975(prte) + 74,131 (Adwrtisi ng) = 306.526-24.975(5.50)+ 74.131(3.5)

=428.62 I

I.

I Note that Advertising is I in $1 OO's, so 5350

means that)(, = 3.5

!predicted sales ! is 428.62 pies

Chapter 13

-Residuals in Multiple Regression

Two variable model

x,

13-12

Nonlinear Regression Models

• The relationship between the dependent variable and an independent variable may not be linear

• Can review the scatter diagram to check for non-linear relationships

• Example: Quadratic model

Iy = [30 + [31X1 +[32X~ HI

• The second independent variable is the sq uare of the first variable

Quadratic Regression Model

Model form:

[Vi =130 +131X1i +132X~i +£i[

• where:

130 = Y intercept

13, = regression coefficient for linear effect of X on Y 132 = regression coefficient for quadratic effect on Y Ej = random error in Y for observation i

Chapter 13

Linear VS. Nonlinear Fit

y

I

Quadratic Regression Model

IYj = 130 + l3,x'i -t 132x;, + £11

Quadratic models may be considered when the scatter diagram takes on one of the following shapes:

'l_ tL [j_ 1_C_

13, < 0 x, 13, > 0 x, 13, < 0 x, 13, > 0 x,

Example: Quadratic Model

F",

PlJrlty TIm.

, ,

1 2

e s

rs s

22 1

33 8

40 to

54 '2

61 13

10 ..

" is

05 -s

87 ie

.. 17 • Purity increases as filter time increases:

,00 ~~~~~~~~_-,

8. +-~~~---"__"~--i

~+-~~~~~~~----i

~

<. +-~------~~~--1

2. +-~~~------~~--1

to Tln'tll'

eo

13-13

Chapter 13

**Example: Quadratic Model
**

(continued)

. Simple regression results:

y = -11.283 + 5.985 Time

I lc...ff1<;"""",I!i,:ml ,St., I P .... III. It statistic, F statistic, and I

'"~- ·11.2112111 '''''''''' -3.2lio332 1J.00(JIl' R' are all high, but lhe

- S.I:ltIS20 .""... 1~.3211'" U1eE·HI residuals are not random:

R~u'iOI'I51 ... rlule-J Timll ReBidu~l~l

R,SqUII~ O.II&eM F SlgnUlcar:l'C'tIF

10

AdJt-l.looRSq ... m I),~ J7l.~010~ 2"OnlU::·10 . , . • I

-",

SI~nd"rd Ern. II,UlXli . .

'::0

].) l ·11) • -, -~

·10 , . J

Timil ~I Example: Quadratic Model

~1Quadratic regression results: (continued)

y = 1.539 + 1.565 Time + 0.245 (Time)2

$'.md.rd TlmllRl1idvlll'lol

eo.ft1cHffU """ r s '" P-too .. lu1I ,,-

"._ 1.5387'0 .,. ... O.G8S50 0.50722 ~ s 1

110M 1.~ O.oo17V a, ..". o.om1 iJ·,

lln~I.III",d O~4!>10 .""" 7.UA.M 1.115$E.05 , ~t *&.

.

R.,..,_ ... kw'o$l.rlHb F slpnm,..,_F_I '".,

RSqUllfti , ..... 1~,n~ 2;,t)6E·13

AdjwWdR~ool1l , .... , n, ..... ~u .. ul Rnldul Pjog!

SlnndarllE"mIf :!.~"51l " 1

i ~ .

~J, ,L' .

I The quadratic term is significant and I , I

improves the model: R2 is hi'gher and So is Xo· • ,.

lower, residuals are now random Tlml·~ul4~

The Log Transformation

The Multiplicative Model:

• Original multiplicative model

• Transformed multiplicative model

13-14

Chapter 13

I nterpretation of coefficients

13-15

For the multiplicative model:

I10g YI = log 130 + 13, log Xu + log E,1

• When both dependent and independent variables are logged:

• The coefficient of the independent variable X. can be interpreted as

a 1 percent change in Xk leads to an estimated bk percentage change in the average value of Y

• b. is the elasticity of Y with respect to a change in Xk

Multiple Regression Assumptions

Errors (residuals) from the regression. model:

Assumptions:

• The errors are normally distributed

• Errors have a constant variance

• The model errors are independent

Chapter Summary

• Developed the multiple regression model

• Tested the significance of the multiple regression model

• Discussed adjusted R2 (R2)

• Tested individual regression coefficients

• Tested portions of the regression model

• Used quadratic terms and log transformations in

regression models

• Used dummy variables

• Evaluated interaction effects

• Discussed using residual plots to check model assumptions

econometrics

econometrics

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd