Y =

0

+

1

X

1

+

2

X

2

+

Slope =

1

Impact of X

1

on Y is independent of the quantity

of X

2

.

Elasticity =

1

* [X

1/

Y]

Double-Log Functional Form

What if you wished to estimate the following

model?

Y =

0

X

1

1

X

2

2

To make this linear in the parameters

InY =

0

+

1

InX

1

+

2

InX

2

+

Slope =

1

= lnY / lnX

1

= [Y / Y] / [X

1

/

X

1

]

What is this? The elasticity, which is constant

across the sample.

What is the slope in a double-functional form?

Slope =

1

* (Y/X) =

[Y / Y] / [X

1

/ X

1

] * (Y/X) =

Y / X

Impact of X

1

on Y depends upon the quantity

of X

2

In other words, the slope of X

1

varies across the

sample.

Why would this be a realistic property?

Other Functional Form

Semi-log functional form

Polynomial Form

Inverse Form

Know the equation and meaning of

1

for each

of these forms.

More specifically, know the calculation of slope

and elasticity for each functional form.

Problems with Incorrect Functional Form

You cannot compare R

2

between two different

functional forms.

Why? TSS will be different.

An incorrect functional form may work within

sample but have large forecast errors outside of

sample.

Violation of Classical Assumption I: The

regression model is linear in the coefficients, is

correctly specified, and has an additive error

term.

Testing for Functional Form

The Quasi-R

2

Box-Cox Test

The MacKinnon, White, Davidson Test

(MWD)

Quasi R

2

1. Estimate a logged model and create a set of

LnY^ (predicted logged dependent variable).

2. Transform LnY^ by taking the anti-log. In

Excel (@exp) is the function needed.

3. Calculate a new RSS with the results of step

2.

4. Calculate the quasi-R

2

with the results of step

3.

The Box Cox Test

Calculate the geometric mean of the dependent

variable in the model.

This can easily be calculated in Excel

Create a new dependent variable equal to Y

i

/

Geometric Mean of Y

Re-estimate both forms of the model, with your

new dependent variable. Compare the Residual

Sum of Squares. Lowest value is the preferred

functional form.

MWD Test

1. Estimate the linear model an obtain the predicted Y values (call

this Yf^).

2. Estimate the double-logged model an obtain the predicted lnY

values (call this lnf^).

3. Create Z

1

= ln(Yf^) lnf^

4. Regress Y on Xs and Z

1

. Reject H

o

(Y is a linear function of

independent variables) if Z

1

is statistically significant by the

usual t-test.

5. Create Z

2

= antilog of lnf^ - Yf^

6. Regress log of Y on log of Xs and Z

2

. Reject H

A

(double-logged

model is best) if Z

2

is statistically significant by the usual t-

tests.

INTERCEPT DUMMIES

What if you thought season of the year

impacted your sales?

Your demand function would include three

dummies (why three) to test the impact of

seasons.

This type of dummy variable is called an

intercept dummy, since it changes the constant

term but not the slopes of the other

independent variables.

SLOPE DUMMIES

Interaction Term an independent

variable in a regression that is the

multiple of two or more independent

variables.

This can be used to see if a qualitative

condition, which we would analyze with a

dummy, impacts the slope of another

independent variable.

CRITERIA FOR CHOOSING A

SPECIFICATION

1. Occams razor or the principle of

parsimony - model should be kept as

simple as possible.

2. Goodness of fit

3. Theoretical consistency

4. Predictive power: Within sample vs.

Out of sample

IF YOU LEAVE OUT AN

IMPORTANT VARIABLE A

BIAS EXISTS UNLESS

The true coefficient of the omitted

variables is zero.

Or, there is zero correlation between the

omitted variable(s) and the independent

variables in the model.

If these conditions dont hold, omitted

variables will bias the coefficients in our

model.

WHAT TO DO?

Add the missing variable.

What if you do not know which

variable is missing? In other words,

what if you suspect something is left

out thus producing strange

results but you do not know what?

IRRELEVANT

VARIABLES

Including an irrelevant variable will

Increase the standard errors of the variables,

thus reducing t-stats. (think back to how

standard errors are calculated)

Reduce adjusted R

2

It does not introduce bias in the estimated

coefficients, but does impact our

interpretation of what we found.

FOUR IMPORTANT

SPECIFICATION CRITERIA

Theory: Is the variables place in the equation

unambiguous and theoretically sound?

t-Test: Is the variables estimated coefficient

significant in the expected direction?

Adjusted R

2

: Does the overall fit of the equation

improve when the variable is added to the

equation?

Bias: Do other variables coefficients change

significantly when the variable is added to the

equation?

SPECIFICATION SEARCHES:

OTHER ISSUES

Good idea to rely on theory rather than statistical fit.

Good idea to minimize the number of equations

estimated.

Bad idea to do sequential Searches or estimate an

undisclosed number of regressions before settling on a

final choice.

Sensitivity Analysis: Are your results robust to

alternative specifications? If not, maybe your not

finding what you think you are finding.

