Non Linear Models

1/34
EC114 Introduction to Quantitative Economics

13. Non-Linear Models
Marcus Chambers
Department of Economics
University of Essex
31 January/01 February 2011
EC114 Introduction to Quantitative Economics 13. Non-Linear Models
2/34
Outline
1
Transformations of Variables
2
Double Logarithm Functions
3
Other Transformations
Reference: R. L. Thomas, Using Statistics in Economics,
McGraw-Hill, 2005, sections 10.110.4.
Transformations of Variables 3/34
So far we have assumed that the relationship between our
two variables, X and Y, was linear.
In this case the population regression equation is of the
form
E(Y) = + X
rather than, for example,
E(Y) = + X
2
.
As a consequence the sample regression equation we
tted to a scatter of points was always a straight line:
Y = a + bX
rather than, for example,
Y = a + bX
2
.
However many economic relationships are non-linear to a
lesser or greater extent, and we need to be able to take
this into account.
A very simple way of tting a curve to a scatter of points is
to apply the so-called transformation technique.
Consider the equation:
Y
= + X
,
where X
= X
(X) and Y
= Y
(Y) are simple functions (or

transformations) of the variables X and Y, respectively.
The choice of transformation depends on the nature of the
data under consideration and can partly be determined by
a scatter diagram plotting Y against X.
Note that the equation Y
= + X
is linear in the
transformed variables Y
and X
(but not Y and X).

Hence we can apply regression techniques to the following
population regression equation:
E(Y
) = + X
.
The corresponding sample regression equation is therefore
= a + bX
where a and b are the OLS estimators of and .

The formulae determining a and b (and also R
2
) remain
unchanged but are now expressed in terms of Y
and X
rather than Y and X.

The relevant formulae are therefore:
b =
i
y
x
2
i
;
a =

Y
;
R
2
=
b
2
x
2
i
y
2
i
.
In the above:
x
i
= X
i
,

X
i
=
i
n
,
y
i
= Y
i
,

Y
i
=
i
n
.
The formulae for computing sums involving deviations from
means remain valid for the transformed variables:
x
2
i
=
(X
)
2
=
X
2
i

(
i
)
2
n
,
i
y
i
=
(X
)(Y
)
=
i
Y
i
n
.
In view of this another common expression for b is:
b =
i
Y
i
n
X
2
i

(
i
)
2
n
.
Example. The demand for carrots Y (in kilograms) and the
price of carrots X (in pence) are observed in a supermarket
over a period of 30 weeks.
Suppose we wish to estimate the price elasticity of the
demand for carrots and to predict the demand for carrots
when the price is 90p per kg.
The latter task is an out of sample prediction of the type
covered previously.
The sample data are represented in the following Table
(which is Table 10.1 of Thomas):
Demand for, and price of, carrots
i Y X i Y X
1 99 25 16 44 70
2 83 25 17 35 71
3 68 27 18 35 66
4 80 28 19 38 61
5 69 30 20 52 58
6 55 32 21 40 56
7 66 35 22 41 52
8 58 38 23 37 48
9 59 41 24 45 51
10 44 43 25 36 46
11 55 54 26 56 45
12 38 52 27 45 40
13 42 60 28 48 39
14 43 64 29 44 39
15 36 70 30 56 41

The scatter diagram suggests it would be unwise to t a
straight line to the data.
Some sort of curve would provide a better t.
However, suppose we run a linear regression of Y on X; we
obtain
Y = 92.9 0.881X, R
2
= 0.61.
The regression line slopes downwards (the slope is
0.881) and 61% of the variation in Y can be attributed to
X.
The out-of-sample prediction, obtained when X = 90, is
Y = 92.9 0.881(90) = 13.6.

Looking at the scatter diagram, it suggests that the
demand for carrots is likely to be considerably higher than
13.6 kg when the price is 90p.
Turning to the price elasticity, this is given by
=
dY
dX
X
Y
;
(note that Thomas inserts a minus in front of the
right-hand-side).
We estimate dY/dX by b and calculate the elasticity
evaluated at the sample means of X and Y,

X = 46.9 and
Y = 51.6.
The estimated elasticity is thus
= (0.881)(46.9/51.6) = 0.801,
meaning that a 1% rise in price leads to a fall of 0.8% in
the demand for carrots (so carrots are price inelastic).
Neither the out-of-sample prediction nor the elasticity
estimate are likely to be very accurate in view of the linear
model being a poor representation of the data.
It is likely that better estimates can be obtained from a
non-linear model that provides a better t to the data.
There are many relationships that give rise to curves of
varying degrees of non-linearity, and we will consider some
important examples.
Double Logarithm Functions 14/34
Consider the nonlinear function:
Y = AX
, A > 0, A and constant.

The diagram on the next slide presents the possible
shapes for this function for different values of the
parameter , these being:
(a) > 1;
(b) 0 < < 1; and
(c) < 0.

Notice how the value of affects the shape of the curve.
It would appear that the function in part (c) might be a
suitable candidate for our data.
If we are going to use a function of the form
Y = AX
then how do we estimate A and ? Is it possible to use

OLS?
Obviously OLS cannot be applied directly to this function
because it is not linear in the variables.
However, the function has the convenient property that it
can be made linear by taking (natural) logarithms.
We shall need the following rules:
ln(uv) = ln(u) + ln(v) and ln(p
q
) = q ln(p),
which we apply with u = A, v = X
, p = X and q = .
Taking logarithms therefore yields
ln(Y) = ln(A) + ln(X)
which can be written in the form
ln(Y) = + ln(X)
where = ln(A).
Hence ln(Y) is a linear function of ln(X) or
Y
= + X
where Y
= ln(Y) and X
= ln(X).
Can we use this equation as a basis for estimating and
by OLS?
As it stands, the equation Y
= + X
is deterministic
i.e. non-random, but we can use it to dene the population
regression equation in the form
E(Y
) = + X
.
Furthermore, introducing a random disturbance
, Y
satises
Y
= + X
.
We can apply OLS to this equation and estimate and
by a and b giving the sample regression equation
= a + bX
.
The usual calculations for computing the OLS estimates
are therefore performed on the transformed variables
Y
= ln(Y) and X
= ln(X).
Recall that
a =

Y
and b =
i
y
x
2
i
.
Table 10.2 of Thomas shows that
i
= 114.059,
X
2
i
= 436.524,
i
Y
i
= 443.056,
i
= 117.096,
Y
2
i
= 459.289.
The sample means are therefore
=
114.059
30
= 3.802 and

Y
=
117.096
30
= 3.903.
Furthermore,
x
2
i
=
X
2
i

(
i
)
2
n
= 436.524
(114.059)
2
30
= 2.875;
i
y
i
=
i
Y
i
n
= 443.056
114.059 117.096
30
= 2.140.
We nd that
b =
2.140
2.875
= 0.744 and a = 3.903(0.744)3.802 = 6.73.
The sample regression equation is therefore
= 6.73 0.744X
or, substituting for Y
and X
,
ln(Y) = 6.73 0.744 ln(X).
If we want to convert these estimates back into the original
non-linear relationship between X and Y we can take take
the exponential function of each side of the equation.
The following rules are useful:
e
u+v
= e
u
e
v
and e
p ln(q)
= q
p
,
which we apply with u = 6.73, v = 0.744 ln(X), p = 0.744
and q = X.
We therefore obtain
Y = 837.1X
0.744
.
This equation is plotted on the next slide in terms of the
untransformed variables X and Y:

The curve appears to t the data much better than any
straight line could.
The next diagram plots the sample linear regression line in
terms of X
and Y
:

The straight line appears to t the transformed data much
better than any curve could.
The coefcient of determination is equal to
R
2
=
b
2
x
2
i
y
2
i
=
(0.744)
2
2.875
2.239
= 0.71
meaning that 71% of the variation in Y
= ln(Y) can be
attributed to X
= ln(X).
Note that the R
2
is in terms of the transformed variables.
It cannot, therefore, be compared with the R
2
of 0.61 for
the regression in the untransformed variables.
This is because one measures the percentage of variation
in Y
attributable to X
while the other measures the

percentage of variation in Y attributable to X, which are
different quantities.
As for the price elasticity, recall that
=
dY
dX
X
Y
.
Differentiating Y = AX
with respect to X gives

dY
dX
= AX
1
and so
= AX
1
X
AX
= .
Hence our estimate of the price elasticity of demand for
carrots is given by b = 0.744 which is our estimate of .
Note that the elasticity does not depend on the value of X
or Y; hence the elasticity is the same for all values of X and
Y.
Other Transformations 27/34
A number of other types of transformation are also
commonly employed in regression analysis.
We shall take a look at the properties of three such
transformations:
(a) semi-logarithmic;
(b) exponential; and
(c) reciprocal.
The semi-logarithmic function has the form
Y = + ln(X)
and is depicted for > 0 and < 0 on the next slide:

The function is linear in Y and ln(X) and so in this case
Y
= Y and X
= ln(X).
The elasticity of Y with respect to X can be shown to be
=

Y
.
The exponential function is of the form
Y = Ae
X
, A > 0.
It can be made linear by taking (natural) logarithms of both
sides:
ln(Y) = ln(A) + X
and so Y
= ln(Y) and X
= X.
= X.
The shape of the function for > 0 and < 0 is as follows:

Note that the intercept on the Y-axis is equal to A and the
slope is positive if > 0 and negative of < 0.
Finally, the reciprocal function is given by
Y = +
1
X
and is linear in Y and 1/X, so that Y
= Y and X
= 1/X.
=

XY
.
The shape of the function for > 0 and < 0 is as follows:

Note that the slope is negative if > 0 and positive if < 0
and the function tends towards as X increases.
All four transformations we have considered are based on
the linear regression
Y
= + X
,
with the estimated version being
= a + bX
.
They can be summarised as follows:
Double-log Semi-log Exponential Reciprocal
Y
ln(Y) Y ln(Y) Y
X
ln(X) ln(X) X 1/X

/Y X /(XY)
Summary 34/34
Summary
Transformations
Double-log, semi-log, exponential and reciprocal.
Next week:
Properties of estimators and the Classical two-variable
regression model

Non Linear Models

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Non Linear Models

Uploaded by

Copyright:

Available Formats

1/34

EC114 Introduction to Quantitative Economics

(Y) are simple functions (or

(but not Y and X).

where a and b are the OLS estimators of and .

rather than Y and X.

Y = 92.9 0.881(90) = 13.6.

, A > 0, A and constant.

then how do we estimate A and ? Is it possible to use

or, substituting for Y

while the other measures the

with respect to X gives

and is linear in Y and 1/X, so that Y

ln(X) ln(X) X 1/X

You might also like