You are on page 1of 9

# THE UNIVERSITY OF

## NEW SOUTH WALES

SCHOOL OF ECONOMICS

Eco~

2206

I"TRODUCTORY ECOI\'OMETRICS
FINAL EXAMI~ATIO:'-!

SESSION 1,

2007

## 1. TIME ALLOWED - 2 Hours.

2. TOTAL NUMBER OF QUESTIONS - 6.
3. ANSWER ALL QUESTIONS.
4. ALL QUESTIONS ARE OF EQUAL VALUE (The marks awarded to each part of a question are indicated
- the total marks for this exam is 60).
5. CANDIDATES MAY BRING THEIR OWN CALCULATORS TO THE EXAM
6. STATISTICAL TABLES ARE PROVIDED AT THE END OF THE EXAM PAPER
7. ALL ANSWERS MUST BE WRITTEN IN PEN. PENCILS MAY BE USED ONLY FOR DRAWING,
SKETCHING OR GRAPHICAL WORK.

## ANSWER ALL SIX QUESTIONS

REMINDER: When performing statistical tests, always state the null and alternative hypotheses, the test statistic and it's distribution under the null hypothesis, the level of significance
and the conclusion of the test.

## Question 1. (10 Marks).

(i) Suppose that the correct population regression model is:
(1.1)

Xl,

(1.2)

## In what circumstance will the OLS estimator for model (1.2):

(a) provide an unbiased estimate of the true population parameter
(b) provide an estimate of

/3 1 that

/31

? (2 marks)

## has positive (or upward) bias? (2 marks)

(ii) Outline the advantages of using larger samples of data in regression analysis. (2 marks)

(iii) A model used analysing the effect of house characteristics on the sale price was:

log(price)

## = /3 0 + /31 area + /3 2 bdrms + /3 3area x

bdnns

+u

where price is the house price, area is the floor area of the house (measured in square metres), and bdrms
is the number of bedrooms. What is the partial effect on log (price) of increasing area by 1 square metre?
(2 marks).

(iv) What is the meaning of the term "contemporaneous exogeneity" as used in the context of time
series data? What is the difference between contemporaneous exogeneity and "strict exogeneity" as used in
multiple regression models for time series data? (2 marks)

## Question 2. (10 Marks in total)

The following regression model explains the monthly wages as a function of years of education (educ) , years
of labour market experience (exper) and current job tenure (tenure):
(2.1)
With a random sample of data the following output was obtained using SHAZAM:

## Welcome to SHAZAM - Version 10.0

I_sample 1 722
I_read wage educ exper tenure
4 VARIABLES AND 722 OBSERVATIONS STARTING AT OBS 1
I_genr Inwage=log(wage)
1_* Model estimates
I_ols Inwage educ exper tenure
REQUIRED MEMORY IS PAR= 81 CURRENT PAR= 2000
OLS ESTIMATION
722 OBSERVATIONS DEPENDENT VARIABLE= LNWAGE
... NOTE .. SAMPLE RANGE SET TO: 1, 722
R-SQUARE = 0.1551 R-SQUARE ADJUSTED = 0.1524
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.19493
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.44151
SUM OF SQUARED ERRORS-SSE= 139.96
MEAN OF DEPENDENT VARIABLE = 6.7790
LOG OF THE LIKELIHOOD FUNCTION
-438.839
VARIABLE
NAME
EDUC
EXPER
TENURE
CONSTANT

ELASTICITY
ESTIMATED STANDARD
T-RATIO
PARTIAL STANDARDIZED
AT MEANS
COEFFICIENT ERROR
718 DF P-VALUE
CORR. COEFFICIENT
0.1487
0.74864E-Ol 0.6512E-02 11.50
0.000
0.353
0.3905
0.0261
0.15328E-Ol0.3370E-02 4.549 0.000
0.147
0.1592
0.13375E-Ol0.2587E-02 5.170 0.000
0.167
0.1612
0.0143
0.8108
5.4967
0.1105
49.73
0.000
0.852
0.0000

## (i) What is the interpretation of the coefficient on education, (31 ? (2 marks).

Oi) Calculate the exact percentage effect of another year of education on the predicted wage level. (2 marks).
(iii) Test the null hypothesis that all the slope parameters in the model are jointly equal to zero using a 1
percent significance level. What do you conclude? (3 mark).
Note: The F-test statistic is given by the formula based on R 2 is:

F=

(R~r - R;)/q
(1 - R~r)/(n - k - 1)

where q is the number of restrictions, and ur and r stand for unrestricted and restricted models, respectively.
(iv) We are interesting in constructing a confidence interval for the (conditional) predicted log( wage) when
educ = 13, exper = 11 and tenure = 7. To obtain the standard error for the prediction we need to estimate
a transformed model that is equivalent to (2.1). Derive the transformed model which will give a direct
estimate of the prediction and the standard error of the prediction. (3 marks).
3

## Question 3. (10 Marks in total)

We are interested in analysing the effect of different house characteristics on the market price of the house
in the Sydney, and consider the following regression model:

log(price)

## = 13 0 + 13 1 log(lotsize) + 13 2 log(sqr jt) + 13 3 log(bdrms) + u

(3.1)

where price is the sale price (measured in \$1000), lotsize is land area (square metres), sqrmtr (is the floor
area of the house (also measured in square metres), and bdTms is the number of bedrooms. Based on a
sample of data from 2005 house sales in Sydney, the following regression estimates were obtained:

log(price)

0.5481

## + 0.70l3log(sqrmtr) + 0.1745log(lotsize) + 0.0363log(bdrms)

(0.3945)(0.0823)
n

108,

R = 0.551,

(0.0353)

(0.0932)

-2

R = 0.538

(i) Construct a 90% confidence interval for ~3 (the coefficient on log(bdrms)). Is zero within the confidence
interval? (3 marks).

(ii) Given the estimation results, would you conclude that this is a good econometric model? Explain.
(3 marks).

(iii) We are concerned that the model in (3.1) may be misspecified. An alternative model specification where
all the variables are in level form (rather than in log form) is:
(3.2)
Outline a procedure for testing whether model (3.1) or model (3.2) is a better specification. What are the
limitations (if any) of the test? Explain. (4 marks)

## Question 4. (10 Marks in total).

In a recent study an economist examined the factors explaining whether a finn was taken over by another
firm during a given year. The dependent variable in the analysis was Takeover - which is a binary variable
equal to 1 if it was taken over (and 0 otherwise). The explanatory variables were profit which is the firm's
average profit rate over the previous five years, mktval which is the market value of the firm (in \$100m), and
debtearn which is the debt-to-earnings ratio. The table below presents coefficient estimates (and standard
errors) based on a sample of 177 firms in 2004.
Table 4.1. Estimation Results for Takeover Models
Dependent Variable: Takeover
Variables
pro fit

0.251
(0.068)
mktval
-0.930
(0.287)
debtearn
-0.364
(0.249)
constant
-19.21
(4.839)
Observations(n)
177
0.233
R2
Note: The usual OL8 standard errors III () below the coefficient estimates.

## (i) What is the interpretation of the coefficient on profit? (2 mark)

(ii) What is the predicted probability of Ta [;;;v er for a firm with the following characteristics: profit = 0.05,
mktval = 1.5 and debtearn = 6 ? Briefly explain whether the result is sensible. (2 marks)
(iii) We know the Linear Probability Model must contain "heteroskedaticity". What is heteroskedasticity
and what are the consequences of heteroskedasticity for:
(a) estimation, and
(b) inference with the standard OL8 procedures?
(2 marks)

(iv) Given that we know the model contains heteroskedasticity, what advice would you give an economist
wishing to analyse the determinant of Takeover with regression methods? (4 marks)

## Question 5. (10 Marks in total).

The following regression model was proposed for analyse the effect of the minimum wage on employment:

where emprtet is the employment rate, minW9t is the minimum wage and GN Pt is GNP (a proxy for labour
demand) in year t.
(i) What is the interpretation of the coefficient

/3 1

? (2 mark).

(ii) Is this a "static" or "dynamic" model? What is the purpose of including the lagged term minW9t_l?
Briefly explain. (2 marks).

Using annual data from 1950-1987, the following regression model estimates were obtained:

log(emprtetl
n

## -7.05 - 0.072Iog(minw9tl- 0.0611og(minW9t_l) - 0.012 log (GNPtl

(0.77)

(0.031)

38, R 2

= 0.661,

(0.015)
f(2

(5.2)

(0.089)

= 0.641

(iii) Test the null hypothesis that the lagged term minW9t-l is insignificant using a 10 percent significance
level and the one-sided alternative that the coefficent is negative (Ho: /3 2 = 0, HI : (32 < 0 ). (2 marks).

(iv) There is not enough information in the results presented in (5.2) to construct a confidence interval for
the Long Run Propensity (LRP). Rewrite the model in (5.1) into a form which you give you a direct estimate
of the LRP (and the standard error on the LRP). What parameter in this transformed model corresponds
to the LRP ? (2 marks).

(v) I am concerned that the model in (5.2) may suffer from the "spurious regression" problem. V/hat is the
spurious regression problem and what simple adjustment to the model would help reduce the possibility of
this problem? (2 marks).

## Question 6. (10 Marks in total).

'Ne are interested in analysing the effect of locating a water desalination plant on local property prices.
Desalination plants are large, industrial sites which can generate a lot of noise pollution and reduce amenities
in the local area. The South Australian government built a desalination plant in the Adelaide area of South
Beach in 1998. Discussion about building a desalination plant in South Beach began after 1994, and the
plant was built and began operating in 1998. We have data on the prices of houses sold in South Beach in
1994 (the "before" period) and another sample on houses sold in 2002 (the "after" period). The hypothesis
we wish to test is that the price of houses located near the site of the desalination plant would fall below the
price of more distant houses.
The data for each year includes the dummy variable nearplant which is equal to one if the house is
located within 3 kilometres of the desalination plant. The variable hprice denotes the real house price
(scaled by \$10,000). The following simple regression model was estimated using only the year 2002 sample
of data:
hprice

## 21.311 - 6.198 nearplant

(6.1)

(0.618) (0.992)

353, R 2

= 0.212

Using the 1994 sample, the following regression results were obtained:

hprice

16.527 - 3.679nearplant

(6.2)

(0.538) (0.615)

182, R 2

= 0.172

(i) What is the interpretation of the coefficient on the intercept term in model (6.2) (that is, what does the
value 16.527 represent) ? What is the interpretation of the coefficient on nearplant in model (6.2) ?
(2 marks)

(ii) Can you infer from the estimates in (6.1), based on the year 2002 data, that the location of the plant
caused the price of houses located nearby to fall by an average of \$61,980 ? Explain. (2 marks)

(iii) An alternative approach is to pool the data for both years and estimate the following model:

hprice

## 16.527 + 4.7840 year2 - 3.679 nearplant - 2.519 year2. nearplant

(0.793) (0.9471)

(0.876)

(6.3)

(1.128)

535, R 2 = 0.202

where year2 is a dummy variable equal to one if the observation is for the year 2002 (and is equal to zero if
the observation is for the year 1994).
What is the estimated effect of the plant on neighbouring house prices based on the "difference-in-difference"
estimator? Is the effect significantly different from 0 at the 5% significance level? (use the one-sided
alternative hypothesis that the coefficient is negative). (3 marks)

(iv) What, if any, would be the advantages of collecting and using panel data to evaluate the effect of the
location of the desalination plant on local property prices? Explain. (3 marks).
7

## Table 1. Critical Values of the t Distribution

1-Tailed:
2-Tailed:
1

e
g

e
e
s
0

f
F

e
e
d
0

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
90
120
00

..

0.10
0.20
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
1.311
1.310
1.303
1.296
1.291
1.289
1.282

0.05
0.10
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.662
1.658
1.645

Significance Level
0.025
0.05
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.021
2.000
1.987
1.980
1.960
..

IS

0.01
0.02
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2.390
2.368
2.358
2.326

0.005
0.01
63.656
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.660
2.632
2.617
2.576

10
11
12
13
14
15
16

17

18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
90
120

e
n
0

m
i

r
D

e
9
r

e
e
s
0

f
F
r

e
e
d
0

00

1
10.04
9.65
9.33
9.07
8.86
8.68
8.53
8.40
8.29
8.18
8.10
8.02
7.95
7.88
7.82
7.77
7.72
7.68
7.64
7.60
7.56
7.31
7.08
6.93
6.85
6.63

7.56
7.21
6.93
6.70
6.51
6.36
6.23
6.11
6.01
5.93
5.85
5.78
5.72
5.66
5.61
5.57
5.53
5.49
5.45
5.42
5.39
5.18
4.98
4.85
4.79
4.61

6.55
6.22
5.95
5.74
5.56
5.42
5.29
5.19
5.09
5.01
4.94
4.87
4.82
4.76
4.72
4.68
4.64
4.60
4.57
4.54
4.51
4.31
4.13
4.01
3.95
3.78

## Numerator Degrees of Freedom

4
6
7
5
5.99
5.64
5.39
5.20
5.67
5.32
5.07
4.89
5.41
5.06
4.82
4.64
5.21
4.86
4.62
4.44
5.04
4.69
4.46
4.28
4.89
4.56
4.32
4.14
4.44
4.77
4.20
4.03
4.67
4.34
4.10
3.93
4.25
4.01
3.84
4.58
4.50
4.17
3.94
3.77
3.87
3.70
4.43
4.10
4.37
4.04
3.81
3.64
4.31
3.76
3.59
3.99
3.94
3.71
3.54
4.26
3.67
3.50
4.22
3.90
3.63
4.18
3.85
3.46
4.14
3.82
3.59
3.42
4.11
3.78
3.56
3.39
3.53
4.07
3.75
3.36
3.33
4.04
3.73
3.50
4.02
3.70
3.47
3.30
3.51
3.29
3.12
3.83
3.34
3.12
2.95
3.65
3.01
2.84
3.53
3.23
3.17
2.96
2.79
3.48
3.32
3.02
2.80
2.64

Example: The 1% critical value for numerator df=3 and denominator df=60 is 4.13.

8
5.06
4.74
4.50
4.30
4.14
4.00
3.89
3.79
3.71
3.63
3.56
3.51
3.45
3.41
3.36
3.32
3.29
3.26
3.23
3.20
3.17
2.99
2.82
2.72
2.66
2.51

9
4.94
4.63
4.39
4.19
4.03
3.89
3.78
3.68
3.60
3.52
3.46
3.40
3.35
3.30
3.26
3.22
3.18
3.15
3.12
3.09
3.07
2.89
2.72
2.61
2.56
2.41

10
4.85
4.54
4.30
4.10
3.94
3.80
3.69
3.59
3.51
3.43
3.37
3.31
3.26
3.21
3.17
3.13
3.09
3.06
3.03
3.00
2.98
2.80
2.63
2.52
2.47
2.32