You are on page 1of 13

Chapter 1: Introduction

1.1. What is Econometrics?

Using sample data on observable variables to learn about economic relationships, the
functional relationships among economic variables.

Econometrics consists mainly of:

• estimating economic relationships from sample data

• testing hypotheses about how economic variables are related

• the existence of relationships between variables


• the direction of the relationships between one economic variable -- the
dependent or outcome variable -- and its hypothesized observable
determinants
• the magnitude of the relationships between a dependent variable and the
independent variables that are thought to determine it.

Sample data consist of observations on randomly selected members of populations of


economic agents (individual persons, households or families, firms) or other units of
observation (industries, provinces or states, countries).

Example 1

We wish to investigate empirically the determinants of households' food expenditures, in


particular the relationship between households' food expenditures and households'
incomes.

Sample data consist of a random sample of 38 households from the population of all
households. For each household in the random sample, we have observations on three
observable variables:

foodexp = annual food expenditure of household, thousands of dollars per year


income = annual income of household, thousands of dollars per year
hhsize = household size, number of persons in household

1
foodexp income hhsize
1. 15.998 62.476 1
2. 16.652 82.304 5
3. 21.741 74.679 3
4. 7.431 39.151 3
5. 10.481 64.724 5
6. 13.548 36.786 3
7. 23.256 83.052 4
8. 17.976 86.935 1
9. 14.161 88.233 2
10. 8.825 38.695 2
11. 14.184 73.831 7
12. 19.604 77.122 3
13. 13.728 45.519 2
14. 21.141 82.251 2
15. 17.446 59.862 3
16. 9.629 26.563 3
17. 14.005 61.818 2
18. 9.16 29.682 1
19. 18.831 50.825 5
20. 7.641 71.062 4
21. 13.882 41.99 4
22. 9.67 37.324 3
23. 21.604 86.352 5
24. 10.866 45.506 2
25. 28.98 69.929 6
26. 10.882 61.041 2
27. 18.561 82.469 1
28. 11.629 44.208 2
29. 18.067 49.467 5
30. 14.539 25.905 5
31. 19.192 79.178 5
32. 25.918 75.811 3
33. 28.833 82.718 6
34. 15.869 48.311 4
35. 14.91 42.494 5
36. 9.55 40.573 4
37. 23.066 44.872 6
38. 14.751 27.167 7
. describe
Contains data from foodexp.dta
obs: 38
vars: 3 7 Sep 2000 23:30
size: 608 (99.9% of memory free)
-------------------------------------------------------------------------------
1. foodexp float %9.0g food expenditure, thousands $
per yr
2. income float %9.0g household income, thousands $
per yr
3. hhsize float %9.0g household size, persons per hh
. list foodexp income hhsize

2
3
. summarize
Variable | Obs Mea Std. Dev. Min Ma
---------+-----------------------------------------------------
n x
foodexp | 38 15.95282 5.624341 7.431 28.98
income | 38 58.44434 19.93156 25.905 88.233
hhsize | 38 3.578947 1.702646 1

Question: What relationship generated these sample data? What is the data
generating process?

Answer: We postulate that each population value of foodexp, denoted as foodexp i,


is generated by a relationship of the form:

foodexp i = f(incomei, hhsizei, u i ) the population regression equation

where
food exp i = the dependent or outcome variable we are trying to explain
= the annual food expenditure of household i (thousands of $ per year)

income i = on independent or explanatory variable that we think might explain


the dependent variable food expi
= the annual income of household i (thousands of $ per year)
hhsizei = a second independent or explanatory variable that we think might

explain the dependent variable food expi


= household size, measured by the number of persons in the household

f(incomei, hhsizei) = a population regression function representing the


systematic relationship of food exp i to the independent or
explanatory variables income i and hhsize i;
ui = an unobservable random error term representing all unknown and
unmeasured variables that determine the individual population values of
food expi
Question: What mathematical form does the population regression function
f(income i, hhsize i) take?

Answer: We hypothesize that the population regression function -- or PRF -- is a

4
linear function:

f(income i, hhsize i 0 1income i 2hhsizei

Implication: The population regression equation -- the PRE -- is therefore


food exp fincome , hhsize ui
 0 1income i 2hhsize
i ui

• Observable Variables:

foodexp i the value of the dependent variable foodexp for the i-th household
income i the value of the independent variable income for the i-th household
hhsize i the value of the independent variable hhsize for the i-th household

• Unobservable Variable:

u i the value of the random error term for the i-th household in the population

• Unknown Parameters: the regression coefficients 0, 1 and2


0= the intercept coefficient
1= the slope coefficient on incomei
2= the slope coefficient on hhsizei
The population values of the regression coefficients 0, 1 and 2 are
unknown.

The Four Elements of Econometrics

Data

Collecting and coding the sample data, the raw material of econometrics.

Most economic data is observational, or non-experimental, data (as distinct from


experimental data generated under controlled experimental conditions).

Specification

Specification of the econometric model that we think (hope) generated the sample
data -- that is, specification of the data generating process (or DGP).

5
An econometric model consists of two components:

1. An economic model: specifies the dependent or outcome variable to be explained


and the independent or explanatory variables that we think are related to the
dependent variable of interest.
• Often suggested or derived from economic theory.
• Sometimes obtained from informal intuition and observation.

2. A statistical model: specifies the statistical elements of the relationship under


investigation, in particular the statistical properties of the random variables in
the relationship.

Estimation

Consists of using the assembled sample data on the observable variables in the
model to compute estimates of the numerical values of all the unknown
parameters in the model.

Inference

Consists of using the parameter estimates computed from sample data to test
hypotheses about the numerical values of the unknown population parameters
that describe the behaviour of the population from which the sample was selected.

Scientific Method

The collection of principles and processes necessary for scientific investigation,


including:

1. rules for concept formation

2. rules for conducting observations and experiments

3. rules for validating hypotheses by observations or experiments

Econometrics is that branch of economics -- the dismal science -- which is concerned


with items 2 and 3 in the above list.

6
Regression analysis has two fundamental tasks:

1. Estimation: computing from sample data reliable estimates of the numerical


values of the regression coefficients j (j = 0, 1, , K), and hence of the
population regression function.

2. Inference: using sample estimates of the regression coefficients j (j = 0, 1, ,


K) to test hypotheses about the population values of the unknown regression
coefficients -- i.e., to infer from sample estimates the true population values of
the regression coefficients within specified margins of statistical error.

Estimator

Assume that we have a sample (x1,x2,,,xn) from a given population. All parameters of the
population are known except some parameter q. We want to determine from the given
observations unknown parameter - q. In other words we want to determine a number or
range of numbers from the observations that can be taken as a value of q.

Estimator – is a method of estimation.

Estimate – is a result of an estimator

Point estimation – as the name suggests is the estimation of the population parameter
with one number.

Problem of statistics is not to find estimates but to find estimators. Estimator is not
rejected because it gives one bad result for one sample. It is rejected when it gives bad
results in a long run. I.e. it gives bad result for many, many samples. Estimator is
accepted or rejected depending on its sampling properties. Estimator is judged by the
properties of the distribution of estimates it gives rise.

Properties of estimator

Since estimator gives rise an estimate that depends on sample points (x1,x2,,,xn) estimate
is a function of sample points. Sample points are random variable therefore estimate is

7
random variable and has probability distribution. We want that estimator to have several
desirable properties like

1. Consistency
2. Unbiasedness
3. Minimum variance

In general it is not possible for an estimator to have all these properties.

Note that estimator is a sample statistic. I.e. it is a function of the sample elements.

Properties of estimator: Consistency

For many estimators variance of the sampling distribution of an estimator decreases as


sample size increases. We would like that estimator stays as close as possible to the
parameter it estimates as sample size increases.

We want to estimate q and tn is an estimator. If tn tends to q in probability as n increases


then estimator is called consistent. I.e. for any given e and h there is an integer number
n0 so that for all samples size of n > n0 following condition is satisfied:

P(|tn- q|< e) > 1- h

The property of consistency is a limiting property. It does not require any behaviour of
the estimator for a finite sample size.

If there is one consistent estimator then you can construct infinitely many others. For
example if tn is consistent then n/(n-1)tn is also consistent.

Example: 1/nSxi and 1/(n-1) Sxi are both consistent estimators for the population mean.

Properties of estimator: Unbiasedness

8
If an estimator tn estimates q then difference between them (tn-q ) is called the
estimation error. Bias of the estimator is defined as the expectation value of this
difference

Bq =E(tn-q)=E(tn)- q

If the bias is equal to zero then the estimation is called unbiased. For example sample
mean is an unbiased estimator:

1 n 1 n 1 n
E(x  )  E(  i
n i 1
x   )   i
n i 1
E ( x )     E ( x)    0
n i 1

Here we used the fact that expectation and summation can change order (Remember that
expectation is integration for continuous random variables and summation for discrete
random variables.) and the expectation of each sample point is equal to the population
mean.

Knowledge of population distribution was not necessary for derivation of unbiasedness of


the sample mean. This fact is true for the samples taken from population with any
distribution for which the first moment exists..

Example of biased estimator: Sample variance

Given sample of size n from the population with unknown mean (q) and variance (s2) we
estimate mean as we already know and variance (intuitively) as:

1 n 1 n 2
 
2
tn  ( xi  x ) 2
 xi  x
n i 1 n i 1

What is the bias of this estimator? We could derive distribution of tn and then use it to
find expectation value. If population has normal distribution then it would give us
multiple of c2 distribution with n-1 degrees of freedom. Let us use a direct approach:

1 n 1 n 1 n
1 n n
E (t n )   i
n i 1
E (x 2
)  E ((  i
n i 1
x ) 2
 E ( x 2
) 
n2
E (  i j x
i 1, j 1
x )  E ( x 2
) 
n2
( E ( 
i 1
x 2
i ) E ( 
i j
xi x j ))

1
 E(x 2 )  ( nE ( x 2 )  n(n  1) E ( x) 2 )
n2
1 n 1 n 1 n 1 2
 E(x 2 )  E(x 2 )  E ( x) 2  ( E ( x 2 )  E ( x) 2 )  
n n n n
9
Sample variance is not an unbiased estimator for the population variance. That is why
when mean and variance are unknown the following equation is used for sample
variance:

1 n
s 
2

n  1 i 1
( xi  x ) 2

Property of estimator: mean square error and bias

Expectation value of the square of the differences between estimator and the expectation
of the estimator is called its variance:

V  E (t n  E (t n )) 2

Exercise: What is the variance of the sample mean.

As we noted if estimator for q is tn then difference between them is error of the


estimation. Expectation value of this error is bias. Expectation value of square of this
error is called mean square error (m.s.e.):

M   E (t n   ) 2

It can be expressed by the bias and the variance of the estimator:

M  (tn )  E (t n   ) 2  E (t n  E (t n )  E (t n )   ) 2  E (t n  E (t n )) 2  ( E (t n )   ) 2 
V (tn )  B2 (t n )
M.s.e is equal to square of the estimator’s bias plus variance of the estimator. If the bias
is 0 then m.s.e is equal to the variance. In estimation it is usually trade of between
unbiasedness and minimum variance. In ideal world we would like to have minimum
variance unbiased estimator. It is not always possible.

10
Intuitive estimators: plug-in

One of the estimator is plug-in. It has only intuitive bases. If parameter we want to
estimate is expressed like q=t(F) then estimator taken asˆ  t ( Fˆ ) .Where F is the
population distribution andF̂ is its sample equivalent.

Example: population mean is calculated as:

   xf ( x)dx

Since sample is from the population with the density of distribution f(x) sample mean is
plug-in estimator for the population mean.

Exercise: What is the plug-in estimator for population variance? What is the plug-in
estimator for covariance. Hint: Population variance and covariance are calculated as:

2   ( x   ) f ( x)dx and
2

cov( X , Y )   ( x   )( y  
x y ) f ( x, y )dxdy

Replace the integration with summation and divide by the number of elements in the
sample. Since sample was drawn from the population with a given distribution it is not
necessary to multiply by f(x).

Least-squares estimator

Another well known and popular estimator is the least-square estimator. If we have a
sample and we think that (because of some knowledge we had before) all parameters of
interest are inside the mean value of the population then least squares methods estimates
by minimising the square of the differences between observations and mean value:

 w ( x   ( ))
i 1
i i
2
 min

Exercise: Verify that if only unknown parameter is the mean of the population and all wi
are equal to each other then the least-squares estimator will result in the sample mean.

11
Interval estimation

Estimation of the parameter is not sufficient. It is necessary to analyse and see how
confident we can be about this particular estimation. One way of doing it is defining
confidence intervals. If we have estimated q we want to know if the “true” parameter is
close to our estimate. In other words we want to find an interval that satisfies following
relation:
P (GL    GU )  1  

I.e. probability that “true” parameter q is in the interval (GL,GU) is greater than 1-a.
Actual realisation of this interval - (gL,gU) is called a 100(1- a)% of confidence interval,
limits of the interval are called lower and upper confidence limits. 1- a is called
confidence level.

Example: If population variance is known (s2) and we estimate population mean then

x
Z  is normal N (0,1)
/ n

We can find from the table that probability of Z is more than 1 is equal to 0.1587.
Probability of Z is less than -1 is again 0.1587. These values come from the tables of the
standard normal distribution.

Now we can find confidence interval for the sample mean. Since:

P(1  Z  1)  P( Z  1)  P( Z  1)  1  P( Z  1)  P ( Z  1)  1  2 * 0.1587  0.6826

Then for m we can write

x
P (1   1)  P ( x   / n    x   / n )  0.6826
/ n

Confidence level that “true” value is within 1 standard error (standard deviation of
sampling distribution) from the sample mean is 0.6826. Probability that “true” value is
within 2 standard error from the sample mean is 0.9545.

12
What we did here is to find sample distribution and to use it to define confidence
intervals. Here we used two sided symmetric interval. They don’t have to be two sided or
symmetric. Under some circumstances non-symmetric intervals might be better. For
example it might be better to diagnose patient for particular treatment than not. If doctor
made an error and did not treat the patient then he might die. But if doctor made a
mistake and started to treat him then he can stop and correct his mistake at some later
time.

Above we considered the case when population variance is known in advance. It is rarely
the case in real life. When both population mean and variance are unknown we can still
find confidence intervals. In this case we calculate population mean and variance and
then consider distribution of the statistic:
x
Z
s/ n

Here s2 is the sample variance.

Since it is the ratio of the standard normal random variable to square root of c2 random
variable with n-1 degrees of freedom, Z has Student’s t distribution with n-1 degrees of
freedom. In this case we can use table of t distribution to find confidence levels.

It is not surprising that when we do not know sample variance confidence intervals for
the same confidence levels becomes larger. That is price we pay for what we do not
know.

If number of degrees of freedom becomes large then t distribution is approximated well


with normal distribution. For n>100 we can use normal distribution to find confidence
levels, intervals.

13

You might also like