You are on page 1of 21

Answer 1.

Literally interpreted, econometrics means economic measurement. Although

measurement is an important part of econometrics, the scope of econometrics is much
broader, as can be seen from the following quotations:
Econometrics, the result of a certain outlook on the role of economics, consists of the
application of mathematical statistics to economic data to lend empirical support to the
models constructed by mathematical economics and to obtain numerical.
Econometrics may be defined as the quantitative analysis of actual economic
phenomena based on the concurrent development of theory and observation, related by
appropriate methods of inference.
Econometrics may be defined as the social science in which the tools of economic
theory, mathematics, and statistical inference are applied to the analysis of economic
Econometrics is concerned with the empirical determination of economic laws.
Although there are several schools of thought on econometric methodology, we
present here the traditional or classical methodology, which still dominates empirical
research in economics and other social and behavioral sciences.
Econometrics is the application of mathematics, statistical methods, and computer
science, to economic data and is described as the branch of economics that aims to
give empirical content to economic relations.[1] More precisely, it is "the quantitative
analysis of actual economic phenomena based on the concurrent development of
theory and observation, related by appropriate methods of inference.
Broadly speaking, traditional econometric methodology proceeds along the following
1. Statement of theory or hypothesis.
2. Specification of the mathematical model of the theory
3. Specification of the statistical, or econometric, model
4. Obtaining the data
5. Estimation of the parameters of the econometric model
6. Hypothesis testing

7. Forecasting or prediction
8. Using the model for control or policy purposes.
Answer 2.
In regression analysis we are concerned with what is known as the statistical, not
functional or deterministic, dependence among variables, such as those of classical
In statistical relationships among variables we essentially deal with random or
stochastic4 variables, that is, variables that have probability distributions. In functional
or deterministic dependency, on the other hand, we also deal with variables, but these
variables are not random or stochastic. The dependence of crop yield on temperature,
rainfall, sunshine, and fertilizer, for example, is statistical in nature in the sense that
the explanatory variables, although certainly important, will not enable the agronomist
to predict crop yield exactly because of errors involved in measuring these variables as
well as a host of other factors (variables) that collectively affect the yield but may be
difficult to identify individually. Thus, there is bound to be some intrinsic or random
variability in the dependent-variable crop yield that cannot be fully explained no
matter how many explanatory variables we consider.
In deterministic phenomena, on the other hand, we deal with relationships of the type,
say, exhibited by Newtons law of gravity, which states: Every particle in the universe
attracts every other particle with a force directly proportional to the product of their
masses and inversely proportional to the square of the distance between them.
Symbolically, F = k(m1m2/r 2), where F = force, m1 and m2 are the masses of the two
particles, r = distance, and k = constant of proportionality. Another example is Ohms
law, which states: For metallic conductors over a limited range of temperature the
current C is proportional to the voltage V; that is, C = ( 1 k )V where 1 k is the
constant of proportionality. Other examples of such deterministic relationships are
Boyles gas law, Kirchhoffs law of electricity, and Newtons law of motion. In this
text we are not concerned with such deterministic relationships. Of course, if there are
errors of measurement, say, in the k of Newtons law of gravity, the otherwise
deterministic relationship becomes a statistical relationship. In this situation, force can
be predicted only approximately from the given value of k (and m1, m2, and r), which
contains errors. The variable F in this case becomes a random variable.

Answer 3.
Endogenous variable: A factor in a causal model or causal system whose value is
determined by the states of other variables in the system; contrasted with
an exogenous variable. Related but non-equivalent distinctions are those between
dependent and independent variables and between explanandum and explanans. A
factor can be classified as endogenous or exogenous only relative to a specification of
a model representing the causal relationships producing the outcome y among a set of
causal factors X (x1, x2, xk) (y = M(X)). A variable xj is said to be endogenous within
the causal model M if its value is determined or influenced by one or more of the
independent variables X(excluding itself). A purely endogenous variable is a factor
that is entirely determined by the states of other variables in the system. (If a factor is
purely endogenous, then in theory we could replace the occurrence of this factor with
the functional form representing the composition of xj as a function of X.) In real
causal systems, however, there can be a range of endogeneity. Some factors are
causally influenced by factors within the system but also by factors not included in the
model. So a given factor may be partially endogenous and partially exogenous
partially but not wholly determined by the values of other variables in the model.
Consider a simple causal systemfarming. The outcome we are interested in
explaining (the dependent variable or the explanandum) is crop output. Many factors
(independent variables, explanans) influence crop output: labor, farmer skill,
availability of seed varieties, availability of credit, climate, weather, soil quality and
type, irrigation, pests, temperature, pesticides and fertilizers, animal practices, and
availability of traction. These variables are all causally relevant to crop yield, in a
specifiable sense: if we alter the levels of these variables over a series of tests, the
level of crop yield will vary as well (up or down). These factors have real causal
influence on crop yield, and it is a reasonable scientific problem to attempt to assess
the nature and weight of the various factors. We can also notice, however, that there
are causal relations among some but not all of these factors. For example, the level of
pest infestation is influenced by rainfall and fertilizer (positively) and pesticide, labor,
and skill (negatively). So pest infestation is partially endogenous within this system
and partially exogenous, in that it is also influenced by factors that are external to this
system (average temperature, presence of pest vectors, decline of predators, etc.).
The concept of endogeneity is particularly relevant in the context of time series
analysis of causal processes. It is common for some factors within a causal system to
be dependent for their value in period n on the values of other factors in the causal
system in period n-1. Suppose that the level of pest infestation is independent of all
other factors within a given period, but is influenced by the level of rainfall and
fertilizer in the preceding period. In this instance it would be correct to say that
infestation is exogenous within the period, but endogenous over time.

Exogenous variable (see also endogenous variable): A factor in a causal model or

causal system whose value is independent from the states of other variables in the
system; a factor whose value is determined by factors or variables outside the causal
system under study. For example, rainfall is exogenous to the causal system
constituting the process of farming and crop output. There are causal factors that
determine the level of rainfallso rainfall is endogenous to a weather modelbut
these factors are not themselves part of the causal model we use to explain the level of
crop output. As with endogenous variables, the status of the variable is relative to the
specification of a particular model and causal relations among the independent
variables. An exogenous variable is by definition one whose value is wholly causally
independent from other variables in the system. So the category of exogenous
variable is contrasted to those of purely endogenous and partially endogenous
variables. A variable can be made endogenous by incorporating additional factors and
causal relations into the model. There are causal and statistical interpretations of
exogeneity. The causal interpretation is primary, and defines exogeneity in terms of
the factors causal independence from the other variables included in the model. The
statistical or econometric concept emphasizes non-correlation between the exogenous
variable and the other independent variables included in the model. If xj is exogenous
to a matrix of independent variables X (excluding xj), then
QW if we perform a
regression of xj against X (excluding xj), we should expect coefficients of 0 for each
variable in X (excluding xj). Normal regression models assume that all the
independent variables are exogenous.
Answer 4
Methodology of Economic Research
It involves the following process:
1. Theoretical observation/ Hypothesis/ Law etc
Law of demand, Law of Supply etc
2. Identification of dependant (regrassand ) and in dependant (explanatory and
regressors) variables.
In law of demand regressand is Qd( quantity Demanded) and regressors are P (own
price), Pr(Price of related goods), Y(Consumers Income) and T (tastes and prefereces).
Thus Qd= f(P,Pr,Y,T)

(Note that tastes and preferences are abstract variables incapable of quantitative
measurement. For the sake of simplicity, we ignore them hereor assume them to be
unchanged. Else, we will have to take dummy variables for them, which wel take
3. Specification of the model
This involves formulation of mathematical relationship among the variables, with
signs- negative/positive and values of coefficients of the variables. Signs depend on
the nature of variation of Qd with each of the regressors. It is + if variation is direct
and - if it is inverse. We can express the hypothesis as
Qd= a+bP+cPr+dY+u
Where a,b,c,d are coefficients whose values and signs we determine through collection
of data and u is a random variable representing the error term.
4. Estimation of the model
This is done through collection of statistical data for regressands and regressors.
5. Choice of appropriate economic technique
The techniques are classified into two groups:
a. Group A: Single Equation Technique: As in step 5 above, if the equation is single,
the econometric techniques employed are:

Classical or Ordinary least Squares (CLS or OLS)

Indirect Least square or reduced form technique
Two stage least square
Limited information maximum likelihood method.

b. Group B: Simultaneous Equation technique: They comprise:

Three stage least square method
Full information maximum likelihood method.
A simultaneous system of equation can be understood by the following equations:
Qd = bo +b1P + u (Law of demand)
Qs = ao + a1P + v (Law of supply)

Qd = Qs (equilibrium market)
Whichever the form of equation, the choice of technique in each group depends on the
properties of the estimates of the coefficients. They are
1. Unbiased
2. Consistency
3. Efficiency
4. Sufficiency
Whichever technique possess most of the above properties, is considered the best

Answer 5 and Answer 16

When we want to study the properties of the obtained estimators, it is convenient to
distinguish between two categories of properties: i) the small (or finite) sample
properties, which are valid whatever the sample size, and ii) the asymptotic properties,
which are associated with large samples, i.e., when tends to .
Finite Sample Properties of the OLS and ML Estimates of
Given that, as we obtained in the previous section, the OLS and ML estimates of
lead to the same result, the following properties refer to both. In order to derive these
properties, and on the basis of the classical assumptions, the vector of estimated
coefficients can be written in the following alternative form:

Unbiasedness. According to the concept of unbiasedness, vector

is an

unbiased estimator vector of .The unbiasedness property of the estimators

means that, if we have many samples for the random variable and we calculate
the estimated value corresponding to each sample, the average of these
estimated values approaches the unknown parameter. Nevertheless, we usually

have only one sample (i.e, one realization of the random variable), so we can
not assure anything about the distance between and . This fact leads us to
employ the concept of variance, or the variance-covariance matrix if we have a
vector of estimates. This concept measures the average distance between the
estimated value obtained from the only sample we have and its expected value.
From the previous argument we can deduce that, although the unbiasedness
property is not sufficient in itself, it is the minimum requirement to be satisfied
by an estimator.
Efficiency. An estimator is efficient if it is the minimum variance unbiased
estimator. The Cramer Rao inequality provides verification of efficiency, since it
establishes the lower bound for the variance-covariance matrix of any unbiased
A property which is less strict than efficiency, is the so called best, linear unbiased
estimator (BLUE) property, which also uses the variance of the estimators.
BLUE. A vector of estimators is BLUE if it is the minimum variance linear
unbiased estimator. To show this property, we use the Gauss-Markov Theorem.
In the MLRM framework, this theorem provides a general expression for the
variance-covariance matrix of a linear unbiased vector of estimators. Then, the
comparison of this matrix with the corresponding matrix of
conclude that


allows us to

) is BLUE.

Finite Sample Properties of the OLS and ML Estimates of

Linearity. According to (2.79) the OLS and ML estimators of


are expressed

so both are non linear with respect to

quadratic forms of

, given that their numerators are

Nevertheless, given that
is biased, this estimator can not be efficient, so we
focus on the study of such a property for . With respect to the BLUE
property, neither
are linear, so they can not be BLUE.
Efficiency. The comparison of the variance of

(expression (2.88)) with

of the matrix
(expression (2.63)) allows us to deduce
that this estimator does not satisfy the Cramer-Rao inequality, given
. Nevertheless, as Schmidt (1976) shows, there is no
unbiased estimator of
with a smaller variance, so it can be said that
efficient estimator.

is an

Asymptotic Properties of the OLS and ML Estimators of

We now consider the following desirable asymptotic properties : asymptotic
unbiasedness, consistency and asymptotic efficiency.
Asymptotic unbiasedness. There are two alternative definitions of this concept.
The first states that an estimator is asymptotically unbiased if as n increases,
the sequence of its first moments converges to the parameter . It can be
expressed as:

Note that the second part of (2.96) also means that the possible bias of
disappears as increases, so we can deduce that an unbiased estimator is also
an asymptotic unbiased estimator.
Consistency. An estimator is said to be consistent if it converges in
probability to the unknown parameter, that is to say:

Means that a consistent estimator satisfies the convergence in probability to a constant,
with the unknown parameter being such a constant.
Asymptotic efficiency A sufficient condition for a consistent asymptotically
normal estimator vector to be asymptotically efficient is that its asymptotic
variance-covariance matrix equals the asymptotic Cramer-Rao lower bound
(see Theil (1971)), which can be expressed as:

denotes the so-called asymptotic information matrix, while
is the
previously described sample information matrix (or simply, information matrix).
Asymptotic Properties of the OLS and ML Estimators of
Asymptotic unbiasedness. The OLS estimator of
satisfies the finite sample
unbiasedness property, according to result (2.86), so we deduce that it is
asymptotically unbiased.
With respect to the ML estimator of , which does not satisfy the finite sample
unbiasedness (result (2.87)), we must calculate its asymptotic expectation. On
the basis of the first definition of asymptotic unbiasedness, presented in (2.96),
we have:
so we conclude that

is asymptotically unbiased.

Consistency. In order to show that

are consistent, and given that both
are asymptotically unbiased, the only sufficient condition that we have to prove
is that the limit of their variances is null.

Asymptotic efficiency. On the basis of the asymptotic Cramer-Rao lower bound,

we conclude that both
are asymptotically efficient estimators of ,
so their asymptotic variances equal the asymptotic Cramer-Rao lower bound.
An estimator, say the OLS estimator 2, is said to be a best linear unbiased
estimator (BLUE) of 2 if the following hold:
1. It is linear, that is, a linear function of a random variable, such as the
dependent variable Y in the regression model.
2. It is unbiased, that is, its average or expected value, E( 2), is equal to the
true value,
3. It has minimum variance in the class of all such linear unbiased estimators;
an unbiased estimator with the least variance is known as an efficient estimator.
Answer 6
The method of ordinary least squares is attributed to Carl Friedrich Gauss, a German
mathematician. Under certain assumptions (discussed in Section 3.2), the method of
least squares has some very attractive statistical properties that have made it one of the
most powerful and popular methods of regression analysis. To understand this method,
we first explain the leastsquares principle.
Recall the two-variable PRF: Yi = 1 + 2Xi + ui (2.4.2)

We estimate it from the SRF:

Yi = 1 + 2Xi + ui (2.6.2) = Y i + ui (2.6.3) where Y i is the estimated
(conditional mean) value of Yi.
But how is the SRF itself determined? To see this, let us proceed as follows. First,
express (2.6.3) as ui = Yi Y i = Yi 1 2Xi (3.1.1) which shows that the ui
(the residuals) are simply the differences between the actual and estimated Y values.
Now given n pairs of observations on Y and X, we would like to determine the SRF in
such a manner that it is as close as possible to the actual Y. To this end, we may adopt
the following criterion: Choose the SRF in such a way that the sum of the residuals ui

= (Yi Y i) is as small as possible. Although intuitively appealing, this is not a very

good criterion, as can be seen in the hypothetical scattergram shown in Figure 3.1. If
we adopt the criterion of minimizing ui, Figure 3.1 shows that the residuals u2 and
u3 as well as the residuals u1 and u4 receive the same weight in the sum (u1 + u2
+ u3 + u4), although the first two residuals are much closer to the SRF than the latter
two. In other words, all the residuals receive equal importance no matter how close or
how widely scattered the individual observations are from the SRF. A consequence of
this is that it is quite possible that the algebraic sum of the ui is small (even zero)
although the ui are widely scattered about the SRF.
To see this, let u1, u2, u3, and u4 in Figure 3.1 assume the values of 10, 2, +2, and
10, respectively. The algebraic sum of these residuals is zero although u1 and u4
are scattered more widely around the SRF than u2 and u3. We can avoid this
problem if we adopt the least-squares criterion, which states that the SRF can be fixed
in such a way that u2 i = (Yi Y i) 2 = (Yi 1 2Xi) 2 (3.1.2) is as small as
possible, where u2 i are the squared residuals. By squaring ui, this method gives
more weight to residuals such as u1 and u4 in Figure 3.1 than the residuals u2 and
u3. As noted previously, under the minimum ui criterion, the sum can be small even
though the ui are widely spread about the SRF. But this is not possible under the
least-squares procedure, for the larger the ui (in absolute value), the larger the u2 i .
A further justification for the least-squares method lies in the fact that the estimators
obtained by it have some very desirable statistical properties, as we shall see shortly. It
is obvious from (3.1.2) that u2 i = f( 1, 2) (3.1.3) that is, the sum of the squared
residuals is some function of the estimators 1 and 2. For any given set of data,
choosing different values for 1 and 2 will give different us and hence different
values of u2 i . To see this clearly, consider the hypothetical data on Y and X given in
the first two columns of Table 3.1.

let 1 = 1.572 and 2 = 1.357 (let us not worry right now about how we got these
values; say, it is just a guess).1 Using these values and the X values given in
column (2) of Table 3.1, we can easily compute the estimated Yi given in column (3)
of the table as Y1i (the subscript 1 is to denote the first experiment). Now let us
conduct another experiment, but this time using the values of 1 = 3 and 2 = 1.
The estimated values of Yi from this experiment are given as Y2i in column (6) of
Table 3.1. Since the values in the two experiments are different, we get different
values for the estimated residuals, as shown in the table; u1i are the residuals from the
first experiment and u2i from the second experiment. The squares of these residuals
are given in columns (5) and (8). Obviously, as expected from (3.1.3), these residual
sums of squares are different since they are based on different sets of values.
Now which sets of values should we choose? Since the values of the first
experiment give us a lower Sumu2i (= 12.214) than that obtained from the values
of the second experiment (= 14), we might say that the s of the first experiment are
the best values. But how do we know? For, if we had infinite time and infinite
patience, we could have conducted many more such experiments, choosing different
sets of s each time and comparing the resulting Sumu2i and then choosing that set
of values that gives us the least possible value of Sumu2i assuming of course that
we have considered all the conceivable values of 1 and 2. But since time, and
certainly patience, are generally in short supply, we need to consider some shortcuts to
this trialand- error process. Fortunately, the method of least squares provides us such a
shortcut. The principle or the method of least squares chooses 1 and 2 in such a

manner that, for a given sample or set of data, Sumu2i is as small as possible. In other
words, for a given sample, the method of least squares
provides us with unique estimates of 1 and 2 that give the smallest possible value of
Sumu2i . How is this accomplished? This is a straight-forward exercise in differential
calculus. As shown in Appendix 3A, Section 3A.1, the process of differentiation yields
the following equations for estimating 1 and 2:

where n is the sample size. These simultaneous equations are known as the normal
Solving the normal equations simultaneously, we obtain

where .X and .Y are the sample means of X and Y and where we define xi = (Xi .X )
and yi = (Yi-.Y). Henceforth we adopt the convention of letting the lowercase letters
denote deviations from mean values.

The last step in (3.1.7) can be obtained directly from (3.1.4) by simple algebraic
Incidentally, note that, by making use of simple algebraic identities, formula (3.1.6)
for estimating 2 can be alternatively expressed as

The estimators obtained previously are known as the least-squares estimators, for
they are derived from the least-squares principle. Note the following numerical
properties of estimators obtained by the method of OLS: Numerical properties are
those that hold as a consequence of the use of ordinary least squares, regardless of
how the data were generated. Shortly, we will also consider the statistical properties
of OLS estimators, that is, properties that hold only under certain assumptions about
the way the data were generated.4 (See the classical linear regression model in
Section 3.2.)
I. The OLS estimators are expressed solely in terms of the observable (i.e., sample)
quantities (i.e., X and Y). Therefore, they can be easily computed.
II. They are point estimators; that is, given the sample, each estimator will provide
only a single (point) value of the relevant population parameter. (In Chapter 5 we will
consider the so-called interval estimators, which provide a range of possible values
for the unknown population
III. Once the OLS estimates are obtained from the sample data, the sample regression
line (Figure 3.1) can be easily obtained. The regression line thus obtained has the
following properties:
1. It passes through the sample means of Y and X. This fact is obvious from (3.1.7), for
the latter can be written as .Y = 1 + 2.X, which is shown diagrammatically in
Figure 3.2.

2. The mean value of the estimated Y = Yi is equal to the mean value of

the actual Y for
Yi = 1 + 2Xi
= (.Y 2.X) + 2Xi
= .Y + 2(Xi .X)
Summing both sides of this last equality over the sample values and dividing through
by the sample size n gives

.Y = .Y
where use is made of the fact that Sum(Xi .X ) = 0. (Why?)
3. The mean value of the residuals ui is zero. From Appendix 3A,
Section 3A.1, the first equation is
2Sum (Yi 1 2Xi) = 0
But since ui = Yi 1 2Xi , the preceding equation reduces to
2Sum ui = 0, whence .u = 0.
As a result of the preceding property, the sample regression
Yi = 1 + 2Xi + ui
can be expressed in an alternative form where both Y and X are expressed as
deviations from their mean values.
4. The residuals ui are uncorrelated with Xi ; that is, Sumui Xi = 0.
Answer 7
Assumption 1: Linear regression model. The regression model is linear in the
parameters, as shown in (2.4.2)
Yi = 1 + 2Xi + ui (2.4.2)
Assumption 2: X values are fixed in repeated sampling. Values taken by the
regressor X are considered fixed in repeated samples. More technically, X is assumed
to be nonstochastic
Assumption 3: Zero mean value of disturbance ui. Given the value of X, the mean,
or expected, value of the random disturbance term ui is zero. Technically, the
conditional mean value of ui is zero. Symbolically, we have
E(ui |Xi) = 0
Assumption 4: Homoscedasticity or equal variance of ui. Given the value of X, the
variance of ui is the same for all observations. That is, the conditional variances of ui
are identical.
Symbolically, we have
var (ui |Xi) = E[ui E(ui |Xi)]2
= E(ui2 | Xi ) because of Assumption 3
= 2
where var stands for variance.
Assumption 5: No autocorrelation between the disturbances. Given any two X
values, Xi and Xj (i _= j), the correlation between any two ui and uj (i _= j) is zero.
cov (ui, uj |Xi, Xj) = E{[ui E(ui)] | Xi }{[uj E(uj)] | Xj }
= E(ui |Xi)(uj | Xj) (why?)

where i and j are two different observations and where cov means covariance.
Assumption 6: Zero covariance between ui and Xi, or E(uiXi) = 0. Formally,
cov (ui, Xi) = E[ui E(ui)][Xi E(Xi)]
= E[ui (Xi E(Xi))] since E(ui) = 0
= E(uiXi) E(Xi)E(ui) since E(Xi) is nonstochastic (3.2.6)
= E(uiXi) since E(ui) = 0
= 0 by assumption
Assumption 7: The number of observations n must be greater than the number of
parameters to be estimated. Alternatively, the number of observations n must be
greater than the number of explanatory variables.
Assumption 8: Variability in X values. The X values in a given sample must not all
be the
same. Technically, var (X) must be a finite positive number.
Assumption 9: The regression model is correctly specified. Alternatively, there is
specification bias or error in the model used in empirical analysis.
Assumption 10: There is no perfect multicollinearity. That is, there are no perfect
relationships among the explanatory variables.
Notable among the irrelevance-of-assumptions thesis is Milton Friedman. To him,
unreality of assumptions is a positive advantage: to be important . . . a hypothesis
must be descriptively false in its assumptions. One may not subscribe to this
viewpoint fully, but recall that in any scientific study we make certain assumptions
because they facilitate the development of the subject matter in gradual steps, not
because they are necessarily realistic in the sense that they replicate reality exactly. As
author notes, . . . if simplicity is a desirable criterion of good theory, all good theories
idealize and oversimplify outrageously.
Answer 8
The first two Gauss-Markov conditions state that the disturbance terms u1, u2, ..., un
in the n observations potentially come from probability distributions that have 0 mean
and the same variance. Their actual values in the sample will sometimes be positive,

sometimes negative, sometimes relatively far from 0, sometimes relatively close, but
there will be no a priori reason to anticipate a particularly erratic value in any given
observation. To put it another way, the probability of u reaching a given positive (or
negative) value will be the same in all observations. This condition is known as
homoscedasticity, which means "same dispersion".
y = + x + u,

Here the variance of the potential distribution of the disturbance term is increasing as
x increases. This does not mean that the disturbance term will necessarily have a
particularly large (positive or negative) value in an observation where x is large, but it
does mean that the a priori probability of having an erratic value will be relatively
high. This is an example of heteroscedasticity, which means "differing dispersion".
If heteroscedasticity is present, the OLS estimators are inefficient because you could,
at least in principle, find other estimators that have smaller variances and are still

2nd Part:
Answer 17
Macro Economics
It suffers from small sample problem.
Inaccuracy brought in available data
due to frequent revisions necessitated
by estimated data differing from actual
Available data have low frequency
Data are assumed to follow normal
Seasonability of data is not prominent.
Answer 18

Financial Econometrics
It doesnt suffer from small sample
It does not exist in Financial data.

They have high frequency

Data do not assume to follow normal
Seasonability of data is prominent.

1. Time Series Data

A time series is a set of observations on the values that a variable takes at different
times. Such data may be collected at regular time intervals, such as daily (e.g., stock
prices, weather reports), weekly (e.g., money supply figures), monthly [e.g., the
unemployment rate, the Consumer Price Index (CPI)], quarterly (e.g., GDP), annually
(e.g., government budgets), quinquennially, that is, every 5 years (e.g., the census of
manufactures), or decennially (e.g., the census of population). Sometime data are
available both quarterly as well as annually, as in the case of the data on GDP and
consumer expenditure. With the advent of high-speed computers, data can now be
collected over an extremely short interval of time, such as the data on stock prices,
which can be obtained literally continuously (the so-called real-time quote). Although
time series data are used heavily in econometric studies, they present special problems
for econometricians. As we will show in chapters on time series econometrics later on,
most empirical work based on time series data assumes that the underlying time series
is stationary. Although it is too early to introduce the precise technical meaning of
stationarity at this juncture, loosely speaking a time series is stationary if its mean and
variance do not vary systematically over time.
2. Cross-Section Data
Cross-section data are data on one or more variables collected at the same point in
time, such as the census of population conducted by the Census Bureau every 10 years
(the latest being in year 2000), the surveys of consumer expenditures conducted by the
University of Michigan, and, of course, the opinion polls by Gallup and umpteen other
organizations. Just as time series data create their own special problems (because of
the stationarity issue), cross-sectional data too have their own problems, specifically
the problem of heterogeneity.
3. Panel Data
This is a special type of pooled data in which the same cross-sectional unit (say, a
family or a firm) is surveyed over time. For example, the U.S. Department of
Commerce carries out a census of housing at periodic intervals. At each periodic
survey the same household (or the people living at the same address) is interviewed to
find out if there has been any change in the housing and financial conditions of that
household since the last survey.
Answer 23
1. Autocorrelation
In statistics, the autocorrelation of a random process describes the correlation between values of the
process at different times, as a function of the two times or of the time lag. Let X be some repeatable

process, and i be some point in time after the start of that process. (i may be an integer for a discretetime process or a real number for a continuous-time process.) Then Xi is the value (or realization)
produced by a given run of the process at time i. Suppose that the process is further known to have
defined values for mean i and variance i2 for all times i. Then the definition of the autocorrelation
between times s and t is

where "E" is the expected value operator. Note that this expression is not well-defined for all time
series or processes, because the variance may be zero (for a constant process) or infinite. If the
function R is well-defined, its value must lie in the range [1, 1], with 1 indicating perfect correlation
and 1 indicating perfect anti-correlation.
2. Autoregression

In an autoregression model, we forecast the variable of interest using a linear combination

of past values of the variable. The termautoregression indicates that it is a regression of the
variable against itself.
Thus an autoregressive model of order p can be written as

where c is a constant and et is white noise. This is like a multiple regression but
with lagged values of yt as predictors. We refer to this as an AR(p) model.
Autoregressive models are remarkably flexible at handling a wide range of different time
series patterns. The two series in Figure 8.5 show series from an AR(1) model and an AR(2)
model. Changing the parameters 1,,p results in different time series patterns. The
variance of the error term et will only change the scale of the series, not the patterns.
3. Partial

For a given stochastic process one is often interested in the connection between two
random variables of a process at different points in time. One way to measure a linear
relationship is with the ACF, i.e., the correlation between these two variables. Another
way to measure the connection between


is to filter out of


the linear influence of the random variables that lie in between,

then calculate the correlation of the transformed random variables. This is called
the partial autocorrelation.
The partial autocorrelation of

th order is defined as