CH - 7 - Econometrics UG

MTU,CANR, Department of Agri Economics
7: Limited Dependent Variables Model (Binary Choice Models) and Time Series Analysis
7.1. Binary Choice Models
Economists are often interested in the factors behind the decision-making of individuals or
enterprises.
Examples are:
-- Why do some people participate in market while others do not?
-- Why do some people go to technology adoption while others do not?
- Why do some people go to college while others do not?
- Why do some women enter the labor force while others do not?
- Why do some people buy houses while others rent?
- Why do some people migrate while others stay put?
The models that have been developed are known as binary choice or qualitative response models
with the outcome, which we will denote Y, being assigned a value of 1 if the event occurs and 0
otherwise. Models with more than two possible outcomes have been developed, but we will
restrict our attention to binary choice. The linear probability model apart, binary choice models
are fitted using maximum likelihood estimation. We start our study of qualitative response
models by first considering the binary response regression model. There are three approaches to
developing a probability model for a binary response variable:
1. The linear probability model (LPM)
2. The logit model
3. The probit model
7.1.1. The Linear Probability Model
The simplest binary choice model is the linear probability model where, as the name implies, the
probability of the event occurring, p, is assumed to be a linear function of a set of explanatory
variable(s): Because of its comparative simplicity, and because it can be estimated by OLS, we
will first consider the LPM.
To fix ideas, consider the following regression model:
Yi = β1 + β2Xi + ui……………………………………..7.1
Where X = family income and Y = 1 if the family owns a house and 0 if it does not own a house.
Model (7.1) looks like a typical linear regression model but because the regressand is binary, or
dichotomous, it is called a linear probability model (LPM). This is because the conditional
Econometrics, Chapter-7: limited dependent variables model and Time Series Analysis Page 1
expectation of Yi given Xi , E(Yi | Xi ), can be interpreted as the conditional probability that the
event will occur given Xi , that is, Pr (Yi = 1 | Xi). Thus, in our example, E(Yi | Xi) gives the
probability of a family owning a house and whose income is the given amount Xi .
The justification of the name LPM for models like (7.1) can be seen as follows: Assuming
E(ui) = 0, as usual (to obtain unbiased estimators), we obtain
E(Yi | Xi) = β1 + β2Xi……………………….7.2
Now, if Pi = probability that Yi = 1 (that is, the event occurs), and (1 − Pi) = probability that Yi =
0 (that is, that the event does not occur), the variable Yi has the following (probability)
distribution.
Yi Probability
0 1 − Pi
1 Pi
Total 1
That is, Yi follows the Bernoulli probability distribution. Now, by the definition of
mathematical expectation, we obtain:
E(Yi) = 0(1 − Pi) + 1(Pi) = Pi……………………………………7.3
Comparing (7.2) with (7.3), we can equate
E(Yi | Xi) = β1 + β2Xi = Pi………………………………………..7.4
that is, the conditional expectation of the model (7.1) can, in fact, be interpreted as the
conditional probability of Yi . In general, the expectation of a Bernoulli random variable is the
probability that the random variable equals 1. In passing note that if there are n independent
trials, each with a probability p of success and probability (1 − p) of failure, and X of these trials
represent the number of successes, then X is said to follow the binomial distribution. The mean
of the binomial distribution is np and its variance is np(1 − p). The term success is defined in the
context of the problem.
Since the probability Pi must lie between 0 and 1, we have the restriction
0 ≤ E(Yi | Xi) ≤ 1………………….7.5
that is, the conditional expectation (or conditional probability) must lie between 0 and 1. From
the preceding discussion it would seem that OLS can be easily extended to binary dependent
variable regression models. So, perhaps there is nothing new here. Unfortunately, this is not the
case, for the LPM poses several problems, which are as follows:
Non-Normality of the Disturbances ui
Although OLS does not require the disturbances (ui) to be normally distributed, we assumed
them to be so distributed for the purpose of statistical inference. But the assumption of normality
for ui is not tenable for the LPMs because, like Yi, the disturbances ui also take only two values;
that is, they also follow the Bernoulli distribution. This can be seen clearly if we write
(7.1) as
ui = Yi − β1 − β2Xi……………………………………………………7.6
The probability distribution of ui is
ui Probability
When Yi =1 1−β1−β2X Pi (7.7)
i
When Yi = 0 −β1 − (1 − Pi )
β2Xi Obviously, ui cannot be assumed to be normally
distributed; they follow the Bernoulli
distribution.
But the nonfulfillment of the normality assumption may not be so critical as it appears because
we know that the OLS point estimates still remain unbiased (recall that, if the objective is point
estimation, the normality assumption is not necessary). Besides, as the sample size increases
indefinitely, statistical theory shows that the OLS estimators tend to be normally distributed
generally. As a result, in large samples the statistical inference of the LPM will follow the usual
OLS procedure under the normality assumption.
Heteroscedastic Variances of the Disturbances
Even if E(ui) = 0 and cov (ui , uj ) = 0 for i _= j (i.e., no serial correlation), it can no longer be
maintained that in the LPM the disturbances are homoscedastic. This is, however, not surprising.
As statistical theory shows, for a Bernoulli distribution the theoretical mean and variance are,
respectively, p and p(1 − p), where p is the probability of success (i.e., something happening),
showing that the variance is a function of the mean. Hence the error variance is heteroscedastic.
For the distribution of the error term given in (7.7), applying the definition of variance,
var (ui) = Pi(1 − Pi)
That is, the variance of the error term in the LPM is heteroscedastic. Since Pi = E(Yi | Xi) = β1 +
β2Xi , the variance of ui ultimately depends on the values of X and hence is not homoscedastic.
We already know that, in the presence of heteroscedasticity, the OLS estimators, although
unbiased, are not efficient; that is, they do not have minimum variance. But the problem of
heteroscedasticity, like the problem of non-normality, is not insurmountable.
Questionable Value of R2 as a Measure of Goodness of Fit
The conventionally computed R2 is of limited value in the dichotomous response models. In most
practical applications the R2 ranges between 0.2 to 0.6. R2 in such models will be high, say, in
excess of 0.8 only when the actual scatter is very closely clustered around points A and B (Figure
7.1), for in that case it is easy to fix the straight line by joining the two points A and B. In this
case the predicted Yi will be very close to either 0 or 1.
For these reasons “use of the coefficient of determination as a summary statistic should be
avoided in models with qualitative dependent variable.
Figure 7.1 Linear probability models.
LPM: A NUMERICAL EXAMPLE

To illustrate some of the points made about the LPM in the preceding section, we present a
numerical example. Table 7.1 gives invented data on home ownership Y (1 = owns a house, 0 =
does not own a house) and family income X (thousands of dollars) for 40 families. From these
data the LPM estimated by OLS was as follows:
¿
Y = −0.9457 + 0.1021X i
(0.1228) (0.0082) R2 = 0.8048 (7.10)

t = (−7.6984) (12.515)
Table 7.1 data on ownership of house
First, let us interpret this regression. The intercept of −0.9457 gives the “probability’’ that a
family with zero income will own a house. Since this value is negative, and since probability
cannot be negative, we treat this value as zero, which is sensible in the present instance. The
slope value of 0.1021 means that for a unit change in income (here $1000), on the average the
probability of owning a house increases by 0.1021 or about 10 percent. Of course, given a
particular level of income, we can estimate the actual probability of owning a house from (7.10).
Thus, for X = 12 ($12,000), the estimated probability of owning a house is
¿
(Y | X = 12) = −0.9457 + 12(0.1021) = 0.2795
That is, the probability that a family with an income of $12,000 will own a house is about 28
¿
percent. Table 7.1 Shows the estimated probabilities, Y , for the various income levels listed in
the table. The most noticeable feature of this table is that six estimated values are negative and
six values are in excess of 1, demonstrating clearly the point made earlier that, although E(Yi | X)
¿
is positive and less than 1, their estimators, Y , need not be necessarily positive or less than 1.
This is one reason that the LPM is not the recommended model when the dependent variable is
dichotomous. Even if the estimated Yi were all positive and less than 1, the LPM still suffers
from the problem of heteroscedasticity, which can be seen readily from (7.8). As a consequence,
we cannot trust the estimated standard errors reported in (7.10). (Why?) But we can use the
weighted least-squares (WLS) to obtain more efficient estimates of the standard errors.
ALTERNATIVES TO LPM
As we have seen, the LPM is plagued by several problems, such as (1) nonnormality of ui , (2)
heteroscedasticity of ui , (3) possibility of ˆYi lying outside the 0–1 range, and (4) the generally
lower R2 values. But these problems are surmountable. For example, we can use WLS to resolve
the heteroscedasticity problem or increase the sample size to minimize the non-normality
problem. By resorting to restricted least-squares or mathematical programming techniques we
can even make the estimated probabilities lie in the 0–1 interval. But even then the fundamental
problem with the LPM is that it is not logically a very attractive model because it assumes that
Pi = E(Y = 1 | X) increases linearly with X, that is, the marginal or incremental effect of X
remains constant throughout. Thus, in our home ownership example we found that as X increases
by a unit ($1000), the probability of owning a house increases by the same constant amount of
0.10. This is so whether the income level is $8000, $10,000, $18,000, or $22,000. This seems
patently unrealistic. In reality one would expect that Pi is nonlinearly related to Xi:
At very low income a family will not own a house but at a sufficiently high level of income, say,
X*, it most likely will own a house. Any increase in income beyond X* will have little effect on
the probability of owning a house. Thus, at both ends of the income distribution, the probability
of owning a house will be virtually unaffected by a small increase in X. Therefore, what we need
is a (probability) model that has these two features: (1) As Xi increases, Pi = E(Y = 1 | X)
increases but never steps outside the 0–1 interval, and (2) the relationship between Pi and Xi is
nonlinear, that is, “one which approaches zero at slower and slower rates as Xi gets small and
approaches one at slower and slower rates as Xi gets very large.’ Geometrically, the model we
want would look something like Figure 15.2. Notice in this model that the probability lies
between 0 and 1 and that it varies nonlinearly with X. The reader will realize that the sigmoid, or
S-shaped, curve in the figure very much resembles the cumulative distribution function (CDF)
of a random variable. Therefore, one can easily use the CDF to model regressions where the
response variable is dichotomous, taking 0–1 values. The practical question now is, which CDF?
For although all CDFs are S shaped, for each random variable there is a unique CDF. For
historical as well as practical reasons, the CDFs commonly chosen to represent the 0–1 response
Figure 7.2. Cumulative distribution function

models are (1) the logistic and (2) the normal, the former giving rise to the logit model and the
latter to the probit (or normit) model. Although a detailed discussion of the logit and probit
models is beyond the scope of this book, we will indicate somewhat informally how one
estimates such models and how one interprets them.
7.1.2. THE LOGIT MODEL
We will continue with our home ownership example to explain the basic ideas underlying the
logit model. Recall that in explaining home ownership in relation to income, the LPM was
Pi = E(Y = 1 | Xi) = β1 + β2Xi (7.11)
Where X is income and Y = 1 means the family owns a house. But now consider the following
representation of home ownership:
(7.12)
For ease of exposition,
P= 1
−Zi
e
i
1+ (7.13)
Where Zi = β1 + β2Xi
Equation (7.13) represents what is known as the (cumulative) logistic distribution function.
It is easy to verify that as Zi ranges from −∞ to +∞, Pi ranges between 0 and 1 and that Pi is
nonlinearly related to Zi (i.e., Xi), thus satisfying the two requirements considered earlier. But it
seems that in satisfying these requirements, we have created an estimation problem because Pi is
nonlinear not only in X but also in the β’s as can be seen clearly from (7.12). This means that we
cannot use the familiar OLS procedure to estimate the parameters. But this problem is more
apparent than real because (7.12) can be linearized, which can be shown as follows.
If Pi, the probability of owning a house, is given by (7.13), then (1 − Pi ), the probability of not
owning a house, is
−Zi
1−P =1− e 1
i 1− P = − Zi −Zi
1+ e 1+ e
i
⇒ (7.14)
Therefore, we can write
1
− Zi
P = 1+ e
i
= e
Zi
−Zi
1− P e i
− Zi
1+ e (7.15)
Now Pi/(1 − Pi) is simply the odds ratio in favor of owning a house—the ratio of the probability
that a family will own a house to the probability that it will not own a house. Thus, if Pi = 0.8, it
means that odds are 4 to 1 in favor of the family owning a house. Now if we take the natural log
of (7.15), we obtain a very interesting result, namely,
(7.16)
that is, L, the log of the odds ratio, is not only linear in X, but also (from the estimation
viewpoint) linear in the parameters.18 L is called the logit, and hence the name logit model for
models like (7.16). Notice these features of the logit model.
1. As P goes from 0 to 1 (i.e., as Z varies from −∞ to +∞), the logit L goes from −∞ to +∞. That
is, although the probabilities (of necessity) lie between 0 and 1, the logits are not so bounded.
2. Although we have included only a single X variable, or regressor, in the preceding model, one
can add as many regressors as may be dictated by the underlying theory.
3. If L, the logit, is positive, it means that when the value of the regressor(s) increases, the odds
that the regressand equals 1 (meaning some event of interest happens) increases. If L is negative,
the odds that the regressand equals 1 decreases as the value of X increases. To put it differently,
the logit becomes negative and increasingly large in magnitude as the odds ratio decreases from
1 to 0 and becomes increasingly large and positive as the odds ratio increases from 1 to infinity.
5. More formally, the interpretation of the logit model given in (7.16) is as follows: β2, the slope,
measures the change in L for a unit change in X, that is, it tells how the log-odds in favor of
owning a house change as income changes by a unit, say, $1000. The intercept β1 is the value of
the logodds in favor of owning a house if income is zero. Like most interpretations of intercepts,
this interpretation may not have any physical meaning.
6. Given a certain level of income, say, X*, if we actually want to estimate not the odds in favor
of owning a house but the probability of owning a house itself, this can be done directly from
(7.13) once the estimates of β1 + β2 are available. This, however, raises the most important
question: How do we estimate β1 and β2 in the first place? The answer is given in the next
section.
7. Whereas the LPM assumes that Pi is linearly related to Xi, the logit model assumes that the log
of the odds ratio is linearly related to Xi.
7.1.3. THE PROBIT MODEL

The estimating model that emerges from the normal CDF is popularly known as the probit
model, although sometimes it is also known as the normit model. In principle one could
substitute the normal CDF in place of the logistic CDF.
−Z 2
1
e
t
p(Y =1)=∫−∞
2
∂ Z=Φ (Z ) ¿
√ 2∏ ¿ (7.17)
Where,
Z   0  1 X 1   2 X 2  ...   k X k
Since P represents the probability that an event will occur, here the probability of owning a
house, it is measured by the area of the standard normal curve.
The Marginal Effect of a Unit Change in the Value of a Regressor in the Various
Regression Models
In the linear regression model, the slope coefficient measures the change in the average value of
the regressand for a unit change in the value of a regressor, with all other variables held constant.
In the LPM, the slope coefficient measures directly the change in the probability of an event
occurring as the result of a unit change in the value of a regressor, with the effect of all other
variables held constant.
In the logit model the slope coefficient of a variable gives the change in the log of the odds
associated with a unit change in that variable, again holding all other variables constant. But as
noted previously, for the logit model the rate of change in the probability of an event happening
is given by βj Pi(1 − Pi ), where βj is the (partial regression) coefficient of the jth regressor. But
in evaluating Pi , all the variables included in the analysis are involved. In the probit model, as
we saw earlier, the rate of change in the probability is somewhat complicated and is given by βj f
(Zi ), where f (Zi) is the density function of the standard normal variable and Zi = β1 + β2X2i + ·
· · +βkXki , that is, the regression model used in the analysis.
Thus, in both the logit and probit models all the regressors are involved in computing the
changes in probability, whereas in the LPM only the jth regressor is involved. This difference
may be one reason for the early popularity of the LPM model.
7.1.4. LOGIT AND PROBIT MODELS

In most applications the models are quite similar, the main difference being that the logistic
distribution has slightly fatter tails, which can be seen from Figure 7.3. That is to say, the
conditional probability Pi approaches zero or one at a slower rate in logit than in probit.
Therefore, there is no compelling reason to choose one over the other. In practice many
researchers choose the logit model because of its comparative mathematical simplicity.
Generally logit and probit analysis yield similar marginal effects. However, the tails of the logit
and probit distributions are different and they can give different results if the sample is
unbalanced, with most of the outcomes similar and only a small minority different.
7.1.5. THE TOBIT MODEL
An extension of the probit model is the tobit model originally developed by James Tobin, the
Nobel laureate economist. To explain this model, we continue with our home ownership
example. In the probit model our concern was with estimating the probability of owning a house
as a function of some socioeconomic variables. In the tobit model our interest is in finding out
the amount of money a person or family spends on a house in relation to socioeconomic
variables. Now we face a dilemma here: If a consumer does not purchase a house, obviously we
have no data on housing expenditure for such consumers; we have such data only on consumers
who actually purchase a house. Thus consumers are divided into two groups, one consisting of,
say, n1 consumers about whom we have information on the regressors (say, income, mortgage
interest rate, number of people in the family, etc.) as well as the regressand (amount of
expenditure on housing) and another consisting of n2 consumers about whom we have
information only on the regressors but not on the regressand. A sample in which information on
the regressand is available only for some observations is known as a censored sample.35
Therefore, the tobit model is also known as a censored regression model. Some authors call such
models limited dependent variable regression models because of the restriction put on the
values taken by the regressand. Statistically, we can express the tobit model as
Yi = β1 + β2Xi + ui if RHS > 0
= 0 otherwise
where RHS = right-hand side. Note: Additional X variables can be easily added to the model.
7.2. Time Series Analysis
A time-series is a set of observations on a quantitative or discrete variable collected over time.
Often, independent variables are not available to build a regression model of a time series
variable. In time series analysis, we analyze the past behavior of a variable in order to predict its
future behavior.
Discrete time series may arise in two ways:
 By sampling a continuous time series, and
 By accumulating a variable over a period of time.
Stationary data is a time series variable exhibiting no significant upward or downward trend over
time. Nonstationary data is a time series variable exhibiting a significant upward or downward
trend over time. Seasonal data is a time series variable exhibiting a repeating patterns at regular
intervals over time.
Long term trend: A time series may be stationary or exhibit trend over time. Long term trend is
typically modeled as a linear, quadratic or exponential function. Reasons for trends include:
 Population growth
 Greater demand for products and services
 Greater supply of products and services.
 Technology: impacts on efficiency, supply, and demand
 Innovation (impacts efficiency as well as supply and demand).
Seasonal variation: When a repetitive pattern is observed over some time horizon, the series is
said to have seasonal behavior. Seasonal effects are usually associated with calendar or climatic
changes. Reasons for seasonal influences include:
 Weather:
 Both outdoor and indoor activities can impact demand because of the number of
people involved
 Supplies of products and services may depend on the weather.
 Events, Holidays (often impact supply and demand)

 Seasonal variation is frequently tied to yearly cycles.
Cyclical variation is an upturn or downturn not tied to seasonal variation. It is similar to seasonal
variations except that there is likely not a relationship to the time of the year. Examples of
cyclical influences include:
 Inflation/deflation (energy costs, wages and salaries, and government spending)
 Stock market prices (bull markets, bear markets)
 Consequences of unique events (severe weather, law suits
Random effects are the unexplained variations which we usually treat as randomness. This is the
equivalent of the error term in the analysis of variance model and the regression model. These
are short-term effects, usually. We treat them as independent from one time period to the next.
The length of the duration of these effects would then be shorter than one time period, that is,
one month for monthly data, one year for annual data.
The four components of time series come together to form a time series model. There are two
popular time series models in this case: additive model and multiplicative.
7.2.1. STOCHASTIC PROCESSES (Stationary stochastic and non-stationary stochastic
process)
A random or stochastic process is a collection of random variables ordered in time. If we let Y
denote a random variable, and if it is continuous, we denote it as Y(t), but if it is discrete, we
denoted it as Yt. Since most economic data are collected at discrete points in time, for our
purpose we will use the notation Yt rather than Y(t). If we let Y represent GDP, for our data we
have Y1, Y2, Y3, . . . , Y86, Y87, Y88, where the subscript 1 denotes the first observation and the
subscript 88 denotes the last observation. Keep in mind that each of these Y’s is a random
variable.
A. Stationary stochastic Time Series
A time series is said to be stationary if its statistical properties do no depend on time. A time
series may be stationary with respect to one characteristic, while not stationary with respect to
another. A time series is said to be weakly stationary if it’s mean, variance and autocovariance
do not grow over time. For example, there can be a time series where its mean does not depend
on time while its variance depends on time. As an example, white noise is stationary. A time
series is said to be a white noise process if each value in the time series have zero-mean, constant
conditional variance and is uncorrelated with all other realizations. Another example is the
autoregressive moving average model, which is discrete-time stationary process with continuous
sample space. Many business time series are far from stationary. Nevertheless, stationary time
series can provide us with meaningful sample statistics such as means, variances, and
correlations with other variables.
Also many statistical forecasting methods are based on the assumption that the time series are
approximately stationary. Mathematical transformations like deflation, seasonal adjustment, and
de-trending can be applied to try to stationarize the time series.
B. Nonstationary Stochastic Processes
Although our interest in stationary time series, one often encounters nonstationary time series,
the classic example being the random walk model (RWM). It is often said that asset prices,
such as stock prices or exchange rates, follow a random walk; that is, they are nonstationary. We
distinguish two types of random walks: (1) random walk without drift (i.e., no constant or
intercept term) and (2) random walk with drift (i.e., a constant term is present). Random Walk
without Drift. Suppose ut is a white noise error term with mean 0 and variance σ2. Then the
series Yt is said to be a random walk if
Yt = Yt−1 + ut
In the random walk model, as (Yt = Yt−1 + ut) shows, the value of Y at time t is equal to its value
at time (t − 1) plus a random shock; thus it is an AR(1) model. We can think of (Yt = Yt−1 + ut )
as a regression of Y at time t on its value lagged one period.
It is easy to show that, while Yt is nonstationary, its first difference is stationary. In other words,
the first differences of a random walk time series are stationary.
from (Yt = Yt−1 + ut) we can write
Y1 = Y0 + u1
Y2 = Y1 + u2 = Y0 + u1 + u2
Y3 = Y2 + u3 = Y0 + u1 + u2 + u3
In general, if the process started at some time 0 with a value of Y0, we have
Y =Y
t 0+
∑ ut
E (Y )= E (Y
t 0+ ∑ ut =Y 0 ) why?
In like fashion, it can be shown that

var (Yt) = tσ2
Interestingly, if you write (Yt = Yt−1 + ut) as

(Yt − Yt−1) = ∆Yt = u
As the preceding expression shows, the mean of Y is equal to its initial, or starting, value, which
is constant, but as t increases, its variance increases indefinitely, thus violating a condition of
stationarity. the RWM without drift is a nonstationary stochastic process. In practice Y0 is often
set at zero, in which case E(Yt) = 0.
Interestingly, if you write (Yt = Yt−1 + ut) as

(Yt − Yt−1) = ∆Yt = ut
It is easy to show that, while Yt is nonstationary, its first difference is stationary.
In other words, the first differences of a random walk time series are stationary.
Random Walk with Drift. Let us modify (Yt = Yt−1 + ut) as follows:
Yt = δ + Yt−1 + ut
where δ is known as the drift parameter. The name drift comes from the fact that if we write
the preceding equation as
Yt − Yt−1 = _Yt = δ + ut
it shows that Yt drifts upward or downward, depending on δ being positive or negative. Note that
model (Yt = δ + Yt−1 + ut),) is also an AR(1) model.
Following the procedure discussed for random walk without drift, it can be shown that for the
random walk with drift model (Yt = δ + Yt−1 + ut),
E(Yt) = Y0 + t · δ
var (Yt) = tσ2
As you can see, for RWM with drift the mean as well as the variance increases over time, again
violating the conditions of (weak) stationarity. In short, RWM, with or without drift, is a
nonstationary stochastic process.
TESTS OF STATIONARITY
By now the reader probably has a good idea about the nature of stationary stochastic processes
and their importance. In practice we face two important questions: (1) How do we find out if a
given time series is stationary? (2) If we find that a given time series is not stationary, is there a
way that it can be made stationary? The three different types stationary tests are:
 Graphical Analysis
 Autocorrelation Function (ACF) and Correlogram
 THE Unit Root Test
UNIT ROOT STOCHASTIC PROCESS
Yt = ρYt−1 + ut − 1 ≤ ρ ≤ 1
This model resembles the Markov first-order autoregressive model that we discussed in the
chapter on autocorrelation. If ρ = 1, becomes a RWM (without drift). If ρ is in fact 1, we face
what is known as the unit root problem, that is, a situation of nonstationarity; we already know
that in this case the variance of Yt is not stationary. The name unit root is due to the fact that ρ =
1.11 Thus the terms nonstationarity, random walk, and unit root can be treated as synonymous.
If, however, |ρ| ≤ 1, that is if the absolute value of ρ is less than one, then it can be shown that the
time series Yt is stationary in the sense we have defined it.
In practice, then, it is important to find out if a time series possesses a unit root.
The most common types of stationary tests that has become widely popular over the past several
years is the unit root test.
TRANSFORMING NONSTATIONARY TIME SERIES
Now that we know the problems associated with nonstationary time series, the practical question
is what to do. To avoid the spurious regression problem that may arise from regressing a
nonstationary time series on one or more nonstationary time series, we have to transform
nonstationary time series to make them stationary. The transformation method depends on
whether the time series are difference stationary (DSP) or trend stationary (TSP). We consider
each of these methods in turn.
Difference-Stationary Processes
If a time series has a unit root, the first differences of such time series are stationary. Therefore,
the solution here is to take the first differences of the time series.
Trend-Stationary Process
A TSP is stationary around the trend line. Hence, the simplest way to make such a time series
stationary is to regress it on time and the residuals from this regression will then be stationary.
APPROACHES TO ECONOMIC FORECASTING
Broadly speaking, there are five approaches to economic forecasting based on time series data:
(1) exponential smoothing methods, (2) single-equation regression models, (3) simultaneous-
equation regression models, (4) autoregressive integrated moving average models (ARIMA), and
(5) vector autoregression.

CH - 7 - Econometrics UG

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH - 7 - Econometrics UG

Uploaded by

Copyright:

Available Formats

MTU,CANR, Department of Agri Economics

Figure 7.1 Linear probability models.

LPM: A NUMERICAL EXAMPLE

(0.1228) (0.0082) R2 = 0.8048 (7.10)

Table 7.1 data on ownership of house

Figure 7.2. Cumulative distribution function

Therefore, we can write

7.1.3. THE PROBIT MODEL

7.1.4. LOGIT AND PROBIT MODELS

 Events, Holidays (often impact supply and demand)

In like fashion, it can be shown that

Interestingly, if you write (Yt = Yt−1 + ut) as

Interestingly, if you write (Yt = Yt−1 + ut) as

var (Yt) = tσ2

You might also like