You are on page 1of 89

Basic Univariate and Multivariate Analysis

Joint MSC Program


Yom Institute of Economic Development
Zerayehu Sime Eshete (PhD)
Presentation Outline:
1. Statistics and Econometrics?
2. Why Econometrics is A Separate Discipline?
3. Univariate Statistical Analysis
4. Methodology Of Econometrics
• Statement of theory or hypothesis (Model Specification).
• Specification of the mathematical model of the theory
• Specification of the statistical, or econometric, model
• Collecting the data
• Estimation of the parameters of the econometric model
• Diagnostic Tests (Post-Estimation Tests)
• Hypothesis testing
• Forecasting or prediction
• Using the model for control or policy purposes.
5. Qualitative Explanatory Variables /Dummy Variables
STATISTICS AND ECONOMETRICS
Statistics:

1.It is the science of learning from data, and of measuring,


controlling, and communicating uncertainty; and it thereby
provides the navigation essential for controlling the course of
scientific and societal advances
2.It is a science of collection, presentation, analysis, and
reasonable interpretation of data. And making inference and
predict relations of variables.
A Taxonomy of Statistics
Econometrics:
• Economists are frequently interested in relationships between different
quantities, for example between income and consumption.

• The most important job of econometrics is to quantify these relationships on the


basis of available data and using statistical techniques, and to interpret, use or
exploit the resulting outcomes appropriately.

• Consequently, econometrics is the interaction of economic theory, observed data


and statistical methods. It is the interaction of these three that makes
econometrics interesting, challenging and, perhaps, difficult.

• Traditionally econometrics has focused upon aggregate economic relationships.


Since the 1970s econometric methods are increasingly employed in micro-
economic models describing individual, household or firm behaviour, stimulated
by the development of appropriate econometric models and estimators (Verbeek
2004)
1. Beyond Theory: Economic theory makes statements or hypotheses
that are mostly qualitative in nature (the law of demand), the law does
not provide any numerical measure of the relationship. This is the job
of the econometrician.

2. Beyond Mathematical Economics: The main concern of mathematical


economics is to express economic theory in mathematical form
without regard to measurability or empirical verification of the theory.
Econometrics is mainly interested in the empirical verification of
economic theory.

3. Beyond Economic statistics: Economic statistics is mainly concerned


with collecting, processing, and presenting economic data in the form
of charts and tables (descriptive analysis). It does not go any further.
The one who does that is the econometrician.
Univariate Statistical Analysis:
Descriptive Analysis
Frequency Distribution, Measures of Central Tendency, Measures of
Dispersion, and Shape of Frequency Distribution
Frequency Distribution:
Distribution:- (of a variable) tells us what values the variable takes and how often
it takes these values: unimodal - having a single peak, Bimodal - having two distinct
peaks, and Symmetric - left and right half are mirror images.

Frequency Distribution: Consider a data set of 26 children of ages 1-6


years. Then the frequency distribution of variable ‘age’ can be tabulated
as follows: ( PDF, and CDF)
Graphical Presentation

Figure 1: Bar Chart of Subjects in


Treatm ent Groups
Num ber of Sub jects

30
25
20
15
10
5
0
1 2 3
Treatm e nt Group
Pie Chart: Lists the categories and presents the percent or
count of individuals who fall in each category.
Measures of Descriptive Statistics
Descriptive statistics: are methods for organizing and
summarizing data.

For example, tables or graphs are used to organize data, and


descriptive values such as the average score are used to
summarize data.

A descriptive value for a population is called a parameter and


a descriptive value for a sample is called a statistic.
•Descriptive statistics provide important information about
variables. Descriptive statistics are used to describe the basic
features of the data in a study.

•Descriptive statistics are typically distinguished from inferential


statistics. With descriptive statistics you are simply describing
what is or what the data shows. With inferential statistics, you are
trying to reach conclusions that extend beyond the immediate data
alone. 
•Three Measures :
1.Measures of Central Tendency: Mean, median, and mode
measure the central tendency of a variable.
2.Measures of dispersion (Variability): include variance, standard
deviation and range
3. Shape of Distribution: Skewness, Kurtosis
Mean:
For a data set, the mean is the sum of the values divided by the
number of values. The mean of a set of numbers x1, x2... xn is
typically denoted by , pronounced "x bar". This mean is a type of
arithmetic mean. The mean describes the central location of the
data; the arithmetic mean is the "standard" average, often simply
called the "mean".
Median:

It is the middle value of the distribution when all items are


arranged in either ascending or descending order in terms of
value
th
 n  1
Med= Med    value
 2 

Mode:

It is the value that occurs most frequently in the data set


Measures of Dispersion: Measures the
amount of scatter in a dataset.
Variance:

The variance is used as a measure of how far a set of


numbers are spread out from each other. It is one of
several descriptors of a probability distribution,
describing how far the numbers lie from the mean
(expected value). In particular, the variance is one of the
moments of a distribution.
n 2

 ( x  x)
i
Var ( x )  i 1
n
Standard deviation:
 

It is a widely used measurement of variability or diversity used in


statistics and probability theory. It shows how much variation or
“dispersion" there is from the average (mean, or expected value).
A low standard deviation indicates that the data points tend to be
very close to the mean, whereas high standard deviation indicates
that the data are spread out over a large range of values. The
standard deviation of X is given by:
For a population
A useful property of n 2
standard deviation is
that, unlike variance, it
 ( x  x)
i

is expressed in the
SE ( x)  i 1

n
same units as the data.
Small Variance
Large Variance
(small SD)
(large SD)
Coefficient of variation (CV):

In probability theory and statistics, the coefficient of variation (CV)


is a normalized measure of dispersion of a probability distribution.
It is also known as unitized risk or the variation coefficient. The
coefficient of variation (CV) is defined as the ratio of the standard
deviation to the mean :
 SD 
CV   
 Mean 
Covariance :

Covariance between X and Y refers to a measure of how much two


variables change together.
Covariance indicates how two variables are related. A positive
covariance means the variables are positively related, while a negative
covariance means the variables are inversely related. The formula for
calculating covariance of sample data is shown below.

 (x i  x )( yi  y )
Cov ( x, y )  i 1

n
Shape of Frequency Distribution

Skweness:

It refers to symmetry or asymmetry of the distribution.

Kurtosis:

It refers to peakedness of the distribution.


Skewness:
It is a measure of the asymmetry of the probability distribution of a
real-valued random. The skewness value can be positive or
negative, or even undefined.
Qualitatively, a negative skew indicates that the tail on the left side
of the probability density function is longer than the right side and
the bulk of the values (possibly including the median) lie to the
right of the mean.
A positive skew indicates that the tail on the right side is longer
than the left side and the bulk of the values lie to the left of the
mean. A zero value indicates that the values are relatively evenly
distributed on both sides of the mean, typically but not necessarily
implying a symmetric distribution.
The coefficient of Skewness is a measure for the degree of
symmetry in the variable distribution.
Kurtosis:

It is a measure of the "peakedness" of the probability


distribution of a real-valued random variable although some
sources are insistent that heavy tails, and not peakedness, is
what is really being measured by kurtosis. Higher kurtosis
means more of the variance is the result of infrequent extreme
deviation, as opposed to frequent modestly sized deviations.
 
The coefficient of Kurtosis is a measure for the degree of
peakedness/flatness in the variable distribution.
Standardization
• Example: Dave gets a 50 on his Statistics midterm and an
50 on his Calculus midterm. Did he do equally well on these
two exams? Big question: How can we compare a person’s
score on different variables? Thus, how we evaluate Dave’s
performance depends on how much variability there is in
the exam scores.

• Standard (Z) Scores: In short, we would like to be able


to express a person’s score with respect to both (a) the
mean of the group and (b) the variability of the scores
– how far a person is from the mean = x-m (Xi  M )
Zi 
– Variability = sd SD
“ How far a person is from the mean, in the metric of
standard deviation units **
Properties of Standard Scores
1.The mean of a set of z-scores is always zero.
2.The SD of a set of standardized scores is always 1
3.The distribution of a set of standardized scores has the
same shape as the Unstandardized scores
The area under a normal curve
0 .4
0 .3
0 .2

34% 34%
0 .1

14% 14%

2% 2%
0 .0

-4 -2 0 2 4

SCORE
CORRELATION BETWEEN VARIABLES
In statistics, dependence refers to any statistical relationship
between two random variables or two sets of data. Correlation
also tells you the degree to which the variables tend to move
together.

The most familiar measure of dependence between two quantities


is the Pearson product-moment correlation coefficient, or
"Pearson's correlation." It is obtained by dividing the covariance
of the two variables by the product of their standard deviations.

The Pearson correlation is defined only if both of the standard


deviations are finite and both of them are nonzero. The correlation
coefficient is symmetric: corr(X, Y) = corr(Y, X).
The Pearson correlation is +1 in the case of a perfect positive
(increasing) linear relationship (correlation), −1 in the case of a
perfect decreasing (negative) linear relationship and some value
between −1 and 1 in all other cases, indicating the degree of linear
dependence between the variables. If the variables are
independent, Pearson's correlation coefficient is 0.

The sample correlation coefficient is written


n

 (x i  x )( yi  y )
r ( x, y )  i 1

var( xi  x ) var( yi  y )
ECONOMETRIC ANALYSIS: Robust Regression
Analysis
Regression analysis is a statistical tool for the investigation of
relationships between variables. Usually, we seek to ascertain the
causal effect of one variable upon another.

regression analysis estimates the conditional expectation of the


dependent variable given the independent variables that is, the
average value of the dependent variable when the independent
variables are held fixed.
Y   0  1 X 1   2 X 2  
In all cases, the estimation target is a function of the independent
variables called the regression function. Regression analysis is
widely used for prediction and forecasting
• Broadly speaking, traditional econometric methodology proceeds along
the following lines:

1. Statement of theory or hypothesis (Model Specification).


2. Specification of the mathematical model of the theory
3. Specification of the statistical, or econometric, model
4. Collecting the data
5. Estimation of the parameters of the econometric model
6. Hypothesis testing
7. Diagnostic Tests (Post-Estimation Tests)
8. Forecasting or prediction
9. Using the model for control or policy purposes.

• To illustrate the preceding steps, let us consider the well-known


Keynesian theory of consumption.
1. Statement of Theory or Hypothesis (Model Specification)
•Choosing among Competing Models: When a governmental agency collects
economic data, such as that shown in Table I.1, it does not necessarily have
any economic theory in mind.

•How then does one know that the data really support the Keynesian theory of
consumption? Is it because the Keynesian consumption function (i.e., the
regression line) shown in Figure I.3 is extremely close to the actual data
points?
•Is it possible that another consumption model (theory) might equally fit the
data as well? For example, Milton Friedman has developed a model of
consumption, called the permanent income hypothesis.
•Robert Hall has also developed a model of consumption, called the life-cycle
permanent income hypothesis. Could one or both of these models also fit the
data in Table I.1?

•In short, the question facing a researcher in practice is how to choose among
competing hypotheses or models of a given phenomenon, such as the
consumption–income relationship.

•Let us use the Keynesian model for a time being. Let Keynes states that on average,
consumers increase their consumption as their income increases, but not as much as the
increase in their income (MPC < 1).
2. Specification of the Mathematical Model of Consumption (single-
equation model)
Y = β1 + β 2 X 0 < β2 < 1 (I.3.1)

Y = consumption expenditure and (dependent variable)


X = income, (independent, or explanatory variable)
β1 = the intercept
β2 = the slope coefficient

• The slope coefficient β2 measures the MPC.


• Geometrically,
3. Specification of the Econometric Model of Consumption
• The relationships between economic variables are generally inexact. In addition
to income, other variables affect consumption expenditure. For example, size of
family, ages of the members in the family, family religion, etc., are likely to exert
some influence on consumption.

• To allow for the inexact relationships between economic variables, (I.3.1) is


modified as follows:

• Y = β1 + β2X + u (I.3.2)

• where u, known as the disturbance, or error, term, is a random (stochastic)


variable that has well-defined probabilistic properties. The disturbance term u
may well represent all those factors that affect consumption but are not taken into
account explicitly.
• N.B: Dependent variable (y) means response variable, explained, predictand,
endogenous, and outcome variable. Independent variables (x) means
explanatory, repressors, exogenous, predictor variables. And, coefficients are
called statistic in sample, and are parameter in population
• (I.3.2) is an example of a linear regression model, i.e., it hypothesizes that Y
is linearly related to X, but that the relationship between the two is not exact;
it is subject to individual variation. The econometric model of (I.3.2) can be
depicted as shown in Figure I.2.
4. Obtaining Data
• To obtain the numerical values of β1 and β2, we need data. Look at Table
I.1, which relate to the personal consumption expenditure (PCE) and the
gross domestic product (GDP). The data are in “real” terms.
5. Estimation of the Econometric Model
The objective is to minimize the error terms so that we
apply Ordinary Least Square (OLS) method to find the
optimal level of coefficients.

Least squares method minimizes the sum of squares of


errors (deviations of individual data points form the
regression line). Such a and b are called least squares
estimators (estimators of parameters α and β).

The process of getting parameter estimators (e.g., a


and b) is called estimation. “Regress Y on X”

Lest squares method is the estimation method of


ordinary least squares (OLS)
Regression line is a straight line that describes the dependence of
the average value of one variable on the other.

Slope Random Error


Y Intercept Coefficient

Yi      X i   i
Dependent Independent
Regression
(Response) (Explanatory)
Variable Line Variable
Ordinary Least Squares Method
E (Y )  Yˆ  a  bX
  Y  Yˆ  Y  (a  bX )  Y  a  bX
 2  (Y  Yˆ ) 2  (Y  a  bX ) 2
(Y  a  bX ) 2  Y 2  a 2  b 2 X 2  2aY  2bXY  2abX
 
 2
 (Y  Y 
ˆ ) 2  (Y  a  bX ) 2

Min  2  Min (Y  a  bX ) 2

How to get coefficients b that can minimize the sum of


squares of errors?
Compute a and b so that partial derivatives with
respect to a and b are equal to zero


  2

 
  (Y  a  bX ) 2 
 2na  2 Y  2b X  0
a a

na   Y  b X  0

a  Y
b  X
 Y  bX
n n
Take a partial derivative with respect to b and plug in a
you got,
 Y X

  2 


  (Y  a  bX ) 2 
 2b X 2  2 XY  2a  X  0
b b
b X 2   XY  a  X  0   Yb
X
X 2   XY  Y  bX   X  0

b X 2   XY     b 
 Y X 
 X  0

 n n 
X Y  X 0
b X 2   XY     b 
2

n n

 n X 2    X  2   XY   X  Y
b 
 n  n
 
Least squares method is an algebraic solution that minimizes
the sum of squares of errors (variance component of error)

n XY   X  Y  ( X  X )(Y  Y ) SP
b   xy

n X 2   X  (X  X )
2 2
SS x

a  Y
b  X
 Y  bX
n n
Properties of OLS estimators: The outcome of least squares method is
OLS parameter estimators a and b.
•OLS estimators are linear
•OLS estimators are unbiased (precise)
•OLS estimators are efficient (small variance)
•Gauss-Markov Theorem: Among linear unbiased estimators, least
square estimator (OLS estimator) has minimum variance. BLUE (best
linear unbiased estimator)
In order to estimate coefficients, first we need to build the Classical
linear regression model:
– Linear in Parameter
– Linear relationship between Y and Xs
– Constant slopes (coefficients of Xs)
– Xs are fixed; Y is conditional on Xs
– X is exogenous and error is not related to Xs
– Constant variance of errors (Homoscedascticity)
– No autocorrelation with error terms
Therefore, the estimation of the Econometric Model of the example we have
is as follows:

• Regression analysis is the main tool used to obtain the estimates. Using this
technique and the data given in Table I.1, we obtain the following estimates
of β1 and β2, namely, −184.08 and 0.7064. Thus, the estimated consumption
function is:

Yˆ  184.08  0.7064 X (I.3.3)


Se 24.372 0.025
• The estimated regression line is shown in Figure I.3. The regression line fits
the data quite well. The slope coefficient (i.e., the MPC) was about 0.70, an
increase in real income of 1 dollar led, on average, to an increase of about 70
cents in real consumption.
• NB: If the variables are converted into logarithms, the coefficients become
elasticities measure.
Y i = 0 + 1X 1i + 2X 2i + i
Y (O b s e rv e d Y )

R esponse 0 i
P la n e
X2

X1 ( X 1 i, X 2 i)
 Y |X =  0 +  1 X 1 i +  2 X 2 i
R2 and Goodness-of-fit
Goodness-of-fit measures evaluates how well a regression
model fits the data. The smaller RSS, the better fit the model.

n 2

 (Y  Y )
TSS  ESS  RSS R2 
ESS
TSS
 n
i 1

(X
i 1
 X )2

 t
(Y  Y ) 2
  (Yˆ  Y ) 2
  t
(Y  Yˆ
t ) 2

R2 (Coefficient of Determination) is SSM/SST that measures how much a


model explains the overall variance of Y.
It is called Coefficient of Determination. Coefficient of Determination: is
defined as the proportion of the total variation or dispersion in the dependent
variable that explained by the variation in the explanatory variables in the
regression. Large R square means the model fits the data
In Simple Regression (if only has one X), R square is
2

Karl Pearson correlation coefficient squared.


r2=.89672=.80

If a regression model includes many regressors, R2 is is


not equal to r2.

Addition of any regressor always increases R2


regardless of the relevance of the regressor.

Adjusted R2 give penalty for adding regressors:

( n  1)
R  1
2
(1  R 2 )
(n  k )
Analysis of Variance and F Statistic

Explained Variation /(k  1)


F
Unexplained Variation /(n  k )

R /(k  1)
2
F
(1  R ) /(n  k )
2
Statistical Test:
Inferential Statistics and Hypothesis
Testing
Inferential statistics:

They are methods for using sample data to make general conclusions (inferences)
about populations.
Because a sample is typically only a part of the whole population, sample data
provide only limited information about the population. As a result, sample statistics
are generally imperfect representatives of the corresponding population parameters.
The discrepancy between a sample statistic and its population parameter is called
sampling error.

Hypothesis testing or significance testing


It is a method for testing a claim or hypothesis about a parameter in a population,
using data measured in a sample.

In this method, we test some hypothesis by determining the likelihood that a sample
statistic could have been selected, if the hypothesis regarding the population
parameter were true.
The goal of hypothesis testing is to determine the likelihood that
a population parameter, such as the mean, is likely to be true.
The method can be summarized in five steps.

1. Hypothesis Testing: we identify a hypothesis or claim that we feel


should be tested.
2. Calculate Test Statistic:
3. Select Tabulated Test: Look for from their distinct tables
4. Compare Calculated and tabulated one:
5. Decision Rules
A) Hypothesis Testing:

A statistical test provides a mechanism for making quantitative decisions


about a process or processes. The intent is to determine whether
there is enough evidence to "reject" a conjecture or hypothesis about
the process.
The Ho conjecture is called the null hypothesis. Not rejecting may be a
good result if we want to continue to act as if we "believe" the null
hypothesis is true. Or it may be a disappointing result, possibly
indicating we may not yet have enough data to "prove" something
by rejecting the null hypothesis.

H0: Null Hypothesis indicating the current belief is true


H1: Alternative Hypotheses, indicating your belief
Null and alternative hypotheses can be two sided or one-sided, it means
two-tailed or one tailed.
Hypothesis Testing for individual Coefficients:

H 0 : i  0
H1 :  i  0

Hypothesis Testing for Joint Coefficients (overall


significance of goodness of the fit)

H 0 : 1   2  0
H1 : 1   2  0
B) Compute the Test Static:

1) Statistically testing for individual coefficient


In theory, the t-statistic of any one variable may be used to test
the hypothesis that the true value of the coefficient is zero
(which is to say, the variable should not be included in the
model). In testing the null hypothesis that the populations mean
is equal to a specified value , one uses the statistic: Degrees of
Freedom = (n-k). Standard Error of the Slope Estimate

 t t

(Y  Yˆ ) 2
e 2
 sbˆ  
t  
( n  k ) ( X t  X ) 2
(n  k ) ( X t  X )2
se(  )
1) Statistically Testing for joint level of significance

The F-ratio provide a test of the significance of all the


independent variables (other than the constant term) taken
together. The F-ratio is the ratio of the explained-variance-
per-degree-of-freedom-used to the unexplained-variance-
per-degree-of-freedom-unused, i.e.:
ESS / k  1
F
RSS / n  k
Where K is the number of coefficient and N is the number of
observation .
• That is to find out whether the estimates obtained in, Eq. (I.3.3) are in
accord with the expectations of the theory that is being tested. Keynes
expected the MPC to be positive but less than 1. In our example we found the
MPC to be about 0.70. But before we accept this finding as confirmation of
Keynesian consumption theory, we must enquire whether this estimate is
sufficiently below unity. In other words, is 0.70 statistically less than 1? If it
is, it may support Keynes’ theory.
• Such confirmation or refutation of economic theories on the basis of sample
evidence is based on a branch of statistical theory known as statistical
inference (hypothesis testing).
• It is also long with Statistical Inference from Sample to population

H 0 : 1  0 0.7064  0
t  28.56
H1 : 1  0 0.025
C) Decision Rules:

1. If tcal  ttab , Reject H 0 and Accept H1


Pvalue   , Reject H 0 and Accept H1

then coefficient is statistically significant, and the


associated variable is a policy variable. If not, it is
statistically insignificant and can not be a policy
variables.
Fcal  Ftab , Reject H 0 and Accept H1
2.If Pvalue   , Reject H 0 and Accept H1

then, all explanatory variables are jointly


statistically significant, meaning the model is good fir.
If not, the model is not good.
The probability of obtaining a sample mean, given that the value stated
in the null hypothesis is true, is stated by the p value. The p value is a
probability: It varies between 0 and 1 and can never be negative.

we stated the criterion or probability of obtaining a sample mean at


which point we will decide to reject the value stated in the null
hypothesis, which is typically set at 5% in behavioral research.

To make a decision, we compare the p value to the criterion with level


of significance. A p value is the probability of obtaining a sample
outcome, given that the value stated in the null hypothesis is true. The
p value for obtaining a sample outcome is compared to the level of
significance. Significance, or statistical significance, describes a
decision made concerning a value stated in the null hypothesis. When
the null hypothesis is rejected, we reach significance. When the null
hypothesis is retained, we fail to reach significance
7. Forecasting or Prediction

• To illustrate, suppose we want to predict the mean consumption


expenditure for 1997. The GDP value for 1997 was 7269.8 billion dollars
consumption would be:

Yˆ1997 = −184.0779 + 0.7064 (7269.8) = 4951.3 (I.3.4)

• The actual value of the consumption expenditure reported in 1997 was


4913.5 billion dollars. The estimated model (I.3.3) thus over-predicted the
actual consumption expenditure by about 37.82 billion dollars. We could
say the forecast error is about 37.8 billion dollars, which is about 0.76
percent of the actual GDP value for 1997.
• Within Sample and Out of sample forecasting.
Diagnostic Tests (Post-Estimation
Tests)
8. Diagnostic Tests (Post-Estimation Tests)

• The results of the model MUST satisfy the


assumptions of linear regression model and the
properties of the coefficients. Otherwise, we do not
need to use the result!
• Test for Normality
• Test for Multicollinearity
• Test for Autocorrelation
• Test for Homoskedasticity
Test for Normality:

the Jarque–Bera test is a goodness-of-fit test of whether


sample data have the skewness and kurtosis matching
a normal distribution. The test statistic JB is defined as

where n is the number of observations (or degrees of freedom in


general); S is the sample skewness, and K is the sample kurtosis:
a) Hypothesis Testing

H 0 : e rror terms are nomally distributed


H1 : Null Hypothesis is not true

a) Decision Rule

J cal  J tab , Reject H 0 and Accept H1


Pvalue   , Reject H 0 and Accept H1
Test for Multicollinearity

Multicollinearity is a linear relationship between two explanatory


variables. One of the features of Multicollinearity is that the
standard errors of the affected coefficients tend to be large. In that
case, the test of the hypothesis that the coefficient is equal to zero
leads to a failure to reject the null hypothesis

Steps: run an OLS of one of the explanatory variable on all other


explanatory variables. And calculate VIF” 1 1
VIFi  
(1  R ) tolerance
i
2

High VIF, High MC: In the rule of thumb,


If VIF is less than 10 , MC is not there
TEST FOR HETROSCEDASCTISITY

In statics, a sequence of random variable is heteroskedasticity, if


the random variables have different variance. Heteroskedasticity
does not cause ordinary least squares coefficient estimates to be
biased, although it can cause ordinary least squares estimates of the
variance (and, thus, standard errors) of the coefficients to be biased,
possibly above or below the true or population variance.

Test for Heteroskedasticity using Breusch-Pagan


Test for Heteroskedasticity using Goldfeld–Quandt:
H 0 : Constant Variance Chi 2cal  Chi 2tab , Reject H 0 and Accept H1
H1 : Null Hypothesis is not true Pvalue   , Reject H 0 and Accept H1
TEST FOR AUTOCORRELATION
In statistics, the autocorrelation of a random process describes the
correlation between values of the process at different points in
time, as a function of the two times or of the time difference.
yt  0  1 xt  et  yˆ  et
The existence of autocorrelation can be detected using

et   e11  t
Having the above regression estimate, Durbin-Watson propose the
following to detect the existence of autocorrelation:
t n  

 (ei  ei 1 ) 2
d  i2
t n 
d  2(1   )
 (e )
i 1
i
2
Having this decision rule, the Hypothesis Testing is:
H 0 :   0 or d=2, no autocorrelation
H 1 :   0 or d  2, there is autocorrelation

Upper and lower critical values, du and dL have been


tabulated for different values of k and it has three
possibilities: accept, reject, or indeterminate.

If d  d L reject H 0 :   0
If d  du do not reject H 0 :   0
If d L  d  du test is inconclusive
Example: regressing Y on x in simple regression with
sample size 20. After regression you have the following:

t n  
 (ei  ei 1 ) 2
d  i 2
t n 
 1.08
 (e )
i 1
i
2

If we choose at 5% level of significance, the critical


values corresponding to n=20 and one regressor as
DL=1.20 and Du=1.41.

Therefore, d=1.08 < DL=1.20, rejecting HO and


concluding the errors are positively auto correlated
To avoid some of the pitfalls of the Durbin-Watson d test of the
autocorrelation, the Breusch–Godfrey has been developed to
address this issue in the sense that it allows for (Non-stochastic
regressors such as the lagged values of the regressand, Higher-order
autoregressive schemes, such as AR(1), AR(2) etc, and Simple or
higher-order moving averages of white noise error terms

H 0 : No serial correlation
H1 : Null Hypothesis is not true

Chi 2cal  Chi 2tab , Reject H 0 and Accept H1


Pvalue   , Reject H 0 and Accept H1
Use of the Model for Control or Policy
Purposes
9. Use of the Model for Control or Policy Purposes
• Suppose we have the estimated consumption function given in (I.3.3).
Suppose further the government believes that consumer expenditure of
about 4900 will keep the unemployment rate at its current level of about
4.2%. What level of income will guarantee the target amount of
consumption expenditure?
• If the regression results given in (I.3.3) seem reasonable, simple arithmetic
will show that:

4900 = −184.0779 + 0.7064X (I.3.6)

• which gives X = 7197, approximately. That is, an income level of about 7197
(billion) dollars, given an MPC of about 0.70, will produce an expenditure
of about 4900 billion dollars. As these calculations suggest, an estimated
model may be used for control, or policy, purposes. By appropriate fiscal
and monetary policy mix, the government can manipulate the control
variable X to produce the desired level of the target variable Y.
• Figure: Summarizes The Anatomy Of Classical Econometric Modeling.
Introducing
Qualitative/Categorical/Discrete
Explanatory Variables

Regression Model with Dummy


Variables
Dummy variables:
They are discrete variables taking a value of ‘0’ or ‘1’. They are
often called ‘on’ ‘off’ variables, being ‘on’ when they are 1.

Dummy variables can be used as explanatory variables for


qualitative data, or discrete data or categorical data.
*Qualitative dummy variables: i.e. age, sex, race, health.
*Seasonal dummy variables: depends on the nature of the data, so
quarterly data requires three dummy variables etc.
*Dummy variables that represent a change in policy:
–Intercept dummy variables, that pick up a change in the
intercept of the regression
–Slope dummy variables, that pick up a change in the slope of
the regression
If y is a teachers salary and
Di = 1 if a non-smoker
Di = 0 if a smoker
We can model this in the following way:

yi    Di  ut
Keys:

This produces an average salary for a smoker of E(y/Di =0) =.


The average salary of a non-smoker will be E(y/Di = 1) =  + .
This suggests that non-smokers receive a higher salary than
smokers.
Equally we could have used the dummy variable in a
model with other explanatory variables. In addition to the
dummy variable we could also add years of experience
(x), to give:

yi    Di  xi  ut
y
Non-smoker

Smoker
α+β

x
Two ways of Specifying Model with Dummy:
1)A model with constant term:
• Drop out one of the dummy category and consider it as a
reference category. This is due to protecting the model from
multicollinearity.
•Constant term coefficient is mean value of the reference category.
•Coefficients of dummy variables measures marginal difference.
•Example a model for having 4 season dummy variables:
Examining the impacts of seasonality on wage income

Y   0  1d 2   2 d3  3 d 4  
Exercise 1: seasonality is represented by dummy variables and
agricultural wage income is captured by Y.

Y  800  200d 2  400d3  100d 4  


1. The mean wage of season one is 800 Birr.
2. Wage in season two is less than the reference season, S1, wage
3. Wage in season Three wage is higher than S1 wage
4.Wage in season four is higher than S1 wage.
1) A model with out constant term:
• Drop out the constant term. This is due to protecting the
model from Multicollinearity
• No season dummy variables are dropped out for being a
reference category.
• Coefficients of dummy variables measures mean values,
not marginal difference
• Example a model for having 4 season dummy variables:
Examining the impacts of seasonality on wage income
Exercise 2: seasonality is represented by dummy variables
and agricultural wage income is captured by Y. You can
simply derived from the first model

Y  800d1  600d 2  1000d 3  900d 4  

1.The mean wage of season one is 800 Birr.


2. Mean wage in season two is 600
3. Mean wage in season three wage is 1000
4.Mean wage in season four is 900
Interactive Dummy
Dummy variables are simply variables that have been coded either 0 or 1
to indicate that an observation falls into a certain category. They are also
sometimes called indicator variables.
Interactive terms captures the possibility that the effect of one
independent variable might vary with the level of another independent
variable. Example, the effect of the drug on your blood pressure depends
on your age.
OBS PRESSURE AGE DRUG Age*Drug
1 85 30 0 0
2 95 40 1 40
3 90 40 1 40
4 75 20 0 0
5 100 60 1 60
6 90 40 0 0
7 90 50 0 0
8 90 30 1 30
9 100 60 1 60
10 85 30 1 30
Suppose that when we run a regression, we get the following result
A) Again we set D = 0 for the control group and D = 1 for those taking the drug.
Y = 70 + 5(Drug) + .44(Age) + .21(Drug*Age)

B) We obtain two separate equations for the two groups:

set D = 0: Y = 70 + .44Age
Set D = 1: Y = 75 + .65Age

Y=75 + .65Age

90
DRUG Y=70 + .44 Age
BLOOD PRESSURE

80

CONTROL
70

10 20 30 40
AGE
– Note that for those taking the drug not only does the intercept
increase (that is, the average level of blood pressure), but so does
the slope.
– Interpretation of an interactive term -- The effect of one independent
variable (DRUG) depends on the level of another independent
variable (AGE).
– The results here suggest that for people not taking the drug, each
additional year adds .44 units to blood pressure.
– For people taking the drug, each additional year increases blood
pressure by .65 units.

Do not fall into the dummy variable trap!


When you have entered both values of a dummy variable in
the same regression. These two variables are linearly dependent.
One will drop out.
Thank You

You might also like