29 views

Uploaded by Amitabha Sinha

save

- s Ross Thesis
- Market Timing
- Midterm
- Cheat_Sheet_Interpreting_Regressions_One_Page.pdf
- R7210501 Probability & Statistics
- Haynes and Stone Election Paper
- 2
- toolsempiric_ch03_new.pdf
- Trainiing Manual on Sample Design for Surveys
- Attachment-1.pdf
- Manual SPM12
- TUGAS RANCOP 1
- m 4 Question Bank
- CA Small Business Administration (SBA) Cost of Regulation Study (September, 2009)
- Adaptive and Statistical Signal Processing
- Satyendra Dissertation
- Biodiversity and Economic Development
- Economics Research
- agriculture tripura india
- 5
- Gretl eBook
- Causality & Comparative Advantage- Vietnam-s Coffee Role Post-ICA
- Tipo de Cambio Real
- Econometria Basica
- Bayesian Multivariate Time Series Methods for Empirical Macroeconomics
- RASE_0801
- Linear Regression
- VALORACION CONTINGENTE
- Panel data analysis
- The Influence of teacher on students Achievement in Math.pdf
- Capital Structure Article
- Rural Urban Gap
- Gasto social y ciclo económico en América Latina y el Caribe
- Lectures from Class.pdf
- Regression Analysis
- Binary Choice Model
- Kolmogorov Wiener
- 2. the Forecasting Performance of Implied Volatility Index
- Ricci Refcard Regression
- 63 - 2005
- Abend Mex US Epistemology.pdf
- Panel Data Methods for Microeconomics Using Stata
- APUNTE
- Apuntes de Econometría Jorge Salgado
- Do the Teachers’ Grading Practices Affect Student Achievement? Hans Bonesrønning
- Engle Arch
- Lara Mizala y Repetto 2011
- Heteroskedasticity
- 236185337-Solucionario-Econometria-Jeffrey-M-Wooldridge.pdf

You are on page 1of 9

ASSOCIATE PROFESSOR

DEPARTMENT OF A & A

ECONOMICS

TRIPURA UNIVERSITY

**I. UNDERSTANDING THE ASSUMPTIONS OF CLRM
**

We note that in application of least squares principle; we

are in fact dealing with estimating average Y. This average

Y, we postulate that is decomposable into ‘a’, bX and ‘u’.

The so called disturbance term ‘u’ is nothing but the

random component of Y in the context of the population

we are theorizing about. The estimators of slope or

intercept follow the property of Y. Y follows the property

of ‘u’. What we assume about ‘u’, determines, so to say,

the sampling distribution properties of the estimators. Why

is it that what we assume about the population determine

the properties of estimators based on observed samples of

Y? The obvious answer is: we think that the sample is a

representative sample. As a result, we appeal to the

principle of analogy. To make this point more clear we can

first write down the population regression function (never

directly observable) and sample regression function based

on observed Y and X.

PRF: Yi=β0 + β1 Xi + ui …………………. (5)

SRF: Yi= β*0 + β1* Xi + ei ........................ (6)

Now, the OLS formula for β1* is

β1* = ΣxY = Σ x(β0+ β1 Xi +ui)

Σx2 Σx2

Formula By analogy Y is from PRF

**The important point is that in deriving the properties of
**

the estimators, we are thinking that sample Y is same as

population Y which we never observe. The sampling

distribution properties of the estimators depend on the

distribution properties of the disturbance term u.

As a result, the assumptions of CLRM about u become

crucial in determining the sampling distributions of the

estimators. Not only that the stronger the assumptions, the

more attractive the properties of the estimators turn out to

be. Unfortunately, to the utter dismay of the researcher, the

stronger the assumptions the less realistic they may turn out

to be. Hence, the researcher is at the receiving end of the

two horns of dilemma. Try to make the estimators attractive

in terms of unbiasedness, low variance or linearity on Y,

testable in terms of confidence interval and so on, the

stricter the assumptions must be about ‘u’. But these

assumptions may take her away from the reality of

available data set. So, she likes to make the assumptions the

weakest. The ‘Occam’s Razor’ summarizes this dilemma.

In the context of EM, the regression model fulfills the

following three conditions:

(1) The expected value of ui given Xi is zero. Symbolically,

E (u i | X i) = 0

**Why does this condition hold in the case of randomized
**

experiments? The answer is not far to seek. X denotes the

treatment level. Each subject is assigned into different

treatment level in a random manner. For a given level of Xi,

positive u and negative u have equal probability.

Consequently, the sum of the products pi*ui is zero, where

pi is the probability of ui. In short, E (ui|Xi) = 0.

(2) The covariance of Xi and ui is zero. Symbolically,

Cov(u i, X i) = 0

This is the direct consequence of randomization. The

subjects are assigned to different groups in a random

manner. So correlation is by definition zero.

There are certain sources of confusion here. The least

squares principle leads to two normal conditions /equations.

One is that Σei=0. The second condition is that ΣXiei=0.

These are mathematical first order conditions of

minimization of error squares sum. These conditions are

the restrictions imposed on the sample data by the least

squares principle. On the other hand, the two assumptions

mentioned above are assumptions about potentially

unobservable population.

Why is it then that we expect the sample estimators like

OLS (ordinary least squares ) to show certain desirable

properties like unbiasedness etc. when the disturbance term

of PRF fulfill conditions like the two mentioned above ? It

is due to the basic assumption that the sample is

representative of the population. Otherwise, least squares

principle cannot help at all. Let us imagine a hunter with a

gun which shoots straight, hits the target most of the time

and the stray bullets also do not stray very much. He is

requested to kill a man-eater tiger. But he goes to a jungle

of deadly serpents with no tiger. Will his gun help?

Similarly, OLS cannot help if the sample comes from a

‘wrong’ population.

Here the researcher must be cautious indeed. What will

help the researcher in selecting a representative sample?

First, a clear definition of the population is needed. Let us

say that the population consists of the married females in

the age group 18 to 35 years with at least one living child

.It will be less difficult to collect a sample in this case than

simply saying that all females define our population .Of

course much depends on the objectives of the study.

Second, method of data collection should not create any

errors. For example, females may understate their age.

Appropriate strategy of data collection has to be adopted

like looking at school leaving certificate or municipality

certificate. Sometimes, participatory learning methods like

focused group may help.

Third, the sample size should be sufficiently large, say,

more than 30.

The issues of random sampling can be addressed after

facing the problems mentioned above.

Let us assume that both the problems mentioned above,

namely, representative-ness and randomization are met.

What will be the consequences now for OLS estimators if

the two assumptions of CLRM are met?

We generally introduce the concept of ‘bias’ of the sample

based estimator to analyze the consequence. If we have for

the sample β1* and its population counterpart β1, then bias is

defined as follows:

Bias = E (β1*) - β1

Here, E represents the symbol for expected value, of

course. If bias is zero then regression model gives a proper

estimate of the effect of X on Y like EM.

Will the OLS estimators become unbiased if the two

assumptions mentioned above are met? The answer is, NO.

Unbiased- ness requires that the average of the sample

estimator is equal to the population estimator. Now we note

that

β1,OLS*= β1+ Σ(Xi – ¯X)ui …………………….(7)

Σ (Xi – ¯X)2

This expression can be easily derived by taking deviation

from the mean form of the regression equation and using

the result that the sum of deviation from mean for X is

zero. If we take expectations for both sides then the

numerator of the second expression on the right hand side

has E (ui). Now from the Law of Iterative Expectation, we

have the result

E(ui) = E{E(ui|Xi,X1,X2,…,XN)} .

For E(ui) to be zero ,which makes the estimator unbiased,

we require that (a) E(ui|Xi,) is independent from other Xs

like X1, X2 , X3 etc and (b) E(ui|Xi,)=0 for each X level

.This leads us to the third important assumption of the

CLRM.

(3) The disturbance terms are identically and independently

distributed (i.i.d). Conditions (a) and (b) imply this.

Randomization of course ensures the fulfillment of this

condition through assignment to experimental groups in a

random manner.

II.EXTENSIONS

A few concluding remarks before we wind up this lecture

(1) So far we have assumed a single regressor. But we can

have more than one explanatory variable. As a

consequence, we may have to ask the question, what is the

effect of X1 on Y, holding other variable constant? Multiple

regression is perfectly capable for answering such

questions. In fact it is designed to do precisely such

work .The coefficients of Xs measure these impacts. Do we

need additional assumption now? Yes. One assumption is

required. It is that the different explanatory variables are

not really the same variable in the sense that one can

derived from the other mathematically as a linear

transformation. Technically this is called absence of perfect

or deterministic multicollinearity. In this case, partial

impacts are not defined at all. Non linear relationships pose

no problem, however. Nor do imperfect or stochastic

multicollinearity.

(2) What about variance of the OLS estimators? If there is

same (homo) variation (scedasticity) for all disturbance

terms for each level of treatment then the OLS contains all

relevant information. It will have the minimum variance

among all linear and unbiased estimators. This is because

of two reasons. The OLS weights are perfectly correlated

(Σkixi=1) with the explanatory variables, i.e., Xs. Secondly,

the estimators are based on the principle of minimization of

variance of the disturbance term. This is the

homoscedasticity assumption required to establish the

Gauss-Markov Theorem.

(3) Time series econometrics requires further extensions.

- s Ross ThesisUploaded byJavier Alejandro Rodriguez
- Market TimingUploaded bySaba Azhar
- MidtermUploaded byboja-boja
- Cheat_Sheet_Interpreting_Regressions_One_Page.pdfUploaded byniloykrittika
- R7210501 Probability & StatisticsUploaded bysivabharathamurthy
- Haynes and Stone Election PaperUploaded byKharden
- 2Uploaded byHarpreet Singh
- toolsempiric_ch03_new.pdfUploaded byJoe Ogle
- Trainiing Manual on Sample Design for SurveysUploaded byHannah Ma Ya Li
- Attachment-1.pdfUploaded byKirui Bore Paul
- Manual SPM12Uploaded bymariobar17636
- TUGAS RANCOP 1Uploaded byAmir Ahmad
- m 4 Question BankUploaded byDinesh Palavalasa
- CA Small Business Administration (SBA) Cost of Regulation Study (September, 2009)Uploaded bywmartin46
- Adaptive and Statistical Signal ProcessingUploaded byfff9210

- 5Uploaded byAbhishek Sharma
- Gretl eBookUploaded bydickie_hardiansyah
- Causality & Comparative Advantage- Vietnam-s Coffee Role Post-ICAUploaded byDuc Anh
- Tipo de Cambio RealUploaded byWendy QO
- Econometria BasicaUploaded byHector OSCANOA SALAZAR
- Bayesian Multivariate Time Series Methods for Empirical MacroeconomicsUploaded byAndrea Nocifora
- RASE_0801Uploaded byASE Asociación de Sociología de la Educación
- Linear RegressionUploaded byope ojo
- VALORACION CONTINGENTEUploaded byDiana Vásquez
- Panel data analysisUploaded byJulio José
- The Influence of teacher on students Achievement in Math.pdfUploaded byHafiz M Iqbal
- Capital Structure ArticleUploaded byNotty Boyye
- Rural Urban GapUploaded by_Low_Key_
- Gasto social y ciclo económico en América Latina y el CaribeUploaded byIsrael
- Lectures from Class.pdfUploaded byNarutoLLN
- Regression AnalysisUploaded byAnand
- Binary Choice ModelUploaded byLilyHollies
- Kolmogorov WienerUploaded byAnderson Oliveira
- 2. the Forecasting Performance of Implied Volatility IndexUploaded bydikshita
- Ricci Refcard RegressionUploaded byDavid Z Shuvalov
- 63 - 2005Uploaded byc_mc2
- Abend Mex US Epistemology.pdfUploaded byGidi Loza
- Panel Data Methods for Microeconomics Using StataUploaded bygamegang
- APUNTEUploaded byCarlos Flores Muñoz
- Apuntes de Econometría Jorge SalgadoUploaded byresearch90
- Do the Teachers’ Grading Practices Affect Student Achievement? Hans BonesrønningUploaded byPedro Paulo Palazzo
- Engle ArchUploaded byUy Krav
- Lara Mizala y Repetto 2011Uploaded byRodrigo Fernandez
- HeteroskedasticityUploaded byallswell
- 236185337-Solucionario-Econometria-Jeffrey-M-Wooldridge.pdfUploaded byMartín Cruzat Riquelme