You are on page 1of 15

1.

For the simple regression model y     x  

Under OLS estimation given in the above simple linear regression, there is constant term α,
therefore we should take in to account that the value of one (1) in the first column of X value,
since the assumption of OLS estimation the sum of error term is equal to zero(0).

x1
[1
]x
[1 [ ¿ 2 ]¿
[ . .. ]. . .
[1 ] xn
Let ¿


i
i  0 and  x
i
i 0
A Show that the least squares normal equations simply

Answer Rewrite the normal equations as follows

X ' Y − X ' X β=0


^

X ' (Y −X β^ )=0

X ' e=0

For every column of xk of X, this implies xk'e = 0. If the first column of X is a column of
n
x '1 e=i' e= ∑ ei =0
1s, then, the least squares residuals sum to zero. this follows from i=1

n
∑ x i e i=0
and also imply i=1

Show that the solution for the constant term is y    x 


B

n
∑ e i =0
i=1
Using the least square sum of residuals to zero from the first normal equation

e= y i −α−βx i
n n
∑ e i =∑ ( y i −α−βx i )=0
i=1 i=1
n n
∑ y i =∑ α + β ∑ x i
i=1 i=1

∑ y i =nα+β ∑ x i
∑ yi =α+β
n
∑ xi
ȳ=α+β x̄ ⇒α= ȳ −β x̄

C Show that the solution for b is


n n
∑ e i =0 ∑ x i e i=0 ∑ (x i − x̄ )ei =0
We know that i=1 and i=1 from this we can obtain

n n
∑ x̄ ei = x̄ ∑ ei =0 ∑ (x i − x̄ )( y i−α−bx i )
Since i=1 i=1 ,Substituting ei=yi--bxi to get
or

From question b above we have got ȳ=α+β x̄ from this α= ȳ−b x̄ where b= then

∑ (x i − x̄ )( y i− ȳ−b( x i − x̄ ))=0
substitute it on α
n
∑ ( xi − x̄ )( y i − ȳ )
n n b= i=1 n
∑ ( x i − x̄ )( y i− ȳ )=b ∑ ( x i− x̄ )( x i− x̄ )) ∑ ( x i − x̄ )2
i=1 i=1 so i =1

Therefore the solution for b is

D Prove that these two values uniquely minimize the sum of squares by showing that the
diagonal elements of the second derivatives matrix of the sum of squares with respect to the
parameters are both positive and that the determinant is

which is positive unless all values


of x are the same

Solution: From OLS estimation of the first order condition of ε^ ' ε^


n
ε^ ε^ =∑ ( y i −x i β^ )2=( y−x ' β^ )' ( y−x ' β^ )
'

i =1

ε^ ' ε^ = y ' y−2 β^ ' x ' y+ β^ ' x' x β^


'
We can find the value of  that minimizes the vector ε^ ε^ by differentiating it with respect to
 and setting the result equal to zero:
'
∂ ε^ ε^
=0⇒−2 x ' y+2 x ' x β=0
^
∂β which is equal to −2x ' ( y−x β^ )=−2 x ' e
2 '
∂ ee
'
=2 X ' X
The second order derivative matrix is ∂β∂ β

We need to show that this matrix is positive definite.

from the value of X that i stated at the beginning

∑ xi 2∑ xi
[n [ 2n
[¿ n ]¿ 2 X ' X=¿ [¿ ]¿
[ ∑ xi ] ∑ x 2i [2 ∑ x i ]2 ∑ xi2
i=1
X’X= ¿ and ¿

The diagonal elements are 2n and 2 ∑ x2i which clearly shows both of them are positive.

The determinant is (2 n)(2 ∑ xi2)−4(∑ xi )2


= 4n ∑ x2i −4(n x̄)2
=
4 n [ ( ∑ x 2i )−n x̄ 2 ]

n 2

=
4n
[(∑ ) ]
i=1
x i − x̄
this the result that we want.

2. Suppose that b is the least squares coefficient vector in the regression of y on X and that C is
any other Kx1 vector. Prove that the difference in the two sums of squared residuals is

Prove that this difference is positive.


Solution :

Let us express c as b + (c - b). Then, the sum of squared residuals based on c is


(y - Xc)′(y - Xc)

= [y - X(b + (c - b))] ′[y - X(b + (c - b))]

= [(y - Xb) + X(c - b)] ′[(y - Xb) + X(c - b)]


= (y - Xb) ′(y - Xb) + (c - b) ′X′X(c - b) + 2(c - b) ′X′(y - Xb).
But, the third term is zero, as 2(c - b) ′X′(y - Xb) = 2(c - b)X′e = 0. Hence,
(y-Xc) ′(y-Xc) = e′e + (c - b) ′X′X(c - b)
or (y-Xc) ′(y-Xc) - e′e = (c - b) ′X′X(c - b).
The right hand side can be written as Z'Z where Z = X(c - b), thus it is necessarily positive.

3.Explain what effect the following problems have on the properties of the least squares
estimates of the coefficients and their standard errors. How would you detect whether each
problem was present:

(a) Heteroscedasticity. (b) Serial correlation. (d) Non-normality.

(e) Non-linearity. (f) Exact multicollinearity

The formulas used by classical ordinary least squares (OLS) regression to estimate the
population parameters in a regression model will be unbiased, be efficient, have minimum mean
square error (MSE) and be consistent, if the following assumptions hold true:

1. The model is correctly specified, e.g., all relevant explanatory variables are included in the
regression.

2. The error terms are normally distributed.

3. The error terms have constant variance.

4. The error terms are independent of each other.

If the above assumptions are “violated” the classical regression formulas(OLS) may not be
unbiased, efficient, have minimum mean square error (MSE) or be consistent.
a.Heteroscedasticity.

Heteroscedasticity occurs when the error variance has non-constant variance. In this case, we can
think of the disturbance for each observation as being drawn from a different distribution with a
different variance. Stated equivalently, the variance of the observed value of the dependent
variable around the regression line is non-constant. We can think of each observed value of the
dependent variable as being drawn from a different conditional probability distribution with a
different conditional variance. A general linear regression model with the assumption of
heteroscedasticity can be expressed as follows:

Yi = 1 + 2 Xi2 + … + k Xik + εi

Var(εi) = E(εi 2) = t2 for i = 1, 2, …, n

Consequences of heteroscedasticity on OLS

If the error term has non-constant variance, then the consequences of using the OLS estimator to
obtain estimates of the population parameters are:

1. The OLS estimator is still unbiased.

2. The OLS estimator is inefficient; that is, it is not BLUE.

3. The estimated variances and covariances of the OLS estimates are biased and inconsistent.

4. Hypothesis tests are not valid.

5. asymptotically normaly distributed.

Detection of heteroscedasticity
(b) Serial correlation.
Autocorrelation occurs when the errors are correlated. In this case, we can think of the
disturbances for different observations as being drawn from different distributions that
are not explanatory distributions.
Consequences of Autocorrelation on OLS and Standard error
The consequences are the same as heteroscedasticity. That is:
1. The OLS estimator is still unbiased.
2. The OLS estimator is inefficient; that is, it is not BLUE.
3. The estimated variances and covariances of the OLS estimates are biased and
inconsistent. If there is positive autocorrelation, and if the value of a right-hand side
variable grows over time, then the estimate of the standard error of the coefficient
estimate of this variable will be too low and hence the t-statistic too high.
4. Hypothesis tests are not valid.
Detection of autocorrelation.
There are several ways to use the sample data to detect the existence of autocorrelation.
Plot the residuals
One way to detect autocorrelation is to estimate the equation using OLS, and then plot the
residuals against time.
The Durbin-Watson d test: The most often used test for first-order autocorrelation is the
Durbin-Watson d test. It is important to note that this test can only be used to test for
first-order autocorrelation, it cannot be used to test for higher-order autocorrelation. Also,
this test cannot be used if the lagged value of the dependent variable is included as a
right-hand side variable.
The Breusch-Godfrey Lagrange Multiplier Test: The Breusch-Godfrey test is a general
test of autocorrelation. It can be used to test for firstorder autocorrelation or higher-order
autocorrelation. This test is a specific type of Lagrange multiplier test.
Economists usually test for positive autocorrelation because negative serial correlation is
highly unusual when using economic data. The null and alternative hypotheses are
H 0: 1 = 2 = 0 vs H 1: At least one  is not zero

(f) Exact multicollinearity: One of the assumptions of CLR model is that there are
no exact linear relationships between the independent variables and that there are at
least as many observations as the dependent variables (Rank of the regression). If
either of these is violated it is impossible to estimate OLS and the estimating
procedure simply breaks down.

In estimation the number of observations should be greater than the number of


parameters to be estimated. The difference between the sample size the number of
parameters (the difference is the degree of freedom) should be as large as possible.

If multicollinearity is perfect, the regression coefficients of the X variables are


indeterminate and their standard errors infinite.

Consequences of multicollinearity
1. Although BLUE, the OLS estimators have larger variances making precise
estimation difficult. OLS are BLUE because near collinearity does not affect the
assumptions made.

2. The confidence intervals tend to be much wider, leading to the acceptance of the
null hypothesis

3. The t ratios may tend to be insignificant and the overall coefficient of


determination may be high.

4. The OLS estimators and their standard errors could be sensitive to small changes in
the data.

Detection of multicollinearity.

The presence of multicollinearity is detected by

1. A relatively high R2 and significant F-statistics with few significant t- statistics.

2. Wrong signs of the regression coefficients

3. Examination of partial correlation coefficients among the independent variables.

4. Use subsidiary or auxiliary regressions. This involves regressing each independent


variable on the remaining independent variables and use F-test to determine the
significance of R2.

5) Using VIF (variance inflating factor)

(d) Nonnormality

If the error terms are not normally distributed, inferences about the regression
coefficients (using t-tests) and the overall equation (using the F-test) will become
unreliable. However, as long as the sample sizes are large (namely the sample size
minus the number of estimated coefficients is greater than or equal to 30) and the
error terms are not extremely different from a normal distribution, such tests are
likely to be robust. Whether the error terms are normally distributed can be assessed
by using methods like the normal probability plot. The formal tests to detect non-
normal errors one can estimate the values of skewness and kurtosis. These values can
be obtained from the descriptive statistics.

4.Suppose you face the demand curve Q     p   . In the past, you have set the following
prices and sold the accompanying quantities

Suppose that your marginal cost is 10. Based on the least squares regression, compute a 95%
confidence interval for the expected value of the profit maximizing output.

Solution

Since the mean value of error term is zero, that is E(ε) =0 .The expected value of the demand
function,E(Q)=q=+P or the expected value of the inverse demand function, E(P)=-/+(1/)q
=p.then total revenue is TR= QP=q(-/+(1/)q)=(- /)q+(1/)q2. From this by differentiating
total revenue with respect to Q a linear demand curve, marginal revenue is MR = d(pq)/dq=(-α/β)
+ (2/β)q. The profit maximizing output is that at which marginal revenue equals marginal cost, or

α 2q α
− + =10⇒ q eqm= +5 β
10. Equating MR to 10 and solving for q produces MR=MC , β β 2
, q at equilibrium ,so we require a confidence interval for this combination of the parameters.

S.No. Q P (Q-q) (P-p) (Q-q) (P-p) (P-p)2


1 3.00 18 -8.467 6.933 -58.7 48.0665
2 3.00 16 -8.467 4.933 -41.76 24.334
3 7.00 17 -4.467 5.933 -26.502 35.2
4 6 12 -5.467 0.933 -5.1 0.87
5 10 15 -1.467 3.933 -5.769 15.468
6 15 15 3.533 3.933 13.89 15.468
7 16 4 4.533 -7.067 -32.03 49.94
8 13 13 1.533 1.933 2.963 3.736
9 9 11 -2.467 -0.067 0.26 0.00448
10 15 6 3.533 -5.067 -17.9 25.674
11 9 8 -2.533 -3.067 7.566 9.4
12 15 10 3.533 -1.067 -3.76 1.138
13 12 7 0.533 -4.067 -2.16 16.54
14 18 7 6.533 -4.067 -26.56 16.54
15 21 7 9.533 -4.067 -38.77
q=11.467 p=11.067 (Q-q) (P-p)= (P-p)2 =278.93
-233.662

^ ∑ (Q−q )(P− p) = −233 .662 =−0. 841


β=
∑ (P− p)2 278 . 93
α^ =q− β^ p=11. 467−(−0 .841)11. 067=20. 769

Expected value of profit maximizing or q at equilibrium (estimated q) is

α 20 .769
q eqm= +5 β= −5∗0 .841=6 .182
2 2

Now ^ β^ P and
ε^ =Q−α− ∑ ^ε2=204 .6142
^2
^δ 2= ∑ ε = 204 .6142 =15 .74
N−2 15−2

2
(11. 067 )2
var( α^ )=δ
1
+2 p
[
N ∑ ( P−p )2
1
=15. 74∗ +
]
15 278. 93 [
= 7. 96 ]
2
δ 15 . 74
var( β^ )= = = 0 . 056436
∑ ( P− p )2 278 . 93
The estimated covariance matrix of the
2 '
coefficients is obtained as δ P P=−0 .624559

−0 . 6246
[ 7 . 96 [¿ ]¿
[−0 . 6246 ] 0 . 056436
¿
The estimate of the variance of q^ is (1/4)7.96 + 25(.056436) + 5(-.06246) or 0.278415, so the
estimated standard error is 0.5276

The degree of freedom for this data is N-2=15-2 =13 .Thus the 95% cutoff value for a t
distribution with 13 degrees of freedom is 2.161.

Therefore The 95% confidence interval for Q is

eqm
q ±t α SE ( q )
, n−2
2

(6.1816-2.161*0.5276, 6.1816+2.161*0.5276)= (5.041.7.322).

5 For the classical normal regression model y     p   with no constant term and K

regressors, assuming that the true value of  is zero, what is the exact expected value of

 R2 
F  K, N  K    k
1  R 2 
 N  K

Solution : the F artio is computed as

b' X ' Xb
K
ee e
n−K

Since e'e='M then ,

e=Y − Xb
=Y −X ( X ' X )−1 X ' Y
¿ [( I−X ( X ' X )−1 X ' ) ] Y
¿ MY =M ( Xβ+ε )=MX β + Mε=Mε
¿ ε ' M ' Mε=ε ' M ' Mε=ε ' Mε

And also we substitute b


b=( X ' X )−1 X ' Y =( X ' X )−1 X ' ( Xβ+ε)
¿ β+ X ' X )−1 X ' ε
¿ X ' X )−1 X ' ε

ε ' X ( X ' X )−1 X ' X ( X ' X )−1 X ' ε


K
F= '
ε Mε
Then ( n−k )

7.Consider the simultaneous equations model:

y1t  12 y2t   11 x1t  1t


y2 t   2 t y1t   22 x2 t   23 x3t   2 t

Where
y1t and y2 t are endogenous, x1t x2 t and x3t are exogenous variables, and

 1t and  2 t 
are
NID  0,  
random disturbances.
a. Discuss the identifiability of each equation of the system in terms of the order and
rank conditions for identification.
b. Explain why the ordinary least squares estimator of
 12 ,  11  is inconsistent
c. What are the two-stage least squares estimators of the coefficients in the two
equations? Describe the procedure step by step
Solution :a Order condition for the identification of equation j

In a model of M simultaneous equations, in order for an equation to be identified,if the number


of exogenous variables from equation j must be at least as large as the number of endogenous
¿
variables in equation j ,that is
K j≥M j

If K*j=Mj then the equation is exactly identified

if K*j>Mj the equation is overidentified

if K*j<Mj the equation is unidentified

where K*j is the number of exogenous variables excluded from equation j and Mj is the number
of endogenous variables included in equation j

 know from our two simultaneous equations the first equation over identified since there
are two K*1 which are X2t and X3t and one M1 which is Y2 (that is K*j>Mj , 2>1).
Therefore the equation is identified, we call it over identified
 for the second equation since K*j=Mj the equation identified.

Inshort :In this model y1t and y2t are endogenous and X1t X2t and X3t are exogenous
(predetermined). The first equation excludes two variables X 2t and X3t and hence overidentified
by the order condition.however the second equation excludes exactly one variable X1t , and hence
by the order condition it is exactly identified. Therefore, the model as a whole is
identified. The order condition is only a necessary condition not sufficient condition for
identification.

RANK CONDITION OF IDENTIFICATION :


state that in a model containing M equations in M endogenous variables, an equation is identified
if and only if at least one nonzero determinant of order (M − 1)(M − 1) can be constructed from
the coefficients of the variables (both endogenous and predetermined) excluded from that
particular equation but included in the other equations of the model.in another way it requires

rank [ Π ¿j ]= M j
Consider system of simultaneous equations

y 1t −β 12 y 2t −γ 11 x 1t =v 1t ..... ....... .. ....... ......(1)


y 2t − β2 t y 1 t −γ 22 x 2t −γ 23 x 3t =v 2t ....... ........(2)
By the order condition the two equation is identified. Let us check with the rank condition.
Consider the first equation, which excludes variables X2t and X3t (this is represented by zeros in
the first row). For this equation to be identified, we must obtain at least one nonzero determinant
of order 1 × 1 from the coefficients of the variables excluded from this equation but included in
other equations. To obtain the determinant we first obtain the relevant matrix of coefficients of
variables X2t and X3t included in the second equations.

Coefficients of the variables

Equations y1t y2t x1t x2t x3t

1 1 -12 -11 0 0

2 1 -2t 0 -22 -23


8. Discuss the difference and their common properties of the OLS, GMM, MLE, SUR,
FIML, 2SLS, 3SLS estimates of regression model.

Full Information Maximum Likelihood (FIML) & 3SLS

 Has the same asymptotic variance-covariance matrix as 3SLS.


 The 2SLS and 3SLS estimators don’t require distributional assumptions, while
FIML require distributional assumptions.

You might also like