Professional Documents
Culture Documents
1
Q1. Comment briefly on the meaning of each of the following.
A. Estimated coefficient:- models should be tested for
multivariate
multicollinearity in severe cases, multicollinearity results in inflated
standard errors for the coefficients, which renders predicted values
of the model unreliable:
Estimated coefficient is (a slope) or the coefficient of explanatory
variable(s) in the estimated or predicted regression model. It shows by
how much (a unit or a percentage changes) in the dependent variable as
the explanatory variable(s) change by one unit or some percent.
The relationship between the dependent variable(y) and independent
variable(x) is explained by the estimated coefficient. Example: Yi =
α+ β Xi+Ui holds for population of the values X&Y. Since these values
of the population are unknown we do not know the exact numerical
values of & β's.
To calculate or obtain the numerical values of & β we took sample
observations for Y & X. By substituting these values in the population
regression we obtain sample regression which gives an estimated value
of & β given by α∧ ^ β^ respectively then the sample regression line is
^ ^ β^ Xi
given by Y i= α+ .
The true relationships between variables (that explain the population) is
^
given by Y i= α + β Xi+Ui
B. Standard error:-Standard error is the square root of variance that is
necessary to apply test of significance in order to measure the size of
the error committed and determine the degree of confidence that the
estimates will lie.
In multiple regression analysis, the estimate of the standard deviation
of the population error, obtained as the square root of the sum of
squared residuals over the degrees of freedom it is generally an
estimate of the standard deviation of an estimator.
E.g. standard error of ßj is an estimate of the standard
deviation in the sampling distribution.
2
C.T-statistics:-The statistic used to test a single hypothesis about the
parameters in an econometric model. The T-statistic is the test that
applicable if the sample size is less than 30 & the population
parameters distribution is normal and also this test is also important
to test the significance of the parameters.
To apply t-test there are the following steps.
In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter
from its hypothesized value to its standard error. It is used in hypothesis
testing via Student's t-test. The term "t-statistic" is abbreviated from "hypothesis test
statistic", while "Student" was the pen name of William Sealy Gosset, who introduced
the t-statistic and t-test in 1908, while working for the Guinness brewery in Dublin,
Ireland. By default, statistical packages report t-statistic with β0 = 0 (these t-statistics are
used to test the significance of corresponding repressor). However, when t-statistic is needed to
test the hypothesis of the formH0: β = β0, then a non-zero β0 may be used.
If β is an ordinary least squares estimator in the classical linear regression model (that
is, with normally distributed and homoscedastic error terms), and if the true value of the
parameter β is equal to β0, then the sampling distribution of the t-statistic is
the Student's t-distribution with (n−k) degrees of freedom, where n is the number of
observations, and k is the number of regresses (including the intercept).
R2= TSS ∑ yi 2
3
plot, it would be able to explain all of the variation. The further the line
is away from the points, the less it is able to explain.
E. Sum of squared residuals (SSR):- In multiple regression analysis,
the sum of the squared OLS residuals across all observations can be
processed. SSR measures the sample variation in the uˆi.
To make matters even worse, the residual sum of squares is often
called the “error sum of squares.”
This is especially unfortunate because, the errors and the residuals are
different quantities.
Thus, we will always call the residual sum of squares or the sum of
squared residuals. We prefer to use the abbreviation SSR to denote the
sum of squared residuals, because it is more common in econometric
packages. Mathematically we can rewrite as follow:
RSS= TSS-ESS
Ui2 =yi2 - βxi yi_________________________________in simple regression model
¿ ¿
in multiple regression
F. Standard error of regression (SER):-
analysis ,the estimate of the standard deviation of the
population error ,obtained as the square root of the sum
of squared residuals over the degrees of freedom. The
standard errors of the regression or the standard error of
estimate is the positive square root of variance (σ 2). If sˆ 2 is
plugged into the variance formulas, then we have unbiased
estimators of Var (b ˆ 1) and Var (b ˆ 0). Later on, we will need
estimators of the standard deviations of b ˆ 1 and b ˆ 0, and this
requires estimating. The natural estimator of s is and is called
the standard error of the regression (SER). (Other names for
sˆ are the standard error of the estimate and the root means
squared error, but we will not use these.) Although sˆ is not an
unbiased estimator of s, we can show that it is a consistent
estimator of s, and it will serve our purposes well. The estimate
4
sˆ is interesting because it is an estimate of the standard
deviation in the unobservable affecting y; equivalently, it
estimates the standard deviation in y after the effect of x has
been taken out. Most regression packages report the value of sˆ
along with the R-squared, intercept, slope, and other OLS
statistics (under one of the several names listed above). For
now, our primary interest is in using sˆ to estimate the standard
deviations of b ˆ SSTx, the natural estimator of this is called the
standard error of b ˆ 1.
Note that se (b ˆ 1) is viewed as a random variable when we
think of running OLS over different samples of y; this is true
because sˆ varies with different samples. For a given sample,
se(b ˆ 1) is a number, just as b ˆ 1 is simply a number when we
compute it from the given data. Similarly, se(b ˆ 0) is obtained
from sd(b ˆ 0) by replacing s with sˆ . The standard error of any
estimate gives us an idea of how precise the estimator is.
Standard errors play a central role throughout this text; we will
use them to construct test statistics and confidence intervals for
every econometric procedure we cover.
E. Best linear unbiased estimator (BLUE):-among all linear unbiased
estimators, the one with the smallest variance. OLS is BLUE,
conditional on the sample values of the explanatory variables, under
the Guass Markov assumptions. For our purpose the term “linear”
in the linear regression model refers to linearity in the regression
coefficients, the Bs, and not linearity in the Y and X variables.
Best/ Least Variance: An estimate is best when it has the smallest variance as compared with
any other estimate obtained from other econometric methods.
Unbiasedness: An estimator is said to be unbiased if the expected value is equal to the true
population parameter. One of the powerful findings about the sample mean (which is also the
least squares estimator) is that it is the best of all possible estimators that are both linear and
unbiased. The fact that Y is the best linear unbiased estimator (BLUE) accounts for its wide
use. In this context we mean by best that it is the estimator with the smallest variance of all
linear and unbiased estimators. It is better to have an estimator with a smaller variance than one
5
with a larger variance; it increases the chances of getting an estimate close to the true
population mean.
TSS−RSS RSS ∑ e i2
R2= Tss
=1−
TSS
=1−
∑ y2
The value of r is such that -1 < r < +1. The + and – signs are used for
positive linear correlations and negative linear correlations,
respectively. Positive correlation: If x and y have a strong positive
linear correlation, r is close to +1. An r value of exactly +1 indicates a
perfect positive fit. Positive values indicate a relationship
between x and y variables such that as values for x increase, values
for y also increase. Negative correlation: If x and y have a strong
negative linear correlation, r is close to -1. An r value of exactly -1
indicates a perfect negative fit. Negative values indicate a relationship
between x and y such that as values for x increase,
values for y decrease. No correlation: If there is no linear correlation
or a weak linear correlation, r is close to 0. A value near zero means
that there is a random, nonlinear relationship between the two variables
Note that r is a dimensionless quantity; that is, it does not depend on
the units employed. A perfect correlation of ± 1 occurs only when the
data points all lie exactly on a straight line. If r = +1, the slope of this
line is positive. If r = -1, the slope of this line is negative. A
correlation greater than 0.8 is generally described as strong, whereas a
correlation less than 0.5 are generally described as weak. These values
can vary based upon the "type" of data being examined. A study
utilizing scientific data may require a stronger correlation than a study
using social science data.
C. Test for stability: is a test of the hypothesis that the
econometrics package views makes available several
standards diagnostic. Stability test are needed for testing,
7
amongst other things, the validity of purchasing power parity
and the consistency of wage distribution over time. It is
argued that stationary tests are more appropriate than unit
root tests in this situation since that null hypothesis is usually
that the series are stable. An implicit assumption in all
regression models is that their coefficients remain constant
across all observations. When they do not and this occurs
regularly with time series data in particular the problem of
structural change is encountered. After presenting a
simulation example of a typical structural break in a
regression, methods are introduced to test for such breaks,
whether at a known point in time or when the breakpoint is
unknown. An approach to modelling changing parameters
using dummy variables is introduced and a detailed example
of a shifting regression relationship between inflation and
interest rates brought about by policy regime changes is
presented.
8
Example:
1. Y = α + βx + u is linear in both parameters and the variables, so it
Satisfies the assumption.
lnY = α + βln x + u is linear only in the parameters. Since the classical
worry on the parameters, the model satisfies the assumption.
9
Q3. Explain the following tests for homoscedasticity
[illustrate each of these tests with data]
10
calculated value is greater than the tabulated ( critical values), we reject
the null hypothesis of homoscedasticity.
Q5. State with reasons whether the following statements are TRUE,
FALSE or UNCERTAIN. Be precise
A. The t-test discussed in simple linear regression requires that the sampling distributions
of estimators ^β 1∧ ^β 1follows the normal distribution. TRUE. The t test is based on variables
with a normal distribution. Since the estimators of and are linear combinations of the error
11
to, which is assumed to be normally distributed under CLRM, these estimators are also
normally distributed
B. Even though the disturbance term in the CLRM is not normally distributed, the OLS
estimators are still unbiased.
TRUE. So long as E(ui) = 0, the OLS estimators are unbiased. No probabilistic assumptions
are required to establish biasedness.
C. If there is no intercept in the regression model, the estimated ui (=u^ i ) will not sum to
zero. TRUE. The differences between the two sets of formulas should be obvious: In the
model with the intercept term absent, we use raw sums of squares and cross products but
in the intercept-present model, we use adjusted (from mean) sums of squares and cross
products. Second, the df for computing ˆ σ2 is (n−1) in the first case and (n−2) in the
second case. (Why?) Although the intercept less or zero intercept model may be
appropriate on occasions, there are some features of this model that need to be noted. First,
E(ui), which is always zero for the model with the intercept term (the conventional model),
need not be zero when that term is absent. In short, E(ui), need not be zero for the
regression through the origin.
D. The p-value and the size of test statistic mean the same thing
TRUE. The p value is the smallest level of significance at which the null hypothesis can be
rejected. The terms level of significance and size of the test are synonymous
E. In a regression model that contains the intercept, the sum of the residuals is always zero
TRUE. The intercept model may be appropriate on occasions, there are some features of this
model that need to be noted. First, E(ui), which is always zero for the model with the intercept
term (the conventional model).
F. If a null hypothesis is not rejected, it is true. FALSE. All we can say is that the data at
hand does not permit us to reject the null hypothesis.
12
There is no general test of heteroskedasticity that is free of any assumption about which
variable the error term is correlated with. TRUE. Since the ü-ueCiare not directly observable,
some assumption about the nature of heteroscedasticity is inevitable
E. If the regression model is missing specified, [e.g. an important variable is omitted), the
OLS residuals will show a distinct pattern. TRUE. Besides heteroscedasticity, such a
pattern may result from autocorrelation, model specification errors,
Q7. Explain the meaning of each of the following
A. Seasonal Dummy Variables: same times ,we do work with seasonality unadjusted
data ,and it is useful to know that simple methods are available for dealing with seasonality
in re (0.65) (0.038) (0.093)
+0.027 bdrms +0.54 colonial
( 0.29) (0.045)
N=88, R2=.649 aggression models, so we can include a set of seasonal Dummy variables to
account for seasonality in the dependent variable, the independent variables, or both.
B. Dependent Dummy Variables: a common specification in applied work has the
dependent variables appearing in logarithmic form, with one or more dummy variables
appearing as independent variables
E.g: log (price‾)= -1.35+0.168log(logsize)+0.707logsqft
The above example shows when log(Y) is the dependent variable in a model, the coefficient on
a dummy variable when multiplied by 100 is interpreted as the percentage d/ce in Y holding
all other factors constant.
C. Linear probability Model (LPM): is the multiple linear regression model with a
binary dependent variable because of the response probability is linear in the parameters B j.in
LPM, Bjmajors the change in the probability of success when X j changes, holding other
factors fixed:
( 1
)
∆ P Y = =¿Bj∆ Xj
X
D Linear discriminant Functions: The problem of finding a linear discriminant function
will be formulated as a problem of minimizing a criterion function. The obvious criterion
function for classification purposes is the sample risk, or training error-the average loss
incurred in classifying the set of training samples. It is difficult to derive the minimum-risk lin-
ear discriminant, and for that reason it will be suitable to investigate several related criterion
functions that are analytically more tractable. Much of our attention will be devoted to studying
the convergence properties and computational complexities of various gradient descent
procedures for minimizing criterion functions. The similarities between many of the procedures
sometimes make it difficult to keep the differences between them clear.
A discriminant function that is a linear combination of the components of x can be written as
Where w is the weight vector and w0 the bias or threshold weight. Linear discriminant
functions are going to be studied for the two-category case, multi-category case, and general
case (Figure 9.1). For the general case there will be c such discriminant functions, one for each
of c categories.
13
E. The Logit Model:the cumulative distribution function for standard logistic random
variable. In the Logit Model, G is the logistic function: G(Z)= exp (Z)/[1+exp(Z)]= ∧(Z )
,w/c is b/n 0&1 for all real numbers of Z.
F. Probit Model: In the probit model, G is the standard normal cumulative distribution
function (CDF), w/c is expressed as an integral :
Z
PRACTICAL ILLUSTRATION:
1. In the following table you are given the ranks of 10 students in midterm and final
examination in econometrics. Compute spearman’s coefficient of rank correlation and
interpret it
STUDENT
Rank A B C D E F G H I J
Midterm 1 3 7 10 9 5 4 8 2 6
Final 3 2 8 7 9 6 5 10 1 4
Solution
6 ∑ D i2
r s=1− W h ere r s=coefficient ofrank correlationD=t h e differece between paired ranks
n ( n2 −1 )
n=t h e number of pairs
14
10 7 3 9
9 9 0 0
5 6 -1 1
4 5 -1 1
8 10 -2 4
2 1 1 1
6 4 2 4
Total 26
6 ∑ Di2 6 ( 26 )
⇒ r s=1− =1− =0.842
n ( n −1 ) 10 ( 99 )
2
Y^ i = 0.2033+0.6560 X i
Se= (0.0976) (0.1961)
2
r =0.397 RSS=0.0544 ESS=0.0358
Where Y= labor force participation rate (LFPR) of women in 1972 and X=LFPR of women in
1968. The regression results were obtained from a sample of 19 cities of a certain country
B. Test the hypothesis: H 0 : β 2>1 which test do you use? And why? What is the underlying
assumption(s) the tests you use?
Use the one-tail t test.
0.6560-1 = l. 7542
0.1961
For 17df, the one-tailed t value at a=5% is 1.740. Because, since the estimated t value is
significant, at this level of significance, we can reject the hypothesis that the true slope
coefficient is 1 or greater.
C. Suppose that the LFPR in 1968 was 0.58. on the bases of regression results given above,
what is the mean LFPR in 1972? Establish a 95% confidence interval for the mean
prediction
The mean LFPR is: 0.2033 + 0.6560 (0.58) 0.5838. To establish a 95% confidence interval for
this forecast value, use the formula: 0.5838 ± 2.11 (se of the mean forecast value), where 2.11
is the 5% critical t value for 17 df. To get the standard error of the forecast value, use Eq.
(5.10.2). But note that since the authors do not give the mean value of the LFPR of women in
1968, we cannot compute this standard error.
D. How would you test the hypothesis that the error term in the population regression is
normally distributed? Show the necessary calculation.
15
Without the actual data, we will not be able to answer this question because we need the values
of the residuals to plot them and obtain the Normal Probability Plot or to compute the value of
the Jarque-Bera test.
3. From the data for 46 states in the US for 1992, Baltagi obtained the following regression
results.
^ =4.30-1.34logP+0.17logY
logC
2
Se= (0.91) (0.32) (0.20) R =0.27
Where C= cigarette consumption, packs per year
P= real price per pack
Y= real disposable income per capita
Summary of Functional Forms
ii. If not, what might be the reasons for it? The income elasticity, although
positive, is not statistically different from zero, as the t value under the zero null
hypotheses is less than 1.
n- k
Since in this example R 2 = 0.27, n = 46, and k = 3, by substitution the reader can verify that R 2
=0.3026, approximately.
16
Q4. From a sample of 209 items, Wooldridge obtained the following regression results.
Log (^
salary ¿ ¿=4.32+0.280 log(sales)+0.0174Roe+0.00024Ros
Se= (0.32) (0.035) (0.0041) (0.00054) R2=0.283
Where: salary=salary of CEO
Sales=annual firm sales
Roe=return on equity in percent
Ros=return on firm’s stock
A. Interpret the preceding regression taking into account any prior expectations that you
may have about the signs of various coefficients
A priori, salary and each of the explanatory variables are expected to be positively related,
which they are. The partial coefficient of 0.280 means, ceteris paribus, the elasticity of CEO
salary is a 0.28 percent. The coefficient 0.0174 means, ceteris paribus, if the rate of return on
equity goes up by 1 percentage point (Note: not by 1 percent), then the CEO salary goes up by
about 1.07 %. Similarly, ceteris paribus, if return on the firm's stock goes up by 1 percentage
point, the CEO salary goes up by about 0.024%.
B. Which of the coefficients are individually statistically significant at the 5 percent level?
Under the individual, or separate, null hypothesis that each true population coefficient is zero,
you can obtain the t values by simply dividing each estimated coefficient by its standard error.
These t values for the four coefficients shown in the model are, respectively, 13.5, 8, 4.25, and
0.44. Since the sample is large enough, using the two-t rule of thumb, you can see that the first
three coefficients are individually highly statistically significant, whereas the last one is
insignificant.
= 27.02
ii. And why? Because: Under the null hypothesis, this F has the F distribution with 3 and
205 df in the numerator and denominator, respectively. The p value of obtaining such
an F value is extremely small, leading to rejection of the null hypothesis.
Q5. In studying the demand for farm tractors in the US for the periods 1921-
1941 & 1948-1957, Griliches obtained the following results.
LogY^ i = constant-0.519log X 2 i -4.933log X 3 i R =0.793
2
Se= (0.231)(0.477)
17
Where: Y=value of stock of tractors on farms as of January 1, in 1935-1938.
X 2 =index of prices paid for tractors divided by an index of prices received for all crops at time
t-1, and X 2 = interest rate prevailing in year t-1.the demand standard errors are given in
parentheses.
A. Interpret the preceding regression
The logs of real price index and the interest rate in the previous year explained about 79
percent of the variation in the log of the stock of tractors, a form of capital. Since this is a
double log model, the slope coefficients are (partial) price elasticity’s. Both these price
elasticity’s have a priori expected signs
B. Are the estimated slope coefficients
I. Individually statistically significant?
Each partial slope coefficient is individually significant at the 5% level
C Use the analysis of variance technique to test the significance of the overall regression.
[Hints: use the R2 variant of the ANOVA technique]
0.793/2 = 53.63
0.207/28
With n = 31, k = 3, the reader can verify that this F value is highly
Significant.
D How would you compute the interest-rate elasticity of demand for farm tractors/?
Same answer with A
E. How would you test the significance of estimated R2?
2 2
R =0.793 with n = 31, k = 3, the test R significance value is significant.
Q6.A researcher regressed child mortality (CM) on per capita GNP (PGNP) and female
literacy rate (FLR). Another time, the same researcher, extended this model by including
total fertility rate (TFR). They reproduced both models’ regression results as follow.
^ =263.6416- 0.0056 PGNP -2.2316 FLR
(1)CM i i i
2
Se= (11.5932) (0.0019) (0.2099) R =0.7077
^ i=168.3067- 0.0055 PGNP i-1.7680 FLR i+12.8686TFR i
(2) CM
Se= (32.8916) (0.0018) (0.2480) (?)
A. How would you interpret the coefficient of TFR?
i .A priori, would you expect a positive or a negative relationship between CM & TFR?
A priori, one would expect a positive relationship between CM and TFR.
iii. Justify your answer? Therefore, Due to the larger the number children born to a woman
the greater is the likelihood of increased mortality due to health and other reasons
B. Have the coefficient values of PGNP and FR:
i. Changed between the two equations?
18
The coefficients of PGNP are not very different, but that of FLR look different. To see if the
difference is real, we can use the t test.
ii. If so, what may be the reason(s) for such a change?
Suppose we use Eq. (l) and hypothesize that the true coefficient of PGNP is —1.7680. We can
now use the t test as follows: -2.2316-(-1.7680)-0.4636
2099-0.2099 == -2.20860
19
Since both the differential intercept and slope coefficients are highly significant, the levels as
well the growth rates of population in the two periods are different
ii. If they are different, what are the growth rates for 1972-1977 and 1978-1992?
The growth rate for the period before 1978 is 1.5% and that after 1978 it is 2.6% (= 1.5% +
1.1%).
Thank you!
20