You are on page 1of 20

JIMMA UNIVERSITY

COLLEGE OF BUSINESS AND ECONOMICS


DEPARTMENT OF ECONOMICS
M.SC IN ECONOMICS (INDUSTRIAL ECONOMICS)
REGULAR PROGRAM

INDIVIDUAL ASSIGNMENT: ECONOMETRICS

Submitted to: Dr. Berhanu G.


NAME ID
Reviewed by: Misganu Diriba Gelan RM2968/12-0

Submitted Date: Mar. , 2020


Academic year: 2012/2020
Mar. , 2020
Jimma, Ethiopia

1
Q1. Comment briefly on the meaning of each of the following.
A. Estimated coefficient:- models should be tested for
multivariate
multicollinearity in severe cases, multicollinearity results in inflated
standard errors for the coefficients, which renders predicted values
of the model unreliable:
Estimated coefficient is (a slope) or the coefficient of explanatory
variable(s) in the estimated or predicted regression model. It shows by
how much (a unit or a percentage changes) in the dependent variable as
the explanatory variable(s) change by one unit or some percent.
The relationship between the dependent variable(y) and independent
variable(x) is explained by the estimated coefficient. Example: Yi =
α+ β Xi+Ui holds for population of the values X&Y. Since these values
of the population are unknown we do not know the exact numerical
values of & β's.
To calculate or obtain the numerical values of & β we took sample
observations for Y & X. By substituting these values in the population
regression we obtain sample regression which gives an estimated value
of & β given by α∧ ^ β^ respectively then the sample regression line is
^ ^ β^ Xi
given by Y i= α+ .
The true relationships between variables (that explain the population) is
^
given by Y i= α + β Xi+Ui
B. Standard error:-Standard error is the square root of variance that is
necessary to apply test of significance in order to measure the size of
the error committed and determine the degree of confidence that the
estimates will lie.
In multiple regression analysis, the estimate of the standard deviation
of the population error, obtained as the square root of the sum of
squared residuals over the degrees of freedom it is generally an
estimate of the standard deviation of an estimator.
E.g. standard error of ßj is an estimate of the standard
deviation in the sampling distribution.

2
C.T-statistics:-The statistic used to test a single hypothesis about the
parameters in an econometric model. The T-statistic is the test that
applicable if the sample size is less than 30 & the population
parameters distribution is normal and also this test is also important
to test the significance of the parameters.
To apply t-test there are the following steps.
In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter
from its hypothesized value to its standard error. It is used in hypothesis
testing via Student's t-test. The term "t-statistic" is abbreviated from "hypothesis test
statistic", while "Student" was the pen name of William Sealy Gosset, who introduced
the t-statistic and t-test in 1908, while working for the  Guinness brewery in Dublin,
Ireland. By default, statistical packages report t-statistic with β0 = 0 (these t-statistics are
used to test the significance of corresponding repressor). However, when  t-statistic is needed to
test the hypothesis of the formH0: β = β0, then a non-zero β0 may be used.
If β  is an ordinary least squares estimator in the classical linear regression model (that
is, with normally distributed and homoscedastic error terms), and if the true value of the
parameter β is equal to β0, then the sampling distribution of the t-statistic is
the Student's t-distribution with (n−k) degrees of freedom, where n is the number of
observations, and k is the number of regresses (including the intercept).

D. R-squared:-In a multiple regression model, the proportion of the


total sample variation in the dependent variable that is explained by
the independent variable.
ESS
=
β^ ∑ yi x + β^ ∑ x
1 1 2 2 y

R2= TSS ∑ yi 2

    It is also the ratio of the explained variation to the total variation. It


is such that 0 <  r 2 < 1,  and denotes the strength of the linear
association between x andy.
    The coefficient of determination represents the percent of the data
that is the closest  to the line of best fit.  For example, if r = 0.922,
then r  2 = 0.850, which means that 85% of the total variation in y can
be explained by the linear relationship between x and y (as described
by the regression equation). 
   It is a measure of how well the regression line represents the data.  If
the regression line passes exactly through every point on the scatter

3
plot, it would be able to explain all of the variation. The further the line
is away from the points, the less it is able to explain.
E. Sum of squared residuals (SSR):- In multiple regression analysis,
the sum of the squared OLS residuals across all observations can be
processed. SSR measures the sample variation in the uˆi.
To make matters even worse, the residual sum of squares is often
called the “error sum of squares.”
This is especially unfortunate because, the errors and the residuals are
different quantities.
Thus, we will always call the residual sum of squares or the sum of
squared residuals. We prefer to use the abbreviation SSR to denote the
sum of squared residuals, because it is more common in econometric
packages. Mathematically we can rewrite as follow:
RSS= TSS-ESS
Ui2 =yi2 - βxi yi_________________________________in simple regression model
¿ ¿

Ui2 =yi2 - β 1 x1i yi - β 2 x2iyi________________ in multiple regression model

in multiple regression
F. Standard error of regression (SER):-
analysis ,the estimate of the standard deviation of the
population error ,obtained as the square root of the sum
of squared residuals over the degrees of freedom. The
standard errors of the regression or the standard error of
estimate is the positive square root of variance (σ 2). If sˆ 2 is
plugged into the variance formulas, then we have unbiased
estimators of Var (b ˆ 1) and Var (b ˆ 0). Later on, we will need
estimators of the standard deviations of b ˆ 1 and b ˆ 0, and this
requires estimating. The natural estimator of s is and is called
the standard error of the regression (SER). (Other names for
sˆ are the standard error of the estimate and the root means
squared error, but we will not use these.) Although sˆ is not an
unbiased estimator of s, we can show that it is a consistent
estimator of s, and it will serve our purposes well. The estimate

4
sˆ is interesting because it is an estimate of the standard
deviation in the unobservable affecting y; equivalently, it
estimates the standard deviation in y after the effect of x has
been taken out. Most regression packages report the value of sˆ
along with the R-squared, intercept, slope, and other OLS
statistics (under one of the several names listed above). For
now, our primary interest is in using sˆ to estimate the standard
deviations of b ˆ SSTx, the natural estimator of this is called the
standard error of b ˆ 1.
Note that se (b ˆ 1) is viewed as a random variable when we
think of running OLS over different samples of y; this is true
because sˆ varies with different samples. For a given sample,
se(b ˆ 1) is a number, just as b ˆ 1 is simply a number when we
compute it from the given data. Similarly, se(b ˆ 0) is obtained
from sd(b ˆ 0) by replacing s with sˆ . The standard error of any
estimate gives us an idea of how precise the estimator is.
Standard errors play a central role throughout this text; we will
use them to construct test statistics and confidence intervals for
every econometric procedure we cover.
E. Best linear unbiased estimator (BLUE):-among all linear unbiased
estimators, the one with the smallest variance. OLS is BLUE,
conditional on the sample values of the explanatory variables, under
the Guass Markov assumptions. For our purpose the term “linear”
in the linear regression model refers to linearity in the regression
coefficients, the Bs, and not linearity in the Y and X variables.
Best/ Least Variance: An estimate is best when it has the smallest variance as compared with
any other estimate obtained from other econometric methods.
Unbiasedness: An estimator is said to be unbiased if the expected value is equal to the true
population parameter. One of the powerful findings about the sample mean (which is also the
least squares estimator) is that it is the best of all possible estimators that are both linear and
unbiased. The fact that Y is the best linear unbiased estimator (BLUE) accounts for its wide
use. In this context we mean by best that it is the estimator with the smallest variance of all
linear and unbiased estimators. It is better to have an estimator with a smaller variance than one

5
with a larger variance; it increases the chances of getting an estimate close to the true
population mean.

Q2.Define the following terms


A.R2:is the amount of variability in Y that is accounted for ( explained) by the X variables. If
there is a perfect linear r/p b/n Y & IVS, R 2 will equal 1. If there is no r/p b/n Y &IVS ,
R2 will equal zero.
R2= SSR/ SST, that is the ratio of explained variability to the total variability.
F∗K
R= 2
( n−k−1 ) +(F∗K) this can be use full if F, n& k are known.

TSS−RSS RSS ∑ e i2
R2= Tss
=1−
TSS
=1−
∑ y2

Unlike regular R2, adjusted R2can actually get smaller as additional


variables are added to the model. One of the claimed benefits for
adjusted R2 is that it “punishes” you for including extraneous&
irrelevant variables in the model. As N gets bigger, the difference b/n
R2& adjusted R2 gets smaller and smaller.
Adjusted R2 wasn’t primarily designed to “punish” you mindlessly
including extraneous variables (although it has that effect), it was just
meant to correct for the inherent upward bias in Adjusted R2= 1−¿ (1-
πN−1
R2)* N −K −1

Is a measure of the degree of association between two variables in


simple regression model? In the simple regression context, r2 is a more
meaningful measure than r, for the former tells us the proportion of
variation in the dependent variable explained by the explanatory
variable(s) and therefore provides an overall measure of the extent to
which the variation in one variable determines the variation in the
other. The latter does not have such value. Moreover, as we shall see,
the interpretation of r (= R) in a multiple regression model is of
dubious value. However, we will have more to say about r2 in In
passing, note that the r2 defined previously can also be computed as the
squared coefficient of correlation between actual Yi and the estimated
Yi, namely, ˆYi. It is how well does your regression equation truly
6
represent your set of data? One of the ways to determine the answer to
this question is to exam the correlation coefficient and the coefficient of
determination.  The quantity r, called the linear correlation coefficient,
measures the strength and the direction of a linear relationship between
two variables. The linear correlation coefficient is sometimes referred
to as the Pearson product moment correlation coefficient in honor of
its developer Karl Pearson. The mathematical formula for-r is:
∑ ^y i2 ¿ ∑ xyi
 B. r
2=¿
∑ y i2 or ∑ y i 2

The value of r is such that -1 < r < +1.  The + and – signs are used for
positive linear correlations and negative linear correlations,
respectively.  Positive correlation:    If x and y have a strong positive
linear correlation, r is close to +1.  An r value of exactly +1 indicates a
perfect positive fit.   Positive values indicate a relationship
between x and y variables such that as values for x increase, values
for y also increase. Negative correlation:   If x and y have a strong
negative linear correlation, r is close to -1.  An r value of exactly -1
indicates a perfect negative fit.   Negative values indicate a relationship
between x and y such that as values for x increase,
values for y decrease. No correlation:  If there is no linear correlation
or a weak linear correlation, r is close to 0.  A value near zero means
that there is a random, nonlinear relationship between the two variables
 Note that r is a dimensionless quantity; that is, it does not depend on
the units employed. A perfect correlation of ± 1 occurs only when the
data points all lie exactly on a straight line.  If r = +1, the slope of this
line is positive.  If r = -1, the slope of this line is negative.  A
correlation greater than 0.8 is generally described as strong, whereas a
correlation less than 0.5 are generally described as weak.  These values
can vary based upon the   "type" of data being examined.  A study
utilizing scientific data may require a stronger correlation than a study
using social science data.
C. Test for stability: is a test of the hypothesis that the
econometrics package views makes available several
standards diagnostic. Stability test are needed for testing,

7
amongst other things, the validity of purchasing power parity
and the consistency of wage distribution over time. It is
argued that stationary tests are more appropriate than unit
root tests in this situation since that null hypothesis is usually
that the series are stable. An implicit assumption in all
regression models is that their coefficients remain constant
across all observations. When they do not and this occurs
regularly with time series data in particular the problem of
structural change is encountered. After presenting a
simulation example of a typical structural break in a
regression, methods are introduced to test for such breaks,
whether at a known point in time or when the breakpoint is
unknown. An approach to modelling changing parameters
using dummy variables is introduced and a detailed example
of a shifting regression relationship between inflation and
interest rates brought about by policy regime changes is
presented.

D. Degree of freedom (DF):-in the multiple regression


analysis the number of observations minus the number of estimated
parameters. Or it refers to the scores in a distribution that are free to
change without changing the mean of distribution.
df is used to determine power, b/c the more subjects the greater the
power.
df=n-( k+1), or df= n-k-1
The number of “free” or unconstrained data used in calculating a
sample statistic or test statistic
E. Linear functions of parameters: The classical assumed
that the model should be linear in the parameters regardless of
whether the explanatory and the dependent variables are linear or
not. This is because if the parameters are non-linear it is difficult to
estimate them since their value is not known but you are given with
the data of the dependent and independent variable.

8
Example:
1. Y = α + βx + u is linear in both parameters and the variables, so it
Satisfies the assumption.
lnY = α + βln x + u is linear only in the parameters. Since the classical
worry on the parameters, the model satisfies the assumption.

F. nested & no nested hypothesis:


Nested models: one model (the restricted model) is special case of the
other model (the unrestricted model)
E.g.log(salary+B1years+B2games
yr+B3Bavg+B4hrunsyr+B5bisyr+B+u,
Log (salary)= Bo +B1years+ B2games yr+ U
None nested: neither the equation is special case of the other.
E.g. log (salary)= Bo +B1years+B2games yr+B3Bavg+B4hrunsyr+u,
If we consider the following models:
Model A
Model B
We say model B is nested in model A because it is special chase of
model A. if we estimate model A and test the hypothesis that and do
not reject it on the basis of tests model A will reduce to model B. if we
consider the following:
Model C
Model D
In the above models are non-nested models, because they do not have
the same variables.

G. Analysis of variance (ANOVA): it tests the d/ce among d/t


groups of data for homogeneity .
Compares the means of two or more parametric samples
The ANOVA test has two degrees of freedom:
1. Total number sampled minus numbers of groups
2. Numbers of groups minus one

9
Q3. Explain the following tests for homoscedasticity
[illustrate each of these tests with data]

A.Ramsey’s Test: reset relies on a trick similar to the special form


of the white test
Instead of adding functions of the X’s directly, we add &test functions
of Y^. So estimate Y =B0+B1X1+…BkXk+ẟ1ӯ2 + ẟ1 ӯ3 +error and
test
B.The Golgfeld –Quandt Test: observations are ordered by
magnitude of independent variable thought be related to the
variances of the disturbances. Needs the formation of two equal
sized groups.
A certain number of central observations (d) need to be omitted.
The first group corresponds to low values of the independent variable
involving pieces data with The second groups corresponds high values
of the independent variable involving pieces of data with Separate
regressions are run for each of the two groups and the ratio of their
same of squared residuals is formed. Assuming the error variances are
distributed normally, the static will have an F-distribution with in both
the numerator and enumerator.
C.Glejsser Test: s similar to park test, but uses the absolute value of
the estimated residuals as independent variable.

D. Breussch & Pagan Test: Does not require ordering of


observations but requires the assumption of normality. It requires
the specification of the r/p b/n the true error variance and
independent variable
The Breusch-Pagan test will detect any linear forms
of heteroscedasticity
To conduct the test: Calculate the list squares residuals from the
original regression equation Estimate the regression variance Run the
regression After obtaining the regression sum of squares and if the

10
calculated value is greater than the tabulated ( critical values), we reject
the null hypothesis of homoscedasticity.

Q4. Explain the following


A. The Durbin- Watson Test: it is statistic used to test for first order serial correlation
in the errors of a time series regression model under CLM assumptions. Or it is another
test for autoregressive (AR) serial correlation is the Durbin-Watson test . Durbin-Watson
(DW) statistic is also based on the OLS residuals:
n
DW= ∑ ¿¿ -Ut
^ -1)2/∑i=1U2t
i=2
The DW test involves the calculation of a test b statistic based residuals from the OLS
regression procedure.
The DW statistic lies b/n 0&4. DW near two indicate no first order serial correlation
B. Estimation with Quasi-First Difference: in a panel data setting, the pooled OLS
estimator applied to first differences of the data across time.
C. The Cochrane- orcutt (Co): estimation omits the first observation and uses at ^ P from:
^
Run the regression of U t-1, for all t=2…n
D. Durbin- H test:-the DW-test that could be seen as a modified version of it and for that
reason is called the Durbins h statistic. Where DW is the standard DW-test, T the number
of observations and Var (b1) the square of the standard error of the estimated parameter
of the lagged dependent variable. The test statistic has been shown to be standard normally
distributed under the null hypothesis of no autocorrelation, which means that the test value
should be compared with a critical value from the standard normal table.
The presence of autocorrelation in models that include lagged dependent variables is even
more affected than the standard model. When the error term is serially correlated in a dynamic
model the estimated parameters are biased and inconsistent. It is therefore very important to
correct for the problem before using the estimates for anything.
Example: Assume that we have estimated the parameters of a dynamic model and received
the following results with standard errors within parenthesis:
We use quarterly data over a 30 years period, which means that T=120. Since our model
includes a lagged variable we lose one observation in the estimation. Using the information
from the regression results, we may form the Durbins ^-statistic:
Using a one sided test at the 5 percent significance level we receive a critical value of 1.645.
Since the test value is much larger than the critical value, we must conclude that our error
terms are serially correlated.

Q5. State with reasons whether the following statements are TRUE,
FALSE or UNCERTAIN. Be precise
A. The t-test discussed in simple linear regression requires that the sampling distributions
of estimators ^β 1∧ ^β 1follows the normal distribution. TRUE. The t test is based on variables
with a normal distribution. Since the estimators of and are linear combinations of the error

11
to, which is assumed to be normally distributed under CLRM, these estimators are also
normally distributed

B. Even though the disturbance term in the CLRM is not normally distributed, the OLS
estimators are still unbiased.
TRUE. So long as E(ui) = 0, the OLS estimators are unbiased. No probabilistic assumptions
are required to establish biasedness.

C. If there is no intercept in the regression model, the estimated ui (=u^ i ) will not sum to
zero. TRUE. The differences between the two sets of formulas should be obvious: In the
model with the intercept term absent, we use raw sums of squares and cross products but
in the intercept-present model, we use adjusted (from mean) sums of squares and cross
products. Second, the df for computing ˆ σ2 is (n−1) in the first case and (n−2) in the
second case. (Why?) Although the intercept less or zero intercept model may be
appropriate on occasions, there are some features of this model that need to be noted. First,
E(ui), which is always zero for the model with the intercept term (the conventional model),
need not be zero when that term is absent. In short, E(ui), need not be zero for the
regression through the origin.

D. The p-value and the size of test statistic mean the same thing
TRUE. The p value is the smallest level of significance at which the null hypothesis can be
rejected. The terms level of significance and size of the test are synonymous

E. In a regression model that contains the intercept, the sum of the residuals is always zero
TRUE. The intercept model may be appropriate on occasions, there are some features of this
model that need to be noted. First, E(ui), which is always zero for the model with the intercept
term (the conventional model).

F. If a null hypothesis is not rejected, it is true. FALSE. All we can say is that the data at
hand does not permit us to reject the null hypothesis.

Q6.State with brief reason whether the following statements are


TRUE, FALSE or UNCERTAIN.
A. In the presence of heteroskedasticity OLS estimators are biased and inefficient
FALSE.The estimators are unbiased but are inefficient
B. If heteroskedasticity present, the conventional t & F tests are invalid
TRUE. The concavity property of log the expectation of the log of a random variable is less
than the log of its expectation, unless the variable has a zero variance, in which case they are
equal.
C. In the presence of heteroskedasticity the usual OLS method always overestimates the
standard errors of estimators. FALSE. Typically, but not always, will the variance be
overestimated
D. If residuals estimated from an OLS regression exhibit systematic pattern, it means
heteroskedasticity is present in the data FALSE. Besides heteroscedasticity, such a pattern
may result from autocorrelation, model specification errors, etc.

12
There is no general test of heteroskedasticity that is free of any assumption about which
variable the error term is correlated with. TRUE. Since the ü-ueCiare not directly observable,
some assumption about the nature of heteroscedasticity is inevitable
E. If the regression model is missing specified, [e.g. an important variable is omitted), the
OLS residuals will show a distinct pattern. TRUE. Besides heteroscedasticity, such a
pattern may result from autocorrelation, model specification errors,
Q7. Explain the meaning of each of the following
A. Seasonal Dummy Variables: same times ,we do work with seasonality unadjusted
data ,and it is useful to know that simple methods are available for dealing with seasonality
in re (0.65) (0.038) (0.093)
+0.027 bdrms +0.54 colonial
( 0.29) (0.045)
N=88, R2=.649 aggression models, so we can include a set of seasonal Dummy variables to
account for seasonality in the dependent variable, the independent variables, or both.
B. Dependent Dummy Variables: a common specification in applied work has the
dependent variables appearing in logarithmic form, with one or more dummy variables
appearing as independent variables
E.g: log (price‾)= -1.35+0.168log(logsize)+0.707logsqft

The above example shows when log(Y) is the dependent variable in a model, the coefficient on
a dummy variable when multiplied by 100 is interpreted as the percentage d/ce in Y holding
all other factors constant.
C. Linear probability Model (LPM): is the multiple linear regression model with a
binary dependent variable because of the response probability is linear in the parameters B j.in
LPM, Bjmajors the change in the probability of success when X j changes, holding other
factors fixed:

( 1
)
∆ P Y = =¿Bj∆ Xj
X
D Linear discriminant Functions: The problem of finding a linear discriminant function
will be formulated as a problem of minimizing a criterion function. The obvious criterion
function for classification purposes is the sample risk, or training error-the average loss
incurred in classifying the set of training samples. It is difficult to derive the minimum-risk lin-
ear discriminant, and for that reason it will be suitable to investigate several related criterion
functions that are analytically more tractable. Much of our attention will be devoted to studying
the convergence properties and computational complexities of various gradient descent
procedures for minimizing criterion functions. The similarities between many of the procedures
sometimes make it difficult to keep the differences between them clear.
A discriminant function that is a linear combination of the components of x can be written as
                                                                                                                    
Where w is the weight vector and w0 the bias or threshold weight. Linear discriminant
functions are going to be studied for the two-category case, multi-category case, and general
case (Figure 9.1). For the general case there will be c such discriminant functions, one for each
of c categories.
  

13
E. The Logit Model:the cumulative distribution function for standard logistic random
variable. In the Logit Model, G is the logistic function: G(Z)= exp (Z)/[1+exp(Z)]= ∧(Z )
,w/c is b/n 0&1 for all real numbers of Z.
F. Probit Model: In the probit model, G is the standard normal cumulative distribution
function (CDF), w/c is expressed as an integral :
Z

G(Z)=∅ (Z) ≡ ∫ ∅ (V)dv, where


−∞
∅ (Z)is the standard normal density
∅ ( Z )= (2 π )-1/2exp(-z2/2)
G. The Tobit Model: is quite convenient when we want to estimate futures of the distribution
of Y given x1,…xk other than the conditional expectation
The Tobit model expresses the observed response, Y, in terms of an underlying latent
variables:
Y*=B0 +xB+U, U/X normal (0,δ 2)
Y=max(0.y*)

H. Truncated regression model: rises when we exclude on bases of Y, a subset of the


population in our sampling scheme . in other words, we don’t have a random sample from
the underlying population, but we know the rule that was used to include units in the
sample. This rule is determined by whether Y is above or below a certain thresh hold.

PRACTICAL ILLUSTRATION:
1. In the following table you are given the ranks of 10 students in midterm and final
examination in econometrics. Compute spearman’s coefficient of rank correlation and
interpret it
STUDENT
Rank A B C D E F G H I J
Midterm 1 3 7 10 9 5 4 8 2 6
Final 3 2 8 7 9 6 5 10 1 4
Solution
6 ∑ D i2
r s=1− W h ere r s=coefficient ofrank correlationD=t h e differece between paired ranks
n ( n2 −1 )
n=t h e number of pairs

Midterm Final R1-R2 D2


(R1) (R2) (D)
1 3 -2 4
3 2 1 1
7 8 -1 1

14
10 7 3 9
9 9 0 0
5 6 -1 1
4 5 -1 1
8 10 -2 4
2 1 1 1
6 4 2 4
Total 26

6 ∑ Di2 6 ( 26 )
⇒ r s=1− =1− =0.842
n ( n −1 ) 10 ( 99 )
2

Interpretation: there is positive correlation.


2. Consider the following regression output

Y^ i = 0.2033+0.6560 X i
Se= (0.0976) (0.1961)
2
r =0.397 RSS=0.0544 ESS=0.0358
Where Y= labor force participation rate (LFPR) of women in 1972 and X=LFPR of women in
1968. The regression results were obtained from a sample of 19 cities of a certain country

A. How do you interpret the regression?


There is positive association in the LFPR in 1972 and 1968, which is not surprising in view of
the fact since WW II there has been a steady increase in the LFPR of women.

B. Test the hypothesis: H 0 : β 2>1 which test do you use? And why? What is the underlying
assumption(s) the tests you use?
Use the one-tail t test.
0.6560-1 = l. 7542
0.1961
For 17df, the one-tailed t value at a=5% is 1.740. Because, since the estimated t value is
significant, at this level of significance, we can reject the hypothesis that the true slope
coefficient is 1 or greater.

C. Suppose that the LFPR in 1968 was 0.58. on the bases of regression results given above,
what is the mean LFPR in 1972? Establish a 95% confidence interval for the mean
prediction
The mean LFPR is: 0.2033 + 0.6560 (0.58) 0.5838. To establish a 95% confidence interval for
this forecast value, use the formula: 0.5838 ± 2.11 (se of the mean forecast value), where 2.11
is the 5% critical t value for 17 df. To get the standard error of the forecast value, use Eq.
(5.10.2). But note that since the authors do not give the mean value of the LFPR of women in
1968, we cannot compute this standard error.
D. How would you test the hypothesis that the error term in the population regression is
normally distributed? Show the necessary calculation.

15
Without the actual data, we will not be able to answer this question because we need the values
of the residuals to plot them and obtain the Normal Probability Plot or to compute the value of
the Jarque-Bera test.

3. From the data for 46 states in the US for 1992, Baltagi obtained the following regression
results.
^ =4.30-1.34logP+0.17logY
logC
2
Se= (0.91) (0.32) (0.20) R =0.27
Where C= cigarette consumption, packs per year
P= real price per pack
Y= real disposable income per capita
Summary of Functional Forms

A. What is the elasticity of demand for cigarette with respect to price


I. Is it statistically significant?
In summary,the elasticity is -1.34. It is significantly different from zero, for the t value under
the null hypothesis that the true elasticity coefficient is zero is:
-1.43
t == -4.4687
0.32
The p value of obtaining such a t value is extremely low. The period in 1992 shows a price of
elasticity demand for cigarette slightly negative and an income of elasticity demand
significantly greater than unity.
II. If so, is it statistically different from 1?
However, the elasticity coefficient is not different from one because under the null hypothesis
that the true elasticity is 1, the t value
-1.34 1
-1.0625
0.32
This t value is not statistically significant.
B. What is the income elasticity of demand for cigarettes?
i. Is it statistically significant? The income elasticity, although positive, is not
statistically different from zero, as the t value under the zero null hypothesis is less than
1.

ii. If not, what might be the reasons for it? The income elasticity, although
positive, is not statistically different from zero, as the t value under the zero null
hypotheses is less than 1.

C. How would you retrieve R2 from the adjusted R2 given above?

n- k
Since in this example R 2 = 0.27, n = 46, and k = 3, by substitution the reader can verify that R 2
=0.3026, approximately.

16
Q4. From a sample of 209 items, Wooldridge obtained the following regression results.
Log (^
salary ¿ ¿=4.32+0.280 log(sales)+0.0174Roe+0.00024Ros
Se= (0.32) (0.035) (0.0041) (0.00054) R2=0.283
Where: salary=salary of CEO
Sales=annual firm sales
Roe=return on equity in percent
Ros=return on firm’s stock
A. Interpret the preceding regression taking into account any prior expectations that you
may have about the signs of various coefficients
A priori, salary and each of the explanatory variables are expected to be positively related,
which they are. The partial coefficient of 0.280 means, ceteris paribus, the elasticity of CEO
salary is a 0.28 percent. The coefficient 0.0174 means, ceteris paribus, if the rate of return on
equity goes up by 1 percentage point (Note: not by 1 percent), then the CEO salary goes up by
about 1.07 %. Similarly, ceteris paribus, if return on the firm's stock goes up by 1 percentage
point, the CEO salary goes up by about 0.024%.

B. Which of the coefficients are individually statistically significant at the 5 percent level?
Under the individual, or separate, null hypothesis that each true population coefficient is zero,
you can obtain the t values by simply dividing each estimated coefficient by its standard error.
These t values for the four coefficients shown in the model are, respectively, 13.5, 8, 4.25, and
0.44. Since the sample is large enough, using the two-t rule of thumb, you can see that the first
three coefficients are individually highly statistically significant, whereas the last one is
insignificant.

C. What is the overall significance of the regression?


i. Which test do you use?
To test the overall significance, that is, all the slopes are equal to zero, we use the following
F test, which yields:

= 27.02

ii. And why? Because: Under the null hypothesis, this F has the F distribution with 3 and
205 df in the numerator and denominator, respectively. The p value of obtaining such
an F value is extremely small, leading to rejection of the null hypothesis.

D. Can you interpret the coefficients Roe &Ros as elasticity coefficients?


i. Why? Because, since the dependent variable is in logarithmic form and the roe and ros
are in linear form, the coefficients of these variables give semi elasticity’s.
ii. Why not? The growth rate in the dependent variable for an absolute (unit) change in the
repressor.

Q5. In studying the demand for farm tractors in the US for the periods 1921-
1941 & 1948-1957, Griliches obtained the following results.
LogY^ i = constant-0.519log X 2 i -4.933log X 3 i R =0.793
2

Se= (0.231)(0.477)

17
Where: Y=value of stock of tractors on farms as of January 1, in 1935-1938.
X 2 =index of prices paid for tractors divided by an index of prices received for all crops at time
t-1, and X 2 = interest rate prevailing in year t-1.the demand standard errors are given in
parentheses.
A. Interpret the preceding regression
The logs of real price index and the interest rate in the previous year explained about 79
percent of the variation in the log of the stock of tractors, a form of capital. Since this is a
double log model, the slope coefficients are (partial) price elasticity’s. Both these price
elasticity’s have a priori expected signs
B. Are the estimated slope coefficients
I. Individually statistically significant?
Each partial slope coefficient is individually significant at the 5% level

II. Are they significantly different from unity?


Each is also significantly different from unity.

C Use the analysis of variance technique to test the significance of the overall regression.
[Hints: use the R2 variant of the ANOVA technique]

0.793/2 = 53.63
0.207/28

With n = 31, k = 3, the reader can verify that this F value is highly
Significant.

D How would you compute the interest-rate elasticity of demand for farm tractors/?
Same answer with A
E. How would you test the significance of estimated R2?
2 2
R =0.793 with n = 31, k = 3, the test R significance value is significant.
Q6.A researcher regressed child mortality (CM) on per capita GNP (PGNP) and female
literacy rate (FLR). Another time, the same researcher, extended this model by including
total fertility rate (TFR). They reproduced both models’ regression results as follow.
^ =263.6416- 0.0056 PGNP -2.2316 FLR
(1)CM i i i
2
Se= (11.5932) (0.0019) (0.2099) R =0.7077
^ i=168.3067- 0.0055 PGNP i-1.7680 FLR i+12.8686TFR i
(2) CM
Se= (32.8916) (0.0018) (0.2480) (?)
A. How would you interpret the coefficient of TFR?
i .A priori, would you expect a positive or a negative relationship between CM & TFR?
A priori, one would expect a positive relationship between CM and TFR.
iii. Justify your answer? Therefore, Due to the larger the number children born to a woman
the greater is the likelihood of increased mortality due to health and other reasons
B. Have the coefficient values of PGNP and FR:
i. Changed between the two equations?

18
The coefficients of PGNP are not very different, but that of FLR look different. To see if the
difference is real, we can use the t test.
ii. If so, what may be the reason(s) for such a change?
Suppose we use Eq. (l) and hypothesize that the true coefficient of PGNP is —1.7680. We can
now use the t test as follows: -2.2316-(-1.7680)-0.4636
2099-0.2099 == -2.20860

iii. Is the observed difference statistically significant? Yes


iii. Which test do you use and why? We can now use the t test, Because This t value
exceeds 2 in absolute terms, so can reject the hypothesis that the true coefficient is —
1.7680.
C. How would you choose between model (1) and model (2)?
i. Which statistical test would you use to answer this question?
We can treat model (1) as the restricted version of model (2). Hence, we can use the R version
of the F test.

iv. Show the necessary calculation


since the dependent variables in the two models are the same. The resulting F statistic is as
follows:
(0.7474-0.7077)/10.0397 = 9.4523
(1 — 0.7474) / (64 —4) 0.0042
Under the standard assumptions, this has the F distribution with1 and 60 df in the numerator
and denominator, respectively. The 1% critical F for these dfs is 7.08. Since the computed F
exceeds this critical value, we can reject the restricted model (1) and conclude that the TFR
variable belongs in the model.

D We have not given the standard error of the coefficient of TFR,


i. Can you find out? Yes, Fik = t:
Therefore, taking the (positive) square root of the F value given in (c) above,
We find: 9.4523 = 3.0744, approx.
ii. [recall the relationship between the t and F distributions]
Therefore, under the null hypothesis that the true value of coefficient of TFR in model
Q7. To study the rate of growth of population in city X over the period1970-1992,
Mukherejee et al. estimated the following models.
Model I: ln( ^
pop ¿ ¿i=4.73+0.024t
t=(781.25) (54.71)
^
Model II: ln( pop ¿ ¿i=4.77+0.015t- 0.075 D i+0.011( D i t ¿
t= (2477.92) (34.01) (-17.01) (25.54)
Where: pop=population in millions, t=trend variable, D i =1 for observations beginning in 1978
and 0 before 1978 and ln= natural logarithm.
A. In model I, what is the rate of growth of City X’s population over the sample period?
2.4%.

C. Are the population growth rate statistically different pre-and post-1978?


i. How do you?

19
Since both the differential intercept and slope coefficients are highly significant, the levels as
well the growth rates of population in the two periods are different
ii. If they are different, what are the growth rates for 1972-1977 and 1978-1992?
The growth rate for the period before 1978 is 1.5% and that after 1978 it is 2.6% (= 1.5% +
1.1%).
Thank you!

20

You might also like