Econometrics Assignment... 2

PHARMA COLLEGE
DEPARTMENT OF ACCOUNTING AND FINANCE

MSC IN ACCOUNTING AND FINANCE
ECONOMETRICS ASSIGNMENT
1. The following results have been obtained from a sample of 11 observations on the
values of sales (Y) of a firm and the corresponding prices (X).
X̄=519.18
Ȳ =217.82
∑ X 2i =3,134 ,543
∑ X i Y i=1,296,836
∑ Y 2i =539,512
i) Estimate the regression line of sale on price and interpret the results
ii) What is the part of the variation in sales which is not explained by the regression
line?
iii) Estimate the price elasticity of sales.
Answer
i) The regression line of sales on price can be represented by the following equation
Y= a +bX
∑ N (∑ xy )−( ∑ x )( ∑ y )
Where b = and a = Y −b X and N = 11
N ( ∑ x 2 )−( ∑ x )
2
Putting the values,
(11∗1296836)−(11∗519.18∗11∗217.82)
b = (11∗3134543)−(11∗519.18∗11∗519.18)
52870.3364
b = 169516.4036 = 0.31
a = 217.82 – (0.31*519.18) = 55.89
Therefore, the regression line of sales on price is, Y = 55.89 + 0.31X
1
ii) The part of sales that is explained by the sales regression is calculated by R2
N ( ∑ xy )−( ∑ x ) ( ∑ y )
R 2=
√ N ∑ ( X−ΣX ) ( NΣγ−∑ γ )
R 2=
[ ( 11∗1296836−11∗519.18∗11∗217.82 ) ]
√[(11∗3134543−11∗519.18∗11∗519.18)(11∗539512−11∗217.82∗11∗217.82)]
2 52870.3364
R= ∗17610.9236
169516.4036
2
R 0.968
The part of sales that is not explained by price can be calculated as 1−R2=1−0.968=0.032 and
interpreted as 3.2% of sales are not explained by the price.
iii) The price elasticity of sales can be calculated as:
dY
∗X
e(p)¿ dX
Y
X
e(p) = b* Y
0.31 X
e(p) = Y
2. The following data refers to the price of a good ‘P’ and the quantity of the good supplied, ‘S’.
P 2 7 5 1 4 8 2 8
S 15 41 32 9 28 43 17 40
a. Estimate the linear regression line Ε( S )=α + βP
b. Estimate the standard errors of α^ and β^

c. Test the hypothesis that price influences supply
Answer
a) The linear regression line can be calculated as follows:
2
8
SSS=∑ S =∑ ( S i−s ) =1205
2 2
i=1
8
SPP=∑ P =∑ ( Pi−P ) =55.9
2 2
i=1
8
SSP=∑ ( SP )=¿ ∑ ( S i−S ) ( Pi −P )=¿ ¿ 22.4
i=1
(∑ SP) 225.4
α =S−Pβ and β = = 1 1 = 0.8685
√ S 2∗√ P 2 1205 ∗55.9
2 2
From the table
Si 228
S=∑ = =28.125
n 8
P
P=∑ i =37/8 = 4.625
n
α =28.12−4.625∗0.8685=24.1082
Therefore, the estimated regression line is, S=24.1082 + 0.8685P
b) The standard error (SE) ofα and β are calculated as follows:
SE(α )= σ
√ 1
n
+ P 2/ Spp∧¿ SE(α ) = σ/√ Spp
Σ^2=1/( n−2)S SE=1/(n-2)[SSS- β 2 Spp]

=1/(8-2)[1205-0.86852*55.9]
=1/6*1162.8351=193.8058
Σ=√ 193.8058=13.9214
√
1
SE(α )= σ + P2 / S PP
n
13.9214/√ 55.9=1.86199
c) Testing for hypothesis

H0 : β = 0 Versus H1 : β ≠0 at α = 0.05
t = (β -0)/ (SE (β)
t = 0.8685/1.86199=0.4664
ttabulated =2.4469
∣t∣=2.4469
Therefore, we fail to reject H0, that means at α = 0.05; Price doesn′t affect Supply
3
3. Suppose that a researcher estimates a consumptions function and obtains the following results:
C= 15 + 0 .81 Yd n=19
2
( 3. 1 ) ( 18. 7 ) R =0 . 99
Where C=Consumption, Yd=disposable income, and numbers in the parenthesis are the‘t-ratios’
a. Test the significant of Yd statistically using t-ratios
b. Determine the estimated standard deviations of the parameter estimates
Answer
For a fitted regression model y= β^ 1 + ^β 2 with Y as response and X as prediction variable, the test
statistic for testing the significance of X is given by,
^β + β 0
2 2
T=
√ σ^ H0 tn-2
2
s xx
Where:
n r
1
σ^ =RSS/n-2 = SXX =∑ ( X i−x ) ∑X,
2 2
=
i=1 n i=1 i
β is the hypothesized value Of ^β 2 here it is 0
0
2
a) In this case, Y=C,x= Yd, n=19, R2=0.99
T-ratio for β1=3.1, T-ratio for β2=18.7
Therefore we are to test the null hypothesis H1:β2≠0 against the alternative hypothesis .H1:β2≠0.
The test statistic for this test is given by tβ2= T-ratio for β2=18.7 [ ∵ under H0;β20=0]
The p-value for this test can be computed for t-distribution with f=n−2=19−2=17 using R
code:
2Hpt(18.7,17,lower.tail=F which gives p-value as 0 for which we reject the null hypothesis.
Therefore, It can be conclude that Yd is statistically significant.
4
b) The t-ratio is basically the estimate divided by the standard error. Again the standard error is
the standard deviation of the estimates.
t-ratio(β 2)=3.1= β1/sec(β1)=15/se(β1)
Se(β1)=15/3.1=4.839
t-ratio(β2)=18.7= β2/se (β2)=0.81/se(β2)
sec(β1)=0.81/18.7=0.043
Therefore the estimated standard deviations of the parameter estimates are 4.389 and 0.043
respectively.
4. Discuss the nature, causes, consequences and remedies of each of the following problems we
might encounter in regression analysis.
a) Muticollinearity
b) Hetroscedasticity
c) Autocorrelation
Answer
a) Multicollinearity
Multicollinearity is the occurrences of high inter correlations among two or more independent
variables in a multiple regression model. The causes of multicollinearity can be data-based or
structural. Data-based multicollinearity can occur due to insufficient data, existence of dummy
variables, using a variable that is a combination of two existing variables or using two identical
or almost identical variables. The consequences of multicollinearity include the coefficient
estimates to have inflated standard errors and reduction in the precision of the estimated
coefficients, which lowers the model's power. Some of the remedies for Multicollinearity are
remove some of the highly correlated independent variables, combine the independent variables
by adding them together, and perform an analysis designed for highly correlated variables, such
as principal components analysis or partial least squares regression.
b) Heteroscedasticity
Heteroscedasticity is a situation where the variance of residuals is not constant over the range of
measured values. It results in an unequal scatter of residuals in a regression analysis. One cause
5
of heteroscedasticity is using a dataset with a wide range of values, resulting in outliers. Another
cause is the omission of variables from the model. The consequence of heteroscedasticity is that
it results in estimators that are not best, linear, and unbiased. Similarly, hypothesis tests of the
estimated coefficients using t-test and f-test become invalid due to heteroscedasticity. The
remedies for heteroscedasticity test are data transformation (Square root transformation,
exponential transformation, logarithmic transformation, absolute value transformation and
inverse transformation).
c) Autocorrelation
Autocorrelation is the correlation of the same variable between two successive time intervals. It
measures the relationship between a variable's current value and its past values. On the other
hand autocorrelation is a correlation coefficient. However, instead of correlation between two
different variables, the correlation is between two values of the same variable. This means that
the disturbances are not pairwise independent, but pairwise auto correlated. Auto correlation is
most likely to occur in time series data. Autocorrelation can be caused by seasonal shocks that
affect a variable differently at different periods. Other causes of autocorrelation are
misspecification and data smoothing or manipulation. Autocorrelation leads to coefficient
estimates that are not best, linear and unbiased. Besides, it underestimates the variances of the
estimates, which affects hypothesis testing. Similarly, the coefficient of determination becomes
overestimated, and all t-statistics become higher. When autocorrelated error terms are found to
be present, then one of the first remedial measures should be to investigate the omission of a key
predictor variable. If such a predictor does not aid in reducing/eliminating autocorrelation of the
error terms, then certain transformations on the variables can be performed.
5. Use the data file wage to work on using STATA and answer the following questions:
a) Examine the data

b) Carry out remedial measure(s) if there is any problem with data
c) Regress HRS on RATE, ERSP, ERNO, NEIN, AGE and DEP
d) Conduct model specification tests using linktest and ovtest commands of STATA, and
interpret the result
6
e) Perform multicollinearity test
f) Perform hetroscedasticity test
g) Comment on the explanatory power and adequacy of the model
h) Interpret the regression coefficients
Answer
a) The data has 10 variables and 39 observations.
b) Remedial measures are not needed because the given data had no problems of
multicollinearity and hetroscedasticity.
c) The results of the regression analysis is presented as follows:
Source SS df MS Number of obs = 36

F(6, 29) = 19.79
Model 115137.522 6 19189.5869 Prob > F = 0.0000
Residual 28119.2285 29 969.628568 R-squared = 0.8037
Adj R-squared = 0.7631
Total 143256.75 35 4093.05 Root MSE = 31.139
HRS Coef. Std. Err. t P>|t| [95% Conf. Interval]
RATE -26.85105 24.04727 -1.12 0.273 -76.03323 22.33114

ERSP .0285354 .0357251 0.80 0.431 -.0445307 .1016015
ERNO -.2780987 .091545 -3.04 0.005 -.4653293 -.0908681
NEIN .5846803 .0892013 6.55 0.000 .4022431 .7671175
AGE -5.239095 2.40644 -2.18 0.038 -10.16082 -.3173734
DEP 27.15917 13.75215 1.97 0.058 -.9671344 55.28547
_cons 2210.755 104.2362 21.21 0.000 1997.568 2423.942
d) 1. Linktest
7
. linktest

F(2, 33) = 70.14
Model 115975.366 2 57987.6832 Prob > F = 0.0000
Total 143256.75 35 4093.05 Root MSE = 28.753
HRS Coef. Std. Err. t P>|t| [95% Conf. Interval]
_hat 6.920836 5.882094 1.18 0.248 -5.046374 18.88805

_hatsq -.0013877 .0013785 -1.01 0.321 -.0041922 .0014168
_cons -6311.115 6271.789 -1.01 0.322 -19071.17 6448.936
The results of the link test indicated that the model is properly specified.
2. Ovtest
Ramsey RESET test using powers of the fitted values of HRS

Ho: model has no omitted variables
F(3, 26) = 3.00
Prob > F = 0.0488
The results of the ovtest indicated that the model is properly specified.
e) Multicollinearity test result
Variable VIF 1/VIF
NEIN 5.49 0.182100

RATE 4.56 0.219380
AGE 3.93 0.254487
ERSP 2.95 0.338889
DEP 2.91 0.343477
ERNO 2.65 0.377146
Mean VIF 3.75
f) Hetroscedasticity test result

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of HRS
chi2(1) = 2.16
Prob > chi2 = 0.1421
8
g) The model is statistically significant (F=1079, p<0.001) and the adj. R-squared value
showed that about 76.31% of the variation of the dependent variable is based on the given
independent variables.
h) The results of the regression analysis indicated that the factors that affect the average
hours worked during the year are Average yearly earnings of other family members
($),Average yearly non-earned income, average family asset holdings (Bank account,
etc.) ($), Average age of respondent, and average number of dependents.
6. Use the data file EARNINGS and, using STATA for analysis, carry out the following tasks.
a. Perform a regression of EARNINGS on S where EARNINGS represents Current hourly
earnings in $ and S represents education (highest grade completed) in number of years of
schooling of the respondent. Interpret the regression results
b. Comment on the value of R2
c. Perform a test on the coefficients of regression. Explain the implications of the result of
the test. Calculate a 95% confidence interval for the slope coefficient
d. Perform an F test for the goodness of fit and comment on the result
e. Regress S on ASVAC and SM where ASVAC is a composite measure of numerical and
verbal ability of the respondent and SM is the years of schooling of the respondent’s
mother. Repeat the regression using SF, the years of schooling of the father, instead of
SM, and again including both as regressors. Do your regression result support the view
that if you educate a male, you educate an individual, while if you educate a female, you
educate a nation?
f. Regress EARNINGS on S and EXP (total out-of-school work experience in years),
interpret the results and perform t tests
Answer
a. The regression analysis of EARNINGS on S where EARNINGS represents Current hourly

earnings in $ and S represents education (highest grade completed) in number of years of
schooling of the respondent and its interpretation is presented as follows.
9
F(1, 538) = 112.15
Model 19321.5589 1 19321.5589 Prob > F = 0.0000
Total 112010.231 539 207.811189 Root MSE = 13.126
EARNINGS Coef. Std. Err. t P>|t| [95% Conf. Interval]
S 2.455321 .2318512 10.59 0.000 1.999876 2.910765

_cons -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
The result of regression analysis shows that the highest grade completed in number of years of
schooling has a positive and significant effect on current hourly earnings.
b. The value of R2 indicates that the variation of current hourly earnings can be explained by
that the highest grade completed in number of years of schooling. But the value is small,
indicating that the highest grade completed in number of years of schooling explains the
current hourly earnings by 17.3%.
c. The value of beta coefficient ( β=2.46 , p<0.00 1 ¿ indicates that a one unit increase in
number of years of schooling brings 2.46 unit increases in current hourly earnings.
d. The value of F-test (F=112.15, p<0.001) indicates that the data fits the model.
e. The Regression results are presented as follows:

F(2, 537) = 147.36
Model 1135.67473 2 567.837363 Prob > F = 0.0000
Total 3204.98333 539 5.94616574 Root MSE = 1.963
S Coef. Std. Err. t P>|t| [95% Conf. Interval]
ASVABC .1328069 .0097389 13.64 0.000 .1136758 .151938

SM .1235071 .0330837 3.73 0.000 .0585178 .1884963
_cons 5.420733 .4930224 10.99 0.000 4.452244 6.389222
10
F(2, 537) = 155.49
Model 1175.37867 2 587.689333 Prob > F = 0.0000
Total 3204.98333 539 5.94616574 Root MSE = 1.9441
ASVABC .1285797 .0095914 13.41 0.000 .1097385 .1474209

SF .1289751 .0259437 4.97 0.000 .0780115 .1799387
_cons 5.541335 .4692887 11.81 0.000 4.619468 6.463202

F(3, 536) = 104.30
Model 1181.36981 3 393.789935 Prob > F = 0.0000
Total 3204.98333 539 5.94616574 Root MSE = 1.943
ASVABC .1257087 .0098533 12.76 0.000 .1063528 .1450646

SM .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons 5.370631 .4882155 11.00 0.000 4.41158 6.329681
The results of the third regression analysis did not support the view that if you educate a male,
you educate an individual, while if you educate a female, you educate a nation.
f. The regression analysis EARNINGS on S and EXP (total out-of-school work experience in
years) presented and interpreted as follows:

F(2, 537) = 67.54
Model 22513.6473 2 11256.8237 Prob > F = 0.0000
Total 112010.231 539 207.811189 Root MSE = 12.91
EARNINGS Coef. Std. Err. t P>|t| [95% Conf. Interval]
S 2.678125 .2336497 11.46 0.000 2.219146 3.137105

EXP .5624326 .1285136 4.38 0.000 .3099816 .8148837
_cons -26.48501 4.27251 -6.20 0.000 -34.87789 -18.09213
11
The result of regression analysis shows that the highest grade completed in number of years of
schooling and total out-of-school work experience in years has a positive and significant effect
on current hourly earnings.
12

Econometrics Assignment... 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics Assignment... 2

Uploaded by

Copyright:

Available Formats

PHARMA COLLEGE

DEPARTMENT OF ACCOUNTING AND FINANCE

Putting the values,

a = 217.82 – (0.31*519.18) = 55.89

Therefore, the regression line of sales on price is, Y = 55.89 + 0.31X

iii) The price elasticity of sales can be calculated as:

a. Estimate the linear regression line Ε( S )=α + βP

b. Estimate the standard errors of α^ and β^

a) The linear regression line can be calculated as follows:

From the table

Therefore, the estimated regression line is, S=24.1082 + 0.8685P

b) The standard error (SE) ofα and β are calculated as follows:

Σ^2=1/( n−2)S SE=1/(n-2)[SSS- β 2 Spp]

c) Testing for hypothesis

a) In this case, Y=C,x= Yd, n=19, R2=0.99

T-ratio for β1=3.1, T-ratio for β2=18.7

Therefore, It can be conclude that Yd is statistically significant.

a) Examine the data

Source SS df MS Number of obs = 36

HRS Coef. Std. Err. t P>|t| [95% Conf. Interval]

RATE -26.85105 24.04727 -1.12 0.273 -76.03323 22.33114

Source SS df MS Number of obs = 36

HRS Coef. Std. Err. t P>|t| [95% Conf. Interval]

_hat 6.920836 5.882094 1.18 0.248 -5.046374 18.88805

Ramsey RESET test using powers of the fitted values of HRS

Variable VIF 1/VIF

NEIN 5.49 0.182100

Mean VIF 3.75

f) Hetroscedasticity test result

a. The regression analysis of EARNINGS on S where EARNINGS represents Current hourly

EARNINGS Coef. Std. Err. t P>|t| [95% Conf. Interval]

S 2.455321 .2318512 10.59 0.000 1.999876 2.910765

e. The Regression results are presented as follows:

S Coef. Std. Err. t P>|t| [95% Conf. Interval]

ASVABC .1328069 .0097389 13.64 0.000 .1136758 .151938

S Coef. Std. Err. t P>|t| [95% Conf. Interval]

ASVABC .1285797 .0095914 13.41 0.000 .1097385 .1474209

Source SS df MS Number of obs = 540

S Coef. Std. Err. t P>|t| [95% Conf. Interval]

ASVABC .1257087 .0098533 12.76 0.000 .1063528 .1450646

Source SS df MS Number of obs = 540

EARNINGS Coef. Std. Err. t P>|t| [95% Conf. Interval]

S 2.678125 .2336497 11.46 0.000 2.219146 3.137105

You might also like