102 views

Uploaded by Mo Ml

Econometrics

- econometrics paper.docx
- Econometrics Exam.
- Section A
- Econometrics Assignment No 1
- BAB 7 Multiple Regression and Other Extensions of the Simple
- Econometrics
- Econometrics PS Final Answers: Replication study
- PGDBA
- Jawapan Econometrics Phd
- Econometrics Problem Set
- Econometrics Homework
- Assignment 3 Solutions
- Formula Sheet Econometrics.pdf
- Econometrics Project
- Honors Exam 2012 Econometrics With Answers
- ECONOMETRICS-Assignment 2
- Econometrics Assignment
- Multiple Regression Answers
- Econometrics
- Econometrics Question and Answer

You are on page 1of 8

Problem Set #4

Nathaniel Higgins

nhiggins@jhu.edu

Assignment

Read 4.1 4.3. Hand in answers to

C4.1(i)

C4.5(i) and (ii)

C4.7(i) (iv)

C4.8(i) (v)

C4.9(i) and (iii)

C4.10(i) (vi)

C4.1

The following model can be used to study whether campaign expenditures aect election

outcomes:

voteA =

0

+

1

log(expendA) +

2

log(expendB) +

3

prtystrA + u,

where voteA is the percentage of the vote received by Candidate A, expendA and expendB

are campaign expenditures by Candidates A and B, and prtystrA is a measure of party

strength for Candidate A (the percentage of the most recent presidential vote that wen

to As party).

i

What is the interpretation of

1

?

When expenditure by Candidate As campaign increases by 1%, the percentage of the

vote that Candidate A receives is predicted to increase by

1

/100.

1

C4.5

Use the data in MLB1.RAW for this exercise.

i

Use the model estimated in equation (4.31) and drop the variable rbisyr. What happens to

the statistical signicance of hrunsyr? What about the size of the coecient on hrunsyr?

The model estimated in equation (4.31) yields:

N = 353 R squared = 0.6278.

When we drop the variable rbisyr we get:

N = 353 R squared = 0.6254.

hrunsyr goes from insignicant (t-stat of 0.90) to signicant at the greater than the 1%

level (t-stat of 4.96). The coecient size essentially triples in size.

ii

Add the variables runsyr (runs per year), dperc (elding percentage), and sbasesyr

(stolen bases per year) to the model from part(i). Which of these factors are individually

signicant?

Only runsyr is individually statistically signicant at conventional levels (at the 5%

level). It is also the most economically signicant variable (it has the largest coecient

by an order of magnitude).

C4.7

Refer to the example used in Section 4.4. You will use the data set TWOYEAR.RAW.

i

The variable phsrank is the persons high school percentile. (A higher number is better.

For example, 90 means you are ranked better than 90 percent of your graduating class.)

Find the smallest, largest, and average phsrank in the sample.

This is a piece of cake with the sum function in Stata. The smallest value of phsrank

is 0, the largest is 99, and the average (mean) value of phsrank is 56.16. The average

high school rank of individuals in the dset is above 50% (interesting; if this is a random

sample of high school graduates, would you expect a value like this?)

2

ii

Add phsrank to equation (4.26) and report the OLS estimates in the usual form. Is

phsrank statistically signicant? How much is 10 percentage points of high school rank

worth in terms of wage?

When I add phsrank to equation (4.26) and run the regression I get the following results:

N = 6, 763 R squared = 0.22.

phsrank is not statistically signicant at conventional (5%) levels (the t-statistic is 1.27,

which is less than the 1.96 it would need to be in order to reject the null hypothesis that

phsrank

= 0 against the two-sided alternative hypothesis that

phsrank

= 0 at the 0.05

level of signicance. You can also see this from the p-value, which is greater than 0.05).

Ten percentage points of high school rank is worth an increase of 10 .0003032 =

0.003032 in log(wage). That is, since high school rank is already expressed in percentage

terms, we only need to multiply the coecient by 10 to see how much log(wage) increases

by when we increase phsrank by 10. We could leave it there, but its not very useful

to interpret things in terms of log(wage) units. We can easily interpret this change of

0.003032 log(wage) units in terms of percentage increase in wage instead (since wage is

logged on the left-hand-side of the regression equation). To do this, we multiply by 100.

Therefore, a 10 percentage point increase in phsrank increases wage by approximately

0.3032 percent. Not that much! High school really didnt matter (which is good for me,

because I spent most of high school guring out creative ways to get in trouble).

iii

Does adding phsrank to (4.26) substantively change the conclusions on the returns to

two- and four-year colleges? Explain.

Adding phsrank to (4.26) (i.e. to a regression model that includes junior college credits,

total college credits, and work experience) does not seem to make much dierence. The

total variation explained by all the variables in each model is very similar, the magnitude

of the coecients (i.e. their absolute value) are very similar between the regressions,

and the standard errors (and hence the t-statistics) are essentially unchanged.

iv

The data set contains a variable called id. Explain why if you add id to equation (4.17)

or (4.26) you expect it to be statistically insignicant. What is the two-sided p-value?

By using the describe command in Stata I can see that the id variable is nothing

but an ID number, which should have absolutely nothing to do with a persons wage.

Therefore, if I add it to any regression model explaining wage, I should hope like hell

that it doesnt correlate very highly with wage. And it doesnt (p-value of 0.587).

3

4.8

The data set 401KSUBS.RAW contains information on net nancial wealth (nettfa),

age of the survey respondent (age), annual family income (inc), family size (fsize), and

participation in certain pension plans for people in the United States. The wealth and

income variables are both recorded in thousands of dollars. For this question, use only

the data for single-person households (so fsize=1).

i

How many single-person households are there in the dset?

I used three commands to determine that out of 9,275 responses in the dset, 2,017 of them

came from individuals with a family size of 1. The only necessary command was: sum

if fsize==1. I then see that there were 2,017 observations in the dset with fsize==1.

I also wondered if any of the observations of fsize were missing (which would indicate

the possibility that there were more single-person households that I could not observe).

To nd this out, I typed describe to nd out that there were 9,275 total observations

in the dset, then typed sum fsize to determine that there were 9,275 observations of

the variable fsize, i.e. there were no missing values. Just some bonus knowledge for

you.

ii

Use OLS to estimate the model

nettfa =

0

+

1

inc +

2

age + u,

and report the results using the usual format. Be sure to use only the single-person

households in the sample. Interpret the slope coecients. Are there any surprises in the

slope estimates?

To run this regression I used the command reg nettfa inc age if fsize == 1. When

I did so I obtained the results:

N = 2, 017 0.12.

When annual income increases by $1,000, we predict that net nancial assets will increase

by $800 (which makes some sense we would be surprised if income increased net assets

by more than the income increase). When age increases by one year, net nancial assets

increase by $840. We expect individuals to be accumulating wealth as they age.

iii

Does the intercept from the regression in part(ii) have an interesting meaning? Explain.

The intercept is the predicted net nancial assets of a zero-year-old. Nu said.

4

iv

Find the p-value for the test H

0

:

2

= 1 against H

A

:

2

< 1. Do you reject H

0

at the

1% signicance level?

The p-value of a test is the probability of getting a test statistic as large (or as small)

as you obtained under the null. So in this case, we want to know: if the null is true (if

2

= 1), what is the probability of getting a t-stat as large (in absolute) as the t-stat we

observe? First, we have to calculate the t-stat. The t-stat under the null is:

0.8426563 1

0.0920169

= 1.7099435.

We now want to know what the probability is of getting a t-stat that is bigger than

1.7099435. If we were testing a two-sided hypothesis (i.e. if the alternative hypothesis

were H

A

:

2

= 0 instead of H

A

:

2

< 0) then we would double the value we are about

to obtain. We want to nd:

P(T > 1.7099435),

where T is a random draw from the t-distribution with 2,014 degrees of freedom. Lets

nd this exact value using Stata:

scalar pval = ttail(2014,1.7099435).

When I do this I obtain: pval = .04371517. The p-value of 0.04 tells me that we would

not reject the null hypothesis at the 1% signicance level (we would reject the null at

the 5% signicance level, but not at the 1% level).

Note that if you did not have Stata (or your Stata does not have the ttail command) you

could come close to this value simply by comparing the absolute value of the t-statistic

we obtained (1.7099435) to the critical values in table G2.

C4.9

Use the data in DISCRIM.RAW to answer this question.

i

Use OLS to estimate the model

log(psoda) = + 0 +

1

prpblck +

2

log(income +

3

prppov + u,

and report the results in the usual form. Is

1

statistically dierent from zero at the 5%

level against a two-sided alternative? What about at the 1% level?

When I run the regression in Stata I obtain:

N = 401 R squared = 0.09.

1

is statistically dierent from zero at the 5% level, but not at the 1% level (p-value of

0.018, which is less than 0.05, but greater than 0.01).

5

iii

To the regression in part(i), add the variable log(hseval). Interpret its coecient and

report the two-sided p-value for H

0

:

log(hseval)

= 0.

When I add the variable log(hseval) to the regression above, I get a coecient of

0.1213056 and a p-value of 0.000. I thus reject the null hypothesis that the true coecient

on log(hseval) is zero at the 1% signicance level. The coecient tells me that when

log(hseval) increases by one unit, log(psoda) is predicted to increase by about 0.12

units. To change this result from units of logged-variables to units of the variables

themselves, I dont need to do anything to the coecient (since both the independent

variable hseval and the dependent variable psoda are logged). Therefore, I can say that

when median house value in a zip code increases by 1%, the price of a medium soda in

that same zip code is predicted to increase by 0.12%.

C4.10

Use the data in ELEM94 95 to answer this question. The ndings can be compared with

those in Table 4.1. The dependent variable lavgsal is the log of average teacher salary

and bs is the ratio of average salary (by school).

i

Run the simple regression of lavgsal on bs. Is the estimated slope statistically dierent

from zero? Is it statistically dierent from -1?

When I run the simple regression I get the following results:

N = 1, 848 R squared = 0.02.

The t-statistic for the null hypothesis that

bs

= 0 is -5.31 (p-value 0.00) we reject

the null hypothesis that

bs

= 0 against the two-sided alternative at better than the

1% level. By inspecting the condence interval, we can see that -1 is inside the 95%

condence interval. This leads us to conclude that we cannot reject the null hypothesis

that

bs

= 1 at the 5% signicance level.

ii

Add the variables lenrol and lsta to the regression from part(i). What happens to the

coecient on bs? How does the situation compare with that in Table 4.1?

When I add lenrol and lstaff to the regression from part(i), I get:

N = 1, 848 R squared = 0.48.

6

The coecient on bs has gone from -0.80 to - 0.61 (i.e. it has decreased in magnitude

by about 20%) while the t-statistic remains largely unchanged. This is exactly what we

see in columns (1) and (2) of Table 4.1.

iii

How come the standard error on the bs coecient is smaller in part(ii) than in part(i)?

The standard error of the coecient is smaller because the unexplained variation (the

variation of u) is signicantly smaller. The addition of two more variables has explained

signicantly more of the variation in lavgsal than the previous model (compare the

R-squared values of the two models to see this). Of course, adding two variables to a

model has the chance to cause problems with multicollinearity. If the two variables that

were included are highly correlated with bs, this could have the eect of increasing the

standard error of

bs

. The eect of reducing the unexplained variation in the model

outweighs the collinearity eect, in this case. This makes sense when you observe that

the correlation between bs and lenrol is 0.02 and the correlation between bs and lstaff

is 0.04 (both relatively small).

iv

How come the coecient on lsta is negative? Is it large in magnitude?

The coecient on lstaff suggests that the relationship between number of sta and

average teacher salary is negative, controlling for enrollment size and ratio of benets to

salary. This suggests that when more sta are added to a school, each teacher is paid

less, on average. The magnitude of the coecient is relatively large: when the number

of sta increases by 10%, the average salary decreases by about 7%/.

v

Now add the variable lunch to the regression. Holding other factors xed, are teachers

being compensated for teaching students from disadvantaged backgrounds? Explain.

When I run this new regression I obtain:

N = 1, 848 R squared = 0.49.

Teachers are not being compensated for teaching students from disadvantaged back-

grounds. The presence of students from disadvantaged backgrounds is indicated by a

higher proportion of students who qualify for a lunch subsidy. When the proportion

increases by 0.10, the average salary declines by 0.7%.

vi

Overall, is the pattern of results that you nd with ELEM94 95 consistent with the pattern

in Table 4.1?

7

Yes. The magnitude of the eects decline as more variables are added, although the signs

and levels of signicance remain the same. Even though this exercise uses a dierent

dset, the results of the model appear robust. This is good! The last column of Tabl

4.1 is not comparable to the regressions we have run in the exercise (elementary school

students cannot drop out of school).

8

- econometrics paper.docxUploaded byrina
- Econometrics Exam.Uploaded byShayan Ahmed
- Section AUploaded byDivyen Patel
- Econometrics Assignment No 1Uploaded byNischal Thapa
- BAB 7 Multiple Regression and Other Extensions of the SimpleUploaded byClaudia Andriani
- EconometricsUploaded byarsalan1984
- Econometrics PS Final Answers: Replication studyUploaded byDzhandarova Nurfatima
- PGDBAUploaded bysmriti
- Jawapan Econometrics PhdUploaded byNoel Chenta Adrastos
- Econometrics Problem SetUploaded byjohnpaulcorpus
- Econometrics HomeworkUploaded bykmbibireddy
- Assignment 3 SolutionsUploaded byFaas1337
- Formula Sheet Econometrics.pdfUploaded byJustin Huynh
- Econometrics ProjectUploaded byLavinia Sandru
- Honors Exam 2012 Econometrics With AnswersUploaded byCindy Toh
- ECONOMETRICS-Assignment 2Uploaded byGaby Jacinda Andreas
- Econometrics AssignmentUploaded byBhanu Kant Jhingan
- Multiple Regression AnswersUploaded byAli Khaliq
- EconometricsUploaded byshyasir
- Econometrics Question and AnswerUploaded byEdith Kua
- Econometrics: Assignment 1Uploaded byrcraw87
- Econometrics FinalUploaded byJack Molloy
- Econometrics ProjectUploaded byRohit Pandey
- AnswersUploaded byFahd Alahaideb
- Summary EconometricsUploaded byFatimah Ahmad
- Econometrics EssayUploaded byErin Madden
- Econometrics Assignment HW4Uploaded byNikhil Sharma
- Homework 02 AnswersUploaded byornelaabegaj_8315239
- Econometrics ProjectUploaded byUDallasEcon
- econometrics SolutionsUploaded byarmailgm

- Em ItalyUploaded byMo Ml
- Handout m2 1Uploaded byMo Ml
- TocUploaded byMo Ml
- Problem Set 12Uploaded byMo Ml
- Lecture Notes 03Uploaded byMo Ml
- Arabic Sau StyleguideUploaded byMo Ml
- Earthwuake Nature Legal Position AdviseUploaded byMo Ml
- 26 June 2014 Recitation QuestionsUploaded byMo Ml
- FormulasUploaded byMo Ml
- Lecture Notes 04Uploaded byMo Ml
- Liszt+-+Wedding+MarchUploaded byJaeHyeokHwang
- Lecture Notes 02Uploaded byMo Ml
- Lecture Notes 01Uploaded byMo Ml
- Physics Answers Study Slam Exam 2Uploaded byMo Ml
- Colander Sample Ch05Uploaded bymischiefven
- MFE-Recitation4WorksheetsolsUploaded byMo Ml
- CH107-L5 bcUploaded byMo Ml
- Bach 855 Charlier AnalysisUploaded byMo Ml
- Harmonic Minor ScalesUploaded bynonopbmo
- thermohw2Uploaded byMo Ml
- Cvim ManualUploaded byTonio
- Gender ResearchUploaded byMo Ml
- MW Abr Practice Worksheet 6 SolutionsUploaded byMo Ml
- Music APIs _ Music MachineryUploaded byMo Ml
- CAMP to Print BiochemUploaded byMo Ml
- exam_01a BiochemUploaded byMo Ml
- Arpeggios MinorUploaded byedsync
- Linking Pathways for C N MetabolismUploaded byMo Ml
- genel kimya 2Uploaded byÖmer Hacıismailoğlu
- Freshmanregistration2012_v3Uploaded byMo Ml

- Chapter 1 9Uploaded bybackch9011
- Sage Partner Program Resource GuideUploaded bydayastral
- NotesUploaded byKogilavaani Vani
- 0610_m15_qp_62Uploaded bySajal
- Mapping an ApproachUploaded byRezky Awan
- MDG Provincial Report 2010 SiquijorUploaded byMa Vina Clata
- Lesson PlanUploaded byvebiboll
- Lesson Plan Sci08 CBSE Some Natural Phenomena-finalUploaded byarun kumar
- Weldon STEM High School Career Academies.pptxUploaded byViola Gilbert
- deborah brandt reverse outlining uwrt 1103Uploaded byapi-317906787
- wa2 camila villaveces enc 3250 professional writing reviewed by landon brooks-2Uploaded byapi-302399152
- Background of the StudyUploaded byTika Virginiya
- majcher christina november 7 2012 mindfulnessUploaded byapi-289910399
- 2015 Chorister Job Description - UpdatedUploaded byjonathan
- American Text Book ListUploaded bymeleknaz
- shanna kathryn ayer resumeUploaded byapi-288261699
- soal selidik - mengkaji tahap motivasi murid untuk belajar secara koperatifUploaded byPushpaavalli Elangovan
- science classification of animals lesson plan 4 14Uploaded byapi-294619583
- Draft of Annotated BibliographyUploaded byLTucker712
- Evidence What Was Happening When It HappenedUploaded byDiana Rl
- giffordheather ngr6791 final aidet clinical hourly rounding teaching project 11 26 14Uploaded byapi-257580555
- Numericals t TEST 2017Uploaded byPrasoon Srivastava
- 2011 June UGC NET Previous Years Solved Paper IUploaded byaarthi dev
- resume updated 2016 daraUploaded byapi-339957387
- December 2011 IssueUploaded bythewarrioronline_shs
- Experiment Finals2Uploaded byprecious
- Tm1 OrientationUploaded byFe Montegrejo
- Engineering HydrologyUploaded byAhmed Buaishi
- 81011_small_1Uploaded byMansoor Iqbal
- ch14.pdfUploaded byNatsu