14 views

Uploaded by PPP

Econ 306 Homework3_ezra

- matlab prject
- Statistical Data Analysis
- Research on Lack of Interpersonal Skills of Programmers
- Statistical Analysis Excel
- Study Guide for Statistics
- MA Statistics Tutorial
- LBC Viscosity Modeling of Gas Condensate to Heavy Oil, Tao Yang, SPE 109892, 2007
- polynomial regression and step function
- bus173chap11-1
- How to Do Model Selection in time series example
- practical_problems_in_statistic.doc
- review
- Introduction Econometrics R
- Liner Regression Model Reporting
- Effect of Firm Characteristics Financial Performance and Environmental Performance on Corporate Social Responsibility-libre
- Train Accidents Report
- Time Series Analysis
- Anova Swagat
- GCM_Chapter15.pdf
- Managerial Economic 2

You are on page 1of 11

Homework 3

100 points

README: The following two problems will require a lot of calculations in STATA. It will

generate many pages of output. Here is how you should organize it. The first pages should

contain your answers to all the questions, along with showing any key algebraic equations

or explanations you need to use along the way. After that, include a printout of the output

from the regressions you executed in support of your answers. Highlight any numbers in this

output that you used in the first section. (To save paper, you may print this section double-side

and/or with 2-up format.) Last, include a copy of the DO file that contains the commands you

asked STATA to execute. Be sure you organize these in a way that will be clear to the reader.

With this assignment you will find a STATA data file called HW3.Housing.dta. For

reference, the variables in this file are:

sqft = Total square feet of living area

beds = Number of bedrooms

baths = Number of full bathrooms

age = age of house in years

stories = number of stories of the house

vacant = If yes, this variable=1. If no, this variable=0.

Open this dataset within STATA. Before you begin answering the following, its not a

bad idea to ask STATA to summarize the data using the command summarize. You

should also start a log file to store your results.

Price 0 1 *beds

a. Run the following regression:

Diego Garzon - dig5269

------------------------------------------------------------------------------

-------------+----------------------------------------------------------------

------------------------------------------------------------------------------

b. Hypothesize the sign of the bias, if any, resulting from excluding sqft from the

regression. Explain your reasoning.

Excluding sqft from the regression, I hypothesize that the bias of beds will be

positive. That is, as the number of bedrooms increase, so too does the price.

c. Use the data to verify (or not) your claim from b). Break down the bias into the

component pieces as we did in class

The data verifies that the number of bedrooms are related to the price. We see the constant

coef. Is 11025.06, then we see the bedroom coef. Is 32104.46

d. You will see from c. that the effect of beds is negative, once we control for square

footage. Does this make sense?

Yes, because if we control square footage and allow for an increase of beds, we would end

up with a house with more and more bedrooms. Who would want that?

Price 0 1 * beds 2 * sqft 3 * baths 4 * age 5 * stories

-------------+------------------------------ F( 5, 874) = 431.77

Diego Garzon - dig5269

Residual | 7.0655e+11 874 808414232 R-squared = 0.7118

-------------+------------------------------ Adj R-squared = 0.7102

Total | 2.4518e+12 879 2.7893e+09 Root MSE = 28433

------------------------------------------------------------------------------

price | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

beds | -17474.4 1983.609 -8.81 0.000 -21367.59 -13581.2

sqft | 98.13947 2.992471 32.80 0.000 92.26621 104.0127

baths | -1332.919 2688.311 -0.50 0.620 -6609.219 3943.381

age | -299.074 53.9454 -5.54 0.000 -404.9516 -193.1963

stories | -6708.066 3246.125 -2.07 0.039 -13079.18 -336.9543

_cons | 28193.49 5344.806 5.27 0.000 17703.34 38683.64

------------------------------------------------------------------------------

f. At a level of =.05, for which, if any, values of i, would you reject the null

hypothesis that i=0?

We reject all null hypotheses except for baths which has a p-value greater than .05 baths

=.620

stories=3? (Note: these are the figures for my house here in State College, but

the data is not, so the price ehre is pretty meaningless as a predictor of my own home

value.)

From data in Q1e

= 107,622.32

According to this model, how much will my house change in value five years from

today?

Diego Garzon - dig5269

= 106,126.90

gen price_thous=price/1000

and then use this in place of price in the regression command for part e

gen price_thous=price/1000

.

. reg price_thous beds sqft baths age stories

-------------+------------------------------ F( 5, 874) = 431.77

Model | 1745246.47 5 349049.293 Prob > F = 0.0000

Residual | 706554.038 874 808.414231 R-squared = 0.7118

-------------+------------------------------ Adj R-squared = 0.7102

Total | 2451800.5 879 2789.3066 Root MSE = 28.433

------------------------------------------------------------------------------

price_thous | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

beds | -17.4744 1.983609 -8.81 0.000 -21.36759 -13.5812

sqft | .0981395 .0029925 32.80 0.000 .0922662 .1040127

baths | -1.332919 2.688311 -0.50 0.620 -6.609219 3.943381

age | -.299074 .0539454 -5.54 0.000 -.4049516 -.1931963

stories | -6.708066 3.246125 -2.07 0.039 -13.07918 -.3369542

_cons | 28.19349 5.344806 5.27 0.000 17.70334 38.68364

------------------------------------------------------------------------------

i. Compare the coefficients, standard error, and t-statistics for the independent variables.

Briefly interpret the difference between this model and the version from part e.

The real values of the coefficients were unchanged, only the decimal place is

moved for all coefficients. They are just divided by 1000. Also, the t-stats are

unchanged.

Diego Garzon - dig5269

j. Create a new age variable by converting age from years to days (365 days in a year).

Rerun the regression from e with the new age variable in place of the original age.

gen age_days=age*365

------------------------------------------------------------------------------

-------------+----------------------------------------------------------------

------------------------------------------------------------------------------

Only the age variable has changed. The coefficient has been divided by 365 days, as

well as the standard error. Nothing else has been affected.

Diego Garzon - dig5269

2. (48 Points Total, 9 parts worth 5 points each, add on 3 points for free.)

For the following problem, use the STATA dataset called crime.dta. This data set was

compiled by Christopher Cornwell and William Trumbull to study factors that influence

crime rates. The data set contains observations for 90 counties in North Carolina for

1981. The definitions of the variables are given in the data set:

According to the economic model of crime rates, lower crime rates are associated with

better labor markets (higher wages), more police presence and tougher sentences, and

lower population density. We will use this data set to examine these hypotheses. Use a

significance level of =.05 for all hypothesis tests. All of the following regressions

will utilize the following subset of variables from this dataset.

crmrte=crime rate

prbarr=probability of arrest

prbconv=probability of conviction

prbpris=probability of a prison sentence

avgsen=average sentence in days

polpc=number of police per capita

density=population density

pctmin=percent minority

taxpc=tax revenue per capita

wmfg=average weekly wage in manufacturing

wcon=average weekly wage in construction

wtuc=average weekly wage in transportation,utilities,and communications

wtrd=average weekly wage in wholesale and retail trade

wfir=average weekly wage in finance,insurance,and real estate

wser=average weekly wage in services

wfed=average weekly wage in federal government

wsta=average weekly wage in state government

wloc=average weekly wage in local government

a. Run a regression of crmrte on the variables listed above. Call this Model 1.

//model 1

.

Diego Garzon - dig5269

. reg crmrte prbarr prbconv prbpris avgsen polpc density pctmin taxpc wmfg wcon wtuc wtrd

wfir wser wfed wsta wloc

------------------------------------------------------------------------------

-------------+----------------------------------------------------------------

Diego Garzon - dig5269

------------------------------------------------------------------------------

Probability of arrest, population density, and percent minority are all not statistically

significant, because they al fall below our original alpha test value

Our F Statistic is calculated through the SSRrestricted minus the SSR unrestricted devided

by the number of restrictions all over the SSR unrestricted devided by our observations

minus the number of individual variables in the unrestricted regression minus one. Our

model gives us F(17, 72) This means our SSE df is 17 and our SSR df is 72. It tests the

joined hypotheses of all of our coefficients on all of our variables and that they are all zero.

This pval of 0.000 that is generated tells us that we will reject this null hyp at this level of .

05

d. Test the hypothesis that the coefficients on wsta and wloc are equal to each other.

Use the t-test method described in the lectures. What transformation do you need to

do here? Be specific.

We use an elaborate t test. We must generate some new value of wsta+wloc labled as

wstawloc. We run the model one regression using wstawloc in place of wsta and wloc We

find our t val as -.25 and our pval as .804, so we would not reject with an alpha of .05.

e. Test the hypothesis that the coefficients on wfed, wsta and wloc are all equal to

each other. Do this by writing down the formula for the relevant F-statistic.

Calculate it (by running the appropriate restricted regression) and test the hypothesis.

Report these results. This restricted version of the regression will be called Model 2.

Diego Garzon - dig5269

For this, we use the same elaborate ttest as we ran in D. We generate some

variable labeled Qe=(wsta+wfed+wloc). We run the model 1 regression with Qe

in place of wsta wfed and wloc on crmrte. We must find our fstat with

.005698845.005661278/2

F=

.005661278/72 = .238887688 WE will fail to reject .23 > .05

Restricted:

Source | SS df MS Number of obs = 90

-------------+------------------------------ F( 15, 74) = 17.34

Model | .020033929 15 .001335595 Prob > F = 0.0000

Residual | .005698845 74 .000077011 R-squared = 0.7785

-------------+------------------------------ Adj R-squared = 0.7336

Total | .025732774 89 .000289132 Root MSE = .00878

Unrestricted is Model 1

f. Return to Model 1. Now test the hypothesis that all 9 of the wage variables have a

coefficient of zero. Do this by writing down the formula for the relevant F-statistic.

Calculate it (by running the appropriate restricted regression) and test the hypothesis.

Report these results.

Source | SS df MS Number of obs = 90

Same process for the one above but all wage variables are replaced by Qf

.005938334.005661278/8

F=

.005661278/80 = .489387732 Therefore we will fail to reject the Ho

Diego Garzon - dig5269

Restricted is model 1

the crime, the probability of conviction is prbconv. If the person is convicted, the

probability of prison is prbpris. Assuming all these probabilities are independent.

What is the formula for calculating the probability that someone who commits a

crime will a) get arrested AND b) get convicted AND c) get a prison sentence? That

is, how would you calculate the probability of this intersection of statistically

independent events? [Note: the probabilities produced by the researchers are

derived from the arrest data, and thus may not follow the usual rules of probability.

In particular, some probabilities are greater than one. Dont worry about that here.]

Call this variable prjail_ifcrime and create it in STATA.

-------------+--------------------------------------------------------

prjail_ifc~e | 90 .2990574 .1268487 .0588235 .7

h. Given this result prjail_ifcrime, how would you use the variable avgsen to calculate

the expected time in jail if commiting a crime. Call this variable jailtime_ifcrime and

create it in STATA.

-------------+--------------------------------------------------------

i. Return to regression Model 1. Replace the variables prbarr, prbconv, prbpris and

avgsen with your new variable jailtime_ifcrime. This is Model 4. Write a paragraph

in which you discuss how Model 4 compares with Model 1.

Diego Garzon - dig5269

. reg crmrte jailtime_ifcrime polpc density pctmin taxpc wmfg wcon wtuc wtrd wfir

wser wfed wsta wloc

Source | SS df MS Number of obs = 90

-------------+------------------------------ F( 14, 75) = 14.31

Model | .018723082 14 .001337363 Prob > F = 0.0000

Residual | .007009691 75 .000093463 R-squared = 0.7276

-------------+------------------------------ Adj R-squared = 0.6767

Total | .025732774 89 .000289132 Root MSE = .00967

----------------------------------------------------------------------------------

crmrte | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-----------------+----------------------------------------------------------------

jailtime_ifcrime | -.0001843 .000319 -0.58 0.565 -.0008197 .000451

polpc | -1.135305 .7543213 -1.51 0.137 -2.637991 .3673797

density | .0081923 .0010578 7.74 0.000 .006085 .0102996

pctmin80 | .0002044 .0000719 2.84 0.006 .0000611 .0003476

taxpc | .0002442 .00017 1.44 0.155 -.0000944 .0005828

wmfg | .0000139 .0000238 0.58 0.561 -.0000335 .0000614

wcon | .0000297 .0000481 0.62 0.539 -.0000662 .0001256

wtuc | -.0000255 .0000233 -1.09 0.278 -.000072 .000021

wtrd | -.0000212 .000066 -0.32 0.749 -.0001527 .0001103

wfir | .0000352 .000057 0.62 0.538 -.0000782 .0001487

wser | 2.04e-06 .000055 0.04 0.970 -.0001075 .0001116

wfed | .0000485 .0000305 1.59 0.116 -.0000123 .0001093

wsta | .000016 .0000424 0.38 0.707 -.0000685 .0001006

wloc | -7.90e-06 .0000932 -0.08 0.933 -.0001935 .0001777

_cons | -.0116069 .0185137 -0.63 0.533 -.0484881 .0252743

Upon creating model 4 and comparing our data to model 1 we fine several

differences in the data. Among our data, police per capita and tax revenue per

capita has increased so too as our average wage in manufacturing. It appears

that the rest of the wages have fallen in terms of their coefficients, while most of

their t values have increased. More surprising is our R2 has decreased

measurably, So I would say that this model 4 is not as good at predicting as

model 1.

- matlab prjectUploaded bywaqas
- Statistical Data AnalysisUploaded byTari Baba
- Research on Lack of Interpersonal Skills of ProgrammersUploaded byRoshan Joshi
- Statistical Analysis ExcelUploaded bywaiting4add
- Study Guide for StatisticsUploaded bystephanie8614
- MA Statistics TutorialUploaded byNabil Lahham
- LBC Viscosity Modeling of Gas Condensate to Heavy Oil, Tao Yang, SPE 109892, 2007Uploaded byjoreli
- polynomial regression and step functionUploaded byapi-285777244
- bus173chap11-1Uploaded byShahriar Noor
- How to Do Model Selection in time series exampleUploaded bysumanta1234
- practical_problems_in_statistic.docUploaded byAkhilesh
- reviewUploaded byapi-285777244
- Introduction Econometrics RUploaded byMayank Rawat
- Liner Regression Model ReportingUploaded byUmar Hayat
- Effect of Firm Characteristics Financial Performance and Environmental Performance on Corporate Social Responsibility-libreUploaded byWisuttorn Jitaree
- Train Accidents ReportUploaded byimmi1989
- Time Series AnalysisUploaded byMadina Suleimenova
- Anova SwagatUploaded byArun Kumar C
- GCM_Chapter15.pdfUploaded bydungnt0406
- Managerial Economic 2Uploaded byAhmad Hirzi Azni
- Pages From SPSS for BeginnersUploaded byWaqas Nadeem
- E Recruitment on job seekersUploaded byRajnee
- 276-283mar14.pdfUploaded bySadhika Katiyar
- math work exampleUploaded byapi-406104808
- 072-30Uploaded bypriya
- Yesim Ozan_Simple Linear Regression-Presentation_08.08.15.pptxUploaded byyeşim ozan
- 5.Wong WelchUploaded byGabriel Lima Marques
- 4_OLSHypothesisTestingUploaded byMindra Jaya
- Thesis PPT2.pdfUploaded byyth
- journalUploaded byDanu Himawan

- MATH3281_2015Lec2Uploaded byPPP
- WQU Financial Markets Module 4 Compiled ContentUploaded byPPP
- ESG Consultant Graduate IR 20170223Uploaded byPPP
- Matering the Game of Go Without Human Knowledge UnformattedUploaded byАлександр Киберман
- 6类主题分类提纲中文口诀Uploaded byPPP
- MT1.F13.v2.solvedUploaded byPPP
- Asia_Pacific_Business_Management.docxUploaded byPPP
- 5.6.10.lecture_11Uploaded byPPP
- RMSC4003_A2Q4dUploaded byPPP
- Homework 3 1Uploaded byPPP
- PDFUploaded byPPP
- Docx ExerciseUploaded byPPP
- Econ 306 Hw 3Uploaded byPPP
- MT1.S15.SolvedUploaded byPPP
- ProbabilityUploaded byGopi Krishna
- testsforconvergencewithanswers.pdfUploaded byRoberto Antonio Rojas Esteban
- Chapter 4 SummaryUploaded byPPP
- Magnificent 7 Topics in Financial Engineering by Phelim BoyleUploaded byPPP
- RM415HW4Uploaded byPPP
- Clicker Chapter 23 Electric PotentialUploaded byPPP
- `CSCI3130_Homework 2Uploaded byPPP

- first presentation.pdfUploaded byzainebfarooq
- cheatsheet_luckycallorUploaded bymmmax
- ANOVA ExampleUploaded byJohn Wong
- Process for Spss and SasUploaded byPat
- Chapter5_Lecture5Uploaded byNdomadu
- Real Statistics Examples ANOVA 1Uploaded byNazakat Hussain
- SPSS_WilcoxonRankSumTestUploaded byAvinash Ambati
- Anderson1957_Statistical Inference About Markov ChainsUploaded byBernardo Frederes Kramer Alcalde
- F0256101511 Grey Relational Analysis(GRA)Uploaded byjtjimmy
- MM ZG515Uploaded byArun Padmanabhan
- lect16.pdfUploaded byMeqi Fitri
- EE 364 Homework 1Uploaded byNikhil Narang
- Theoretical DistributionUploaded byAtul Jhariya
- MLVEUploaded byHiusze Ng
- hw4Uploaded byrodo111
- Sample of Final Exam.pdfUploaded byAbdul Rahman El Cheikh
- StatUploaded bycj_anero
- 2.Discrete Random VariablesUploaded byDeephikaKoppulian
- Econometrics2006.pdfUploaded byAmz 1
- LIMDEP-Chapter33.pdfUploaded byHerwan Galingging
- r05220101-probability-and-statisticsUploaded bySRINIVASA RAO GANTA
- Egger and Lassmann (2012)_The Language Effect in International Trade a Meta-AnalysisUploaded byTan Jiunn Woei
- statistical method engg.pdfUploaded bybingo
- 2009PalisadeRio_EnriqueNavarreteUploaded byBen Toornstra
- 2. Research Design-LDR 280Uploaded bysanchita22
- Group_2 Correlation and Regression_Assgn1Uploaded byKetan Poddar
- Strength RegressionUploaded byamin1233
- Section 1-A Refresher on Probability and StatisticsUploaded byIslam Ali
- BRM - Factor AnalysisUploaded byRavinder Kumar
- How to Write Chapter 3 - Methods of Research and Procedures (Continuation)Uploaded byAngelie Lape