Attribution Non-Commercial (BY-NC)

134 views

Attribution Non-Commercial (BY-NC)

- Methods, CBS News' Battleground Tracker, October 25
- Estimation Theory
- EKOS poll - April 29, 2010
- brainwave marketing project -IBM
- California Fox Poll - 4-22-16
- 20160322120208
- Solution for Exercise Confident Interval for median based on sign test
- QMT500
- High Impact n Effective Application of Manufacturing Statistics
- Students Knowledge based Advisor System for Colleges Admission With an Applied Case Study
- 10-5-1--132-142
- Working Paper PEDI
- Correlacion1
- Simulation
- Basic Tools
- cost saving effect of supervised exercise associated to COPD self management education program.pdf
- confidence int mini project
- ch08.docx
- ec05
- Hawaii Poll — Favorability

You are on page 1of 5

1. Getting Started with Stata The point of this discussion section is to get you started using the statistical software package Stata. Starting from the Excel dataset cps98.csv (available on the course website), load the data into Stata. This le contains the data on average hourly earnings, education, gender and age of individuals for a sample of workers in year 1998. A quick way to do this is to save the Excel le and then load it into Stata using the insheet command. You might nd it easier to use Import command under the File menu. Use the sum command to summarize the variables in the dataset. What is the average hourly earnings and their standard deviation in the sample? How many male workers took the survey? What is the average, minimum and maximum age of the respondents? Use the tab age command to look at the distribution of age in the sample. What is the mode of this distribution? What is the median (approximately)? Use the hist ahe command to plot the histogram of the average hourly earnings. What can you say about the shape of its distribution? Use sum ahe if female == 1 command to calculate the average earnings for the females. What is the average earnings for the males? Do you nd the dierence economically signicant? Use the sum if ahe > 40 command to see who are the top earners their age, gender, education level. How about those with hourly earnings less than 3? Use the gen ahe2 = ahe*ahe command to generate a new variable ahe2 equal to hourly wages squared. Use the scatter ahe2 ahe, title(Ahe2) command to graph the relationship between ahe and ahe2 . Now try adding the , xlabel(0(2)50) ylabel(0(1000)3000) option to see how to change the axes in your graph. Use the pwcorr command to calculate the sample correlation between average hourly earnings and age. Does it come out with an expected sign? Plot the relationship with scatter command. Does this correlation change for top and bottom earners?

2. We know that, by denition, 1 X = n s2 = X sXY a. Prove that b. Prove that c. Prove that d. Prove that Solution: a. (Xi X) = Xi X= Xi nX = nX nX = 0 b. We prove one preliminary result: (Xi X)(Yi Y ) = = = = = = (Xi Yi Xi Y XYi + XY ) Xi Yi Y Xi X Yi + XY

n i=1 n i=1 n i=1 n i=1 n

Xi

i=1 n

1 n1 1 n1

Xi X

i=1 n

Xi X

i=1

Yi Y

Xi X = 0 Xi X Yi = (n 1) sXY Xi X Xi Y + 5 = (n 1) s2

k j=1

1 + Xi X

Yj Y

= n (k 1) s2 Y

Therefore, just use Yi instead of Yi , and the result will of course still hold: (Xi X)Yi = c. (Xi X)(Xi Y + 5) = = (Xi X)Xi (Xi X)Y + (Xi X)5 (Xi X)(Yi Y ) = (n 1)SXY

(Xi X)Xi 0 + 0

2 = (n 1)SXX = (n 1)SX

Page 2

d. (1 + Xi X)(Yj Y )2

i j

=

i j

(Yj Y )2 +

i 2 j

j

= n

j

(Yj Y ) +

i 2 1)SY

= n(k

+0

3. Stock & Watson 3.5 (note: part a is a little tricky) A survey of 1055 registered voters is conducted, and the voters are asked to choose between candidate A and candidate B. Let p denote the fraction of voters in the population who prefer candidate A, and let p denote the fraction of voters in the sample who prefer candidate A. a. You are interested in the competing hypotheses: H0 : p = 0.5 vs. H1 : p = 0.5. Suppose you decide to reject H0 if | 0.5| > 0.02. p i. What is the size of this test? ii. Compute the power of this test if p = 0.53. b. In the survey p = 0.54. i. Test H0 : p = 0.5 vs. H1 : p = 0.5. using a 5% signicance level. ii. Test H0 : p = 0.5 vs. H1 : p > 0.5. using a 5% signicance level. iii. Construct a 95% condence interval for p. iv. Construct a 99% condence interval for p. v. Construct a 50% condence interval for p. c. Suppose that the survey is carried out 20 times, using independently selected voters in each survey. For each of these 20 surveys, a 95% condence interval for p is constructed. i. What is the probability that the true value of p is contained in all 20 of these condence intervals. ii. How many of these condence intervals do you expect to contain the true value of p?

1 d. In survey jargon, the margin of error is 1.96 SE (); that is, it is 2 the length of p the 95% condence interval. Suppose you wanted to design a survey that had a margin oferror of at most 1%. That is, you wanted Pr (| p| > 0.01) .05. How large should p n be if the survey uses simple random sampling?

Page 3

Solution: a. i. The size is given by Pr (| 0.5| > 0.02), where the probability is computed asp suming that p = 0.5. Pr (| 0.5| > 0.02) = 1 Pr (.02 p 0.5 0.02) p = 1 Pr

0.02 (.5.5)/1055

p0.5 (.5.5)/1055

0.02 (.5.5)/1055

= 1 Pr 1.30 = 0.19

p0.5 (.5.5)/1055

1.30

where the nal equality uses the central limit theorem approximation (and the normal tables). ii. The power is given by Pr (| 0.5| > 0.02), where the probability is computed p assuming that p = 0.53. Pr (| 0.5| > 0.02) = 1 Pr (.02 p 0.5 0.02) p = 1 Pr = 1 Pr

0.02 (.53.47)/1055 0.05 (.53.47)/1055

= 1 Pr 3.25 = 0.74

p0.53 (.53.47)/1055

0.65

where the nal equality uses the central limit theorem approximation (and the normal tables). b. i. t =

0.54.5 (0.540.46)/1055

= 2.61, Pr (|t| > 2.61) = .009 so that the null is rejected at the

5% level. ii. Pr (t > 2.61) = .0045 so that the null is rejected at the 5% level. iii. 0.54 1.96 (0.54 0.46)/1055 = 0.54 0.03, or 0.51 to 0.57. iv. 0.54 2.58 (0.54 0.46)/1055 = 0.54 0.04, or 0.50 to 0.58. v. 0.54 0.67 (0.54 0.46)/1055 = 0.54 0.01, or 0.53 to 0.55. c. i. The probability is 0.95 in any single survey, there are 20 independent surveys, so the probability is 0.9520 = 0.36. ii. 95% of the 20 condence intervals or 19.

Page 4

d. The relevant equation is 1.96 SE () < .01 or 1.96 p(1 p)/n < .01. Thus n p 1.962 p(1p) must be chosen so that n > , so that the answer depends on the value of p. .012 Note that the largest value that p(1 p) can take on is 0.25 (that is, p = 0.5 makes 2 p(1p) p(1 p) as large as possible). Thus if n > 1.96.012 = 9604, then the margin of error is less than 0.01 for all values of p.

Page 5

- Methods, CBS News' Battleground Tracker, October 25Uploaded byCBS News Politics
- Estimation TheoryUploaded byLiezel Dizon
- EKOS poll - April 29, 2010Uploaded byThe Globe and Mail
- brainwave marketing project -IBMUploaded byAakash Kumar
- California Fox Poll - 4-22-16Uploaded byThe Conservative Treehouse
- 20160322120208Uploaded byapi-292122272
- Solution for Exercise Confident Interval for median based on sign testUploaded byRohaila Rohani
- QMT500Uploaded byAriffewear
- High Impact n Effective Application of Manufacturing StatisticsUploaded byeddiekuang
- Students Knowledge based Advisor System for Colleges Admission With an Applied Case StudyUploaded byInternational Journal of Research in Engineering and Science
- 10-5-1--132-142Uploaded bynlucaroni
- Working Paper PEDIUploaded byMac Ymac
- Correlacion1Uploaded byGabriel Michelena
- SimulationUploaded byTirupal Puli
- Basic ToolsUploaded byAnnisa Nasution
- cost saving effect of supervised exercise associated to COPD self management education program.pdfUploaded byNurul Kartika Sari
- confidence int mini projectUploaded byapi-283233011
- ch08.docxUploaded bySaied Aly Salamah
- ec05Uploaded byLakshmi Seth
- Hawaii Poll — FavorabilityUploaded byHonolulu Star-Advertiser
- Mgt605 Lecture 11Uploaded bySheraz Ahmed
- 30505Uploaded bySyed Mohammed Abbas
- 14 ChecklistUploaded bygromit256
- UJI T DINAUploaded bynurwahyuti
- Output SPSS - Uji Perbandingan BerpasanganUploaded byGiffari Fitri Maharani
- 20211.pdfUploaded bydyan ayu pusparini
- y1996-7.pdfUploaded byPratiwi Koizumi
- 2017-1-X-Y-SanthiUploaded byV.Jeya Santhi
- NonnormalPCI.pdfUploaded byalejandra Ramirez
- mathUploaded byDanicka jane Enero

- NOMOR 3 UTSUploaded byNita Ferdiana
- A Critical Comparative Study of Liver Patients FromUploaded bySandeep Chaurasia
- ModeUploaded byYu Pan
- fstats_ch4Uploaded bykeplermanuel
- AnovaUploaded byYour Materials
- Uji Normalitas Dan Uji T Tidak Berpasangan Kadar Hb Cacingan Dan Tidak CacinganUploaded byervina
- Pt Review Ch1 SolUploaded byI Putu Wahyu Paramartha
- 13 Probability DistributionUploaded byVivianne Yong
- Vaccination Coverage Cluster Survey AnnexUploaded byJohn Alexander Gallin
- T-distribution Table Extended Df 1-100Uploaded bycookytara
- Profile likelihoodUploaded byAnonymous Y2ibaULes1
- Fx 9750 z Test and t TestUploaded bywiladelacruz
- 2Uploaded byMichael Powell
- Sampling Distributions 1Uploaded byHazel Papagayo
- 1723-2014Uploaded byanchals_20
- Using Excel to Compute the Binomial DistributionUploaded byMasterHomer
- Reflow & Wave PWIUploaded bysweetcx
- GFK K.pusczak-sample Size in Customer Surveys_paperUploaded byVAlentino AUrish
- Section 5Uploaded byMichael Randolph
- t Test for Two Dependent SamplesUploaded byLeonard Amigo
- u3-l4 - Sampling DistributionsUploaded bySudhagar D
- CSE291D_5Uploaded byballechase
- MTH 233 Week 3 MyStatLab® Post-Test.docUploaded by491acc
- Estimations 12 Nov 12Uploaded byGolamKibriabipu
- stat2Uploaded byfuck u bitch
- Advance-Probability-and-Statistics-2-Edition.pdfUploaded byvic721130
- handout2Uploaded byPepe Garcia Estebez
- Introduction4Uploaded bysinglethug
- final skittles projectUploaded byapi-341512483
- FRM 2Uploaded bysadiakhn03

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.