You are on page 1of 27

Simple Random Sampling!

Professor Ron Fricker!


Naval Postgraduate School!
Monterey, California!

Reading Assignment:!
Scheaffer, Mendenhall, Ott, & Gerow!
2/1/13
Chapter 4! 1

Goals for this Lecture!

•  Define simple random sampling (SRS) and


discuss how to draw one!
•  Horvitz-Thompson estimation and SRS!
–  The finite population correction (fpc)!
•  Defining estimators for means, totals, and
proportions!
•  Sample size calculations!

2/1/13
2

Definition!

•  Simple random sampling (SRS) occurs when


every sample of size n (from a population of
size N) has an equal chance of being
selected!
–  This is not how we will actually draw such a
sample, just how it’s defined!
•  Note it is not defined as each element having
an equal chance of being selected!
–  That can occur with more complex designs,
particularly stratified designs!
•  An example…!
2/1/13
3

Example!

•  Consider a population consisting of 90 men


and 10 women, so N=100, where we want to
sample n=10 individuals!
–  With SRS, we can get samples of all men or all
women!
•  We could also draw a stratified sample,
where via SRS we sample nine men and
(separately) via SRS one woman!
–  Here each person has probability 1/10 of being
sampled, but not all groups of 10 can be sampled!

2/1/13
4

How to Draw a SRS!

•  Easiest way: !
–  Assign every element in the sampling frame a
uniformly distributed random number (say
between 0 and 1)!
–  Sort the list according to the random numbers!
•  Either ascending or descending, doesn’t matter!
–  Then take the first n elements!
•  Don’t try to actually generate all possible
combinations of n elements out of N…!
•  Chapter 4 describes other manual ways to do
this using tables of random numbers!
2/1/13
5

Example!

UNSORTED
SORTED

2/1/13
6

Note the Difference!

•  So, notice that giving every element in the


population an equal chance of selection like
this results in a SRS!
•  Which is probably why SRS is often
mistakenly defined this way!
•  But remember that other non-SRS methods
can also result in every element having an
equal chance of being selected!
–  For example, stratified sampling when probability
of selection is proportional to strata size!

2/1/13
7

Horvitz-Thompson Under SRS!

•  Under SRS, each sampling unit has


probability n/N of being selected!
•  Estimating µ with Horvitz-Thompson
estimator, we have!
1 n 1 1 n 1 1 n N 1 n
µˆ = ∑ yi = ∑ yi = ∑ yi = ∑ yi = y
N i =1 π i N i =1 n / N N i =1 n n i =1
–  Same as Stats 101!!
•  If population is infinite, standard error of y is
estimated the same way too: σ !ˆ y = s n

2/1/13
8

But What If Population Is Finite?!

•  It can be shown (see Appendix A of SMO&G)


that for finite populations,!
! E S( )
2
=
N
N −1
σ 2

•  So, an unbiased estimate for the variance of


the sample mean is:! ⎛ N − n ⎞ s 2
 Y =
Var ( ) ⎜⎝ N ⎟⎠ n
•  And thus the estimated standard error is:!
⎛ n⎞ s
s.e.(Y ) = ⎜ 1− ⎟ ×

⎝ N⎠ n

2/1/13
“finite population correction” or fpc! 9

Finite Population Correction!

•  Note that failure to use the finite population


correction (fpc) results in standard errors that
are too large!
–  Confidence intervals will be (erroneously) too big!
–  Hypothesis tests will be (erroneously) less
powerful!
•  For a survey with sample size less than 5
percent of population, can ignore the fpc!
–  It will have negligible effect!
•  If sample size larger than 5 percent, use fpc
to get more precise results – a good thing! !
2/1/13
10

Example: Margin of Error Estimates!

•  For various
sample sizes,
margins of error
for an infinite-
sized population
and one with
N=300
–  Binary question!
–  Conservative
p=0.5 assumption!

2/1/13
11

Another Example!

•  Survey asks a binary yes/no question


–  Estimate the proportion of respondents who say
“yes” with a confidence interval (N=300 and n=200)!
–  If 100 of the 200 say “yes,” population point
estimate is 50% ( pˆ = 0.5)!
•  Calculating the 95% confidence interval:!
–  Incorrect interval without fpc: (43%, 57%)!
pˆ (1 − pˆ ) 0.25
pˆ ± 1.96 = 0.5 ± 1.96 = 0.5 ± 0.07
n 200
–  Correct interval with fpc: (46%, 54%)!
⎛ n ⎞ ⎛ p̂(1− p̂) ⎞ ⎛ 1 ⎞ ⎛ 0.25 ⎞
p̂ ± 1.96 ⎜ 1− ⎟ ⎜ ⎟ = 0.5 ± 1.96 ⎜ ⎟ ⎜ ⎟ = 0.5 ± 0.04
⎝ N⎠⎝ n ⎠ ⎝ 3 ⎠ ⎝ 200 ⎠
2/1/13
12

Where Does the FPC Come From?!

•  In an infinite population, if we sample two


observations then!
–  Doesn’t really matter whether we sample with
replacement or not! Cov(Yi , Y j ) = 0
•  For a finite population, when we sample
without replacement, !
1
Cov(Yi , Y j ) = − σ2
N −1
•  Picking one observation affects the rest, so
there is correlation!!

2/1/13
13

Mean Estimation Summary!

1 n
•  Estimator for the mean:! y = ∑ yi
n i =1

⎛ n ⎞ s 2

( ) ⎜⎝ N ⎟⎠ n
 y = 1−
•  Variance of y :! Var

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ s 2

( ) ⎜⎝ N ⎟⎠ n
 y = 2 1−
2 Var

2/1/13
14

Estimating Totals!

N n
•  Estimator for the total:! τˆ = N × y = ∑ yi
n i=1

⎛ n ⎞ s 2

( )  ( Ny ) = N 2 ⎜⎝ 1− N ⎟⎠ n
 τˆ = Var
•  Variance of τˆ :! Var

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ s 2
2 Var ()
 τˆ = 2N 1−
⎜⎝ N ⎟⎠ n

2/1/13
15

Estimating Proportions!

1 n
•  Estimator for the proportion:!pˆ = y = ∑ yi
n i =1

⎛ n ⎞ p̂ (1− p̂ )
•  Variance of p̂ :! Var ( p̂ ) = ⎜ 1− ⎟

⎝ N⎠ n

•  Bound on the error of estimation (margin of error):!

⎛ n ⎞ p̂ (1− p̂ )
2 Var ( p̂ ) = 2 ⎜ 1− ⎟

⎝ N⎠ n

2/1/13
16

Sample Size Calculations (w/ fpc)

for Estimating Means !
•  Typically, we want to determine a sample size
to achieve a particular margin of error B
•  So, solving the following for n
⎛ N −n ⎞σ
2
2 ⎜ ⎟ =B
⎝ N −1 ⎠ n
gives!
Nσ 2
n= 2
B ( N − 1) 4 + σ 2
•  This is the number of respondents required!
–  Will need to inflate to account for nonrespondents!
2/1/13
17

Sample Size Calculations (w/ fpc)

for Estimating Totals !
•  Proceed as before, but use the expression for
the margin of error for totals!
•  That is, solve the following for n
⎛ N − n ⎞ σ 2
2N ⎜ ⎟ =B
⎝ N −1 ⎠ n
•  ! gives!
Nσ 2
n= 2
B ( N − 1) 4 N 2 + σ 2

•  Again, don’t forget to inflate this to account for


the nonresponse rate!
2/1/13
18

Sample Size Calculations (w/ fpc)

for Estimating Proportions !
•  Again proceed as before, but use the
expression for proportions!
•  That is, solve the following for n
⎛ n ⎞ p (1− p )
2 ⎜ 1− ⎟ =B
⎝ N⎠ n
gives!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)

•  And again, don’t forget to inflate this to


account for the nonresponse rate!
2/1/13
19

Power Calculations Example!

•  Back to survey with N=300, where we guess


that p=50% (most conservative assumption) !
•  What sample size do we need to achieve a
margin of error of 3%?!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)
300 × 0.5(1 − 0.5)
= = 236.4
0.03 ( 300 − 1) 4 + 0.5(1 − 0.5)
2

•  So, need responses from 237 out of the 300


–  If 80% response rate, must sample 237/0.8=297!!
2/1/13
20

For Our Project!

•  Same assumptions:!
–  Binary question!
–  p=0.5
•  If we’re going to
survey ~900 people
out of 1500, might
as well do them all?!
–  Plus, 1500 gives
some insurance if
response rate < 0.7!

2/1/13
21

Doing the Calculations Directly!

•  First, we need this many respondents for a


3% margin of error:!
Np(1− p)
n= 2
B ( N − 1) 4 + p(1− p)
1500 × 0.5(1− 0.5)
= = 638.5
0.03 (1500 − 1) 4 + 0.5(1− 0.5)
2

•  Then, accounting for nonresponse:!

638.5 / 0.7 = 912.1


2/1/13
22

Sample Size Calculations (w/out fpc)

for Estimating Proportions!
•  Similar to what we were doing, but margin of
error expression does not include fpc!
–  Choose B, the margin of error !
–  Then,! B = 2 pˆ (1 − pˆ ) / n
–  Algebra gives required sample size: !
4 pˆ (1 − pˆ )
n=
B2
•  Can simplify further:!
–  Estimate p using worst case: ½
–  Then, !n = 1/ B
2

2/1/13
23

Example!

•  National poll of likely voters for candidate “X”!


–  Desire 3% margin of error!
•  Then! n = 1/ B = 1/ 0.03 = 1,111.1
2 2

•  If expect a 70% response rate, then sample


1,111.1/0.7=1,587.3 or 1,588 likely voters!
•  Compare to fpc-based calculation:!
300,000,000 × 0.5(1 − 0.5)
n= = 1,111.1
0.03 (300,000,000 − 1) 4 + 0.5(1 − 0.5)
2

2/1/13
24

How Does That Work?!

Np (1 − p )
n= 2
B ( N − 1) 4 + p (1 − p )
⎛ N ⎞
4⎜ ⎟ p (1 − p )
N −1 ⎠
= ⎝
p (1 − p )
B +4
2

N −1
⎛ N ⎞
⎜ ⎟

= ⎝ ⎠
N 1
(for p = 1/ 2)
1
B +
2

N −1
1
≈ 2 for large N
B
2/1/13
25

Take-Aways !!

•  With SRS and sample size less than 5% of


population, proceed using “Stats 101”
methods!
–  Means, totals, proportions!
–  Can use standard statistical software!
•  With SRS, if n > 0.05N, then be sure to use
finite population correction!
–  Reported results more precise (and correct)!
–  Either need to use special software or manually
adjust the reported standard errors!

2/1/13
26

What We Have Covered!

•  Defined simple random sampling (SRS) and


discussed how to draw one!
•  Discussed Horvitz-Thompson estimation and
SRS!
–  Defined the finite population correction (fpc)!
•  Defined estimators for means, totals, and
proportions, including their standard errors!
•  Discussed sample size calculations!

2/1/13
27

You might also like