Lecture 6-3 - Simple Random Sampling

Simple Random Sampling!
Professor Ron Fricker!

Naval Postgraduate School!
Monterey, California!
Reading Assignment:!
Scheaffer, Mendenhall, Ott, & Gerow!
2/1/13
Chapter 4! 1

Goals for this Lecture!
•  Define simple random sampling (SRS) and

discuss how to draw one!
•  Horvitz-Thompson estimation and SRS!
–  The finite population correction (fpc)!
•  Defining estimators for means, totals, and
proportions!
•  Sample size calculations!
2/1/13
2

Definition!
•  Simple random sampling (SRS) occurs when

every sample of size n (from a population of
size N) has an equal chance of being
selected!
–  This is not how we will actually draw such a
sample, just how it’s defined!
•  Note it is not defined as each element having
an equal chance of being selected!
–  That can occur with more complex designs,
particularly stratified designs!
•  An example…!
2/1/13
3

Example!
•  Consider a population consisting of 90 men

and 10 women, so N=100, where we want to
sample n=10 individuals!
–  With SRS, we can get samples of all men or all
women!
•  We could also draw a stratified sample,
where via SRS we sample nine men and
(separately) via SRS one woman!
–  Here each person has probability 1/10 of being
sampled, but not all groups of 10 can be sampled!
2/1/13
4

How to Draw a SRS!
•  Easiest way: !
–  Assign every element in the sampling frame a
uniformly distributed random number (say
between 0 and 1)!
–  Sort the list according to the random numbers!
•  Either ascending or descending, doesn’t matter!
–  Then take the first n elements!
•  Don’t try to actually generate all possible
combinations of n elements out of N…!
•  Chapter 4 describes other manual ways to do
this using tables of random numbers!
2/1/13
5

Example!
UNSORTED
SORTED

2/1/13
6

Note the Difference!
•  So, notice that giving every element in the

population an equal chance of selection like
this results in a SRS!
•  Which is probably why SRS is often
mistakenly defined this way!
•  But remember that other non-SRS methods
can also result in every element having an
equal chance of being selected!
–  For example, stratified sampling when probability
of selection is proportional to strata size!
2/1/13
7

Horvitz-Thompson Under SRS!
•  Under SRS, each sampling unit has

probability n/N of being selected!
•  Estimating µ with Horvitz-Thompson
estimator, we have!
1 n 1 1 n 1 1 n N 1 n
µˆ = ∑ yi = ∑ yi = ∑ yi = ∑ yi = y
N i =1 π i N i =1 n / N N i =1 n n i =1
–  Same as Stats 101!!
•  If population is infinite, standard error of y is
estimated the same way too: σ !ˆ y = s n
2/1/13
8

But What If Population Is Finite?!
•  It can be shown (see Appendix A of SMO&G)

that for finite populations,!
! E S( )
2
=
N
N −1
σ 2
•  So, an unbiased estimate for the variance of

the sample mean is:! ⎛ N − n ⎞ s 2
 Y =
Var ( ) ⎜⎝ N ⎟⎠ n
•  And thus the estimated standard error is:!
⎛ n⎞ s
s.e.(Y ) = ⎜ 1− ⎟ ×

⎝ N⎠ n
2/1/13
“finite population correction” or fpc! 9

Finite Population Correction!
•  Note that failure to use the finite population

correction (fpc) results in standard errors that
are too large!
–  Confidence intervals will be (erroneously) too big!
–  Hypothesis tests will be (erroneously) less
powerful!
•  For a survey with sample size less than 5
percent of population, can ignore the fpc!
–  It will have negligible effect!
•  If sample size larger than 5 percent, use fpc
to get more precise results – a good thing! !
2/1/13
10

Example: Margin of Error Estimates!
•  For various
sample sizes,
margins of error
for an infinite-
sized population
and one with
N=300
–  Binary question!
–  Conservative
p=0.5 assumption!
2/1/13
11

Another Example!
•  Survey asks a binary yes/no question

–  Estimate the proportion of respondents who say
“yes” with a confidence interval (N=300 and n=200)!
–  If 100 of the 200 say “yes,” population point
estimate is 50% ( pˆ = 0.5)!
•  Calculating the 95% confidence interval:!
–  Incorrect interval without fpc: (43%, 57%)!
pˆ (1 − pˆ ) 0.25
pˆ ± 1.96 = 0.5 ± 1.96 = 0.5 ± 0.07
n 200
–  Correct interval with fpc: (46%, 54%)!
⎛ n ⎞ ⎛ p̂(1− p̂) ⎞ ⎛ 1 ⎞ ⎛ 0.25 ⎞
p̂ ± 1.96 ⎜ 1− ⎟ ⎜ ⎟ = 0.5 ± 1.96 ⎜ ⎟ ⎜ ⎟ = 0.5 ± 0.04
⎝ N⎠⎝ n ⎠ ⎝ 3 ⎠ ⎝ 200 ⎠
2/1/13
12

Where Does the FPC Come From?!
•  In an infinite population, if we sample two

observations then!
–  Doesn’t really matter whether we sample with
replacement or not! Cov(Yi , Y j ) = 0
•  For a finite population, when we sample
without replacement, !
1
Cov(Yi , Y j ) = − σ2
N −1
•  Picking one observation affects the rest, so
there is correlation!!
2/1/13
13

Mean Estimation Summary!
1 n
•  Estimator for the mean:! y = ∑ yi
n i =1
⎛ n ⎞ s 2
( ) ⎜⎝ N ⎟⎠ n
 y = 1−
•  Variance of y :! Var
•  Bound on the error of estimation (margin of error):!
⎛ n ⎞ s 2
( ) ⎜⎝ N ⎟⎠ n
 y = 2 1−
2 Var
2/1/13
14

Estimating Totals!
N n
•  Estimator for the total:! τˆ = N × y = ∑ yi
n i=1
⎛ n ⎞ s 2
( )  ( Ny ) = N 2 ⎜⎝ 1− N ⎟⎠ n
 τˆ = Var
•  Variance of τˆ :! Var
⎛ n ⎞ s 2
2 Var ()
 τˆ = 2N 1−
⎜⎝ N ⎟⎠ n
2/1/13
15

Estimating Proportions!
1 n
•  Estimator for the proportion:!pˆ = y = ∑ yi
n i =1
⎛ n ⎞ p̂ (1− p̂ )
•  Variance of p̂ :! Var ( p̂ ) = ⎜ 1− ⎟

⎝ N⎠ n
⎛ n ⎞ p̂ (1− p̂ )
2 Var ( p̂ ) = 2 ⎜ 1− ⎟

⎝ N⎠ n
2/1/13
16

Sample Size Calculations (w/ fpc) 
for Estimating Means !
•  Typically, we want to determine a sample size
to achieve a particular margin of error B
•  So, solving the following for n
⎛ N −n ⎞σ
2
2 ⎜ ⎟ =B
⎝ N −1 ⎠ n
gives!
Nσ 2
n= 2
B ( N − 1) 4 + σ 2
•  This is the number of respondents required!
–  Will need to inflate to account for nonrespondents!
2/1/13
17

for Estimating Totals !
•  Proceed as before, but use the expression for
the margin of error for totals!
•  That is, solve the following for n
⎛ N − n ⎞ σ 2
2N ⎜ ⎟ =B
⎝ N −1 ⎠ n
•  ! gives!
Nσ 2
n= 2
B ( N − 1) 4 N 2 + σ 2
•  Again, don’t forget to inflate this to account for

the nonresponse rate!
2/1/13
18

for Estimating Proportions !
•  Again proceed as before, but use the
expression for proportions!
•  That is, solve the following for n
⎛ n ⎞ p (1− p )
2 ⎜ 1− ⎟ =B
⎝ N⎠ n
gives!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)
•  And again, don’t forget to inflate this to

account for the nonresponse rate!
2/1/13
19

Power Calculations Example!
•  Back to survey with N=300, where we guess

that p=50% (most conservative assumption) !
•  What sample size do we need to achieve a
margin of error of 3%?!
Np(1 − p)
n= 2
B ( N − 1) 4 + p(1 − p)
300 × 0.5(1 − 0.5)
= = 236.4
0.03 ( 300 − 1) 4 + 0.5(1 − 0.5)
2
•  So, need responses from 237 out of the 300

–  If 80% response rate, must sample 237/0.8=297!!
2/1/13
20

For Our Project!
•  Same assumptions:!
–  Binary question!
–  p=0.5
•  If we’re going to
survey ~900 people
out of 1500, might
as well do them all?!
–  Plus, 1500 gives
some insurance if
response rate < 0.7!
2/1/13
21

Doing the Calculations Directly!
•  First, we need this many respondents for a

3% margin of error:!
Np(1− p)
n= 2
B ( N − 1) 4 + p(1− p)
1500 × 0.5(1− 0.5)
= = 638.5
0.03 (1500 − 1) 4 + 0.5(1− 0.5)
2
•  Then, accounting for nonresponse:!
638.5 / 0.7 = 912.1

2/1/13
22

Sample Size Calculations (w/out fpc) 
for Estimating Proportions!
•  Similar to what we were doing, but margin of
error expression does not include fpc!
–  Choose B, the margin of error !
–  Then,! B = 2 pˆ (1 − pˆ ) / n
–  Algebra gives required sample size: !
4 pˆ (1 − pˆ )
n=
B2
•  Can simplify further:!
–  Estimate p using worst case: ½
–  Then, !n = 1/ B
2
2/1/13
23

Example!
•  National poll of likely voters for candidate “X”!

–  Desire 3% margin of error!
•  Then! n = 1/ B = 1/ 0.03 = 1,111.1
2 2
•  If expect a 70% response rate, then sample

1,111.1/0.7=1,587.3 or 1,588 likely voters!
•  Compare to fpc-based calculation:!
300,000,000 × 0.5(1 − 0.5)
n= = 1,111.1
0.03 (300,000,000 − 1) 4 + 0.5(1 − 0.5)
2
2/1/13
24

How Does That Work?!
Np (1 − p )
n= 2
B ( N − 1) 4 + p (1 − p )
⎛ N ⎞
4⎜ ⎟ p (1 − p )
N −1 ⎠
= ⎝
p (1 − p )
B +4
2
N −1
⎛ N ⎞
⎜ ⎟
−
= ⎝ ⎠
N 1
(for p = 1/ 2)
1
B +
2
N −1
1
≈ 2 for large N
B
2/1/13
25

Take-Aways !!
•  With SRS and sample size less than 5% of

population, proceed using “Stats 101”
methods!
–  Means, totals, proportions!
–  Can use standard statistical software!
•  With SRS, if n > 0.05N, then be sure to use
finite population correction!
–  Reported results more precise (and correct)!
–  Either need to use special software or manually
adjust the reported standard errors!
2/1/13
26

What We Have Covered!
•  Defined simple random sampling (SRS) and

discussed how to draw one!
•  Discussed Horvitz-Thompson estimation and
SRS!
–  Defined the finite population correction (fpc)!
•  Defined estimators for means, totals, and
proportions, including their standard errors!
•  Discussed sample size calculations!
2/1/13
27

Lecture 6-3 - Simple Random Sampling

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 6-3 - Simple Random Sampling

Uploaded by

Copyright:

Available Formats

Simple Random Sampling!

Professor Ron Fricker!

• Define simple random sampling (SRS) and

• Simple random sampling (SRS) occurs when

• Consider a population consisting of 90 men

• So, notice that giving every element in the

• Under SRS, each sampling unit has

• It can be shown (see Appendix A of SMO&G)

• So, an unbiased estimate for the variance of

• Note that failure to use the finite population

• Survey asks a binary yes/no question

• In an infinite population, if we sample two

• Bound on the error of estimation (margin of error):!

• Bound on the error of estimation (margin of error):!

• Bound on the error of estimation (margin of error):!

• Again, don’t forget to inflate this to account for

• And again, don’t forget to inflate this to

• Back to survey with N=300, where we guess

• So, need responses from 237 out of the 300

• First, we need this many respondents for a

• Then, accounting for nonresponse:!

638.5 / 0.7 = 912.1

• National poll of likely voters for candidate “X”!

• If expect a 70% response rate, then sample

• With SRS and sample size less than 5% of

• Defined simple random sampling (SRS) and

You might also like

•  Define simple random sampling (SRS) and

•  Simple random sampling (SRS) occurs when

•  Consider a population consisting of 90 men

•  So, notice that giving every element in the

•  Under SRS, each sampling unit has

•  It can be shown (see Appendix A of SMO&G)

•  So, an unbiased estimate for the variance of

•  Note that failure to use the finite population

•  Survey asks a binary yes/no question

•  In an infinite population, if we sample two

•  Bound on the error of estimation (margin of error):!

•  Bound on the error of estimation (margin of error):!

•  Bound on the error of estimation (margin of error):!

•  Again, don’t forget to inflate this to account for

•  And again, don’t forget to inflate this to

•  Back to survey with N=300, where we guess

•  So, need responses from 237 out of the 300

•  First, we need this many respondents for a

•  Then, accounting for nonresponse:!

•  National poll of likely voters for candidate “X”!

•  If expect a 70% response rate, then sample

•  With SRS and sample size less than 5% of

•  Defined simple random sampling (SRS) and