Stratified Sampling Explained
Stratified Sampling Explained
Stratified Sampling
An important objective in any estimation problem is to obtain an estimator of a population parameter
which can take care of the salient features of the population. If the population is homogeneous with
respect to the characteristic under study, then the method of simple random sampling will yield a
homogeneous sample, and in turn, the sample mean will serve as a good estimator of the population
mean. Thus, if the population is homogeneous with respect to the characteristic under study, then the
sample drawn through simple random sampling is expected to provide a representative sample.
Moreover, the variance of the sample mean not only depends on the sample size and sampling fraction
but also on the population variance. In order to increase the precision of an estimator, we need to use a
sampling scheme which can reduce the heterogeneity in the population. If the population is
heterogeneous with respect to the characteristic under study, then one such sampling procedure is a
stratified sampling.
Example: In order to find the average height of the students in a school of class 1 to class 12, the
height varies a lot as the students in class 1 are of age around 6 years, and students in class 10 are of age
around 16 years. So one can divide all the students into different subpopulations or strata such as
Students of class 1, 2 and 3: Stratum 1
Students of class 4, 5 and 6: Stratum 2
Students of class 7, 8 and 9: Stratum 3
Students of class 10, 11 and 12: Stratum 4
Now draw the samples by SRS from each of the strata 1, 2, 3 and 4. All the drawn samples combined
together will constitute the final stratified sample for further analysis.
Population (N units)
Strata are constructed such that they are non-overlapping and homogeneous with respect to the
characteristic under study such that N N.
k
i1
i
In cluster sampling, the clusters are constructed such that they are
within heterogeneous and
among homogeneous.
[Note: We discuss the cluster sampling later.]
interest is not to have k different estimators of the parameters, but the ultimate goal is to have a single
estimator. In this case, an important issue is how to combine the different sample information together
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 3
into one estimator, which is good enough to provide information about the parameter.
We now consider the estimation of population mean and population variance from a stratified sample.
1 N i
Yi
Ni j 1
yij : population mean of ith stratum
ni
1
yi y ij : sample mean from ith stratum
n j 1
1
i
N Y w Y : population mean where
k i i i i N
Ni .
Y wi
N
i1 i1
y
n i1 ni yi
as a possible estimator of Y .
E( y )
n n E( y )
i1
i i
1 k
n n Y
i1
i i
Y
and y turns out to be a biased estimator of Y . Based on this, one can modify y so as to obtain an
unbiased estimator of Y . Consider the stratum mean which is defined as the weighted arithmetic mean
of strata sample means with strata sizes as weights given by
1 k
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 5
yst Ny.
N i1
i i
E( yst ) 1 k N E( y )
N i1
i i
k
NY
1 i1 i i
N
Y
Thus is an unbiased estimator of Y .
yst
Variance of
yst
k k ni
Var( ) w2 Var( y )
y
ww Cov( y , y ).
st i i i j i
i1 i ( j )1 j 1
Since all the samples have been drawn independently from each of the strata by SRSWOR so
Cov( yi , y j ) 0,i j
N ni 2
Var( y ) i S
i i
Ni ni
where
1 N
Y i )2.
S2
i
(Y i
ij
Ni 1 j1
Thus
k 2
Ni ni 2
Var( yst ) Si
wi
i1 Ni ni
n S
2
2
k
i w
.
i i
1
i1 Ni ni
2
Observe that Var( yst ) is small when S is small. This observation suggests how to construct the strata.
i
If S is small for all i = 1,2,...,k, then Var( y ) will also be small. That is why it was mentioned earlier
2
i st
that the strata are to be constructed such that they are within homogeneous, i.e., S 2 is small and among
i
heterogeneous.
For example, the units in geographical proximity will tend to be more closer. The consumption pattern
in the households will be similar within a lower income group housing society and within a higher
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 7
income group housing society, whereas they will differ a lot between the two housing societies based on
income.
and N i ni
V‸ar( y ) s2
i
Ni ni i
so V‸ar(
k
2
y st
) i1 w
i i
2 Ni ni 2
k
w s
i
N n
i
i1
i i
Note: If SRSWR is used instead of SRSWOR for drawing the samples from each stratum, then in this
case
yst k wi yi
i1
E( yst ) Y k
N 1 2
2 i
2 k
2
wi yst )
Var( i
Si wi
i1 Nn ni
i i i1
2 2
k
w s
V‸ar( ) ni i
y s
i1 i
N
2
where ( i y )2.
1 y
i ni j 1 ij i
Note: The sample size cannot be determined by minimizing both the cost and variability
simultaneously. The cost function is directly proportional to the sample size, whereas variability is
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 11
inversely proportional to the sample size.
Based on different ideas, some allocation procedures are as follows:
2. Proportional allocation
For fixed k, select ni
such that it is proportional to stratum size Ni , i.e.,
ni Ni
or ni Ni
n
i1
k
i
or n N
n
.
N
Thus n
n N.
i i
N
Such allocation arises from considerations like operational convenience.
*
where is the constant of proportionality.
k k
n i
*
NS
i i
i1 i1
k
or n
*
NS
i 1
i
or
*
n
.
Ni Si
k
i 1
k
This allocation arises when the Var yst is minimized subject to the constraint (prespecified).
ni i1
There are some limitations to the optimum allocation. The knowledge of Si (i 1, 2,..., k) is needed to
know ni . If there are more than one characteristics, then they may lead to conflicting allocation.
where
C : total cost
C0 : overhead cost, e.g., setting up the office, training people etc
To find ni
under this cost function, consider the Lagrangian function with a Lagrangian
multiplier as
Var( y ) 2 (C C )
2 1 1 2
k st k
2
wi n Si Ci ni
N
i1 i i i1
2 2
k
wi Si k k
w2iS 2i
Cn
2
i1 i i
i1 i1 Ni
ni wS 2
k C n terms independent of n .
i i i i i
ni
i1
wi Si
ni Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Sampling
Page 14
Ci ni for all i
or ni wi Si
1 .
Ci
So Cn k
i i
C*
0
i1
k wi
or C*
Si 0
i1
Ci Ci
Ci wi Si
k
or i1
.
C0 *
1 wi
Substituting in the expression for ni , the optimum ni is obtained as
C i
Si
* wS C*
ni i i k 0
.
Ci Ci wi Si
i1
The required sample size to estimate Y such that the variance is minimum for the given cost C C* is
0
k
*
n i1 i
Ci
i1 ni
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 16
k 2 2
i1
Ni S w
or wS
2
V
2
i i
0
i1 wi Si i
i1 Ni
i1
wS
wSCk
i i i
n~i i 1
w2S 2 .
i i
Ci
V0 i i
k
i1 Ni
So the required sample size to estimate Y such that cost C is minimum for a
prespecified variance V0 is n
k n~ .
i
i1
Sample size under proportional allocation for fixed cost and for fixed variance
(i) If cost C C0 is fixed then C0 C n
k
i1
i i .
C0
or n .
w C k
i i
i1
Thus ni Co wi .
wiCi
The required sample size to estimate Y k
in this case is n ni .
i1
wS
2 2
i
or n i1
w Sk 2 2
V0 i i
N i1 i
k
wS
2 2
i1
or ni wi i
2 2
.
w S k
V0 i i
N i1 i
ni n Ni
N
and
Var( y)
k
Ni ni 2 2
st
wi Si
i1 N n
i i
n Ni 2
N N
Var )( y
k i
nN i Si
2
prop st
i1 N N N
i
N
i
Nn k NS2
Nn
i i
i1 N
Nn
Nn k i1
i 4 | Stratified Sampling | Shalabh, IIT Kanpur
Sampling Theory| Chapter
Page 19
w S 2.
i1
N S i i
V
(y ) k
1 1
w2 S 2
opt st
n N
i i
i 1 i i
k
w2 S 2 k
w2 S 2
i i
i i
i1 ni i1 Ni
k
k Ni Si k w2 S 2
w S i 1 i i
2 2
i1 i i nN S
i i i1 Ni
1
k N i Sk i k w
i i
i2 i2
. NS
i1 n N i1 i1 Ni
2
1 k
NS 2 1 k
2
1 k
wi S i 2k
i i
2
N n
wS wS 2
.
n N i i
N
i i
Consider
k Ni
(N 1)S 2
Y )2
(Y i1 j 1
i
Ni 2
k
i1 j i1 j 1
1
k k
(N
i1 1)S 2i
i N (Y
i1
i
Y )2
N i1 N i i1 N i
Since w (Y Y )
i1
i
2
0,
prop st opt st i i
Nn i1 i i i i
N i 1
n
1 k
k i1
2
wS 2 wi Si
n i
i1 i1
1 2
1 k wi S i S
2
ni1 n
w here
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 23
1 k )2
n i1
wi (Si S
k
S wi Si
i1
In stratified sampling,
2 Ni ni
k
2
Var( yst ) Ni ni Si .
wi
i1
Ni
ni i1
k
w2 s2 1 k
N
i i 2
ni w
i1 i1
The second term in this expression represents the reduction due to finite population correction.
assuming y is normally distributed and is well determined so that t can be read from normal
s
t V‸ar( y )
st
distribution tables. If only few degrees of freedom are provided by each stratum, then t values are
obtained from the table of student’s t-distribution.
The distribution of V‸ar ( y )is generally complex. An approximate method of assigning an effective
st
k 2
ii 2
For example, if n1
then take the revised ni ' s as
N1, n~1 N1
and
(n N1 )wi Si i 2, 3,..., k
;n~
i k
w S
i2
i i
provided n~i
for all i = 2,3,…,k.
Ni
Suppose in revised allocation, we find that n~2 N2 then the revised allocation would be
n~1 N1
n~2 N 2
(n N1 N 2 )wi Si
n~ ; i 3, 4,..., k.
wi Si
i
i3
provided n~i
for all i 3, 4,..., k.
Ni
n* N
where * denotes the summation over the strata in which n~ and n* is the revised total sample
N
i i
Pi Ai
N :Proportion of units in C in the i stratum
th
ai
p : Proportion of units in C in the sample from the ith stratum
i
ni
An estimate of population proportion based on the stratified sampling is
pst
k
Ni pi
N .
i1
Her Ni
e S2 Pi Qi
i Ni 1
where Qi 1 Pi .
Also Var( ) Ni ni
y w2 S 2 .
k
st i i
i1 Ni ni
1 k N
(N
2
n ) PQ
So Var( pst )
2 N i
N i1 i
ni i .
i1 i i
Var( ) PiQi
p w2 .
st
k
i
i1 ni
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 28
If the proportional allocation is used for ni
, then the variance of pst is
Nn 1 k
N 2PQ
Var (p )
prop st
i i i
N Nn i1 Ni 1
Nn k
Nn i1 wi PiQi
The best choice of ni such that it minimizes the variance for fixed total sample size is
n N Ni PiQi Ni 1
i i
Ni PiQi
Thus n n
Ni PiQi
i k .
Ni PiQi
i1
PiQi
nN
iC
i
ni k
.
PQ
N
i i
i
Ci
i1
V‸ar( y ) st
.
s
This gives an idea about the gain in efficiency due to stratification.
Nn 2
Since VarSRS ( y ) S , so there is a need to express in terms of S 2 . How to estimate S based
2
S2
Nn i
on a stratified sample?
(Yij Yi ) (Yi Y )
i1 j 1
k Ni k
(Y Y )2 N (Y Y )2
ij i i
i1 j 1 i1
k k
(N 1)S 2
i1 N (Y Y )2 i1
kk (N 1)S N
2
wY2 2
Y
i
i i
i i1
i1
In order to estimate S 2 , we need to estimates of S i2 , Y 2 and Y 2. We consider their estimation one by
one.
E(s2 ) S 2
i
So Sˆ s 2 .
2
i
An unbiased estimate of Y 2 is
ˆ
Y 2 y 2 V‸ar( y )
i
Ni ni 2
i
y 2
i
s.
i i
Ni ni
E( y ) Y 2 2
s
Y 2 E( y 2 ) Var( y )
st st
(N 1)S 2
(N
k1)S N
2
w Y 2Y 2
ˆ2 1 k
i i i i
i1
N
i1
k
ˆ 2
k
ˆ2 ˆ 2
as ( Ni w
S 1)S
Ni NY 1 Y
i1 i i
1 k
2 N k 2 Ni ni 2
k Ni ni 2 2
Ni 1 si N 1 wi yi 2 wi si
N n1 si yst
n N
i1 i1 i i i1 i i
1 k
2 N k k
Ni ni 2
2
yst ) Ni 1 si wi ( yi wi (1 wi ) si .
N 1 i 1 N 1 i1 n i1 N
i i
Thus
N n ˆ2
V‸ar SRS ( y ) S
Nn
Nn N ni 2
k N ( N n)
2 w ( y y )2 k w (1 w ) i s
( i 1)s nN (N 1)
N (N 1)n
k
i
i
i1
i1
i i st
i1
i i
Nn
i i
and k
‸ ni Ni 2 2
Var( yst ) wi si .
Ni ni
i1
V‸ar( y ) st
,
s
the gain in efficiency due to stratification can be obtained.
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 32
If any other particular allocation is used, then substituting the appropriate ni under that allocation, such
The subsamples need not necessarily be independent. The assumption of independent subsamples helps
in obtaining an unbiased estimate of the variance of the composite estimator. This is even helpful if the
sample design is complicated and the expression for variance of the composite estimator is complex.
Then
E(ˆ) E(t )
and
‸ ˆ ‸
g
1 2
Var( ) Var(t )
g(g 1)
(t
j j t) .
Note that
‸
1 g 2 2
(t g(t )
g(g) 1) j
E Var(t ) E
1 g
g(g 1)
Var(t j ) g Var(t )
1
(g 2 g)Var(t ) Var(t )
g(g 1)
If the distribution of each estimator tj is symmetric about , then the confidence interval of can be
obtained by
g 1
1
P Min(t1 , t2 ,..., t g ) Max(t1 , t2 ,..., t g ) 1 .
2
Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur
Page 34
Implementation of interpenetrating subsamples in stratified sampling
ˆ
Le ij (tot
)
be an unbiased estimator of the total of jth stratum based on the ith subsample ,
i = 1,2,...,L; j = 1,2,...,k.
ˆ J ˆ
Yj) (tot 1 Yij (tot )
L i1
‸ˆ
j (tot ) 1 L ˆ ˆ 2
Var ) L(L 1) ( ij) (tot j) (tot ) .
i1
V‸ar(Yˆ ) k V‸ar(Yˆ )
tot j (tot )
j 1
1 L
L(L 1) i1 (ˆij (tot ˆj (tot )2.
k
) j )
1
Note: This topic is to be read after the next module on ratio method of estimation. Since it is related to
the stratification, so it is given here.
In post-stratification,
draw a sample by simple random sampling from the population and carry out the survey.
After the completion of the survey, stratify the sampling units to increase the precision of the
estimates.
Assume that the stratum size Ni is fairly accurately known. Let
m n.
i1
i
Note that mi is a random variable (and that is why we are not using the symbol ni as earlier).
Assume n is large enough or the stratification is such that the probability that some mi 0 is negligibly
small. In case, mi 0 for some strata, two or more strata can be combined to make the sample size non-
zero before evaluating the final estimates.
ypost 1 k
N i1
N y i i .
Now
k
1
E( y ) E
N E( y m , m ,..., m )
1 k
N i
1k i1
post
E
NY
N i1 i i
Y
E wi
m Si Var(Y )
N
i1 i i
k
1 1
w E
2 2
y yj
Y Y j
Rˆ j 1 R j 1
n
, N
.
x X
x
j 1
j X
j 1
j
We know that
E(Rˆ ) R
N RS X SXY
2
n . .
Nn X 2
0 otherwise
and
y j 1for all j = 1,2,...,N.
y n
j
Rˆ j 1 n
x n
j n
i
j 1
Y N
j
j 1 N
R X N
j
i
N
j 1
Thus
N
1 N (N n)(N Ni ) 1
1E
N nN n2 N 2 (N 1) N
ni i i i i
N 1
(N n)N 1 .
n(N 1)N Nn n
i i
Replacing mi in place of ni , we obtain
1 N
1E (N n)N 1 1
m N n(N 1)N N n n
i i i i
Now substitute this in the expression of Var( ypost ) as
1 1 2
Var( y ) k 2
S
w
posti E m N i
i1
i i
k
N n N N 1
w i S i (N 1)n . N 1 nN n
2 2
i1 i i
N n k w2S 2 1 1 1 1
n(N 1) i i w nw n
i1
i i
k N n 1
2
n (N 1)
2
wi Si n 1
i1 w
i
Nn k
n2 (N 1) i1 (nw i1 w )S i
2
k Nn
Nn (1 w )S 2.
wS
k
2
i i i i
n(N 1) i1 n2 (N i1
1)
Assuming N 1 N.
V(y Nn n n N (1 w )S 2
n
wS
2
)
post
Nn i1
i
n2 i1
i i
N
(1 w )S .
2
yst ) Nn2 i1
i
Nn k Nn 2 k
Nn2 (1 w )S S (k 1) w 1)
2
k 1 N n w 2
n Nn S
k 1
Var( y ).
st
n
n
The increase in the variance over Var (y is small if the average sample size n per stratum is
)
prop st
2
reasonably large.
Thus a post-stratification with a large sample produces an estimator which is almost as precise as an
estimator in the stratified sampling with proportional allocation.