0% found this document useful (0 votes)
347 views9 pages

Understanding Systematic Sampling Techniques

This document discusses systematic sampling, which is a simplified form of probability sampling. It provides the basic idea of how systematic sampling works by selecting units at regular intervals from a population. The document also discusses the advantages and disadvantages of systematic sampling, as well as various applications of it such as in opinion polls, quality control, auditing, market research, and health studies.

Uploaded by

Abdul mannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
347 views9 pages

Understanding Systematic Sampling Techniques

This document discusses systematic sampling, which is a simplified form of probability sampling. It provides the basic idea of how systematic sampling works by selecting units at regular intervals from a population. The document also discusses the advantages and disadvantages of systematic sampling, as well as various applications of it such as in opinion polls, quality control, auditing, market research, and health studies.

Uploaded by

Abdul mannan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Systematic Sampling

Basic Idea of Systematic Sampling:


Suppose a sample of n units is to be selected from a population of N units. Let these units be numbered from 1
to N in some order. Let N = nk , where k is an integer, called sampling interval. To select a sample of n units,
choose a unit at random from the first k units and every k-th unit thereafter. Thus if we randomly select a unit r
within the first k units, the sample will consist of units bearing numbers r , r + k , r + 2k ,  , r + (n − 1)k .

Linear Systematic Sampling:


Sample
Sample observations Probability Sample mean
no.
1 n −1
1 y1 , y k +1 , y 2k +1 ,  , y (n −1)k +1 1
k
y1 =  y jk +1
n j =0

1 n −1
2 y 2 , y k + 2 , y 2k + 2 ,  , y (n −1)k + 2 1
k
y2 =  y jk + 2
n j =0
   
n −1
 y jk +i
1
i y i , y k +i , y 2k +i ,  , y (n −1)k +i 1 yi =
k n j =0
   
n −1
 y ( j +1)k
1
k y k , y 2k , y 3k ,  , y nk 1 yk =
k n j =0

Systematic Sampling:
A more simplified and versatile form of probability sampling design, known as systematic sampling design. If a
population consists of the units y1 , y 2 ,  , y N arranged in some fixed order and if the i-th possible sample from

the population is defined to be a subset with the units y i , y k +i ,  , y (n −1)k +i ; i = 1 , 2 ,  , k , then the subset

selected by any means will constitute the i-th systematic sample of size n .
The mean of the i-th systematic sample is obtained as follows

1 n −1
yi =  y jk +i
n j =0

The mean of the k possible sample means E y sy is ( )


( )
E y sy =
1 k
 yi
k i =1
1 k 1 n −1
=   y jk +i
k i =1 n j =0
k n −1
 y jk +i
1
= =Y
nk i =1 j =0

This shows that when N = nk , y sy is an unbiased estimator of Y .

Advantages of Systematic Sampling:


Systematic sampling - 1
• Operational Convenience: Systematic sampling is easy to draw and often easier to execute in the field.
• Field Control: From the point of view of the control over fieldwork, systematic sampling offers great
advantages over simple random sampling.
• Less Non-Sampling Error: Because of the fact that systematic sampling is simpler to perform, it is less
subject to interviewer’s error than either simple random sample or stratified sample, even if the survey
personnel are less efficient.
• Reduced Cost: This can greatly reduce cost.
• Greater Efficiency: Systematic sampling provides greater precision than the simple random sampling or
stratified sampling except when the list is randomly ordered.

Disadvantages of Systematic Sampling:


• Effect of Periodicity: When there is hidden periodicity in the population or a cyclical or periodic movement
of the data the systematic sample is less representative than the simple random sample.
• Effect of Trend: When there is a monotonic trend in the population elements, a systematic sample will
perform less well.
• Effect of Ordering: A major limitation of systematic sampling is that unless some assumption is made about
the ordering of the list, the variability among the values of the sampled elements does not provide a basis for
estimating the variability of the sampling distribution.

Applications of Systematic Sampling:


• In Gallup Poll: Systematic sampling is a common sampling design in many opinion surveys.
• In Quality Control: The systematic sampling is extensively used in manufacturing industries for statistical
quality control of their products.
• In Auditing: In auditing accounts to check compliance, the most natural way to take a sample from the list of
accounts is systematic sampling.
• In Market Research: Systematic sampling has its wide applications in market research.
• In Crop Estimation: In crop yield estimation, systematic sampling has been extensively used.
• In Health Studies: In recent years, UNICEF and WHO have been extensively used systematic sampling
method worldwide in child nutrition, iodine deficiency disorders surveys to assess the prevalence of these
diseases.

Sample Mean and Its Variance:


Let y ij denote the j-th member of the i-th systematic sample, so that j = 1 , 2 ,  , n and i = 1 , 2 ,  , k . The mean

of the i-th sample is denoted by y i . and is defined as

1 n
yi . =  y ij
n j =1

k n
 y ij
1
The population mean is Y =
nk i =1 j =1

The variance of the systematic sample mean is given by

Systematic sampling - 2
( )  2  yi . − Y 
k
1 2
var y sy = E y i . − Y =
k i =1

 nyi . − nY 
k
1 2
= 2
n k i =1
2
 1 k n k n 
  
1 1
= 2 n  y ij − n  y ij 
n k i =1  n j =1 nk i =1 j =1 
 k 
2   k  
2

k   yi .  
1  k
 yi .  
 
 
y −
  = 2  y i2. −  
1 i =1 i =1
= 2 
n k i =1  k 
i.
n k  i =1 k 
   
   

  k  
2
   yi .  
  
N k
For the estimated total, the variance is var Yˆsy ( ) =  y i2. −
n  i =1
 i =1 

k 
 
 

Theorem 1: In systematic sampling, with interval k , the sample mean y sy is an unbiased estimator of the population

mean Y .
Proof:
In systematic sampling, the whole sample becomes fixed as soon as the first unit is selected. Since the probability

of selection of the i-th systematic sample is 1 , each of the k samples has a constant probability 1 of being
k k
selected. By definition, the mean of the i-th systematic sample is

1 n
yi . =  y ij
n j =1

( )
E y sy =
1 k
 y i.
k i =1
=
1 k 1 n
  yij
k i =1 n j =1
k n
 yij
1
= =Y
nk i =1 j =1

So the sample mean y sy is an unbiased estimator of the population mean Y .

Theorem 2: The variance of the mean of a systematic sample is


N − 1 2 k (n − 1) 2
( )
var y sy =
N
S −
N
 S w(sy )

( )2
k n
where S w2 (sy ) =
1
 y ij − y i .    (1)
k (n − 1) i =1 j =1

is the variance among units that lie within the same systematic sample.
Proof:
By definition, the variance of y sy is

Systematic sampling - 3
( )  2  yi . − Y 
k
var y sy = E y i . − Y =
1 2
   (2)
k i =1

Now consider the usual way of partitioning the total sum of squares as follows

(N − 1)S 2 =  y ij − Y 2
k n

i =1 j =1

 2 +  y ij − y i . 2
k k n
= n y i . − Y
i =1 i =1 j =1

( )
= nk var y sy + k (n − 1)S w2 (sy )
N − 1 2 k (n − 1) 2
( )
 var y sy =
N
S −
N
 S w(sy )

Theorem 3: The mean of a systematic sample is more precise than the mean of a simple random sample iff
S w2 (sy )  S 2

Proof:
If y is the mean of a simple random sample of size n , then

N −n S2
var( y ) = 
N n
while the variance y sy is

N − 1 2 k (n − 1) 2
( )
var y sy =
N
S −
N
 S w(sy )

( )
Thus var y sy  var( y ) if and only if

N − 1 2 k (n − 1) 2 N −n S2
S −  S w(sy )  
N N N n
k (n − 1) 2 N −1 2 N − n S 2
  S w(sy )  S − 
N N N n
 − 
 k (n − 1)  S w2 (sy )   N − 1 −
N n
S
2
 n 
 kn − n  2
 k (n − 1)  S w2 (sy )   kn − 1 − S
 n 
 k (n − 1)  S w2 (sy )  k (n − 1)S 2
 S w2 (sy )  S 2

which states that systematic sampling is more precise than simple random sampling if the variance within the
systematic samples is larger than the population variance as a whole.
Theorem 4: An alternative form of the variance of y sy is

2
( )
var y sy =
S 2 N −1
n

N
 1 + (n − 1) w  =
n
 1 + (n − 1) w 

where  w is the intra-class correlation coefficient between pairs of units that are in the same systematic

sample and its value depends on the arrangements of the units in the population. It is defined as
(
E y ij − Y y iu − Y )( ) k n
(yij − Y )(yij − Y ) (1)
(n − 1)(N − 1)S 
2
w = =   
(
E y ij − Y )
2 2
i =1 j u

Proof:
Systematic sampling - 4
By definition , ( )
var y sy =
1 k
 yi . − Y
k i =1
 2
( )  2  2
k k
 n 2 k var y sy = n 2  y i . − Y =  ny i . − nY
i =1 i =1
2
k  n 
=   y ij − nY 
i =1 
 j =1 

( ) ( ) ( )2
k
=  y i1 − Y + y i 2 − Y +  + y in − Y
i =1

( )2 + 2 (yij − Y )(yiu − Y )


k n k n
=  y ij − Y
i =1 j =1 i =1 j u

= (N − 1)S + (n − 1)(N − 1)S 2  w


2
using (1)
= (N − 1)S 1 + (n − 1) w 
2

2
 ( )
var y sy =
S 2 N −1
n

N
 1 + (n − 1) w  =
n
 1 + (n − 1) w 

1
Theorem 5: Show that limits of  w is −  w 1
n −1
Proof:

We know that ( )
var y sy =
N −1 2 n −1 2
N
S −
n
S w(sy ) =  2 −  w2 (sy )    (1)

2
And ( )
var y sy =
n
 1 + (n − 1) w     (2)
Comparing equation (1) and (2 ) , it follows that

2
 1 + (n − 1) w  =  2 −  w2 (sy )
n
n w2 (sy )
 1 + (n − 1) w = n −
2
n w2 (sy )
 (n − 1) w = n − 1 −
2
n  w(sy )
2
  w = 1−     (3)
n −1  2

 w2 (sy )
 1 , it follows from equation (3) we get, −
1
Since 0   w 1.
 2 n −1

Note:

( )
We have, V y sys  0 . So from (*) we get   −
1
n −1
. Thus the minimum value of  is −
1
n −1
and in this case

( )
V y sys = 0 .

Systematic Sampling Vs SRSWOR:


The relative efficiency of the estimate of the population mean in systematic sampling over SRSWOR is given by
the expression

Systematic sampling - 5
N −n 2
V (yn ) n(k − 1)
S
E= = Nn = (1)
( )
V y sys (nk − 1) S 1 + (n − 1)  (nk − 1)1 + (n − 1) 
2
  

nk n
Obviously this depends on the value of  .
E 1
n(k − 1)
 1
(nk − 1)1 + (n − 1) 
 (nk − n )  nk − 1 + (n − 1)(nk − 1)
 − 1  (nk − 1)
1
 −
(nk − 1)
Thus systematic sampling would be more efficient as compared with SRSWOR if
1
−
(nk − 1)
On the other hand, SRSWOR would be superior to systematic sampling if
1
 −
(nk − 1)
However, if  assumes the minimum possible value,  = −
1
(n − 1)
( )
, then V y sys = 0 and consequently E =  .

( )
Thus in this case reduction in V y sys over SRSWOR will be 100% .

k −1
If  assumes the maximum value, i.e., if  = 1 , then from (1) , we get E = .
nk − 1

Theorem: Show that, V (y sys ) =


k −1 2
S wst 1 + (n − 1) wst  , where,  wst is the intraclass correlation within the strata.
nk
Proof:
We know

( )2
k n
2
S wst =
1
 y ij − y. j    (a )
n(k − 1) i =1 j =1

  (yij − y. j )(yij − y. j )
k n

and  wst =
i =1 j  j =1
   (b )
n(n − 1)(k − 1)S wst
2

We have,

Systematic sampling - 6
( )
V y sys =
1 k
 ( yi. − y.. )2
k i =1
2
1 k 1 n 1 n 
=    y ij −  y. j 
k i =1  n j =1 n j =1 
 
2
 n 
( )
k
= 2    y ij − y. j 
1
n k i =1  j =1 

1  k n 
( ) ( )( )
k n
= 2  y ij − y. j 2 +   y ij − y. j y ij − y. j
n k  i =1 j =1 i =1 j  j =1 
1
n k

= 2 n(k − 1)S wst 2
+ n(n − 1)(k − 1) wst S wst
2
 From (a ) and (b )

k −1 2
= S wst 1 + (n − 1) wst  (Proved )
nk
Systematic Sampling Vs Stratified Random Sampling:
n  1 1 
We know, V ( y st ) =   n j − N j  p 2j S 2j
j =1  
Nj
But N j = k and n j = 1, ( j = 1, 2, , n ) and p j =
k 1
= =
N nk n

 1 1
n
 V ( y st ) =  1 −  2 S 2j
j =1  kn
(k − 1) n
=
n2k
 S 2j
j =1

(k − 1)  1 k
n
( )2   1 k
( )2 
=    yij − y. j
n 2 k j =1  k − 1 i =1 
 Since, S j =

2
 yij − y. j
k − 1 i =1 

(y − y. j )2
k n

2  ij
1
=
n k i =1 j =1

 
k −1 2
( )2 
k n

1
= S wst  Since, S wst
2
= y ij − y. j
nk  n(k − 1) i =1 j =1 

Now comparing V y sys = ( ) k −1 2


nk
S wst 1 + (n − 1) wst  and V ( y st ) =
k −1 2
nk
S wst , we get

k −1 2
V ( y st )
S wst
E = = nk
V y sys( ) k −1 2
S wst 1 + (n − 1) wst 
nk
1
=
1 + (n − 1) wst 
Thus we see that the relative efficiency of systematic sampling over stratified random sampling depends upon the
values of  wst .

Example: If the population consists of a linear trend, then prove that


( )
V ( y st )  V y sys  V ( y n )R

Solution:
Let us suppose that the population has the linear trend given by the model Yi = i ; (i = 1, 2 , , N ) then

Systematic sampling - 7
N N
N (N + 1) N N
N (N + 1)(2 N + 1)
 Yi =  i = 2
and  Yi2 =  i 2 = 6
i =1 i =1 i =1 i =1
N
(N + 1)
 Yi =
1
YN =
N i =1 2

( )2
N
1
and S2 = Yi − YN
N − 1 i =1
1 N 2 2
= 
 Yi − NYN 
N − 1  i =1 
1  N (N + 1)(2 N + 1) N (N + 1)2 
=  − 
N − 1  6 4 
N (N + 1)
=
12

1 1   N −n 2
 V ( y n )R =  −  S 2 =  S
 n N   Nn 
n(k − 1) nk (nk + 1)
= Since N = nk 
n2k 12

=
(k − 1)(nk + 1)
   (1)
12

k −1 n
We known, V ( y st ) = 2  S 2j
n k j =1

N (N + 1)
We have S 2 = for the population of N units. Since j th stratum consists of k units, we have
12

k − 1 nk (k + 1) k 2 − 1
 V ( y st ) = =    (2)
n2k 12 12n
( )
For finding out V y sys , we have

Systematic sampling - 8
y i. = mean of the values of i th sample
1 n
=  yij
n j =1

=
1
i + (i + k ) + (i + 2k ) +  + i + (n − 1)k
n
= ni + 1 + 2 +  + (n − 1)k 
1
n
1 (n − 1)
= + k    (3)
i 2
Also
N + 1 nk + 1
y.. = YN = =    (4)
2 2
k +1
 y i. − y.. = i −
2

 ( ) 1 k
V y sys =  ( y i. − y.. )2
k i =1
2
1 k  k +1
= i − 
k i =1  2 

1 k  2  k + 1  k + 1 
2
= 
k i =1 
i +  
 2 
− 2i
2 
2
1 k 2  k + 1 k +1 k
= 
k i =1
i +  −
 2 
i
k i =1

=
(k + 1)(2k + 1) + (k + 1)2 − (k + 1)2 =
k 2 −1
   (5)
6 4 2 12
From (1), (2) and (5) , we get

( ) k +1
V ( y st ) : V y sys : V ( y n ) ::
n
: (k + 1) : (nk + 1)

1
 :1: n (approx.)
n
( )
 V ( y st )  V y sys  V (yn )

Thus if the population is suspect of a linear trend then stratified random sampling is most effective (with
systematic sampling as the next best) in eliminating the effect of linear trend.

Systematic sampling - 9

You might also like