You are on page 1of 30

Chapter -2

Simple Random Sampling

Simple random sampling (SRS) is a method of selection of a sample comprising of n number of


sampling units out of the population having N number of sampling units such that every sampling unit
has an equal chance of being chosen.

The samples can be drawn in two possible ways.


• The sampling units are chosen without replacement in the sense that the units once are chosen
are not placed back in the population.
• The sampling units are chosen with replacement in the sense that the chosen units are placed
back in the population.

1. Simple random sampling without replacement (SRSWOR):


SRSWOR is a method of selection of n units out of the N units one by one such that at any stage of
selection, any one of the remaining units have the same chance of being selected, i.e. 1/ N.

2. Simple random sampling with replacement (SRSWR):


SRSWR is a method of selection of n units out of the N units one by one such that at each stage of
selection, each unit has an equal chance of being selected, i.e., 1/ N..

Procedure of selection of a random sample:


The procedure of selection of a random sample follows the following steps:
1. Identify the N units in the population with the numbers 1 to N.
2. Choose any random number arbitrarily in the random number table and start reading numbers.
3. Choose the sampling unit whose serial number corresponds to the random number drawn from
the table of random numbers.
4. In the case of SRSWR, all the random numbers are accepted ever if repeated more than once.
In the case of SRSWOR, if any random number is repeated, then it is ignored, and more
numbers are drawn.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 11
Such a process can be implemented through programming and using the discrete uniform distribution.
Any number between 1 and N can be generated from this distribution, and the corresponding unit can
be selected into the sample by associating an index with each sampling unit. Many statistical software
like R, SAS, etc. have inbuilt functions for drawing a sample using SRSWOR or SRSWR.

Notations:
The following notations will be used in further notes:

N: Number of sampling units in the population (Population size).


n: Number of sampling units in the sample (sample size)
Y: The characteristic under consideration

Yi : Value of the characteristic for the ith unit of the population

1 n

y  n i1 yi : sample mean

1N

Y  N i1 yi : population mean


N N

S2  N1 (Yi Y )2  N11(i1 Yi2  NY 2)


1 i1
1N

2  N (Yi Y )2  N1 (iN1 Yi2  NY 2)


i1

n n

s2  1 (y i  y)2  n11(i1 yi2


ny2)
n1 i1

Probability of drawing a sample :


Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 22
1.SRSWOR:  N
If n units are selected by SRSWOR, the total number of possible samples are   n .

1
So the probability of selecting any one of these samples is .
 N
  n
Note that a unit can be selected at any one of the n draws. Let ui be the ith unit selected in the sample.

This unit can be selected in the sample either at first draw, second draw, …, or nth draw.

Let P ij ( ) denotes the probability of selection of ui at the jth draw, j = 1,2,...,n. Then

P ij ( )  P i1( ) P i2( ) ... P in ( )


1 1 1
   ... (n times)
N N
Nn 
.N

Now if u u1, 2,...,un are the n units selected in the sample, then the probability of their selection is

P u u( 1, 2,...,un )  P u( 1). (P u2),...,P u( n ).

Note that when the second unit is to be selected, then there are (n – 1) units left to be selected in the
sample from the population of (N – 1) units. Similarly, when the third unit is to be selected, then there
are (n – 2) units left to be selected in the sample from the population of (N – 2) units and so on. n

If P u( 1)  , then
N
1 1
n
P u( 2)  ,..., P u( n )  .
N 1 N  n 1
Thus

 1  2 1 1
P u u( 1, 2,..,un )  N Nn . n  1. Nn  2 ... N   n 1   N .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 33
  n

Alternative approach:
The probability of drawing a sample in SRSWOR can alternatively be found as follows:

Let ui k( ) denotes the ith unit drawn at the kth draw. Note that the ith unit can be any unit out of the N

units. Then so  (ui(1),ui(2),...,ui n( )) is an ordered sample in which the order of the units in which they are

drawn, i.e., ui(1) drawn at the first draw, ui(2) drawn at the second draw and so on, is also considered. The

probability of selection of such an ordered sample is

P s( o )  P u( i(1))P u( i(2) | ui(1))P u( i(3) | u ui(1) i(2))...P u( i n( ) | u ui(1) i(2)...ui n( 1)).


Here P u( i k( ) |u ui(1) i(2)...ui k( 1)) is the probability of drawing ui k( ) at the kth draw given that ui(1),ui(2),...,ui k( 1)

have already been drawn in the first (k – 1) draws.

Such a probability is obtained as

P u( i k( ) | ui(1)ui(2)...ui k( 1))  1 .
N  k1
So
n
1 (N n)!

P s( o ) k1 N  k1  N ! .

The number of ways in which a sample of size can be drawn n  !n

(N n )!
Probability of drawing a sample in a given order 
N!
So the probability of drawing a sample in which the order of units in which they are drawn is

irrelevant n!(N n )!  1 .
N!  N
  n

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 44
2. SRSWR
When n units are selected with SRSWR, the total number of possible samples are N n.
1
The Probability of drawing a sample is N n.

Alternatively, let ui be the ith unit selected in the sample. This unit can be selected in the sample either

at first draw, second draw, …, or nth draw. At any stage, there are always N units in the population in

case of SRSWR, so the probability of selection of ui at any stage is 1/N for all i =

1,2,…,n. Then the probability of selection of n units u u1, 2,...,un in the sample is

P u u( 1, 2,..,un )  P u( 1). (P u2)... (P un )


1 1 1
 . ...
NN N
1
 Nn

Probability of drawing a unit 1. SRSWOR


Let Ae denotes an event that a particular unit u j is not selected at the th draw. The probability of

selecting, say, jth unit at kth draw is

P (selection of u j at kth draw) = P(A1    A2.... Ak1 Ak )

 P A P A A P A A A( 1) ( 2 1 ) (3 )..... (P Ak1 A A1, 2......Ak2)P A


12

A A( k 1, 2......Ak1)

 1  1  1 1  1
1 N 1 N 11 N 2   ... 1 N  k 2  N  k 1

N 1 N 2 N  k1 1
 . ... .
N N 1 N  k 2 N  k 1
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 55
1

N

2. SRSWR

1
P[selection of u j at kth draw] = .
N

Estimation of population mean and population variance


One of the main objectives after the selection of a sample is to know about the tendency of the data to
cluster around the central value and the scatteredness of the data around the central value. Among
various indicators of central tendency and dispersion, the popular choices are arithmetic mean and
variance. So the population mean and population variability are generally measured by the arithmetic
mean (or weighted arithmetic mean) and variance, respectively. There are various popular estimators
for estimating the population mean and population variance. Among them, sample arithmetic mean and
sample variance is more popular than other estimators. One of the reasons to use these estimators is
that they possess nice statistical properties. Moreover, they are also obtained through well established
statistical estimation procedures like maximum likelihood estimation, least squares estimation, method
of moments etc. under several standard statistical distributions. One may also consider other indicators
like median, mode, geometric mean, harmonic mean for measuring the central tendency and mean
deviation, absolute deviation, Pitman nearness etc. for measuring the dispersion. The properties of such
estimators can be studied by numerical procedures like bootstrapping.
1. Estimation of population mean
1 n

Let us consider the sample arithmetic mean y  n i1 yi as an estimator of the population mean

1 N

Y  N i1 Yi and verify y is an unbiased estimator of Y under the two cases.

SRSWOR
n

Let ti  yi. Then


Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 66
i1

n E(i1 yi )
1
E( )y 
1
 E t  i

  N 
 

 1  1  n ti 
n N i1
  n 
 N


1 1  n  n 

 n  N i1  i1 yi .


  n

When n units are sampled from N units by without replacement, then each unit of the population can
occur with other units selected out of the remaining N 1 units is the population and each unit

N 1  N

occurs in  n1 of the   n possible samples.

So 
 N


 n n N 1 N
 

So i1 yi  n1i1 yi .


 i1 
Now

E( )y  (n(1)!(N N1)!n)! n N!(nN!n)! iN1 yi

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 77
1N

 N i1 yi

Y.

Thus y is an unbiased estimator of Y .


Alternatively, the following approach can also be adapted to show the unbiasedness property.
1
1
E( )y  n
1n
1n1 
n

 E y( j ) i

j 
 1
1


n
n
N
N

   1

 YP i ij ()  j1  i1 Yi. N 


j n

 n j1 Y
Y

where Pj ( )i denotes the probability of selection of ith unit at jth stage.

SRSWR

1
E( )y  n E(in1 yi )
n
1

 n i1 E y( i )
1n

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 88
n
i1 (Y P1 1  .. Y PN )
1n

 n Y

Y.

1
where Pi  for all i 1,2,...,N is the probability of selection of a unit. Thus y is an unbiased
N
estimator of the population mean under SRSWR also.

Variance of the estimate


Assume that each observation has some variance 2 . Then

V( )y  E(y Y )2

1 n 2  E 

n i1 (yi Y )


yi Y )  2 (

 E n12 in1 ( 2 n1 ni nj yi Y )(y j Y )


1 n
2
1 n n E y( i Y )(y j Y )

 2 E(yi Y )  n2 i j n
1 n 2 K

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 99
 2  
n2 n
N 1 2 K
 Nn S  n2
n n

where K E y Y y Y( i  )( i  ) assuming that each observation has variance 2. Now we find K
ij

under the setups of SRSWR and SRSWOR.

SRSWOR n
n

K E(yi Y)(yi Y).


ij

Consider
N N

1

E y( i Y )(yj Y )  N N (yk Y )(ye Y )

( 1) k 
Since

N (yk Y )2 N (yk  2 N N yk Y )(y Y ))

Y ) (
 k1  i1 k 
N N

0  (N 1)S 2 (yk Y )(y Y )


k 

1 N N 1 2

 (y k Y )(y Y )  N N( 1)[ ( N 1)S ] N


N( 1) k 
S2
 .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1010
N
S2
Thus K n n( 1) and so substituting the value of K , the variance of y under SRSWOR is
N

N 1 2 1 S2
V(yWOR )  S  2 n n( 1)
Nn n N
N n 2

 S.
Nn

SRSWR
N N

K E y( i Y )(yi Y )
ij
N N

E y( i Y E) (y je Y )
ij 
0

because the ith and jth draws (i  j) are independent.

Thus the variance of y under SRSWR is

N 1 2 V(yWR )
 S.
Nn
It is to be noted that if N is infinite (large enough), then
S2
V( )y 
n
N n
is both the cases of SRSWOR and SRSWR. So the factor is responsible for changing the
N
variance of y when the sample is drawn from a finite population in comparison to an infinite

N n
population. This is why is called a finite population correction (fpc) . It may be noted that

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1111
N
N n n N n n
 1 , so is close to 1 if the ratio of sample size to population , is very small or
N N N N
n
negligible. The term is called the sampling fraction. In practice, fpc can be ignored whenever
N
n
 5% and for many purposes, even if it is as high as 10%. Ignoring fpc will result in the
N
overestimation of the variance of y .
Efficiency of y under SRSWOR over SRSWR

V(yWOR )  N n S2
Nn
N 1
V(yWR )  S2
Nn
N n 2 n1
 S S2
Nn Nn
V(yWOR )a positive quantity
Thus

V(yWR ) V(yWOR ) and so, SRSWOR is more

efficient than SRSWR.

Estimation of variance from a sample


Since the expressions of variances of the sample mean, involve S2 which is based on population values,
so these expressions can not be used in real-life applications. In order to estimate the variance of y on

the basis of a sample, an estimator of S2 (or equivalently 2) is needed. Consider S2 as an estimator of s2

(or 2) and we investigate its biasedness for S2 in the cases of SRSWOR and

SRSWR,

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1212
Consider

2 1 n

s (y  y) n1
i 2 i1

1 n 2


 n 1 (yi Y )(y Y )

1 i

 
1 n 2 2
 n 1 (yi Y ) n(y Y )

1 i 

2 1n 2 2 

E s( )  E y( i Y ) nE(y Y ) 

n1 i1 
1 n
 1 2


 n 1 Var y( i )nVar( )y  n1n nVar( )y 

1 i 

In the case of SRSWOR

V(yWOR ) N
n
S2 Nn
and so

  
E s( 2)  nn1 2  NNn n S2 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1313
 nn NN1S2  NNnn S2
1
 S2

In the case of SRSWR

N 1 2
V(yWR )  S
Nn
and so

 1 2
E s( 2)  nn1 2  NNn S 

nn1N 1S2  N 1S2

 N Nn 

N 1 
S2
N
2

Hence


E s( 2) S22 is SRSWRis SRSWOR



An unbiased estimate of Var( )y is


Vˆ(yWOR )  N n s2 in case of SRSWOR and
Nn

Vˆ(yWR )  N 1. N s2
Nn N 1

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1414
s2
 in case of SRSWR.
n

Standard errors
The standard error of y is defined as Var( )y .

In order to estimate the standard error, one simple option is to consider the square root of the estimate
of the variance of the sample mean.


• under SRSWOR, a possible estimator is ˆ( )y  N ns . Nn

1
N
• under SRSWR, a possible estimator is ˆ( )y  s.
Nn

It is to be noted that this estimator does not possess the same properties as of Var( )y .

ˆ 
Reason being if  is an estimator of , then  is not necessarily an estimator of  .
In fact, the ˆ( )y is a negatively biased estimator under SRSWOR.

The approximate expressions for large N case are as follows:


(Reference: Sampling Theory of Surveys with Applications, P.V. Sukhatme, B.V. Sukhatme, S.
Sukhatme, C. Asok, Iowa State University Press and Indian Society of Agricultural Statistics,
1984, India)

Consider as an estimator of .s S Let s2

S 2  with ( )E   0,E(2)  S 2.
Write

s  (S 2 )1/2

 S1 S2 1/2


Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 1515
  2 
 S1 S 2  8S 4 ...
 2

assuming  will be small as compared to S2 and as n becomes large, the probability of such an event
approaches one. Neglecting the powers of  higher than two and taking expectation, we have

)
 Var s( 2
E( )s  1
8S 4 S

where

2S 4   n1 
Var s
  (n1) 1 2n  3) for large N.
2 2

1N j

j  N i1 Yi Y 

4
2  4 : coefficient of kurtosis.
S
Thus

E s  S 1 4(n11) 28n3

Var s( )  S 2 S 2 1 18Var sS(4 2)2

Var s( 2)

4S 2
S2   n1 
 n 1 2n 2 3.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1616
2 1  

Note that for a normal distribution, 2  3 and we obtain


S2
Var s( )  .
2n1
Both Var s( ) and Var s( 2) are inflated due to nonnormality to the same extent, by the inflation factor

  n1 
1 2n 2 3
 
and this does not depend on the coefficient of skewness.

This is an important result to be kept in mind while determining the sample size in which it is assumed

that S2 is known. If inflation factor is ignored and the population is non-normal, then the reliability on

s2 may be misleading.

Alternative approach:
The results for the unbiasedness property and the variance of the sample mean can also be proved in an
alternative way as follows:

(i) SRSWOR
With the ith unit of the population, we associate a random variable ai defined as follows:


ai  10,, if tif thehe ith unit occurs in the sampleth i1,2,...,N) i unit does not

occurs in the sample (

Then,

E a( i )  1 Probability that the ith unit is included in the sample


n  , i1,2,...,N.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1717
N
E a( i2)  1 Probability that the ith unit is included in the sample
n
 , i1,2,...,N
N
E a a( i j )  1 Probability that the ith and jth units are included in the sample
n n( 1)
 , i j 1,2,...,N. N
N( 1)
From these results, we can obtain

( )
Var a( i )  E a( i2)E a( i )2  nN 2 n , i1,2,...,N
N

(  )
Cov a a( i , j )  E a a( i j ) E a E a( i ) ( j )  nN 2 n , i
j 1,2,...,N. N (N 1)
We can rewrite the sample mean as
1N

y  n i1 a yi i

Then

1
E( )y  n iN1 E a y( i ) i

Y and

1 N  1N 2 N 

Var   
Var( )y  2  1 a yi i  n2  i1 Var a( i )yi  i j Cov

a a( i , j )y yi j . n  i 

Substituting the values of Var a( i ) and Cov a a( i , j ) in the expression of Var( )y and simplifying, we

get
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 1818

Nn 2

Var( )y  S.
Nn

To show that E(s2) S2, consider

2 1 n 2 2  1 

N 2 2  s n 1 yi ny 

(n1) i1 ai yi ny .


( 1)  i 
Hence, taking, expectation, we get

2 1 N 2 2 

(E s )  n 1 E a y( i ) i 
n Var
( )y Y 
( 1)  i 
Substituting the values of E(ai ) and Var( )y in this expression and simplifying, we get E s( 2)  S2.

(ii) SRSWR
Let a random variable ai associated with the ith unit of the population denotes the number of times the

ith unit occurs in the sample i 1,2,..., N. So ai assumes values 0, 1, 2,…,n. The joint distribution of a

a1, 2,...,aN is the multinomial distribution given by

! 1
n
P a a( 1,
2,...,aN )  N
.
n N a i
!
i1

N where a  n. For this multinomial distribution,


i

we have
i1

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 1919
n
E a( i ) 
,
N
n N( 1) Var a( i ) 
2 , i1,2,..., N. N

n
Cov a a(
i , j )  N2 , i j 1,2,..., N.

We rewrite the sample mean as


1N

y  n i1 a yi i .

Hence, taking the expectation of y and substituting the value of E(ai )  n N/ we obtain that

E( )y Y.

Further,

Var( )y  n12 

 iN1 Var a( )yi2 iN1


Cov a a( i , j )y yi j

 i

Substituting, the values of Var a( i ) n N( 1)/ N2 and Cov a a( i,j ) n N/ 2


and simplifying, we get

1
N 2

Var( )y  S.
Nn

1
2
N 2 2

To prove that E s( ) S  in SRSWR,


consider N
n N

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2020
(n1)s2  yi2 ny2 a yi i
2
ny2,
i1 i1
N

(n1) (E s2) E a y( i ) i2 n Var ( )y Y  2

i1

 Nn iN1 2
n.(NnN1) S2 nY 2
yi 

(n1)(N 1) 2
 S
N

2 N
1S2 2
E s( )

N

Estimator of population total:


Sometimes, it is also of interest to estimate the population total, e.g. total household income, total
expenditures etc. Let denotes the population total
N

YT Yi  NY
i1

which can be estimated by

YˆT  NYˆ

 Ny.
Obviously

E Yˆ  NE y
T

 NY

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2121
Var Y  ˆ  N
T
2
 y
N 2  N nS 2  N N( n) S 2 for SRSWOR

  Nn  n


N 2  NNn1S 2  N N( n1) S 2 for SRSWOR

ˆ
and the estimates of variance of Y T are

 N N( n) s2 for SRSWOR

Var Y( ˆT )  n

 N s2 for SRSWOR
 n

Confidence limits for the population mean


Now we construct the 100(1)% confidence interval for the population mean. Assume that the

population is normally distributed N(,2) with mean  and variance 2. then y Y follows Var( )y


N(0,1) when 2 is known. If 2 is unknown and is estimated from the sample then y Y Var( )y

follows a t -distribution with (n1) degrees of freedom. When 2 is known, then the 100(1)%
confidence interval is given by


PZ  y Y  Z  1 
 2 Var( )y 2 

or Py Z Var( )y  y  y Z


 2 2 and the confidence limits are



Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2222
y
Var( )  1  

yZ Var( )y , yZ Var(y


 2 2 


when Z denotes the upper % points on N(0,1) distribution. Similarly, when 2 is unknown,
2
2

then the 100(1-1)% confidence interval is


P t y Y  t
1   2 Varˆ( )y 2




P y
or y t  Varˆ( )y   y y t Varˆ( )  1 
 2 2 
and the confidence limits are



y t  Varˆ( )y  y t Varˆ( )y 


 2 2 


where t denotes the upper % points on t -distribution with (n1) degrees of freedom.
2 2

Determination of sample size


The size of the sample is needed before the survey starts and goes into operation. One point to be kept
ins mind is that when the sample size increases, the variance of estimators decreases but the cost of
survey increases and vice versa. So there has to be a balance between the two aspects. The sample size
can be determined on the basis of prescribed values of the standard error of the sample mean, the error
of estimation, the width of the confidence interval, coefficient of variation of the sample mean, the
relative error of sample mean or total cost among several others.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2323
An important constraint or need to determine the sample size is that the information regarding the
population standard derivation S should be known for these criteria. The reason and need for this will
be clear when we derive the sample size in the next section. A question arises about how to have
information about S beforehand? The possible solutions to this issue are to conduct a pilot survey and
collect a preliminary sample of small size, estimate S and use it as a known value of S it. Alternatively,
such information can also be collected from past data, past experience, the long association of
experimenter with the experiment, prior information etc.

Now we find the sample size under different criteria assuming that the samples have been drawn using
SRSWOR. The case for SRSWR can be derived similarly.

1. Prespecified variance
The sample size is to be determined such that the variance of y should not exceed a given value, say
V. In this case, find n such that

Var( )y  V
N n or
( )y V
Nn

N n 2
or S V
Nn
1 1 V
or   2
nNS
1 1 1
or  
n N ne

ne
n
ne
1
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 2424
N
S2
where ne  .
v

It may be noted here that ne can be known only when S2 is known. This reason compels to assume that

S should be known. The same reason will also be seen in other cases.

The smallest sample size needed in this case is


n n
ne smallest  e .
1
N
It N is large, then the required n is n

 ne and nsmallest  ne .

2. Pre-specified estimation error


It may be possible to have some prior knowledge of population mean Y and it may be required that the
sample mean y should not differ from it by more than a specified amount of absolute estimation error,

i.e., which is a small quantity. Such a requirement can be satisfied by associating a probability (1 )
with it and can be expressed as

P y Y  e (1 ).



Since y follows N Y( , N n S2) assuming the normal distribution for the population, we can write
Nn

 y Y e 
P   1 

 Var( )y Var( )y 
which implies that
e
 Z
Var( )y 2

or Z2 Var( )y  e2
2

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2525
or Z2 N n S2  e2
2 Nn
2
 ZS  
  2  
  e  
   
or n    
2 
  
ZS 
 1 2 
1  N  e 
  
  

which is the required sample size. If N is large then

 Z S 2
2

n  e  .

 
 

3. Pre-specified width of the confidence interval


If the requirement is that the width of the confidence interval of y with confidence coefficient

(1) should not exceed a prespecified amount W , then the sample size n is determined such that

2Z Var( )y W
2

assuming 2 is known and population is normally distributed. This can be expressed as

n
2Z N S W 2
Nn

or 4Z22  1n  N1 S2  W2

1 1 W2
or n  N  4 2 2
ZS

2

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2626
4Z S2 2
2

W2
22
4Z  S
or n  2 .
1 NW2

The minimum sample size required is


4Z2 S 2

nsmallest W2
2 2
4Z  S
 2
NW
1 2

If N is large then
4Z2 S 2
2

nW2

and the minimum sample size needed is


4Z2 S2
2

nsmallest  W2 .

4. Pre-specified coefficient of variation


The coefficient of variation (CV) is defined as the ratio of standard error (or standard deviation) and
mean. The knowledge of coefficient of variation has played an important role in the sampling theory
as this information has helped in deriving efficient estimators.

If it is desired that the the coefficient of variation of y should not exceed a given or pre-specified value

of coefficient of variation, say C0 , then the required sample size n is to be determined such that

CV( )y  C0
Var( )y
or  C0

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2727
Y
N n
S2

or Nn 2 
C02 Y

1 1 C02
or  2
n N C
C2
2
o C
or n C 2


1 2

NC0
S
is the required sample size where C  is the population coefficient of variation.
Y
The smallest sample size needed in this case is
C2
2
0
C2
C
nsmallest .
1 2

NC0

If N is large, then
C2
n
2
C
0

C2
and nsmalest

C
2 0

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 2828
5. Pre-specified relative error
When y is used for estimating the population mean Y , then the relative estimation error is defined yY
as . If it is required that such relative estimation error should not exceed a pre-specified value
Y

R with probability (1), then such requirement can be satisfied by expressing it like such requirement
can be satisfied by expressing it like

 y Y RY 
P   1 .

 Var( )y Var( )y 
  
Assuming the population to be normally distributed, y follows N Y  , NNn n S2 .

So it can be written that


RY  Z .
Var( )y 2

Nn
or Z22  Nn S2 R Y2 2

1 1 R2
or  n  N  C Z2 2
2

2
ZC 
 2 
 R 
 
or n    2
 
ZC 
1  2
1
N R 
 
  
S
where C  is the population coefficient of variation and should be known. Y
If N is large, then
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page 2929
z C 2
n 2 .

 R 
 

6. Pre-specified cost
Let an amount of money C is being designated for sample survey to called n observations, C0 be the

overhead cost and C1 be the cost of collection of one unit in the sample. Then the total cost C can be

expressed as

C  C0 nC1


Or n  C C0
C1
is the required sample size.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur


Page 3030

You might also like