You are on page 1of 6

Sample size determination:

In the planning of a sample survey, a stage is always reached at which a decision must be
made about the size of the sample. The decision is important. Too large a sample implies a waste
of resources, and too small a sample diminishes the utility of the results. The decision cannot
always be made satisfactorily; often we do not posses enough information to be sure that our
choice of sample sizes is the best one. Sampling theory provides a framework for determination
of sample size.
Consider a situation with a market researcher who is preparing to study the customer
behaviour of some island. Among other things, he wishes to estimate the percentage of
customers belonging to brand A. Cooperation has been secured so that it is feasible to take a
simple random sample. How large should the sample be?
This equation cannot be discussed without first receiving an answer to another question.
How accurately does the market research wish to know the percentage of customers with brand
A? In reply he states that he will be content if the percentage is correct within 5% in the sense
that, if the sample shows 43% customers use brand A , the percentage for the whole island is
sure to lie between 38 and 48.
We know that we cannot absolutely guarantee accuracy within 5% except by measuring
everyone. However large n is taken, there is a chance of a very unlucky sample that is in error by
more than the desired 5%. The market researcher is willing to take a 1 in 20 chance of getting an
unlucky sample. Assuming the population is large and the fpc is ignored, and the sample
percentage p is assumed to be normally distributed
In technical terms, p is to lie in the range (P5%), except for a 1 in 20 chance. Since p is
assumed normally distributed about P, it will lie in the range (P2
p
)
Apart from a 1 in 20 chance. Furthermore,


Hence, we may put


At this stage we should have some idea of the likely value of P. From the historical records of
the company P lies within the range 30 to 60%.
With this information PQ is maximum at P=0.50.
Whether the fpc is required depends on the number of people on the island. If the population
exceeds 8000, the sampling fraction is less than 5% and no adjustment for fpc can be ignored .
Otherwise we readjust the sample size according to the population size.
The chosen value of n must be appraised to see whether it is consistent with the resources
available to take the sample. This demands an estimation of the cost, labour, time, and materials
required to obtain the proposed size of sample. In such cases we reduce the sample size by
changing the permissible error. Let d is the permissible error, N is the population size
The formula for n in sampling for proportions

)

For practical use , an advance estimate p to P is substituted in this formula. If N is large, a first
approximation is


Consider a situation with brand preference with d =0.05, p =0.5, =0.05, t = 2(from
normal area tables 1.96)
Thus


Let us assume that there are only 3200 customers in the region. The fpc is needed, and we find
N= 3200, n
0=
400 then we have


Sometimes particularly when estimating the total number NP of units in class C, we wish to
control the relative error r instead of the absolute error in Np; for example, we may wish to
estimate NP with an error not exceeding 10%. That is, we want
(
| |

) | |
For this specification, we substitute rP or rp for d in formulas. We get


In case of estimating the population mean or population total we use the formula

(

]

In this n depends is its coefficient of variation

.
This is often more stable and easier to guess in advance than S itself.
As a first approximation we take


Substituting an advance estimate of (S/

). The quantity C is the desired (cv)


2
of the sample
estimate.
If n
0
/N is appreciable we compute n as


Instead of the relative error r we wish to control the absolute error d in, we take



Consider another example in nurseries that produce young tress for sale it is advisable to
estimate, in late winter or early spring, how many healthy young tress are likely to be on hand,
since this determines policy toward the solicitation and acceptance of orders. The data that
follow were obtained from a bed of silver maple seedlings 1 ft wide and 430 ft long. The
sampling unit was 1ft of the length of the bed, so that N= 430. With the earlier records it was
found that

.With simple random sampling, how many units must be taken to


estimate

within 10% apart from a chance of 1 in 20?


We have


Since

is not negligible, we take



Sample size in stratified random sample with Proportional allocation: Let n be the total size
of the sample.
2 2
0
0
0
2
1
h h
h
h
h h
h
h h
W s
n
w V
n
If is not negligible then
N
W s
n
V W s
n
=
=
+


We compute s
h
value through pilot survey or past records, V = d
2
/ t
2
d is the permissible error
and t is obtained from area of normal / t- distribution at level significance.
In case of estimating proportions we obtain sample size the sample size as follows

1
0 1
1
, ( / )
1
h h h
n
n where n W p q V
n
N
= =
| |
+
|
\ .

for proportional allocation

2 1
0 1
, ( ( )) /
1
h h h
h h h
n
n where n W p q V
W p q
NV
= =
| |
+
|
\ .

for optimal allocation.



In a market research survey of estimating the proportion of customers preferring the brand A the
values of p
h
and N
h
are obtained for four strata. Assuming that the estimated population
proportion shouldnot differ by 10% with a 95% confidence, obytain the required sample size for
proportional allocation and optimal allocation.
Strata
(h)
P
h
N
h
1 0.318 108
2 0.205 228
3 0.412 235
4 0.158 80

Here V= (0.10 / 1.96)
2
= 2.60308X10
-3

For proportional allocation n
1
= 76 and n
0
n= 68
For proportional allocation n
1
= 74 and n
0
= 67.
If the strata sizes are different, proportional allocation could be used to maintain a steady
sampling fraction throughout the population. The total sample size, n, should be allocated to
the strata proportionally to their sizes:

This implies
h
h h
N
n n nW
N
= =
Optimum allocation: Optimum allocation takes into consideration both the sizes of
the strata and the variability inside the strata. In order to obtain the minimum sampling variance
the total sample size should be allocated to the strata proportionally to their sizes and also to the
standard deviation of their values, i.e. to the square root of the variances.
n
h
= constant N
h
s
h



This implies

where n is total sample size, n
h
is the sample size in stratum h, N
h
is the size
of stratum h and s
h
is the square root of the variance in stratum h.
Optimum allocation with variable cost: In some sampling situations, the cost of sampling in
terms of time or money is composed of a fixed part and of a variable part depending on
the stratum.
The sampling cost function is thus of the form:

where C is the total cost of the sampling, c
0
is an overhead cost and c
h
is the cost per sampling
unit in stratum h, which may vary from stratum to stratum.

You might also like