Professional Documents
Culture Documents
Sampling Theory
Sampling Theory
SAMPLING
(3.3) TYPES OF
a sample.
We describe below some ofth
different ways of selecting
There are many random sampling. (2) Purposive samplino
important types ofsampling-(1) Simple (5) Multi-stage sampling.
Systematic sampling.
(3) Stratified sampling. (4)
Random Samp1ing
) simple of selection
called Random Sampling) is the process
Simple Random Sampling (also unit of the population
has an "equal
such m a n n e r that every
of a group of units in a
obtained is called a
included in the sample. The group of units thus
chance" of being
Random Sample only). In practice, the members of the
Simple Random Sample (or
sample are drawn one by one.]
There are two ways of drawing a simple random sample:
(SRSWR): Simple random
(a Simple Random Sampling With Replacement
when the sample members are drawn
is said to be "with replacement",
Lsampling
from the population one by one, and
after each drawing, the selected population
before the next one is drawn.)
unit is noted and then returned to the population
This means that at each stage of
the sampling process all the population uníts
considered for selection with
(including those obtained in earlier drawings)
are
l LeA a Jasn .
.
sis t b w t
151
Smpling Theory
SRSWR is N"
() in is *P, N(N- )N-2)... (N-
n +1) =
the order
of
in SRSWOR (ignoring
number of cases favourable to
each roup
Among these
ese. the
stinct 'combination'
drawing) or distir
of members
is not the same.
) in SRSWR
in SRSWOR
is the same. viz. n!
(ii) consider wiun a
reason, we generally each
For this with distinct
'permutations
SRSWR, N" possible samples
) in
probability
'combinations
N with distinct
samples
in S R S w O R . "C,
( N - possible
(ii)
P X)=
1, 2, N.
various sampling
and k
among the
= . . .
2, n
forall i= 1, and most important
. . .
infinite.
)Purposive Sampling individual judgement otf the sampler,
is
on the basis of purposive
A sample which is selected for selecting a
There is special technique
no
called Purposive Sample.
a sample according to his
out a typical or representativechance is not allowed to
sample; but the sampler picks factor and
on the personal
OWn judgement. It all depends
152 Statistical Methods
play at all. Consequently, there is much scope for bias and the degree of accurac
the estimates is not known. Purposive sampling may be useful when the samnie
small; but as the sample size increases the estimates become unreliable due
to
accumulation of bias. The advantage of purposive sampling is tnat whereasa randon
Will not.
sample may vary widely from the average, a purposive sample
) Stratified Sampling
In Stratified Sampling, the population is subdivided into several parts, called strata
and then a sub-sample is chosen from each of them. All the sub-samples combined
together give the Stratified Sample. If the selection from stratais done by random
sampling, the method is known as Stratified Random Sampling. The subdivision of
the population into strata is done by purposive method, but the selection of sub-
samples from within the strata depends purely on chance. Stratified random sampling
may therefore be viewed as a mixture of both purposive and random sampling, and
combines the advantages of both.
(Stratified sampling is generally used when the population is heterogeneous. but
can be subdivided into strata within each of which the heterogeneity is not so
prominent. Some prior knowledge is, therefore necessary for subdivision into strata,
called stratification.) If a proper stratification can be made such that the strata differ
from one another as much as possible, but there is much homogeneity within each of
them, then a stratified sample will yield better estimates than a random sample of the
same size. This is so, because in a stratified sample the different sections of the
population are suitably represented through the sub-samples, while in random sampling
some of these sections may be over-represented or under-represented or may even be
omitted.
Usually the same fraction of members from each stratum is included in the sample,
i.e., sub-sample sizes are made proportional to sub-population sizes. The principal
purposes of stratification are: i) to increase the overall estimates; (i) to ensure that
all sections of the population are adequately represented; (ii) to avoid a large size of
the population, and (v) to avoid the heterogeneity of the population.
(Uses-(i) Administrative convenience may dictate the use ofstratified sampling.
For example, in conducting a sample survey over West Bengal, the different districts
may be taken as strata, so that the district authorities can supervise the survey and
collect data more efficiently from their own region, i) Sometimes, different parts of
the population involve different sampling problems, and
special measures are necessary
for dealing with these cases. Stratified sampling is extremely useful in such cases.
Advantages-(i) If data of a given precision are required for certain parts of the
population, it is advisable to treat each part as a population in its own right,
ii) Stratification also brings about a gain in precision of the estimates obtained.
A)Systematic Sampling
Systematic Sampling involves the selection of sample units at equal intervals, after all
the units in the population have been arranged in some order. If the population size 1s
finite, the units may be serially numbered and arranged. From the first k of these, a
single unit is chosen at random. This unit and every k-th unit thereafter constitutes a
Systematic Sample)In order to obtain a systematic sample of 500 villages out of
KRoraefeistie
Jaramle
basis etffusadk col Led a laramolda
153
Sampling Theory
have to be
Of 40,000 in West
st Bengal, i.e. of 80 on an average all the villages
one out
selected at random, suppo
numbered serially. From the first 80 of these a village is
number 2/. Then the with serial numbers 27,
107, 187, 20/,
with the
serial villages
constitute the systematic sample.
347, ...
of arrangement or uc
characteristic under study is independent of the order
If the random sample. Ine
urits, then a systematic sample
is practically equivalent to a
the sample is easier and quicker. Systematic sampling
1s
Suitaoic
actual selection of workers listed on
cards.
the units are described on serial numbered cards, e.g.
when numbers. The sample
can be drawn easily by looking at the serial
Then the sample interval.
it there are periodic features associated with the sampling
may be biased
Multi-stage Sampling
several
which is carried out in
Multi-stage Sampling refers to a sampling procedure These
calledfirst-stage units.
stages. The population is first divided into large groups,called units-the
divided into smaller units, second-stage
first-stage units are again
into third-stage units, and so oF, until we reach
the ultimate units.
second-stage units
are selected, from each of
which some second
Initially some of the first-stage units until the
on from stage to stage
units are chosen; and the process is carriéd
stage on an
in order to introduce a scheme
selection of ultimate units. For example, from the
basis in thevillages, we may have to select a few villages
experimental Sub-divisions may be used as
whole of the State. If apply three-stage sampling.
we
as the
the second-stage units and then Villages
first-stage units. Anchals forming
ultimate units.
divisions and subdivisions of the population
(Multi-stage sampling enables existing to be concentrated,
various stages and permits the field work
to de used as units at
Another advantage is that the subdivision into
second
although a large area is covered. which are included in
carried out for only those first stage units
stage units need be
of underdeveloped areas where no sampling
the sample. It therefore helps in surveys
accurate for subdivision of the natural units into
frame is sufficiently detailed and
considerable saving in cost is achieved
reasonably small sampling units. Usually, than
However, this method is in general less accurate
through multi-stage sampling.
the number of ultimate units by some single
any other sampling method using
same
stage process.
SAMPLE
METHOD OF DRAWING RANDOM
3.4
Example 3.6 What are random numbers ?
M.Com. '77, '80; M.B.A. "78]
L.C.WA, June 74; W.B.H.S. '79; C.U.,
refer to well
Solution Random numbers (also called Random sanmpling numbers)
some
follows:
in the value ó l are unkn alia statistucs (I) San
variation parameters
constant. This
fluctuation. Usually statistic s
is called its
sampling distribution of a
has no
parameter probability is called 'stand
I f f r e p e s
used as of
estimates
deviation in the sampling a sampling dic mean a
the standard c o n s t a n t it
has n e i t h e r
d i s t r i b u t i o n ' and
However, since
the parameter
is bution approxim
of the statistic. s t a t i s t i c and p a r a m .
standard error.
between
neter: mean =
nor a to distinguish
notations will be used provided
The following
Statistic (from Sample
Parameter (from all
Population Values)
Values) (II)
I f p repre
a lot wve
Mean
Standard Deviation P a p p r o x i
Proportiom
rth Raw Moment
provide
population.
Let us consider a random sample x, X2, .., X, of size n drawn from a population containing
N units. Let us further suppose that we are interested in the sampling distribution of the
statistic (i.e., sample mean),
where +2t. =
(t +X,)/n.
If the population size N is finite, there is a finite number (say K) of possible ways of
drawing n units in the sample out of a total of N units in the population. For each of these K
samples we calculate the value of (see Tables 3.3 and 3.5. Here N = 4, n = 2, and
K N" = 16 for SRSWR and K = "C, =6 for SRSWOR). Although the K samples re
distinct, the sample means may not be all different; but each of these occurs with equal
probability. Thus, we can construct a table shwoing the set of possible values of the statistie
and also the probabilitythat will take each of these values. The probability distribution
of the statistic ï , will be called 'sampling distribution' of sample mean (Tables 3.4 and
3.6). The above method is quite general, and the sampling distribution of any other statistic,
say, median, or standard deviation of the sample, may be obtained.
If, however, the number (N) of units in the population is thé large, number (K) of possible
distinct samples being even larger,
the above method of
finding the sampling distribution
cannot be applied. In this case, the
values of I obtained from a
large number of samples
may be arranged in the form of a relative frequency distribution.(The limiting form of this
relative frequency distribution, when the number of samples considered becomes
large, will be called sampling distribution of the infinitely
statistic.When
the population is specified
by a theoretical distribution (e.g., binomial, or normal)/the sampling distribution can be
theoretically obtained. The knowledge of sampling distribution is necessary in finding
confidence limits' for parameters and in 'testing statistical hypotheses'.
equa, say F1E/2**
-
3.1 Aa
S.E. of(P -P2)
with s.d. o
population
normal
from a
random sample
4. For
S.E. of sample s.d. (5) = 2n
(3.7.5
variance (S) =
0
2
(3.76)
S.E. of sample and used only whenn the
are
however approximate
Formulae (3.7.5) and (3.7.6)
greater than 50).
sample size n is large (say ofa statistic. What doe
does
the concept of 'Standard error
Discuss
Example 3.12
statistic measure?
he standard error ofa C.U. B.A. (Econ) 8; C.A., Nov. '7
C.U. M.Com. '71 '77,; '76, '78; Dec. '811
L.C.W.A., June '73, *75,
n drawn from
a specified population
be a random sample of size
Solution Letx.1z,..x,
calculate the value of certain statistic, say, sample mean
On the basis of this sample, let
us
n a large number of
random sample of the fixed size
We repeat the process of drawing a
for each sample. The
and calcualte the value of the statistic (here, sample mean)
times, considered
distribution of these sample means, when the number of samples
relative frequency
distribution of sample mean.
is infinitely large, is called the sampling
have its own mean, standard deviation.
The sampling distribution of any statistic will
moments, etc. The standard deviation calculatedfrom the sampling distribution ofa statistie
he
error gives a measure of dispersion of
is called its 'Standard Error'. The standard
and goes on diminishng as the samplie
concernedstatistic. It depends on the sample size n
size increases. It is used to set up 'confidence limits' for population parameters and i
'testsof significance'. Thus, the standard errors of sample mean ( +) and sample propori
p) are used to find confidence limits for the population mean () and the populauo
oroportion (P) respectively.
xample 3.13 State the formulae for standard error of sample mean andsaniple
roportion. IM.B.A. 77; 1.C.W.A., June
CRaateai'stie
Jaramele
talle a Jaramel
basist wXb
ivenefulatian. aSalanter o
G
Caotan
A Ranttashie uted
tdtit
knmma
6ban da
andra
t o t s hte
s t a t i she
yautoane
Sonl
a ekaraa
etara
sa
la m
u h
tt Somte
dansiu
u va n
i t e
i
ahlr
2olsm y a
a d a l ibulon
aaard
d r a k m O
anplio
Valu»
V aua a
ditiuwn
Ai
pana
ecan
b kann a tha
xsob:lti
Co3a i
SAatutt
Consequently,
P{lt- 61< PE.}=
there is a
Thus,there 50: 50 chance that the
TnIn
thanPE this sense, P.E. measures thestatistic t differs from 0by more than or less
hanndom
In randon
sampling fluctuation like S.E.
samples of size h from a normal
he shown that very nearly population with mean u and s.d o, 1t
can
S.D. (3.8.1)
is called a Standard Normal Variate. The probability distribution of z is called Standard
Normal Distribution, and is defined by the p.d.f.
p(z) =
2,(-«<1<+o)
27
(3.8.2)
Characteristics:
1. The standard normal distribution is a special case of normal distribution with
the mean =0, and s.d. = 1.
2. It has no parameters (unlike the normal distribution which has two parameters
and o).
3. The central moments are =1, 43 =0, j4= 3.
Also, B, = 0, B = 3; Skewness ( ) = 0, Kurtosis (7)=0.
4. The standard normal curve is symmetrical about the mean 0, and the two tails
of the curve extend to infinity on either side of the mean. The points of infection
are at z =tl. |
Samping Theory 177
,
on the right [Fig. 3.1(B)
Ifx andy are independent chi-square variates with d.f. na andn^ respectively,
then their sum (x + y) also follows chi-square distribution with d.f. (n, + n,).
4. When the d.f. n is large y2x -2n -1 approximately follows the standard
normal distribution.
Theorem II
then
2 , are n independent standard normal varieties,
Theorem III
population with mean u and
Ifx1, X2, . . .
X, is a
random sample from a normal
s.d.o,
then
24 )2/a
)
(3.8.11)
i=l
freedom; and
distribution with n degrees of
follows chi-square
1) degrees of mean
mean
of the sample).
Student's t Distribution
(C)
Student's t distribution, or simply t distribution, it
A random variable is said to folow
its p.d.f. is of the form
- (n +1)/2
f) = K. 1
n
(3.8.13)
(-oc <t <+ «)
where K is a constant. The parameter n (positive integer) is called the number of
W.S. Gossett who wrote
degrees offreedom (d.f.).This distribution was discovered by
under the pen-name "Stúdent" and hence it is called Student's distribution. A variable
which follows Student's distribution is denoted by the symbol t.
The percentage points of t distribution with n degrees of freedom are denoted by
about zero
n.nOr briefly 1, if the d.f. n is understood. Since the 1-curve is symmetrical
Fig. 3.1(C)). 1-=-p;for example t9s - f03
Characteristics:
n
1. Mean =0, S.D. :
Vn-2 (n>2)
2. The -curve is symmetrical about 0, extending from-o to+ oo (like the standard
normal curve). It has zero skewness and positive kurtosis (leptokurtic). i.e.
B = 0. B > 3.
3. When the d.f. n is large, the t distribution can be the standa
approximated by
normal distribution.
Theorem IV
If z and y are independent random variables, where z follows standard
norn
distribution and y follows chi-square distribution with n degrees of freedom, then
y = 2-) nS
2
follows
chi-square distribution with (n
s chi-
1) d.f.. where S =XX,
-
1o Also
vartance. Al the variable z and v are -
T))/n
n is the sample
sample
Theorem v
independently distributed.
ndom
f a ran sanmple of size n is drawn from a normal population with mean U and s.d.
a, then
X -
1) degrees of
mean and s.d. of the sample. freedom, where and S denote the
Theorem VI
If two independent random samples of sizes n, and n, are drawn
from two normal
nopulations with means and u2 respectively and a
common s.d. o, then
2)-(41 -H2)
sln +1/n (3.8.16)
follows t distribution with (n,+n- 2)
degrees of freedom, where X, I denote the
means and S1, S,, the s.d.s. of the
samples, and
S=
S+n,S2
n +n -2 (3.8.17)
(D) Snedecor's F Distribution
A random variable is said to follow F distribution with degrees of freedom
its p.d.f. is of the form (n1, n2) if
percentage points of F distribution with d.f. (n1, n2) are denoted by Fp.
p n. n2 of briefly
F, if the d.f. are understood. The lower percentage points are given by
F1-p.(n. n2) 1p,(n2, n) (3.8.19)
i.e., the lower percentage point is the reciprocal of the upper percentage point with
the order of d.f. reversed.
Characteristics:
n Mode = 22(7-2)
1. Mean =
n -2 (n2 + 2)
S.D. = 2 _ 2 0 + n , -2)
n2-2Vnn2 -4)
provided they exist and are positive.
2. The F-curve is positively skew, and
starting from 0 extends to infinity
[Fig. 3.1(D)]. |
Theroem VII
Ify, and y, are independent chi-square variates with
respectively, then
degrees of freedom n, and
F= l
y, /n (3.8.20)
follows F distribution with degrees of freedom
(n1, n2).
In view of the results
given in Theorem 3, and as a consequence of the above
theorem, we havve
Theorem VIII
Ifx, X2, ...
X and y. y2, ..
Yn, are independent random samples of sizes
n, and n
respectively fronm two normal populations with means
, l and common standard
deviation o, then