I
,/ od5)nzl
Principles of SamPling
Dr. Kazi Saleh Ahmed
President of FREPD
To make inference about a population from
information contained in the population or in
the sample.
lnference = Estimation or Test of Hypothesis
of all elements
A population is collection
about which we wish
to make inference'
ExamPIes:
class
All ParticiPants in the
All shoPs in the new market
All voters in the citY
in a country
All business enterpl{lrizes
2
Sampling units are non-overlapping
collection of elements from the population
that cover the entire population.
FHouseholds consisting of members
)Village consisting of households
Sampling frame is a list of sampling units or elements
List of all Participants
List of all households
List of villages etc
3
A sample is a collection of sample units drawn
from a frame.
Any part of the population is cailed sample
The part may be drawn randomly (probabitity) or
purposely (non-probability)
Sample size n = 1, or 2, .... N-1
Where N = Population Size
Parameters and Statistics
Numerical descriptive measures of the population are
called parameters. Thus,a parameter is a function of the
observations in the population.
Mean =*rrl
NI'
* y,Z +.....yr)
1
statistics is a descriptive measure of the sample observations
sample mean = 1({ * Yr+.....Yn)
n
4
Parameters
Population Totalof y x, ...A
Population Mean y,x,..........2
Population proportion pi, p2, p2 ....pK
Population Ratio 11x or Xly
Population Variance oi. or, o),
s; s,1
Sample Totaly, X, ....?
Sample Mean y,x,..........2
Sample Proportion pl, p2,....pK
SampleRatio y1,.y1,.....
Sampfe Variance ai, &. a,".
Parameters and Estimates of parameters
lf d isparameter 6
=f(y1,y2......yN)
And0 isestimated of 0,0 = f (yr, yz,......ynt
5l is statistics used to estimate (known
B as estimator) or
test hypotheses regard'mg . (known as
e test statistics)
5
/
A,B,C,D,E
All possible sample of size 1 I A,B,C,D,E
All possible sample of size 2 : AB,AG,AD,AE, BC,BD,BE, CD,CE,DE
All possible sample of size 3 : ABC,ABD,ABE, ACD,ACE,ADE,BcD,BDE,CDE, BcE
All possible sample of size 4 : ABGD,ABCE,ACDE,BCDE' ABED
Complete count (Census): ABCDE
Y,:1, Yz:2, Y., :3, Y* :4, Yr:5
Let
Popuiation Total:i5
Population mean - 3
Sample mean for n :2
Estimate of population Total.
: t: =-L
V,+Y"
N,9.where
9, -1.5"i =10. S.-Z.S-S^=3
3,=2.5,4 =: o, 4 =l.s
4 =:.s,4 = 4.0,4, = +.s
1
Surprisin gly = meanof means = ( r.5 +'..+4.5)
r0
6
/
If N is the population size with \, Y,...YN measures
Then for n = sample size we can draw N.,, samples
N = 3, rr = 2t A,B,C : AB, AC, BC
Y : 1
(Yr+.....+ I, ). Population mean
*
!r, lr" r", u'"N., estimates'
" "V
Some y )I some y (Y and some 7 = I
We can easily find A
rr{y-Ylq,a}:.qs
o, r,{r -Yl),a}: .os
If A=2 orthena :.05
Concept of Sampling
Sample is a subgroup of population, sampling is the process
of selecting n elements out of N for use of estimating the
population parameter 9, based on sample estimator t.
7
Advantaee and Disadvantase of Sampling
Advantage: Saves time, cost and management
Disadvantage: You get only estimate +error * sampling error.
The sampling theory helps determine sampling error.
An Example
A=Abul, B=Babul, C=Ghandan, D=Dhar
Age: 18 19 20 21
Population mean ='18+19+20+21=78t4 = 1g.5
4
Sample mean: 18.5, 19.0, 19.5, 19.S, 20,20.5
Difference between t: 18.5, 19.0, 19.S, 19.S, 20, 20.s
,r: 19.5" 19.S. 19.S. 19.S. {9.5. {9.S
t- -1.0, -.5, 0
J 0 .5 1.0
Property 1 : I(/-J)=0, i=111/6=t9.5yrs=9
8
Property 2: lt -01 is minimum for higher sample sizes
Thus the greater the sample size, the more accurate the
estimator of population mean.
Property 3: The sampling error increases with oi
proporly +:l (t -s), < ) {t-a),
where a * e
Tvpes of Sampling
Random/Probability: Simple random, stratified,
Cluster, systematic, multistage.
Non-Probability: Quota, Judgment, accidental,
snowball, expert sampling.
I
/
Random Samplins
Every clement has equal chance of including in the Sample.
Methods of drawing:
(i) Fishbowldraw
(ii) computer program
(iii)Random number.
Stratified Samplins
lf a population is heterogeneous due mainly for
example rural and urban, male & female, then we
make rural & urban strata or male and female
strata & make them homogenous.
Apply SRS to Strata.
10
/
Clusters are heterogeneous
in targe
population. Out of N ctuster,
setected n
with equal or unequal probability.
21
Quota: Divide the population
into groups
and collect data from a pre-determined
number available to participate
in the
process.
11
Accidental
You start collecting data as you meet
them and stop as soon as the
determined number is reached. ln
accidental sampling no quota is used.
The primary consideration is the researcher,s
judgment who would be best resource persons
to
provide with.
Expert Samolino
lf in judgment sampling the judgment comes from
expert persons than it is expert sampling.
12
Snowball
The list of sensitive people are not available. lt
is difficult to find them either' First select
some, collect information and ask them to
identify more. The process continues until the
saturation points are researched.
Variance. cost and sample size
For accurate estimate you need big sample.
n= f(s' ,c)
Large n for large '(") . r you increase n cost increases.
Gost is one constraint.
13
7
Drawing a simple Random samPle
using Random Number.
a) Prepare a samPling frame.
b) ldentify the number of elements in the frame.
c) lf it is three digits, then chose 3 columns randomly from
the Table. The integer may be q = > [rl. lfI N, select the
individual having the number in the list.
d) if number ) N., then divide Number by N and the residual
correspond to your selected number.
Demonstrate with N = 20, fl = 3,
and random number Table'
14
7
ln SRS: For N, n, o:,Yr,!,
E(r,) = Y ,T,is unbaised for Y
v(-\):+#
,(i):+ H.were. r =*Zo"-i'
Bound on the error of estimation
Upper :y +zo(y,)
Lower: y -\o(y,)
Ppower<V S.upper)= )J
l+no,*
e : margin or error lv - Y1 = ,
We can find first no & then 11, [0 = 42, for s = 10,
When 52 is not known, we use S'.
where Sl A Sl are closed & 51 is known.
other r.t ise use o2 :'; ' t' .
range
Where o=3
4
15
/
Estimation for P.
P is the parameter: Proportion of an attribute
P is the sample estimate:
E(P)=P Pisunbiased.
V(P\=PQ I_4\
n'N-1'
N-'"1
,(P)= Pq
r-l .fIN-l /
Pg, p
n.o - 0 'gq'z .
o= 5.,e = .05 or .or
e2 '' =
= 9600 fbr e = .01
31
Stratified Random Sampline.
A Population of N units are divided into k mutually exclusive
homogenous groups. The total units of the group is N, .
Ilr, = lr.
The sample size for ith group is n,, and the sample mean is yi. yi is
unbiased tor y ,, E(y,) -_ y ,,
,. =Zr'.r,
. ,,,\rst,,=Zrl x,-n,s?/
N N2 N, /n,
Y, = I ru,y. vty,,t= ,z !.1 S'2 /
16
Let us consider 2 strata, one for Boys and one for girls.
Nr=6, Nz=4, N=N1+N2=10
Yrr:25,.24, 25,26,25,25; Yr = 150, Y, = 2S
Yr, '.29,30,30, 3l,y z =l2O,Z = 30,Y = 27
fit =3, frz =2
Y,, :24,26,25, !, =15, ,, =25
lzi:29,31, yz=60, yr=30
- _N,!,+Nr!, _6x25+4x30
/st .1
N to
Allocation of Sample size k to Strata
a. Proportionalallocation
n,=n.J-
MN^n.=n.J
, N, N
6-
_< _ 1
10
-4
n, =).-=l
'10
b. Equal allocation: Regardless of N,, & N,
n-
\/
J/1 =J.n= =J
/t, 2
C. Neyman Allocation:
(Nio \
''='LNs- Putting o,&o,
We get n, &n,
17
7
Let the N Units are arranged in sequence
Y, Y, ............ Y*. The sample size n, So that nk = N
Y, \*r.......... ..Y(n-r)o*,
Y2 \n2.......... ..Y1n-r1o*,
Yk Yr*...........ynr.
Draw a random number and select one from the 1st K Units
let the units be 2
Then the sample is : .'. Iz, Y,*p, !2*zt!z*1i-r1r,
Example.
1 2 3 4 5 6 7 I 910 1'.| 12
10 11 11 12 13 14't5 15't6 16 17 18
Letn=4, k= 3, nk=12.
10 11 11
12 13 14
15 15 16
16 17 18
first number drawn is 2
The sample is: 11, 13, 15, 17.
/s, = SamPle mean = sfi=ru.
Threemean. ,, =I=rs.zs.
4.4 7"=tq. i"=2=t+.ls
"ti, = oZri, - r"l' = 5I ti,-rt'
-tt
18
-1
Cluster Sampling
Sometimes the sampling frame is not available.
Preparing a new frame is costly. We resort to cluster
sampling which give more information per unit cost
than do the srs, Stratified and systematic sampling.
A cluster sample is a sample random sample in
which each sampling unit is a collection of elements
i.e. a cluster.
A household is a element and household income is variable. A village is a
cluster of households. All Households of the union is our survey population.
Let total Households in the union is NM,
N = Total villages, M = Number of Households in a village.
Yr = Total income of all households in village I
Yz = Total income of all households in village 2
Yr,r = Total income of all households in village N
Let us draw a sample of n villages, and the total income of n villages are
MM
Yr, Yr, ..,........ yn ..,....'..'.. Here y, = lY,,rr= lrr.
= l3 Y.. L=Y
)'=:) NY = lM
na''' NM
19
I
20