You are on page 1of 38

Sampling

Prof Amira Gamal


MUST University
Objectives of presentation

• Definition of sampling
• Why do we use samples?
• Concept of representativeness
• Main methods of sampling
• Sampling error
• Sample size calculation
Definition of sampling

Procedure by which some members


of a given population are selected as representatives of the
entire population
Definition of sampling terms

Population:
Collection of units sharing a common characteristic
Example: finite: possibility of counting all units e.g.,
students in a school
Infinite: counting all units is not feasible e.g., RBCs of an
individual
Sample:
A subset of a population obtained to investigate
properties of the parent population
Definition of sampling terms
• Target population:
• Population upon which the results of the study will be
generalized
• Sampling population:
• Population from which the sample was taken
Sampling unit
Population unit used for sampling
• Subject under observation on which information is collected
• Example: Children <5 years, hospital discharges, health
events…
Definition of sampling terms

Sampling frame
• Any list of all the sampling units in the population
• List of households, health care units…
Sampling scheme
• Method of selecting sampling units from sampling
frame
• Randomly, convenience sample…
Determination of sample population

• Feasability:
• Reachable sampling population e.g., hospital based
population

• Definition of sampling unit:


• Inclusion and exclusion criteria
Why do we use samples ?

Get information from large populations


• At minimal cost
• At maximum speed
• At increased accuracy
• Using enhanced tools
Sampling

Precision
Cost
Types of samples

• Non-probability samples:
• Generalization from study results is not possible since
representative-ness of the sample cannot be assumed

• Probability samples
Non probability samples
• Quotas
• Sample reflects population structure, e.g., 60 males and 60
females interviewed, within each gender 20 individuals <20, 20
>60 and 20 in between so quota of 20 will be included
• Time/resources constraints
• Convenience samples (purposive units)
• Biased: investigator selects a convenient sample e.g., assess
opinion of patients about service, the investigator decides to
interview all patients coming to his office today
Probability of being chosen : unknown
Probability samples
• Random sampling
• Each subject has a known probability of being chosen
• Reduces possibility of selection bias
Methods used in probability samples

• Simple random sampling


• Systematic sampling
• Stratified sampling
• Multistage sampling
• Cluster sampling
Simple random sampling
• Principle
– Equal chance of drawing each unit
– Basic sampling method
– Needs complete sampling frame
– Needs sample size
• Procedure
– Number all units
– Randomly draw units, lottery method
– Table of random numbers
Simple random sampling

• Advantages
– Simple
– Sampling error easily measured

• Disadvantages
– Need complete list of units
– Does not always achieve best representative-ness
– Units may be scattered
Simple random sampling
Example: evaluate the prevalence of tooth decay among
the 1200 children attending a school

• List of children attending the school


• Children numerated from 1 to 1200
• Sample size = 100 children
• Random sampling of 100 numbers between 1 and 1200

How to randomly select?


Simple random sampling
Table of random numbers

57172 42088 70098 11333 26902 29959 43909 49607


33883 87680 28923 15659 09839 45817 89405 70743
77950 67344 10609 87119 15859 74577 42791 75889
11607 11596 01796 24498 17009 67119 00614 49529
56149 55678 38169 47228 49931 94303 67448 31286
80719 65101 77729 83949 83358 75230 56624 27549
93809 19505 82000 79068 45552 86776 48980 56684
40950 86216 48161 17646 24164 35513 94057 51834
12182 59744 65695 83710 41125 14291 74773 66391
13382 48076 73151 48724 35670 38453 63154 58116
38629 94576 48859 75654 17152 66516 78796 73099
60728 32063 12431 23898 23683 10853 04038 75246
01881 99056 46747 08846 01331 88163 74462 14551
23094 29831 95387 23917 07421 97869 88092 72201
15243 21100 48125 05243 16181 39641 36970 99522
53501 58431 68149 25405 23463 49168 02048 31522
07698 24181 01161 01527 17046 31460 91507 16050
22921 25930 79579 43488 13211 71120 91715 49881
68127 00501 37484 99278 28751 80855 02035 10910
55309 10713 36439 65660 72554 77021 46279 22705
92034 90892 69853 06175 61221 76825 18239 47687
50612 84077 41387 54107 09190 74305 68196 75634
81415 98504 32168 17822 49946 37545 47201 85224
38461 44528 30953 08633 08049 68698 08759 45611
07556 24587 88753 71626 64864 54986 38964 83534
60557 50031 75829 05622 30237 77795 41870 26300
Systematic random sampling
• Basic method, needs:
• Complete sampling frame list
• Numbering of sampling units
• Sample size
• Method:
• Determine the period of sampling (ith) as in next
example
Systematic sampling

• N = 1200, and n = 60
⇒ sampling fraction = 1200/60 = 20
• List persons from 1 to 1200
• Randomly select a number between 1 and 20 (ex :
8)
⇒ 1st person selected = the 8th on the
list
⇒ 2nd person = 8 + 20 = the 28th
etc .....
Systematic sampling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

46 47 48 49 50 51 52 53 54 55 ……..
Systematic sampling
Example: systematic sampling
Stratified random sampling

• Suitable to ensure representation of certain subcategories of


the population in the sample
Stratified random sampling
• To ensure representation of certain population
subcategories or
• strata suspected to affect the research question.
Example:
• In the study of the relationship between
hypercholesterolemia and
• IHD, gender and smoking are important factors that
might affect this
• association.
Stratified random sampling
• Hence, we need to study it in various subcategories or
• strata of gender and smoking:
• Male smokers
• Female smokers
• Male non-smokers
• Female non-smokers
• It is a process of dividing the population into mutually exclusive
• strata, and sampling from these various strata.
• Better representativeness
• Lower sampling error
Multiple stage random sampling

Principle
• = consecutive samplings
• example :
sampling unit = household

• 1rst stage : drawing areas or blocks


• 2nd stage : drawing buildings, houses
• 3rd stage : drawing households
Multiple stage random sampling

• Mostly used in survey


• Needs: sampling frame of the first population and of
subsequently selected sampling units
• Sample size
• Method:
• Determine different stages of samples
• Required sample size in each stage
• Use random or systematic random methods
Multiple stage random sampling
country
Sampling unit: province

provinces
Sampling unit city
cities
Sampling unit district
districts
Sampling unit
household
households

Samplinh unit person


person
Multi stage sampling
Cluster sampling
• Example:
• In a study of the prevalence of schistosomiasis in a village, the
sampling units were households.
• It was decided to select households by simple random
sampling.
• The village had 10,000 households. The required sample size,
300 households, were found to be scattered on an area of 600
km2. The time and resources wouldn't allow the investigator to
undergo this tedious field work.
• What would he do?
Cluster sampling
• The investigator noticed that:
• The village consisted of 100 small areas with an average
of 80 to 120 households each.
• All the small areas were quite similar regarding socio-
demographic characteristics and factors related to the
disease.
• Each small areas contained various categories of age,
gender, occupation, social class, education, etc.
Cluster sampling
• Each small areas was considered as a cluster of
households
• A sampling frame of the 100 small areas was prepared
• A simple random sample of "n" clusters was selected
• All the households of the selected clusters were included
in the sample
Example: Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4
Selecting a sampling method

• Population to be studied
• Size/geographical distribution
• Heterogeneity with respect to variable
• Level of precision required
• Resources available
• Importance of having a precise estimate of the
sampling error
Sample size formula in
descriptive survey
Simple random / systematic sampling
z² * p * q 1.96²*0.15*0.85
n = -------------- ---------------------- = 544
d² 0.03²

Cluster sampling
z² * p * q 2*1.96²*0.15*0.85
n = g* -------------- ------------------------ = 1088
d² 0.03²

z: alpha risk express in z-score


p: expected prevalence
q: 1 - p
d: absolute precision
g: design effect
Conclusions

• P obability samples are e best


• Beware of …
• refusals
• absent s
• “do not know”

You might also like