Probability Sampling

Probability Sampling; Simple, Systematic, Cluster, Stratified, Multistage 
Mehdi Osooli
Knowledge Hub on HIV surveillance, Kerman University of Medical Sciences, Iran

The main objectives 
1. Giving an introduction to basics of probability sampling 2. Making participants familiar with practical aspects of each probability sampling method 3. Comparing pros and cons of different probability sampling methods

Concept and basics of probability sampling methods 
One of the most important issues in researches is selecting an appropriate sample. Among sampling methods, probability sample are of much importance since most statistical tests fit on to this type of sampling method. Representativeness and generalize-ability will be achieved well with probable samples from a population, although the matter of low feasibility of a probable sampling method or high cost, don’t allow us to use it and shift us to the other non-probable sampling methods. In probability sampling we give known chance to be selected to every unit of the population. We usually want to estimate some parameters of a population by a sample. These parameters estimates when we don’t observe whole population usually have some errors. Fortunately in probability sampling it is possible that we know how much our estimates are trustable or close to the parameter value from population by computing standard errors of estimates. This is not easily possible in non-probability sampling methods.

Types of probability sampling methods  Simple Random Sampling 
What is it? Simple random sampling is selecting randomly some units from a known and well defined population. In this method the sampling frame should be known and all units should have same chance for being selected. How is it down? (Example) In simple random sampling, from population of N, n units are selected randomly and the chance of being selected for all units is equal. Different methods and tools can be used for creating 27

000 from 76. 12 and 18 respectively. How is it down? (Example) First we should have the list of the population and according to the total number of sample needed we define a value of “k” to jump over population units and selecting units. The k was defined as 38 and a number between 1 and 38 was chosen. We decided to select a sample of 2. Criticisms The chance of selecting a non-representative sample is very high in this method of sampling especially when there is a correlation between the place of the unit in the population list and the 28 . Criticisms Although when the population is not very big it is possible to do simple random sampling. Example: You have been asked to perform a KAP survey in a prison.Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV random numbers for sample selection. you can use a random number generator to generate 300 numbers between 1 and 2000. Suppose the random number is 3. third and fourth units will be 9. 2009. Standard random number tables and soft-wares with ability of generating random numbers like Open-Epi or Stata are available. Since. Choosing 12 then 38 was added to that and second person was the record number 50 and the next units were chosen adding each time the 38 to the previous selected record. Uses Simple random sampling is a good method for comparing the precision of different methods of sampling and also useful for teaching general probabilistic sampling rules. Systematic Random Sampling  What is it? In systematic random sampling we use the order of the population list or the place of units in the population for choosing the sample. the k is 6 the second. we can define k=6 and draw a random number between 1 and 6. The precision of systematic random sampling is higher than simple random sampling. According to the participants names repeated units were excluded and replaced by new units with the same method. You think that a sample of 300 would be satisfactory for your work. Uses Systematic random sampling is very easy and less time consuming. If we want select 5 units over a population of 50. The list of blood donors was available on computer software and the order of patients was according to the date of their referral. In big population and wide geographical sampling areas it is not easy to take a list form all units and randomly selecting them.000. other methods of random sampling are preferable to it because they brought more precise estimates from population. The list of all 2000 prisoners has been given to you. Most of the time you would have some repeated numbers that should be replaced by new numbers. If you want choose 300 of them for interview randomly. Example: We want to estimate the prevalence of HIV infection among volunteer blood donors in Tehran.

hotel based and brothel based. Here it is possible that using the “k” you jump over some specific units and select in case different units of the population. In this case it is more reasonable to take random samples from these subdivisions. How is it down? (Example) Stratified sampling is done in two major steps. A schematic example for systematic sampling with k=10 Stratified Random Sampling  What is it? In some situations the population can be divided in to sub population which share some characteristics internally. First we should define population strata’s and second select a sample form each stratum. The population of the strata’s should be known. Example: You want to estimate the prevalence of STI among female sex workers in a capital city. You have found from the formative assessment that there are three types of FSWs in the city.SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling characteristics of the unit that should be observed. The level of shaded color in the below figure indicated the level of frequency of risky behaviors. They somehow differ from each other regarding percentage of high risk behavior and medical consultant and available health care services. The sub divisions here are called “strata”. The strata’s should be non-overlapping homogenous internally but heterogeneous externally. street based. 29 .

although in real world that is not easily achievable. you have access to the list of all the FSWs. How is it down? (Example) 30 . FSWs Hotel-based FSWs Brothel-based FSWs Streetbased Uses Four uses can be proposed for stratified random sampling: 1. Cluster Random Sampling  What is it? In sampling from big population. the most common method is cluster sampling. When you want achieve certain precision or information for specific subdivision of the population stratified sampling is very useful. In this method we divide population in to sub divisions called clusters. 2. 3. you have three strata of FSWs and to have a precise estimate of STI prevalence of the target population. you should select sample from each group/strata.Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV As all the FSWs in the city have registered in the health department. Clusters are representative sub-samples of population. In this case each area is a stratum and you will allocate your sample from each stratum separately. In some cases the sampling problems are different in different study fields again here dividing population into subdivisions will enable you to define specific methods and criteria for work in each division. Using this method you will need smaller sample size with higher precision. Criticisms The assumption of little variation and similarity within strata’s is very important to benefit from this method. In this case. Since the variability within strata’s is very small the smaller numbers will give you satisfactory precision and combining these estimates as well bring precise estimates too. this means that the distribution of population units in each cluster is heterogeneous. Using stratified sampling is more helpful in studies in multiple administrative areas. By using stratified sampling the overall precision of the estimates will be more exact. 4.

Big national surveys usual are done using multi stage sampling methods. How is it down? (Example) At first we assess the study population and its criteria to fit for different sampling methods. Then according to the population’s structure we define our sampling framework and take our sample. The inference about sample and final analysis is a bit complicated and should be based and adapted on the procedure of the sampling and different methods were used. Criticisms Multi stage method needs careful design. we include whole selected cluster’s unit in to our sample while in multi stage cluster sampling we will choose randomly just some units within clusters. Uses The benefits of each sampling type are achieved using multi stage method. In household surveys also each family can be considered as a cluster. In one step cluster sampling after selecting few clusters. in one step method all patients form each hospital should be included while in multistage cluster sampling we first select some hospitals then a number of patients within each hospital (cluster) will be selected.SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling Clusters are parts of the population with almost same elements or units of the population but in smaller scale. Criticisms The precision of cluster sample is lower than stratified sampling and it needs bigger sample sizes to bring same precision. Uses In sampling from wide geographical area’s it is possible to define neighboring regions as a cluster. Multi stage sampling What is it? In this method you do several sampling steps and use different sampling methods to achieve your desired sample. It is possible to use both probabilistic and non probabilistic methods together but should keep in mind that non-probabilistic samples are not representative of the population. Tehran has around 120 hospitals.   31 . Example: In assessing the satisfaction of HIV positive patients from hospital based health care services in city of Kerman you can assume each hospital in the city and allocate a random sample size from each hospital to reach your desired sample size. Cluster sampling can be done in just one step or multiple steps.

Logistically difficult if sample geographically dispersed Systematic 1. Construct sample frame 1. Requires sample large enough to make precise estimates for each strata 4.Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Summary  Table 1 provides a brief summary of various conventional sampling techniques including their advantages and disadvantages. Requires sample frame of entire survey population 2. Add SI to random start and select person. Using random number/lottery lottery draw time-consuming 1. 1. Can increase precision of indicator estimates 1. Define the strata and construct sample frame for each strata 2. Logistically difficult if sample geographically dispersed 3. Random numbers or lottery not required 2. Requires sample frame of entire for survey population target population 1. Sampling Simple random Steps Advantages Disadvantages 1. Select random start between 1 and SI & select that person 4.Summary of conventional sampling techniques. Select people randomly 2. Calculate indicator estimates for each strata and for population 1. Population estimates require weighting Stratified 32 . Create a list of the target population 2. Table 1 . Concept is easy 2. Logistically difficult if sample to understand and from sample frame using geographically dispersed analyse random number table or 3. Easy to analyse 1. Produces unbiased estimates of indicators for the strata 2. Take a simple/systematic sample from each strata 3. etc. Requires sample frame of entire target population 2. Calculate sampling interval (SI) 3.

Select clusters using simple/systematic sampling 3. Sample concentrated the random start in geographical areas 4. Only need sample frame of clusters and & SI individuals in selected 3. Sample equal numbers of people from selected clusters 1. Sample size. Only need sample frame of clusters and individuals in selected clusters 2. Decreases precision of estimates. fixed cluster size cluster size 1. Sample concentrated in geographical areas 3. thus. Sample equal proportions of people per cluster 1.SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Probability Sampling Table 1 . select random start between 1 1. Size of clusters required prior to sampling Cluster: Cluster: Equal probability. Decreases precision of estimates. Size of clusters required for weighted analysis 1. thus. Construct sample frame of clusters 2. Construct sample frame of clusters 2. precision of estimates unpredictable 33 . thus. Add SI to random start & select cluster 5. Select cluster using simple/systematic sampling 3. Decreases precision of estimates. Select cluster whose clusters cumulative size contains 2. Construct sample frame of clusters 2. Sampling Cluster: Probability proportional to size (PPS) or equal probability sampling Steps Advantages Disadvantages 1. Sample concentrated in geographical areas 1. Sample equal numbers of people from selected clusters 1. Only need sample frame of clusters and individuals in selected clusters 2.Summary of conventional sampling techniques. Weighted analysis required for unbiased estimates 3. requires larger sample size 2. requires larger sample size 2. Size of clusters required for proportional sampling 3. Don’t need cluster sizes prior to sampling 1. continued. proportional Equal probability. thus. Calculate SI. requires larger sample size 2.

Tryfos P.Y. [in Persian]. Lemeshow. answer the following questions: a. John Willey & Sons.Probability Sampling SAMPLING METHODS FOR POPULATION AT INCREASED RISK OF HIV Question(s) to be discussed Looking at Table 1.Y. 2nd ed. 1992 Thompson M. London Chapman & Hall. 1983 Levy. 2nd ed.S. Sampling of Population methods and applications. et al. S. 2nd ed. 1996 34 . Sampling Techniques.G. John Willey & Sons. N. Sampling method for applied research. What are the steps to take when using a Cluster: equal probability. 3rd ed. 1999 Thompson S. 2007 Malek Afzali H et al. N.E. Applied Research Methodology in Medical Sciences. 1997 Chereii A. P. John Willey & Sons. John Willey & Sons Inc. Tehran University of medical Sciences. What is the disadvantage of using stratified sampling method when it comes to making population estimates? References  • • • • • • • Cochrane W. Sampling. Theory of Sample survey. N.Y. What are the advantages to using a systematic sampling method? b.. 3rd ed. Sampling and estimating Sample Size in Medical Research [in Persian]. fixed cluster size sampling method? c.