You are on page 1of 16

Sampling

• Sampling Terminology
• What is Sampling?
• Sampling Techniques
Sampling Terminology
• Population: set of all individuals (or objects) having some common
characteristic (e.g. people living in Italy)
• Sampling Frame (in Italian: “lista”): subset of the population
from which the sample is actually drawn (e.g. white pages)
• Sample: set of people who actually contribute data to (e.g. every
1000 person in the white pages who answers the phone and
responds)
• Representativeness: how similar is the sample to the population
with regard to the construct of interest?

What is sampling?
Sampling is the process of selecting units (e.g. people, organizations, objects)
from a population of interest so that by studying the sample we may fairly
generalize our results back to the population from which they were chosen.
The sampling techniques

PROBABILITY SAMPLING NON PROBABILITY SAMPLING

Each member of population has a Each sample unit is chosen arbitrary


specific probability of being chosen
Is the only choice when an exhaustive
Only when the complete sampling population list is NOT available
frame is available
No inference
Free from bias of subjective
judgments

Inference
Sampling Techniques
Probability sampling: each member of population has a specific probability of
being chosen
• Random: everyone in population has an equal chance of being selected
• Systematic: (e.g. every 10 student ID number)
• Stratified: population divided into strata, then random sampling from
within each stratum (e.g., an equal number of males/females are selected)
• Cluster sample (campionamento a grappoli): identify clusters of individuals
& sample from these (e.g. 1 person per household)
• Multi-stage cluster sampling (e.g. 1 person per selected household per
selected suburb)
Non-probability sampling : arbitrary, sample not representative of population
• Quota Sampling - e.g. 50% psychology students - 30% economics students
- 20% law students
• Convenience Sampling - take them where you find them e.g. at shopping
mall
• Snowball Sampling - ask each respondent if they know someone else
suitable for survey e.g. studying drug-users.
Probability vs. non-probability sampling
Probability sampling: each unit in the sampling frame is associated to a given
probability of being included in the sample, which means that the
probability of each potential sample is known
Non-probability sampling: extraction of sample units is not based on
probability rules
• Non-probability sampling does not allow one to accompany sample
estimates with evaluations of their precision and accuracy
• Still, non-probability sampling is a common practice in marketing research,
especially quota sampling.
• It is not necessarily biasing or uninformative
• In some circumstances – for example when there is no sampling frame – it
may be the only viable solution
• Key limits: techniques for statistical inference cannot be used to generalize
sample results to the population – sample in not representative – estimates
may be biased
• Pros: quick and cheap techniques
Convenience sampling
only convenient elements enter the sample
Example: interview people in the street or at the shopping mall

Judgmental sampling
selection based on the judgement of the researcher
Example: In a study wherein a researcher wants to know what it takes to graduate
summa cum laude in college, the only people who can give the researcher first hand
advise are the individuals who graduated summa cum laude. With this very specific
and very limited pool of individuals that can be considered as a subject, the
researcher must use judgmental sampling.

Snowball sampling:
A first small sample is selected randomly
Respondents are asked to identify others who belong to the population of
interests. The referrals will have demographic and psychographic
characteristics similar to the referrers
Useful for rare (e.g. rare diseases) or hidden (e.g. drug users) populations
Quota sampling:

Define control categories (quotas) for the population elements,


such as sex, age… Apply a restricted judgmental sampling so that
quotas in the sample are the same of those in the population

Example: Divide employees by type of educational degree.


The employees who have a physical science degree are 1 out of 4.
Sample: 10,000 people
Quota sample: 100 people
The 25% of the sample have a physical science degree.
The selection process continues until all quotas are filled.
Probability sampling
Simple random sampling
• Each element of the population has a known and equal probability
of selection
• Every element is selected independently from other elements
• The probability of selecting a given sample of n elements is
computable (known)
• Statistical inference is possible • Representative samples are large
and expensive
• It is easily understood
• Standard errors are larger than in
other probabilistic sampling
techniques
• Sometimes it is difficult to
execute a really random sampling
Systematic sampling
• A list of N elements in the population is compiled and ordered according to a
specified variable
• Unrelated to the target variable (similar to SRS)
• Related to the target variable (increased representativeness)
• A sampling size n is chosen
• A systematic step of k=N/n is set
• A random number s between 1 and N is extracted and represents the first element
to be included
• Then the other elements selected are s+k, s+2k, s+3k…
• Cheaper and easier than SRS Less representative
• More representative if order is related to the (biased) if the order is
interest variable (monotone) cyclical

Cheapest houses Most expensive houses


Stratified sampling
• Population is partitioned in strata through control variables (stratification
variables), closely related with the target variable, so that there is homogeneity
within each stratum and heterogeneity between strata
• A simple random sampling frame is applied in each strata of the population
• Proportionate sampling – size of the sample from each stratum is
proportional to the relative size of the stratum in the total population
• Disproportionate sampling: size is also proportional to the variability of the
target variable in each stratum
– Gains in precision – Stratification variables may not
– Include all relevant sub population be easily identifiable
even if small – Stratification can be expensive
Applying post-stratification(PS)
• Typical obstacle to stratified sampling: unavailability of a sampling frame for each
of the strata
• Post stratification is carried out by extracting a Simple Random Sample (SRS) of
size n and then classifying units into strata. Instead of the usual SRS mean, a PS
estimator is computed by weighting the means of the sub-groups by the size of
each sub-group. The procedure is identical to the one of stratified sampling and
the only difference is that the allocation into strata is made ex post.
Cluster sampling
• The population is partitioned into clusters
• Elements within the cluster should be as heterogeneous as possible with
respect to the variable of interests (e.g. area sampling)
1. A random sample of clusters is extracted through SRS (with probability
proportional to the cluster size)
• 2a. All the elements of the cluster are selected (one-stage)
• 2b. A probabilistic sample is extracted from the cluster (two-stage
cluster sampling)

• Less precision
• Reduced costs
• Inference can be difficult
• Higher feasibility
Complex sampling designs
• Combination of different sampling methods to increase efficiency or reduce
costs
• Two-stage sampling: two different sampling units, where the second-stage
sampling units are a sub-set of the first-stage ones.
• Typically in household surveys a sample of cities or municipalities is extracted
in the first- stage while in the second stage the actual sample of households is
extracted out of the first-stage units.
• Any probability design can be applied within each stage.
• For example, municipalities can be stratified according to their populations in
the first stage to ensure that the sample will include small and rural towns as
well as large cities, while in the second stage one could apply area sampling,
a particular type of cluster sampling where:
1) each sampled municipality is subdivided into blocks on a map through
geographical coordinates
2) blocks are extracted through simple random sampling
3) all households in a block are interviewed.
Representativeness of sample depends on:
• adequacy of sampling frame
• selection strategy
• adequacy of sample size
• response rate – both the % & representativeness of people in sample
who actually complete survey

Note:
It is better to have a small, good sample than a large, poor sample.
Learning Check
We want to administer a questionnaire to students on a Department.

• If we approach people during daylight hours outside the library or the


classrooms and we pick every k-th person who walks past or who come across
us - we do not just pick whoever we like - which sampling technique we are
using?
• If we randomly choose three out of the ten classrooms and we administer the
questionnaire to all students in the classrooms selected, which sampling
technique we are using?
• If we ask a list of all enrolled students to the student registration office and we
randomly select n students from this list, which sampling technique we are
using?
• If we randomly select a number of students from each classrooms proportional
to the proportion of enrolled students in each classrooms, which sampling
technique we are using?
• If we administer the questionnaire to our best friends, which sampling
technique we are using?
Learning Check answers
• Systematic
• Cluster
• Simple Random
• Stratified
• Judgmental or Convenience

You might also like