You are on page 1of 27

Sampling and its types

• Population is the collection of the elements


which has some or the other characteristic in
common. Number of elements in the
population is the size of the population.
• Sample is the subset of the population. The
process of selecting a sample is known as
sampling. Number of elements in the sample
is the sample size.
• Sampling is a process of selecting subset of
observations from population to make
inference about various population
parameters such as mean , standard deviation
etc .
• Measures such as mean, standard deviation
calculated using the entire population are
called population parameters. The population
parameters are denoted by the symbol µand
Ϭ
• Since calculating population parameters is
practically impossible we depend on the
samples to estimate the population
parameters .population parameter estimated
from sample are called sample statistic or
statistic. It is denoted by x bar and s or S.
Since inferences about population are made
using a sample statistic plays a important role
in hypothesis testing.
Sampling is a method that allows us to get information about
the population based on the statistics from a subset of the
population (sample), without having to investigate every
individual.
Probability Sampling
Non- Probability Sampling
The difference lies between the above two is
whether the sample selection is based on
randomization or not. With randomization,
every element gets equal chance to be picked
up and to be part of sample for study.

Probability Sampling
This Sampling technique uses randomization to make sure that
every element of the population gets an equal chance to be
part of the selected sample.
Probability Sampling: In probability sampling, every element of the population
has an equal chance of being selected. Probability sampling gives us the best
chance to create a sample that is truly representative of the population
Non-Probability Sampling: In non-probability sampling, all elements do not have
an equal chance of being selected. Consequently, there is a significant risk of
ending up with a non-representative sample which does not produce
generalizable results
Simple Random Sampling: Every element has an equal
chance of getting selected to be the part sample. It is
used when we don’t have any kind of prior information
about the target population.
• Random sampling is usually carried out
without replacement , that is an observation
which is selected in the sample is removed
from the population.
• With replacement means , an observation
which is selcted for inclusion in the sample
can again be considered since it is replaced
(not removed ) in the population.lation for
subsequent selection.
Systematic Sampling

In this type of sampling, the first individual is selected


randomly and others are selected using a fixed ‘sampling
interval’
Stratified Sampling
This technique divides the elements of the
population into small subgroups (strata) based
on the similarity in such a way that the
elements within the group are homogeneous
and heterogeneous among the other
subgroups formed. And then the elements are
randomly selected from each of these strata.
We need to have prior information about the
population to create subgroups.

Example: Amount of time spent by


male and female users in sending
messages in a day . The strata are
male and female users
In this type of sampling, we divide the
Stratified Sampling
population into subgroups (called strata) based
on different traits like gender, category, etc.
And then we select the sample(s) from these
subgroups:
Cluster Sampling
Our entire population is divided into clusters or
sections and then the clusters are randomly
selected. All the elements of the cluster are
used for sampling. Clusters are identified using
details such as age, sex, location etc.

Cluster sampling can be done in following


ways:
· Single Stage Cluster Sampling
Entire cluster is selected randomly for
sampling.
Two Stage Cluster Sampling
Here first we randomly select clusters and then
from those selected clusters we randomly
select elements for sampling
Systematic Clustering
Here the selection of elements is systematic
and not random except the first element.
Elements of a sample are chosen at regular
intervals of population. All the elements are
put together in a sequence first where each
element has the equal chance of being
selected.
Multi-Stage Sampling
It is the combination of one or more methods
described above.
Population is divided into multiple clusters and
then these clusters are further divided and
grouped into various sub groups (strata) based
on similarity. One or more clusters can be
randomly selected from each stratum. This
process continues until the cluster can’t be
divided anymore. For example country can be
divided into states, cities, urban and rural and
all the areas with similar characteristics can be
merged together to form a strata.
• Bootstrap Aggregating (Bagging)
• Is sampling with replacement used in machine
learning algorithms , especially random forest
algorithm. In bagging Several samples(with
replacement) . are generated from the
population and analytical models are
developed using each sample.The size of each
sample and number of samples are
determined based on factors of population
size , target accuracy of the model
• Bagging is used in ensemble methods (in
which several models are developed and the
final prediction is usually based on majority
voting) .
• (Ensemble is a Machine Learning concept in which the idea is
to train multiple models using the same learning algorithm.)
• For Ex: in Random forest several hundred
samples are generated from the population
and classification trees are generated using
each sample.the final classification of a new
Types of Non-Probability Sampling
Convenience Sampling
This is perhaps the easiest method of sampling
because individuals are selected based on their
availability and willingness to take part.
Quota Sampling
In this type of sampling, we choose items
based on predetermined characteristics of the
population. Consider that we have to select
individuals having a number in multiples of
four for our sample:

Therefore, the individuals numbered 4, 8, 12, 16,


and 20 are already reserved for our sample.
In quota sampling, the chosen sample might not be
the best representation of the characteristics of the
population that weren’t considered.
Judgment Sampling
It is also known as selective sampling. It
depends on the judgment of the experts when
choosing whom to ask to participate.

Snowball Sampling
. Existing people are asked to nominate
further people known to them so that the
sample increases in size like a rolling
snowball. This method of sampling is effective
when a sampling frame is difficult to identify.

Voluntary sampling: the data is collected from people who volunteer


for such data collection.Ex: customer feedback in Amazon, Trip
advisor
Steps in sampling
Were these results concluded by considering the views of all 900 million voters of the
country or a fraction of these voters? Let us see how it was done.
The first stage in the sampling process is to
clearly define the target population.

So, to carry out opinion polls, polling agencies consider


only the people who are above 18 years of age and are STEP-1
eligible to vote in the population.

Sampling Frame – It is a list of items or people forming a


population from which the sample is taken.
So, the sampling frame would be the list of all the people
whose names appear on the voter list of a constituency.
Sampling frame defines the source (or method/procedure) STEP-2
used for identifying the elements of the target population.
Example to study attrition among IT professionals sources
as Linkeldn, naukri , Monster can be used.
• Step 3
• Generally, probability sampling methods are
used because every vote has equal value and
any person can be included in the sample
irrespective of his caste, community, or
religion. Different samples are taken from
different regions all over the country.
• Step 4
• Sample Size – It is the number of individuals or
items to be taken in a sample that would be
enough to make inferences about the population
with the desired level of accuracy and precision.
• Larger the sample size, more accurate our
inference about the population would be.
• For the polls, agencies try to get as many people
as possible of diverse backgrounds to be included
in the sample as it would help in predicting the
number of seats a political party can win.
• Step 5
• Once the target population, sampling frame, sampling
technique, and sample size have been established, the
next step is to collect data from the sample.
• In opinion polls, agencies generally put questions to
the people, like which political party are they going to
vote for or has the previous party done any work, etc.
• Based on the answers, agencies try to interpret who
the people of a constituency are going to vote for and
approximately how many seats is a political party going
to win.

You might also like