You are on page 1of 10

PINES CITY COLLEGES

COLLEGE OF DENTISTRY
First Semester, AY 2022-2023

I. COURSE NUMBER: BIOSTAT 301

II. COURSE DESCRIPTIVE TITLE: BIOSTATISTICS and EPIDEMIOLOGY

Modular Learning Guide #2

Topic: Data Collection: Sampling Techniques


Expected Time of Completion: 3hours

A. LEARNING OUTCOMES

At the end of this module, you can:


1. Classify data according to source
2. Determine and compute valid sample size
3. Enumerate the different Sampling techniques
4. List down the different data collection methods

B. LEARNING CONTENT

Overview
In your previous subject, the basic principles and methods used in statistics were
discussed. The different types of data and the different scales of measurement were
also tackled. In this module, we will be discussing the classification of data according to
source, sample size determination, sampling methods and data collection methods

Concepts
Before we discuss sampling, let us discuss first the classification of data according to
source.

Classification of Data According to Source

Data can be classified in two ways:


1. Primary Data – those gathered from original sources and direct or firsthand
experiences.
Primary sources are:
1. First person accounts (results of interviews, questionnaires, observations,
etc.)
2. Autobiographies
3. Diaries

2. Secondary Data - refer to information which are taken from published or unpublished
data which were previously gathered by other individuals or agencies
Secondary sources are:
a. Books, encyclopedias. dictionaries
b. Articles publlshed in Journals, maqazines, newspapers and other
publications
c. Unpublished theses and dissertations
d. Monographs, manuscripts

Page | 1
Advantages of Primary Data Over Secondary Data
1. Primary data frequently give detailed definition of terms and accurate statistical
units used in the survey
2. Primary data lend more relevance to the researcher's study because Of his direct
participation in the project
3. Primary data are more reliable because of their first-hand nature.

Advantages of Secondary Data


1. Secondary data are more convenient to use because they are already
condensed and organized.
2. Analysis and interpretation are done more easily
3. Libraries make secondary data more easily accessible

SAMPLING TECNIQUES

Now that we know the two types of data according to source, let us discuss
sampling concepts. Let us start by reviewing the definition of population and sample.

Population is the entire set of objects, people or things under consideration.

Sample is a subset of the population that is available for the analysis. It is a


representative of the population

Sampling is the process of determining the adequate number of informants/ respondents


that is representative, valid and reliable in providing information or data. It is also defined
as the procedure in which a sample is selected

Advantages of sampling.

1. Low cost of sampling


If data were to be collected for the entire population, the cost will be quite
high. A sample is a small proportion of a population. So, the cost of collecting data
will be reduced if data is collected from a sample.

2. Enables the completion of a study within a reasonable period of time.


Use of sampling takes less time. It consumes less time than census technique.

3. Enables the investigation of a large population.


Some populations are so large that their characteristics could not be measured.
Before the measurement has been completed, the population would have
changed. But the process of sampling makes it possible to arrive at generalizations
by studying the variables within a relatively small proportion of the population.

Page | 2
4. Makes data more relevant and accurate
It permits a high degree of accuracy due to a limited area of operations. Moreover,
careful execution of field work is possible. Ultimately, the results of sampling studies
turn out to be sufficiently accurate.

5. Easy Organization of data


Organizational problems involved in sampling are very few. Since sample is of a small
size, vast facilities are not required. Sampling is therefore economical in respect of
resources.

6. Avoids consuming all the sources of data


If sources of data is limited, then sampling will avoid consuming all the sources.

7. Better rapport with respondents


An effective research study requires a good rapport between the researcher and
the respondents. When the population of the study is large, the problem of rapport
arises. But manageable samples permit the researcher to establish adequate
rapport with the respondents.

Disadvantages of sampling
The reliability of the sample depends upon the appropriateness of the sampling
method used. The purpose of sampling theory is to make sampling more efficient.
But the real difficulties lie in selection, estimation and administration of samples.

1. Chances of bias
The serious limitation of the sampling method is that it involves biased selection
and thereby leads us to draw erroneous conclusions. Bias arises when the
method of selection of sample employed is faulty.

2. Difficulties in selecting a truly representative sample


Selection of a truly representative sample is difficult when the phenomena under
study are of a complex nature. Selecting good samples is difficult.

3. In adequate knowledge of the subject


Use of sampling method requires adequate knowledge in sampling technique.
Sampling involves statistical analysis and calculation of probable error. When the
researcher lacks specialized knowledge in sampling, he may commit serious
mistakes. Consequently, the results of the study will be misleading.

4. Changeability of units
When the units of the population are not homogeneous, the sampling technique
will not be scientific. In sampling, though the number of cases is small, it is not
always easy to stick to the, selected cases. The units of sample may be widely
dispersed.

5. Some of the cases of sample may not cooperate with the researcher and some
may be inaccessible.
Because of these problems, all the cases may not be taken up. The selected
cases may have to be replaced by other cases. Substitution of units stands in the
way of results of the study.

6. Sampling is not possible


Deriving a representative sample is difficult, when the universe is too small or too
heterogeneous. In this case, census study is the only alternative. Moreover, in
studies requiring a very high standard of accuracy, the sampling method may be
unsuitable. There will be chances of errors even if samples are drawn most
carefully.

Page | 3
General Types of sampling

There are two types of sampling methods:

1. Probability sampling involves random selection, allowing you to make strong


statistical inferences about the whole group. Probability sampling means that
every member of the population has a chance of being included in the sample.

2. Non-probability sampling involves non-random selection based on convenience


or other criteria, allowing you to easily collect data.

PROBABILITY SAMPLING

There are four main types of probability sample

1. Simple random sampling

In a simple random sample, every member of the population has an equal


chance of being selected. It is also known as lottery or fishbowl technique. This
method has high external validity: it represents the characteristics of the larger
population.

To conduct this type of sampling, you can use tools like random number
generators or other techniques that are based entirely on chance. To use this
method, there are some prerequisites:
➢ You have a complete list of every member of the population.
➢ You can contact or access each member of the population if they are
selected.
➢ You have the time and resources to collect data from the necessary sample
size.

Page | 4
Example: You want to select a simple random sample of 100 employees of
Company X. You assign a number to every employee in the company database
from 1 to 1000, and use a random number generator to select 100 numbers.

2. Systematic sampling
Systematic sampling is a probability sampling method in which researchers
select members of the population at a regular interval (or k) determined in advance.

K = N/n where N – population n – sample size

If the population order is random or random-like (e.g., alphabetical), then this method
will give you a representative sample that can be used to draw conclusions about the
population.

Example: A sample of 10 is to be taken from a company who have 100 employees.


Systematic random sampling is to be used. Who will be included in the sample?

Step 1. First thing to do is to determine k

Given : N = 100 and n = 10


K = 100/10
= 10

Step 2. List down the names of all employees of the company in alphabetical
order.

Step 3. From the first 10 numbers, you randomly select a starting point: e.g.
number 6.

Step 4. From number 6 onwards, every 10th person on the list is selected (6, 16,
26, 36, 46,56, 66, 76,86, 96), and you end up with a sample of 10 people.

When using systematic sampling with a population list, it’s essential to consider
the order in which your population is listed to ensure that your sample is valid. If your
population is in ascending or descending order, using systematic sampling should still
give you a fairly representative sample, as it will include participants from both the
bottom and top ends of the population. You should not use systematic sampling if your
population is ordered cyclically or periodically, as your resulting sample cannot be
guaranteed to be representative.

Example: if you are sampling from a list of individuals ordered by age, systematic
sampling will result in a population drawn from the entire age spectrum. If
you instead used simple random sampling, it is possible (although unlikely)
that you would end up with only younger or older individuals.

3. Stratified sampling
Stratified sampling involves dividing the population into subpopulations that
may differ in important ways. It allows you draw more precise conclusions by
ensuring that every subgroup is properly represented in the sample. In a stratified
sample, divide the population into homogeneous subpopulations called strata (the
plural of stratum) based on specific characteristics (e.g., race, gender, location,
etc.). Every member of the population should be in exactly one stratum. Each
stratum is then sampled using another probability sampling method, such as cluster
or simple random sampling, allowing researchers to estimate statistical measures for
each sub-population. Researchers rely on stratified sampling when a population’s
characteristics are diverse and they want to ensure that every characteristic is
properly represented in the sample.

Page | 5
Example: The company has 800 female employees and 200 male employees. You
want to ensure that the sample reflects the gender balance of the company, so you
sort the population into two strata based on gender. Then you use random sampling
on each group, selecting 80 women and 20 men, which gives you a representative
sample of 100 people.

4. Cluster sampling
In cluster sampling, researchers divide a population into smaller groups known
as clusters. They then randomly select among these clusters to form a sample.
Cluster sampling is a method of probability sampling that is often used to study large
populations, particularly those that are widely geographically dispersed.
Researchers usually use pre-existing units such as schools or cities as their clusters.

The simplest form of cluster sampling is single-stage cluster sampling. It involves 4 key
steps.
Step 1: Define your population
As with other forms of sampling, you must first begin by clearly defining
the population you wish to study.

Step 2: Divide your sample into clusters


This is the most important part of the process. The quality of your
clusters and how well they represent the larger population determines
the validity of your results. Ideally, you would like for your clusters to meet the
following criteria:

➢ Each cluster’s population should be as diverse as possible. You


want every potential characteristic of the entire population to be
represented in each cluster.
➢ Each cluster should have a similar distribution of characteristics as
the distribution of the population as a whole.
➢ Taken together, the clusters should cover the entire population.
➢ There should not be any overlap between clusters (i.e. the same
people or units do not appear in more than one cluster).

Because clusters are usually naturally occurring groups, such as


schools, cities, or households, they are often more homogenous than the
population as a whole. You should be aware of this when performing your
study, as it might affect its validity.

Step 3: Randomly select clusters to use as your sample


If each cluster is itself a mini-representation of the larger population,
randomly selecting and sampling from the clusters allows you to imitate
simple random sampling, which in turn supports the validity of your results.

Page | 6
Conversely, if the clusters are not representative, then random sampling will
allow you to gather data on a diverse array of clusters, which should still
provide you with an overview of the population as a whole.

Step 4: Collect data from the sample


You then conduct your study and collect data from every unit in the selected
clusters.

Page | 7
Research example: You are interested in the average reading level of all the seventh-
graders in your city. It would be very difficult to obtain a list of all seventh-graders and
collect data from a random sample spread across the city. However, you can easily obtain
a list of all schools and collect data from a subset of these. You thus decide to use the
cluster sampling method.

Step 1: Define your Population. In your reading program study, your population is all
the seventh-graders in your city.

Step 2: You cluster the seventh-graders by the school they attend. To cover the
whole population, you need to include every school in the city. There is no
overlap because each student attends only one school.

Step 3: You assign a number to each school and use a random number generator to
select a random sample.

Step 4: You test the reading levels of every seventh-grader in the schools that were
randomly selected for your sample.

Multi-stage cluster sampling


In multi-stage clustering, rather than collect data from every single unit in the
selected clusters, you randomly select individual units from within the cluster to use
as your sample. You can then collect data from each of these individual units – this is
known as double-stage sampling.

You can also continue this procedure, taking progressively smaller and smaller
random samples, which is usually called multi-stage sampling. You should use this
method when it is not feasible or it is too expensive to test the entire cluster.

Example: Multistage sampling. Instead of collecting data from every seventh-grader


in the selected schools, you narrow down your sample in two additional
stages:
➢ From each school, you randomly select a sample of seventh-grade
classes.
➢ From within those classes, you randomly select a sample of
students.
The resulting sample is much smaller and therefore easier to collect data from.

NON-PROBABILITY SAMPLING
In a non-probability sampling, individuals are selected based on non-random
criteria, and not every individual has a chance of being included. This type of sample is
easier and cheaper to access, but it has a higher risk of sampling bias. That means the
inferences you can make about the population are weaker than with probability samples,
and your conclusions may be more limited. If you use a non-probability sample, you should
still aim to make it as representative of the population as possible.

Page | 8
1. Convenience sampling/ Accidental sampling
A convenience sampling also known as accidental sampling simply includes the
individuals who happen to be most accessible to the researcher. This is an easy and
inexpensive way to gather initial data, but there is no way to tell if the sample is
representative of the population, so it can’t produce generalizable results.

Example: You are researching opinions about student support services in your
university, so after each of your classes, you ask your fellow students to complete
a survey on the topic. This is a convenient way to gather data, but as you only
surveyed students taking the same classes as you at the same level, the sample is
not representative of all the students at your university.

2. Voluntary response sampling


Similar to a convenience sample, a voluntary response sample is mainly
based on ease of access. Instead of the researcher choosing participants and
directly contacting them, people volunteer themselves (e.g. by responding to a
public online survey). Voluntary response samples are always at least somewhat
biased, as some people will inherently be more likely to volunteer than others.

Example: You send out the survey to all students at your university and a lot of
students decide to complete it. This can certainly give you some insight into the
topic, but the people who responded are more likely to be those who have strong
opinions about the student support services, so you can’t be sure that their opinions
are representative of all students.

3. Purposive sampling
This type of sampling, also known as judgement sampling, involves the
researcher using their expertise to select a sample that is most useful to the purposes
of the research.

Example: You want to know more about the opinions and experiences of disabled
students at your university, so you purposefully select a number of students with
different support needs in order to gather a varied range of data on their
experiences with student services.

4. Snowball sampling/ Referral Sampling


If the population is hard to access, snowball sampling can be used to recruit
participants via other participants. The number of people you have access to
“snowballs” as you get in contact with more people.

Example: You are researching experiences of homelessness in your city. Since there is
no list of all homeless people in the city, probability sampling isn’t possible. You meet
one person who agrees to participate in the research, and she puts you in contact
with other homeless people that she knows in the area.

5. Quota sampling
This is one of the most common forms of non-probability sampling. Sampling is
done until a specific number of units (quotas) for various sub-populations have been
selected. Since there are no rules as to how these quotas are to be filled, quota
sampling is really a means for satisfying sample size objectives for certain sub-
populations. The quotas may be based on population proportions.

Example: if there are 100 men and 100 women in a population and a sample of 20
are to be drawn to participate in a cola taste challenge, you may want to divide
the sample evenly between the sexes—10 men and 10 women. Quota sampling can
be considered preferable to other forms of non-probability sampling (e.g.,
judgement sampling) because it forces the inclusion of members of different sub-
populations.

Page | 9
C. LEARNING ACTIVITY
Accomplish the following:
The medical clinic wanted to know the acceptability of the Covid 19 vaccine
among the students of Pines City Colleges. Before the school decides to avail of the
vaccine the school wanted to estimate how many students would want to be
vaccinated. They decided to do a survey to find out first. The school has 1,200 students
in years 1–4 with 300 students in each year. Assuming the sample size is 30.

1. How would you choose them? Explain your answers.


2. The medical clinic staff got the names of all 1200 students in the school and put
them in a hat. Then she pulled out 300 names. What do you think of this sampling
method? Explain your answer
3. The school clinic set up a booth in front of the bookstore . Anyone who wanted
to stop and fill out a survey could. She stopped collecting surveys when she got
300 students to complete them. What do you think of this method? Explain your
answer.

References:
Brase, C.H. (1987). Understandable Statistics. USA: D.C. Health and Company.
Downie, N.M. (1983). Basic Statistical Methods. (Third edition). USA: Harper and Row
Publishers.
Febre, F. A. (1994). Introduction to Statistics. Philippines: Phoenix Publishing House,
Inc.
Lingren, B.W. (1981). Elementary Statistics. USA. Macmillan Publishing Co., Inc.
Pagoso, C.M. (1985). Introductory Statistics. Philippines: Rex Printing Co., Inc.
Zorilla, R.S. (2009). Statistics, Basic Concepts and Applications. Philippines: Mutya
Publishing House, Inc.
https://www.biostat.washington.edu/about/biostatistics
https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch13/nonprob/5214898-
eng.htm
https://www.youtube.com/watch?v=YgXfTnyCOJQ&feature=emb_logo
https://www.statisticshowto.com/probability-and-statistics/sampling-in-statistics

Prepared by: Noted by:

Rowena Tolentino- Acacio Dr. Joseph Charles Herrero


Instructor Head

Page | 10

You might also like