You are on page 1of 12

LECTURE 8

Introduction to Sampling Theory


STAT 3001 - Experimental Design and Sampling Theory

The University of the West Indies, St Augustine


Lecturer: Devika Bhagwandin

February 28th 2023

1/12
Suppose that the NWRHA wants to implement new medical services but
first needs to ascertain the population of children of primary school age
who have been immunised against childhood infectious diseases.
To conduct such an exercise involves:
1. The engagement of a large number of field officers in an attempt to
collect information from parents of these children or from
administrative records such as school immunisation cards for every
child of appropriate age residing in the NWRHA coverage.
2. Such a large staff employs the need for a large budget to cover
employment costs as well as travel expenses (economic constraints).
3. Inevitably the conduct of such an exercise even with a large staff
would take a long time (time constraints).
In light of such challenges, statistical officers, companies and researchers
utilise a process of sampling.
Sampling is defined as the process of selecting a subset (or sample) of
individuals to observe or measure from a larger population, in order to
make estimations of the entire population.

2/12
Social scientists conduct surveys to collect a sample, whereas physical
scientists perform experiments.
Why use sample surveys?
➢ We want to make inferences about a set of measurements in a
population. The medium of inference is the sample.
➢ Measuring items in a sample is less time-consuming and less costly
than measuring all items in the population.
Sample surveys are used by:
➣ governments in determining allocation of funds to cities
➣ businesses to forecast sales, to manage personnel and to establish
future site locations
➣ urban and regional planners to plan land use, transportation
networks and energy consumption
➣ social scientists to study economic conditions, racial balance and
other aspects of the quality of life

3/12
Technical Terms:
▶ An element is an object on which a measurement is taken.
▶ A population is a collection of elements above which we wish to
make an inference.
▶ Sampling units are non-overlapping collections of elements from the
population that cover the entire population.
▶ A frame or sampling frame is a list of sampling units.
▶ A sample is a collection of sampling units drawn from a frame or
frames.
▶ A census is a complete count of the population. Observations of all
sampling units are collected in a census.

Consider the following: In a certain community an opinion poll was


conducted to determine public sentiment toward a bond issue in an
upcoming election. The objective of the survey was to estimate the
proportion of voters in the community who favored the bond issue.

4/12
From this example we have the following:
Element: a registered voter in the community
Population: the collection of registered voters in the community
Sampling units: This may be a registered voter (where each sampling
unit has one element) or households (collection of
elements). Note it must be defined that no voter in the
population can be sampled more than once i.e.
non-overlapping.
Frame: If we choose a registered voter as the sampling unit then the
frame is a list of all registered voters. If we choose households
as the sampling unit then a telephone directory or a list of
household heads can serve as the frame.
Sample: The sample would contain the voters that were contacted to
determine their preference on the issue.
We would then use the information obtained from these voters to make
an inference about voter preference throughout the community.

5/12
Elements of the sampling problem
The objective of sample surveys is to make inferences about a population
from information contained in a sample selected from that population.
The inference usually takes the form of estimating a population mean
(example the mean income per household) or proportion (example the
proportion of voters favouring a certain issue as before).

How to select a sample: Design of the sample survey


Each item or observation taken from the population contains a certain
amount of information about the parameters of interest. Because
information costs money, an experimenter must determine how much
information is needed; too little information prevents the experimenter
from making good estimates/inferences, while too much of it results in a
waste of money.
The quantity of information obtained depends on:
1. the number of items sampled (or sample size)
2. the amount of variation in the data
The variation can somewhat be controlled by the method of selecting the
sample, called the design of the sample survey.

6/12
Sample surveys can be classified into two broad classes: probability
sampling and non-probability sampling.
A probability sample has the characteristic that every element in the
population has a known, non-zero probability of being included in the
sample.
Note that because every element has a known chance of being selected,
unbiased estimates of population parameters (such as population means,
totals and proportions) can be constructed from the sample data.
Examples of probability sampling are:
1. Simple random sampling
2. Stratified random sampling
3. Systematic sampling
4. Cluster sampling
We will discuss these methods in detail in the upcoming lectures.

A non-probability sample is one in which the probability of each


member of the population being selected is unknown.

7/12
Since probabilities are unknown in non-probability sampling, there is no
firm method of evaluating either the reliability or the validity of resulting
estimates.
Examples of non-probability sampling:
1. Accidental, haphazard or convenience sampling
2. Purposive or judgemental sampling
3. Quota Sampling

1. In convenience sampling, subjects/items are selected based only on


the fact that they are easy, inexpensive or convenient to the sample.
− Interviews conducted by tv news programs to get a quick reading of
public opinion
− Interview the first n people to enter a mall
− Use of college students in psychological research
Problem: There is no evidence that the sample represents the population
being generalized.

8/12
2. In a judgement sample you get the opinions of pre-selected experts
in the subject matter i.e. sampling with a purpose in mind.
− Targeting a specific group for example market research
Problem: Likely to outweigh the subgroups in the population.

3. In quota sampling you try to get a comparable percentage of


respondents within each of the subgroups in the original population.
− Suppose a school’s student body contains 60% men and 40% women,
then sample until we get 60 men and 40 women i.e. the proportion of
subgroups in the sample is the same as the population.
Problem: At first it may seem that quota sampling is representative of
the population, however it presents poor results. This is because the final
selection of the respondent is left up to the subjective judgement of the
interviewer rather than being determined objectively.

Quota sampling versus probability sampling:


In quota sampling, interviewers tend to be given general instructions eg.
“find 2 men and 3 women where 4 are over 25 and 1 is under 25 years.”

9/12
In probability sampling, interviewers are given names or addresses already
selected by a randomization device, without human subjectivity.

Sources of error in Surveys


We generalize errors in sample surveys into two major groups: errors of
non-observation where the sampled elements make up only part of the
target population, and errors of observation, where the recorded data
deviate from the truth.
Errors of non-observation can be attributed to sampling, coverage or
nonresponse.
❖ Sampling error is the deviation between an estimate from an ideal
sample and the true population value. This is the sampling error
which is produced simply because we are utilising a sample and not
a census. Sampling errors can be reduced by good survey designs
and appropriate choice of sample size.
❖ In almost all surveys, the sampling frame does not match up
perfectly with the target population leading to errors of coverage.
For example, with respect to telephone surveys, telephone directories
are inadequate because of unlisted numbers.

10/12
❖ Nonresponse is a particularly difficult and important problem in
surveys which attempts to collect information directly from people
through some sort of interview. Nonresponse may arise in one of the
three ways: the inability to contact the sample element (person or
household), the inability of the respondent to come up with an
answer for the question of interest or direct refusal to answer.

Errors of observation
Once a person (or object) is in place and ready to be ‘measured’, there
are still more errors that can creep into the survey. These are errors of
observation and can be attributed to the interviewer (data collector),
respondent, instrument or method of data collection.
❖ Interviewer - friendly interviews are more successful than forceful
ones, stressing on certain words etc. influence the response
❖ Respondent - error can be introduced based on the ability to
understand the question, ability to answer accurately etc.
❖ Measurement instrument - errors can be made in the definition of
the survey question

11/12
❖ Method of collection - interviews, questionnaires etc. have their own
sources of error

Exercise: Read sections 2.5 and 2.6 in Scheaffer 7th edition


(pages 29 and 37)

12/12

You might also like