You are on page 1of 67

ECON 1005 Lectures

Sampling
(based largely on PS Mann Appendix 1)

1
Recall the 5 steps in Statistical
Investigation
• Question
• Sampling procedure
• Collection of data
• Make generalizations
• Decide on reliability

2
Why Sample?
• Why not take a census?

• Impossibility of conducting a census due to


time, cost, lack of access to the entire
population etc.

3
Sample Objectives
• To be as representative as possible of the
underlying population

• Why is this important?

• Because we are going to generalise from the


sample to the population; we are going to use
the information from the sample and assume
that it is true of the entire population
4
Sampling Approaches
We will discuss in this topic the following
sampling approaches:
• Random sampling
– Simple random sampling
– Stratified random sampling
– Cluster sampling

• Non-Random sampling
– Quota sampling

5
Population versus Sample
• Recall the difference between a Population
and a Sample of a Population.

• Population = all elements – individuals, items


or objects – whose characteristics are being
studied

• Sample = a portion of the population selected


for study
6
Census versus Sample Survey
• How do we collect information? Via a survey.

• Census = a survey that includes every element


of the target population

• Sample survey = a survey that collects


information from a portion of the population

7
Example / Activity
• I am interested in the ages of first year students at
UWI. I ask the ages of my ECON 1005 class.
– What is the population?
– What is the sample?
– Do you see any problems with this sample?

• Representative Sample = a sample that represents the


characteristics of the population as closely as possible.

• Why would we want a representative sample?

• There are different sampling techniques that can aid us


in obtaining a representative sample.
8
Sampling Frame
• The Sampling Frame is an exhaustive list of the
elements of a population from which the
sample is to be drawn.

• Examples
– The population census data
– Telephone directory
– Students registration database at the UWI

9
Activity
• Name the populations that
correspond to the following sampling
frames:
–The population census data
–Telephone directory
–Students registration database at
the UWI
–VAT database
10
Activity
• Explain how you will develop the
sampling frame for each of the following
populations:
– Level I Courses offered by the Faculty of Social
Sciences
– Citizens of CARICOM countries
– Level I students of the Faculty of Social Sciences

11
Sampling frame vs Sample – Situation 1
Suppose that we want to measure the proportion of cricket fans across
the Caribbean Region who support the West Indies cricket team

• The population for the purpose of this measurement (otherwise


called a statistical investigation) is the set of all cricket fans across
the islands of the Caribbean.

Suppose instead, I opted to take the measurement only among West


Indian cricket fans attending a recent match in the Queens Park Oval

• In this case, I would have measured the support for the Windies
team across a subset of the population i.e. a sample of the
population.

12
With Replacement or Without Replacement?
• A sample may be selected with replacement or without
replacement.
• In sampling with replacement, each time we select an element
from the population, we put it back in the population before we
select the next element of the sample
• Therefore, the population contains the same number of items
each time a selection is made, and we can also select the same
item more than once
• Sampling without replacement occurs when the selected
element is not replaced in the population
• Therefore, each time we make a selection, the size of the
population is reduced by one element. We cannot select the
same item more than once.
• Most times, sample taken in statistics are without replacement.
13
Census vs Sample Survey

What are the criteria to be used in deciding to


undertake a census versus using a sample
survey?
• Time
• Money
• Problems with sample frame
• Level of Accuracy required

14
Challenge in Sampling
• Consider the measurement of support for the
Windies team.
• While we wish to only measure the support
among a subset of the population for reasons
discussed above, we wish that the resulting
proportion of fans will be almost the same as if
we had conducted a census.
• We can only achieve such a result if the sample is
representative of the population.
• We might say that this is the challenge of
sampling.

15
Activity
• Create a statistical investigation around Level I
of the Faculty of Social Sciences

• Define the related population

• Name & describe a sample drawn from that


population

16
Random vs Non-random Samples
• A sample may be random or nonrandom.
• A random sample is a sample that has been
drawn so that each element of the population
has a measureable chance of being selected for
the sample.
• If a sample does not assign a chance of being
included, to a specific set of elements within the
population, it is nonrandom.
• It is important to note that the randomness is in
the procedure and any corruption of that
procedure is likely to “corrupt” the randomness,
thereby introducing selection bias.

17
Activity
• Choosing a random sample is similar to drawing names
from a hat.

• Imagine that I want to draw a random sample


comprised of 10 students from a Statistics class in
which 300 students are registered.

• If I write the names of all 300 students on identical


pieces of paper then put these names in a hat, properly
shake and draw 10 pieces of paper from the hat, one at
a time without replacement then I would have fulfilled
all the requirements of a random sample.

• Why will the sample be random?


18
Activity

• Is sampling with replacement random? Or is it


non-random?

• Is sampling without replacement random? Or


is it non-random?

19
Sampling Frames and Random Sampling
• No matter what random sampling design is selected,
the actual implementation of the exercise and the
explicit choice of the individual members of the sample
cannot be carried out without a sampling frame.

• Sampling frames may themselves be responsible for


many of the inaccuracies of the resultant sample.

• For instance, if the frame is incomplete (which


immediately excludes some members of the
population from the sample), if the frame is inaccurate,
or if the frame is out of date.

20
Sampling Frames and Random Sampling

Notwithstanding these limitations, a sampling frame is an


immediate requirement for the selection of a sample, and
no actual frame will possess all of the characteristics of an
ideal sampling frame.

What an investigator must consider at this point,


therefore, is all the frames that are available and their
limitations, and in this context select the frame that will
enable the most complete, accurate and convenient
sample of the population under study.

21
Situation 2
Suppose, for example, that I want to determine the
average age of the statistics class of 300 students by
using a random sample of 10 students drawn from
the class.

The random nature of the selection procedure does


not necessarily ensure a truly representative sample
since, for instance, it is quite within the realm of
possibility to draw the ten youngest members of the
class. Using the average age of these 10 as the average
age of the entire class is therefore erroneous.

22
Simple Random Sampling
What is it?
• A simple random sample is a sample that has
been drawn so that each element of the
population has an equal chance of being
selected for the sample.

23
Activity
• What if I were to pick a sample from this class at this
moment, by “randomly” choosing students?
– I can walk along the aisle, choosing people with my hands
– I can stay at the podium, choosing people with my eyes

• Will either of these be a simple random sample?

• Graham Kalton (1983) - “Choosing a simple random sample is


similar to drawing the winning numbers in the Lottery.”

• Describe the methods used to draw the winning numbers for


the Lottery and a Raffle.

• How do these methods fulfill the requirements of a simple


random sample?
24
Simple Random Sampling - Its Advantages

• Simple random samples enjoy the principal


advantage that they eliminate selection bias.

• Furthermore, the elimination of selection bias


in Simple Random Sample has to do with the
mode of selection and has nothing to do with
the representativeness of the resultant
sample.

25
Simple Random Sample - Its Limitations
• Simple random samples can be very costly, for instance
when the population being sampled is distributed across a
wide (geographical) area.

• Taking a simple random sample does not automatically


ensure that the results obtained are reliable and there is
the possibility that a sample selected by a simple random
method would be a bad or imprecise indicator of the
population from which it was taken, so leading to
inaccurate estimates.

• In particular, when a population is stratified, simple random


sampling can result in a sample that is comprised of
elements from just one or two strata; in which case the
sample will not be representative of the population. 26
Stratified Random Sampling
• Suppose I was interested in the average age of a student in ECON 1005
– What is my population?
– Suppose I take this class as my sample, and find the average age
– Will be sample be representative of the population?
– What has gone wrong?

• The stratified random sampling design involves the division of the


population into various categories or strata (singular stratum) using
what is known as a stratification factor.

• The stratification factor must be chosen so that the strata are mutually
exclusive. i.e. Each member of the population must be assigned to
exactly one stratum.

• A simple random sample is then drawn from each stratum to ensure


adequate representation of all strata in the sample.

• The collection of simple random samples constitutes the stratified


random sample. 27
Stratified Random Sampling
• Stratified Random Sampling is based on the principle
that a sample must be truly representative of the
population as a whole if reliable inferences from the
sample, about the population are to be made.

• Back to the example – I am interested in the average


age of an ECON 1005 student. How should I
proceed?
• Suppose I was interested now in the average height
of an ECON 1005 student. How should I proceed?
Should I proceed the same way, or should I change
something?
28
Activity
Stratified Random Sampling
Give some examples of a stratification factor for
each of the following populations:
• The ECON1005 Class
• The residents of the capital city in your
country
• Licensed Motor Vehicles in your country
• West Indies Cricket Fans across the Caribbean

29
Stratified Random Sampling – Its Advantages
• The stratified random sampling design makes use of knowledge of the
population to increase the precision of the results obtained from the
sample.

• The chance of any individual being drawn is still measurable and all
possible samples of equal size still have the same chance of selection. So,
although the choice of the stratification factor is almost entirely dependent
on human judgment, the procedure still maintains an element of
randomness to it, especially since the mode of selection within each
stratum is clearly random.

• This method can significantly reduce the cost of sampling, therefore,


because it can achieve accurate results from smaller samples.

• One advantage of stratification is that, besides facilitating the acquisition of


information about the entire population, we can also make inferences
within each stratum or compare strata. 30
Stratified Random Sampling - Its Limitations

• One important consequence of stratifying the


population is that, now, each element of the
population no longer has an equal chance of
being drawn.

31
Best Results from Stratified Random Sampling
• Stratified random sampling produces its best
results when the variation within each
stratum/group is small compared to the variation
between strata/groups.
• When the within-group variation is small, it will
provide results nearly identical to those of Simple
random sampling.
• Go back to the examples of age of an ECON 1005
student, and height of an ECON 1005 student,
and the choice of the stratification factors.
Describe the within-group and between-group
variations. 32
Stratified vs Simple Random Sampling
• Return to the Examples above; in the extreme
case, simple random sampling could have
resulted in an all-male or all female selection / an
all FT or EU selection, causing, in either case, a
misleading result.

• Since the point of stratifying the population is to


select some members from each stratum, then a
sample consisting entirely of members from one
stratum (which is theoretically possible when
stratification is not employed) could not and
should not be chosen.

33
Cluster or Multistage Sampling

What is it?
• The cluster sampling design involves the
division of the population into groups called
clusters
• A simple random sampling process is then
applied to select a sample of clusters.
• All members of the selected cluster(s)
constitute the cluster sample.

34
Cluster Sampling - Example

Suppose we wish to establish student attitude to the University’s


reform of its Policy on Consultations.

It may be convenient to divide the University student population


into
Faculties which will then constitute the clusters.

We use simple random sampling to select a sample of 2 faculties.

Each student in the 2 selected faculties forms part of the cluster


sample and therefore must be surveyed.
35
Cluster Sampling

• Cluster sampling may seem to resemble stratified


random sampling in that both sample designs involve a
grouping of the members of the population. The
similarity stops there!
• When we stratify, every stratum is sampled, whereas
when we cluster, we select among the
clusters, with the resultant cluster(s) constituting the
sample.
• In addition, when we stratify, we use a simple random
sample of each stratum, whereas when we cluster, we
select each member of each of the selected clusters.

36
Cluster Sampling - Its Advantages

• Cluster sampling is particularly useful when it


is difficult or costly to develop a sampling
frame.
• It is also useful when the population elements
are widely dispersed geographically.
• The selection process remains a random one,
since we select among the clusters in much
the same way that we select among individual
population members in random sampling.

37
Cluster Sampling - Its Limitations

• Under both the simple random and stratified


random sampling designs, one member at a
time (of the population or stratum) is
selected. The selection of any one member is
independent of the selection of another.

• Cluster sampling does not share this


characteristic.
38
Cluster Sampling – Best Results
• There are many ways of selecting the clusters
themselves, but one general rule always
applies - for maximum precision to be attained, clusters
should be formed so that the variation
within each cluster is large relative to the variation
between clusters.
• In other words, the clusters are collectively
homogenous but within each one, they are as
heterogeneous as can be.
• The reason for this should be obvious - it is in this way
that we will obtain a sample that is truly representative
of the population as a whole.

39
Just so you know…
• In your reading you may also encounter
systematic sampling, probability proportional
to size sampling, panel surveys.

40
Moving on…
• We have looked at the relevant random
sampling techniques. Do you recall why we
started discussing them? Remember the 5
stages?
• Now we consider non-random sampling
techniques.
• Quota sampling
• Convenience sampling
• Judgement sampling

41
Quota Sampling
What is it?
• As its name implies, this method involves the
identification of a quota(s) that the selected
sample is required to fulfill.
• This quota is based on the diversity of the
population, and so represents an attempt to
ensure that the population is truly represented by
the resultant sample.
• Once the quota has been established, the actual
selection of the members of the sample is left up
to the discretion of the enumerators/interviewers
(the people who conduct the survey).

42
Quota Sampling - Example
• We may wish to draw a sample of 500 investors in
Credit Unions in your country but 200 must be men;
175 must be women; and 125 must be persons under
the age of twenty-five.

• We can appoint 3 enumerators/interviewers; one is


assigned the task of sampling 200 male investors in
Credit Unions, another is assigned the task of sampling
175 female investors in Credit Unions, and the third is
assigned the task of sampling 125 investors in the age
group ‘under 25’ in Credit Unions. Each interviewer
selects his/her quota of investors utilising a first come
first served approach.

43
Quota Sampling - Its advantages

• One major advantage of the quota sampling


design is that it is relatively cheap to carry
out.

• It also does not require the existence of a


sampling frame and is useful in situations
where no such frames exist.

44
Quota Sampling - Its Limitations

• All the sampling designs discussed before this


one share, in some degree or another, a
random feature.
• However, this sample design is non random.

45
Convenience and Judgment Sampling

• Convenience sampling is just that! Sampling


on a street corner or asking for volunteers.
While it gives information, be wary of making
generalizations!
• Judgment sampling is one where a sample is
chosen by an expert, who deems that sample
to be representative.

46
Errors in Sampling

Two major types of error can arise when a


sample of observations is taken from a
population:

• Sampling Error

• Non-Sampling Error

47
Sampling Error

Sampling Error refers to differences between the


sample and the population that exist only
because of the observations that happened to
be selected for the sample. This type of error is
consequence of the chance factor involved in
the elements of the sample that were selected
from the population. As a statistical investigator/
researcher, you have little control over this type of
error.

48
Non-sampling Error
• Non-sampling Error is due to mistakes made by the
researcher/ interviewers in the acquisition of the data
or due to the improper selection of the sample. This
type of error is within the control of the
researcher/interviewer.

• Examples of non-sampling error are:


• Poor design of the experiment
• Errors in the sampling frame used in the experiment
• Poorly phrased questions
• Poorly administered questions
• Incorrect recording of responses
• Non-response error
• Data Entry errors
• Data Coding Errors
• Selection bias
49
Question:
In attempting to derive a sample that is representative
of the population, a researcher is about to make a
choice between stratified random sampling and
cluster sampling.
Which of the two sampling techniques will require
greater care in the process of subdividing the
population
(i) stratified random sampling
(ii) cluster sampling
Question:
In attempting to derive a sample that is representative
of the population, a researcher is about to make a
choice between stratified random sampling and
cluster sampling.
Which of the two sampling techniques will require
greater care in the process of subdividing the
population

(i) stratified random sampling


(ii) cluster sampling
Question
Which of the sampling approaches below is not
random in nature?
(a) Simple random sampling

(b) Systematic sampling

(c) Quota sampling

(d) Stratified random sampling

(e) Cluster sampling


Question
Which of the sampling approaches below is not
random in nature?
(a) Simple random sampling

(b) Systematic sampling

(c) Quota sampling

(d) Stratified random sampling

(e) Cluster sampling


Question

As part of a household survey comprising 325 households


numbered 001 to 325 on a map, the interviewer was
instructed to select a sample of 36 households by utilizing
a skipping interval of every 9 houses and a ‘start’ house at
the 5th house. What kind of sampling technique was
employed?
(a) Simple random sampling
(b) Systematic sampling
(c) Quota sampling
(d) Stratified random sampling
(e) Cluster sampling
Question

As part of a household survey comprising 325 households


numbered 001 to 325 on a map, the interviewer was
instructed to select a sample of 36 households by utilizing
a skipping interval of every 9 houses and a ‘start’ house at
the 5th house. What kind of sampling technique was
employed?
(a) Simple random sampling
(b) Systematic sampling
(c) Quota sampling
(d) Stratified random sampling
(e) Cluster sampling
Question
Sampling that is performed in such a way
that each item of a population has a
measurable chance of being selected is
called
(a) Simple random sampling
(b) Non-random sampling
(c) Quota sampling
(d) Random sampling
(e) Cluster sampling
Question
Sampling that is performed in such a way
that each item of a population has a
measurable chance of being selected is
called
(a) Simple random sampling
(b) Non-random sampling
(c) Quota sampling
(d) Random sampling
(e) Cluster sampling
Question
Sampling that is performed in such a way
that each item of a population has an
equal chance of being selected is called
(a) Simple random sampling
(b) Non-random sampling
(c) Quota sampling
(d) Random sampling
(e) Cluster sampling
Question
Sampling that is performed in such a way
that each item of a population has an
equal chance of being selected is called
(a) Simple random sampling
(b) Non-random sampling
(c) Quota sampling
(d) Random sampling
(e) Cluster sampling
Stages of a Statistical Investigation

• We have looked at the first and second


stage.
• The third stage now involves the
Collection of data.
• There are several approaches/options…

60
Methods of Data Collection
• Direct Observation
• Experiments
• Surveys
– Personal Interviews
– Telephone Interviews
– Self Administered Survey
– Mail Survey
– Internet Survey

61
Instruments
• Questionnaire
- Keep it short and simple; avoid open-ended
questions; leave options for “Other”; should be
anonymous
• Observation schedules
• Experiments
-controlled and treatment group
• Interviews
- formal or informal; predetermined list of
questions
Research is being conducted to determine the influence of exercise on cholesterol
level.

Questionnaire
1. Name:…………………………………………….
2. Sex: ………………………………………………
3. Do you exercise?…………………………………
4. What type of exercises do you do?……….……………..
5. Do you exercise at least 3 times per week for 30 minutes or
more?……………………………..
6.What is your total cholesterol level?………………………

Comment on defects of the questionnaire &


suggest how these defects can be improved.
Questionnaire
1. Name:…………………………………………….Not required: Should be anonymous.
2. Sex: ………………………………………………Give options
3. Do you exercise?………………………………… Give options
4. What type of exercises do you do?……….…………….. Give options
5. Do you exercise at least 3 times per week for 30 minutes or
more?……………………………..Separate to two questions
6.What is your total cholesterol level?………………………Many persons do not
know their cholesterol level.
We have the data -what now?
• Once we have collected our data, we
assemble a team to begin entering the data
into our statistical software.
• We may have to code the data for it to be
relevant to the software.
• From the software, we can now begin to
derive descriptive statistics and display them
in the various forms discussed in the first
lecture.
65
Next Steps
• Each Lecture Outline will be posted onto the website on the
weekend, after the weekly lectures are complete.

• These are Lecture Outlines only, which means you have to use
these broad topics, read the relevant chapter in the Mann and
do a complete review.

• The Mann chapters all contain some excellent review


questions that you should attempt.

• The tutorial sheets will ask some questions on material that


was not explicitly covered in the lecture but is within the
relevant Chapter of the Mann.

Read your textbook!


66
Thank You!

67

You might also like