You are on page 1of 23

Chapter 5

Data Collection and


Sampling
NO SOUND FOR THIS CHAPTER
Please read PPT and make notes from
textbook material. It is a “definitions” and “concepts” chapter.
There are no calculations.
5.1 Methods of Collecting Data
• The reliability and accuracy of the data affect the
validity of the results of a statistical analysis.
• The reliability and accuracy of the data depend
on the method of collection.
• Three of the most popular sources of statistical
data are:
– Published data
– Observational studies
– Experimental studies
Published Data
– This is often a preferred source of data due to low
cost and convenience.
– Published data is found as printed material, tapes,
disks, and on the Internet.
For example:
Data published by the Stats
Canada
– Data published by the organization that has collected
it is called PRIMARY DATA.

– Data published by an organization different than the


organization that has collected it is called
SECONDARY DATA.
For example:
•The Statistical abstracts of Canada,
compiles data from primary sources
Observational and experimental studies

• When published data is unavailable, one needs


to conduct a study to generate the data.
– Observational study is one in which measurements
representing a variable of interest are observed and
recorded, without controlling any factor that might
influence their values.
– Experimental study is one in which measurements
representing a variable of interest are observed and
recorded, while controlling factors that might influence
their values.
Surveys

• Surveys solicit information from people.


• Surveys can be made by means of
– personal interview
– telephone interview
– self-administered questionnaire
Surveys

A good questionnaire must be well designed:


• Keep the questionnaire as short as possible.
• Ask short,simple, and clearly worded questions.
• Start with demographic questions to help
respondents get started comfortably.
• Use dichotomous and multiple choice questions.
• Use open-ended questions cautiously.
• Avoid using leading-questions.
• Pretest a questionnaire on a small number of people.
• Think about the way you intend to use the
collected data when preparing the questionnaire.
5.2 Sampling
• Motivation for conducting a sampling procedure:
– Costs.
– Population size.
– The possible destructive nature of the sampling process.
• The sampled population and the target population
should be similar to one another.
5.3 Sampling Plans
• We introduce three different sampling plans
– Simple random sampling
– Stratified random sampling
– Cluster sampling
Simple Random Sampling
• In simple random sampling all the samples with the
same size are equally likely to be chosen.
• To conduct random sampling…
– assign a number to each element of the chosen
population (or use already given numbers),
– randomly select the sample numbers (members). Use a
random numbers table, or a software package.
Stratified Random Sampling
• This sampling procedure separates the population
into mutually exclusive sets (strata), and then draw
simple random samples from each stratum.

Age
• under 20
• 20-30
Occupation Sex
• 31-40
• professional • Male
• 41-50
• clerical • Female
• blue-collar
Stratified Random Sampling

• With this procedure we can acquire information


about
– the whole population
– each stratum
– the relationships among strata.
Stratified Random Sampling

• There are several ways to build the stratified


sample. For example, keep the proportion of each
stratum in the population.
A sample of size 1,000 is to be drawn
Stratum Income Population proportion Stratum size
1 under $15,000 25% 250
2 15,000-29,999 40% 400
3 30.000-50,000 30% 300
4 over $50,000 5% 50
Total 1,000
Cluster Sampling
• Cluster sampling is a simple random sample of groups or
clusters of elements.
• This procedure is useful when
– it is difficult and costly to develop a complete list of the
population members (making it difficult to develop a simple
random sampling procedure.
– the population members are widely dispersed geographically.
• Cluster sampling may increase sampling error, because of
probable similarities among cluster members.
5.4 Sampling and Non-sampling errors

• Two major types of errors can arise when a


sampling procedure is performed.
• Sampling Error
– Sampling error refers to differences between the
sample and the population, because of the specific
observations that happen to be selected.
– Sampling error is expected to occur when making a
statement about the population based on the sample
taken.
Sampling Errors

Population income distribution

m ( population mean)
The sample mean falls here only because
Sampling error
certain randomly selected observations
were included in the sample.

x ( sample mean)
Non-sampling Errors

• Non-sampling errors occur due to mistakes


made along the process of data acquisition
• Increasing sample size will not reduce this type
of errors.
• There are three types of Non-sampling errors;
– Errors in data acquisition,
– Non-response errors,
– Selection bias.
Sample Question 1
Which method of data collection is involved when
a researcher counts and records the number of students
wearing backpacks on campus in a given day?
a. An experiment. c. Direct observation.
b. A survey. d. None of these choices.
Sample Questions
Remember these are just sample questions
and you should be doing as many questions
from the textbook as possible.
Sample Question 2
Which of the following statements
is true regarding the design of a
good survey?

a. The questions should be kept as short as possible.


b. A mixture of dichotomous, multiple-choice, and open-ended
questions may be used.
c. Leading questions must be avoided.
d. All of these choices are true.
Sample Question 3
Which of the following must be avoided
in designing a survey?

a. Dichotomous questions. c. Demographic questions.


b. Leading questions. d. All of these choices are true.
Sample Question 4
Which of the following data collection
methods is not observational?
a. A personal interview. c. A self-administered
questionnaire.
b. A telephone interview. d. An experiment.
Solution to Sample Questions
• 1: C
• 2: D
• 3: B
• 4: D

You might also like