You are on page 1of 40

Sampling

Dr. Mary Wolfinbarger


Marketing Research
Sample vs. Census
■ Census -- every population member
included
■ With sampling, researcher infers population
characteristics from a sample
Why sample?
■ Saves money
■ Saves time
■ A sample can be more accurate; it has fewer
“nonsampling” errors than a census
Sampling terms
■ Population (or Universe): a complete listing of a set
of elements having a given characteristic(s) of
interest
An example of population definition:
■ Americans
■ Registered Voters
■ Voters
■ Swing voters

(Which is more relevant to politicians?)


Sampling terms
Element: Unit about which information is
sought
Most common units in marketing:
■ individuals
■ households
■ Sudman and Blair suggest a conceptual
sample: sales dollars or potential sales
dollars
Sampling terms
Sample Frame: A list of population members
■ May get a complete listing of population,
but often population and sample frame are
different
■ Example: “American recreational tennis
players?”
■ Differences between the sample frame and
population: “sample frame error”
Sampling terms
■ Parameter: The actual characteristic of the
population, the true value of which can only
be known by taking an error-free census
■ Statistic: The estimate of a characteristic
obtained from the sample
Sampling terms
Non-response error: error created when
chosen sample members who do not
participate
Non-response creates two problems:
■ Need a larger initial sample size to allow
for non-response
■ More seriously, non-respondents may differ
from respondents (“questionnaire freaks?”)
Sample Types
Two broad categories:
■ Probability: each population element has a
known, non-zero chance of being included
in the sample
■ Non-probability: cannot mathematically
estimate the probability of a population
element being included in the sample
Sample Types
■ Statistician’s opinion: all N-P samples are
worthless because you cannot estimate the
degree to which your results are
generalizable
■ So, why are N-P samples ever used?
Non-probability Samples
■ Convenience
■ Judgment
■ Quota
Convenience Samples
■ “Accidental samples” -- those in sample are
where the data is being collected
■ One major form in marketing: “Mall
Intercept”
■ What do statisticians think? “Rarely do
samples selected on a convenience sample
basis, regardless of size, prove representative,
and are not recommended for descriptive or
causal research.”
Convenience Samples
■ I agree, but….
Minimizing drawbacks of convenience samples:
■ compare sample characteristics and findings to
those collected on a census/random sample basis
■ speculate intelligently about bias, and how it is
likely to have affected results
Convenience Samples
■ When possible, collect the sample where your
population is likely to be (retailers collecting
in-store surveys)
■ Cultivate diversity in the sample (e.g. mall
intercept using multiple locations)
■ May be better at understanding relationships
between variables than at making descriptive
estimates
Judgment Samples
■ Also called purposive sampling
■ Sample elements are hand picked because it
is felt that they are representative of some
population of interest
■ Typically a small sample (maybe as small as
10) in which the researcher tries to represent
all groups or segments from the population
Judgment Samples
■ Snowball design: a special form of
judgment sample
■ Appropriate for small specialized
populations
■ Each respondent is asked to identify one or
more other population members
Judgment Samples
■ Drawbacks?
■ Those with more ties to sample members
are selected
■ Similar people are more likely to be named
Quota Sampling
■ Attempt to be representative by selecting
sample elements in proportion to their
known incidence in the population
Quota Sampling
Example: Surveying undergraduate students
about campus food services
■ Step 1: Identify attributes researcher
believes is important, e. g. sex and class
level
■ Step 2: Look at incidence of sex and class
level in population
Quota Sampling
Class Level Sex
Freshmen 3200 Males 4500
Sophomores 2600 Females 5500
Juniors 2200
Seniors 2000

If I sample 100, how many of each type do I select?


Quota Sampling

■ Don’t be fooled -- relies on personal,


subjective selection of quota attributes
■ The sample can still be non-representative
with respect to some other characteristic
(e.g. in this example, perhaps race)
■ I plead guilty -- I have sinned -- and will do
so again -- …….so shoot me………….
Probability Sampling
■ Does not guarantee representativeness, but
does allow for the assessment of sampling
error
■ Sampling error: error that occurs because a
sample rather than a census is used
Simple Random Sampling (SRS)
■ Each sample element has a known, non-zero,
equal chance of being selected
■ Example: Lottery numbers
■ Or, put everyone’s name in a hat
■ Major polling firms use random digit dialing to
approximate random samples
■ Or, use a random numbers table (actually
pseudo-random I’m told)
Systematic Sampling
■ Systematically spreads sample through a list
of population members
■ Example: If a population contained 10,000
people, and need a size of 1000, select every
10th list name
■ In nearly all practical examples, the
procedure results in a sample equivalent to
SRS
Systematic Sampling
■ Only exception: when there are
“regularities” in the list
Systematic Sampling
Another application of systematic sampling:
■ select a number of millimeters or inches
down a page or column that will be selected
(it’s easier than counting!)
Stratified Sampling
Information about subgroups in the sample
frame is used to improve the efficiency of
the sample plan
Stratified Sampling
Three major reasons to use

■ Some subgroups are more homogenous than


others so fewer numbers are needed for those
groups to obtain the same level of precision
■ Group comparison is the purpose of the study
(disproportionate stratified sampling)
■ Some elements are more important in determining
outcome of research interest than are others
How is this different from quota
sampling?
■ Within strata, selection of sample elements
is random, not first available
Bad Uses of Stratification
■ To satisfy people distrustful that random
sampling will not be representative
■ To correct for MAJOR problems with
survey cooperation
Poststratification is OK
■ Is done after sampling
■ Corrects for MINOR differences between
sample and population produced by non-
cooperation
Area (or Cluster) Sampling
■ Elements are geographically grouped into
relatively homogenous clusters (e.g. a city is
divided into 40 areas)
■ From these areas, 10 are randomly selected
■ From these larger areas, blocks within areas
will be randomly selected
■ Within each block, attempt to survey each
household
Area (or Cluster) Sampling
■ Especially useful for door-to-door personal surveys
(significantly reduces costs)
■ However, clustering increases sampling errors
(people who live close together tend to be more
similar)
■ Statistics formula suggests in marketing research
20-25 clusters is appropriate with 20-25
observations per site
Determining Sample Size
Ad Hoc Methods (non-statistical)
■ Rules of thumb: Collect sample size large
enough so that when divided into groups,
each group will have a minimum sample of
100 or so (Sudman)
■ Budget constraints: calculate the cost of
interview and data analysis per respondent.
Divide total budget by this amount to get
maximum sample size.
Ad Hoc Methods (non-statistical)
■ Comparable studies: Find similar studies
which are successful and getting
sufficiently reliable results
Most general formula
■ Total sampling error=
desired confidence level (Z)*standard deviation
of sample (SD)/sample size (N)
■ Sampling error: the standard deviation of the
distribution of sample means
■ Sampling error is expressed as an absolute,
and is not a percentage: it is the amount your
measurement is from the true value
Re-arranging Algebraically
N=Z2σ 2/(sampling error )2

Where N=sample size


Z=z score from normal curve table (1.96 for a
confidence interval of 95%)
σ =standard deviation (obtained from previous
survey or estimated, e. g. 95% of responses fall
between 3 and 5, so 1 SD=.5)
Example:
■ For example, if allowable sampling error = .
20 (on a 7 point scale), SD=1.34, and a
confidence interval of .05 is being used,
■ N=1.962*1.342/.202
■ N=172
What this formula suggests
■ If the sample is more varied, a larger sample is
required
■ If more precision is required, a larger sample is
necessary
■ If a small confidence interval is desired, a larger
sample is necessary
■ The increase required to achieve ever more
precision and confidence increases at an
increasing rate!

You might also like