You are on page 1of 69

Lecture 2

Sampling Techniques

For use in fall semester 2015


Lecture notes were originally designed by Nigel Halpern. This
lecture set may be modified during the semester.
Last modified: 4-8-2015

SCM300 Survey Design


Lecture Aim & Objectives

Aim
• To investigate issues relating to sampling techniques
for survey research
Objectives
• What is a sample?
• How should the sample be obtained?
– Sampling considerations
– Sampling techniques
– Sources of error & degrees of confidence
• How large should the sample be?

SCM300 Survey Design


What is Sampling?

• Method for selecting people or things from which you


plan to obtain data
• Closely associated with quantitative methods
– i.e. surveys or experiments
• Sometimes associated with qualitative methods
– i.e. content analysis & ethnography
• Used because it’s rarely feasible or effective to
include every person or item in a survey or study

SCM300 Survey Design


Not Feasible or Effective…..

• Travel patterns of UK adults


• Need to survey 50mn+ people!
– The UK government conducts a
Census of Population every 10
years but this costs tens of £mn’s
• Even a survey of annual cruise
passengers visiting Molde would
be costly & time consuming
• Sampling provides a feasible &
effective solution

SCM300 Survey Design


What is a Sample?

“A sample is a portion or sub-set of a larger group called


a population” (Fink, 2003; p33)

+ + +
+ + + +
+ + +
+
+ +
+ + + ++
+ +
Note: sampling isn’t necessary when you survey the entire
population!

SCM300 Survey Design


What is a Population?

• It can consist of human & non-human phenomena


– Organisations, businesses, geographical areas, households,
individuals
• Examples:
– Hotels in Møre og Romsdal (population of hotels)
– Beaches in Australia (population of beaches)
– People in Norway (population of Norway)
– Households in Molde (population of households)
– Visitors to a resort (population of visitors)
– Users of a ferry service (population of users)
– Students at HiMolde (population of students)

SCM300 Survey Design


Aims of Sampling

• Provide a small & more manageable portion or sub-


set of the population
• Represent the population & be free from bias
– Results for the sample should be similar if the survey was
conducted on another sample from the same population
– i.e. results are repeatable & reliable

SCM300 Survey Design


The Need for Reliable Representation

SCM300 Survey Design


Extracting a Sample

Two main sources


• From a sampling frame
– A list of all known cases in a population from which a sample
can be drawn
• Sampled at source
– Points in time/space where a potential population is
available

SCM300 Survey Design


Typical Sampling Frames

• Electoral register – individuals over 18


• Telephone directories – households
• Royal Mail – households
• Market research companies – households / postcodes / census areas
• Businesses – customers
• Organisations / clubs / trade associations – members
• Magazines / newsletters – subscribers
• Local authorities / CCI – households / employers
• Business / trade directories – businesses
• Yellow pages – clubs / organisations / businesses
• Tourism offices – reservations / visitors’
• Hotels/accommodation – registration records / reservations

SCM300 Survey Design


Sampling Frames

• Only available where there is a finite population


– i.e. where the population can be clearly defined
• Potential problems
– List not up-to-date / only up-dated periodically
• Lags in registration & deregistration
– Clusters of individuals create complexities
• e.g. making sure you survey the correct individual in a
sampling frame of households
– Some cost money to access or are confidential

SCM300 Survey Design


Sampling at Source

• Clearly defined population is not the case when


sampling at source
– i.e. shopping streets, visitor attractions, transport terminals,
museums, sporting events, etc
• Problems
– The population is fairly vague (‘hanging around’)
– Individuals present are not listed in any form which would
constitute a sampling frame
– Sampling is more challenging

SCM300 Survey Design


Sampling Considerations

Two key Q’s to address in any sample survey


1. How should the sample be obtained?
a. Who or what should be sampled (eligibility criteria)?
b. Who do you survey (profiles & individuals in clusters)?
c. When should sampling take place (timing & timescale)?
d. Where should the survey be administered (location)?
e. What sampling technique do you use (probability
versus non-probability)?
2. How large should the sample be?

SCM300 Survey Design


How Should the Sample be Obtained?

a. Who or what should be sampled?


– Therefore defining the eligibility criteria
b. Who do you survey?
– Households, visitor attractions, shopping streets, etc will
normally have people in clusters as opposed to individuals
– Ensure that the survey is completed by the correct
individual

SCM300 Survey Design


How Should the Sample be Obtained?

c. When should the sampling take place?


– Time of year, month, day, time
– Duration of the sampling process
– Useful to
• Have some prior knowledge of the phenomena to be sampled
as results may be biased by particular times of day or year or
weekly, monthly & seasonal variations
• Spread the sampling over different times, days, months, etc to
reduce potential for bias

SCM300 Survey Design


How Should the Sample be Obtained?

d. Where should the survey be administered?


– This could be determined by the definition of the population
• e.g. surveys sent to postal addresses
– On-site surveys should consider location of interviewers
• e.g. recreation areas or tourist attractions tend to have natural
or pre-defined entry & exit points
– If using multiple-interviewers, strict instruction must be given
on where to stand

SCM300 Survey Design


How Should the Sample be Obtained?

e. What sampling technique should be used?

Two main options

Probability Techniques Non-Probability Techniques


1. Simple random sampling 1. Haphazard sampling
2. Systematic random sampling 2. Purposive sampling
3. Stratified random sampling a. Judgement sampling
4. Cluster sampling b. Quota sampling
5. Multi-stage sampling c. Snowball sampling
d. Expert choice sampling

SCM300 Survey Design


Sampling Techniques

• Choice of technique is dependent on 2 Q’s


– Is the population known/clearly defined?
– Can the population be listed as a sampling frame?

Yes to either Q No or uncertainty


Allows for Sampling is complex & based on
Probability Techniques Non-Probability Techniques
(used with sampling frames) (used when sampling at source)

SCM300 Survey Design


Probability Sampling Techniques

1. Simple random sampling


• Each unit has an equal chance of selection
– e.g. lottery draw, names pulled from a list
– Probability of selection is:
• (sample size/total population)*100
• e.g. (100/1,000)*100 = 10% (a 1 in 10 chance)
• Should really use a table of random numbers
– e.g. see http://stattrek.com/Tables/Random.aspx

SCM300 Survey Design


Table of Random Numbers
Create a sample of 10 from a population of Norway’s top 30 football clubs

01. Ham-Kam 11. Start 21. Åsane


02. Bodø Glimt 12. Sogndal 22. Hødd
03. Hereford United 13. Vålerenga 23. Lørenskog
04. Brann 14. Viking 24. Strømsgodset
05. Bryne 15. Aalesund 25. Frederikstad
06. Lillestrøm 16. Haugesund 26. Mjøndalen
07. Lyn 17. Rosenborg 27. Ranheim
08. Molde 18. Hønefoss 28. Tromsdallen
09. Odd Grenland 19. Tromsø 29. Moss
10. Stabæk 20. Sandefjord 30. Træff

1 7 2 5 8 9 4 0 4 6 3 8 7 0 3 3 2 1 2 7 4 3 7 9
7 1 3 5 5 3 2 2 8 1 5 3 7 9 9 6 6 0 1 7 3 5 4 9
3 1 4 9 2 4 0 9 3 5 4 2 1 9 2 1 9 3 3 6 2 5 2 7
0 3 7 8 3 1 0 6 9 1 4 6 4 2 0 4 7 6 5 3 8 6 4 2
SCM300 Survey Design
Your turn…..
Create a sample of 10 from a population of England’s top 30 football clubs

01. Chelsea 11. Bolton 21. West Ham


02. Wigan Athletic 12. Hereford United 22. Millwall
03. Aston Villa 13. Cheltenham 23. Tottenham
04. Manchester City 14. Liverpool 24. Birmingham
05. Reading 15. Fulham 25. Brighton
06. Carlisle 16. Sunderland 26. Blackburn
07. Luton Town 17. Middlesborough 27. Nottingham Forrest
08. Portsmouth 18. Arsenal 28. Newcastle
09. Leicester City 19. Swindon Town 29. Crewe
10. Derby County 20. Everton 30. Manchester United

7 2 5 8 9 4 0 4 6 3 8 7 0 3 3 2 1 2 7 4 3 7 9 2
2 3 5 5 3 2 2 8 1 5 3 7 9 9 6 6 0 1 7 3 5 4 9 7
6 4 9 2 4 0 9 3 5 4 2 1 9 2 1 9 3 3 6 2 5 2 7 3
3 7 8 3 1 0 6 9 1 4 6 4 2 0 4 7 6 5 3 8 6 4 2 2
SCM300 Survey Design
Simple Random Sampling

• Quick, cheap n’ easy…


• Each unit has an equal chance of selection…
• Need to list units of the poulation
– Difficult to do with a large sampling frame…

SCM300 Survey Design


Probability Sampling Techniques

2. Systematic random sampling


• Pull one unit from a list at regular intervals
– e.g. every nth name from a membership list
• Commonly used by production companies to survey
product quality

SCM300 Survey Design


Procedure for Systematic Random
Sampling

SCM300 Survey Design


Example (using a small sampling
frame) of 30 students

• Sample 10 from a population of 30


• 30/10=3, select a number between 1 & 3 to start from (e.g. 2), then
select every 3rd number

1. Andy Anderson 11. Jai Jones 21. Sarah Smith


2. Anita Ashley 12. Keith Kent 22. Simon South
3. Ben Ball 13. Lorna Law 23. Tony Tapp
4. Carol Crow 14. Larry Love 24. Tom Trade
5. David Dent 15. Mike Matthews 25. Ursula Unger
6. Eddie East 16. Nigel North 26. Veronica Vallis
7. Flora Field 17. Oscar Oliver 27. Vic Vaxley
8. Gaynor Green 18. Paul Plumber 28. Wayne West
9. Harold Harvey 19. Peter Parson 29. Yen Yeah
10. Ineka Ince 20. Richard Reed 30. Zac Zachid

SCM300 Survey Design


Your turn…..
Sample 6 from the list of 30, starting at 3

1. Rafael Nadal 11. Steffen Iversen 21. Marco Van Basten


2. Kurt Asle Arvessen 12. Alex Zülle 22. John Arne Riise
3. Thierry Henry 13. Niki Lauda 23. John Tavares
4. Steffi Graff 14. Steffen Kjærgaard 24. Fernando Torres
5. John Carew 15. Michael Schumacher 25. Boris Becker
6. Bjørn Dæhlie 16. Guus Hiddink  26. Bernard Hinault
7. Hermann Maier 17. Jacques Villeneuve 27. Emanuel Pogatetz
8. Roger Federer 18. Katarina Witt 28. Martina Hingis
9. Andy Murray 19. David Beckham 29. Arantxa S-Vicario
10. Thor Hushovd 20. Renate Götschl 30. Lewis Hamilton

SCM300 Survey Design


Probability Sampling Techniques

3. Stratified random sampling


• Simple/systematic could miss particular groups when
using a small population
– e.g. mature students
• Prior knowledge may suggest that inclusion of a
group(s) is necessary
– e.g. mature students perform better than others
• Stratified random sampling samples according to
groups (strata)

SCM300 Survey Design


Procedure for Stratified Random Sampling

SCM300 Survey Design


Example
Survey a Sample of 400 Households in a County

H o u s eh o ld s in th e c o u n ty
100
100
25% District 1
40% District 2
District 3
25%
100 District 4
10%
100

Randomly select an equal amount from each of the 4 districts in the county
(e.g. 100 from each for a sample of 400)

SCM300 Survey Design


Problem Associated with Multiple Variables

• The sample is representative of a single variable but not


of others
– e.g. representative of the 4 districts in the county but not
necessarily of age of residents
• Where multiple variables are required, the benefits of
stratified random sampling diminish in favour of
simple/systematic random sampling
• This problem is less likely when creating a large sample

SCM300 Survey Design


Problem Associated with Time & Cost

• Stratified divides into groups, then selects units using


random sampling
• Random sampling may produce a sample that is
geographically dispersed
– Especially problematic for face-to-face surveys
• e.g. the 100 units selected for the household survey in districts 1-4
may come from different parts of each district and interviewers
may need to travel vast distances between each unit to conduct
their surveys
• Clustering can overcome this problem

SCM300 Survey Design


Probability Sampling Techniques

4. Cluster sampling
• Draw from mutually exclusive sub-groups
– e.g. the 100 units selected for the household survey in districts 1-4
will be selected in clusters instead of randomly

SCM300 Survey Design


Example: Stratified versus Cluster

H o u s eh o ld s in th e H o u s eh o ld s in th e
c o u n ty c o u n ty

25% District 1 25% District 1


40% 40%
District 2 District 2
District 3 District 3
25% 25%
10% District 4 10% District 4

Stratified takes an equal Cluster takes a proportionate amount from each


amount from each (e.g. 100 & in clusters (e.g. 16 clusters of 10 from district 1,
from each for a sample of 400) 4 clusters of 10 from district 2, 10 clusters of 10
from districts 3 & 4, for a sample of 400)

SCM300 Survey Design


The Problem with Cluster Sampling

• Whilst cluster sampling provides huge time & cost


savings, it is likely to have a much greater potential
for sampling error
– i.e. certain parts of each district will be excluded

SCM300 Survey Design


Probability Sampling Techniques

5. Multi-stage sampling
• Experts increasingly use a combination of probability
sampling techniques
– e.g. sample attitudes to tourists in Norway’s towns
• Draw up a sampling frame of towns in Norway
• Randomly (simple, systematic or stratified) select an appropriate
number of towns
• Randomly select an appropriate number of electoral wards
(geographical units from which politicians are elected) from each
town
• Randomly select an appropriate number of voters from the
electoral register of each ward

SCM300 Survey Design


Non-Probability Sampling Techniques

1. Haphazard sampling (accidental, convenience


or availability)
– Samples drawn at the convenience of the interviewer
• e.g. people on a street that are available & willing to
participate
– This technique should still be systematic
• e.g. stop 1 in every 10 passers-by
• Don’t just stop those that you fancy.............!

SCM300 Survey Design


Non-Probability Sampling Techniques

2. Purposive sampling
a. Judgement: samples are believed to possess the
necessary attributes
• e.g. mature students for a survey on mature students
b. Quota: selection according to a pre-specified sampling
frame
• e.g. select 75 out of 100 units aged 21-25 with the
presumption that 75% mature students will be 21-25 and 25%
will be 26+
• The problem is that you need to decide which specific
characteristics to quota (age, gender, income?)

SCM300 Survey Design


Non-Probability Sampling Techniques

c. Snowball: one sampling unit refers another, who


refers another, etc
• e.g. expats refer other expats for a survey on expats
• Not particularly representative but useful when the
population is hard to find or access (e.g. the homeless)
d. Expert choice: asks experts to choose typical units
• i.e. representative individuals or cities
• Often referred to as a ‘panel of experts’
• This helps elicit views of persons with specific expertise
• Also means they help to validate & ‘defend’ any results

SCM300 Survey Design


Probability versus Non-probability
Sampling Techniques

• In probability sampling
– Representation is determined by the fact that every unit has an
equal chance of being selected, based on probability theory
• In non-probability sampling
– There is an assumption that there is an even distribution of
characteristics within the population
– BUT, the population may or may not be represented and it will
be hard to know which is true

SCM300 Survey Design


Why Might the Following Approaches
to Sampling be Biased?

1. I want to survey golf club members attitudes to the quality of the


greens and survey a sample of the top 25 players at the club

2. I want to survey people in Molde to find out what they think


about my cafe so I survey every 10th customer in the cafe.
Surveys are conducted every Monday morning

3. I survey 2,500 bus passengers in Ålesund, over a series of


times, days and months, to ask what they think about the
availability of bus services in Ålesund

SCM300 Survey Design


Sources of Error

• Non-sampling errors (i.e. from survey design or


delivery)
– Non-observation errors: failing to obtain data from certain
segments of the population due to non-response or exclusion
– Observation errors: inaccurate information obtained from the
samples or errors in data processing, analysis or reporting
Characteristic Population Sample (% pop) Responses (% sample)
18-21 years 500 250 (50%) 179 (72%)
22-25 years 300 150 (50%) 96 (64%)
26+ years 200 100 (50%) 10 (10%)
Total 1,000 500 (50%) 285 (57%)

SCM300 Survey Design


Sources of Error

• Sampling error (i.e. from sampling)


– Where the sample drawn may not provide the same
estimates of certain characteristics as other same-size
samples from the population

SCM300 Survey Design


Example of Sampling Error

• Age of Squash club members (n=40):

24, 21, 23, 16, 17, 56, 60, 64, 58, 57, 60, 47, 42, 41, 40, 22, 35, 38,
40, 41, 49, 19, 19, 20, 35, 27, 28, 29, 30, 71, 66, 21, 23, 26, 27, 30,
31, 45, 55

• Overall average is 37.5 years (population parameter)


• Average for 5 separate samples of 10 members
– 35.7, 39.5, 23.1, 51.3, 30.3 (estimates)
• Accuracy (AKA standard error) of sample means can be
calculated for probability samples
SCM300 Survey Design
Standard Error

• Accuracy is often quoted in studies

“56% of customers were more than


satisfied with service quality; this
estimate is subject to a 2% error either
way”

• The 2% error is called the standard error


• Measures statistical accuracy of the sample
• Standard error decreases as sample size increases
– Zero error when the sample is the population

SCM300 Survey Design


Calculating the Standard Error

• Standard error = sdev / (√n)


– sdev: standard deviation of sample mean
– n: sample size

Example
– Random sample of 50 customers have a mean
age of 23.4 and a standard deviation of 9.7
– Standard error = 9.7 / (√ 50) = 1.4
– Therefore, population mean is likely to be 23.4 +/-
1.4 (i.e. range between 22.0-24.8 years)

SCM300 Survey Design


Degrees of Confidence

• Standard error doesn’t say how likely it is (i.e. how


confident we can be) that the estimated range is
correct
• We use principles of standard deviation to determine
the level of confidence in our estimated range

SCM300 Survey Design


Standard Deviation

95% of responses
fall within 2 sdev’s
68% of the mean

95%

99%

-3sd -2sd -1sd Mean +1sd +2sd +3sd

SCM300 Survey Design


Degrees of Confidence

• 2 sdev’s means we can be 95% confident (i.e. correct 95 times out


of 100) that the sample mean will lie within 2 sdev’s of the
population mean

• Calculating 95% confidence for the earlier example


– Where we said that the population mean is likely to be 23.4 +/-1.4 (i.e.
range between 22.0-24.8 years)
– 23.4 +/- 2.8 (standard error of 1.4 x 2) provides a range of 20.6 to 26.2

• Therefore, we can be 95% confident that the population mean is


between 20.6 and 26.2 years
• Do the same for the 99% level of confidence…..

SCM300 Survey Design


Acceptable Level of Confidence?

• 68% of all sample means would fall within a range of


+/- 1 sdev of the population

– This means that we would be 68% confident that the


population mean is between 22.0 & 24.8 years

• The 68% level of confidence means there is a 32%


chance of being incorrect
• 95% is normally used as the acceptable level of
confidence for statistical analysis

SCM300 Survey Design


How Large Should the Sample be?

• Sample size is NOT relative to population size!


• Sample size is absolute
– e.g. provided sampling procedures have been followed, a
sample size of 1,000 is equally valid for a population of British
adults (50mn), London residents (7mn) or Molde residents
(24,000)
• Sample size is determined by
– The availability of resources
– The purpose of data you intend to collect
– The required level of accuracy in the results
– The required level of confidence

SCM300 Survey Design


Resources & Purpose

• Availability of resources is self-explanatory


• The purpose of data you intend to collect
– Smaller OK for descriptive info. on attitudes
– Larger required for explanations for attitudes
• e.g. to investigate satisfaction according to gender, you need
sufficient numbers of each gender and each level of satisfaction
in order to capture the variation within the population – 5 in
each would result in a minimum sample size of 60 (see next
slide)

SCM300 Survey Design


Sample Size & Explanations for Attitudes

Male Female Total


Very Satisfied 5 5 10
Satisfied 5 5 10
Neither 5 5 10
Dissatisfied 5 5 10
Very Dissatisfied 5 5 10
Total 30 30 60

SCM300 Survey Design


Optimum Size for Probability Samples

• Estimating proportions method is one of many


methods used by researchers
• Assumes
– No info. on standard error from previous studies
– Size of population is known
– Simple or systematic random sampling
– Sample will be used to estimate proportions
• e.g. the percentage of customers that are satisfied
• e.g. the percentage of students that like to play squash
• e.g. the percentage of voters for a particular party

SCM300 Survey Design


Optimum Size for Probability Samples

• Sample size is determined by


n = z² p(1-p)

• Where
– n = sample size needed to achieve the level of reliability
– p = the population proportion (i.e. % satisfied customers)
– H = desired level of accuracy
– z = standard error corresponding to the desired level of
confidence (z = 2.0 for 95%)

SCM300 Survey Design


Optimum Size for Probability Samples

Example: sampling levels of customer satisfaction


1. Want to estimate % satisfied customers within +/-2%
 H = 0.02 (2 / 100)
2. Estimate what proportion of the population are satisfied (50% is
normal unless a pilot or previous study suggests otherwise)
 p = 0.5 (5 / 100)
3. Select the desired level of confidence
 z = 2 (z is 2 at the 95% level)
4. Calculate sample size
n = 2² 0.5(1-0.5)
0.02² Now select 2,500 samples from
n = 10,000 x 0.25 the sampling frame using simple
n = 2,500 or systematic random sampling

SCM300 Survey Design


Optimum Sample Sizes at the 95% Level

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%


50 14.0 13.7 12.8 11.2 8.4
100 9.8 9.7 9.0 7.9 5.9
Could reduce 250 6.2 6.1 5.7 5.0 3.7
sample size by 500 4.4 4.3 4.0 3.5 2.6
reducing level
of accuracy 1,000 3.1 3.0 2.8 2.5 1.9
(e.g.4.4% for 2,500 2.0 1.9 1.8 1.6 1.2
just 500!) 5,000 1.4 1.4 1.3 1.1 0.8
10,000 1.0 1.0 0.9 0.8 0.6
20,000 0.7 0.7 0.6 0.6 0.4
40,000 0.5 0.5 0.4 0.4 0.3

SCM300 Survey Design


Effect of Changing the Level of Confidence

Sample size (50/50%) 99% (z=2.6) 95% (z=2.0) 90% (z=1.6)


50 18.4 14.0 11.8
100 13.0 9.8 8.3
250 8.2 6.2 5.2
500 5.8 4.4 3.7
1,000 4.1 3.1 2.6
2,500 2.6 2.0 1.6
5,000 1.8 1.4 1.2
10,000 1.3 1.0 0.8
20,000 0.9 0.7 0.6
40,000 0.6 0.5 0.4

SCM300 Survey Design


Your turn.....

Sampling if students like to play squash

Using the ‘estimating proportions methods’, estimate


the optimum sample size for a survey on whether
students like to play squash.

1. The desired level of accuracy is 5%


2. The same survey from last year
found that 20% like to play
3. The desired level of confidence is 95%

SCM300 Survey Design


Result.....

Example: sampling if students like to play squash


1. Want to estimate % students that like to play within +/-5%
 H = 0.05 (5 / 100)
2. Estimate what proportion of the population like to play (the same
survey from last year found that 20% like to play)
 p = 0.2 (2 / 100)
3. Select the desired level of confidence
 z = 2 (z is 2 at the 95% level)
4. Calculate sample size
n = 2² 0.2(1-0.2)
0.05² Now select 256 samples from
n = 1,600 x 0.16 the sampling frame using simple
n = 256 or systematic random sampling

SCM300 Survey Design


Optimum Sample Sizes at the 95% Level

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%


50 14.0 13.7 12.8 11.2 8.4
100 9.8 9.7 9.0 7.9 5.9
250 6.2 6.1 5.7 5.0 3.7
500 4.4 4.3 4.0 3.5 2.6
1,000 3.1 3.0 2.8 2.5 1.9
2,500 2.0 1.9 1.8 1.6 1.2
5,000 1.4 1.4 1.3 1.1 0.8
10,000 1.0 1.0 0.9 0.8 0.6
20,000 0.7 0.7 0.6 0.6 0.4
40,000 0.5 0.5 0.4 0.4 0.3

SCM300 Survey Design


SUGGESTED APPENDIX

Statistical Note on Sample Size & Confidence Intervals


This survey has a sample size of 500. All samples are subject to a margin
of statistical error. The margins of error, or ‘confidence intervals’, for this
survey are as follows:

Finding from the 95% confidence


survey interval
50/50% +/-4.4%
40/60% +/-4.3%
30/70% +/-4.0%
20/80% +/-3.5%
10/90% +/-2.6%
5/95% +/-1.9%
This means, for example, that if 20% of the sample are found to have a
particular characteristic, there is an estimated 95% chance that the true
population percentage lies in the range 20 +/- 3.5, i.e. between 16.5 and
23.5%. These margins of error have been taken into account in the
analysis in this report.
Source: Veal (1997; p215)
SCM300 Survey Design
Dodgy Opinion Polls…..?
”Senterpartiet er også i siget med 9 prosent, en
framgang på 2,2 siden juni” (Tidens Krav, 20/08/07)

Meningsmålingen for august er laget av Sentio Research Norge for


Tidens Krav, Romsdals Budstikke, Sunnmørsposten og NRK. 500
personer i Møre og Romsdal er intervjuet 13. og 14. august. 
SCM300 Survey Design
Optimum Sample Sizes at the 95% Level

Sample size 50/50% 40/60% 30/70% 20/80% 10/90%


50 14.0 13.7 12.8 11.2 8.4 A 2.2% change
100 9.8 9.7 9.0 7.9 5.9 is within the
250 6.2 6.1 5.7 5.0 3.7 margin of error
500 4.4 4.3 4.0 3.5 2.6 and can
therefore be
1,000 3.1 3.0 2.8 2.5 1.9
’down to
2,500 2.0 1.9 1.8 1.6 1.2 chance’
5,000 1.4 1.4 1.3 1.1 0.8
10,000 1.0 1.0 0.9 0.8 0.6
20,000 0.7 0.7 0.6 0.6 0.4
40,000 0.5 0.5 0.4 0.4 0.3

SCM300 Survey Design


Optimum Size for Non-Probability Samples

• Optimum sample sizes can’t be determined for non-


probability samples
– Can use optimum probability samples but levels of accuracy
& confidence are relatively meaningless
• The equation is based on probabilities
• Size is simply based on pragmatic considerations
– i.e. resources & purpose of data

SCM300 Survey Design


The Effect of Non-Response on
Sample Size

• Previous studies may suggest that you can expect a


certain response rate – take this into account
– e.g. if you need a sample of 200 and expect a response rate
of 40%, you should consider sampling 500
– e.g. if your interested in opinions about a particular event
and only 30% of your sample attended the event, sample
size should be increased

SCM300 Survey Design


Summary

• A small & manageable portion or sub-set


– Commonly associated with quantitative methods
– Applies to human & non-human phenomena
– Extracted from a sampling frame or at source
• 2 main sampling techniques
– Probability & non-probability sampling
• 2 main types of error
– Non-sampling & sampling errors

SCM300 Survey Design


Summary

• Levels of accuracy & confidence


– Standard error measures accuracy in sample estimates
– Confidence determines likelihood that the estimate is correct
• Sample size is absolute
– Based on resources available & purpose of data
– Also based on desired accuracy & confidence (probability
sampling)

SCM300 Survey Design


Recommended Reading

• Chapters 1 & 2 in Fink, A. (2003). The Survey


Handbook. 2nd Ed. London: Sage.

SCM300 Survey Design


“Thank you for your attention”

Questions.…….

SCM300 Survey Design

You might also like