Lecture 1: Types of Error in Survey (Chap 2) : Sampling Errors Random Errors

SURVEY WORKSHOP
LECTURE 1: TYPES OF ERROR IN SURVEY (CHAP 2)
A. ERRORS ASSOCIATED WITH WHO ANSWERS
- Anytime a sample is drawn from a population, there is a chance that the sample is differ from the
population in some characteristics
=> goal of survey methodology = to minimize the random differences between sample and population
➔ Sampling errors = random errors

=> the possible errors that stem from the fact that data are collected from the sample than from
every member of the population
➔ Bias = systematic errors that cause consistent difference between the sample and the targeted
population
➔ 3 steps in the data collecting process that may cause bias or sampling errors
1. Error in choosing sample frame

- There are people in the targeted population who do not have any chance of being selected for the
sample
- And if they, somehow, consistently different from those who have been selected for the sample.
=> Then the result will be bias in that way.
2. Error in the process of choosing who is in the sample (sampling)

E.g. volunteer => people who do not volunteer have different profile of interests
3. Failure to collect data from everyone who are selected to be in the sample
E.g. respondents unable to do, unwilling to do, or unavailable to do the survey.
RECAP
SAMPLING
A. Sample frame
➔ 3 standards of a sample frame

1. Comprehensiveness = the extent to which sample frame completely covers the population
2. Probability of selection = calculate the Probability a person is selected
3. Efficiency = the rate at which members of the targeted group can be found among population
Comprehensiveness
E.g. household-based opinion, ppl tend to exclude the prison, asylum, nursing houses etc.
=> even tho regular household accounts for the majority of the population, those aforementioned have
distinctive characteristics that may be consistently different than those being chosen.
Probability of selection
It’s not possible to calculate this rate for everyone. But, it is essential to calculate the rate who those have
been selected
Efficiency
In some cases, sampling frames alr includes units that are not representative of the target population.
NOTE: Because the ability to generalize from a sample is limited by the sample frame, researchers must
report who had and had not been given the chance to be selected to the sample
1. Simple random sampling

= everyone has equal chance of being selected
2. Systematic samples
= everyone has calculated equal chance of being selected and a start point is designated by choosing a
random number that falls within the sampling interval.
E.g. there are 8500 ppl on the list, N = 100 => 100/8500 = 1/85 => 1 out of every 85 people in the sample
will be selected.
3. Stratified samples
= the process of producing a sample that is the most likely to mirrors the population
- The characteristics of the respondents is known in the sample frame

- The sample is representative by design
4. Quota sample
- Like stratified sample but the characteristics is not known before hand
- NOT a random sample
5. Cluster sample
- Respondents are selected per cluster
E.g. students from a tutorial group
6. Multistage sample
- Almost like cluster but
E.g. cluster: every students from a tutorial group
Multistage: some students from a tutorial group
Why sample?
- Cannot survey every member of the target population
=> a representative sample provide information about the population
- The question is: do we choose random sample or non-random sample?
Why random?
no1: statistical significance test

- Does random sample guarantee a representative sample? => not necessarily, not actually.
You may get a representative sample if youre lucky
Representative sample
- Representative sample mirrors certain characteristics of the population
- There are 3 kinds of representative
1. Representative by chance => random
2. Representative by design => sample design (systematic, stratified etc.)
3. Non-representative sample
DRAWING SAMPLE FROM 2 OR MORE SAMPLE MODE/ WHICH MODE
Survey mode
➔ ads & disads of interviewers

➔ no interviewer: mail vs internet. NOTE: mail response are often bias
➔ phone vs f2f interview
➔ mixing modes? What happens
- Use sufficient sample size

E.g. N = 100 => mix internet survey and f2f interview => easy to find 100 respondents for internet survey
but time-consuming and costly to have f2f interview with 100 people.
Mode effects
➔ interviewer effects
- Commonly happens when there are private or sensitive question such as sex, crime, etc.
E.g.question about sex: have u ever had sex while drunk? => young interviewer: respondents more likely to
be honest than with old interviewer
- Interaction btw interviewers and respondents
- Social desirability
E.g. ques: have you ever bullied anyone? => most people are likely to say no => social desirability
➔ primacy and recency

- Primacy: visual => tend to pick the 1st answer
- Recency: aural => tend to pick the last answer
SAMPLE SIZE: how big should a sample be?

- Not necessary to have population size
- The sample size depends on the accuracy of measurement you take => smaller CI = larger sample
- If you want to compare between subgroups of a population (T-Test, ANOVA etc)
- Distribution of characteristics in the population => smaller difference = larger sample
Coverage
Example: are probability sampling
Sampling through landlines
=> now most people use cell phones, landline survey may not cover all units in the sample frame.
NOTE: for each person interviewed via cell phone, probability of selection must be calculated
Ability to calculate sampling errors is one of the principal strength of survey method => researchers should
create complex sample design to ensure sampling errors is calculated and reported appropriately
LECURE 2: ANSWER, QUESTIONS, STRUCTURE

➔ Common errors in survey
- Question wording
- Memory problem on the interviewee side
- Didnt understand the question on the interviewee side
- Deliver the question incorrectly on the interviewer side
- The way answer is recorded on the interviewer side
- Information processing errors during the data entry or coded process
➔ rules to create questions
- Remember the RQ
- Decide clearly what you want to find out
- Be considerate with the respondent
- Put yourself in the respondent shoe?
- How would u answer the question
- What would u answer with the question
- Is the question vague or hard to understand?
➔ should u use IDK?

- Midpoints are viable => if u offer, people will choose it
=> why? It takes less effort to think about the question
- If you want people to answer question: leave out idk

- If you want people to answer question, only if they can: include idk
SOCIAL DESIRABILITY
➔ Open-ended question
Structure
- Every interviewee must receive questions in the SAME ORDER.
- Go from general to specific, related questions should be out near each other
- former questions may affect the salience of the latter one.
- First questions should be directly about the topic
- Private, sensitive or embarrassing questions should be put in the end of the survey
➔ introduction
- Always introduce a new set of question
- In qualtrics, you can create block
LECTURE 3: RESPONSE RATE, NON-RESPONSE
Non-reponse
- Mail survey are often biased
- Availability is a more important source of non-response for telephone and personal survey than mail
survey
- Altho there tend to be a demographic difference between respondent and non-respondents for
interviewer-administered survey => the effect of non-reponse for random digit dialling and telephone
are less clear
➔ Item non-repsonse = questions not answered
➔ unit non-response = respondents do not join the survey at all
Problems from unit non-response
There are two possibility of unit non-response

1. Random non-response = the reason respondents dont respond are random
2. Non-response NOT random = there has to be something with the respondent’s characteristic => this
possibility is problematic
How to reduce non-response?

- Train interviewers
- Extended fieldwork period => longer survey time
=> risk: history effect => threat to validity (basically, people change over time => consistent diff)
- Contact respondent again
- Incentive
=> gift for respondent, BEFORE the survey/interview, WITHOUT conditions
E.g. gift with money, saying “before u start the survey, this is my lil gift for u, thank u for participating”
=> respondent feel the need to pay back => basic persuasion trick
DONT: gift with money, saying “if you do the survey properly, i’ll give u the money” => only do the
survey for the gift.
Response rate
Response rate often increase when:
- Greater investment: money, time, effort => identifiable sponsor/ investment
- Presentation of the survey (especially with mail and internet survey mode) => well-designed
instrument
- Make task easier for respondents
- Gift the respondent => financial incentives
- Repeated contact until we get the answer, if possible

Lecture 1: Types of Error in Survey (Chap 2) : Sampling Errors Random Errors

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 1: Types of Error in Survey (Chap 2) : Sampling Errors Random Errors

Uploaded by

Copyright:

Available Formats

SURVEY WORKSHOP

LECTURE 1: TYPES OF ERROR IN SURVEY (CHAP 2)

A. ERRORS ASSOCIATED WITH WHO ANSWERS

➔ Sampling errors = random errors

1. Error in choosing sample frame

2. Error in the process of choosing who is in the sample (sampling)

➔ 3 standards of a sample frame

1. Simple random sampling

- The characteristics of the respondents is known in the sample frame

no1: statistical significance test

➔ ads & disads of interviewers

➔ mixing modes? What happens

- Use sufficient sample size

➔ primacy and recency

SAMPLE SIZE: how big should a sample be?

LECURE 2: ANSWER, QUESTIONS, STRUCTURE

➔ should u use IDK?

- If you want people to answer question: leave out idk

LECTURE 3: RESPONSE RATE, NON-RESPONSE

➔ unit non-response = respondents do not join the survey at all

Problems from unit non-response

There are two possibility of unit non-response

How to reduce non-response?

You might also like