You are on page 1of 23

Master in Tourism Management and Planning

Quantitative Methods in Tourism Research

V. Types of errors
V. Types of errors

Content

1. Inference in surveys

2. Sampling errors
2.1 Coverage error
2.2. Sampling error
2.3 Non-response error
2.4 Adjustment error

3. Non-sampling (measurement) errors


3.1 Validity
3.2 Measurement error
3.3 Processing error
1. Inference in surveys

Inference: drawing conclusions about something


unobserved from something observed.

Inference II: We use


statistics computed on
the respondents to
draw inferences about
the characteristics of
the larger population.

Inference I: We use
an answer to a
question from a
respondent to draw
inferences about the
characteristics of
that person.
1. Inference in surveys

Associated errors:

Measurement Representation
NON-SAMPLING SAMPLING ERRORS
(MEASUREMENT) ERRORS Construct Target Population
µi Y
Coverage
Validity Error

Sampling Frame
Measurement YC
Yi Sampling
Error
Measurement
Error Sample
YC

Nonresponse
Response
Error
yi
Respondents
Yr
Processing
Error
Adjustment
Error
Edited Response
Postsurvey
yip adjustments
In both cases, errors can be: Yrw
• RANDOM
• SYSTEMATIC (bias)

Survey Statistic
y p rw

2. Sampling errors

Measurement Representation

Construct Target Population


µi Y
Coverage
Validity Error

Sampling Frame
Measurement YC
Yi Sampling
Error
Measurement
Error Sample
YC

Nonresponse
Response
Error
yi
Respondents
Yr
Processing
Error
Adjustment
Error
Edited Response
Postsurvey
yip adjustments
Yrw

Survey Statistic


2. Sampling errors

2.1 Coverage error Discrepancy between target population and sampling frame.

INELIGIBLE UNITS FRAME


P.

TARGET P.

COVERED POPULATION

UNDERCOVERAGE

Example: in the United States there is no updated list of residents that can be used as a sampling
frame of people. Sample surveys of the target population of all US residents often use sampling
frames of telephone numbers. But, people with lower incomes and in remote rural areas are less
likely to have telephones in their homes while young urban residents tend to have only mobile
phones.
2. Sampling errors

2.1 Coverage error

Examples of frame issues

– Telephone directories
• people without telephones or only-cell-phone users
• Several telephone numbers belong to one person
– Customers, employees, members of an organization:
• Up-to-dated?
• Duplicates, temporary absences
• Addresses that describe a role rather than an individual (e.g. secretary)
• Free lance rollers included?
– Frames for web surveys
• For certain populations (e.g. UdG students) – all access and known e-mail
addresses
• For general population:
– Always coverage problems (all internet access? Known addresses?)
– Panels: lists of e-mail addresses (+incentives)
2. Sampling errors

2.1 Coverage error

Coverage bias? People who use cell-phones exclusively may not differ
significantly in vote choice but might have big differences on attitudes toward
technology.

How to cope with coverage errors?


• Increase quality of sampling frame
• Combine different lists/procedures
2. Sampling errors

2.2 Sampling error Gap between the sampling frame and sample

• Not all people in the sampling frame are measured (error deliberately
introduced).

– Sampling random error: always present (except if census).


– Sampling bias: systematic error arising when some members of the
sampling frame are given no chance of selection.

Example: I’m interested in measuring your level of satisfaction with my lectures.


a) I select a sample (randomly) through a list of all students registered.
b) I select students today in class (it might not representative if missing students
are those who get bored my the lectures and don’t attend classes).

• How to cope with it?


– Whole population (expensive, time consuming…)
– Precise sampling methods (probabilistic sampling)
2. Sampling errors

2.3 Nonresponse error Gap between the sample and the respondents.

• Unit: Not all sample members are respondents

• Item: Missing information for a particular question.


2. Sampling errors

2.3 Nonresponse error

Causes:

Impossibility to contact Refusal

Depends on the type of • The social environment


data collection method • Subject characteristics
• Interviewer characteristics
• Questionnaire design:
• Inadequate comprehension
• Too much effort (recall, open-questions)
• Sensitive topics
• “Opportunity cost”
• Non-Interest/motivation on the topic
2. Sampling errors

2.3 Nonresponse error

How to cope with it?

Impossibility to contact Refusal


• Experience • Anonymity/confidentiality

• Sponsors • Self-completeness (sensitive)

• Incentives • Interviewer (to clarify)

• Persistence • Clear instructions/questions

• Review questionnaire design


2. Sampling errors

2.3 Nonresponse error


Length of data Interviewer
collection period workload

Number and Interviewer


timing of calls observations

Contactability

Pre-
Incentives Burden
notification

Sponsorship Respondent rule

Initial Decision
Respondent/
Interviewer
Interviewer
behavior
match

Persuasion
Mode switch
letters
Interviewer
switch
Post-survey
2-phase adjuntment
sampling Final Decision
2. Sampling errors

2.3 Nonresponse error

(!) Nonresponse bias exists when the causes of the non-response are linked to the
survey statistics measured.

Example (I): A company is conducting a study in Girona on people’s attitudes


toward their occupation. Interviewers call a sample of subjects each day between
6:00 pm and 9:00 pm. Those that work during the evening hours (e.g. healthcare
employees) will be unable to take part in the study, and they may have
unique/different views regarding their jobs.

Example (II): Consider a survey measuring tax payment compliance. Citizens


who do not properly follow tax laws will be the most uncomfortable filling out this
survey and be more likely to refuse. This will bias the data towards a more law
abiding net sample than the original sample.
2. Sampling errors

2.4 Adjustment error

Post-survey adjustments are efforts to improve the sample estimate in the face of
coverage, sampling, and nonresponse error.

• Imputation: process of replacing missing data with substituted values.

• Weighting: give greater weight to sample cases that are underrepresented in


the final dataset.

If adjustments are not carried out properly they can also be an error source: they
can increase rather than reduce error.
3. Non-sampling (measurement) errors

Measurement Representation

Construct Target Population


µi
Coverage
Validity Error

Sampling Frame
Measurement
Yi Sampling
Error
Measurement
Error Sample

Nonresponse
Response
Error
yi
Respondents
Processing
Error
Adjustment
Error
Edited Response
Postsurvey
yip adjustments

Survey Statistic
3. Non-sampling (measurement) errors

3.1 Invalidity Gap between the constructs and measures

• Construct Validity: the extent to which the measure (one or more questions)
reflects the true value of the construct of interest for each individual.

• Invalidity refers to the presence of systematic errors.


Yi= µi + ei
ei = error term, deviation from the true value.

Example:
• Construct: price sensitivity with respect to ecologically grown foods
• We ask: “Would you be willing to pay 10% more for an ecologically grown apple?”
• Those with tendency to give favourable answers will systematically overstate µi
• Those wishing to present themselves as ecologist or health conscious will systematically
overstate µi
• Those who do not like apples at all will systematically understate µi
CFA: Convergent and Discriminant Validity

My expectations of the e1
establishment have been met at all
times
λ11
Satisfaction e2
λ21 I have always felt satisfied with the
with the
λ31
establishment
establisment
e3
The level of satisfaction attained
was high compared to that of other
φ12
similar establishments
e4
I am satisfied with the tiles
acquired
λ42
e5
Satisfaction λ52 My expectations of the tiles
with the
λ62 purchased have been fulfilled
product
e6
Compared to other tiles that I have
seen the degree of satisfaction is
high
3. Non-sampling (measurement) errors

3.2 Measurement error Gap between the ideal measurement and the response obtained

Sources of error:

• Questionnaire design (e.g. complexity, wording, ambiguity, type of questions)

• Social desirability

• Role of the interviewer – interviewer bias:


• Age- and gender- related attitudes
• Experience (does not necessarily mean quality)
• Questions about sensitive behaviour (social presence)

(*) Interviewers can also cause errors in other design stages:


• Deviation from sampling plan/collecting data from an incorrect unit
• Within a unit, collecting data from an incorrect respondent
• Failure to develop rapport/cooperate with respondents
• Failure to help respondents to complete the questionnaire
• Incorrectly recording answers
• Incorrectly editing answers prior to transmitting the edited data
3. Non-sampling (measurement) errors

3.3 Processing error Gap between the variable used in estimation and the
response provided by a respondent.

Some examples:
– Data entry errors.
– Misinterpretation of an answer
Exercises

Groves (2004). Page 64, ex. 4.

For each of the following design decisions, identify which error sources might be
affected.

a) The decision to include or exclude institutionalized persons (e.g., residing in nursing


homes) from the sampling frame in a survey of the prevalence of physical disabilities
in the United States.

b) The decision to use self-administration of a mailed questionnaire for a survey of


elderly subjects regarding their housing situation.

c) The decision to use repeated calls persuading reluctant respondents in a survey of


customer satisfaction.

d) The decision to reduce costs of interviewing by using existing office personnel,


thereby increasing the sample size of the survey.

e) The decision to increase the number of questions about assets and income in a
survey of income dynamics, resulting in a lengthening of the interview.
Exercises

Groves (2004). Page 63, ex. 1.

• A recent newspaper article reported that "sales of handheld digital devices (e.g.,
ebooks, PDAs) are up by nearly 10% in the last quarter, while sales of laptops and
desktop PCs have remained stagnant."

• This report was based on the results of an on-line survey in which 9.8% of the more
than 126,000 respondents said that they had "purchased a handheld digital device
between January 1 and March 30 of this year."

• E-mails soliciting participation in this survey were sent to individuals using an e-mail
address frame from the five largest commercial Internet service providers (ISPs) in
the United States.

• Data collection took place over a 6-week period beginning May 1,2002. The overall
response rate achieved in this survey was 53%.

• Assume that the authors of this study wanted to infer something about the expected
purchases of US adults (18 years old +).
Exercises

a) What is the target population?

b) How the design of this survey might affect the following sources of error: coverage
error, nonresponse error, and measurement error.

c) Without changing the duration or the mode of this survey (i.e., computer assisted,
self-administration), what could be done to reduce the errors you outlined in (b)?

d) To lower the cost of this survey in the future , researchers are considering cutting
the sample in half, using an e-mail address frame from only the two largest ISPs.
What effect (if any) will these changes have on sampling error and coverage error?

You might also like