You are on page 1of 101

Data Collection

Tools and Data
Processing
2
Questionnaire Design Process
For a questionnaire to fulfill a researcher’s purposes, the
questions must meet the basic criteria of relevance and
accuracy. To achieve these ends, a researcher who is
systematically planning a questionnaire’s design will be
required to make several decisions
1. What should be asked?
2. How should questions be phrased?
3. In what sequence should the questions be arranged?
4. What questionnaire layout will best serve the research
objectives?
5. How should the questionnaire be pretested? Does the
questionnaire need to be revised?
3
Questionnaire Design Process
Certain decisions made during the early stages of the
research process will influence the questionnaire design.
At the same time, the latter stages of the research
process will also have an important impact on
questionnaire wording and measurement. For example,
when designing the questionnaire, the researcher should
consider the types of statistical analysis that will be
conducted.
What Should Be Asked?
4
Questionnaire Design Process
A questionnaire is relevant to the extent that all
information collected addresses a research question that
will help the decision maker address the current business
problem. In a study where two samples of the same
group of businesses received either a one-page or a
three-page questionnaire, the response rate was nearly
twice as high for the one-page survey.
Questionnaire Relevancy
5
Questionnaire Design Process
Conversely, many researchers, after conducting surveys,
find that they omitted some important questions.
Therefore, when planning the questionnaire design,
researchers must think about possible omissions. Is
information on the relevant demographic and
psychographic variables being collected? Would certain
questions help clarify the answers to other questions?
Will the results of the study provide the answer to the
manager’s problem?
Questionnaire Relevancy
6
Questionnaire Design Process
Accuracy means that the information is reliable and
valid. While experienced researchers generally believe
that questionnaires should use simple, understandable,
unbiased, unambiguous, and nonirritating words.
Obtaining accurate answers from respondents depends
strongly on the researcher’s ability to design a
questionnaire that will facilitate recall and motivate
respondents to cooperate.
Questionnaire Accuracy
7
Questionnaire Design Process
Respondents tend to be more cooperative when the
subject of the research interests them. When questions
are not lengthy, difficult to answer, or ego threatening,
there is a higher probability of obtaining unbiased
answers.
Questionnaire Accuracy
8
Questionnaire Design Process
There are many ways to phrase questions, and many
standard question formats have been developed in
previous research studies.
Wording Questions
Open-Ended Response versus
Fixed-Alternative Questions
Open-ended response questions pose some problem or
topic and ask respondents to answer in their own
words. Open-ended response questions are free-answer
questions.
9
Questionnaire Design Process
•What names of local banks can you think of?
•What comes to mind when you look at this
advertisement?
•In what way, if any, could this product be changed or
improved? I’d like you to tell me anything you can think
of, no matter how minor it seems.
•How would you describe your supervisor’s management
style?
•Please tell us how our stores can better serve your
needs.
Open-Ended Response versus Fixed-
Alternative Questions
10
Questionnaire Design Process
fixed-alternative questions—sometimes called closed-
ended questions—which give respondents specific
limited-alternative responses and ask them to choose
the one closest to their own viewpoints. For example:
Open-Ended Response versus
Fixed-Alternative Questions
Did you use any commercial feed or supplement for
livestock or poultry in 2010?
*Yes *No
11
Questionnaire Design Process
Would you say that the labor quality in Japan is higher, about
the same, or not as good as it was 10 years ago?
•Higher
•About the same
•Not as good
Open-Ended Response versus
Fixed-Alternative Questions
Do you think the Renewable Energy Partnership Program has
affected your business?
•Yes, for the better
•Not especially
•Yes, for the worse
12
Questionnaire Design Process
Using Open-ended Response Questions
Open-ended response questions are most beneficial
when the researcher is conducting exploratory research,
especially when the range of responses is not yet known.
Respondents are free to answer with whatever is
foremost in their minds. Such questions can be used to
learn which words and phrases people spontaneously
give to the free-response question.
13
Questionnaire Design Process
Using Open-ended Response Questions
Such responses will reflect the flavor of the language that
people use in talking about the issue and thus may
provide guidance in the wording of questions and
responses for follow up surveys. Also, open-ended
response questions are valuable at the beginning of an
interview. They are good first questions because they
allow respondents to warm up to the questioning
process.
14
Questionnaire Design Process
Using Open-ended Response Questions
The cost of administering open-ended response
questions is substantially higher than that of
administering fixed-alternative questions because the job
of editing, coding, and analyzing the data is quite
extensive. As each respondent’s answer is somewhat
unique, there is some difficulty in categorizing and
summarizing the answers. Another potential
disadvantage of the open-ended response question is
the possibility that interviewer bias will influence the
answer.
15
Questionnaire Design Process
Using Fixed-alternative Questions
In contrast, fixed-alternative questions require less
interviewer skill, take less time, and are easier for the
respondent to answer. This is because answers to closed
questions are classified into standardized groupings prior
to data collection. Standardizing alternative responses to
a question provides comparability of answers, which
facilitates coding, tabulating, and ultimately interpreting
the data.
16
Questionnaire Design Process
Types of Fixed-Alternative Questions
A fixed-alternative question that requires the respondent
to choose one of two alternatives. The answer can be a
simple “yes” or “no” or a choice between “this” and
“that.”
Simple-dichotomy (Dichotomous) Question
Did you have any overnight travel for work-related
activities last month?
Yes No
17
Questionnaire Design Process
Types of Fixed-Alternative Questions
A fixed-alternative question that requires the respondent
to choose one response from among multiple
alternatives.
Determinant-choice Question
Please give us some information about your flight. In
which section of the aircraft did you sit?
*First class
*Business class
*Coach class
18
Questionnaire Design Process
Types of Fixed-Alternative Questions
A fixed-alternative question that asks for an answer
about general frequency of occurrence.
Frequency-determination Question
How frequently do you watch Star Plus?
•Every day
• 5–6 times a week
• 2–4 times a week
• Once a week
• Less than once a week
• Never
19
Questionnaire Design Process
Types of Fixed-Alternative Questions
A fixed-alternative question that allows the respondent
to provide multiple answers to a single question by
checking off items.
Checklist Question
Please check which, if any, of the following sources of
information about investments you regularly use.
• Personal advice of your broker(s)
• Brokerage newsletters
• Brokerage research reports
• Investment advisory service(s)
• Conversations with other investors
• Web page(s)
• None of these
• Other (please specify)
__________

20
Questionnaire Design Process
Types of Fixed-Alternative Questions
A category exists for every respondent in among the
fixed alternative categories.
Totally Exhaustive
Alternatives should be totally exhaustive, meaning that
all the response options are covered and that every
respondent has an alternative to check. The alternatives
should also be mutually exclusive, meaning there should
be no overlap among categories and only one dimension
of an issue should be related to each alternative.
21
Questionnaire Design Process
Types of Fixed-Alternative Questions
Totally Exhaustive
The following listing of income groups illustrates
common errors:
$10,000–$30,000
$30,000–$50,000
$50,000–$70,000
$70,000–$90,000
$90,000–$110,000
Over $110,000
22
Questionnaire Design Process
Types of Fixed-Alternative Questions
Totally Exhaustive
The following response categories address the totally
exhaustive and mutually exclusive issues.
Less than $10,000
$10,000–$29,999
$30,000–$49,999
$50,000–$69,999
$70,000–$89,999
$90,000–$109,999
Over $110,000
23
Questionnaire Design Process
Phrasing Questions for Self-administered,
Telephone, and Personal Interview Surveys
The means of data collection—telephone interview,
personal interview, self-administered questionnaire—will
influence the question format and question phrasing. In
general, questions for telephone in particular, as well as
Internet and mail surveys, must be less complex than
those used in personal interviews. Questionnaires for
telephone and personal interviews should be written in a
conversational style. telephone surveys use easy to
understand response categories.
24
Questionnaire Design Process
25
Questionnaire Design Process
Guidelines for Constructing Questions
Developing good business research questionnaires is a
combination of art and science. Few hard and- fast rules
exist in guiding the development of a questionnaire.
Fortunately, research experience has yielded some
guidelines that help prevent the most common
mistakes.
26
Questionnaire Design Process
Guidelines for Constructing Questions
1-Avoid Complexity: Use Simple, Conversational
Language
Words used in questionnaires should be readily
understandable to all respondents. Remember, not all
people have the vocabulary of a college graduate. The
vocabulary used in the following question from an
attitude survey on social problems probably would
confuse many respondents:
27
Questionnaire Design Process
Guidelines for Constructing Questions
1-Avoid Complexity: Use Simple, Conversational
Language
When effluents from a paper mill can be drunk and
exhaust from factory smokestacks can be breathed, then
humankind will have done a good job in saving the
environment. . . . Don’t you agree that what we want is
zero toxicity: no effluents?
28
Questionnaire Design Process
Guidelines for Constructing Questions
1-Avoid Complexity: Use Simple, Conversational
Language
Besides being too long and confusing, this question is
leading. Survey questions should be short and to the
point. Like this:
The stock market is too risky to invest in these days.
29
Questionnaire Design Process
Guidelines for Constructing Questions
2- Avoid Leading and Loaded Questions
Leading and loaded questions are a major source of bias
in question wording. A leading question suggests or
implies certain answers. A study of the dry cleaning
industry asked this question:
Many people are using dry cleaning less because of improved
wash-and-wear clothes. How do you feel wash-and-wear clothes
have affected your use of dry cleaning facilities in the past 4
years?
* Use less * No change *Use more
30
Questionnaire Design Process
Guidelines for Constructing Questions
2- Avoid Leading and Loaded Questions
A loaded question suggests a socially desirable answer or
is emotionally charged. Consider the following question
from a survey about media influence on politics:
What most influences your vote in major elections?
•My own informed opinion
•Major media outlets such as CNN
•Newspaper endorsements
•Popular celebrity opinions
•Candidate’s physical
attractiveness Family or
friends
•Video advertising (television
or Web video)
•Other
31
Questionnaire Design Process
Guidelines for Constructing Questions
2- Avoid Leading and Loaded Questions
Certain answers to questions are more socially desirable
than others. For example, a truthful answer to the
following classification question might be painful:
Where did you rank academically in your high school
graduating class?
Top quarter
2nd quarter
3rd quarter
4th quarter
32
Questionnaire Design Process
Guidelines for Constructing Questions
2- Avoid Leading and Loaded Questions
When taking personality or psychographic tests,
respondents frequently can interpret whichanswers are
most socially acceptable even if those answers do not
portray their true feelings.
I feel capable of handling myself in most social situations.
* Agree * Disagree

I fear my actions will cause others to have low opinions of me.
* Agree * Disagree
33
Questionnaire Design Process
Guidelines for Constructing Questions
3-Avoid Ambiguity: Be as Specific as Possible
Items on questionnaires often are ambiguous because
they are too general. Consider such indefinite words as
often, occasionally, regularly, frequently, many, good, and
poor. Each of these words has many different meanings.
What media do you rely on most?
Television
Radio
Internet
Newspapers
34
Questionnaire Design Process
Guidelines for Constructing Questions
3-Avoid Ambiguity: Be as Specific as Possible
This question is ambiguous because it does not provide
information about the context. “Rely on most” for
what—news, sports, entertainment? When—while
getting dressed in the morning, driving to work, at home
in the evening? Knowing the specific circumstance can
affect the choice made.
35
Questionnaire Design Process
Guidelines for Constructing Questions
4-Avoid Double-Barreled Items
A question that may induce bias because it covers two
issues at once. Making the mistake of asking two
questions rather than one is easy—for example, “Do you
feel our hospital emergency room waiting area is clean
and comfortable?”
Did your plant use any commercial feed or supplement
for livestock or poultry in 2010?
* Yes * No
36
Questionnaire Design Process
Guidelines for Constructing Questions
5-Avoid Making Assumptions
Should General Electric continue to pay its outstanding
quarterly dividends?
*Yes *No
Another frequent mistake is assuming that the
respondent had previously thought about an issue.
Research that induces people to express attitudes on
subjects they do not ordinarily think about is rather
meaningless.
37
Questionnaire Design Process
Guidelines for Constructing Questions
6-Avoid Burdensome Questions That May
Tax the Respondent’s Memory
A simple fact of human life is that people forget.
Researchers writing questions about past behavior or
events should recognize that certain questions may
make serious demands on the respondent’s memory.
Writing questions about prior events requires a
conscientious attempt to minimize the problems
associated with forgetting.
38
Questionnaire Design Process
The Best Question Sequence
The order of questions, or the question sequence, may
serve several functions for the researcher. If the opening
questions are interesting, simple to comprehend and
easy to answer, respondents’ cooperation and
involvement can be maintained throughout the
questionnaire.
39
Questionnaire Design Process
The Best Question Sequence
In their attempt to “warm up” respondents toward the
questionnaire, student researchers frequently ask
demographic or classification questions at the beginning
of the survey. This generally is not advisable, because
asking for personal information such as income level or
education may embarrass or threaten respondents.
Asking these questions at the end of the questionnaire
usually is better, after rapport has been established
between respondent and interviewer.
40
Questionnaire Design Process
The Best Question Sequence
Order bias can result from a particular answer’s
position in a set of answers or from the sequencing of
questions. In political elections in which candidates lack
high visibility, such as elections for county
commissioners and judges, the first name listed on the
ballot often receives the highest percentage of votes.
For this reason, many election boards print several
ballots so that each candidate’s name appears in every
possible position on the ballot.
41
Questionnaire Design Process
The Best Question Sequence
Funnel Technique Asking general questions before
specific questions in order to obtain unbiased responses.
Generally, researchers should ask general questions
before specific questions. This allows the researcher to
understand the respondent’s frame of reference before
asking more specific questions about the level of the
respondent’s information and the intensity of his or her
opinions.
42
Questionnaire Design Process
The Best Question Sequence
Filter Question A question that screens out
respondents who are not qualified to answer a second
question. minimizes the chance of asking questions that
are inapplicable. Asking a human resource manager
“How would you rate the third party administrator (TPA)
of your employee health plan?” may elicit a response
even though the organization does not utilize a TPA. The
respondent may wish to please the interviewer with an
answer.
43
Questionnaire Design Process
The Best Question Sequence
A filter question such as “Does your organization use a
third party administrator (TPA) for your employee health
plan?” followed by “If you answered Yes to the previous
question, how would you rate your TPA on . . . ?” would
screen out the people who are not qualified to answer.
44
Questionnaire Design Process
The Best Question Sequence
Pivot Question A filter question used to determine
which version of a second question will be asked.
“Is your total family income over or under $50,000?” IF
UNDER, ASK, “Is it over or under $25,000?” IF OVER, ASK,
“Is it over or under $75,000?”

-Under $25,000 -$50,001–$75,000
-$25,001–$50,000 -Over $75,000
45
Questionnaire Design Process
The Best Layout
Good layout and physical attractiveness are crucial in
mail, Internet, and other self-administered
questionnaires. For different reasons, a good layout in
questionnaires designed for personal and telephone
interviews is also important.
46
Questionnaire Design Process
The Best Layout
-Traditional Questionnaires
47
Questionnaire Design Process
The Best Layout
a page from a telephone questionnaire. The layout is
neat and organized, and the instructions for the
interviewer (all boldface capital letters) are easy to
follow. The responses “It depends,” “Refused,” and
“Don’t Know” are enclosed in a box to indicate that
these answers are acceptable but responses from the
five-point scale are preferred.
-Traditional Questionnaires
48
Questionnaire Design Process
The Best Layout
Questionnaires should be designed to appear as short as
possible. Sometimes it is advisable to use a booklet form
of questionnaire rather than stapling a large number of
pages together. In situations in which it is necessary to
conserve space on the questionnaire or to facilitate data
entry or tabulation of the data, a multiple-grid layout
may be used. The multiple-grid question presents
several similar questions and corresponding response
alternatives arranged in a grid format.
-Traditional Questionnaires
49
Questionnaire Design Process
The Best Layout
-Traditional Questionnaires
Airlines often offer special fare promotions, but they may require
connecting flights. On a vacation trip, how often would you take a
connecting flight instead of a nonstop flight if you could save $100
a ticket, but the connecting flight was longer?
Never Rarely Sometimes Often Always
Complete trip is
one hour longer?

Complete trip is
two hours longer?

Complete trip is
three hours longer?
50
Collecting Primary Data
Making Initial Contact and
Securing the Interview
Personal Interviews
Personal interviewers may carry a letter of identification
or an ID card to indicate that the study is a bona fide
research project and not a sales pitch. Interviewers are
trained to make appropriate opening remarks that will
convince the respondent that his or her cooperation is
important, as in this example:
51
Collecting Primary Data
Making Initial Contact and
Securing the Interview
Personal Interviews
Good afternoon, my name is _____________, and I’m
with [insert name of firm], an international research
company. We are conducting a survey concerning
_____________. I would like to get a few of your ideas. It
will take [insert accurate time estimate] minutes.
52
Collecting Primary Data
Making Initial Contact and
Securing the Interview Telephone Interviews
Giving the interviewer’s name personalizes the call. The
name of the research agency is used to imply that the
caller is trustworthy. The respondent must be given an
accurate estimate of the time it will take to participate in
the interview. If someone is told that only three minutes
will be required for participation, and the interview
proceeds to five minutes or more, the respondent will
tend to quit before completing the interview.
53
Collecting Primary Data
Making Initial Contact and
Securing the Interview
Telephone Interviews
For the initial contact in a telephone interview, the
introduction might be something like this:

Good evening, my name is ________________. I am not
trying to sell anything. I’m calling from [insert name of
firm] in Mason, Ohio. We are seeking your opinions on
some important matters and it will only take [insert
accurate time estimate] minutes of your time.
Processing of
Research Data
55
Processing of Research Data
Stages of Data Analysis
Practically all researchers will be very anxious to begin
data analysis once the field work is complete. Now, the
raw data can be transformed into intelligence. However,
raw data may not be in a form that lends itself well to
analysis.
Raw Data
The unedited responses from a respondent exactly as
indicated by that respondent.
56
Processing of Research Data
Stages of Data Analysis
Raw data will often also contain errors both in the form
of respondent errors and non-respondent errors.
Whereas a respondent error is a mistake made by the
respondent, a non-respondent error is a mistake made by
an interviewer or by a person responsible for creating an
electronic data file representing the responses.
57
Processing of Research Data
58
Processing of Research Data
Stages of Data Analysis
Data integrity refers to the notion that the data file
actually contains the information that the researcher
promised the decision maker he or she would obtain.
Additionally, data integrity extends to the fact that the
data have been edited and properly coded so that they
are useful to the decision maker.
59
Processing of Research Data
Editing
Fieldwork often produces data containing mistakes. For
example, consider the following simple questionnaire
item and response:
How long have you lived at your current address? - 48

The researcher had intended the response to be in years.
Perhaps the respondent has indicated the number of
months rather than years he or she has lived at this
address?
60
Processing of Research Data
Editing
Sometimes, responses may be contradictory. What if the
same respondent above gives this response?

What is your age? 32 years

This answer contradicts the earlier response. If the
respondent is 32 years of age, then how could he or she
have lived at the same address for 48 years?
61
Processing of Research Data
Editing
Editing is the process of checking the completeness,
consistency, and legibility of data and making the data
ready for coding and transfer to storage. So, the editor’s
task is to check for errors and omissions on
questionnaires or other data collection forms. When the
editor discovers a problem, he or she adjusts the data to
make them more complete, consistent, or readable.
62
Processing of Research Data
Field Editing
Field supervisors often are responsible for conducting
preliminary field editing on the same day as the
interview. Field editing is used to
1. Identify technical omissions such as a blank page on an
interview form
2. Check legibility of handwriting for open-ended
responses
3. Clarify responses that are logically or conceptually
inconsistent.
63
Processing of Research Data
In-House Editing
In-house editing rigorously investigates the results of
data collection. The research supplier or research
department normally has a centralized office staff
perform the editing and coding function.
For example, Arbitron measures radio audiences by having
respondents record their listening behavior—time, station, and
place—in diaries. After the diaries are returned by mail, in-
house editors perform usability edits in which they check that
the postmark is after the last day of the survey week, verify the
legibility of station call letters, look for completeness of entries
on each day of the week, and perform other editing activities.
64
Processing of Research Data
Illustrating Inconsistency - Fact Or Fiction?
Consider a situation in which a telephone interviewer has
been instructed to interview only registered voters in a
state that requires voters to be at least 18 years old. If the
editor’s review of a questionnaire indicates that the
respondent was only 17 years old, the editor’s task is to
correct this mistake by deleting this response because
this respondent should never have been considered as a
sampling unit.
65
The editor also should check for consistency within the data
collection framework.
In which of the following cities have you shopped for clothing
during the last year?
• San Francisco
• Sacramento
Processing of Research Data
Illustrating Inconsistency - Fact Or Fiction?
Suppose a respondent checks Sacramento and San Francisco
to the first question. If the same respondent lists a store that
has a location only in Los Angeles in the second question, an
error is indicated. Either the respondent failed to list Los
Angeles in the first question or listed an erroneous store in the
second question. These answers are obviously inconsistent.
• San José
• Los Angeles
• Other _________
66
Responses should be logically consistent, but the
researcher should not jump to the conclusion that a
change should be made at the first site of an
inconsistency. In all but the most obvious situations, a
change should only be made when multiple pieces of
evidence exist that some response is in error and when
the likely true response is obvious.
Processing of Research Data
Taking Action When Response Is Obviously An Error
67
The editor may check other responses to make sure that
the screening question was answered accurately. For
instance, if the respondent left the question about home
value unanswered, then the editor will be confident that
the person truly does not own a home. In cases like this,
the editor should adjust these answers by considering all
answers to the irrelevant questions as “no response” or
“not applicable.”
Processing of Research Data
Taking Action When Response Is Obviously An Error
68
In some cases the respondent may have answered only
the second portion of a two-part question. The following
question creates a situation in which an in-house editor
may have to adjust answers for completeness:
Does your organization have more than one computer
network server?
# Yes # No
If yes, how many? ____
Processing of Research Data
Editing for Completeness
If the respondent checked neither yes nor no but indicated
three computer installations, the editor should change the first
response to a “Yes” as long as other information doesn’t
indicate otherwise.
69
Processing of Research Data
Editing for Completeness
Item Non-response
The technical term for an unanswered question on an
otherwise complete questionnaire resulting in missing
data.
Plug Value
An answer that an editor “plugs in” to replace blanks or
missing values so as to permit data analysis; choice of
value is based on a predetermined decision rule.
70
Processing of Research Data
Editing for Completeness
The decision rule may be to plug in an average or neutral
value in each instance of missing data. Several choices are
available:
1. Leave the response blank. Because the question is so
important, the risk of creating error by plugging a value is
too great.
2. Plug in alternate choices for missing data (“yes” the first
time, “no” the second time, “yes” the third time, and so
forth).
71
Processing of Research Data
Editing for Completeness
3. Randomly select an answer. The editor may flip a coin
with heads for “yes” and tails for “no.”
4. The editor can impute a missing value based on the
respondent’s choices to other questions. Many different
techniques exist for imputing data. Some involve complex
statistical estimation approaches that use the available
information to forecast a best guess for the missing
response.
72
Processing of Research Data
Editing Questions Answered Out of Order
Another task an editor may face is rearranging the answers
given to open-ended questions such as may occur in a
focus group interview. The respondent may have provided
the answer to a subsequent question in his or her
comments to an earlier open-ended question. Because the
respondent already had clearly identified the answer, the
interviewer may not have asked the subsequent question,
73
Processing of Research Data
Facilitating the Coding Process
While all of the previously described editing activities will
help coders, several editing procedures are designed
specifically to simplify the coding process. For example,
the editor should check written responses for any stray
marks. Respondents are often asked to circle responses.
Sometimes, a respondent may accidentally draw a circle
that overlaps two numbers. For example, the circle may
include both 3 and 4. The editor may be able to decide
which number is the most accurate response and indicate
that on the form.
74
Processing of Research Data
Editing And Tabulating “Don’t Know” Answers
In many situations, respondents answer “don’t know.” A
legitimate “don’t know” response is the same as “no
opinion.” However, there may be reasons for this response
other than the legitimate “don’t know.” A reluctant “don’t
know” is given when the respondent simply does not want
to answer a question. For example, asking an individual
who is not the head of the household about family income
may elicit a “don’t know” answer meaning, “This is
personal, and I really do not want to answer the question.”
75
Processing of Research Data
Pitfalls of Editing
Subjectivity can enter into the editing process. Data
editors should be intelligent, experienced, and objective. A
systematic procedure for assessing the questionnaires
should be developed by the research analyst so that the
editor has clearly defined decision rules to follow. Any
inferences such as imputing missing values should be done
in a manner that limits the chance for the data editor’s
subjectivity to influence the response.
76
Processing of Research Data
Coding
Editing may be differentiated from coding, which is the
assignment of numerical scores or classifying symbols to
previously edited data. Careful editing makes the coding
job easier. Codes are meant to represent the meaning in
the data.
77
Processing of Research Data
Coding
Assigning numerical symbols permits the transfer of data
from questionnaires or interview forms to a computer.
Codes often, but not always, are numerical symbols.
However, they are more broadly defined as rules for
interpreting, classifying, and recording data. In qualitative
research, numbers are seldom used for codes.
78
Processing of Research Data
Coding Qualitative Responses
In qualitative research, the codes are usually words or
phrases that represent themes. A qualitative researcher is
applying a code to a text describing in detail a
respondent’s reactions.
Unstructured Qualitative Responses (Long Interviews)
79
Processing of Research Data
Coding Qualitative Responses
Structured Qualitative Responses
Qualitative responses to structured questions such as
“yes” or “no” can be stored in a data file with letters such
as “Y” or “N.” Alternatively, they can be represented with
numbers, one each to represent the respective category.
So, the number 1 can be used to represent “yes” and 2
can be used to represent “no.”
80
Processing of Research Data
Coding Qualitative Responses
Structured Qualitative Responses
The research may consider adopting dummy coding for
dichotomous responses like yes or no. Dummy coding
assigns a 0 to one category and a 1 to the other. Dummy
coding provides the researcher with more flexibility in
how structured, qualitative responses are analyzed
statistically.
81
Processing of Research Data
Data File Terminology
Field
A collection of characters
that represents a single
type of data—usually a
variable.
String Characters
Computer terminology to
represent formatting a variable
using a series of alphabetic
characters that may form a word.
Record
A collection of related
fields that represents the
responses from one
sampling unit.
Data File
The way a data set is stored
electronically in spreadsheet-like
form in which the rows
represent sampling units and the
columns represent variables.
82
Processing of Research Data
Precoding Fixed-Alternative Questions
83
Processing of Research Data
Precoding Fixed-Alternative Questions
84
Processing of Research Data
Coding Open-Ended Questions
Surveys that are largely structured will sometimes contain
some semi-structured open-ended questions. These
questions may be exploratory or they may be potential
follow-ups to structured questions. The purpose of
coding such questions is to reduce the large number of
individual responses to a few general categories of
answers that can be assigned numerical codes.
85
Processing of Research Data
Coding Open-Ended Questions
A consumer survey about frozen food also asked why a
new microwaveable product would not be purchased:
• We don’t buy frozen food very often.
• I like to prepare fresh food.
• Frozen foods are not as tasty as fresh foods.
• I don’t like that freezer taste.
All of these answers could be categorized under “dislike
frozen foods” and assigned the code 1. Code construction
in these situations reflects the judgment of the
researcher.
86
Processing of Research Data
Coding Open-Ended Questions
Test tabulation is the tallying of a small sample of the
total number of replies to a particular question. The
purpose is to preliminarily identify the stability and
distribution of answers that will determine a coding
scheme.
87
Processing of Research Data
Editing and Coding Combined
Frequently the person coding the questionnaire performs
certain editing functions, such as translating an
occupational title provided by the respondent into a code
for socioeconomic status. A question that asks for a
description of the job or business often is used to ensure
that there will be no problem in classifying the responses.
For example, respondents who indicate “salesperson” as their
occupation might write their job description as “selling shoes in a
shoe store” or “selling IBM supercomputers to the defense
department.” Generally, coders are instructed to perform this type
of editing function, seeking the help of a tabulation supervisor if
questions arise.
88
Processing of Research Data
Tabulation
Tabulation refers to the orderly arrangement of data in a
table or other summary format. When this tabulation
process is done by hand, the term tallying is used.
Counting the different ways respondents answered a
question and arranging them in a simple tabular form
yields a frequency table. The actual number of responses
to each category is a variable’s frequency distribution. A
simple tabulation of this type is sometimes called a
marginal tabulation.
89
Processing of Research Data
Tabulation
Simple tabulation tells the researcher how frequently
each response occurs. This starting point for analysis
requires the researcher to count responses or
observations for each category or code assigned to a
variable. A frequency table showing where consumers
generally purchase Chocolate can be computed easily.
90
Processing of Research Data
Tabulation
91
Processing of Research Data
Cross-Tabulation
Cross-tabulation is the appropriate technique for
addressing research questions involving relationships
among multiple less-than interval variables. We can think
of a cross-tabulation is a combined frequency table.
Cross-tabs allow the inspection and comparison of
differences among groups based on nominal or ordinal
categories. One key to interpreting a cross-tabulation
table is comparing the observed table values with
hypothetical values that would result from pure chance.
92
Processing of Research Data
Cross-Tabulation
93
Processing of Research Data
Cross-Tabulation
94
Processing of Research Data
Contingency Tables
A contingency table is a data matrix that displays the
frequency of some combination of possible responses
to multiple variables. Two-way contingency tables,
meaning they involve two less-than interval variables,
are used most often. A three-way contingency table
involves three less-than interval variables. Beyond three
variables, contingency tables become difficult to analyze
and explain easily. For all practical purposes, a
contingency table is the same as a cross-tabulation.
95
Processing of Research Data
Contingency Tables
Two variables are depicted in the contingency table
shown in panel A:
• Row Variable: Biological Sex _____M _____F
• Column Variable: “Do you shop at Target? YES or NO”
96
Processing of Research Data
Contingency Tables
97
Processing of Research Data
Contingency Tables
98
Processing of Research Data
Contingency Tables
A two-way contingency table like the one shown in part
A is referred to as a 2 × 2 table because it has two rows
and two columns. Each variable has two levels. A two-
way contingency table displaying two variables, one (the
row variable) with three levels and the other with four
levels, would be referred to as a 3 × 4 table. Any cross-
tabulation table may be classified according to the
number of rows by the number of columns (R by C).
99
Processing of Research Data
Percentage Cross-Tabulations
When data from a survey are cross-tabulated,
percentages help the researcher understand the nature
of the relationship by making relative comparisons
simpler. The total number of respondents or
observations may be used as a statistical base for
computing the percentage in each cell.
100
Processing of Research Data
Percentage Cross-Tabulations
When the objective of the research is to identify a
relationship between answers to two questions (or two
variables), one of the questions is commonly chosen to
be the source of the base for determining percentages.