Professional Documents
Culture Documents
A DISSERTATION SUBMITTED TO
THE FACULTY OF THE DIVISION OF THE SOCIAL SCIENCES
IN THE CANDIDACY FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
DEPARTMENT OF SOCIOLOGY
BY
CLIFFORD ALEXANDER YOUNG
CHICAGO, ILLINOIS
DECEMBER 2001
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 3029551
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
LIST OF FIGURES.................................................................................... iv
LIST OF TABLES.................................................................................... v
ACKNOWLEDGEMENTS....................... ............................................... vi
ABSTRACT.......................... vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6.0 An Analysis of the Age and Civic Participation Effects.......................... 99
6.1 D iscussion o f the M easures............................................................................ 103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
IV
LIST OF FIGURES
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
V
LIST OF TABLES
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
vi
ACKNOWLEDGEMENTS
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTRACT
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1
CHAPTER ONE
INTRODUCTION OF THESIS
quantitative sociology over the last half century has made increasing use of high-
quality surveys. There is no better indicator of the central role of the sample survey
in quantitative sociology than the General Social Survey, which to date has trained
several generations of sociologists; won tenure for scores of young professors; and
One way in which sociological concepts can be used to improve data quality
is by explaining why certain respondents are more likely to answer survey questions
more thorough understanding of item missing data can help guide analysts in how to
deal with it in analysis. In pursuit of this, I attempt to answer four questions in this
thesis:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2
(1) Are respondents who are more likely to answer survey questions
different from those who are less likely?
(4) Can general principles be derived, so that missing data will not have to
be dealt with on a case by case basis?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3
1.1.1 Definitions
SAS, and STATA, the columns of the data matrix represents variables (both discrete
and continuous) and the rows represent the unit of analysis (e.g., individuals,
companies, households, etc.). Every data analyst has at one time or another had to
confront the problem of missing data. Missing data refers to when either some or all
of the values in the data matrix are not observed for a given respondent (Little and
Rubin 1987).
There are two forms of missing data: unit m d item nonresponse. Unit
nonresponse refers to when “...units in the selected sample and eligible for the
unusable" (Madow and Olkin 1983). Item nonresponse refers to when “[eligible
units in a selected sample provide some, but not all, of the required information or
the information for some items is unusable” (Madow and Olkin 1983).
Item nonresponse can be further broken down into two sub-dimensions: (1)
process item nonresponse and (2) interview item nonresponse. By processes item
nonresponse, I refer to item missing data resulting from problems with pre-survey
with post-processing including editing, coding, and data capture (see Smith 1993
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4
arises from problems with the survey production process and, therefore, can be
(CAI) has made it especially simple for survey researchers to minimize process item
from the social, psychological, and cognitive dynamics of the interview. Many
including question form, order, and context; interviewer behavior; and mode of
nonresponse. Therefore, from this point on, when referring to item nonresponse, I
data from their analysis. By ignoring missing data however, analysts must confront
two potential problems: (1) increased sampling error and (2) bias. Let us further
First, added sampling error refers to larger standard errors (wider confidence
intervals) and, hence, less precise estimates. In brief, by excluding cases with
missing data, we reduce the effective sample size of our analysis, which, in turn,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Note the standard error is inversely related to the square root of the sample
size (see equation 1 below). Thus, as the effective sample size decreases, the
Second, bias refers to a systematic deviation between the sample mean and
the population mean (see equation 2a below) where the deviation is constant across
infinite replications of a survey (Groves 1989). Simply put, this means that, if an
analyst wants to determine the average US household income and decides to conduct
the study via the internet, he/she will overestimate average income independently o f
sample size, given that Americans with internet access in their homes are more
the size of the item nonresponse subgroup and (2) the difference between the mean
for the subgroup giving a substantive response and the mean for the item
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6
Bias (y) = Wnr (|ysr-ynr|) (2b)
Wnr = Size o f Nonresponse Subgroup
ysr = Mean for the subgroup giving
a substantive response
ynr = Mean for the nonresponse
subgroup
Equation 2b suggests that, when the item nonresponse subgroup is (1) large
relative to the subgroup giving a substantive response and/or (2) quite different from
the subgroup giving a substantive response, the sample mean will deviate
substantially from the population mean. In such cases, analysts are likely to draw
designing questions that minimize item nonresponse and/or (2) by replacing the item
missing data with imputed values. I will briefly discuss how survey researchers use
reducing social barriers (e.g., privacy concerns) and/or (2) by reducing cognitive
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7
barriers (e.g., the use of difficult words) (Young 1999b; 1999c; 1999d). The three
respondents for their income within a given range, or interval, (e.g., $25,000-35,000)
rather than ask for their exact income. This strategy has been shown to reduce
missing data (e.g, Sudman and Bradbum 1982). The methods literature, in turn,
speculates that respondents are more likely to give a substantive response on closed-
1982).
such as the GSS, will typically not offer an explicit DK category. Instead, the
volunteers it (Davis and Smith 1996). In support of this strategy, extensive research
shows that the demands of the question confine respondent behavior (e.g., Sudman
and Bradbum 1973; Schuman and Presser 1981; Bishop et. al. 1980; Krosnick and
Fabrigar 1997 ). Respondents, therefore, will not typically provide answers that are
not offered as explicit options. This same research, however, is not conclusive on
Third, on pre-election polls which ask respondents their intention to vote for
a given candidate, survey organizations use secret ballots to minimize the rate of
1 While the name, secret ballot, suggests a highly sophisticated technique, the method
is nothing more than a self-administered questionnaire.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
undecideds (Perry 1979; Traugott and Tucker 1984). The secret ballot method is
concern that their choice of candidate will be socially censured by the interviewer
minimize item nonresponse, this effort has been directed most often in an ad hoc
manner, varying considerably from case to case. The lack of any general guiding
principles is most evident when analyzing the primary questionnaire design primers
in the field (e.g., Dillman 1978; Dillman 2000; Sudman and Bradbum 1982; Sudman
et al. 1996). None of them has a chapter (or even a portion of a chapter) devoted
solely to item nonresponse. This suggests that the survey methods literature is in
data. Item nonresponse has been dealt with using statistical imputation models that
first predict missing data values based upon a respondent’s known characteristics
and then replace the missing data with predicted values. A wide-variety of data
imputation models are presently used ranging from those that simply replace missing
values on a given variable with the grand mean of that variable (grand mean
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
9
imputation) to highly sophisticated multivariate imputation models that account for
2
both within and between imputation variance (multiple iterative imputation).
different assumptions must be made which, in turn, affect the appropriateness of the
(Rubin 1976; Rubin and Little 1987). The least restrictive assumption in the data
In the case of MCAR {see equation 3 above), the probability that we observe
V). The missing data mechanism (m) is proportionately distributed across levels of
For a further discussion of the data imputation literature see Andersen et. al. 1983;
Little and Rubin 1987; Lesser and Kalsbeek 1992. Survey statisticians have put enormous effort
into the development of statistically valid and reliable imputation models over the last twenty
years (1979-1999). Indeed, over this period, the number o f articles on data imputation appearing
per year in IAS A (Journal o f American Statistical Association)—the premier statistics journal—
has increased fivefold (my own research). Research into data imputation has also become the
“hot” topic, replacing the more traditional areas in statistics related to survey methods, such as
survey sampling (Groves 1996).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10
MCAR is the implicit assumption made when analysts listwise delete
nonrespondents tend to be different from respondents (e.g., Ferber 1966; Francis and
Using the income example again, this means that respondents are
In the still more restrictive case of multivariate MAR, the probability that we
and W (see equation 4b above). This means that respondents who provide an
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
income answer are systematically different from those who do not—more educated
and older for instance. However, when controlling for education and age,
adjustments for item missing data make this more restrictive multivariate MAR
assumption.
characteristic U. Simply put, the probability that respondents report income depends
on how much they make—the higher the income, the less likely to provide a
response.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12
consideration both the observed characteristic education and the unobserved
nonrespondents.
for the proper estimation of imputed values. Data imputation, however, is typically
has not examined in any depth the underlying social and cognitive processes
Perhaps the lack of effort on the proper specification of missing data models
cognitive processes of the survey interview, falling under the natural purview of
sociology and cognitive psychology not statistics. Does this kind of research
produce results?
The short answer is yes it does. Indeed, research into the specific processes
that produce missing data has been shown to improve imputation models.
Quantitative sociology has been extremely slow in incorporating data imputation into
its repertoire o f methods. Several factors might explain the present situation of data imputation
in sociology. First, only recently have user-friendly imputation software become available either
as stand-alone packages (e.g., SOLAS) or as integrated options to standard statistical packages
(e.g., Missing Data Analysis in SPSS). Second, few advanced courses in quantitative methods
offer even an overview o f data imputation techniques. When courses do examine these issues,
they only review a very small subset o f techniques from the econometrics literature that deal with
missing data on the dependent variable—commonly referred to as sample selection models (e.g.,
Berk 1983; Stolzenberg and Relies 1997; Heckman 1976; 1979).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
13
Matfaoweitz (1998), for instance, demonstrates that including respondents’
item missing data in an ad hoc, case by case basis. No general guiding principles
presently exist about how to effectively confront the problem of item missing data.
First, I choose to study the DK responses on attitude items because they have
received considerable attention from both the behavioral and social sciences (e.g.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14
cognitive psychology, political science, and sociology). This literature on DK can
provide theoretical as well as empirical insights into the more general issues of item
missing data.
effects has extensively examined the phenomenon (e.g., Sudman and Bradbum
1974; Schuman and Presser 1981; Bradbum 1983; Tourangeau and Rasinski 1988;
Sudman, Bradbum, and Schwarz 1996). This literature has a well-established socio-
cognitive framework, which may provide insights into the phenomenon of item
nonresponse.
studies can examine whether the conclusions drawn from my analysis of DK hold in
for data analysts who lack clear guidelines about how to treat DK responses in
analysis. Indeed, all analysts have asked at one time or another—should I delete DK
1.2.2 Criticism
respondent’s true underlying value and, therefore, does not qualify as item
nonresponse. As defined on page three (3) of this thesis, item nonresponse includes
those responses that are unusable in substantive analysis. I contend that in the great
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
15
majority of cases DK responses are unusable in analysis because, in practice, it is
choose “off-scale” response categories, like don’t know, for a variety of reasons
and Coombs 1977; Smith 1984; Feick 1989; Krosnick and Fabrigar 1997;
Considering this qualification, I will examine the issue in more detail in the
13 Organization of Thesis
Excluding this introduction, I divide the thesis into 7 separate chapters.
for the remainder of the thesis. In chapter 5 ,1 analyze and attempt to explain one
two predictors (civic participation and age) are correlated with DK; and, in
chapter 7 ,1 determine which multivariate model best fits the data. Finally, in
chapter 8 ,1 conclude the thesis by discussing the general results and how they
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
16
CHAPTER TWO
The research on DK can be broadly broken down into three strains. The first
question wording) (Coombs and Coombs 1977; Smith 1984; Feick 1989; Krosnick
that questionnaire designers have substantial control over the rate of DK responses,
instance, this research shows that placing an explicit DK option in the response scale
will significantly increase the rate of DK responses (e.g., Schuman and Presser
1981). This literature has also examined whether maximizing (or minimizing) DK
responses improves data quality (e.g., Schuman and Presser 1981; Krosnick and
The third strain examines individual correlates of DK. This research finds
from the characteristics of those who give substantive responses. Quickly summing
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
up this literature, respondents who provide a DK answer, on average, are less
educated, older, female, black, less politically active, and less knowledgeable (e.g.,
Gergen and Back 1966; Glenn 1969; Sudman and Bradbum 1974; Converse 1977;
In this chapter, I examine these three primary strains of research found in the
literature on DK responses. To simplify this task, I break the chapter down into four
sections. In section 2.1,1 examine what the DK response means to both researchers
and respondents. In section 2.2,1 detail the research on question level correlates of
DK responses. In section 2.3,1 discuss the link between DK responses and data
Finally, in section 2.5,1 examine how the literature conceptualizes DK. Specifically,
are DKs substantive responses (i.e., true underlying value)? Are they missing data
response types?
Both researchers and respondents mean many things when saying DK. In
answer two questions: (1) what do researchers mean when they say DK? and (2)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18
things to the respondent. Extensive research shows that respondents say DK for a
variety of reasons (e.g., Coombs and Coombs 1977; Smith 1984; Feick 1989;
Krosnick and Fabrigar 1997; O’Murcheartaigh et. al. 1999).1 These reasons include:
Research also shows that different DK subgroups often have very different demographic
and attitudinal profiles. Several studies demonstrate, for instance, that respondents with ambivalent or
neutral attitudes who choose a DK option—(1) are more educated and knowledgeable; (2) have higher
cognitive abilities; (3) are younger; (4) are more male—than other respondents that choose DK
(Coombs and Coombs 1977; Falkenberry and Mason 1978; Smith 1984; O’Murcheartaigh et. al. 1999).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
19
(4) not understanding the question because of poor wording (item
ambiguity) and;
Furthermore, this same research indicates that, even on the same question,
sets of survey items from 4 different studies, which decompose the DK category.
composition. Specifically:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
20
.. _ „ i ...........
' I'l >■ .1!* 't "“! 1' ' ‘ •a I. " r"." ■ ' 1 ' >‘e 1 ' - \ n . i l I— x r ~ .......,5.5 245
20 40 years af age, 1972
Farfkeakuy and Mason (1978) Study of American Adults, 1 Rem on Wind Energy 45.2 54R
1975 Iiitariswer Coded
-......... - - ...................... ... -...-..................— ...........—
Smith (1984) SRC American Election Study 15 Political Attitude Items m 655
1956
I
8
Young (1999c) GSS1975-1998 6 Abortion Items 73.4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
21
ambivalent/neutral attitude (note I took the average of the 7 item sets). Simply put,
further research is needed to confirm this conclusion. Second, Table 1 also indicates
that there exists considerable variation among subsets of items. Indeed, on average,
74.5 percent ([75.5 + 73.4J/2) of the respondents who answered DK to one of the
two abortion scales had an ambivalent/neutral attitude, while only 46.5 percent
might explain why abortion questions produce higher ambivalent attitude rates than
political questions?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
22
First, abortion is a topic that most individuals probably have considered at
one time or another. Second, abortion is the kind of issue that taps underlying,
perhaps even immutable, beliefs and social roles. Respondents, therefore, may never
have thought much about the issue of abortion but still may use well-defined social
roles and beliefs to impute substantive answers. Respondents may arrive at answers
whether (or not) the respondent has well-defined social roles. On questions like
abortion, respondents who say I don’t know probably mean la m neitherfo r nor
this:
respondents may not have well-defined belief structures, which closely correspond
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
23
to the question topic (e.g., questions concerning specific policies, like NAFTA).
Indeed, for many respondents, the survey interview may be the first contact they
have had with specific socio-political subjects. On such topics, we should expect
who answer DK many times do have positive or negative leanings toward the given
issue (Schuman and Presser 1981; Gilljam and Granberg 1993). So why, then, do
to attitude questions are not always direct reflections of underlying beliefs and social
roles (Sudman et. al. 1996; Krosnick 1991; Tourangeau et. a{. 2000). Specifically,
mental file marked politics; open it up; and select the answer. Instead, a variety of
respondent cognitive ability) affect the cognitive processing of survey questions and,
in turn, responses to them (Sudman et. al. 1996; Krosnick 1991; Tourangeau et. al.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24
2000; Schuman and Presser 1981). Considering this, respondents might arrive at a
Of course, even in the case of the NAFTA question, some respondents may
have well-defined beliefs and social roles, which they may use to arrive at an
Summing up the above discussion, respondents mean many things when they
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
25
• Editing Socially Undesirable Answers: I really don’t like
homosexuals.. .But I shouldn’t say this openly—-It is not
politically correct...I Don’t Know
The important point here is that respondents who say DK do not only mean I
have no idea. Indeed, analysis above has shown that, on average, over half of the
question characteristics can affect the rate of DK for a given question (e.g., Schuman
and Presser 1981). For instance, DK rates'are higher when questions include a DK
option in the response scale than when,they .do not. In this section, I attempt to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
26
(1) Question Content (More Specialized Knowledge, Less
Specialized Knowledge)
(2) Question Concept (Well-Defined, Poorly Defined)
(3) Number of Response Options (More, Less)
(4) Middle Option (Included, Not Included)
(5) DK Option (Included, Not Included)
(6) Probing of DK response (Probe DK, Do Not DK)
(7) DK Option (Included Question Stem, Included Response Scale)
(8) Wording of DK Option in the Question Stem (More Restrictive,
Less Restrictive)
(9) Wording of DK Option in the Response Scale (More Restrictive,
Less Restrictive)
Research shows that DK rates are higher for questions which address topics
that require very specialized knowledge and/or are very distant from a respondent’s
everyday life than for questions that require general knowledge and/or are very'
1999d). Specifically, topics, such as politics, foreign policy, and economics, which
wording produce higher rates of DK than questions with clear concepts and wording
satisfaction study of Brazilian banks, Young (2000) found that respondents were 5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27
times more likely to ask the interviewer to clarify question wording on items with
high rates of DK (25% or more) than on items with low rates of DK (3% or less).
These empirical findings, however, are nothing new. Indeed, one of the key
The questionnaire design literature, in turn, gives ample treatment on how to best
DK rates are higher on questions with polar response scales such as yes/no,
reflect respondent opinion. This research, however, has never been replicated.
Research shows that DK rates are higher on questions which possess true
mid-points hut do no t offer them as a response option than on questions which offer
mid-points (O’Murcheartaigh et. al. 1999). What might explain this mid-point
effect?
One answer may be that respondents use “the next best answer strategy”—
where if the first choice is not offered, respondents opt for the next best answer. The
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
28
Interviewer (Question): “Do you agree or disagree with the
Mowing statement?” Bush will be a better President than
Clinton
Interviewer: OR...Thanks
The above scenario finds empirical support in two different strains in the
in the last section) shows that many DK responses represent ambivalent attitudes,
interviewing techniques has established probes, such as one cited above, to guide
respondents who complain that their answer is not offered in the response scale
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
29
2.2X5 Response Scale: DK Option
The literature argues that the demands of the question confine respondent
behavior (e.g., Sudman and Bradbum 1974; Schuman and Presser 1981; Bishop et.
al. 1980; Krosnick and Fabrigar 1997). In other words, the written question itself
communicates to the respondent which answers are legitimate and which are not
legitimate. Potentially legitimate responses are those included in the response scale.
Thus, in cases where the DK option is included in the response scale, DK responses
are lower than when they do not probe DK-like responses (Sanchez and Morchio
the DK option in the response scale. Instead, interviewer must record a DK answer
only after the respondent expresses a DK-like response. How is this done?
many of the large commercial and academic survey institutes specify very similar
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
30
(Fowler and Mangione 1990). First, training manuals teach interviewers that the
DK response may represent many things: (1) ignorance; (2) a pause in thought; (3)
Does this sort of probing induce respondents to offer opinions when they
really do not have an opinion? We really do not know at this point. Some research
questions (Sanchez and Morchio 1992). However, this research is not conclusive in
respondents who express a DK-like response actually have a stable leaning if asked
similar questions repeated times (Giljam and Granberg 1993). These findings
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
31
Research shows that respondents are more likely to choose the DK option
when it is included in the stem o f the question than when it is offered as an option in
the response scale (Schuman and Presser 1981; Bishop et. al. 1980). The research
is not definitive as to why this is the case. Some studies suggest that the DK filter
encourages respondents who have no attitude to select a DK option (Bishop et. al.
1980). Other research indicates that such filters signal to respondents that the task
will be difficult, thus discouraging them from exerting the effort to come up with a
et. al. 1983). For instance, “Do you have an opinion on this issue or not?”, produces
fewer DK responses than “Have you been interested enough in this issue to favor
one side over the other?”, which produces slightly less DK’s than either “Have you
thought much about this issue?” or “Have you already heard or seen enough about it
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
32
Krosnick and Fabrigar (1997) argue that “...the three latter filters make it
easier for respondents to admit that they have not considered the topic...and
therefore have no opinion on [the issue]” (p. 154). Put another way, the latter three
filters are less restrictive, making it easier for the respondent to say DK, while the
first filter is more restrictive, making it more difficult for the respondent to answer
DK.
not) a respondent will choose a DK response? For instance, are respondents more
likely to choose the DK option if the label is Don’t Know than if the label is Can’t
Choose? The short answer is probably yes but we really do not know conclusively
corresponding research, however, has examined whether respondents are more likely
to choose one specific type of DK labels over another (e.g., Don’t Know, Not Sure,
Can’t Choose, etc.). Even so, the ISSP (International Social Survey Programme)
uses a Can’t Choose option on its surveys, instead of Don’t Know, because ISSP
researchers believe that Can’t Choose is more restrictive, making it less likely for
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
33
2.2.1.10 Summary Remarks
can affect the rate of DK on any given question. How, then, should the researcher
QiM io jA Question's
Question Content (Subjective Well-Being) Question Content (Subjective Wei-Being)
Question Concept (Clear) Question Concept (Clear)
Many Response Options Many Response Options
Mid-Point Included Mid-Point Included
No Probe No Probe
DK Option in Response Scale ... DK Option tit Question Stem
More Restrictive DK Option More Restrictive DK Option
.In the above case, the questionnaire designer knows that question type A will
produce lower DK rates than question'type B, because DK rates are higher for
questions with the DK option in the question stem than the response scale.
However, not all cases are this simple. Indeed, the research on the association
between question characteristics and DK rates is far from complete, not having
simulations become problematic when more than a few factors are varied. The
Question A Q g M P l.1
Question Content (Subjective Well-Being) Question Content (Subjective Well-Being)
Question Concept (Clear) Question Concept (Clear)
Few Response Options Many Response Options
Mid-Point Nat Included Mid-Point Indudei
No Probe No Probe
DK Option in Response Scale DK Option in Question Stem
More Restrictive DK Option More Restrictive DK Option
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
34
In the above case, the questionnaire designer can not determine if question
type A will produce lower DK rates than question type B or vice-versa. Further
Before going into the specifics of the research on data quality, it is first
important to discuss the two main schools of thought on this issue—one which
argues that DKs should be maximized andfhe other which argues that they should be
due to extremely low information levels. Converse calls such uninformed opinions,
non-attitudes.
answers randomly at the flip of a coin. However, later research suggested that such
selecting the positive end of the response scale (Smith 1981; Schuman and Presser
1981; Taylor 1983; Brody 1986). Whatever the position on respondent response
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
35
holders have no attitude position in relationship to the topics covered on surveys;
minimize the cognitive burden of survey questions. Krosnick calls such sub-optimal
answering—satisficing.
actually exclude many respondents with real attitudes, undermining, rather than
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36
• the research has found differences in inter-item correlations
among attitude questions with and without a DK (Schuman and
Presser 1981).
• the research has found that the inclusion of the middle category
improves reliability and validity (O’Murcheartaigh et. al. 1999).
general rule, middle options should be included on questions which have a true mid
Indeed, the changes found in the univariate and multivariate distributions of attitude
items do not point toward which method (the inclusion or exclusion of the DK
option) produces more valid results (Schuman and Presser 1981; Bishop et. al. 1980,
On pure sample size grounds, the inclusion of the DK option decreases effective
sample size, increasing the standard error of estimates. With issues of sample size in
DK actually have attitudes but may not express them for a variety of reasons,
response categories reflecting the respondent’s attitude (Gilljam and Granberg 1993;
Sudman et. al. 1996; Krosnick 1991; Tourangeau et. al. 2000; Schuman and Presser
1981).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
While the research is mixed, the evidence points towards minimizing DK
1966; Francis and Busch 1975; Krosnick and Milbum 1990; 1999b; 1999c). In this
yet rarely defined. It, indeed, means different things to different researchers.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
38
Here cognitive sophistication means the combination of three factors: (1)
knowledge, (2) information exposure, and (2) cognitive ability. The more
cognitively sophisticated are those individuals who are more knowledgeable about
survey topics; are more exposed to such topics; and are more able (cognitive ability)
to think through topics found on surveys. While these three factors are probably
distinct sub-dimensions, they are often grouped together under the umbrella of
cognitive sophistication because they are highly correlated (Young 1998c; 1999c).
cognitive sophistication. First, the cognitive sophistication effect may result from
demonstrate that the well-informed are less likely to answer DK (Converse 1964,
1970; Converse 1977; Faulkenberry and Mason 1978; Francis and Busch 1975;
Second, the cognitive sophistication effect may result from varying levels of
verbal ability. Research suggests that respondents with weaker verbal skills are
activities may explain variations in the rate of DK. This research shows that
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
39
respondents more involved in civic activities are less likely to answer DK (Francis
and Busch 1975; Faulkenberry and Mason 1978; Rapoport 1985; Young 1999c).
While far from conclusive, there are two possible reasons for the association
between civic participation and DK. First, civic participation may be a proxy for a
(Krosnick 1991; Young 1999c). The methods literature has called respondents with
a high propensity for such behavior—“good respondents”. Second, people who are
more likely to participate in civic activities are also more likely to be exposed to
issues found on surveys, such as politics and current events (e.g., Francis and Busch
2.4.3 Gender
DK may also result from differences in gender. Many studies indicate that
women are more likely to give a DK answer than men (Francis and Busch 1975;
Rapoport 1982,1985; Smith 1984; Sudman and Bradbum 1974; Young 1999c). The
literature offers two possible explanations for the gender effect. First, gender
differences may result from women being less knowledgeable about the subject
matters covered in surveys than men (Francis and Busch 1975; Rapoport 1982,
1985). An alternative explanation suggests that females have been socialized not to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
40
2.4.4 Age
studies have shown that older individuals are more likely to express DK-like
responses than younger people (Gergen and Back 1966; Glenn 1969; Young
1999c; 2000a). Three possible explanations for the age effect are cited in the
literature.
(Young 1999c; 2000a). Older individuals are more likely to give a DK response
because they are less cognitively able to deal with the topics on surveys than
(social senescence) (Gergen and Back 1966; Young 1999c). As articulated in the
progressively withdraw physically and mentally from the social world, feeling
less bound by societal norms. Socially disengaged individuals being less likely
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
41
the context of the survey interview, one potential form of abnormal behavior is
Third, the relationship between age and DK may result from generational
merely assume that the age effect results from changes in the life cycle. However,
differences may also result because younger generations are more educated and
better-informed, decreasing their propensity to answer DK. What does the research
Almost no research has been conducted on the age effect. One study does
indicate, though, that older respondents are more likely to say DK because they are
growing older and not because they are from earlier generations (Krosnick and
Milbum 1990). More research is needed before any definitive conclusion may be
drawn.
2.4.5 Education
Many studies show that the less educated are more likely to say DK than the
more educated (Gergen and Back 1966; Ferber 1966; Glenn 1969; Sudman and
Bradbum 1974; Francis and Busch 1975; Converse 1977; Faulkenberry and Mason
1978; Bishop et. al. 1980; Smith 1981, 1984; Narayan and Krosnick 1996; Krosnick
and Fabrigar 1997; Young 1999c). There are four different explanations for the
education effect.
First, research suggests that the less educated are less likely to be cognitively
sophisticated (Schuman and Presser 1981; Krosnick and Milbum 1990; Young
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
42
1999a, 1999b). Second, the more educated are also more likely to participate in
civic activities (Young 1999b; 1999c). Third, the less educated are more likely to
say DK because they are older (Gergen and Back 1966; Ferber 1966; Krosnick and
Milbum 1990; Young 1999b; 1999c). Fourth, the less educated are more likely to
say DK because they are more likely to be female (Rapoport 1982, 1985; Krosnick
2.4.6 Race
Research has also shown that blacks are more likely to say DK than non
blacks (Francis and Busch 1975; Rapoport 1982,1985; Krosnick and Milbum 1990;
Young 1999b). There exist three possible explanations for the race effect. First,
blacks have lower levels of education than non-blacks (Francis and Busch 1975;
Krosnick and Milbum 1990). Second, blacks are less involved in activities related
to the topics found on surveys than non-blacks (Kronsick and Milbum; Young
1999b, 1999c). Third, blacks are less knowledgeable about the topics found on
surveys than non-blacks (Rapoport 1982,1985; Krosnick and Milbum 1990; Young
1999b)
prestige, subjective health, and work status. The research finds that:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
43
• Those respondents with higher occupational prestige levels are
more likely to say DK than those respondents with lower
occupational prestige levels (Ferber 1966; Francis and Busch
1975). However, after controlling for other characteristics (age,
education, and cognitive sophistication), occupational prestige no
longer has an independent effect on DK (Young 1998c; 1999b;
1999c).
So, what does the above discussion suggest? Broadly defined, individual
level correlates of DK can be broken down into two general categories: (1)
and (3) mental ability. Social factors include: (1) respondent motivation and (2)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
44
question is both yes and no. DK responses theoretically can be both true values as
may have an attitude on a given subject, they may choose the DK option, to avoid
answering the question; or because a response option that more closely corresponds
to their own opinion is not offered. Research suggests that at least half of the DK
responses are ambivalent attitudes, hence missing data (section 1 in this chapter).
However, the empirical evidence indicates that there does not exist such a
clear distinction between non-attitudes (true values) and attitudes (missing data).
Indeed, recent research suggests that many “non-attitude holders” have stable
leanings towards issues if asked the same (or similar) questions repeated times on a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
45
survey (Schuman and Presser 1981; Gilljam and Granberg 1993). So why do
indicates that attitude formation does not follow the traditional file-drawer model,
where respondents first are administered the question; after which, they search for
the relevant mental file to see if they have an opinion on the subject; and, finally,
they respond (Schuman and Presser 1981; Sudman et. al. 1996; Tourangeau et. al.
2000). Instead, recent research shows that intervening factors, such as respondent
Furthermore, our discussion of the literature has also shown that the rate of
evidence, without a doubt, further blurs the line between true value and missing data.
Based upon the evidence presented in this chapter, I contend that, in most
cases, DK responses are missing data because they are unusable in practice. I base
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
46
(2) The empirical evidence shows that a majority of DK
responses are ambivalent attitudes, a form of missing
data.
responses may represent true underlying values. Indeed, analysts must evaluate
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
47
CHAPTER THREE
I break the chapter doyvn into four sections. In section 3.1,1 describe the
data source. In section 3.2,1 examine the specifics of DK correlates to be used in the
analysts. In section 3.3,1 discuss the methods that will be used to calculate sampling
variance. Finally, in section 3 .4,1 review the technical aspects of the regression
3.1 Data
In this thesis, I will be analyzing data from the 1987 General Social Survey
(GSS). I choose the 1987 round because it is the only GSS study that includes
suggesting that restricting analysis to the 1987 GSS will not seriously limit
generalizability.
English speaking population, 18 years of age and older. I exclude the black
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
48
oversample which leaves a total sample size of 1466 respondents for analysis (see
interviewers code DK only after first probing the respondent once for a substantive
1 . . . .
answer. The GSS/NORC, in turn, tries to avoid excessive probing in order to
because I want to examine the sernrk m M m m kio between DK and the correlates
that the scale will minimize the effect of any one question type or topic.
1 use two basic decision rules to select items for the scale. First, the question
must be a subjective attitude item. The DK scale, then, does not include any
Second, the question must have been administered to all the respondents.
The DK scale, therefore, does not include the approximately 60 attitude questions
1
see page 21 of “Basic Interviewing Techniques” in NORC’s Field Interviewer Reference
Material for fijrther-discussion of interviewer protocols concerning DK’s.
2
Cronbach’s Alpha is a commonly used indicator of scale quality. An alpha of .65 is generally
considered acceptable for sociological or political scales, while for test instruments a much higher alpha is
required. Cronbach’s alpha squared corresponds to the percent of variance that the scale items explain in
the underlying construct being measured (Nnnnally and Bernstein 1994).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
49
from the International Social Survey Program (ISSP) module because about 10% of
those interviewed on the GSS did not respond to the ISSP supplement. I, then,
1 below indicates that the distribution of DK responses cm the 1987 GSS is skewed
50% -
45%
40%
35%
30%
20% -
15% -
10% -
5%
0% -
0 7 4 6 f? 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
50
3.2 DK Correlates
sophistication and social participation), age, education, race, work status, health
than the group that answered the verbal battery. Refusers answered DK 9.2 times,
on average, compared to 2.7 times for non-refusers. Refusers also had, on average,
9.3 years of education and 60.1 years of age compared to 12.7 years of education and
demographic variables (age, education, sex, and race) and DK differ significantly
This analysis suggests that the exclusion of the 88 cases would bias means,
correlations, and partial correlations. I, therefore, impute the missing values using a
regression based approach {see Little and Rubin 1987 for a further discussion of the
3
A NA code is given when “ ...the respondent does not give an answer, when the written
information is contradictory or too vague, and when the coder needs to supply a code in order to
resolve a tricky skip pattern” (Davis and Smith 1996, p. 1030).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
51
method). In brief, I estimate a simple model where I regressed verbal ability
nonlinear terms. I, then, take these estimated parameters and calculate predicted
WORDSUM values for the 88 refusers, adding a random residual to each predicted
because it does not take into consideration between imputation variation. The
times, using one of several multiple imputation techniques. These procedures, while
theoretically justified, are not practically useful because they require the analysis and
nonresponse biases.
existing 16 item social and political participation scale (MEMNUM) and a 7 item
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
52
education, I use two different variables. For the first education variable, I use a five-
categories include: (1) less than a high school degree; (2) a high school degree; (3) a
junior college degree; (4) a college degree; and (5) a graduate degree. Using a logit
transformation, I re-calibrate this variable in order to take into account that the
distance between education levels is uneven {see Master and Wright (1982) for
variables). For the second education variable, I create five dummy indicators which
represent the highest degree obtained by the respondent (Dl=less than a high school
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
53
degree; D2=a high school degree; D3=a junior college degree; D4=a college degree
gender, I create dummy variables: (1) Race (white =1; 0=nonwhite); (2) Work
Status (retired=l; O=nonretired); (3) Health status (poor health=l; 0=other); and (4)
Prestige (top ten percent of occupational prestige-1; less than top ten percent=0); (5)
3.3 Methods
are intended to decrease costs. However, because respondents are selected within
analysts underestimate standard errors, which, in turn, can lead to incorrect statistical
inferences.
To calculate proper variances and standard errors, I first create a cluster (or
PSU) variable which assigns a value for each of the 84 PSUs (primary sampling
units). This variable adjusts for the within PSU correlation. I also create a strata
variable, which matches each PSU with another geographically proximate PSU.
Using the PSU and Strata variables, I calculate DEFF (Design Effect) to
adjust standard errors. Specifically, I multiply the square root of DEFF (commonly
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
54
referred to as DEFT) by the standard error of the estimate (means, proportions, and
sample (in the case of the GSS, multi-stage stratified cluster sample) for a given
variable over the variance of simple random sample (SRS) for that given variable.
Cluster samples typically have larger variances than simple random samples and,
thus, larger standard errors (Kish 1965). But why are variances larger in cluster
samples?.
population, where any given draw is not correlated with any preceding or future
draw (Cochran 1977). In cluster sampling, draws are not independent because
respondents within a given geographic area are highly similar (e.g., race, income,
ethnic background). Simply put, due to the high degree of similarity within
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
55
Considering the above discussion, DEFF can be expressed as the function of
the interclass correlation (ROH) and the number of interviews conducted in the
given sampling point (b) (see equation 8b above). ROH varies considerable from
to race) has a relatively large ROH, given that blacks and white are highly
segregated in the United States (e.g., Smith et. al 1993). Both DEFF and ROH have
For any given characteristic though, ROH can be treated as fixed. This
decreasing the number of interviews done per sampling point (see equation 8b
above). Thinking along these lines, DEFF for simple random sampling is merely a
Table 3 below includes: the DEFF and DEFT for each of the variables cited
T a b le 3: S ta n d a r d E r i w , D E F F * a m i D E F T o f D l iC o n - e l a te s ................
V ariable Standard Error (Unadjusted) Standard Error (Adjusted) D EFF . D1FT..
A ae . ...0,461....' .... 0.647 : 1,97 1,404
Degree 0,108 0,124 1,33 1.153
Civic P articip atio n 0,081 0,087 1,142 1.069
Cognitive Sophistication 0.231 0.268 1.347 1.161
Female 0.629 0.563 0.802 0.896
White 0.010 0.021 4,709 2.170
Poor Health 0.006 0.007 1.358 1.165
High Prestige 0.008 0.007 0.791 0.889
R etired 0.009 0,011 1,54 1,241
DKscale 0,138 0,211 2,349 1,533
I use the statistical package STATA to estimate regression models and standard errors.
For complex samples, STATA uses taylor series approximation to estimate variances.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
56
Table 3 above shows that the size of DEFF varies considerably from
large DEFFs. Specifically, race has the highest DEFF at 4.71, while age and DK
have DEFFs of almost 2. These higher than average DEFFs suggest that these
variable will have larger sampling variances than would be expected using simple
random sampling. For instance, the sampling variance for race is almost 5 times
larger than the sampling variance for a simple random sample. Civic participation
(1.14), gender (.802), and degree (1.33)—all have relatively low DEFFs, suggesting
3.4 Model , ■ ;
In each of the four empirical chapters (4,5,6, and 7), I employ OLS
autocorrelation), I adjust the OLS model in two ways. First, I transform the
dependent variable using the square root fimction to account for non-normal error
eliminate it. To interpret results, I re-transform all estimates into their original
transform the dependent variable. I also use the Aiken Transformation which
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
57
weights the variance-covariance matrix by the inverse of the estimated standard error
(Greene 1997).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
58
CHAPTER FOUR
literature. Respondents who are more likely to answer DK are systemically different
than those who are less likely to say DK (Ferber 1966; Francis and Busch 1975)—
I organize the chapter into 5 sections. In section 4.1,1 quickly review the
literature discussed at length in chapter 3. In section 4.2,1 detail the models that I
will test. In section 4.3,1 analyze bivariate correlations, while, in section 4.4,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
59
random phenomenon (Gergen and Back 1966; Glenn 1969; Sudman and Bradbum
1973; Francis and Busch 1975; Converse 1977; Faulkenberry and Mason 1978;
Bishop et. al. 1980; Smith 1981, 1984; Narayan and Krosnick 1996; Krosnick and
(1) Respondents who are more educated are less likely to answer DK
than respondents who are less educated.
(6) Respondents with high levels of prestige are more likely to say
DK than those with lower levels of prestige.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
60
4.2 Methods
In this chapter, I am most interested in determining the best fitting model in
order to establish a baseline for the rest of the study. To do this, I first estimate a
(9)
(1) civic participation; (2) cognitive sophistication; (3) age; (4) gender; (5) race; (6)
subjective health; (7) prestige; (8) work status; and (9) education. Here I am making
( 10)
effects: (1) civic participation * cognitive sophistication; (2) age * gender; (3) race *
civic participation; and (4) age * civic participation. I choose the interaction terms
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
61
based upon (1) a review of the literature and (2) an empirical analysis of inter-item
correlations.
To test for the best fitting model, I use the difference in R-square test for
(11)
f = f ( R „ 2- R j ) / ( k b - k a ) ] / [ ( l ~ R b)/(n - kb- l ) ]
review (note all bivariate coefficients (r) are standardized Pearson correlation
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
62
The general conclusion based upon the results in Table 4 suggests that
responuems who are more likely to answer DK are systematically different from
those that are less likely answer DK. The bivariate correlations, however, can be
further divided into three groups according to the strength of the relationship.
an average correlation (r) of .235. These correlations account for about 6 percent of
the variation in DK (.235 * .235 * .06). These three variables should be robust
work status, and race with an average correlation (r) of .148 and, on average,
accounting for 2 percent of the variance in DK. Group 3 consists of sex, subjective
health status, and occupational prestige with an average correlation of .056. These
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
rn
so
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE 4: Bivariate Correlations Between Don't Know and Other Selected Variables
DK AGE CIVIC COG EDUC FEMALE HEALTH HOSTILE COMP PRESTIGE RETIRED WHITE
DK 1.000
AGE 0.232 1.000
CIVIC -0.212 0.005 1.000
COGNITIVE -0.261 0.070 0.462 1.000
EDUCATION -0.177 -0.226 0.388 0.543 1.000
FEMALE 0.086 0.047 -0.076 -0.056 -0.076 1.000
POOR HEALTH 0.033 0.178 -0.061 -0.120 -0.130 0.014 1.000
HOSTILE 0.182 0.050 -0.085 -0.074 -0.045 0.026 0.038 1.000
POOR COMPRE 0.354 0.179 -0.276 -0.417 -0301 0.019 0.109 0.216 1.000
HIGH PRESTIGE -0.049 -0.038 0.225 0.269 0.428 0.037 -0.025 -0.005 -0.099 1.000
RETIRED 0.146 0.587 - -0,031 -0,008 -0.174 -0.084 0.141 0.058 0.146 -0.051 1.000
WHITE -0.122 0.081 0.119 0.272 0.116 -0.032 -0.044 -0.035 -0.167 0.080 0.011 1.000
* All coefficients in bold are significant at the .05
64
DK. Controlling for other respondent level correlates, the correlates in Group 3
among the DK correlates. This suggests that any conclusion based upon the above
may result from a confounding third factor. For instance, even though the
correlation between age and DK is significant and strong (r = .232), a portion of the
bivariate relationship may result from education given the strong correlation between
age and education (r = -.226). To account for these confounding effects, I adjust
To determine the model that best explains variation in DK, I test three
separate regression models (see table 5 below). Model 1 includes all potential
significant correlates found in Model 1. Model 2 fits the data significantly better
than model 1 (/= .378; p>.001) (Note the change in/ from Model 1 to Model 2 is
not significant). In such cases, the best fitting model is the one with the least
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
65
data significantly better than Model 2 (/= 3.1; p.<.001). What respondent level
prestige, subjective health, and work status do not explain variation in DK (see
table 5 below). Note all betas (b) presented in table 5 are unstandardized with the
(3) Respondents who are retired are not sisnificantlv more likely to
say DK than respondents who are not retired, controlling for
other respondent level characteristics (b =.059; p.>.GQl).
These results are not surprising for two reasons. First, our bivariate analysis
showed that respondents with high occupational prestige levels were not
occupational prestige.
health and retirement effects (Young 1998c; 1999b; 1999c). Simply put,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
66
r.128]
Cog Soj*ist*Qvlc Part ** 0020
[.006]
Age*Gender ** 0606
[.021]
Gvic Partiripation*White ## ** -0050
[.030]
Gvic Partidpation*Age ** - 0.002
[001]
Constant am 1.47 0.523
[341] [.222] [178]
Sample size (n=) 1446 1446 1446
Adjusted R Square 0.1446 0.1437 0.1588
* All coefficients in Hack italites are sigpiliceni at t e r . 1 level (two4afled test)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
67
respondents who are retired and rate themselves as having poor subjective
respondents are more likely to rate themselves as having poor subjective health (r =
.178; p.< 05) and to be retired (r = .587; p.<.05). Specifically, a respondent’s age
Model 1 in Table 5 also shows that education does not have an independent
effect on DK, after controlling for other respondent level characteristics. This non
significant effect is an unexpected finding for two reasons. First, our bivariate
literature shows that education is not only a robust predictor of DK but also a
predictor of other forms of survey error, such as coverage bias, unit nonresponse,
and measurement error (e.g., Young 1999a, 2000c; Smith 1988; Smith 1981; Groves
and Couper 1998; Narayan and Krosnick 1996). Given the importance of this
finding, I will more closely examine the relationship between education and DK in
the next chapter (Chapter 5). What initial clues, though, might be gleaned from our
present analysis?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
68
A cursory analysis of the bivariate correlations in Table 4 suggests that five
correlates may possibly explain the association between education and DK.
Specifically:
(1) Age: Older respondents are more likely to have lower levels of
education than younger respondents (r = .070; p .<.05).
(4) Sex: Female respondents are less educated than male respondents
(r = -.076; p.<.Q5 ).
possible that the education effect results from a combination of all five factors,
cognitive sophistication is the strongest candidate for two reasons. First, research
makes sense that the actual measure would explain the proxy. Second, the bivariate
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
69
Model 2 in Table 5 also shows that five of the 9 respondent level correlates
characteristics. First, older respondents are more likely to answer DK than younger
respondents (b = .018; p .<05). Seventy-five year old respondents, for instance, are
2.3 times more likely to answer DK than 20 year-old respondents (7.5 vs. 3.3
DKs).1
less likely to say DK than those with lower levels (=1) of cognitive sophistication (b
= -.103; p.<.05). Respondents with high levels of cognitive sophistication are 9.6
times less likely to answer DK than respondents with low levels of cognitive
Third, respondents who participate more in civic activities (= 10) are less
activities is 2 .2 times less likely to say DK than one who does not participate
Fourth, female respondents are more likely to say DK than male respondents
(b = .160; p .<05). Female respondents answer DK, on average, 1.23 times more on
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
70
Finally, white respondents are less likely to answer DK than non-whites
average, 1.59 times on the survey, compared to 2.16 for non-white respondents.
None of these results are surprising—all having been cited in the literature
review. More elusive, however, is explaining why each of the above characteristics
is correlated with DK—explaining why will be one of the central challenges of this
thesis.
reflect reality, it is essential to test for interaction effects. Yes, older respondents
are more likely to say DK than younger respondents. But so what? Everyone knows
that the real world is more complex than a simple bivariate relationship would
suggest.
sophistication less likely to answer DK than older respondents with lower levels of
cognitive sophistication? Or, are older female respondents more likely to say DK
than younger female respondents? To test for possible interactions, I draw upon
both the methods literature as well as empirical findings. What does the methods
and not interaction terms (Gergen and Back 1966; Glenn 1969; Sudman and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
71
Bradbum 1974; Francis and Busch 1975). Several studies, however, have found
that:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Furthermore, an analysis of the standardized bivariate correlations in Table 4
2
suggests seven possible candidates for a three-way interaction. They include:
include the interaction between (1) civic participation and cognitive sophistication (r
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
73
-.462); (2) race and civic participation (r =116); and (3) race and cognitive
I tested all possible interactions. However, only four interactions were found
Respondents who are more both cognitively sophisticated and who participate more
frequently in civic activities are more likely to answer DK than those who are
participates frequently in civic activities is 2.14 times more likely to answer DK than
his counterpart who does not participate frequently in civic activities (.337 vs. .157
The initial hypothesis would, of course, be the opposite: respondents who are
the least likely to say DK. What, then, might be going on?
Two possible explanations may account for the interaction. First, given
high levels of competence, such respondents may be less likely to employ face-
express DK than older men, while younger women are about as likely to answer DK
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
74
respondents are 1.48 times more likely to answer DK than male respondents their
same age (4.8 vs. 3.3 DKs), younger women are only 1.13 times more likely to say
DK than younger men (1.15 vs. .91 DKs). These results confirm previous findings,
that the gender gap is decreasing. However, the reasons are less clear: the
participate frequently in civic activities are slightly less likely to express DK than
not participate frequently in civic activities do so at about the same rate as their non
participate frequently in civic activities are 1.2 times more likely to answer DK than
white respondents who participate frequently in civic activities (.32 vs. .26 DKs),
while whites who are less likely to participate in civic activities are only .89 times
more likely than their non-white counterparts. Once again, these results confirm
The significant interaction term suggests that a portion of the race effect is
mediated through civic participation. In brief, non-whites, in part, are less likely to
Unfortunately, we can not determine from this analysis whether the effect results
from non-whites being less likely to be exposed to information and/or from being
participate more frequently in civic activities are less likely to answer DK than older
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
75
respondents who are less likely to participate in civic activities (b = -.002; p.<.05).
are 1.4 times more likely to answer DK than 65 year-old respondents who
participate infrequently in civic activities (2.79 vs. 1.94 DKs). In closing, these
findings show that civic participation mediates a portion of the age effect. However,
like the other interaction effects, we know much less about why the significant
correlations exist.
Here it is also important to note that three of the main effects (gender: b = -
.113; p.>.05; race: b = -.080; p.>.05 and civic participation: b = .046; p>.05 ) are
the interaction terms explain away the main effect for gender, race, and civic
The main conclusion here it that gender, race, and civic participation do not
have direct effects on DK, instead being mediated by other respondent level
characteristics. Specifically, whites are less likely to answer DK because they are
more likely to participate in civic activities; female respondents are more likely to
say DK because they are older; and finally respondents who participate more in civic
activities are less likely to answer DK because they are more likely to be white;
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
76
4.5 Conclusion
well as uncovered new findings. Like past research, our analysis has shown that
are systematically different than respondents who are less likely to answer DK.
Some of our results, however, were unexpected. First and foremost, we were
able to explain away the education effect, after controlling for other respondent level
predictor of all forms of survey error, including DK (e.g., Young 1999a; Smith 1988;
Smith 1981; Groves 1989; Groves and Couper 1998; Naiayan and Krosnick 1996).
One possible explanation is that this is one of the few studies that has
concept should explain away the proxy. I test this cognitive sophistication
Second, our analysis also shows that the gender, race, and civic participation
Specifically, such respondents (1) may be more willing to express ignorance and/or
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
77
(2) may be more likely to critique poorly formulated questions (e.g., vagueness of
the question content, poor wording, discordance between question stem and response
Finally, our analysis in this chapter made it very apparent that relatively little
is known about why respondent level characteristics are correlated with DK. Both
age and civic participation are correlated with DK, but so what? What do these
correlations mean? Put another way, quantitative analysis, while broad, is not very
deep. One of the primary objectives of this thesis is to uncover the why behind the
correlations.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
78
CHAPTER FIVE
la the last chapter (chapter 4), we were able to explain away the relationship
between education and DK. This finding is quite important because education is not
changes to. American society, especially since the rapid expansion of higher
education after the Second World War. Extensive research oh a variety of topics has
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
79
(4) More educated individuals are more likely to express
opinions (Converse 1964,1970; Krosnick and Milbum
1990).
predictor of survey error. The survey methods literature, for instance, has
The survey research Mterature typically treats education as a proxy for more
proximate characteristics. For instance, the research on question wording, order, and
level of knowledge about the survey topic (e.g., Schuman and Presser 1981;
Krosnick 1991). Similarly, the survey literature on unit nonresponse uses education
Couper 1998).
secondary data which forces the researcher to make use of the measures found on the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
80
study. Most of methods research, thus, does not specifically address what
mechanisms explain the association between education and survey error. As a result,
combination of factors?
To explain why we were able to explain away the education effect in chapter
question about the implication of the research in this chapter on our understanding of
General Question: what might the results in this chapter imply about the
education effect in relationship to opinionation (likelihood to answer with a
substantive response) and other sociological phenomena?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
81
level of DK and (2) that this negative relationship persists even when controlling for
both respondent and question level characteristics (e.g., Ferber 1966; Converse
1977). We find one exception to this general tendency. On fictitious and obscure
Presser 1981; Bishop etal. 1980). Smith (1981) argues that this reversal in the
relationship results from the less educated’s greater issue confusion and greater need
negative, the strength of the relationship increases with the difficulty of the question
(Smith 1981).1 Smith (1981) explains that the strength of the DK/education
1
Difficult questions are those that are less salient to the respondent and require more specific
knowledge (e.g., questions on specific government policies such as NAFTA). Conversely, less difficult
questions are those which are more salient to the respondent and require less personal knowledge (e.g.,
questions on general values or personal evaluations such as happiness).
2
Note later research suggests that the nonattitude subgroup can be further subdivided
between those respondents which lack attitudes (nonattitudes) and those which choose the DK option
to avoid the cognitive demands of the survey question (satisficers) (Krosnick 1991; Young 1999d).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
82
with education, while ambivalent attitudes do not (Faulkenberry and Mason 1978;
These findings are important because they suggest that one must be careful
when making generalizations about the association between education and DK, since
The literature on DK, however, does not provide much insight into the
specific functional form of the relationship between DK and education. For those
practical outcome of this under-theorizing is that these studies implicitly assume that
several reasons. First, many of the studies use crude two category measures for
education. In such cases, the relationship by default is linear. And second, the vast
indicated, the strength (and presumably the functional form) of the association varies
with question difficulty. However, some indirect evidence in the survey literature
does suggest that the relationship might be non-linear. Narayan and Krosnick (1996)
find that response effects, including the propensity to give a DK response under
varying question conditions, occur disproportionately among those with less than a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
83
high school degree while the magnitude of these effects declines at an increasing rate
ability, interest, and willingness to participate in the survey interview (e.g., Converse
1964,1970; Converse 1977; Smith 1981). However, none of this research attempts
DK. Based on a review of the survey literature, it seems that four possible
sophistication, (2) civic participation, (3):ge»cter, and (4) age. I go into further detail
below.
1991; Schuman and Presser 1981). There are two possible explanations. First, the
the more educated are more informed/knowledgeable about the issues asked on
surveys (Hyman et. al. 1975; Smith 1981; Nie et. al. 1997) and (2) that the well-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
84
informed are less likely to answer DK (Converse 1964,1970; Converse 1977;
Faulkenberry and Mason 1978; Francis and Busch 1975; Rapoport 1982,1985;
Smith 1981).
Second, the cognitive sophistication effect may also result from varying
levels of verbal ability. Considerable research suggests (1) that the less educated
are more likely to have lower verbal abilities and (2) that respondents with weaker
verbal skills are more likely to have difficulties understanding survey questions and,
in turn, are more likely to answer DK (Krosnick and Alwin 1987; Krosnick 1991;
■Other research suggests that the association between DK and education may
result from, participation: in civic activities. This research shows (1) that the more
educated are more likely to participate in civic activities (e.g.,Nie et. al. 1997) and
(2) that those more involved in civic activities are less likely to answer DK (Francis
and Busch 1975; Faulkenbeiry and Mason 1977; Rapoport 1985; Young 1999a;
1999c).
5.1.2.3 Gender
of the respondent. Several studies indicate (1) that men, on average, are more
educated than women (e.g., Rapoport 1982,1985; Young 1999b, 1999c) and (2) that
women are more likely to give a DK answer than men (Francis and Busch 1975;
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
85
Rapoport 1982,1985; Smith 1984; Sudman and Bradbum 1974; Young 1999b;
1999c).
5X2.4 Age
results from age differences. A number of studies have shown (1) that older
individuals are less educated than younger individuals and (2) that older individuals,
on average, are more likely to express DK-like responses than younger people
such as occupational prestige, race, subjective health, and size of city (e.g., Ferber
1966; Francis and Busch 1975). In analysis of both cross-temporal as well as cross
national data, Young (1999b; 1999c) found that none of these factors, independently
of cognitive sophistication, civic participation, sex, and age, helped explain the
DK/education relationship.
5.2 Methods
To test for what factors explain the association between DK and education, I
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
86
relationship. Based upon research in the last chapter, the strongest possible
(1) high school; (2) junior college; (3) college; and (4) graduate school with “less
(13)
( 14)
sophistication (as well as civic participation, gender, and age). Instead, I want to
between DK and education. Put into statistical parlance, I am interested in the effect
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
87
(15)
(16)
'■(17)-'
a
DK = pO + Dl(High School) + D2(Junior College) + D3(College) +
D4(Graduate) + pi (Cognitive Sophistication) + p2(Civic
Participation) p3(Gender) + p4(Age) + ei
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
88
(3) what factors might explain the association between
education and DK?
| l
Cognitive Sophistication - 0,101 - 0,072
[.031] [.031] [-031]
Civic Participation ** ** - 0,055 - 0,053 - 0,056
[Oi l] [■Oil] [.011]
Gender (Fem ale=l) ** ** 0,211 0,177
[.069] [.062]
Age (in years) ** ** 0,018
r.oo2i
C onstant 2,27 1,51 1,48 1,49 1,22
. . r.1211 ; , _I-M81.... [.107] ....UMl . ......1.1081...
Sample size (n=) 1457 ' . 1451 1447 1447 1447
MasfeiKSflafflS— — .— — - M U .... A1401,,
* All coefficients in black, italics are significant at the .05 level (two-taiied test}
Dependent variable transformed using the square root
f Excluded Category for Education is 'Less ttan a High School Degree’1
t Standard Errors adjusted for complex design of the sample (clustering and stratification)
I t Standard Errors in brackets under coelfieicnis
Table 6 above includes the unstandardized OLS estimates that I use to create
dependent variable (note Figure 1 in chapter 3), I transformed the dependent variable
using the square root function. In order to easily interpret the results, I have re
transformed all estimates into their original metric (number of DKs) using the
quadratic function.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
89
.5.1
4.0
I
Sa 30
2,0
2.0
&0
Less High School High ScSmmsSDegree College Degree College Degree1
Degree
•.Iw d o f Education
are negatively related. The more educated, on average, are less likely to express
DK-like answers than the less educated (overall relationship significant; p.=.000).
Specifically, while respondents with less than a high school degree are
I assess the overall significance of the DK/education relationship by testing the joint
hypothesis that high school degree, junior college degree, college degree, and graduate degree are
simultaneously significant. I performed all such tests in STATA using the “test of linear
hypothesis” subcommand. Note I also correct all p-values using Bonferroni adjustments in order
to account for multiple comparisons.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
90
approximately 2.0 times more likely to give a DK response than those with a high
school degree are only 1.3 times more likely to answer DK than those with a college
way, the gains from education occur overwhelmingly between those with less than a
high school degree and a high school degree with diminishing returns at higher
levels of education (some college and up). The above results lead to a natural
question. Why are individuals with lower levels of education more likely to give a
One reason may be that respondents with lower levels of education also have
For all significance test of difference of two means, I combine the College and
Graduate School degree categories because the difference between the two groups is not
statistically significant.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
91
6.0
5.0
©
3.0
2.0
1.5
1.0
0.0
Less BBgik Sdsooi Higfe S ^ o o l Degree J u s e lr Oalkge Begree College ©agree
•Be^ee
‘Level of Edo'cstion
lower and higher levels of education. Specifically, while respondents with less than
a high school degree are still more likely (1.7 times) to give a DK response than
respondents with a high school degree are not more likely (1.09 times) to say DK
than those that went to college or graduate school (difference not statistically
significant; p>.05).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
92
Put another way, a respondent’s level of cognitive sophistication explains the
difference in DK rates between respondents with a high school degree and those
respondents who participate more frequently in civic activities are less likely to
shows that:
(1) respondents that participate more frequently in civic activities are less
likely to give a DK-like answer. The bivariate relationship is relatively
strong (r = -.212) and statistically significant and;
participation does not seem to account for much of the relationship between
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
93
6.0 -
5.0
3.0
2.0
1.0
0.0
Less High Sctosl High School Degjrw .JmwlrCoHegic Degree College Degree Gradtasite Degree
Degree
Education
, indeed, even after controlling for both cognitive sophistication and civic
participation, the basic form of the relationship does not change and the overall
of education (less than a high school degree) are still more likely to give a DK-like
response than respondents with higher (college and graduate degree) and medium
men and are, therefore, more likely to express a DK-like answer. The following
bivariate analysis does not seem to support this gender hypothesis. Indeed, Table 4
indicates that:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
94
(1) Women are more likely to answer DK than men. The bivariate
relationship however is weak ( r = .086) and not statistically significant
and;
(2) Women are less educated than men. The bivariate correlation, however,
is weak (r = -.084) but statistically significant.
6.0 !
5.0
3.0
£ 2.8
1.7
1.0
0.0
Less High School High School Degree Jaitfor College Degree College Degree
Defp*ee
In further support of our bivariate analysis, Figure 5 above suggests that even
participation, the basic form and direction of the relationship does not change
(p.=.004). Respondents with low levels of education (less than a high school degree)
are still more likely to give a DK-like response than respondents with higher (college
and graduate degree) and medium levels (high school degree). Might age account
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
95
Bivariate analysis suggests that yes, age may be a good explanatory
(2) Older individuals are less educated than younger individuals. The
bivariate correlation is both strong (r = -.177) and statistically
significant.
6.0
5.0
M 4.0
O
©
22
2.0
4^
1.0
0.0
Less High School High School Dogreo Junior Caiiegv Co3St*ge Degree Graduate Degi
Degree Degree
Education
In support of the above conclusions, Figure 6 above suggests that yes, age
explains the remaining variation between education and DK. Indeed, once we take
participation, and gender, the overall nonlinear relationship between education and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
96
In brief, respondents with low levels of education (less than a high school
degree) are no longer more likely to give a DK-like response than respondents with
higher (college and graduate degree) and medium levels (high school and junior
college degree). Specifically, age accounts for the difference in mean levels of DK
between respondents with less than a high school degree and a high school degree
suggesting that age not education is the primary factor, contributing to differences in
DK rates among the medium and less educated. In addition, although not
education—with the more educated more likely to give a DK response than the less
■• 5'
educated. What can we conclude from the above analysis?
5.4 Conclusion
At the beginning of this chapter, we asked two specific questions and one
general question:
Note I ran a series of regression models and found the results to be robust
irrespective of the order in which I entered cognitive sophistication, civic participation, gender,
and age.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
97
5.4.1 Specific Questions
Taken as a whole, five main findings came out of the analysis. First, the
DK. Third, two factors explain away the relationship between education and DK:
explains the difference in mean DKs between respondents with high (college and
graduate degree) and moderate (high school degree) levels of education. Age, in
turn, explains the difference in mean DKs between respondents with moderate and
low levels of education (less than a high school degree). Finally, while cognitive
sophistication and age explained away the overall relationship, those with a junior
college degree were consistently less, likely to answer DK than respondents with
higher and lower levels of education. What might explain this result?
Two possible explanations exist for why junior college respondents are less
likely to answer DK: (1) such respondents may know ju st enough to think that they
know everything and/or (2) they may be too embarrassed to express DK because
characteristics explain the education effect. However, several pending questions are
left unanswered. Why exactly is age correlated with DK? Similarly, thinking back
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
98
to chapter 4, why is civic participation related, though indirectly, to DK? In short,
The research presented here in this chapter also suggests new lines of
What factors account for the education effect? Do these factors vary
and take a new look at the variable.;Indeed, the results demonstrate that much
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER SIX
sociologist. Such tools have allowed the analyst to uncover important effects,
shedding light on underlying sociological processes. However, even with all the
that quantitative data analysis is often done on secondary data sources (such as the
sociologist, therefore, is often left using proxy measures that only indirectly tap the
This study, like most dissertations done in the quantitative social sciences, is
plagued by the same limitations. Yes, I have found several interesting correlations,
respondents who are younger and participate more in civic activities are less likely to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100
provide a DK response. Furthermore, while accounting for the relationship between
education and DK, we could not adequately explain why. Why are age and civic
In the last chapter (Chapter 5), I was able to unravel part of the meaning
two factors: age and cognitive sophistication. But, in so doing, I answered a riddle
.. Might there be a broader meaning that accounts for the correlations? The
short answer is yes.. Both age and civic participation have been used as proxy
motivation (Groves and Couper 1998; Krosnick 1991; Young 1998;Young 1999b;
1999c). Indeed, the methods research has treated nonresponse as a specific case of
outcomes (Kaase and Marsch 1979; Nagel 1987; Verba and Nie 1972). One indirect
same literature shows that those individuals who are more likely to participate are
also more likely to be informed about the society in which they are members (Verba
and Nie 1972). So what exactly does the survey methods literature have to say about
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
101
The methods literature links survey participation to the degree to which a
respondent feels bound or connected to the society (Glenn 1969; Mathiowetz et. al,
1991; Couper et. al. 1997; Groves and Couper 1998). Socially isolated and alienated
respondents are less likely to participate in social activities, like surveys. Groves
and Couper (1998) note that surveys researchers have long felt that respondent “
social isolation, research shows that survey non-responders are also more likely to
have lower levels of political efficacy; are less likely to trust government; are more
likely to feel alienated from society; and are less likely to have confidence in societal
institutions, such as religion and the press (Southwell 1985; Weatherford 1991;
The concepts of civic duty and social isolation have also been used to
1999b; 1999c). This literature argues that respondents with high levels of social
isolation and alienation are more likely to answer DK because they do not feel
Specifically, this line of research speculates that age and civic participation are
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
102
6.0.2. Solutions
There exist two possible solutions to our problem. First, the GSS includes
two questions at the end of the interview that ask the interviewer to evaluate
rate the respondent’s overall comprehension of the questions on the study. The
second question (COOP) asks the interviewer to evaluate the respondent’s attitude
These measures will be used to separate out the possible social and cognitive
aspects of the age and civic participation effects. Specifically, I use COOP to
determine to what extent age and civic participation tap respondent motivation,
while COMPREND to determine the degree to which these same correlates capture
respondent comprehension.
respondent behavior and attitudes during the interview, as opposed to weak proxy
respondent to participate in the survey prior to the interview or the degree to which
In short, all analysis in this chapter must be qualified due to the problems
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
103
DK Indeed, are respondents who are evaluated as being less cooperative more
likely to answer DK because they are really less motivated? Or, do interviewers
classify respondents as being less cooperative because such respondents are more
isolation and alienation. I use these items to determine whether age and civic
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
104
questions... Good, Fair, or Poor?” Why does the GSS include interviewer
know is that COOP and COMPREND (or similar questions) began to appear in
NORC studies in the 1960’s and 1970’s. Some speculate that they appeared as a
order to give the analyst the option of excluding those respondents with poor
comprehension ratings. We know of no similar uses for COOP. Despite the fuzzy
as correlates for survey error. This research, conducted internally by GSS staff, has
found that both respondent comprehension and cooperation are strongly correlated
with survey error, most notably with non-response (missingness). Specifically, these
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
105
(2) respondents who are less likely to cooperate (COOP) and less
likely to understand survey questions (COMPREND) are more
likely to be item non-responders on factorial vignettes (Smith
1986).
(3) respondents who are less likely to cooperate are less likely to
have a household phone (telephone coverage bias). (Smith
1987b).
(4) respondents who are less likely to cooperate and less likely to
comprehend are more likely to be item non-responders on
household income questions (Smith 1991).
(5) Respondents who are less likely to cooperate and less likely to
understand survey questions are correspondingly less likely to
respond to re-interviews (Re-interview nonresponse) (Smith
1992a).
(6) respondents who are less cooperative and who are less likely to
understand survey questions are more likely to be item non
responders on sexual behavior questions (Smith 1992b).
(7) respondents who are less likely to cooperate and less likely to
understand survey questions are more likely to choose the
extreme ends of response scales (Smith 1992c).
So what does this all mean? fn brief, this research treats COMPREND as an
COMPREND and COOP have been used as measures of respondent motivation and
COMPREND/COOP and DK
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
106
In this chapter, I use two measures of social isolation and alienation: (1) a
13- item confidence in leaders of institutions scale (CONFID); and (2) a 3-item
summated anomie scale. I choose these two measures because research shows that
they tap distinct dimensions of alienation and social isolation (Smith 1997). First,
since 1973, the GSS has included a 13-item battery concerning confidence in the
CONSCI, CONLEGIS, CONARMY). The question stem for the confidence battery
reads:
executive branch of the federal government, organized labor, the press, medicine,
TV, the US supreme court, scientific community, the congress, the military, and
banks and financial institutions. Research shows that, in general, the level of
confidence in all institutions has declined since 1973 (Young 1998a, 1998b; Citrin
and Moste 1999). Furthermore, analyzing the 13-item summated scale, Young
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
107
(1999a, 1999b) found that (1) the less educated; (2) elderly respondents; and (3) and
Second, the GSS includes a 3-item anomie scale that measures the level of
questions reads:
Research, in turn, shows that alienation has increased overtime (Reef and
Knoke 1999). Furthermore, this same research indicates that: (1) older respondents;
(2) the less educated; and (3) non-whites are more likely to feel alienated.
properties of age and civic participation. What do they measure? Why are they
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
108
Our preliminary hypothesis is that age and civic participation tap respondent
who do not participate in civic activities probably are more likely to say DK because
they either have greater problems understanding the questions or they are less likely
composition of the age and civic participation effects, I use three validating
indicators:
I organize the following analysis of age and civic participation into four
distinct sections. First, I analyze the bivariate relationship between DK and the
bivariate relationship between age/civic participation and DK. Here I examine the
Third, I analyze the bivariate correlations among civic participation, age, and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
109
interested in assessing two types of validity: (1) convergent validity and (2)
highly correlated.
comprehension.
civic participation.
( 18)
of respondent motivation and comprehension and then use the beta weights
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
110
(standardized betas) to determine the importance of each indicator (see equation 18
(19)
motivation does not cause age. Instead, I am using multiple regression to estimate
partial correlations.
Tables 7a and 7b below show that most respondents are both cooperative and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I ll
rated as friendly (76%) or cooperative (19%), while only 5 percent were rated as
This analysis indicates that most respondents perform well during interviews
with only a small subgroup seen as problematic. Are less cooperative respondents
also less likely to understand survey questions? In other words, is there a respondent
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
112
The answer to the above question is yes. First, the bivariate correlation
p.<.05) (see table 5). Second, a look at a two-way table of COOP and COMPREND
(see table 8 below) shows that the correlation is statistically significant (X2 = 186.68;
p =.000).
Good Fair/Poor
about 78 percent of respondents were rated as having both good comprehension and
being either friendly or cooperative during the interviewer with only 22 percent of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
113
Are respondents who do not cooperate during the interview and/or who do not
Yes, table 4 in chapter 4 shows that both COOP (r = .182; p.c.GS) and
respondents who are less likely to cooperative and/or who are less likely to
S
2.4
t
o
Restless/Hostile Cooperative Friendly
ftespowfenl Cooperation
those rated as cooperative answered DK 4.6 times; and those rated as restless or
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
114
Furthermore, figure 8 below in d i c a t e s that respondents with good
comprehension answered DK only 2.1 times on the survey, while those rated as fair
answered 5.6 times and those rated as poor answered 11.8 times.
12
10
COMPREND and DK is stronger. Taken as a whole, the analysis in this section has
shown that:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
115
(2) Respondents who are more likely to cooperate are also
more likely to understand survey questions. However,,
this interaction is weak.
Table 9 below presents the descriptive statistics for our two indicators of
respondent alienation and social isolation. The 13-item CONFID scale varies from a
low of 3 to a high, of 39 with an average score of 26.5. CONFID also possesses good
institutions factor. The second factor explains 14 percent of the variance and seems
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
116
military and the federal government are positively correlated and confidence in the
supreme court; the scientific community; the press; and TV are negatively correlated
with the factor. Finally, the third factor explains only 7 percent of the variance and
labor is positively correlated with the factor and confidence in major companies is
negatively correlated.
Second,, table 9 above shows that the 3-item anomie scale ranges from a low
levels of anomie) with an average score of 4.25. Unlike the CONFID scale, the
anomie index is a weak measure with a cronbach's alpha of only .389. Factor
analysis, however, indicates that each of the 3 items load strongly on one factor
.136; p.<.05).
Even given the weak relationships found above, both of the correlations are
with lower levels of confidence and higher levels of anomie are more likely to
answer DK.
R ep ro d u ced with p erm ission o f the copyright ow ner. Further reproduction prohibited w ithout perm ission.
117
Table 10 above also suggests that the two scales tap distract dimensions.
Before analyzing the possible reasons for the age and civic participation
effects found in the last two chapters, let us first re-examine the relationship between
age/civic participation and DK. What is the functional form of the relationships?
Are they linear or non-linear? Were our initial assumptions about the relationships
valid?
that both age and civic participation were continuous and linearly related to DK.
R ep ro d u ced with p erm ission of th e copyright ow ner. Further reproduction prohibited w ithout p erm ission .
118
Figure 9 below shows the relationship between age and DK. 1 re-code
age into approximately equal groups with 10 year intervals except for the youngest
(18 to 24 years of age) and oldest age categories (65 years of age or more).
Figure 9 indicates that age is not linearly related to DK. Indeed, there exists
only a slight increasing trend in DK from respondents who are 18-24 years of age
(2.40 DKs) to those who are 55-65 years of age ( 3.07 DKs)—an increase of only .67
DKs. M contrast, respondents who are 65 years of age or older are much more
or older answered DK 5.70 times, on average, versus 2.5 times for those respondents
64 years of age or younger—a 230 percent increase in DK. What do these findings
suggest?
$
6.70
4SS4 6S+
Yam of Age
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
119
First, these results indicate that any model (1) should not treat age as a
continuous variable and (2) should not specify age as being linearly related to DK.
Instead, age should either be treated as a binary variable (65 plus versus 64 less) or
as a three category variable with breaks between 35-44 and 45-54 and between 55-
64 and 65 plus.
Second, the large increase in the rate of DK for those respondents who
are 65 years of age or older lends support to the argument that the association
between age and DK results from aging and not generational change (cohort
effects). Indeed, if the age effect were actually the result of gradual cohort change
explanation does not exclude the possibility that the large increase in DK for older
respondents (65 years of age and older) results from a particularly strong period
effect (e.g., World War II), making this older generation qualitatively different from
assuming that the age effect results from aging, we still do not know anything about
Figure 10 below shows the association between civic participation and DK.
Here, due to the small number of cases, I re-code civic participation at the higher
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
120
levels into 2 categories: (1) 6 to 8 civic activities and (2) 9 plus civic activities.
What do we find?
Figure 10 indicates that the association between civic activities and DK more
closely resembles a linear relationship than the correlation between age and DK.
Indeed, at lower levels of civic participation (0-2 civic activities), the association is
actually linear with those respondents who do not participate in civic activities
answering DK 4.9 times; those participating in 1 civic activity answering 4.0 times;
7.9
6.0
5.0
40
3.0
2.0
V 1.3
1.0
0.0
0 1 2 3 4 5 6 to 8
Number of Civic Activities
DK =2.0) are only 1.5 times more likely to answer DK than those respondents who
who do not participate in civic activities (average DK =4.9) are 3.3 times more likely
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
121
These results suggest that the association between civic participation and
Now that we know more about the bivariate relationship between age/civic
participation and DK, what may explain these relationships? What, in other
In this section, I have one simple objective: to determine the relative weight
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
122
alienation are correlated, though weakly, with age. Older respondents are more
likely to feel alienated; more likely to have lower levels of confidence in institutions;
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
123
Specifically, table 11 below shows that older respondents are more likely to
these findings?
abilities acquired in the past and/or accumulated over a long period of time (e.g.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
124
the percent of variance explained in four stages. First, I regressed age on the
Table 12: Age ass# the relative w eight o f respondent motivation a n d com pnfaem ien
Variable Beta Weight % Explained Variance
COMPREND 0.11 51
COOP 0.00 0
Cognitive -0.04 6
Anomie 0.09 33
CONPID -0.05 10
Civic 0.00 0
Total 0
* Standardized Beta; estimated using logistic
** CGM PREND: good =1; fair and poor =0
t COOP: friendly and cooperative = 1; im patient and hosfctte=0
J N ote the above variables explain 10- peteesA o f the total variance in age
Second, I squared the beta weights (beta weight * beta weight) taken from
the results of the multiple regression. Third, 1 summed the squared beta weights.
Fourth, I divided the squared beta weights by the sum of the squared beta weights.
The results in the table 12 above confirm our bivariate analysis in table 11—
respondent motivation (COOP, CONFED and Anomie) accounts for 43 percent of the
variance (0% + 10% + 33%) with anomie explaining most of variance (33%). What
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
125
percent of the variance in age, though they are correlated in opposite directions.
(beta weight = - .04). COMPREND, in turn, is negatively correlated with age with
older respondents being less likely to comprehend survey questions (beta weight
= 11). On balance, COMPREND is the dominant effect, given its larger beta weight
(51%-6% = 45%).
Once again, the results above indicate that distinct differences exist between
cognitive ability acquired over the life-course (vocabulary; knowledge) and survey
the one hand, older individuals have acquired knowledge and vocabulary over time
which make them, on average, more c&gmttiwly sophisticated. On the other hand,
they seem to be less cognitively agile in demanding situations, such as those required
by surveys.
The above results confirm our initial observation—age taps both respondent
motivation and comprehension. Older respondents are more likely to answer DK (1)
because they are more likely to feel alienated and (2) because they are less
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
126
Table 11 above suggests that, like age, civic participation may also be a function of
alienation, with the exception of CONFID, are correlated with civic participation.
Specifically, we find:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
127
Table 13:Cwic Participation and th e relative weight of respondent motivation and comprehension
explains almost all of the variance in civic participation (94%) with cognitive
These results run counter to the methods literature which assumes that civic
1999b; 1999c). Instead, it appears that respondents who participate more in civic
activities are less likely to say DK because they are more cognitively sophisticated
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
128
(higher verbal ability; more knowledgeable; and more exposed to information) and
This research, however, sheds no light on exactly why this is case. Is civic
6.4 Conclusion
I asked two questions at the beginning of this chapter: (1) why are older
respondents more likely to say DK than younger respondents?; and (2) why are
respondents who participate less in civic activities more likely to answer DK than
respondents who participate more in civic activities? So what did the research in this
Our analysis of age ended in three important findings. First, age does not
response only increases among the oldest of respondents (65 years of age or more).
motivation. Specifically, older respondents (65 years or more) are less likely to
understand survey questions as well as more likely to feel alienated. Third, while
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the beginning of this chapter, serious problems remain concerning the causal
respondents who are more cognitively sophistication are more prone to participate in
civic activities. In either case, these results run counter to the methods literature
We are still left with two pending questions. First, does re-specifying age as
a binary variable (65+= 1) improve the explanatory power of the model? And
model?
C H A P T E R SEVEN
explained away the education effect and (2) we validated the age and civic
questions. First, does the re-specification of the age variable improve the
explanatory power of our final model (model 3) in chapter 4? Given our analysis
Third, does the inclusion of COMPREND and COOP affect the relationship
between other correlates and DK? I have no a priori hypotheses concerning how
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7.1 Models
To answer the above questions, I test two additional models against the final
and race) and 4 interaction effects [(1) civic participation * cognitive sophistication;
(2) age * gender; (3) race * civic participation; and (4) age * civic participation)].
This model treats age (age in years) as being linearly related to DK (see equation 19
below)
(19)
I first test the baseline model against a second model (see equation 20 below)
which specifies age a dummy variable (65 years of age or more =1).
(20)
participation, cognitive sophistication, age, gender, and race) and four interaction
effects (note bold parentheses). Note this model specifies age as a binary variable.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
132
Finally, I test whether interviewer evaluation of respondent comprehension and
above.
(21)
(civic participation, cognitive sophistication, age (binary), gender, and race); four
To test for the best fitting model, I use the difference in R-square test for
(22)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
133
includes the re-specified age variable. Here age has been re-coded into a binary
dummy variable where 1 =65 years of age or older and 0 = 64 years of age or
performance .(COMPREND and COOP). Of these three-models which one best fits
the data?
Equation 22 above is designed'.to test nested models (e.g., a model with one
independent variable age versus a second model with two independent variables age
and education). Models 11 and 12, however, are not nested models—the only
difference being a re-specified age variable (age in years versus age (1=65+)).
Non-nested models can mot be directly tested using the above equation
because, in such cases, the equation has no mathematical solution.1 How then
1998). One simple rule is to assume that die model with a larger adjusted R-square
is the best fitting model. Using this decision rale, model 12 best fits the data
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
134
strategy is to assume that the difference in the number of parameters is 1 (kb -k a =
1). Using this method, model 12 again is the best fitting model ( f - 11.7; p.<.001).
1998). The J-test is a three-step procedure. First, I regress the dependent variable
(DK) on the re-coded age variable and the other explanatory variables. Second, I
age measured in years; and the predicted value (YAhat) estimated in step 2. If YAhat
is statistically significant, this suggests that the re-coded age variable fits the data
significant, this suggests that age measured in years fits the data significantly better
What do we find? The J-test demonstrates that Model 12, once again,
significantly improves the fit of the data over Model 13 (t-value = 2.151; p=. 032).
The above analysis suggests that the re-coded age variable fits the data better
than the age variable in years. Supporting our earlier bivariate analysis, these
findings underscore the initial hypothesis that age only becomes important in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
135
significantly and substantially improve model fit i f —59.1 p.<.001). Indeed, model
14 increases the adjusted R-square from .1656 to .1986 (model 13 versus model 14).
COOP) taps underlying constructs which are different than those captured by
measure actual behavior during the survey interview, while respondent level
both the re-specified age variable and the measures of respondent performance
multivariate model. So what about individual level correlates? How does the
other factors?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
136
COOP; (4) cognitive sophistication; and (5) gender. Model 13 in Table 14 below
Age (65 plus =1): controlling for other factors, age has. a statistically
respondents who are 65 yearn of age or older are 3.5 times more likely to answer DK
than younger respondents (3.25 DKs versus .920 DKs). Furthermore, the joint
inclusion of COMPREND and COOP accounts for about 30% of the variance in age
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
137
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
f
138
p.<.05 and COOP: b = .697; p.<,05). Pat simply, respondents who are rated as
having poor or fair comprehension of survey questions are 2.67 times more likely to
answer DK than those who are rated as having good comprehension (2.46 versus
.919 DKs). Similarly, respondents who are rated as hostile or restless during the
interview are approximately 3.0 times more likely to say DK than those who are
low levels, of cognitive sophistication (low level =1) are 95 times more likely to
answer DK than respondents who have high levels (high level =10) of cognitive
Gender (female =1): controlling for other factors, gender has a statistically
respondents are 1.4 times more likely to say DK than male respondents (1.29 versus
.912 DKs), So what does this mean? Why did gender become significant?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
139
One possible explanation is the re-specification of the age variable. Indeed,
11 versus model 12 ^ t-value = -.621 versus 2.37), while the interaction effect (age
* gender) no longer remains significant after age is re-specified in model 12. This
finding is important because the literature has always treated age as a continuous
variable, linearly related to DK. This suggests that the significant interaction effect
7.4 Conclusion
.:;So what did we find in the above.analysis? First, the re-specified age
variable(65+) fits the data better than the age in years variable. This result lends
further support to the argument that age captures life-cycle and m i cohort
differences . Future research must examine this question in more depth, given that
Second, both COMPREND and COOP significantly improve model fit. This
result is important because it shows that such measures tap qualitatively different
concepts than respondent level characteristics. But why is this the cases?
capturing the underlying social and cognitive dynamics of the interview, while
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
140
to perfect them. For instance, COMPREND uses a 3-point response scale, when a 5
respondents are more likely to say DK. Future studies should examine other
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
141
CHAPTER EIGHT
CONCLUSION OF THESIS
answer this question, it makes most sense to re-trace our first steps in chapter 1.
framework as well as specific guidelines for the treatment of item missing data and
(2) to understand why. certain survey respondents are more likely to answer DK than
(1) Are respondents who are more likely to answer survey questions
different from than those who are less likely?
(4) Can general principles be derived, so that missing data will not have to
be dealt with on a case by case basis?
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
142
is to give survey researchers a rough idea about how they should think about item
missing data—both at the design stage as well as during post-survey data correction
(imputation) stage.
In section 8.2,1 discuss the general findings with particular attention to the
question of why certain survey respondents are more likely to say DK than other
both theoretically and practically. In this section, I also discuss the practical
implications of my research.
section 8.2 relate to the conceptual framework presented in section 8.3. To end
framework of the survey interview. I, then, discuss how this framework can be used
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
143
Research shows that responses to survey questions—including DK
responses—are a function of both the social and cognitive dynamics of the survey
interview (Sudman et. al. 1996; Rrosnick and Fabrigar 1997). On the one hand, the
survey interview has been shown to be a social encounter between the respondent
and the interviewer that is governed by certain norms and social rules. On the other
hand, the interview requires cognitive effort from the respondent who must first
understand the question, then retrieve the relevant information, and finally integrate
the information in order to answer the question. The social norms of the interview,
mental ability and respondent comprehension during the interview and (2) social
factors, such as respondent motivation and adherence to social norms (Young 1999b;
factors:
interviewer administering the question to the respondent and the respondent, in turn,
providing an answer. Task difficulty can vary depending on its characteristics (e.g.,
question wording, format, content). Considering this, the simple response model
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
144
presented above with two main effects expands to a four variable model with two
(24)
processes-, cognitive processes, and the characteristics of the survey task. Let us
further define what we mean by each of these concepts: (1) social; (2) cognitive; and
(3) task. . . .
social system (e.g., Sudman and Bradbum 1974; Bradbum 1983; Sudman et. al.
1996) as well as from the literature on satisficing which places central importance
(Krosnick 1991; Krosnick and Fabrigar 1997; Krosnick 1999). So how does the
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
145
In this social system, there are two participants, or social roles—that of the
and answer questions. The social roles of both the respondent and the interviewer, in
turn, afford certain rights and prescribe certain obligations. Let me briefly describe
given question; and to be treated with respect. Respondents also have certain
best of their ability. The interviewer has certain rights, including the right to guide
the interview within the given constraints set by the researcher as well as the right to
limit the respondent’s comments about subjects relevant to the survey. The
These .rights and obligations, in turn, dictate specific social norms which
govern the behavior of both the interviewer and the respondent. By understanding
this social system, researchers can design surveys (1) that do not violate the social
norms of the system (2) that emphasize both the rights and obligations of each
participant and (3) that emphasize those social norms which motivate the respondent
In the case of the interviewer for instance, survey designers want to limit
interview protocols are used to strictly define how an interviewer should behave.
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
146
Protocols, in other words, are a mechanism used by the researcher to clearly define
an interviewer’s obligations.
survey designers should stress those social norms which motivate respondents to
give reliable and valid answers (e.g., good respondent norm and norm of
truthfulness).
of the survey. The General So« m S irvey (GSS), for instance, sends an introductory
letter to selected households poor to the survey interview stressing the importance of
the survey: “...the results of this research will be released quickly to officials in
participate in the survey, researchers also use this technique, and others like it, to
1991).
The survey researcher, however, must keep in mind that all individuals have
multiple roles at any given time (e.g., the role of the father, son, professional) and
obligations, and social norms. Some of these roles and norms can actually facilitate
a respondent in providing reliable and valid answers, while others can actually
hinder it. Good citizen norms, for instance, can be used to motivate respondents to
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
147
while face-saving norms and norms o f politeness might actually cause respondents
to alter (or edit) their answers to make them more socially acceptable.
Put another way, on the one hand, there are norms that motivate respondents
to provide answers that more closely correspond to their true value. On the other
hand, there are norms that motivate respondents to give answers which do not
researchers should attempt to maximize the role of social norms which lead to more
reliable and valid data and to minimize the role of social norms which do not
failed to address one, important issue. What about a lack or norms, or anomie?
How does a survey researcher manipulate the norms of a survey interview if one of
the actors does nothald the same values as the larger social group?
alienated from society, more likely to be social isolates, and less likely to possess the
they relate to survey response, differ from those of the larger social group.
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
148
and retrieve the relevant information to answer a question. I take this concept from
the cognitive psychology literature on. context effects which hypothesizes that
question response is a four stage cognitive process where respondents must (1)
understand the question, (2) retrieve the information from memory, (3) form a
judgement from the retrieved information, and (4) format the answer to the response
category (Tourangeau and Rasinski 1988). The methods literature argues that
respondents who are more cognitively sophisticated are less likely to have problems
Krosnick 1999).
ability, general political competence, and exposure to media. (Schuman and Presser
The research in this thesis suggests that two distinct dimensions of cognitive
ability actually exist: (1) acquired cognitive ability and (2) cognitive agility. On the
process information during the interview and comprehend survey questions. In other
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
149
words, cognitive sophistication is not a direct measure of survey performance.
Instead, the concept seems to be more closely related to acquired cognitive ability
needs to more felly examine the dual concepts of cognitive ability and survey
performance.
ability, researchers should design surveys so that even the least cognitively able can
understand the questions. Such strategies may include the simple wording of
questions and the use of pictorial devices such as show cards (Sudman and Bradbum
1982; Sudman et. ai. 1996). In addition, even among the more cognitively able,
certain topics can be cognitively demanding, such as complex public policy issues
like NAFTA.
design surveys that facilitate cognitive processing. For instance, in the case of
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
150
The central task of the survey interview is for the respondent to answer
survey questions. The given characteristics of the task (e.g., question wording,
format), in turn, can make the response process more or less difficult for the survey
respondent.
characteristics can not be varied, survey researchers can control the normative and
manipulate three task variables to vary normative and cognitive task difficulty: (1)
The following two examples illustrate this point. First, questions concerning
retrieve from memory. To reduce the cognitive barriers associated with such
This strategy decomposes the cognitive task for respondents, making it easier
to recall the relevant information from memory (Sudman and Bradbum 1982;
Sudman et. al. 1996). To further reduce the cognitive demands, researchers might
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
151
request that respondents access personal record when in doubt—this can either be
could also assure respondents, in the introduction of the questionnaire, that their
When designing a survey, researchers must take into consideration both the
social (normative motivation) and cognitive (cognitive ability) aspects of the survey
interview as well as the normative and cognitive difficulties associated with the task.
To illustrate this point, let us slightly alter equation 24 using the terminology
(25)
Note the literature has typically treated social desirability as a psychological trait (see
DeMaio 1984). However, some research suggests that such behavior is normatively determined
(Stockings 1979).
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
152
In place of social factors, equation 25 uses normative motivation, and, in
place of cognitive factors, it employs cognitive ability. So what does the above
model suggest?
motivation; cognitive ability and the difficulty of the task (e.g., question wording).
Task difficulty, however, does not have a direct effect on DK. Instead, both
normative motivation and cognitive ability mediate the effect of task on DK.
Specifically, the normative motivation by task interaction effect can be treated as the
normative barriers associated with a given task, while the cognitive ability by task
interaction effect can be treated as the cognitive barriers associated with a given
task. Simply put, these two interaction effects suggest that respondents who are
either less noitnatively motivated or less cognitively able will be more likely to
In support of this model, the literature shows that respondents with lower
levels of cognitive ability and normative motivation are more likely to answer DK
when the task is more difficult (Sehtiman and Presser 1981; Narayan and Krosnick
different respondent and task level characteristics are related. Such knowledge is
essential for a fully functional model of DK response. Even with these limitations
Survey researchers can use the above model in both questionnaire design and
data imputation. In the case of questionnaire design, the model stresses that survey
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
153
responses. For instance, researchers can minimize DK responses by employing
strategies that reduce normative and cognitive banders such as excluding the DK
option; using simple words and concepts; including an introduction emphasizing the
importance of the survey to the national public debate; using intermittent interviewer
In the case of post-hoc data fixes (imputation), the two interaction terms in
equation 25 above drop out because data imputation is usually concerned with
missing data on individual questions. Here the most important point is that DK
responses (and item missing data more generally) are a function of the social and
Huge gaps still remain with the conceptual framework presented above,
warranting further research. This model does establish general guidelines about how
to reduce DK response and finds strong theoretical and empirical support in the
framework used to explain response effects. For statisticians imputing missing data,
the above model provides a general blueprint concerning the variables that should be
included the data imputation model. In sections 8.2,1 discuss in greater depth
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
154
Filially, it is important to note that the above model is probably relevant for
other forms of missing data. However, future research still must test the validity of
Both the research cited in this thesis as well as the research presented in the
previous chapters show that respondents who are more likely to answer DK are
systematically different than respondents who are less likely. Specifically, like past
studies, our research here has demonstrated that, on average, the less educated,
female, less knowledgeable, black, older, those less active in civic activities and the
less cognitively sophisticated are more likely to answer DEL In short, DK responses
about potential bias in point estimates. But so what? What do these correlations
(2) age; (3) cognitive sophistication; (4) civic participation; (5) COOP; and (6)
Education: the research here shows that education is negatively and non-
linearly related to DK and can be completely explained away by two factors: (1) age
and (2) cognitive sophistication . These findings support the use of education as a
proxy for both cognitive ability and normative motivation. However, it is important
to note that no other research has explained away the education effect, suggesting
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
155
that future confirmation is needed before any definitive conclusions can be made.
So what are some practical lessons we can learn from these findings?
First, DK responses occur more often among the less educated than the more
(Sudman and Bradbum 1981; Sudman et. al. 1996; Krosnick 1991). One simple rule
than a high school degree—questionnaire designers should write at the same level
the above analysis, individuals with less than a high school degree represented 24
percent of the sampled population but were responsible for approximately 40 percent
of all DKs. Questionnaire designers should be keenly aware that a rather small
(2) Any and all devices should be employed in the design of the
questionnaire to ease cognitive difficulties of the survey interview—
always keeping in mind that a large portion of survey error probably is
disproportionately located among a small subgroup. Such devices could
include hand cards, clear transitions between sections and breaking long
blocks of questions.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
156
(3) Pre-testing questionnaires should take into considerations that a
disproportionate level ofDK’s (and probably survey error in general)
occurs among a small group of respondents. Specifically, questionnaire
designers might want to disproportionately sample (oversample) low
education respondents in order to detect specific problems with the
instrument. At the pre-test stage, oversampling of low education groups
may also be desirable when pre-testing instruments using focus groups
and cognitive interviews. Pre-testing might also be further improved by
stratifying the pretest sample according to age and cognitive ability, as
these two variables actually account for the education effect.
Third, the research in this thesis also shows that other correlates of cognitive
ability, such as verbal ability, knowledge, and information exposure do a better job
models? The short answer is yes.. One initial suggestion is that other measures of
A ge: the research has demonstrated that the age effect is, in part, a function
however, suggests that older respondents are not less cognitively sophisticated but
instead are more likely to have problems understanding survey questions. In other
Furthermore, the research in this thesis has shown that age only has an effect
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
157
younger do not differ greatly in their probability of providing a DK responses.
Instead, the age effect is most prominent among respondents 65 years of age or older
who are much more likely to answer DK than their younger counterparts. This result
suggests that the age effect most probably results from life-cycle differences, such as
cognitive and social senescence, and not from cohort differences. However, this
conclusion must be seriously qualified considering that we only analyzed one point
in time. Future research needs to examine the relative weight of life cycle and
These results also suggest solutions and raise new questions. First, as
younger respondents). Second, the age effect, in part, results from lower levels of
values from society and/or are less likely to adhere to the societal norms. So how do
The short answer is that we really do not know. Future research needs to
instance, research shows that older respondents are more likely to use stories and life
experiences when answering questions (Schwartz et. al. 1999). Perhaps the initiation
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
158
Third, the results here suggest that methodologists should focus on the
specification of the age variable. Indeed, most of the methods research has treated
using a direct measure of cognitive sophistication, we were able to explain away the
education effect.
survey performance are not necessarily the same thing. Yes, respondents’ general
both theoretically and empirically justified. My principal reason, though, for the
interpretation of the results. I sacrificed depth for efficiency. Future research should
break out the measures to determine their relative weight in explaining DK.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
159
that when imputation is predicted to play an important role in a study—direct rather
activities are less likely to say DK because they are more cognitively sophisticated
and not because they are more normatively motivated. This finding runs counter the
methods literature which had assumed that civic participation is a proxy variable for
respondent motivation.
predictors of DK. This finding is important for three reasons. First, such measures
are easily administered and, therefore, should always be included when researchers
are considering data imputation. Second, the results suggest that methods
causality. Does respondent comprehension and motivation influence DK? Or, does
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
160
the number of DK responses that a given respondent provides influence the
characteristics tap normative motivation and which tap cognitive ability? Where do
A quick glance at the above table indicates that many more proxy measures
for two reasons. First, cognitive ability may simply be the more important predictor
of DK. Second, survey methodologists have not devoted the same energies to the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
161
development of measures of normative motivation. Considering these deficiencies,
targeted at capturing adherence to social norms associated with the survey interview.
Indeed, until such measures are developed, the socio-cognitive framework presented
Before closing out this study, we are left with one pending question—should
we really impute missing values on attitudinal items? Does data imputation have a
I would say yes and no. On questions with high levels of DK (20 percent or
more), especially those that require specified knowledge (e.g., NAFTA), I would
not recommend imputing DK responses for two reasons. First, the proportion of DK
responses typically is too large. The simple “rule of thumb” for imputation is that as
the percent of missing data cases reaches 25 or 30 percent, imputed data become less
reliable and valid (Little and Rubin 1987). Second, a significant portion of the DK
category most definitely says something substantive about what the public both
being (e.g., happiness) however, data imputation may be the correct strategy for two
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
162
underlying true value when it comes to their subjective well-being. All respondents,
items is not a common practice in the social sciences today (for an interesting
exception to this see Gelman et. al. 1998). In the end, the decision to impute on
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
163
BIBLIOGRAPHY
Anderson, A.B, Balilevsky, A., and Hum, D.P. (1983) “Missing Data” in Handbook
o f Survey Research (eds) Rossi, P. et. al. Academic Press, Inc.: New York.
Bishop, G.F., Oldendick, R.W., Tuchfarber, A.J., Bennet, S.E. (1980) “Psuedo-
Opinions on Public Affairs,” Public Opinion Quarterly, 44, pp.198-209.
Bishop, G.F., Oldendick, R.W., Tuchfarber, AJ. (1983) “Effects of Filter Questions
in Public Opinion Surveys,” Public Opinion Quarterly, 47, pp.528-546.
Bonmstedt, G.W., and Rnoke, D. (1994) Statisticsfo r Social Data Analysis. F.E.
Peacock Publishers, Inc.: New York.
Bradbum, N., Sudman S. and Associates (1979) Improving Interview Methods and
Questionnaire Desigp Joessy-Bass: San Francisco
Brody, C.J. (1986) “Things are Really Black and White: Admitting Gray into the
Converse Model of Attitude Stability,” American Journal o f Sociology, 92,
pp.657-677.
Campbell, Angus, Miller, W.E., Converse, P.E. (1960) The American Voter.
University of Chicago Press.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
164
Campbell, C.F., and Fiske, D.W. (1959) “Convergent and Discriminant Validation
by the Multi-Trait Multi-Method Matrix,” Psychological Bulletin, Vol 56,
No.2pp.8M 05.
Ceci, SJ. (1992) “How Much Does Schooling Influence General Intelligence and Its
Cognitive Components? A Reassessment of the Evidence,” Developmental
Psychology, 27, pp.703-722.
Cochran, W.G., (1977) Sampling Techniques. John Wiley & Sons: New York.
Converse, P.E., (1964) “The Nature of Belief Systems in Mass Publics,” in D.E.
Apter (ed.), Ideology and Discontent, New York: Free Press, pp.206-266.
Coombs, C.H. and Coombs, L.C., (1977) “‘D on't Know’rltem Ambiguity or
Respondent Uncertainty,” Public Opinion Quarterly, 40, pp.497-514.
Couper, M P., Singer, E., and Kukla, R.A. (1997) “Participation in the Decennial
Census Census: Politics, Privacy, and Pressures,” America Politics
Quarterly, 26, pp.59-80.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
165
Davis, James A., (1980) "Conservative Weather in a Liberalizing Climate:
Change in Selected NORC General Social Survey Items, 1972-1978 in
Social Forces, 58, 1129-1156
Davis, James A., (1992) "Changeable Weather in a Cooling Climate Atop the
Liberal Plateau: Conversion and Replacement in 42 Items, 1972-1989," in
Public Opinion Quarterly, 56, 261-306.
Davis, JA.,and Smith, T.W., (1996) General Social Surveys 1972-1996: Cumulative
Code Book The Roper Center for Public Opinion Research.
Dillman, D.A. (1978) Mail and Telephone Surveys: The Total Design Method. John
Wiley & Sons, Inc.: New York.
Dillman, D.A. (2000) Mail and Internet Surveys: The Tailored Design Method. John
Wiley & Sons, Inc.: New York.
Feick, L.F., (1989) “Latent Class Analysis of Survey Questions that Include Don’t
Know Response,” Public Opinion Quarterly, 53, pp.525-547.
Francis, J. and Busch L., (1975) “What We Don’t Know about ‘I Don’t Know’,”
Public Opinion Quarterly, 39, pp.207-218.
Gelman, A., King, G., Lui, C. (1998) “Not Asked and Not Answered: Multiple
Imputation for Multiple Surveys,” Journal of the American Statistical
Association, Vol. 93, Nmn 443 pp.846-857.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
166
Gergen, K.J., and Back, K.W., (1966) “Communication in the Interview and the
Disengaged Respondent,” Public Opinion Quarterly, 30, pp. 17-33.
Gilljam, M., and Granberg, D., (1993) “Should We Take Don’t Know for An
Answer,” Public Opinion Quarterly, 57, pp.348-357.
Greene, W.H., (1997) Econometric Analysis. Prentice Hall: Upper Saddle River,
New Jersey.
Groves, GJML, (1989) Survey Errors and Survey Costs John Wiley & Sons, Inc.:
New York.
Hippier, H J., and Schwarz, N. (1989) “No Opinion filters: A Cognitive Perspective”
in International Journal o f Public Opinion Research. 1:1, pp. 77-87.
Hyman, H.H., Wright, C.R., and Reed, J.S., (1975) The Enduring Effects o f
Education. University o f Chicago Press: Chicago.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
167
Kish, L. (1965) Survey Sampling. John Wiley & Sons, Inc.: New York.
Kohut, A. (1981) “The 1980 Presidential Polls: A Review of Disparate Methods and
Results,” Proceedings o f the Section of m Survey Research Methods,
American Statistical Association, pp. 41-46.
Krosnick, J.A., (1991) “Response Strategies for Coping with the Cognitive Demands
of Attitude Measures in Surveys,” Applied Cognitive Psychology, 5, pp.213-
236.
Krosnick, J.A. and Alwin, D.F., (1987) “Satisficing: A Strategy for Dealing with the
Demands of Survey Questions,” GSS Methodological Report 46. March,
1987.
Krosnick, J.A., and Fabrigar, L.R.. (1997) “Designing Rating Scales for Effective
Measurement in Surveys,” in Lyberg et al. (eds.), Survey Measurement and
Process Quality, John Wfleiy & Sons, Inc.: New York, pp.141-164.
Leeuw de, Edith and Collins, Martin (1997) “Data Collection Methods and Survey
Quality: An Overview” in Lyberg et.al.(eds) (1997) Survey Measurement
and Process Quality, Jonh Wiley & Sons, Inc.: New York.
Little, J.A.R.,and Rubin, B.E., (1987) Statistical Analysis with Missing Data, John
Wiley & Sons, Inc: New York.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
168
Madow, W.G., and OUrin, I. (eds) Incomplete Data in Sample Surveys Vol. 3,
Proceedings of the Symposium. Academic: New York.
Mathiowetz, N.A., DeMaio, T.J., and Martin, E. (1991) “Political Alienation, Voter
Registration, and the US 1990 Census,” Paper Presented at the annual
conference of the American Association of Public Opinion Research,
Phoenix, AZ.
Mayer, William G., (1992) The Changing American Mind: How and Why
American Public Opinion Changed Between I960 and 1988. Ann Arbor:
University o f Michigan Press, 1992.
McCutcheon, A.L and Alwin D.F., (1987) Latent Class Analysis Sage Publications:
London.
Narayan, S., and Krosnick, J.A., (1996) “Education Moderate Some Response
Effects .in Attitude Measurement,” Public Opinion Quarterly, 60,pp. 86-
96.
Nie, N.H., Junn, I., SteMik-Barry, K. (1996) Education and Democratic Citizenship
in America. The University of Chicago Press: Chicago and London.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
169
Rapoport, R.B., (1982) “Sex Differences in Attitude Expression: A Generational
Explanation,” Public Opinion Quarterly„46, pp.86-96.
Rubin, D.B., (1976) “Inference and Missing Data,” Biometrika, 63, pp.581-592.
Sanchez, M.E and Morchio, G,, (1992) “Probing Don’t Know Answers: Effect on
■■ Survey Estimates andVariable Relationships,” Public Opinion Quarterly,
56, pp.454-474.
Schuman, H., and Presser, S., (1981) Questions & Answers in Attitude Surveys:
Experiments on Question Form, Wording, and Context. Sage Publications,
Inc.: Thousand Oaks, California. .
Schuman, Howard, Steeh, C , Bobo, L., and Kryson, M. (1986) Racial Attitudes
in America: Trends and Interpretations. Harvard University Press.
Schwarz, N., Park, D., Knauper, B., and Sudman, S. (1999) Cognition, Aging, and
Self-Reports. Academic Press: New York.
Schuman, H. and Scott, J., (1989) “Response Effects Over Time: Two Experiments”
in Sociological Methods & Research 17:4, pp.398-408.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
170
Southwell, P.L (1985) “ Alienation and Non-voting in the United States: A Refined
Operationalization” in Western Political Quarterly 38: PP. 663-674.
Stolzenberg, R.M., and Relies, D.A. (1997) “Tools for Intuition About Sample
Selection Bias and Its Correlates” in American Sociological Review, 62:
494-507.
Sudman, S., Bradbum, N.M., Schwarz, N. (1996) Thinking About Answers. Josey
Bass: San Francisco.
Sudman, S., and Bradbum, N.M., (1982) Asking Questions. Jossey-Baas: San
Francisco.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
171
Tourangeau, R., Rips, L., and Rasinski, K.A., (2000) The Psychology o f Survey
Response, Cambridge University Press: New York.
Verba, S and Nie, N.H. (1972) Participation in America New York: Harper &
Row.
Wright, B.D., and Masters, G.N. (1982) Rating Scale Analysis. MESA Press:
Chicago
Young C.A. (1998b) “Sex, Crimes, and Economic Downturns: Why Americans
have become less confident in their Political and Non-political Institutions,”
Unpublished Masters Thesis. University of Chicago.
Young, C.A. (1999a) “Mean Square Error: A Framework for Classifying Survey
Error,” paper presented to the research staff of DATA-UFF at the
Universidade Federal Fuminense. Niterio, Brazil. March 9,1999.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
172
Young, C.A. (1999b) “An Analysis of Can’t Choose Responses on the 1993
International Social Survey Program,” paper presented at the annual meeting
of International. Social Survey Program. Madrid, Spain. April 26,1999.
Young, C.A. (1999c) “What.We Now Know about ‘I Don’t Know’: An Analysis of
the Relationship between ‘Don’t Know’ and Education,” paper presented at
the 54th annual meeting of the American Association of Public Opinion
Research. St.. Pete Beach, Florida. May 15,1999.
Young, C.A. (2000) “0 <pe qtieremos dizer quando falamos sobre quahdade em ■
pesquisa?: Defini?So e Glassificafao de Conceitos,” presented at the Ford
Foundation Series on Survey Methodology at DataUff, Universidade Federal
Fluminense. November, Rio de Janeiro.
Young, C.A. (2001) ““In Search of.Country and House Effects: An International
Comparison of Data Quality Unpublished Paper, March. Sao Paulo.
Young, C.A., Andrade, F.C., and Moura, C.B. (2001) “Non-Response in Sample
Surveys: A Clarification of Concepts and Our Empirical Experience in
the United States and Brazil,” paper presented at the annual Pesquisa
Social Brasileira (PESB) seminar, April. Rio de Janeiro.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.