Young - Why Surveys

THE UNIVERSITY OF CHICAGO
EXPLAINING WHY SURVEY RESPONDENTS ANSWER T DON’T KNOW’:

AN ANALYSIS OF MISSING DATA AND DATA QUALITY ON THE
GENERAL SOCIAL SURVEY
A DISSERTATION SUBMITTED TO
THE FACULTY OF THE DIVISION OF THE SOCIAL SCIENCES
IN THE CANDIDACY FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
DEPARTMENT OF SOCIOLOGY
BY
CLIFFORD ALEXANDER YOUNG
CHICAGO, ILLINOIS
DECEMBER 2001
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 3029551
UMI Microform 3029551

Copyright 2002 by Bell & Howell Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
Bell & Howell Information and Learning Company

300 North Zeeb Road
P.O. Box 1346
Ann Arbor, Ml 48106-1346
TABLE OF CONTENTS
LIST OF FIGURES.................................................................................... iv
LIST OF TABLES.................................................................................... v
ACKNOWLEDGEMENTS....................... ............................................... vi
ABSTRACT.......................... vii
1.0 Introduction of Thesis........................................................................... 1

1.1 The Problem.................................................................................... 3
1.2 Subject of Study............................................................................... 13
1.3 Organization of Thesis......................................................................15
2.0 Discussion of the Literature..................................................................... 16
2.1 Defining What We Mean By Don’t Know...........................................17
2.2 Question Level Correlates and Data Quality ................................25
2.3 DK Option and Issues of Quality. ................................................... 34
2.4 Individual Level Correlates........................... 37
2.5 DK and Missing Data ...... 44
3.0 Data and Methods.....................................................................................47
3.1 Data............................ ..................... ................................................. 45
3.2 DK Correlates........................ ............................................................50
3.3 Methods.................................................................................. ...........53
3.4 Model................................................................................. ................56
4.0 Analysis of the Baseline Model................................................................58
4.1 Literature Review............................................................................... 59
4.2 Methods..............................................................................................60
4.3 Bivariate Analysis of Demographic and Behavioral Correlates 61
4.4 Multivariate Analysis of Demographic and Behavioral Correlates 64
4.5 Conclusion..........................................................................................76
5.0 Analysis of the Association between DK and Education.......................... 78
5.1 Review of the Literature..................................................................... 81
5.2 Methods........................................................................................... 85
5.3 Examination of the Relationship between Education and DK.............. 87
6.0 An Analysis of the Age and Civic Participation Effects.......................... 99
6.1 D iscussion o f the M easures............................................................................ 103
6.2 Methods and Data............................................................... ............. 107

6.3 Bivariate Analysis of Age and Civic Participation............................... 117
6.4 Validation and Decomposition of Ageand Civic Participation...............121
6.5 Conclusion......................................................................................... 128
7.0 Final Multivariate Models.........................................................................130
7.1 Models......................................................... 131
7.2 Testing Model Fit......................................................... 133
7.3 Analysis of DK Predictors.............. 136
7.4 Conclusion................................. 139
8.0 Conclusion of Thesis....................... ................................................... 141
8.1 Conceptual Framework ................................................ 142
8.2 Analysis of Respondent Level Correlates.............................. ........ . 154
8.3 Wrapping Things Up....................................... 160
BIBLIOGRAPHY.............................................. ........................................... 163
IV
LIST OF FIGURES
Figure 1: Distribution of “Don’t Know” Responses from Attitude Items on

the 1987 General Social Survey.................................................. 49
Figure 2: Relationship between DK and Education....................................89
Figure 3: Relationship between DK and Education (Controlling for Cognitive
Sophistication)............................................................................91
Figure 4: Relationship between Education and DK(Controlling for Cognitive
Sophistication and Civic Participation)........................................93
Figure 5: Relationship between Education and DK (Controlling for Cognitive
Sophistication, Civic Participation, and Gender)........................ 94
Figure 6: Relationship between Education and DK (Controlling for Cognitive
Sophistication, Civic Participation, Gender, and Age)................ 95
Figure 7: Relationship between DK and Respondent Cooperation
(COOP) ............................ ..................... 113
Figure 8: Relationship between DK and Respondent Comprehension
(COMPREND)........................... 114
Figure 9: DK Rate by Age of the Respondent........................................ 118
Figure 10: DK Rate by Number of Civic Activities.................................. 120
V
LIST OF TABLES
Table 1: Decomposing the DK Category: Ambivalent Attitudes vs. Other DK-

Like Responses.......................................................................... 20
Table 2: Descriptive Statistics: Variable Names, Means, Standard Deviations,
and Ranges................................................................................. 52
Table 3: Standard Error, DEFF, and DEFT of DK Correlates....................55
Table 4: Bivariate Correlations between Don’t Know and Other Selected
Variables................................................................................. 63
Table 5: Summated DK Index regressed on DK Correlates..................... 66
Table 6: Unstandardized OLS Estimates for Models with Education Main
Effects controlling for DK Predictors...........................................88
Table 7a: Frequency Distribution for COOP..............................................I l l
Table 7b: Frequency Distribution for COMPREND....................... Ill
Table 8: Two-Way Crosstab of COMPREND and COOP.........................112
Table 9: Descriptive Statistics, Indicators of Respondent Motivation and
Comprehension............................................................................ 115
Table 10: Pearson’s Bivariate Correlation (r) of Indicators of Respondent
Alienation..................................... 117
Table 11: Pearson’s Bivariate Correlations of Age, Civic Participation and
Indicators of Respondent Motivation and Comprehension.......... 122
Table 12: Age and the Relative Weight of Respondent Motivation and
Comprehension.......................................................................... 124
Table 13: Civic Participation and the Relative Weight of Respondent
Motivation and Comprehension............................................... 127
Table 14: Summated DK Index regressed on COMPREND, COOP,
Controlling for Other Predictors................................................ 137
Table 15: Respondent Variables by Normative Motivation and Cognitive
Ability.......................................................................................160
vi
ACKNOWLEDGEMENTS
I write my acknowledgements from Sao Paulo, Brazil—a huge, unforgiving

metropolis—a world away from the serene confines of Hyde Park and the University
of Chicago. This comparison gives me pause, making me reflect upon just how
much personally and intellectually I have changed over the course of my graduate
studies.
The changes in my personal life are most easily put into quantitative terms.
During this period, I married; had two children; moved to a new country; lived in
three different cities; and left my place of employment three times. Without a doubt,
these changes have influenced the contents of my thesis. How? And in what way? I
really do not have space here to waste.
The changes in my intellectual life, while less quantifiable, are just as
profound. Indeed, I am not the same person intellectually who started his graduate
studies six years ago being more rational, analytical, synthetic, and systematic in
thinking about problems. I also had the opportunity at the University of Chicago to
interact with intelligent and dynamic professors and fellow students from the most
varied of backgrounds. Such interaction influenced how I look at the world.
Here I want to thank the two central figures of my graduate training—James
Davis, my thesis advisor, and Tom Smith, my former boss at the National Opinion
Research Center. Without these two individuals, I would be a different person
today.
James Davis is one of the great educators and institution builders in the
quantitative social sciences, having both founded the General Social Survey (GSS)
as well as taught several generations of social scientists about the logic of causal
order, the equivalent of the law student’s Socratic method, or the MBA student’s
case study method. In my first year of graduate studies, Jim’s course on Soc
Methods taught me how to think about causal order as well as introduced me to
quantitative methods and survey research as a professional field.
Tom Smith, in turn, taught me the importance of good measures and high
quality surveys as a graduate research assistant on the GSS. In short, good science
requires good measures. The GSS is also where I decided to become a survey
methodologist and to write a methods dissertation. To Tom I say, I never rest in my
quest to kill “the measurement error vampire”.
Last but not least, I want to thank my family for their steadfast support
throughout my graduate experience. Only those who have both worked and, at the
same time, have written a doctoral thesis can attest to the brutal toll it takes on one’s
family. Cristina, my wife, deserves special praise for her support of me. Thank
You. I also want to acknowledge my two beautiful sons—Lucas, now 3, and
Thomas, now 1—they are an endless source of inspiration. If someday they read
this passage, I want them to know how much they mean to me.
To all those who I have not mentioned, I thank you.
ABSTRACT
Sociologists are important consumers of sample surveys. Modem

quantitative sociology over the last half-century has made increasing use of high-
quality surveys. Surveys, however, do not need to be m echanisms solely for testing
sociological concepts. Indeed, such concepts can also provide important insights
into how best to maximize survey quality.
Analyzing the 1987 General Social Survey, this thesis has two primary
objectives: (1) to establish a general framework for the treatment of Don’t Know
Responses (DK) and (2) to understand why certain survey respondents are more
likely to answer DK than other respondents. So what do the results of this study
suggest?
First, this study shows that DK responses are a function of two overarching
respondent level variables: (1) normative motivation and (2) cognitive ability.
Specifically, respondents who feel socially obliged to provide a substantive response
will be less likely to answer DK, while those who are more cognitively sophisticated
will also be less likely to say DK.
This model can help researchers both at the questionnaire design stage as
well as during the post-survey data imputation stage. For instance, to minimize DK
rates, researchers may want to normatively motivate respondents by stressing the
importance of the survey. Similarly, statisticians should include questions on
surveys that are highly correlated with item nonresponse in order to increase the
efficiency of imputing missing data.
Second, confirming past research, respondents who answer DK are
systematically different from respondents who do not. Specifically, the less
educated, the less cognitively sophisticated, the elderly, black, female, and the less
likely to participate in civic activities are all more likely to answer DK.
Two additional findings should be noted here. First, unlike past methods
research, this study finds that, in combination, two factors completely explain away
the association between DK and education: (1) cognitive sophistication and (2) age.
Second, this research strongly suggests that the age effect results from
changes in the life-cycle and not cohort differences. Specifically, there are few
differences in DK among respondents between 18 and 64 years of age, while large
differences for those 65 years of age and older.
1
CHAPTER ONE
INTRODUCTION OF THESIS
1.0 Introduction of Thesis

Sociologists are important consumers of sample surveys. Modem
quantitative sociology over the last half century has made increasing use of high-
quality surveys. There is no better indicator of the central role of the sample survey
in quantitative sociology than the General Social Survey, which to date has trained
several generations of sociologists; won tenure for scores of young professors; and
provided the means to develop and perfect the measurement of sociological
phenomena. Surveys, however, do not need to be mechanisms solely for testing
sociological concepts. Indeed, sociological concepts can also provide important
insights into how best to maximize survey quality.
One way in which sociological concepts can be used to improve data quality
is by explaining why certain respondents are more likely to answer survey questions
than others. By understanding “why”, survey methodologists will be able to develop
better strategies to maximize survey response and survey quality. In addition, a
more thorough understanding of item missing data can help guide analysts in how to
deal with it in analysis. In pursuit of this, I attempt to answer four questions in this
thesis:
2
(1) Are respondents who are more likely to answer survey questions
different from those who are less likely?
(2) Along what dimensions (demographic, behavioral, and/or attitudinal) are

respondents who are more likely to answer survey questions different
from those who are less likely?
(3) What social and/or psychological processes explain why some

respondents are more likely to answer survey questions than others?
(4) Can general principles be derived, so that missing data will not have to
be dealt with on a case by case basis?
3
1.1 The Problem
1.1.1 Definitions
Quantitative sociologists use statistical techniques which have been designed
to examine rectangular data matrices. In standard statistical packages such as SPSS,
SAS, and STATA, the columns of the data matrix represents variables (both discrete
and continuous) and the rows represent the unit of analysis (e.g., individuals,
companies, households, etc.). Every data analyst has at one time or another had to
confront the problem of missing data. Missing data refers to when either some or all
of the values in the data matrix are not observed for a given respondent (Little and
Rubin 1987).
There are two forms of missing data: unit m d item nonresponse. Unit
nonresponse refers to when “...units in the selected sample and eligible for the
survey do not provide the requested information, or the provided information is
unusable" (Madow and Olkin 1983). Item nonresponse refers to when “[eligible
units in a selected sample provide some, but not all, of the required information or
the information for some items is unusable” (Madow and Olkin 1983).
Item nonresponse can be further broken down into two sub-dimensions: (1)
process item nonresponse and (2) interview item nonresponse. By processes item
nonresponse, I refer to item missing data resulting from problems with pre-survey
questionnaire formatting (e.g., improper skip patterns in questionnaires) as well as
with post-processing including editing, coding, and data capture (see Smith 1993
for examples of questionnaire formatting problems). Process item nonresponse
4
arises from problems with the survey production process and, therefore, can be
minimized with redundant systems and checks. Computer assisted interviewing
(CAI) has made it especially simple for survey researchers to minimize process item
nonresponse through computerized range, consistency, and m issing value checks
(Lyberg & Kasprzyk 1997; Weeks 1992).
By interview item nonresponse, I am referring to item missing data resulting
from the social, psychological, and cognitive dynamics of the interview. Many
factors, in addition to individual level characteristics, result in missing data,
including question form, order, and context; interviewer behavior; and mode of
questionnaire administration. In this thesis, I will be analyzing interview
nonresponse. Therefore, from this point on, when referring to item nonresponse, I
mean interview item nonresponse.
1.1.2 Item Missing Data and Its Effect on Substantive

Research
Quantitative sociologists analyzing survey data typically exclude missing
data from their analysis. By ignoring missing data however, analysts must confront
two potential problems: (1) increased sampling error and (2) bias. Let us further
define our terms.
First, added sampling error refers to larger standard errors (wider confidence
intervals) and, hence, less precise estimates. In brief, by excluding cases with
missing data, we reduce the effective sample size of our analysis, which, in turn,
increases the standard error.
Note the standard error is inversely related to the square root of the sample
size (see equation 1 below). Thus, as the effective sample size decreases, the
standard error increases.
Standard Error (y) = Vvar(y) / n (1)

y = sample mean
var (y) = sampling variance o f the sample mean
n = sample size
Second, bias refers to a systematic deviation between the sample mean and
the population mean (see equation 2a below) where the deviation is constant across
infinite replications of a survey (Groves 1989). Simply put, this means that, if an
analyst wants to determine the average US household income and decides to conduct
the study via the internet, he/she will overestimate average income independently o f
sample size, given that Americans with internet access in their homes are more
affluent than those without.
Bias (y) = E(|Y-y|) (2a)

Y = Population mean
y = Sample mean
E = In Expectation
(across infinite replications of a survey)
Bias, as it relates to item nonresponse, is a function of two components: (1)
the size of the item nonresponse subgroup and (2) the difference between the mean
for the subgroup giving a substantive response and the mean for the item
nonresponse subgroup (see equation 2b below).
6
Bias (y) = Wnr (|ysr-ynr|) (2b)
Wnr = Size o f Nonresponse Subgroup
ysr = Mean for the subgroup giving
a substantive response
ynr = Mean for the nonresponse
subgroup
Equation 2b suggests that, when the item nonresponse subgroup is (1) large
relative to the subgroup giving a substantive response and/or (2) quite different from
the subgroup giving a substantive response, the sample mean will deviate
substantially from the population mean. In such cases, analysts are likely to draw
incorrect inferences about the population of interest. Considering the
aforementioned limitations of excluding item missing data, how might analysts
effectively minimize error associated with item nonresponse?
1.1.3 Dealing with Item Nonresponse

Survey researchers can reduce problems of missing data in two ways: (1) by
designing questions that minimize item nonresponse and/or (2) by replacing the item
missing data with imputed values. I will briefly discuss how survey researchers use
these two methods to minimize problems of missing data.
1.1 3 .1 Questionnaire Design
Questionnaire designers have used a variety of strategies to minimize item
nonresponse through questionnaire design. The strategy, in general, has been to
sufficiently motivate the respondent to provide a substantive response (1) by
reducing social barriers (e.g., privacy concerns) and/or (2) by reducing cognitive
7
barriers (e.g., the use of difficult words) (Young 1999b; 1999c; 1999d). The three
following examples will illustrate this point.
First, on income questions, questionnaire designers will typically ask
respondents for their income within a given range, or interval, (e.g., $25,000-35,000)
rather than ask for their exact income. This strategy has been shown to reduce
missing data (e.g, Sudman and Bradbum 1982). The methods literature, in turn,
speculates that respondents are more likely to give a substantive response on closed-
ended income questions because they ensure respondents a greater degree of
confidentiality (Bradbum, Sudman, and Associates 1979; Sudman and Bradbum
1982).
Second, in order to minimize don’t know (DK) responses, some surveys,
such as the GSS, will typically not offer an explicit DK category. Instead, the
interviewer is instructed to record a DK response only after the respondent
volunteers it (Davis and Smith 1996). In support of this strategy, extensive research
shows that the demands of the question confine respondent behavior (e.g., Sudman
and Bradbum 1973; Schuman and Presser 1981; Bishop et. al. 1980; Krosnick and
Fabrigar 1997 ). Respondents, therefore, will not typically provide answers that are
not offered as explicit options. This same research, however, is not conclusive on
whether minimizing DK responses improves or worsens data quality.
Third, on pre-election polls which ask respondents their intention to vote for
a given candidate, survey organizations use secret ballots to minimize the rate of
1 While the name, secret ballot, suggests a highly sophisticated technique, the method
is nothing more than a self-administered questionnaire.
undecideds (Perry 1979; Traugott and Tucker 1984). The secret ballot method is
designed to provide respondents with a greater degree of privacy, eliminating the
concern that their choice of candidate will be socially censured by the interviewer
(Perry 1979; Kohut 1981).
While survey methodologists have employed a variety of methods to
minimize item nonresponse, this effort has been directed most often in an ad hoc
manner, varying considerably from case to case. The lack of any general guiding
principles is most evident when analyzing the primary questionnaire design primers
in the field (e.g., Dillman 1978; Dillman 2000; Sudman and Bradbum 1982; Sudman
et al. 1996). None of them has a chapter (or even a portion of a chapter) devoted
solely to item nonresponse. This suggests that the survey methods literature is in
need of a conceptual framework to help guide methodologists in how best to
minimize the problem of item missing data.
1.1.3.2 Data Imputation
Missing data will occur even on the best-designed surveys. Survey
organizations, therefore, allocate considerable resources to correct for item missing
data. Item nonresponse has been dealt with using statistical imputation models that
first predict missing data values based upon a respondent’s known characteristics
and then replace the missing data with predicted values. A wide-variety of data
imputation models are presently used ranging from those that simply replace missing
values on a given variable with the grand mean of that variable (grand mean
9
imputation) to highly sophisticated multivariate imputation models that account for
2
both within and between imputation variance (multiple iterative imputation).
Depending on the nature of the missing data (missing data mechanism),
different assumptions must be made which, in turn, affect the appropriateness of the
imputation model employed. Let us discuss in detail the different assumptions
(Rubin 1976; Rubin and Little 1987). The least restrictive assumption in the data
imputation literature is when the missing data is considered to be Missing
Completely at Random (MCAR).
P r (Ri = 1 [Ui, V i) =■m (constant) (3)
where U = Characteristic with some missing values (income)

:V = Characteristic with no missing values (education)
R - Response to U (1 if observed; 0 if not observed)
i = Respondent i
m = Type of missing data mechanism
In the case of MCAR {see equation 3 above), the probability that we observe
income (characteristic U) for respondent i is independent of education (characteristic
V). The missing data mechanism (m) is proportionately distributed across levels of
education. MCAR assumes that nonrespondents are not systematically different
from respondents in respect to education.
For a further discussion of the data imputation literature see Andersen et. al. 1983;
Little and Rubin 1987; Lesser and Kalsbeek 1992. Survey statisticians have put enormous effort
into the development of statistically valid and reliable imputation models over the last twenty
years (1979-1999). Indeed, over this period, the number o f articles on data imputation appearing
per year in IAS A (Journal o f American Statistical Association)—the premier statistics journal—
has increased fivefold (my own research). Research into data imputation has also become the
“hot” topic, replacing the more traditional areas in statistics related to survey methods, such as
survey sampling (Groves 1996).
10
MCAR is the implicit assumption made when analysts listwise delete
missing data cases from analysis. However, as considerable research shows,
nonrespondents tend to be different from respondents (e.g., Ferber 1966; Francis and
Busch 1975), suggesting that MCAR is not a realistic assumption.
A more restrictive and realistic assumption is typically made where missing
data is assumed to be Missing At Random (MAR). In the bivariate case of MAR,
the probability that we observe characteristic U for respondent i is conditional upon
characteristic V (see equation 4a below).
Pr (R i = 1|Ui, Vi) = m (Vi) (4a)
Using the income example again, this means that respondents are
systematically different from nonrespondents—more educated for instance.
However, when controlling for education, nonrespondents are not systematically
different from respondents.
Pr (Ri = 1JUi, V i, W i) = m (V i, W i) (4b)
where W = Characteristic with all values observed (age)
In the still more restrictive case of multivariate MAR, the probability that we
observe characteristic U for respondent i is conditional jointly upon characteristics V
and W (see equation 4b above). This means that respondents who provide an
11
income answer are systematically different from those who do not—more educated
and older for instance. However, when controlling for education and age,
nonrespondents and respondents are not systematically different. Most statistical
adjustments for item missing data make this more restrictive multivariate MAR
assumption.
Pr (Ri = 1|Ux, Vx) = m (U i, VO (5)
MAR can not always be assumed however. We refer to such cases as
nonignorable missing data mechanisms (see equation 5 above). Specifically, the
probability that we observe characteristic U for respondent i depends upon
characteristic U. Simply put, the probability that respondents report income depends
on how much they make—the higher the income, the less likely to provide a
response.
Pr (Ri = 1 |Ui, V i) = m (V i, Zi) (6 )
where Z is an unobserved characteristic (verbal ability)
Put into econometric parlance, nonignorability can be thought of as
unobserved heterogeneity. Specifically, the probability that we observe
characteristic U for respondent i is conditional upon observed characteristic V and
unobserved characteristic Z (see equation 6 above). In other words, the observed
characteristic education does not by itself explain the difference between
respondents and nonrespondents. In this case, we would have to take into
12
consideration both the observed characteristic education and the unobserved
characteristic verbal ability to explain the difference between respondents and
nonrespondents.
The correct specification of variables in data imputation models is essential
for the proper estimation of imputed values. Data imputation, however, is typically
done in an ad-hoc manner with little attention to questions of unobserved
heterogeneity. Underscoring this point, the statistical literature on data imputation
has not examined in any depth the underlying social and cognitive processes
producing the missing data. ......
Perhaps the lack of effort on the proper specification of missing data models
can be explained by disciplinary boundaries. Indeed, the propensity for an
individual to respond to a given question is a function of the underlying social and
cognitive processes of the survey interview, falling under the natural purview of
sociology and cognitive psychology not statistics. Does this kind of research
produce results?
The short answer is yes it does. Indeed, research into the specific processes
that produce missing data has been shown to improve imputation models.
Quantitative sociology has been extremely slow in incorporating data imputation into
its repertoire o f methods. Several factors might explain the present situation of data imputation
in sociology. First, only recently have user-friendly imputation software become available either
as stand-alone packages (e.g., SOLAS) or as integrated options to standard statistical packages
(e.g., Missing Data Analysis in SPSS). Second, few advanced courses in quantitative methods
offer even an overview o f data imputation techniques. When courses do examine these issues,
they only review a very small subset o f techniques from the econometrics literature that deal with
missing data on the dependent variable—commonly referred to as sample selection models (e.g.,
Berk 1983; Stolzenberg and Relies 1997; Heckman 1976; 1979).
13
Matfaoweitz (1998), for instance, demonstrates that including respondents’
expressions of uncertainty can improve the validity of imputed data.
1.1.4 Solutions to the Problem
The literature on questionnaire design as well as data imputation has treated
item missing data in an ad hoc, case by case basis. No general guiding principles
presently exist about how to effectively confront the problem of item missing data.
A thorough examination of item nonresponse, therefore, would provide
practical insights into the problem of item nonresponse, including:
• The specification of what factors—respondent, question, and/or

interview—should be manipulated to minimize item nonresponse.
• The specification of covariates for data imputation models, eliminating

biases associated with unobserved heterogeneity.
• The specification of working principles to guide methodologists in their

treatment item missing data.
1.2 Subject of Study
1.2.1 Decision to Restrict the Scope of the Study
In this thesis, I will confine my analysis of item nonresponse to Don’t Know
(DK) responses on attitude items. I choose to restrict my study to DK responses for
four (4) reasons.
First, I choose to study the DK responses on attitude items because they have
received considerable attention from both the behavioral and social sciences (e.g.,
14
cognitive psychology, political science, and sociology). This literature on DK can
provide theoretical as well as empirical insights into the more general issues of item
missing data.
Second, I choose to study DK because the literature on response and context
effects has extensively examined the phenomenon (e.g., Sudman and Bradbum
1974; Schuman and Presser 1981; Bradbum 1983; Tourangeau and Rasinski 1988;
Sudman, Bradbum, and Schwarz 1996). This literature has a well-established socio-
cognitive framework, which may provide insights into the phenomenon of item
nonresponse.
Third, by concentrating on DK responses, I will gain an in depth
understanding of the phenomenon of item nonresponse as it relates to DK. Future
studies can examine whether the conclusions drawn from my analysis of DK hold in
the case of other forms of item nonresponse.
Fourth, I choose to study DK responses because they present many problems
for data analysts who lack clear guidelines about how to treat DK responses in
analysis. Indeed, all analysts have asked at one time or another—should I delete DK
responses from my analysis?
1.2.2 Criticism
One potential criticism of my thesis is that a DK response might represent a
respondent’s true underlying value and, therefore, does not qualify as item
nonresponse. As defined on page three (3) of this thesis, item nonresponse includes
those responses that are unusable in substantive analysis. I contend that in the great
15
majority of cases DK responses are unusable in analysis because, in practice, it is
difficult, if not impossible, to determine if they represent a respondent’s true
underlying value or are missing data.
Supporting this conclusion, extensive research shows that respondents
choose “off-scale” response categories, like don’t know, for a variety of reasons
making clear interpretation of such categories difficult if not impossible (Coombs
and Coombs 1977; Smith 1984; Feick 1989; Krosnick and Fabrigar 1997;
O’Murcheartaigh et. al. 1999).
In some cases however, DK responses may represent true underlying values.
Considering this qualification, I will examine the issue in more detail in the
following chapter (chapter 2).
13 Organization of Thesis
Excluding this introduction, I divide the thesis into 7 separate chapters.
In chapter 2 ,1 review the relevant literature on DK. In chapter 3 ,1 discuss the
data, variables and methods employed in the thesis. In chapter 4 ,1 analyze
bivariate correlations and estimate a baseline model which I use as a reference
for the remainder of the thesis. In chapter 5 ,1 analyze and attempt to explain one
important predictor of DK—education. In chapter 6 ,1 attempt to understand why
two predictors (civic participation and age) are correlated with DK; and, in
chapter 7 ,1 determine which multivariate model best fits the data. Finally, in
chapter 8 ,1 conclude the thesis by discussing the general results and how they
might be applied in practice.
16
CHAPTER TWO
DISCUSSION OF THE LITERATURE
2.0 Discussion of tie Literature
The research on DK can be broadly broken down into three strains. The first
strain examines the DK category itself, attempting to determine why respondents
provide DK responses. This research finds that respondents provide DK responses
for a variety of different reasons (e.g., ignorance, social desirability, ambiguity of
question wording) (Coombs and Coombs 1977; Smith 1984; Feick 1989; Krosnick
and Fabrigar 1997; O’Murcheartaigh'et al. 1999).
The second strain examines question level correlates of DK response, finding
that questionnaire designers have substantial control over the rate of DK responses,
through the specific manipulation of the characteristics of survey questions. For
instance, this research shows that placing an explicit DK option in the response scale
will significantly increase the rate of DK responses (e.g., Schuman and Presser
1981). This literature has also examined whether maximizing (or minimizing) DK
responses improves data quality (e.g., Schuman and Presser 1981; Krosnick and
Fabrigar 1997; O’Murcheartaigh et. al. 1999).
The third strain examines individual correlates of DK. This research finds
that the characteristics of respondents who answer DK are systematically different
from the characteristics of those who give substantive responses. Quickly summing
up this literature, respondents who provide a DK answer, on average, are less
educated, older, female, black, less politically active, and less knowledgeable (e.g.,
Gergen and Back 1966; Glenn 1969; Sudman and Bradbum 1974; Converse 1977;
Ferber 1966; Francis and Busch 1975; Rapoport 1982,1985).
In this chapter, I examine these three primary strains of research found in the
literature on DK responses. To simplify this task, I break the chapter down into four
sections. In section 2.1,1 examine what the DK response means to both researchers
and respondents. In section 2.2,1 detail the research on question level correlates of
DK responses. In section 2.3,1 discuss the link between DK responses and data
quality. In section 2.4,1 explore individual level correlates of DK responses.
Finally, in section 2.5,1 examine how the literature conceptualizes DK. Specifically,
are DKs substantive responses (i.e., true underlying value)? Are they missing data
(i.e., a censored substantive response)? Or are they a combination of these two
response types?
2.1 Defining What We Mean by Don’t Know
Both researchers and respondents mean many things when saying DK. In
this section, my objective is to do some conceptual housecleaning. I attempt to
answer two questions: (1) what do researchers mean when they say DK? and (2)
what do respondents mean when they say DK?
18
2.1.1 What do researchers mean when they say DK?

A review of the literature on DK responses indicates that when researchers
refer to DK, they mean one of two things:
(1) Responses that correspond to a number of off-scale response

categories, such as can’t choose, not sure, not enough
information to form an opinion, no opinion, and question mark
(?) and/or;
(2) Interviewer coded DK-like responses.
In brief, DK is a catch-all concept, including both off-scale response labels and
respondent declarations (Young 1999d).
2.1.2 What do respondents mean when they say 4I Don’t

Know’ or express-DK-like responses?
The short answer to the above question is—“I Don’t Know” can mean many
things to the respondent. Extensive research shows that respondents say DK for a
variety of reasons (e.g., Coombs and Coombs 1977; Smith 1984; Feick 1989;
Krosnick and Fabrigar 1997; O’Murcheartaigh et. al. 1999).1 These reasons include:
(1) not having an attitude toward the given issue (nonattitude);
(2) not wanting to give a socially undesirable answer;
(3) having a neutral or an ambivalent attitude towards an issue but

answering DK because no mid-point is offered;
Research also shows that different DK subgroups often have very different demographic
and attitudinal profiles. Several studies demonstrate, for instance, that respondents with ambivalent or
neutral attitudes who choose a DK option—(1) are more educated and knowledgeable; (2) have higher
cognitive abilities; (3) are younger; (4) are more male—than other respondents that choose DK
(Coombs and Coombs 1977; Falkenberry and Mason 1978; Smith 1984; O’Murcheartaigh et. al. 1999).
19
(4) not understanding the question because of poor wording (item
ambiguity) and;
(5) not wanting to expend the mental effort needed to give a

substantive answer (satisficing).
Furthermore, this same research indicates that, even on the same question,
respondents answer DK for a variety of reasons. Table 1 below includes 7 separate
sets of survey items from 4 different studies, which decompose the DK category.
The studies in Table 1 differ considerably in how they decompose the DK
category. However, they all show that the DK category is heterogeneous in
composition. Specifically:
• In a study of Taiwanese women on issues of family planning,

Coombs and Coombs (1977) used a psychometric technique
called proximity analysis which decomposes scale dependent
DK responses (ambivalent attitudes) from DK responses which
are not scale dependent (other DR-like responses). Analyzing a
7-item abortion scale, they found that 75.5 percent of DK
responses are ambivalent attitudes.
• In a national study of the: adult population concerning issues

related to energy, Faulkenbery and Mason (197S) used
interviewer coding of DK responses to distinguish between
ambivalent attitudes and no opinion. They examined only one
item on wind energy, finding that 45.2 percent of the DK
responses represented ambivalent attitudes.
20
;W t e l ! Dteoniposiitg f e DK Categor?: Aabifalent AMtafla w . Qft«r DK-like Respoiiises

• ! ; ' R -: i j V .,3 . i -.u ? -> jr / ti .'m ' s ;
.. _ „ i ...........
' I'l >■ .1!* 't "“! 1' ' ‘ •a I. " r"." ■ ' 1 ' >‘e 1 ' - \ n . i l I— x r ~ .......,5.5 245
20 40 years af age, 1972
Farfkeakuy and Mason (1978) Study of American Adults, 1 Rem on Wind Energy 45.2 54R
1975 Iiitariswer Coded
-......... - - ...................... ... -...-..................— ...........—
Smith (1984) SRC American Election Study 15 Political Attitude Items m 655
1956
S ea* (1984) SRC Americas Etectiim Stwly 8 Political Attitude Items m m

1960
Yonag (1999c) OSS 1975-1998 11 Confidence Items 52.7 m

Itmitg (1999c) CSS 1973-1998 12 Government Spending ................59.7................. ........... 50.3................_
Items
I
8
Young (1999c) GSS1975-1998 6 Abortion Items 73.4
21
• Using two SRC studies on political attitudes (1956 and 1960),

Smith (1984) analyzed interviewer coded DK responses. He
found that, on average, 34.2 percent of respondents on the 1956
study and 40.8 percent on the 1960 study, who answered DK,
were ambivalent in respect to the topics covered.
• Using the same techniques employed by Coombs and Coombs

(1977), Young (1999d) analyzed three batteries of items from the
GSS. He found that 52.7 percent of respondents who gave a DK
response on the 11 item confidence scale were ambivalent; 59.7
percent of respondents who gave a DK response on the 12 item
government spending scale were ambivalent; and 73.4 percent of
respondents who gave a DK response on the 6 item abortion
scale were ambivalent.
What do these results suggest?
First, 54.7 percent of respondents who provided a DK answer had an
ambivalent/neutral attitude (note I took the average of the 7 item sets). Simply put,
it appears that the majority of DK responses seem to be ambivalent attitudes, though
further research is needed to confirm this conclusion. Second, Table 1 also indicates
that there exists considerable variation among subsets of items. Indeed, on average,
74.5 percent ([75.5 + 73.4J/2) of the respondents who answered DK to one of the
two abortion scales had an ambivalent/neutral attitude, while only 46.5 percent
([45.2 + 34.2+40.8+52.7 +59.7J/5 ) of respondents who provided a DK response to
one of the 5 batteries of political items had an ambivalent/neutral attitude. What
might explain why abortion questions produce higher ambivalent attitude rates than
political questions?
22
First, abortion is a topic that most individuals probably have considered at
one time or another. Second, abortion is the kind of issue that taps underlying,
perhaps even immutable, beliefs and social roles. Respondents, therefore, may never
have thought much about the issue of abortion but still may use well-defined social
roles and beliefs to impute substantive answers. Respondents may arrive at answers
something like this:
• Case 1 (well-defined social role): Hummm lama

Catholic. I, therefore, am against all forms of abortion.
• Case 2 (well-defined social role): Hummm....I am a

feminist. I, therefore, should be in favor of all forms of
abortion, i
in the above cases, issue confusion, lack of knowledge, and problems of
respondent motivation become less important in determining DK responses than
whether (or not) the respondent has well-defined social roles. On questions like
abortion, respondents who say I don’t know probably mean la m neitherfo r nor
against abortion. Such respondents might arrive at a DK response something like
this:
• Case 3 (conflicting social roles): Hummm... .1 am a

feminist and a catholic. I know that I am against
abortion when it is used for birth control, but, I am in
favor of abortion when pregnancy threatens the life of
die woman. If the woman were to be raped... .1 really
don’t know... p am ambivalent].
What about other subject matters such as politics? In these cases,
respondents may not have well-defined belief structures, which closely correspond
23
to the question topic (e.g., questions concerning specific policies, like NAFTA).
Indeed, for many respondents, the survey interview may be the first contact they
have had with specific socio-political subjects. On such topics, we should expect
that the proportion of mu-am bivalent DK responses to be larger relative to
ambivalent attitudes, as a result of increased issue confusion; lack of knowledge; and
low respondent motivation. In such cases, respondents might arrive at a DK
response something like this:
• Case 4 (non-attitude; lack of knowledge): Hummm... .What is

NAFTA? I really have no idea...I don’t know
Here it is important to stress that non-ambivalent DK responses do not
always represent non-attitudes or no opinions. Research shows that respondents
who answer DK many times do have positive or negative leanings toward the given
issue (Schuman and Presser 1981; Gilljam and Granberg 1993). So why, then, do
respondents provide DK-like answers?
The literature on attitude formation offers a possible explanation. Responses
to attitude questions are not always direct reflections of underlying beliefs and social
roles (Sudman et. al. 1996; Krosnick 1991; Tourangeau et. a{. 2000). Specifically,
when asked questions on politics, respondents do not simply go to the relevant
mental file marked politics; open it up; and select the answer. Instead, a variety of
intervening factors (e.g., question characteristics; respondent motivation; and
respondent cognitive ability) affect the cognitive processing of survey questions and,
in turn, responses to them (Sudman et. al. 1996; Krosnick 1991; Tourangeau et. al.
24
2000; Schuman and Presser 1981). Considering this, respondents might arrive at a
DK response something like:
• Case 5 (lack of respondent motivation): Hummm... .NAFTA.

It has something to do with trade.. .1 still have to pick up the kids
and make dinner. If I have to think this much for all the
questions I will never get this thing over I don’t really know.
• Case 6 (DK option placed in the response stem):

Hummm... .NAFTA. It has something to do with trade... am I for
or against it? or don’t I have an opinion?...I really don’t know
anything about NAFTA. ..I have no opinion.
Of course, even in the case of the NAFTA question, some respondents may
have well-defined beliefs and social roles, which they may use to arrive at an
answer. Such respondents might impute an answer something like this: •
• Case 7 (well-defined social role): Hummm....NAFTA. I don’t

know much about it.. .has something to do with free trade. I am
a union member... .free trade could take away high-paying jobs
from the US. I am against free trade... I, therefore, am against
NAFTA.
Summing up the above discussion, respondents mean many things when they
say DK or express DK-like responses. Some examples include:
® Nonattitudes: I Don’t Know—I really do not know anything

about the issue;
• Ambivalent Attitudes: I Don’t Know. ...I am neither for nor

against the issue;
25
• Editing Socially Undesirable Answers: I really don’t like
homosexuals.. .But I shouldn’t say this openly—-It is not
politically correct...I Don’t Know
• Satisficing: I really don’t have time for this survey..I don’t

want to think too much to answer this question..,I Don’t Know
The important point here is that respondents who say DK do not only mean I
have no idea. Indeed, analysis above has shown that, on average, over half of the
DK responses are actually ambivalent attitudes.
2.2 Question Level Correlates and Data Quality
Extensive research has shown that the inclusion or exclusion of certain
question characteristics can affect the rate of DK for a given question (e.g., Schuman
and Presser 1981). For instance, DK rates'are higher when questions include a DK
option in the response scale than when,they .do not. In this section, I attempt to
answer one question:
(1) What question characteristics affect DK rates?
2.2.1 Question Characteristics
I examine 9 question characteristics that the research has shown to
influence DK rates. These characteristics include:
26
(1) Question Content (More Specialized Knowledge, Less
Specialized Knowledge)
(2) Question Concept (Well-Defined, Poorly Defined)
(3) Number of Response Options (More, Less)
(4) Middle Option (Included, Not Included)
(5) DK Option (Included, Not Included)
(6) Probing of DK response (Probe DK, Do Not DK)
(7) DK Option (Included Question Stem, Included Response Scale)
(8) Wording of DK Option in the Question Stem (More Restrictive,
Less Restrictive)
(9) Wording of DK Option in the Response Scale (More Restrictive,
Less Restrictive)
2.2X1 Question Content
Research shows that DK rates are higher for questions which address topics
that require very specialized knowledge and/or are very distant from a respondent’s
everyday life than for questions that require general knowledge and/or are very'
proximate to a respondent’s everyday life (Converse 1977; Smith 1981; Young
1999d). Specifically, topics, such as politics, foreign policy, and economics, which
require specialized knowledge, produce higher rates of DK than topics, such as
morality, quality of life, and subjective well-being.
2.2.1.2 Question Concept
Research shows that questions with difficult or unclear concepts and
wording produce higher rates of DK than questions with clear concepts and wording
(Coombs and Coombs 1977; Young 2000). In an analysis of DK rates on a
satisfaction study of Brazilian banks, Young (2000) found that respondents were 5
27
times more likely to ask the interviewer to clarify question wording on items with
high rates of DK (25% or more) than on items with low rates of DK (3% or less).
These empirical findings, however, are nothing new. Indeed, one of the key
objectives at the questionnaire design stage is to identify poorly written questions.
The questionnaire design literature, in turn, gives ample treatment on how to best
pre-test questions (e.g., Sudman et. al. 1996).
22.13 Response Scale: Number of Response Options
DK rates are higher on questions with polar response scales such as yes/no,
agree/disagree, and favor/oppose than on questions with more response options
between the two ends of the response continuum (Converse 1977).
'.Theliterature speculates that more finely-grained response scales better
reflect respondent opinion. This research, however, has never been replicated.
2 2 .1.4 Response Scale: Middle Option
Research shows that DK rates are higher on questions which possess true
mid-points hut do no t offer them as a response option than on questions which offer
mid-points (O’Murcheartaigh et. al. 1999). What might explain this mid-point
effect?
One answer may be that respondents use “the next best answer strategy”—
where if the first choice is not offered, respondents opt for the next best answer. The
logic is something like this:
28
Interviewer (Question): “Do you agree or disagree with the
Mowing statement?” Bush will be a better President than
Clinton
Respondent (Answer): Bush will be just as bad as Clinton.. .1

really don’t agree or disagree with this statement....The question
really does not include my answer.
Interviewer (Probe): I know that questions can be frustrating

and sometimes your exact answer is not included. We, though,
have to keep the questions the same across respondents, so that
we can compare the answers. “Do you agree or disagree with
the following statement?” Bush will be a better President than
Clinton.
Respondent (Answer): [I answered don’t know on that NAFTA

question at the beginning]... I really don’t know
Interviewer: OR...Thanks
The above scenario finds empirical support in two different strains in the
methods literature. First, research on the composition of the DK category (presented
in the last section) shows that many DK responses represent ambivalent attitudes,
presumably because no mid-point was offered. Second, the literature on
interviewing techniques has established probes, such as one cited above, to guide
respondents who complain that their answer is not offered in the response scale
(Fowler and Mangione 1993). Specifically, in order to standardize responses across
respondents, this research instructs the interviewer to stress the importance of
responses that correspond to options offered in the response scale, discouraging
responses that fall outside the pre-defined response options.
29
2.2X5 Response Scale: DK Option
DK rates are higher on questions that include DK options than on questions
where no DK category is offered (Schuman and Presser 1981; Converse and
Schuman 1984). Why is this the case?
The literature argues that the demands of the question confine respondent
behavior (e.g., Sudman and Bradbum 1974; Schuman and Presser 1981; Bishop et.
al. 1980; Krosnick and Fabrigar 1997). In other words, the written question itself
communicates to the respondent which answers are legitimate and which are not
legitimate. Potentially legitimate responses are those included in the response scale.
Thus, in cases where the DK option is included in the response scale, DK responses
are considered legitimate.
2,2X6 interviewer Protocols: Probing DK-like answers
Research shows that, when interviewers probe DK-like responses, DK rates
are lower than when they do not probe DK-like responses (Sanchez and Morchio
1992). When and why would interviewers probe DK responses?
In order to minimize DK responses, many survey institutes do not include
the DK option in the response scale. Instead, interviewer must record a DK answer
only after the respondent expresses a DK-like response. How is this done?
Although methodologies vary from research company to research company,
many of the large commercial and academic survey institutes specify very similar
procedures to confront DK-like answers in their interviewer training manuals
30
(Fowler and Mangione 1990). First, training manuals teach interviewers that the
DK response may represent many things: (1) ignorance; (2) a pause in thought; (3)
an ambivalent attitude. Second, because DK may mean many things, training
manuals direct interviewers not to interpret DK answers. Instead, they instruct
interviewers to use nondirective probes before recording a DK response. Most DK
probes follow a logic something like this:
Interviewer: “Are you for or against the US trade policy

towards Cuba?
Respondent used as a pause): Humm.....I Don’t

Know....[give.me a second]
Interviewer (Nondirective Probe): “Please remember there

are.no right or wrong answers. ...we are only interested in
your best guess.... .{repeat;question]. Are you for or against
./ .; the/US trade policy towards Cuba?
Does this sort of probing induce respondents to offer opinions when they
really do not have an opinion? We really do not know at this point. Some research
suggests that excessive probing may lead to increased guesswork on knowledge
questions (Sanchez and Morchio 1992). However, this research is not conclusive in
respect to attitude questions. Furthermore, other research shows that many
respondents who express a DK-like response actually have a stable leaning if asked
similar questions repeated times (Giljam and Granberg 1993). These findings
suggest that persistent probing may improve data quality.
31
2.2.1.7 Response Seale: DK Option—Question Stem or

Response Scale
Research shows that respondents are more likely to choose the DK option
when it is included in the stem o f the question than when it is offered as an option in
the response scale (Schuman and Presser 1981; Bishop et. al. 1980). The research
is not definitive as to why this is the case. Some studies suggest that the DK filter
encourages respondents who have no attitude to select a DK option (Bishop et. al.
1980). Other research indicates that such filters signal to respondents that the task
will be difficult, thus discouraging them from exerting the effort to come up with a
substantive answer (Hpplet and'Schwartz 1989);
2X 1J Wording of tic MC Option in the Stem of the

Question
Research demonstrates that DK rates are higher on questions with less
restrictive DK options than on questions with more restrictive DK options (Bishop
et. al. 1983). For instance, “Do you have an opinion on this issue or not?”, produces
fewer DK responses than “Have you been interested enough in this issue to favor
one side over the other?”, which produces slightly less DK’s than either “Have you
thought much about this issue?” or “Have you already heard or seen enough about it
to have an opinion?” (Bishop et al. 1983; Krosnick and Fabrigar 1997).
32
Krosnick and Fabrigar (1997) argue that “...the three latter filters make it
easier for respondents to admit that they have not considered the topic...and
therefore have no opinion on [the issue]” (p. 154). Put another way, the latter three
filters are less restrictive, making it easier for the respondent to say DK, while the
first filter is more restrictive, making it more difficult for the respondent to answer
DK.
2.2.L9 Wording of the DK Option in the Response Scale
Do differences in the wording of value labels have an effect on whether (or
not) a respondent will choose a DK response? For instance, are respondents more
likely to choose the DK option if the label is Don’t Know than if the label is Can’t
Choose? The short answer is probably yes but we really do not know conclusively
at this point. .So what do we know?
Research demonstrates that the -wording of the DK filter located in the
response stem does influence the respondent’s likelihood of responding DK. No
corresponding research, however, has examined whether respondents are more likely
to choose one specific type of DK labels over another (e.g., Don’t Know, Not Sure,
Can’t Choose, etc.). Even so, the ISSP (International Social Survey Programme)
uses a Can’t Choose option on its surveys, instead of Don’t Know, because ISSP
researchers believe that Can’t Choose is more restrictive, making it less likely for
respondent to select the category (Young 2001).
33
2.2.1.10 Summary Remarks
Extensive research shows that the manipulation of question characteristics
can affect the rate of DK on any given question. How, then, should the researcher
use this information to make decisions on questionnaire design? In simple cases,
researchers can treat the question characteristic variables as simple linear
combinations. The following example illustrates this point:
QiM io jA Question's
Question Content (Subjective Well-Being) Question Content (Subjective Wei-Being)
Question Concept (Clear) Question Concept (Clear)
Many Response Options Many Response Options
Mid-Point Included Mid-Point Included
No Probe No Probe
DK Option in Response Scale ... DK Option tit Question Stem
More Restrictive DK Option More Restrictive DK Option
.In the above case, the questionnaire designer knows that question type A will
produce lower DK rates than question'type B, because DK rates are higher for
questions with the DK option in the question stem than the response scale.
However, not all cases are this simple. Indeed, the research on the association
between question characteristics and DK rates is far from complete, not having
examined in any depth interactions between question characteristics. Therefore,
simulations become problematic when more than a few factors are varied. The
following examples illustrates this point:
Question A Q g M P l.1
Question Content (Subjective Well-Being) Question Content (Subjective Well-Being)
Question Concept (Clear) Question Concept (Clear)
Few Response Options Many Response Options
Mid-Point Nat Included Mid-Point Indudei
No Probe No Probe
DK Option in Response Scale DK Option in Question Stem
More Restrictive DK Option More Restrictive DK Option
34
In the above case, the questionnaire designer can not determine if question
type A will produce lower DK rates than question type B or vice-versa. Further
research must examine interactions between question characteristics.
2 3 DK Option and Issues of Quality
Now that we know what question characteristics can maximize or minimize
DK rates, should we maximize DK responses, or minimize DK responses? Which
strategy maximizes data quality?
Before going into the specifics of the research on data quality, it is first
important to discuss the two main schools of thought on this issue—one which
argues that DKs should be maximized andfhe other which argues that they should be
minimized. Perspective 1: Converse. (1964, 1970) argues that DKs should
maximized in order to maximize data quality. Converse’s work showed that a
substantial portion of the population is unable to form adequate opinions on issues,
due to extremely low information levels. Converse calls such uninformed opinions,
non-attitudes.
Converse originally hypothesized that nonattitode holders arrived at their
answers randomly at the flip of a coin. However, later research suggested that such
respondents use more sophisticated response strategies to arrive at answers, such as
selecting the positive end of the response scale (Smith 1981; Schuman and Presser
1981; Taylor 1983; Brody 1986). Whatever the position on respondent response
strategy, Converse and colleagues are unanimous in stressing that nonattitude
35
holders have no attitude position in relationship to the topics covered on surveys;
thus, they should excluded.
Perspective 2: Krosnick (Krosnick 1991; Krosnick and Fabrigar 1997;
Krosnick 1999) argues that DK should be minimized if data quality is to be
maximized. Specifically, Kronick’s research suggests that certain respondents—
specifically those with low levels of cognitive sophistication—answer DK in order to
minimize the cognitive burden of survey questions. Krosnick calls such sub-optimal
answering—satisficing.
Krosnick suggests that researchers should exclude explicit DK options in
order to make satisficing as difficult as possible. Indeed, contrary to Converse’s
nonattitude perspective, Krosnick’s satisficing perspective predicts that DK filters
actually exclude many respondents with real attitudes, undermining, rather than
maximizing, data quality. So what does the research suggest?
The research is far from conclusive on this point. Specifically:
• the research has found no differences in validity and reliability

between attitude questions with and without a DK filter
(McClendon and Alwin 1993)
• the research has found no differences in correlations among

demographic and attitude questions with and without a DK
option (Schuman and Presser 1981; McClendon and Alwin 1993)
• the research has found differences in univariate distributions

among attitude questions with and without a DK option.
However, differences were not found on all questions (Schuman
and Presser 1981; Bishop et. al. 1980, 1983).
36
• the research has found differences in inter-item correlations
among attitude questions with and without a DK (Schuman and
Presser 1981).
• the research has found that the inclusion of the middle category
improves reliability and validity (O’Murcheartaigh et. al. 1999).
So what conclusions can we draw from the above findings? First, as a
general rule, middle options should be included on questions which have a true mid
point (e.g., agree/disagree).
Second, the evidence against the exclusion of the DK category is weak.
Indeed, the changes found in the univariate and multivariate distributions of attitude
items do not point toward which method (the inclusion or exclusion of the DK
option) produces more valid results (Schuman and Presser 1981; Bishop et. al. 1980,
1983). Instead, ,the research only suggests thatfrifferences exist.;
Conversely, the evidence^br the exclusion of the DK category is strong.
On pure sample size grounds, the inclusion of the DK option decreases effective
sample size, increasing the standard error of estimates. With issues of sample size in
mind, many survey companies do not offer a DK option.
Furthermore, empirical evidence suggests that many respondents who answer
DK actually have attitudes but may not express them for a variety of reasons,
including lack of motivation; unrestricted interviewer probing; and the lack of
response categories reflecting the respondent’s attitude (Gilljam and Granberg 1993;
Sudman et. al. 1996; Krosnick 1991; Tourangeau et. al. 2000; Schuman and Presser
1981).
While the research is mixed, the evidence points towards minimizing DK
responses. Of course, such a conclusion must be treated on a case by case basis.
For instance, it may be advisable on a question concerning a very specific, obscure
policy to filter out respondents that do not have an opinion.
2.4 Individual Level Correlates
Extensive research shows that respondents who answer DK are
systematically different from respondents who provide substantive answers (Ferber
1966; Francis and Busch 1975; Krosnick and Milbum 1990; 1999b; 1999c). In this
section, I examine individual level correlates of DK. The correlates include:
(1) Cognitive Sophistication

(2) Civic Participation
(3} Age
(4) Gender
(5) Education
(6) Race
(7) Occupational Prestige
(8) Health
(10) Work Status
2.4.1 Cognitive Sophistication
The existing literature on the correlates of DK responses suggests that DK
may be a function of differential levels of cognitive sophistication (e.g., Krosnick
1991; Schuman and Presser 1981). What is cognitive sophistication?
Cognitive sophistication is a concept widely-cited in the methods literature
yet rarely defined. It, indeed, means different things to different researchers.
38
Here cognitive sophistication means the combination of three factors: (1)
knowledge, (2) information exposure, and (2) cognitive ability. The more
cognitively sophisticated are those individuals who are more knowledgeable about
survey topics; are more exposed to such topics; and are more able (cognitive ability)
to think through topics found on surveys. While these three factors are probably
distinct sub-dimensions, they are often grouped together under the umbrella of
cognitive sophistication because they are highly correlated (Young 1998c; 1999c).
There are two possible explanations for why DK may be a function of
cognitive sophistication. First, the cognitive sophistication effect may result from
differential levels of knowledge and exposure to information. A number of studies
demonstrate that the well-informed are less likely to answer DK (Converse 1964,
1970; Converse 1977; Faulkenberry and Mason 1978; Francis and Busch 1975;
Rapoport 1982,1985; Smith 1981).
Second, the cognitive sophistication effect may result from varying levels of
verbal ability. Research suggests that respondents with weaker verbal skills are
more likely to have difficulties understanding survey questions, resulting in higher
DK rates (Krosnick and Alwin 1987; Krosnick 1991; Young 1999a).
2.4.2 Civic Participation
Other research suggests that a respondent’s degree of participation in civic
activities may explain variations in the rate of DK. This research shows that
39
respondents more involved in civic activities are less likely to answer DK (Francis
and Busch 1975; Faulkenberry and Mason 1978; Rapoport 1985; Young 1999c).
While far from conclusive, there are two possible reasons for the association
between civic participation and DK. First, civic participation may be a proxy for a
respondent’s propensity to feel the social obligation to answer survey questions
(Krosnick 1991; Young 1999c). The methods literature has called respondents with
a high propensity for such behavior—“good respondents”. Second, people who are
more likely to participate in civic activities are also more likely to be exposed to
issues found on surveys, such as politics and current events (e.g., Francis and Busch
1975; Faulkenberry and Mason 1978).
2.4.3 Gender
DK may also result from differences in gender. Many studies indicate that
women are more likely to give a DK answer than men (Francis and Busch 1975;
Rapoport 1982,1985; Smith 1984; Sudman and Bradbum 1974; Young 1999c). The
literature offers two possible explanations for the gender effect. First, gender
differences may result from women being less knowledgeable about the subject
matters covered in surveys than men (Francis and Busch 1975; Rapoport 1982,
1985). An alternative explanation suggests that females have been socialized not to
express opinions, resulting in higher DK rates (Rapoport 1982,1985).
40
2.4.4 Age
Other research indicates that DK may be a function of age. A number of
studies have shown that older individuals are more likely to express DK-like
responses than younger people (Gergen and Back 1966; Glenn 1969; Young
1999c; 2000a). Three possible explanations for the age effect are cited in the
literature.
First, higher DK rates among older respondents may be a function of
cognitive deterioration brought on by (cognitive) senescence at older ages
(Young 1999c; 2000a). Older individuals are more likely to give a DK response
because they are less cognitively able to deal with the topics on surveys than
younger individuals (e.g., inability to easily retrieve information from memory;
to manipulate difficult concepts, etc.).
Second, age differences in DK rates may also result from older
respondents being more likely to answer DK because of social disengagement
(social senescence) (Gergen and Back 1966; Young 1999c). As articulated in the
methods literature, social disengagement is a process by which older individuals
progressively withdraw physically and mentally from the social world, feeling
less bound by societal norms. Socially disengaged individuals being less likely
to adhere to social norms are more likely to exhibit non-normative behavior. In
41
the context of the survey interview, one potential form of abnormal behavior is
providing a DK response when a substantive answer is expected.
Third, the relationship between age and DK may result from generational
differences in education and information exposure. Most studies on response error
merely assume that the age effect results from changes in the life cycle. However,
differences may also result because younger generations are more educated and
better-informed, decreasing their propensity to answer DK. What does the research
that exists on this subject suggest?
Almost no research has been conducted on the age effect. One study does
indicate, though, that older respondents are more likely to say DK because they are
growing older and not because they are from earlier generations (Krosnick and
Milbum 1990). More research is needed before any definitive conclusion may be
drawn.
2.4.5 Education
Many studies show that the less educated are more likely to say DK than the
more educated (Gergen and Back 1966; Ferber 1966; Glenn 1969; Sudman and
Bradbum 1974; Francis and Busch 1975; Converse 1977; Faulkenberry and Mason
1978; Bishop et. al. 1980; Smith 1981, 1984; Narayan and Krosnick 1996; Krosnick
and Fabrigar 1997; Young 1999c). There are four different explanations for the
education effect.
First, research suggests that the less educated are less likely to be cognitively
sophisticated (Schuman and Presser 1981; Krosnick and Milbum 1990; Young
42
1999a, 1999b). Second, the more educated are also more likely to participate in
civic activities (Young 1999b; 1999c). Third, the less educated are more likely to
say DK because they are older (Gergen and Back 1966; Ferber 1966; Krosnick and
Milbum 1990; Young 1999b; 1999c). Fourth, the less educated are more likely to
say DK because they are more likely to be female (Rapoport 1982, 1985; Krosnick
and Milbum 1990; Young 1999b; 1999c).
2.4.6 Race
Research has also shown that blacks are more likely to say DK than non
blacks (Francis and Busch 1975; Rapoport 1982,1985; Krosnick and Milbum 1990;
Young 1999b). There exist three possible explanations for the race effect. First,
blacks have lower levels of education than non-blacks (Francis and Busch 1975;
Krosnick and Milbum 1990). Second, blacks are less involved in activities related
to the topics found on surveys than non-blacks (Kronsick and Milbum; Young
1999b, 1999c). Third, blacks are less knowledgeable about the topics found on
surveys than non-blacks (Rapoport 1982,1985; Krosnick and Milbum 1990; Young
1999b)
2.4.7 Other Possible Explanatory Factors
The literature has examined other correlates of DK, including occupational
prestige, subjective health, and work status. The research finds that:
43
• Those respondents with higher occupational prestige levels are
more likely to say DK than those respondents with lower
occupational prestige levels (Ferber 1966; Francis and Busch
1975). However, after controlling for other characteristics (age,
education, and cognitive sophistication), occupational prestige no
longer has an independent effect on DK (Young 1998c; 1999b;
1999c).
• Retired respondents are more likely to say DK than those

respondents who are not (Young 1998c; 1999b; 1999c).
However, after controlling for other characteristics (age,
education), being retired no longer has an independent effect on
DK (Young 1999b).
• Respondents with lower levels of subjective health are more

likely to say DK than those respondents who do not (Young
1998c; 1999b). However, after controlling for other
characteristics (age), health status no longer has an independent
effect on DK (Young 1998c; 1999b).
So, what does the above discussion suggest? Broadly defined, individual
level correlates of DK can be broken down into two general categories: (1)
cognitive correlates and (2) social correlates.
Cognitive factors include: (1) knowledge level; (2) exposure to information;
and (3) mental ability. Social factors include: (1) respondent motivation and (2)
adherence to social norms.
Put a slightly different way, DK can be conceptualized as a function of two
meta-variables: respondent motivation and respondent comprehension (Young
1999b; 1999c) (see equation 7below).
DK = Respondent Motivation + Respondent Comprehension (7)
44
Respondent motivation variables refer, specifically, to social factors, while
respondent comprehension variables refer to social factors.
2.5 DK and Missing Data
Are DK responses on attitude questions missing data? The answer to this
question is both yes and no. DK responses theoretically can be both true values as
well as missing data.
First, a DK response can represent a true value. For instance, some
respondents may have no opinion on a subject because they lack sufficient
knowledge to form an opinion. Converse (1964,1970) refers to this lack (or
absence) of attitudes as nonattitudes.
Second, DK can also be a form of missing data. Even though respondents
may have an attitude on a given subject, they may choose the DK option, to avoid
giving a socially undesirable answer; or to minimize the cognitive demands of
answering the question; or because a response option that more closely corresponds
to their own opinion is not offered. Research suggests that at least half of the DK
responses are ambivalent attitudes, hence missing data (section 1 in this chapter).
However, the empirical evidence indicates that there does not exist such a
clear distinction between non-attitudes (true values) and attitudes (missing data).
Indeed, recent research suggests that many “non-attitude holders” have stable
leanings towards issues if asked the same (or similar) questions repeated times on a
45
survey (Schuman and Presser 1981; Gilljam and Granberg 1993). So why do
respondents answer DK when they have an opinion?
Research into the cognitive processes associated with survey response
indicates that attitude formation does not follow the traditional file-drawer model,
where respondents first are administered the question; after which, they search for
the relevant mental file to see if they have an opinion on the subject; and, finally,
they respond (Schuman and Presser 1981; Sudman et. al. 1996; Tourangeau et. al.
2000). Instead, recent research shows that intervening factors, such as respondent
motivation and cognitive ability; question characteristics; interview characteristics;
and mode of questionnaire administrations all affect the probability of providing a
DK response. This model of attitudinai formation suggests that respondents have a
higher probability of providing a DK response when the DK option is offered than
when the DK option is not offered.
Furthermore, our discussion of the literature has also shown that the rate of
DK varies considerably by question and respondent characteristics. This empirical
evidence, without a doubt, further blurs the line between true value and missing data.
Based upon the evidence presented in this chapter, I contend that, in most
cases, DK responses are missing data because they are unusable in practice. I base
my argument on the following four points:
(1) DK responses to a given question are heterogeneous in

nature with respondents providing DK responses for a
variety of reasons. In practice, it is difficult, if not
impossible, to separate out the different DK typologies.
46
(2) The empirical evidence shows that a majority of DK
responses are ambivalent attitudes, a form of missing
data.
(3) Research shows that there exists no clear distinction

between attitudes and nonattitudes. Instead, the
probability that a respondent will provide a DK response
is a function of respondent, question, and interviewer
characteristics.
(4) Respondents who are more likely to provide a DK

response are systematically different from respondent
who are less likely. These findings suggest possible
problems with bias if DK responses are maximized.
This discussion, though, does not exclude the possibility that DK
responses may represent true underlying values. Indeed, analysts must evaluate
the aforementioned usability assumption on a case by case basis.
47
CHAPTER THREE
DATA AND METHODS
3.0 Data and Methods
In this chapter, I describe data and methods. The objective is to specify
general aspects of the data and methods employed in this thesis.
I break the chapter doyvn into four sections. In section 3.1,1 describe the
data source. In section 3.2,1 examine the specifics of DK correlates to be used in the
analysts. In section 3.3,1 discuss the methods that will be used to calculate sampling
variance. Finally, in section 3 .4,1 review the technical aspects of the regression
models that will be employed.
3.1 Data
In this thesis, I will be analyzing data from the 1987 General Social Survey
(GSS). I choose the 1987 round because it is the only GSS study that includes
measures of cognitive sophistication and civic participation. Research shows that
there is not a strong time by DK interaction (Krosnick and Milbrun 1990),
suggesting that restricting analysis to the 1987 GSS will not seriously limit
generalizability.
The GSS is a national probability sample of the US non-institutionalized
English speaking population, 18 years of age and older. I exclude the black
48
oversample which leaves a total sample size of 1466 respondents for analysis (see
Davis and Smith 1996 for further discussion of the GSS).
DK on the GSS is not typically an explicit response category, instead,
interviewers code DK only after first probing the respondent once for a substantive
1 . . . .
answer. The GSS/NORC, in turn, tries to avoid excessive probing in order to
minimize false reporting ofDKs.
3.1.1 Stimmated DK Scale
To measure the level of DK responses on the GSS, I create a summated DK
scale of 106 attitude items '(Cronbach5s Alpha=.9264). I create such a scale
because I want to examine the sernrk m M m m kio between DK and the correlates
of DK. By combining questions of varying difficulties and varying-topics, I believe
that the scale will minimize the effect of any one question type or topic.
1 use two basic decision rules to select items for the scale. First, the question
must be a subjective attitude item. The DK scale, then, does not include any
demographic items (e.g., parents education and occupation) or behavioral questions
(e.g., frequency of sexual intercourse).
Second, the question must have been administered to all the respondents.
The DK scale, therefore, does not include the approximately 60 attitude questions
1
see page 21 of “Basic Interviewing Techniques” in NORC’s Field Interviewer Reference
Material for fijrther-discussion of interviewer protocols concerning DK’s.
2
Cronbach’s Alpha is a commonly used indicator of scale quality. An alpha of .65 is generally
considered acceptable for sociological or political scales, while for test instruments a much higher alpha is
required. Cronbach’s alpha squared corresponds to the percent of variance that the scale items explain in
the underlying construct being measured (Nnnnally and Bernstein 1994).
49
from the International Social Survey Program (ISSP) module because about 10% of
those interviewed on the GSS did not respond to the ISSP supplement. I, then,
dichotomize the items (where 1=DK; 0=substantive response).
Respondents, on average, answered DK 3.09 times on the 1987 GSS. Figure
1 below indicates that the distribution of DK responses cm the 1987 GSS is skewed
right with approximately 55 percent of respondents either not giving a DK response
or providing only one.
Figure 1: Distribution of "Don’t Know" Responses from

Attitude Items on the I f 87 General Social Survey
50% -
45%
40%
35%
30%
20% -
15% -
10% -
5%
0% -
0 7 4 6 f? 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Number of "Don’t Know” Responses
50
3.2 DK Correlates
The explanatory variables include two summated scales (cognitive
sophistication and social participation), age, education, race, work status, health
status, gender and occupational prestige. The cognitive sophistication scale
(Cronbach’s Alpha=.7851) includes three items: a pre-existing 10 item verbal ability
scale (WORDSUM), a three item political knowledge scale which I create
(GOVERNOR, USREP, SCHLHEAD), and a question which asks respondents how
frequently they read the newspaper (NEWS).
Note 88 respondents refused or were otherwise classified as NA (No
Answer)3 to answer the verbal battery (WORDSUM). Sensitivity analysis
demonstrated that this group of refusers is systematically different in composition
than the group that answered the verbal battery. Refusers answered DK 9.2 times,
on average, compared to 2.7 times for non-refusers. Refusers also had, on average,
9.3 years of education and 60.1 years of age compared to 12.7 years of education and
44.38 years of age for non-refiisers. In addition, the pairwise correlations of
demographic variables (age, education, sex, and race) and DK differ significantly
between refusers and non-refusers.
This analysis suggests that the exclusion of the 88 cases would bias means,
correlations, and partial correlations. I, therefore, impute the missing values using a
regression based approach {see Little and Rubin 1987 for a further discussion of the
3
A NA code is given when “ ...the respondent does not give an answer, when the written
information is contradictory or too vague, and when the coder needs to supply a code in order to
resolve a tricky skip pattern” (Davis and Smith 1996, p. 1030).
51
method). In brief, I estimate a simple model where I regressed verbal ability
(WORDSUM) on age (years), education (years), gender, and race in addition to
nonlinear terms. I, then, take these estimated parameters and calculate predicted
WORDSUM values for the 88 refusers, adding a random residual to each predicted
score in order to account for the within imputation variance.
This data imputation model, however, slightly underestimates standard errors
because it does not take into consideration between imputation variation. The
theoretically correct procedure would be to impute the missing values multiple
times, using one of several multiple imputation techniques. These procedures, while
theoretically justified, are not practically useful because they require the analysis and
merging of multiple data sets. The above regression-based imputation is a
compromise which slightly underestimates the sampling error but minimizes
nonresponse biases.
The social participation scale (Cronbach’s Alpha=.6510) includes a pre
existing 16 item social and political participation scale (MEMNUM) and a 7 item
political involvement scale which I create (LOCPROB, LGCGRP, INTPOL,
POLRALLY, LOGLOBBY, OTHLOBBY, GAVEPOL). Table 2 below includes
simple means and standard deviations of all variables.
52
Table 2: Descriptive Statistics: Variable Names, Means, Standard

Deviations, and Ranges
Variable Name Mean SD Range

Min Max
Degree (ordinal) 1.24 1.13 0 4
Less than a High 0.243 0.429 0 1
School Degree
High School Degree 0.518 0.500 0 1
Junior College Degree 0.045 0.207 0 1
College Degree 0.141 0.348 0 1
Graduate Degree 0.054 0.225 0 1
Cognitive 8.080 2.101 0.36 12.97
Sophistication
Social Participation. 3.231 3.092 0 16
Age 45.26 17.63 18 89
Race (1 =White) 0.834 0.373 0 1
Work Status 0.136 0.342 0 1
(l=Retired)
Health Status (l=Poor 0.052 0.222 0 1
Health) .
Occupational Prestige : .098 . .298 ■ 0 1
(l=High Prestige)
DK Scale 3.037- ' 5.267 42
To capture the age effect, I use age measured in years. To measure
education, I use two different variables. For the first education variable, I use a five-
category ‘highest level of education achieved’ question (DEGREE). The five
categories include: (1) less than a high school degree; (2) a high school degree; (3) a
junior college degree; (4) a college degree; and (5) a graduate degree. Using a logit
transformation, I re-calibrate this variable in order to take into account that the
distance between education levels is uneven {see Master and Wright (1982) for
further discussion of logit transformations to correct for unequal spacing in ordinal
variables). For the second education variable, I create five dummy indicators which
represent the highest degree obtained by the respondent (Dl=less than a high school
53
degree; D2=a high school degree; D3=a junior college degree; D4=a college degree
and D5=a graduate degree.
To measure race, work status, health status, occupational prestige and
gender, I create dummy variables: (1) Race (white =1; 0=nonwhite); (2) Work
Status (retired=l; O=nonretired); (3) Health status (poor health=l; 0=other); and (4)
Prestige (top ten percent of occupational prestige-1; less than top ten percent=0); (5)
gender (l=female and 0 —male).
3.3 Methods
3.3.1 Variance Estimation

The GSS is a multi-stage stratified area cluster sample. Such sample designs
are intended to decrease costs. However, because respondents are selected within
given geographic units, respondent characteristics such as age, income, and
education tend to be highly correlated. By ignoring the complex sample design,
analysts underestimate standard errors, which, in turn, can lead to incorrect statistical
inferences.
To calculate proper variances and standard errors, I first create a cluster (or
PSU) variable which assigns a value for each of the 84 PSUs (primary sampling
units). This variable adjusts for the within PSU correlation. I also create a strata
variable, which matches each PSU with another geographically proximate PSU.
Using the PSU and Strata variables, I calculate DEFF (Design Effect) to
adjust standard errors. Specifically, I multiply the square root of DEFF (commonly
54
referred to as DEFT) by the standard error of the estimate (means, proportions, and
standard errors). What is DEFF?
DEFF = VAR (Cluster)/VAR (SRS) (8a)
Equation 8a above shows that DEFF as a function of the variance of a cluster
sample (in the case of the GSS, multi-stage stratified cluster sample) for a given
variable over the variance of simple random sample (SRS) for that given variable.
Cluster samples typically have larger variances than simple random samples and,
thus, larger standard errors (Kish 1965). But why are variances larger in cluster
samples?.
The answer to this question rests in one of the underlying assumptions of
simple random sampling—that respondents are independent draws from a
population, where any given draw is not correlated with any preceding or future
draw (Cochran 1977). In cluster sampling, draws are not independent because
respondents within a given geographic area are highly similar (e.g., race, income,
ethnic background). Simply put, due to the high degree of similarity within
geographic regions, if we were to select 10 respondents from a sampling point (e.g.,
Hyde Park), it would be actually more like selecting 7 or 8 respondents.
DEFF = 1 + ROH (b-1) (8b)
55
Considering the above discussion, DEFF can be expressed as the function of
the interclass correlation (ROH) and the number of interviews conducted in the
given sampling point (b) (see equation 8b above). ROH varies considerable from
respondent characteristic to characteristic. For instance, race (and questions related
to race) has a relatively large ROH, given that blacks and white are highly
segregated in the United States (e.g., Smith et. al 1993). Both DEFF and ROH have
been used as measures for segregation (Kish 1965).
For any given characteristic though, ROH can be treated as fixed. This
allows sampling statisticians to minimize ROH’s negative influence on variance by
decreasing the number of interviews done per sampling point (see equation 8b
above). Thinking along these lines, DEFF for simple random sampling is merely a
special case where: (1) ROH = 0 and (2) b = 1.
Table 3 below includes: the DEFF and DEFT for each of the variables cited
in the last section.4 What do we find?
T a b le 3: S ta n d a r d E r i w , D E F F * a m i D E F T o f D l iC o n - e l a te s ................
V ariable Standard Error (Unadjusted) Standard Error (Adjusted) D EFF . D1FT..
A ae . ...0,461....' .... 0.647 : 1,97 1,404
Degree 0,108 0,124 1,33 1.153
Civic P articip atio n 0,081 0,087 1,142 1.069
Cognitive Sophistication 0.231 0.268 1.347 1.161
Female 0.629 0.563 0.802 0.896
White 0.010 0.021 4,709 2.170
Poor Health 0.006 0.007 1.358 1.165
High Prestige 0.008 0.007 0.791 0.889
R etired 0.009 0,011 1,54 1,241
DKscale 0,138 0,211 2,349 1,533
I use the statistical package STATA to estimate regression models and standard errors.
For complex samples, STATA uses taylor series approximation to estimate variances.
56
Table 3 above shows that the size of DEFF varies considerably from
characteristic to characteristic (4.71 vs .791). Race, age, and DK have relatively
large DEFFs. Specifically, race has the highest DEFF at 4.71, while age and DK
have DEFFs of almost 2. These higher than average DEFFs suggest that these
variable will have larger sampling variances than would be expected using simple
random sampling. For instance, the sampling variance for race is almost 5 times
larger than the sampling variance for a simple random sample. Civic participation
(1.14), gender (.802), and degree (1.33)—all have relatively low DEFFs, suggesting
that their sampling variances will be approximately equivalent to that of a simple
random sample (SRS).
3.4 Model , ■ ;
In each of the four empirical chapters (4,5,6, and 7), I employ OLS
regression to test the effect of explanatory factors on DK. In order to meet
underlying assumptions (e.g., linearity, normality, homoskedasticity, and
autocorrelation), I adjust the OLS model in two ways. First, I transform the
dependent variable using the square root fimction to account for non-normal error
distribution. This correction substantially minimizes non-normality yet does not
eliminate it. To interpret results, I re-transform all estimates into their original
metric (number of DKs) by squaring them.
Second, to minimize problems of unequal variances (heteroskedasticity), I
transform the dependent variable. I also use the Aiken Transformation which
57
weights the variance-covariance matrix by the inverse of the estimated standard error
(Greene 1997).
58
CHAPTER FOUR
ANALYSIS OF THE BASELINE MODEL
4.0 Analysis of the Baseline Model
In this chapter, I attempt to answer one question:
(I) Are respondents who are more likely to answer DK

systematically different from respondents who are less
likely to answer DK?
The question above has already been thoroughly addressed in the
literature. Respondents who are more likely to answer DK are systemically different
than those who are less likely to say DK (Ferber 1966; Francis and Busch 1975)—
DK, in other words, is a non-random phenomenon. I, however, will address the
question in order to establish a baseline model before exploring new questions.
I organize the chapter into 5 sections. In section 4.1,1 quickly review the
literature discussed at length in chapter 3. In section 4.2,1 detail the models that I
will test. In section 4.3,1 analyze bivariate correlations, while, in section 4.4,
multivariate partial correlations. Finally, in section 4.5,1 conclude the chapter by
discussing the results of the analysis.
59
4.1 Literature Review

The literature on respondent level correlates has shown that DK is a non-
random phenomenon (Gergen and Back 1966; Glenn 1969; Sudman and Bradbum
1973; Francis and Busch 1975; Converse 1977; Faulkenberry and Mason 1978;
Bishop et. al. 1980; Smith 1981, 1984; Narayan and Krosnick 1996; Krosnick and
Fabrigar 1997). Specifically, the literature indicates that:
(1) Respondents who are more educated are less likely to answer DK
than respondents who are less educated.
(2) Older respondents are more likely to say DK than younger

respondents. V
(3) Respondents who are cognitively more sophisticated are less

likely to answer DK than the cognitively less sophisticated.
(4) Respondents who participate more in civic activities are less

likely to say DK than those who are less likely to participate in
civic activities.
(5) Female respondents are more likely so say DK than male

respondents.
(6) Respondents with high levels of prestige are more likely to say
DK than those with lower levels of prestige.
(7) Non-white respondents are more likely to say DK than white

respondents.
(8) Respondents who rate themselves as having poor subjective

health are more likely to answer DK than respondents who rate
themselves as having good subjective health.
(9) Retired respondents are more likely to say DK than respondents

who are not retired.
60
4.2 Methods
In this chapter, I am most interested in determining the best fitting model in
order to establish a baseline for the rest of the study. To do this, I first estimate a
main effects model (see equation 9 below).
(9)
DK = P0 + pi (Civic Participation) + p2(Cognitive Sophistication) +

p3(Age) + p4(Gender) + p5(Race) + p6(Subjective Health) +
P7(Prestige) + p8(Work Status) + p9(Education) + ei
This model hypothesizes that DK is a function of 9 independent variables:
(1) civic participation; (2) cognitive sophistication; (3) age; (4) gender; (5) race; (6)
subjective health; (7) prestige; (8) work status; and (9) education. Here I am making
no assumptions about possible non-linearity.
( 10)
DK = po + pi (Civic Participation) + p2(Cognitive Sophistication) +

p3(Age) + p4(Gender) + P5(Race) + {06(Civic Participation * Cognitive
Sophistication) + 0 7 (Age * Gender) + 08 (Civic Participation * Race) +
09 (Age * Civic Participation)} + ei
Furthermore, in equation 10 above, I test for interaction effects. This model
hypothesizes that DK is a function of the main effects, in addition to 4 interaction
effects: (1) civic participation * cognitive sophistication; (2) age * gender; (3) race *
civic participation; and (4) age * civic participation. I choose the interaction terms
61
based upon (1) a review of the literature and (2) an empirical analysis of inter-item
correlations.
To test for the best fitting model, I use the difference in R-square test for
nested models (see equation 11 below).
(11)
f = f ( R „ 2- R j ) / ( k b - k a ) ] / [ ( l ~ R b)/(n - kb- l ) ]
where Rb2 is the R square for the full model;

Ra2 is the R square for the parsimonious model;
kb is the number o f parameters for the full model;
k ’S the number o f parameters for the parsimonious
model,
a is the sample size.
4.3 Bivariate Analysis of Demographic and Behavioral Correlates

In this section, I analyze the bivariate correlations cited in the literature
review (note all bivariate coefficients (r) are standardized Pearson correlation
coefficients) . What do we find? Table 4 below shows that:
(1) Older respondents are significantly more likely to say DK than

younger respondents (r = .233; p.<.05).
(2) Respondents who participate more in civic activities are

significantly less likely to say DK than those who are less likely
to participate in civic activities (r = -.212; p.<05.).
(3) Respondents who are cognitively more sophisticated are

significantly less likely to say DK than those that are less
cognitively sophisticated (r = -.261 ; p.<-05).
(4) More educated respondents are significantly less likely to answer

DK than less educated respondents (r = -.177; p.<.05).
62
(5) Female respondents are not significantly more likely to answer

DK than male respondents (r = .086: p>.G5).
(6) Respondents who rate themselves as having poor subjective

health are significantly more likely to answer DK than
respondents who rate themselves as having good subjective
health (r = .033: p.<.05).
(7) Respondents with high occupational prestige are not

significantly more likely to say DK than respondents with lower
occupational prestige (r = -.049: p.>. 05).
(8) Retired respondents are significantly more likely to say DK than

respondents who are not retired (r = .146; p.<05).
(9) Non-white respondents are significantly more likely to say DK

than white respondents (r = .122; p.<.05).
The general conclusion based upon the results in Table 4 suggests that
responuems who are more likely to answer DK are systematically different from
those that are less likely answer DK. The bivariate correlations, however, can be
further divided into three groups according to the strength of the relationship.
Group 1 consists of age, civic participation, and cognitive sophistication with
an average correlation (r) of .235. These correlations account for about 6 percent of
the variation in DK (.235 * .235 * .06). These three variables should be robust
predictors of DK even in a multivariate context. Group 2 consists of education,
work status, and race with an average correlation (r) of .148 and, on average,
accounting for 2 percent of the variance in DK. Group 3 consists of sex, subjective
health status, and occupational prestige with an average correlation of .056. These
correlations explain only .03 percent of the variation in
rn
so
TABLE 4: Bivariate Correlations Between Don't Know and Other Selected Variables
DK AGE CIVIC COG EDUC FEMALE HEALTH HOSTILE COMP PRESTIGE RETIRED WHITE
DK 1.000
AGE 0.232 1.000
CIVIC -0.212 0.005 1.000
COGNITIVE -0.261 0.070 0.462 1.000
EDUCATION -0.177 -0.226 0.388 0.543 1.000
FEMALE 0.086 0.047 -0.076 -0.056 -0.076 1.000
POOR HEALTH 0.033 0.178 -0.061 -0.120 -0.130 0.014 1.000
HOSTILE 0.182 0.050 -0.085 -0.074 -0.045 0.026 0.038 1.000
POOR COMPRE 0.354 0.179 -0.276 -0.417 -0301 0.019 0.109 0.216 1.000
HIGH PRESTIGE -0.049 -0.038 0.225 0.269 0.428 0.037 -0.025 -0.005 -0.099 1.000
RETIRED 0.146 0.587 - -0,031 -0,008 -0.174 -0.084 0.141 0.058 0.146 -0.051 1.000
WHITE -0.122 0.081 0.119 0.272 0.116 -0.032 -0.044 -0.035 -0.167 0.080 0.011 1.000
* All coefficients in bold are significant at the .05
64
DK. Controlling for other respondent level correlates, the correlates in Group 3
probably will not have significant independent effects on DK.
Table 4 also demonstrates that there is a high degree of inter-item correlation
among the DK correlates. This suggests that any conclusion based upon the above
bivariate analysis is problematic considering that any given bivariate relationship
may result from a confounding third factor. For instance, even though the
correlation between age and DK is significant and strong (r = .232), a portion of the
bivariate relationship may result from education given the strong correlation between
age and education (r = -.226). To account for these confounding effects, I adjust
bivariate relationships by extracting out the confounding effect of other variables,
using multiple linear regression. So what correlates have independent effects on DK
rates? And, what subset of correlates test explains variation in DK?
4.4 Multivariate Analysis-of Demographic and Behavioral

Correlates
To determine the model that best explains variation in DK, I test three
separate regression models (see table 5 below). Model 1 includes all potential
correlates of DK discussed in the literature review. Model 2 excludes the non
significant correlates found in Model 1. Model 2 fits the data significantly better
than model 1 (/= .378; p>.001) (Note the change in/ from Model 1 to Model 2 is
not significant). In such cases, the best fitting model is the one with the least
parameters—most parsimonious). Model 3 includes interaction effects and fits the
65
data significantly better than Model 2 (/= 3.1; p.<.001). What respondent level
characteristics explain variations in DK?
4.4.1 Main Effects
After controlling for other respondent level characteristics, occupational
prestige, subjective health, and work status do not explain variation in DK (see
table 5 below). Note all betas (b) presented in table 5 are unstandardized with the
units in number ofDKs squared. Specifically, model 1 in table 5 shows that:
(1) Respondents With high occupational prestige are not

significantly more likely to say DK than respondents with lower
levels of occupational prestige, controlling for other respondent
level characteristics (b = .105; p>.001).
(2) Respondents with poor subjective health are not sisnificantlv

more likely to say DK than respondents with good subjective
health, controlling for other respondent level characteristics (b =
-.102; p >.001).
(3) Respondents who are retired are not sisnificantlv more likely to
say DK than respondents who are not retired, controlling for
other respondent level characteristics (b =.059; p.>.GQl).
These results are not surprising for two reasons. First, our bivariate analysis
showed that respondents with high occupational prestige levels were not
significantly more likely to answer DK than respondents with lower levels of
occupational prestige.
Second, previous research demonstrates that age explains the subjective
health and retirement effects (Young 1998c; 1999b; 1999c). Simply put,
66
Table 5s Siunmated DK Index regressed on DK correlates

(Umtmdanfized Betas with measmmsacttt unit In number of DK squared)
V ariables Mfodd 1 Model 2 lîlotSeii 3
Degree (1) -0.001 ** **
[.020]
Age (in years) 0.017 MM 0019
[.003] [.002] [.003]
Cognitive Sophistication - 0.107 -0103 -0.145
[.028] [.023] [.021]
Gvic Rutidpatfon - 0.056 -0052 0046
[.011] [.010] [.038]
Gender (Feraale=l) 0.166 016 -0113
[.064] [.061] [.027]
Race (White=l) - 0.209 -0211 -008
[.119]- [-121] [.015]
[High” f) 0.105 ** **
[.094]
Heatth (Poor Hm MIf 'I) -0.102 **
[.149]
Work Status (SMir«t=l) 0.059 **
r.128]
Cog Soj*ist*Qvlc Part ** 0020
[.006]
Age*Gender ** 0606
[.021]
Gvic Partiripation*White ## ** -0050
[.030]
Gvic Partidpation*Age ** - 0.002
[001]
Constant am 1.47 0.523
[341] [.222] [178]
Sample size (n=) 1446 1446 1446
Adjusted R Square 0.1446 0.1437 0.1588
* All coefficients in Hack italites are sigpiliceni at t e r . 1 level (two4afled test)
** Efepeedent w ia H e transformed usii^g (he square root
t SEs adjusted ftr couples design ofHiesample (clustering and stratification)

% SEsm b r a c k s tinder coefficients
67
respondents who are retired and rate themselves as having poor subjective
health are more likely to say DK because they are older.
The bivariate results in Table 4 support the above conclusion—older
respondents are more likely to rate themselves as having poor subjective health (r =
.178; p.< 05) and to be retired (r = .587; p.<.05). Specifically, a respondent’s age
accounts for approximately 3 percent of the variance in a respondent’s self-rating of
subjective health (.178*.178 «.03) and approximately 34 percent of the variance in a
respondent’s likelihood of being retired (.587*.587 «.34).
Model 1 in Table 5 also shows that education does not have an independent
effect on DK, after controlling for other respondent level characteristics. This non
significant effect is an unexpected finding for two reasons. First, our bivariate
analysis demonstrated a statistically significant and robust correlation between
education and DK (r = -.177; p.<.05).
Second, no study has produced similar results. Indeed, the methods
literature shows that education is not only a robust predictor of DK but also a
predictor of other forms of survey error, such as coverage bias, unit nonresponse,
and measurement error (e.g., Young 1999a, 2000c; Smith 1988; Smith 1981; Groves
and Couper 1998; Narayan and Krosnick 1996). Given the importance of this
finding, I will more closely examine the relationship between education and DK in
the next chapter (Chapter 5). What initial clues, though, might be gleaned from our
present analysis?
68
A cursory analysis of the bivariate correlations in Table 4 suggests that five
correlates may possibly explain the association between education and DK.
Specifically:
(1) Age: Older respondents are more likely to have lower levels of
education than younger respondents (r = .070; p .<.05).
(2) Cognitive Sophistication: Respondents who have higher levels

of cognitive sophistication are more likely to be educated than
respondents with lower levels of cognitive sophistication (r =
.543; p.<.05).
(3) Civic Participation: Respondents who are more likely to

participate in civic activities have higher levels of education than
respondents who are less likely to participate in civic activities (r
= .388; p.<05 ).
(4) Sex: Female respondents are less educated than male respondents
(r = -.076; p.<.Q5 ).
(5) Race: White respondents are less educated than non-white

respondents (r = .116; p.<05 ).
So what preliminary conclusions may be drawn? Even though it is quite
possible that the education effect results from a combination of all five factors,
cognitive sophistication is the strongest candidate for two reasons. First, research
on DK has typically used education as a proxy for cognitive sophistication—it
makes sense that the actual measure would explain the proxy. Second, the bivariate
correlation between education and cognitive sophistication is very strong (r = .562;
p.<05). In chapter 5 ,1 test this cognitive sophistication hypothesis in addition to
other possible explanations.
69
Model 2 in Table 5 also shows that five of the 9 respondent level correlates
of DK remain statistically significant, after controlling for other respondent level
characteristics. First, older respondents are more likely to answer DK than younger
respondents (b = .018; p .<05). Seventy-five year old respondents, for instance, are
2.3 times more likely to answer DK than 20 year-old respondents (7.5 vs. 3.3
DKs).1
Second, respondents with higher levels of cognitive sophistication (=10) are
less likely to say DK than those with lower levels (=1) of cognitive sophistication (b
= -.103; p.<.05). Respondents with high levels of cognitive sophistication are 9.6
times less likely to answer DK than respondents with low levels of cognitive
sophistication (. 19 vs. 1.87 DKs).
Third, respondents who participate more in civic activities (= 10) are less
likely to answer DK than those who participate less (= 1) in civic activities (b = -
.052; p.<.05). Specifically, a respondent who participates frequently in civic
activities is 2 .2 times less likely to say DK than one who does not participate
frequently in civic activities (.903 vs. 2.01 DKs).
Fourth, female respondents are more likely to say DK than male respondents
(b = .160; p .<05). Female respondents answer DK, on average, 1.23 times more on
the survey than men (2.66 vs. 2.16).
Note I calculated the average DK level in 4 steps. To simplify explanation, let us

calculate the average DK level for a respondent of 75 years of age. Step 1: multiply beta (b =
.018) times the age of the respondent [.018 * 7 5 = 1.275],Step 2: take the product in step 1
(1.275) and add the regression constant for model 1 in table 6 [1.275 + 1.47 = 2.745]. Step3:
square the result in step 2 (1.987) in order to transform the units from DKs squared to number of
DKs. [2.745 * 2.745 = 7.54]. Step 4: Seventy-five year old respondents, on average, answered
DK 7.5 times on the survey.
70
Finally, white respondents are less likely to answer DK than non-whites
respondents (b = -.209; p.< 05). Specifically, white respondents answered DK, on
average, 1.59 times on the survey, compared to 2.16 for non-white respondents.
None of these results are surprising—all having been cited in the literature
review. More elusive, however, is explaining why each of the above characteristics
is correlated with DK—explaining why will be one of the central challenges of this
thesis.
4.4.2 Interaction Effects
Model 3 in Table 5 includes interaction terms. If statistical models are to
reflect reality, it is essential to test for interaction effects. Yes, older respondents
are more likely to say DK than younger respondents. But so what? Everyone knows
that the real world is more complex than a simple bivariate relationship would
suggest.
For instance, are older respondents with higher levels of cognitive
sophistication less likely to answer DK than older respondents with lower levels of
cognitive sophistication? Or, are older female respondents more likely to say DK
than younger female respondents? To test for possible interactions, I draw upon
both the methods literature as well as empirical findings. What does the methods
literature tell us?
Most of the literature concerning DK correlates has examined main effects
and not interaction terms (Gergen and Back 1966; Glenn 1969; Sudman and
71
Bradbum 1974; Francis and Busch 1975). Several studies, however, have found
that:
(1) Older female respondent are more likely to say DK than

younger female respondents (Rapaport 1982,1985;
Krosnick and Milbum 1990). The explanations,
however, differ. Rapaport (1982,1985) argues that
generational change brought on by the women’s
movement increased the likelihood for women to express
their opinion, while Krosnick and Milbum (1990)
attribute the interaction to life-cycle differences between
men and women.
(2) Female respondents with higher levels of cognitive

sophistication are less likely to say DK than female with
lower levels of cognitive sophistication (Krosnick and
Milbum 1990).
(3) Older respondents with higher levels of cognitive

sophistication are less likely to answer DK than older
respondents with lower levels cognitive sophistication
(Krosnick and Milbum 1990).
(4) Non-white respondents with higher levels of cognitive

sophistication are less likely to say DK than non-white
respondents with lower levels of cognitive sophistication
(Krosnick and Milbum 1990).
Furthermore, an analysis of the standardized bivariate correlations in Table 4
2
suggests seven possible candidates for a three-way interaction. They include:
(1) Age * Cognitive Sophistication: Older respondents

have lower levels of cognitive sophistication than
younger respondents (r = .070; p. < .05).
(2) Age * White: White respondents, on average, are older

than non-white respondents (r = 081 ; p. <.05).
(3) Civic Participation* Cognitive Sophistication:

Respondents who have higher levels of cognitive
sophistication are more likely to participate frequently in
civic activities (r = .080; p. < .05).
(4) Civic Participation* Female: Female respondents are

less likely to participate in civic activities than male
respondents (r = -.076; p. < .05).
(5) Civic Participation* White: White respondents are

more likely to participate in civic activities than non
white respondents (r =116; p. < .05).
(6) Cognitive Sophistication* Female: Female respondents

are less likely to participate in civic activities than male
respondents (r = -.056; p. < .05).
(7) Cognitive Sophistication* White: White respondents

are more likely to be cognitively sophisticated than non
white respondents (r = 272; p. < .05).
Considering the bivariate correlations cited above, the strongest candidates
include the interaction between (1) civic participation and cognitive sophistication (r
For an interaction effect to be possible in a multivariate context, two separate

conditions must be met. First, there must be a correlation between the two independent variables,
even controlling for other factors. Second, the two independent variables must be correlated with
the dependent variable, even after controlling for the joint effect of other factors. Given these
conditions, one simple data analysis shortcut, in specifying interaction effects, is to determine
which independent variables are highly correlated. While this method is notfool proof, it can
help the data analysts make intelligent cuts in searching for potential interaction effects (Green
1997).
73
-.462); (2) race and civic participation (r =116); and (3) race and cognitive
sophistication (r =.272). So what do we find in the multivariate context?
I tested all possible interactions. However, only four interactions were found
to significantly improve upon the fit of Model 2 ( f =3.1; p .<.001).
Interaction I: Cognitive Sophistication by Civic Participation.
Respondents who are more both cognitively sophisticated and who participate more
frequently in civic activities are more likely to answer DK than those who are
cognitive sophisticated but do not participate frequently in civic activities (b = .020;
p<01). Specifically, a respondent who is cognitively sophisticated and who
participates frequently in civic activities is 2.14 times more likely to answer DK than
his counterpart who does not participate frequently in civic activities (.337 vs. .157
DKs). Why is this the case?
The initial hypothesis would, of course, be the opposite: respondents who are
cognitively sophisticated and who participate frequently in civic activities should be
the least likely to say DK. What, then, might be going on?
Two possible explanations may account for the interaction. First, given
high levels of competence, such respondents may be less likely to employ face-
saving strategies, like providing substantive answers in order to hide ignorance
concerning a topic. Second, such respondents may be more likely to criticize
poorly formulated questions by expressing DK.
Interaction 2: Gender by Age. Older women are much more likely to
express DK than older men, while younger women are about as likely to answer DK
as younger men (b = .006; p.<.05). Specifically, while 65 year-old female
74
respondents are 1.48 times more likely to answer DK than male respondents their
same age (4.8 vs. 3.3 DKs), younger women are only 1.13 times more likely to say
DK than younger men (1.15 vs. .91 DKs). These results confirm previous findings,
that the gender gap is decreasing. However, the reasons are less clear: the
interaction may result from cohort or from life cycle differences.
Interaction 3: Race by Civic Participation. White respondents who
participate frequently in civic activities are slightly less likely to express DK than
non-white respondents who participate frequently, while white respondents who do
not participate frequently in civic activities do so at about the same rate as their non
white counterparts (b = -.050; p.<.05). Specifically, non-white respondents who
participate frequently in civic activities are 1.2 times more likely to answer DK than
white respondents who participate frequently in civic activities (.32 vs. .26 DKs),
while whites who are less likely to participate in civic activities are only .89 times
more likely than their non-white counterparts. Once again, these results confirm
previous findings. But why?
The significant interaction term suggests that a portion of the race effect is
mediated through civic participation. In brief, non-whites, in part, are less likely to
answer DK because they are less likely to participate in civic activities.
Unfortunately, we can not determine from this analysis whether the effect results
from non-whites being less likely to be exposed to information and/or from being
less motivated to be “good respondents”.
Interaction 4: Civic Participation by Age. Older respondents who
participate more frequently in civic activities are less likely to answer DK than older
75
respondents who are less likely to participate in civic activities (b = -.002; p.<.05).
For instance, 65 year-old respondents who participate frequently in civic activities
are 1.4 times more likely to answer DK than 65 year-old respondents who
participate infrequently in civic activities (2.79 vs. 1.94 DKs). In closing, these
findings show that civic participation mediates a portion of the age effect. However,
like the other interaction effects, we know much less about why the significant
correlations exist.
Here it is also important to note that three of the main effects (gender: b = -
.113; p.>.05; race: b = -.080; p.>.05 and civic participation: b = .046; p>.05 ) are
no longer statistically significant, after controlling for interaction effects. In short,
the interaction terms explain away the main effect for gender, race, and civic
participation. What do these results suggest?
The main conclusion here it that gender, race, and civic participation do not
have direct effects on DK, instead being mediated by other respondent level
characteristics. Specifically, whites are less likely to answer DK because they are
more likely to participate in civic activities; female respondents are more likely to
say DK because they are older; and finally respondents who participate more in civic
activities are less likely to answer DK because they are more likely to be white;
older and are more cognitively sophisticated.
76
4.5 Conclusion
The analysis in this chapter both confirmed previous research as
well as uncovered new findings. Like past research, our analysis has shown that
DK is a non-random phenomenon—respondents who are more likely to answer DK
are systematically different than respondents who are less likely to answer DK.
Some of our results, however, were unexpected. First and foremost, we were
able to explain away the education effect, after controlling for other respondent level
characteristics. This finding is noteworthy because education is an important
predictor of all forms of survey error, including DK (e.g., Young 1999a; Smith 1988;
Smith 1981; Groves 1989; Groves and Couper 1998; Naiayan and Krosnick 1996).
What might explain this result?
One possible explanation is that this is one of the few studies that has
included a strong direct measure of cognitive sophistication—a direct measure of a
concept should explain away the proxy. I test this cognitive sophistication
hypothesis in the next chapter (chapter 5).
Second, our analysis also shows that the gender, race, and civic participation
effects are all mediated through other respondent level characteristics.
Third, we found a significant interaction effect between civic participation
and cognitive sophistication. This result suggests that DK is not necessarily an
indicator of low respondent performance. Instead, among high-performing
respondents, DK may actually reflect a thoughtful well-considered response.
Specifically, such respondents (1) may be more willing to express ignorance and/or
77
(2) may be more likely to critique poorly formulated questions (e.g., vagueness of
the question content, poor wording, discordance between question stem and response
scale, response scale with missing options, etc.) by expressing DK.
Finally, our analysis in this chapter made it very apparent that relatively little
is known about why respondent level characteristics are correlated with DK. Both
age and civic participation are correlated with DK, but so what? What do these
correlations mean? Put another way, quantitative analysis, while broad, is not very
deep. One of the primary objectives of this thesis is to uncover the why behind the
correlations.
78
CHAPTER FIVE
ANALYSIS OF THE ASSOCIATION BETWEEN DK AND

EDUCATION
5.0 Analysis of the Association between DK and Education
la the last chapter (chapter 4), we were able to explain away the relationship
between education and DK. This finding is quite important because education is not
only an important predictor of DK but also of many other sociological phenomena.
Indeed, education is an important socializing mechanism that has brought profound
changes to. American society, especially since the rapid expansion of higher
education after the Second World War. Extensive research oh a variety of topics has
shown the multi-faceted way in which education serves as an important socializing
mechanism. A sampling of this research shows that:
(1) More educated individuals are more knowledgeable

about politics and other related issues as well as are more
likely to be active in civic affairs (Nie et. al. 1997;
Campbell et. al. 1960; Hyman et. al. 1975).
(2) More educated individuals are more tolerant towards a

number of issues/groups, including civil liberties, race,
and gender (e.g., Davis 1975,1980,1992; Schuman et.
al. 1998; Mayer 1992).
(3) More educated individuals are more likely to earn more

money than less educated individuals (Blau and Duncan
1967).
79
(4) More educated individuals are more likely to express
opinions (Converse 1964,1970; Krosnick and Milbum
1990).
Furthermore, education has been shown to be an extremely important
predictor of survey error. The survey methods literature, for instance, has
consistently shown that the less educated are:
(1) More likely not to be included in the sampling frame

(noncoverage)—especially in telephone surveys (e.g.,
Smith 1988; Young 2000)
(2) More likely not to respond to a survey (e.g., Smith 1981;

Groves and Couper 1998)
(3) More likely to be influenced by changes in question

context; order, and wording (e.g., Narayan and Krosnick
1996)
The survey research Mterature typically treats education as a proxy for more
proximate characteristics. For instance, the research on question wording, order, and
context effects uses education as a measure of cognitive sophistication—which
loosely encompasses a respondent’s cognitive abilities, exposure to information, and
level of knowledge about the survey topic (e.g., Schuman and Presser 1981;
Krosnick 1991). Similarly, the survey literature on unit nonresponse uses education
as a measure for a respondent’s willingness to participate in a survey (Groves and
Couper 1998).
The wide-use of education as a methodological proxy variable may, in part,
be attributed to the fact that much of the methodological research is done on
secondary data which forces the researcher to make use of the measures found on the
80
study. Most of methods research, thus, does not specifically address what
mechanisms explain the association between education and survey error. As a result,
an important question is consistently left unanswered: what does education measure?
Is education a proxy for a respondent’s cognitive ability, socioeconomic status,
willingness to participate in a survey, motivation during a survey interview, or a
combination of factors?
To explain why we were able to explain away the education effect in chapter
4, I attempt to answer two specific questions. Furthermore, I ask a more general
question about the implication of the research in this chapter on our understanding of
the education effect as it relates to other phenomena.
Specific Questions: (1) what is the specific functional form of the

relationship between DK and education? and (2) what factors (if any) explain
this relationship? "
General Question: what might the results in this chapter imply about the
education effect in relationship to opinionation (likelihood to answer with a
substantive response) and other sociological phenomena?
81
5.1 Review of the Literature
5.1.1 Direction, Strength, and Functional Form of the

DK/Education Relationship
The research on DK suggests (1) that education is negatively related to the
level of DK and (2) that this negative relationship persists even when controlling for
both respondent and question level characteristics (e.g., Ferber 1966; Converse
1977). We find one exception to this general tendency. On fictitious and obscure
questions, the DK/education association actually becomes positive (Schuman and
Presser 1981; Bishop etal. 1980). Smith (1981) argues that this reversal in the
relationship results from the less educated’s greater issue confusion and greater need
to Mde their ignorance.
While under most conditions the correlation between DK and education is
negative, the strength of the relationship increases with the difficulty of the question
(Smith 1981).1 Smith (1981) explains that the strength of the DK/education
relationship depends on the relative distribution of nonattitude and ambivalent

2
attitude holders within the DK category. Specifically, more difficult questions
increase the proportion of nonattitudes. Nonattitudes, in turn, correlate (negatively)
1
Difficult questions are those that are less salient to the respondent and require more specific
knowledge (e.g., questions on specific government policies such as NAFTA). Conversely, less difficult
questions are those which are more salient to the respondent and require less personal knowledge (e.g.,
questions on general values or personal evaluations such as happiness).
2
Note later research suggests that the nonattitude subgroup can be further subdivided
between those respondents which lack attitudes (nonattitudes) and those which choose the DK option
to avoid the cognitive demands of the survey question (satisficers) (Krosnick 1991; Young 1999d).
82
with education, while ambivalent attitudes do not (Faulkenberry and Mason 1978;
Smith 1981). Thus, as the proportion of nonattitudes increases, the educational
differential becomes larger as well.
These findings are important because they suggest that one must be careful
when making generalizations about the association between education and DK, since
the relationship may vary from question to question.
The literature on DK, however, does not provide much insight into the
specific functional form of the relationship between DK and education. For those
studies that include education as a predictor of DK, none offers an a priori
theoretical justification for the specification of the DK/education relationship. The
practical outcome of this under-theorizing is that these studies implicitly assume that
education is linearly related to DK. -
Furthermore, any conclusions about the functional form of the relationship
based on a review of the empirical findings of past research is problematic for
several reasons. First, many of the studies use crude two category measures for
education. In such cases, the relationship by default is linear. And second, the vast
majority of the studies look at question-specific relationships. Thus, any
generalization about the DK/education relationship is difficult because, as previously
indicated, the strength (and presumably the functional form) of the association varies
with question difficulty. However, some indirect evidence in the survey literature
does suggest that the relationship might be non-linear. Narayan and Krosnick (1996)
find that response effects, including the propensity to give a DK response under
varying question conditions, occur disproportionately among those with less than a
83
high school degree while the magnitude of these effects declines at an increasing rate
at higher levels of education.
5.1.2 Explaining tie Relationship between DK and Education
The inclusion of education as a predictor of DK receives scant theoretical
justification in the survey literature. When explanations are provided, education is
often considered a proxy for other characteristics such as knowledge, cognitive
ability, interest, and willingness to participate in the survey interview (e.g., Converse
1964,1970; Converse 1977; Smith 1981). However, none of this research attempts
to empirically test the conceptual reasons for including education as a predictor of
DK. Based on a review of the survey literature, it seems that four possible
explanations might account for the DK/education relationship: (1) cognitive
sophistication, (2) civic participation, (3):ge»cter, and (4) age. I go into further detail
below.
5.1.2.1 Cognitive Sophistication
The existing literature on the correlates of DK responses suggests that the
DK/education relationship may be a function cognitive sophistication (e.g., Krosnick
1991; Schuman and Presser 1981). There are two possible explanations. First, the
cognitive sophistication effect may result from differential knowledge/exposure to
information. In support of this hypothesis, a number of studies demonstrate (1) that
the more educated are more informed/knowledgeable about the issues asked on
surveys (Hyman et. al. 1975; Smith 1981; Nie et. al. 1997) and (2) that the well-
84
informed are less likely to answer DK (Converse 1964,1970; Converse 1977;
Faulkenberry and Mason 1978; Francis and Busch 1975; Rapoport 1982,1985;
Smith 1981).
Second, the cognitive sophistication effect may also result from varying
levels of verbal ability. Considerable research suggests (1) that the less educated
are more likely to have lower verbal abilities and (2) that respondents with weaker
verbal skills are more likely to have difficulties understanding survey questions and,
in turn, are more likely to answer DK (Krosnick and Alwin 1987; Krosnick 1991;
Young 1999b, 1999c).
5X2.2 Civic Participation
■Other research suggests that the association between DK and education may
result from, participation: in civic activities. This research shows (1) that the more
educated are more likely to participate in civic activities (e.g.,Nie et. al. 1997) and
(2) that those more involved in civic activities are less likely to answer DK (Francis
and Busch 1975; Faulkenbeiry and Mason 1977; Rapoport 1985; Young 1999a;
1999c).
5.1.2.3 Gender
The relationship between DK and education may be a function of the gender
of the respondent. Several studies indicate (1) that men, on average, are more
educated than women (e.g., Rapoport 1982,1985; Young 1999b, 1999c) and (2) that
women are more likely to give a DK answer than men (Francis and Busch 1975;
85
Rapoport 1982,1985; Smith 1984; Sudman and Bradbum 1974; Young 1999b;
1999c).
5X2.4 Age
Other research indicates that the relationship between education and DK
results from age differences. A number of studies have shown (1) that older
individuals are less educated than younger individuals and (2) that older individuals,
on average, are more likely to express DK-like responses than younger people
(Gergen and Back 1966; Glenn 1969; Young 1999b; 1999c).
5X2.5 Other Possible Explanatory Factors
Additional factors may explain the association between DK and education,
such as occupational prestige, race, subjective health, and size of city (e.g., Ferber
1966; Francis and Busch 1975). In analysis of both cross-temporal as well as cross
national data, Young (1999b; 1999c) found that none of these factors, independently
of cognitive sophistication, civic participation, sex, and age, helped explain the
DK/education relationship.
5.2 Methods
To test for what factors explain the association between DK and education, I
test 5 successive regression models. Here I hypothesize that cognitive
sophistication, civic participation, gender, and age explain the DK/education
86
relationship. Based upon research in the last chapter, the strongest possible
candidate is probably cognitive sophistication.
to the first model, DK is hypothesized to be a function of education (see
equation 13 below). Specifically, education includes 4 education dummy variables:
(1) high school; (2) junior college; (3) college; and (4) graduate school with “less
than a high school degree” being the excluded category.
(13)
DK = p0 + Dl(High School) + D2(Jimior College) + D3(College) +

D4(Graduate) + ei
In model 2, DK is hypothesized to be a function of education as well as
cognitive sophistication (see equation 14 below). ■■
( 14)
DK = {30 + Dl(High School) + D2(Junidr College) + D3(College) +

D4(Graduate) + pi (Cognitive Sophistication) + ei
Here I am not interested in estimating the main effect of cognitive
sophistication (as well as civic participation, gender, and age). Instead, I want to
determine if controlling for cognitive sophistication changes the relationship
between DK and education. Put into statistical parlance, I am interested in the effect
of education on DK taking out the effect of cognitive sophistication. Using this
logic, I estimate successive models with additional predictors.
87
(15)
DK = P0 + Dl(ffigh School) + D2(Junior College) + D3(Coltege) +

D4(Graduate) + pi (Cognitive Sophistication) + p2(Civic
Participation) + ei
In model 3, DK is hypothesized to be a function of education, cognitive
sophistication, and civic participation (see equation 15 above).
(16)
DK = p0 + Dl(High School) + D2(Junior College) + D3(College) +

Participation) p3(Gender) + ei
In model 4, DK is hypothesized to be a function of education, cognitive
sophistication, civic participation, and gender (see equation 16 above). Finally, in
model 5, DK is hypothesized., to be a function of education, cognitive sophistication,
civic participation, gender, and age (see equation 17 below)
'■(17)-'
a
DK = pO + Dl(High School) + D2(Junior College) + D3(College) +
Participation) p3(Gender) + p4(Age) + ei
5.3 Examination of the Relationship between Education and DK

In the following section, I attempt to answer three basic questions:
(1) what is the direction of the relationship between

education and DK (positive or negative)?
(2) what is the functional form of the education/DK

relationship (linear or nonlinear)?
88
(3) what factors might explain the association between
education and DK?
Table 6: Unstandardized OLS Estimates for Models with Education

Main Effects controlling for DK predictors
Variables ........... Model 1 Mode! 2 Model 3 Model 4 Mode! 5

H igh School Degree - 0,641 - 0,358 - 0,336 - 0,349 -0,039
[.125] [.115] [•U4] [.114] [-119]
Ju n io r College Degree - 1,03 - 0,662 - 0,634 - 0,652 -0,208
[.175] [.182] [.178] [.178] [.196]
College Degree - 0,861 - 0,432 - 0,357 - 0,347 0,092
[.146] [.147] [.152] [.153] [.166]
Graduate Degree - 0,864 -0,307 -0,175 -0,165 0.253
[.156] [.197] [.206] [207] [.224]
** -0,073
| l
Cognitive Sophistication - 0,101 - 0,072
[.031] [.031] [-031]
Civic Participation ** ** - 0,055 - 0,053 - 0,056
[Oi l] [■Oil] [.011]
Gender (Fem ale=l) ** ** 0,211 0,177
[.069] [.062]
Age (in years) ** ** 0,018
r.oo2i
C onstant 2,27 1,51 1,48 1,49 1,22
. . r.1211 ; , _I-M81.... [.107] ....UMl . ......1.1081...
Sample size (n=) 1457 ' . 1451 1447 1447 1447
MasfeiKSflafflS— — .— — - M U .... A1401,,
* All coefficients in black, italics are significant at the .05 level (two-taiied test}
Dependent variable transformed using the square root
f Excluded Category for Education is 'Less ttan a High School Degree’1
t Standard Errors adjusted for complex design of the sample (clustering and stratification)
I t Standard Errors in brackets under coelfieicnis
Table 6 above includes the unstandardized OLS estimates that I use to create
Figures 2 through 5. In order to adjust for the non-normal distribution of the
dependent variable (note Figure 1 in chapter 3), I transformed the dependent variable
using the square root function. In order to easily interpret the results, I have re
transformed all estimates into their original metric (number of DKs) using the
quadratic function.
89
Using these re-transformed estimates, Figure 2 below presents the raw
association between level of education and DK.
Figure 2: Relationship between DK and Education
.5.1
4.0
I
Sa 30
2,0
2.0
&0
Less High School High ScSmmsSDegree College Degree College Degree1
Degree
•.Iw d o f Education
Confirming past research, Figure 2 above indicates that education and DK
are negatively related. The more educated, on average, are less likely to express
DK-like answers than the less educated (overall relationship significant; p.=.000).
Figure 2 furthermore adds light to previous research by demonstrating that
the association between education and DK is not linear but nonlinear.
Specifically, while respondents with less than a high school degree are
I assess the overall significance of the DK/education relationship by testing the joint
hypothesis that high school degree, junior college degree, college degree, and graduate degree are
simultaneously significant. I performed all such tests in STATA using the “test of linear
hypothesis” subcommand. Note I also correct all p-values using Bonferroni adjustments in order
to account for multiple comparisons.
90
approximately 2.0 times more likely to give a DK response than those with a high
school degree (difference statistically significant; p.<Q5), respondents with a high
school degree are only 1.3 times more likely to answer DK than those with a college
or graduate degree (difference not statistically significant; p>.Q5) 4 Put another
way, the gains from education occur overwhelmingly between those with less than a
high school degree and a high school degree with diminishing returns at higher
levels of education (some college and up). The above results lead to a natural
question. Why are individuals with lower levels of education more likely to give a
DK response than individuals with higher levels of education?
One reason may be that respondents with lower levels of education also have
correspondingly lower levels of cognitive sophistication. Bivariate analysis does
suggest that cognitive sophistication is related to DK and education making it a
strong explanatory candidate Specifically, table 4 in chapter 4 indicates that:
(1) a respondent’s propensity to give a DK-like answer decreases with

higher levels of cognitive sophistication . The bivariate relationship
between cognitive sophistication and DK is strong, negative (r = -.261)
and statistically significant and;
(2) More educated respondents are more cognitively sophisticated. The

bivariate relationship is positive (r =.543) and statistically significant.
Is cognitive sophistication an explanatory factor?
For all significance test of difference of two means, I combine the College and
Graduate School degree categories because the difference between the two groups is not
statistically significant.
91
Figure 3: Relationship between Education and DK (Controlling for

Cognitive Sophistication)
6.0
5.0
©
3.0
2.0
1.5
1.0
0.0
Less BBgik Sdsooi Higfe S ^ o o l Degree J u s e lr Oalkge Begree College ©agree
•Beêe
‘Level of Edo'cstion
Figure 3 above suggests, that yes, a respondent’s level of cognitive
sophistication explains at least some of the differences between individuals with
lower and higher levels of education. Specifically, while respondents with less than
a high school degree are still more likely (1.7 times) to give a DK response than
those with a high school degree (difference statistically significant; p.<.05),
respondents with a high school degree are not more likely (1.09 times) to say DK
than those that went to college or graduate school (difference not statistically
significant; p>.05).
92
Put another way, a respondent’s level of cognitive sophistication explains the
difference in DK rates between respondents with a high school degree and those
with a college/graduate degree. However, even while taking into consideration
respondents’ level of cognitive sophistication, the overall relationship between
education and DK remains statistically significant (p.=.003) with approximately the
same non-linear form.
Another possible explanation for the education/DK relationship may be that
respondents who participate more frequently in civic activities are less likely to
answer DK. Bivariate analysis suggests that a respondent’s level of civic
participation may indeed partially explain the DK/education relationship. Table 4
shows that:
(1) respondents that participate more frequently in civic activities are less
likely to give a DK-like answer. The bivariate relationship is relatively
strong (r = -.212) and statistically significant and;
(2) More educated respondents are more likely to participate in

civic activities. The bivariate correlation is also statistically significant (r
= -.226).
Figure 4 below, however, suggests that a respondent’s level of civic
participation does not seem to account for much of the relationship between
education and DK.
93

Cognitive Sophistication and Civic Participation)
6.0 -
5.0
3.0
2.0
1.0
0.0
Less High Sctosl High School Degjrw .JmwlrCoHegic Degree College Degree Gradtasite Degree
Degree
Education
, indeed, even after controlling for both cognitive sophistication and civic
participation, the basic form of the relationship does not change and the overall
association remains statistically significant (p.=004). Respondents with low levels
of education (less than a high school degree) are still more likely to give a DK-like
response than respondents with higher (college and graduate degree) and medium
levels (high school degree).
Another possible explanation: women, on average, are less educated than
men and are, therefore, more likely to express a DK-like answer. The following
bivariate analysis does not seem to support this gender hypothesis. Indeed, Table 4
indicates that:
94
(1) Women are more likely to answer DK than men. The bivariate
relationship however is weak ( r = .086) and not statistically significant
and;
(2) Women are less educated than men. The bivariate correlation, however,
is weak (r = -.084) but statistically significant.

cognitive sophistication, civic participation, and gender)
6.0 !
5.0
3.0
£ 2.8
1.7
1.0
0.0
Less High School High School Degree Jaitfor College Degree College Degree
Defp*ee
In further support of our bivariate analysis, Figure 5 above suggests that even
when controlling for gender in addition to cognitive sophistication and civic
participation, the basic form and direction of the relationship does not change
(p.=.004). Respondents with low levels of education (less than a high school degree)
are still more likely to give a DK-like response than respondents with higher (college
and graduate degree) and medium levels (high school degree). Might age account
for the remaining difference between DK and education?
95
Bivariate analysis suggests that yes, age may be a good explanatory
candidate. Specifically, table 4 indicates that:
(1) Older individuals are more likely to answer DK than younger

individuals. The bivariate relationship is both strong (r =.232) and
significant and;
(2) Older individuals are less educated than younger individuals. The
bivariate correlation is both strong (r = -.177) and statistically
significant.

cognitive sophistication, civic participation, gender, and age)
6.0
5.0
M 4.0
O
©
22
2.0
4^
1.0
0.0
Less High School High School Dogreo Junior Caiiegv Co3St*ge Degree Graduate Degi
Degree Degree
Education
In support of the above conclusions, Figure 6 above suggests that yes, age
explains the remaining variation between education and DK. Indeed, once we take
into consideration a respondent’s age in addition to cognitive sophistication, civic
participation, and gender, the overall nonlinear relationship between education and
DK no longer remains significant (p.=.969).
96
In brief, respondents with low levels of education (less than a high school
degree) are no longer more likely to give a DK-like response than respondents with
higher (college and graduate degree) and medium levels (high school and junior
college degree). Specifically, age accounts for the difference in mean levels of DK
between respondents with less than a high school degree and a high school degree
suggesting that age not education is the primary factor, contributing to differences in
DK rates among the medium and less educated. In addition, although not
statistically significant, age actually reverses the relationship between DK and
education—with the more educated more likely to give a DK response than the less
■• 5'
educated. What can we conclude from the above analysis?
5.4 Conclusion
At the beginning of this chapter, we asked two specific questions and one
general question:
Specific Question: (1) what is the functional form of the relationship

between education and DK? and (2) what factors explain away the
relationship between education and DK?
General Question: what might the results in this chapter imply

about the education effect in relationship to opinionation and other
sociological phenomena?
So what did we find in the above analysis?
Note I ran a series of regression models and found the results to be robust
irrespective of the order in which I entered cognitive sophistication, civic participation, gender,
and age.
97
5.4.1 Specific Questions
Taken as a whole, five main findings came out of the analysis. First, the
association between education and DK is negative. Second, the relationship is
nonlinear with a disproportionate number of lower educated respondents answering
DK. Third, two factors explain away the relationship between education and DK:
(1) cognitive sophistication and (2) age.
Fourth, cognitive sophistication and age explain different parts of the
association between education and DK. Specifically, cognitive sophistication
explains the difference in mean DKs between respondents with high (college and
graduate degree) and moderate (high school degree) levels of education. Age, in
turn, explains the difference in mean DKs between respondents with moderate and
low levels of education (less than a high school degree). Finally, while cognitive
sophistication and age explained away the overall relationship, those with a junior
college degree were consistently less, likely to answer DK than respondents with
higher and lower levels of education. What might explain this result?
Two possible explanations exist for why junior college respondents are less
likely to answer DK: (1) such respondents may know ju st enough to think that they
know everything and/or (2) they may be too embarrassed to express DK because
they feel that they should know.
This chapter has been quite fruitful in uncovering what specific
characteristics explain the education effect. However, several pending questions are
left unanswered. Why exactly is age correlated with DK? Similarly, thinking back
98
to chapter 4, why is civic participation related, though indirectly, to DK? In short,
what are the meanings behind these relationships?
5.4.2 General Question
The research presented here in this chapter also suggests new lines of
research concerning the education effect as it relates to other sociological
phenomena. Yes, education is an important socializing mechanism, but why?
What factors account for the education effect? Do these factors vary
from phenomenon to phenomenon? Or are they constant? Given the importance
of education as a sociological variable, new research should return to the basics
and take a new look at the variable.;Indeed, the results demonstrate that much
can still be learned by the 'simple analysis of a bivariate relationship.
CHAPTER SIX
AN ANALYSIS OF THE AGE AND CIVIC PARTICIPATION

EFFECTS
6.0 An Analysis of the Age and Civic Participation Effects
Multivariate techniques have become important tools for the quantitative
sociologist. Such tools have allowed the analyst to uncover important effects,
shedding light on underlying sociological processes. However, even with all the
advantages of quantitative techniques, one notable disadvantage still remains.
Quantitative research, while broad (and generalizable), is not very deep.
Unlike the ethnographer, quantitative analysts do a poor job of tapping the
underlying meanings of their variables, This shortcoming is compounded by the fact
that quantitative data analysis is often done on secondary data sources (such as the
GSS), seriously limiting the development of adequate measures. The quantitative
sociologist, therefore, is often left using proxy measures that only indirectly tap the
intended theoretical concepts.
This study, like most dissertations done in the quantitative social sciences, is
plagued by the same limitations. Yes, I have found several interesting correlations,
but so what? What do they mean?
To underscore this point, in chapter 4, we were unable to explain why
respondents who are younger and participate more in civic activities are less likely to
100
provide a DK response. Furthermore, while accounting for the relationship between
education and DK, we could not adequately explain why. Why are age and civic
participation correlated with DK?
In the last chapter (Chapter 5), I was able to unravel part of the meaning
conundrum, finding that the association between education and DK is a function of
two factors: age and cognitive sophistication. But, in so doing, I answered a riddle
with another riddle: what does age really measure?
6.0.1 Respondent Motivation and Respondent Cooperation
.. Might there be a broader meaning that accounts for the correlations? The
short answer is yes.. Both age and civic participation have been used as proxy
variables in the methods literature for respondent ..comprehension-and respondent
motivation (Groves and Couper 1998; Krosnick 1991; Young 1998;Young 1999b;
1999c). Indeed, the methods research has treated nonresponse as a specific case of
political participation. Political (social) participation here refers to actions—both
direct and indirect—that ordinary members of a political system take to influence
outcomes (Kaase and Marsch 1979; Nagel 1987; Verba and Nie 1972). One indirect
way in which to influence the system is to participate in a survey. Furthermore, this
same literature shows that those individuals who are more likely to participate are
also more likely to be informed about the society in which they are members (Verba
and Nie 1972). So what exactly does the survey methods literature have to say about
survey error and participation?
101
The methods literature links survey participation to the degree to which a
respondent feels bound or connected to the society (Glenn 1969; Mathiowetz et. al,
1991; Couper et. al. 1997; Groves and Couper 1998). Socially isolated and alienated
respondents are less likely to participate in social activities, like surveys. Groves
and Couper (1998) note that surveys researchers have long felt that respondent “
... feelings of ‘civic duty’ prompt survey participation... ”, while feelings of
alienation prompt survey non-participation (pp. 131). Linked to alienation and
social isolation, research shows that survey non-responders are also more likely to
have lower levels of political efficacy; are less likely to trust government; are more
likely to feel alienated from society; and are less likely to have confidence in societal
institutions, such as religion and the press (Southwell 1985; Weatherford 1991;
Groves and Couper 1998).
The concepts of civic duty and social isolation have also been used to
explain item nonresponse, particularly DK (Krosnick 1991; Young 1998; Young
1999b; 1999c). This literature argues that respondents with high levels of social
isolation and alienation are more likely to answer DK because they do not feel
socially obliged to be a “good respondent” by providing substantive responses.
Specifically, this line of research speculates that age and civic participation are
proxy variables for respondent motivation (Young 1998;Young 1999b; 1999c).
These additional explanations, however, only raise additional questions. Can
we really get at respondent motivation and comprehension? Or will we be forever
stuck speculating about correlations between poorly understood proxy variables?
102
6.0.2. Solutions
There exist two possible solutions to our problem. First, the GSS includes
two questions at the end of the interview that ask the interviewer to evaluate
respondent performance. The first question (COMPREND) asks the interviewer to
rate the respondent’s overall comprehension of the questions on the study. The
second question (COOP) asks the interviewer to evaluate the respondent’s attitude
toward the interview.
These measures will be used to separate out the possible social and cognitive
aspects of the age and civic participation effects. Specifically, I use COOP to
determine to what extent age and civic participation tap respondent motivation,
while COMPREND to determine the degree to which these same correlates capture
respondent comprehension.
The advantage of these questions is that they were designed to measure
respondent behavior and attitudes during the interview, as opposed to weak proxy
variables, like cognitive sophistication, age, and civic participation. The
disadvantage of these measures is that they are subjective evaluations of the
respondent’s performance during the interview. They suffer, therefore, from
potential external influences, such as the interviewer’s difficulty in convincing the
respondent to participate in the survey prior to the interview or the degree to which
the respondent provides substantive answers.
In short, all analysis in this chapter must be qualified due to the problems
associated with determining causal direction between COOP/COMPREND and
103
DK Indeed, are respondents who are evaluated as being less cooperative more
likely to answer DK because they are really less motivated? Or, do interviewers
classify respondents as being less cooperative because such respondents are more
likely to answer DK?
Second, the GSS includes items on confidence in institutions (14 confidence
items), and anomie (3-item anomia battery)—both possible measures of social
isolation and alienation. I use these items to determine whether age and civic
participation tap respondent motivation and alienation.
In response to the above discussion, I attempt to answer two related
questions in this chapter:
(1) why might older respondents be more likely to answer

DK than younger respondents?
(2) why might respondents who participate in fewer civic

activities be more likely to say DK?
6.1 Discussion of the Measures
6.1.1 Discussion of COOP and COMPREND
COOP and COMPREND appear at the end of the interview as interviewer
evaluations of respondent performance. Specifically, COOP asks: “In General, what
was the respondent’s attitude toward the interview?...friendly and interested,
cooperative but not particularly interested, impatient and restless, or hostile?”
COMPREND, in turn, asks: “Was the respondent’s understanding of the
104
questions... Good, Fair, or Poor?” Why does the GSS include interviewer
evaluations of respondent performance?
The simple answer is that we really do not know why—however, what we do
know is that COOP and COMPREND (or similar questions) began to appear in
NORC studies in the 1960’s and 1970’s. Some speculate that they appeared as a
result of Hyman’s (1954) important work on interviewer techniques. We do know
that researchers first included COMPREND as a measure of response quality in
order to give the analyst the option of excluding those respondents with poor
comprehension ratings. We know of no similar uses for COOP. Despite the fuzzy
institutional memory, researchers have used them in three ways:
(1) as correlates of survey error, nonresponse in particular (e.g.,

Smith 1986; 1987a; 1987b; 1991; 1992a; 1992b; 1992c)
(2) as controls for measurement error in substantive research
(3) as substantive measures of social engagement/disengagement and

anomie (e.g., McCutcheon 1987).
Let us concentrate on how researchers have used COOP and COMPREND
as correlates for survey error. This research, conducted internally by GSS staff, has
found that both respondent comprehension and cooperation are strongly correlated
with survey error, most notably with non-response (missingness). Specifically, these
studies show that:
(1) respondents who are perceived to be less cooperative (COOP)

are, in turn, less likely to respond to supplement self
administered questionnaires at the end of the main GSS
interview (Supplement Nonresponse) (Smith 1987a).
105
(2) respondents who are less likely to cooperate (COOP) and less
likely to understand survey questions (COMPREND) are more
likely to be item non-responders on factorial vignettes (Smith
1986).
(3) respondents who are less likely to cooperate are less likely to
have a household phone (telephone coverage bias). (Smith
1987b).
(4) respondents who are less likely to cooperate and less likely to
comprehend are more likely to be item non-responders on
household income questions (Smith 1991).
(5) Respondents who are less likely to cooperate and less likely to
understand survey questions are correspondingly less likely to
respond to re-interviews (Re-interview nonresponse) (Smith
1992a).
(6) respondents who are less cooperative and who are less likely to
understand survey questions are more likely to be item non
responders on sexual behavior questions (Smith 1992b).
(7) respondents who are less likely to cooperate and less likely to
understand survey questions are more likely to choose the
extreme ends of response scales (Smith 1992c).
So what does this all mean? fn brief, this research treats COMPREND as an
indicator of cognitive ability, while COOP as either a measure of alienation or as a
measure of the likelihood to participate in social events. In other words,
COMPREND and COOP have been used as measures of respondent motivation and
respondent comprehension. Of course, any analysis of these questions must consider
the problems associated with establishing causal direction between
COMPREND/COOP and DK
106
6.1.2 Discussion of Indicators of Social Isolation and

Alienation
In this chapter, I use two measures of social isolation and alienation: (1) a
13- item confidence in leaders of institutions scale (CONFID); and (2) a 3-item
summated anomie scale. I choose these two measures because research shows that
they tap distinct dimensions of alienation and social isolation (Smith 1997). First,
since 1973, the GSS has included a 13-item battery concerning confidence in the
leaders of institutions (CONFINAN, CONBUS, CONCLERG, CONEDUC,
CONFED, CONLABOR, COMPRESS, CONMEDIC, CONTV, CONJUDGE,
CONSCI, CONLEGIS, CONARMY). The question stem for the confidence battery
reads:
“I am going to name some institutions in this country. As far

as the people running these institutions are concerned, would
you say you had a great deal of confidence, only some
confidence, or hardly any confidence at all?”
The institutions include major companies, organized religion, education, the
executive branch of the federal government, organized labor, the press, medicine,
TV, the US supreme court, scientific community, the congress, the military, and
banks and financial institutions. Research shows that, in general, the level of
confidence in all institutions has declined since 1973 (Young 1998a, 1998b; Citrin
and Moste 1999). Furthermore, analyzing the 13-item summated scale, Young
107
(1999a, 1999b) found that (1) the less educated; (2) elderly respondents; and (3) and
non-whites are all less likely to be confident in leaders of institutions.
Second, the GSS includes a 3-item anomie scale that measures the level of
alienation from society (ANOMIA5, ANOMIA6, ANOMIA7). The wording of the
questions reads:
ANOMIA5: “In spite of what some people say, the lot

(situation/condition) of the average man is getting worse not
better... [do you agree or disagree?]”
ANOMIA6: “It’s hardly fair to bring a child into the world

with the way things look for the future... [do you agree or
disagree?]”
ANOMIA7: “Most public officials (people in public affairs)

are not really interested in the problems of the average
man... [do you agree or disagree?]”
Research, in turn, shows that alienation has increased overtime (Reef and
Knoke 1999). Furthermore, this same research indicates that: (1) older respondents;
(2) the less educated; and (3) non-whites are more likely to feel alienated.
6.2 Methods and Data
6.2.1 Methods: Validating Age and Civic Participation
The principal objective of this chapter is to determine the measurement
properties of age and civic participation. What do they measure? Why are they
correlated with DK?
108
Our preliminary hypothesis is that age and civic participation tap respondent
motivation and/or respondent comprehension. Older respondents and respondents
who do not participate in civic activities probably are more likely to say DK because
they either have greater problems understanding the questions or they are less likely
to feel a social obligation to provide substantive responses. To separate out the
relative importance of respondent motivation and respondent comprehension in the
composition of the age and civic participation effects, I use three validating
indicators:
(1) Interviewer Evaluation of Respondent

Comprehension (COMPREND): I use this item
as a measure of respondent comprehension.
(2) Interviewer Evaluation of Respondent

Cooperation (COOP): I use this item as a
measure of respondent motivation.
(3) Confidence in the Leaders of Institutions and

Anomie: I use these items as measures of
respondent motivation and alienation.
I organize the following analysis of age and civic participation into four
distinct sections. First, I analyze the bivariate relationship between DK and the
indicators of respondent motivation and comprehension. Second, I analyze the
bivariate relationship between age/civic participation and DK. Here I examine the
specific functional form of the bivariate relationships. Is the association between
civic participation/age and DK, linear or non-linear?
Third, I analyze the bivariate correlations among civic participation, age, and
our validating indicators of respondent motivation and comprehension. Here I am
109
interested in assessing two types of validity: (1) convergent validity and (2)
discriminate validity (Campbell and Fisk 1959).
Convergent validity exists when indicators of the same construct (e.g.,
cognitive ability) that employ different methods of measurement (e.g., education,
verbal ability, interviewer-evaluated ability, respondent-evaluation of ability) are
correlated. Specifically, different measures of respondent comprehension should be
highly correlated.
Discriminant validity exists when the correlations among indicators of
different constructs are low, independent of the method of measurement employed.
For instance, different indicators of respondent motivation should be highly
correlated, while being weakly correlated with indicators of respondent
comprehension.
Fourth, I use multiple regression to estimate the relative weight of
respondent motivation and respondent comprehension in the composition of age and
civic participation.
( 18)
Age = (Wrm) (Respondent Motivation) + (Wrc) (Respondent Comprehension)
where Wrm = the weight of respondent motivation in the

explanation of the age and civic participation effects.
Wrc = the weight of respondent comprehension in the

explanation of the age and civic participation effects.
Specifically, I regress age and civic participation on the validating indicators
of respondent motivation and comprehension and then use the beta weights
110
(standardized betas) to determine the importance of each indicator (see equation 18
above and 19 below).
(19)
CP = (Wrm) (Respondent Motivation) + (Wrc) (Respondent Comprehension)
Where CP = civic participation
I am most interested in what percent of the variance respondent motivation
and respondent comprehension explain in the composition of age and civic
participation. Here I am net making any causal inference—of course respondent
motivation does not cause age. Instead, I am using multiple regression to estimate
partial correlations.
6.2.2 Data and Measures
In this section, I have two objectives. First, I analyze the frequency
distributions and correlations of COOP and COMPREND. Second, I examine
frequency distributions; correlations; and the measurement properties of the
indicators of respondent motivation and alienation (anomie, CONFID).
6.2.2.1 Analysis of COOP and COMPREND
Tables 7a and 7b below show that most respondents are both cooperative and
able to understand survey questions. Specifically, 95 percent of respondents were
I ll
rated as friendly (76%) or cooperative (19%), while only 5 percent were rated as
restless (4%) or hostile (1%). Similarly, 79 percent of respondents were rated as
having good comprehension of survey questions, while 21 percent were seen by
interviewers as having fair (17%) or poor (4%) comprehension.
Table 7a: Frequency Distribution for COOP
Response Category Label Percent (%) Frequency

Friendly 76.3 1119
Cooperative 19.1 280
Restless 4.0 59
Hostile 0.6 9
Total. 100 1466
Table 7b: Frequency Distribution for COMPREND
Response Category Label Percent (%) Frequency

Good 79.3 1163
Fair 17.0 249
Poor 3.7 54
Total 100 1466
This analysis indicates that most respondents perform well during interviews
with only a small subgroup seen as problematic. Are less cooperative respondents
also less likely to understand survey questions? In other words, is there a respondent
cooperation by respondent comprehension interaction?
112
The answer to the above question is yes. First, the bivariate correlation
between COOP and COMPREND is positive and statistically significant (r = .216;
p.<.05) (see table 5). Second, a look at a two-way table of COOP and COMPREND
(see table 8 below) shows that the correlation is statistically significant (X2 = 186.68;
p =.000).
Table 8: T w o-W ay C rosstab o f C O M PR E N D and C O O P
Good Fair/Poor
Friendly/Cooperative 78.6% 16.9%

(1152) . (248)
Restless/Hostile. 1.8% 2.7%

........;..a n........ -.... ....-H 9 ) .............
* Bivariate correlation significant at the p-.G‘0G level
** COOP collapsed into two categories: (1) Friendly/Cooperative and Restless/Hostile
f COMPREND collapsed into two categoies: (!) Good and (2) Fair/Poor
%Cell sizes in parentheses
This analysis, however, revels that the interaction is weak. Specifically,
about 78 percent of respondents were rated as having both good comprehension and
being either friendly or cooperative during the interviewer with only 22 percent of
respondents falling in the other three cells:
(1) Three percent (3%) had a rating of poor/fair in comprehension

and restless/hostile in cooperation.
(2) Seventeen percent (17%) had a rating of poor/fair in

comprehension and cooperative/friendly in cooperation
(3) two percent (2%) had a rating of good in comprehension and

restless/hostile in cooperation.
113
Are respondents who do not cooperate during the interview and/or who do not
understand survey questions also more likely to say DK?
Yes, table 4 in chapter 4 shows that both COOP (r = .182; p.c.GS) and
COMPREND (r = .354; p.<.G5) are significantly correlated with DK. In particular,
respondents who are less likely to cooperative and/or who are less likely to
understand survey questions are correspondingly more likely to provide DK
response on the survey.
Figure 7: Relationship between OK and Respondent Cooperation

(COOP)
S
2.4
t
o
Restless/Hostile Cooperative Friendly
ftespowfenl Cooperation
In support of these findings, figure 7 above indicates that respondents rated
as friendly or cooperative, on average, answered DK 2.4 times on the survey,, while
those rated as cooperative answered DK 4.6 times; and those rated as restless or
hostile answered 7.6 times.
114
Furthermore, figure 8 below in d i c a t e s that respondents with good
comprehension answered DK only 2.1 times on the survey, while those rated as fair
answered 5.6 times and those rated as poor answered 11.8 times.
Figure 8: Relationship between DK and Respondent Comprehension

(COMPREND)
12
10
Poor Fair Good

Respondent Comprehension
These results suggest that while both respondent cooperation and
comprehension are correlated positively with DK, the association between
COMPREND and DK is stronger. Taken as a whole, the analysis in this section has
shown that:
(1) Most respondents perform adequately on surveys, both

by cooperating during the interview and comprehending
the survey questions presented to them.
115
(2) Respondents who are more likely to cooperate are also
more likely to understand survey questions. However,,
this interaction is weak.
(3) Respondents who are more likely to cooperate are less

likely to answer DK.
(4) Respondents who are more likely to understand survey

questions are correspondingly less likely to answer DK.
6.2J.2: Analysts of Indicators of Respondent Social Isolation

and Alienation
Table 9 below presents the descriptive statistics for our two indicators of
respondent alienation and social isolation. The 13-item CONFID scale varies from a
low of 3 to a high, of 39 with an average score of 26.5. CONFID also possesses good
measurement properties with a.cronbach’s.alpha o f. 8493.
Table 9: Descriptive Statistics, Indicators of-Respondent Motivation and Comprehension

Measure # of Items Cronbach's Mean SD Min Max
COOP 1 *** .045 .207 O 1
COMPREND. 1 *** .196 .297 0 1
Confidence 13 .8493 24.6 4.30 3 39
Anomie 3 .3894 4.25 1.06 1 6 |
* COOP ne-eoded; l=bostiIe/restJess, O^friesdly/eooperative
** COMPREND re-coded; l=poor/fair, 0=good
Factor analysis of the CONFID scale indicates that the measure is a
multidimensional construct with three dimensions. The first dimension explains 23
percent of the variance and seems to correspond to a general confidence in
institutions factor. The second factor explains 14 percent of the variance and seems
to represent an authoritarian/anti-authoritariaH dimension where confidence in the
116
military and the federal government are positively correlated and confidence in the
supreme court; the scientific community; the press; and TV are negatively correlated
with the factor. Finally, the third factor explains only 7 percent of the variance and
appears to correspond to a class cleavage dimension where confidence in organized
labor is positively correlated with the factor and confidence in major companies is
negatively correlated.
Second,, table 9 above shows that the 3-item anomie scale ranges from a low
of 1 (corresponding to low levels of anomie) to a high of 6 (corresponding to high
levels of anomie) with an average score of 4.25. Unlike the CONFID scale, the
anomie index is a weak measure with a cronbach's alpha of only .389. Factor
analysis, however, indicates that each of the 3 items load strongly on one factor
which explains 45.2 percent of the variance of the construct. Is respondent
alienation correlated with DK?
Table 10 below shows that neither of the attitudmal indicators of respondent
alienation is strongly correlated with DK. Indeed, anomie (r = .093; p.<Q5) is
weakly correlated with DK, while CONFID is correlated moderately with DK (r = -
.136; p.<.05).
Even given the weak relationships found above, both of the correlations are
significantly significant and in the hypothesized direction. Specifically, respondents
with lower levels of confidence and higher levels of anomie are more likely to
answer DK.
R ep ro d u ced with p erm ission o f the copyright ow ner. Further reproduction prohibited w ithout perm ission.
117
Table 10: Pearson’s Bivariate Correlation (r) of Respondent

Alienation
DK CONFID ANOMIE
DK 1.00 *** .
CONFID -136* 1.00 ***

ANOM IE .093* -.067* 1.00
* Correlations with asterisk significant at the .05 level (one-tailed test)
Table 10 above also suggests that the two scales tap distract dimensions.
Specifically, respondents who have lower levels of confidence have correspondingly
lower levels of anomie (r = -.067). However, the correlation is relatively weak,
suggesting that they tap different constructs.
6.3 Bivariate Analysis of Age and Civic Participation
Before analyzing the possible reasons for the age and civic participation
effects found in the last two chapters, let us first re-examine the relationship between
age/civic participation and DK. What is the functional form of the relationships?
Are they linear or non-linear? Were our initial assumptions about the relationships
valid?
Remember we assumed (implicitly in our regression models in chapter 4)
that both age and civic participation were continuous and linearly related to DK.
What do we find with a closer look?
R ep ro d u ced with p erm ission of th e copyright ow ner. Further reproduction prohibited w ithout p erm ission .
118
6.3.1 Bivariate Analysis of Age
Figure 9 below shows the relationship between age and DK. 1 re-code
age into approximately equal groups with 10 year intervals except for the youngest
(18 to 24 years of age) and oldest age categories (65 years of age or more).
Figure 9 indicates that age is not linearly related to DK. Indeed, there exists
only a slight increasing trend in DK from respondents who are 18-24 years of age
(2.40 DKs) to those who are 55-65 years of age ( 3.07 DKs)—an increase of only .67
DKs. M contrast, respondents who are 65 years of age or older are much more
likely to answer DK than younger respondents. Specifically, those 65 years of age
or older answered DK 5.70 times, on average, versus 2.5 times for those respondents
64 years of age or younger—a 230 percent increase in DK. What do these findings
suggest?
Figure 9: DK Mate by Age of tie Respondent

8
$
6.70
4SS4 6S+
Yam of Age
R ep ro d u ced with p erm ission o f th e copyright ow ner. Further reproduction prohibited w ithout perm ission.
119
First, these results indicate that any model (1) should not treat age as a
continuous variable and (2) should not specify age as being linearly related to DK.
Instead, age should either be treated as a binary variable (65 plus versus 64 less) or
as a three category variable with breaks between 35-44 and 45-54 and between 55-
64 and 65 plus.
Second, the large increase in the rate of DK for those respondents who
are 65 years of age or older lends support to the argument that the association
between age and DK results from aging and not generational change (cohort
effects). Indeed, if the age effect were actually the result of gradual cohort change
(perhaps resulting from increasing levels of schooling), we would expect a more
constant negative trend in DK from younger to older respondents. Of course, this
explanation does not exclude the possibility that the large increase in DK for older
respondents (65 years of age and older) results from a particularly strong period
effect (e.g., World War II), making this older generation qualitatively different from
younger generations. However, no convincing explanation comes to mind. Even
assuming that the age effect results from aging, we still do not know anything about
the possible underlying social or cognitive processes.
6.3.2 Bivariate Analysis of Civic Participation
Figure 10 below shows the association between civic participation and DK.
Here, due to the small number of cases, I re-code civic participation at the higher
120
levels into 2 categories: (1) 6 to 8 civic activities and (2) 9 plus civic activities.
What do we find?
Figure 10 indicates that the association between civic activities and DK more
closely resembles a linear relationship than the correlation between age and DK.
Indeed, at lower levels of civic participation (0-2 civic activities), the association is
actually linear with those respondents who do not participate in civic activities
answering DK 4.9 times; those participating in 1 civic activity answering 4.0 times;
and those participating in 2 civic activities answering 3.0 times.

Figure 10: DK Rate by Number of Civic Activities
7.9
6.0
5.0
40
3.0
2.0
V 1.3
1.0
0.0
0 1 2 3 4 5 6 to 8
Number of Civic Activities
At higher levels of civic participation however, the relationship takes on a
more non-linear form. Respondents who participate in 5 civic activities (average
DK =2.0) are only 1.5 times more likely to answer DK than those respondents who
participate in 9 civic activities or more (average DK = 1.3). Conversely, respondents
who do not participate in civic activities (average DK =4.9) are 3.3 times more likely
to answer DK than those who participate in 5 civic activities.
121
These results suggest that the association between civic participation and
DK is approximately linear, with non-linearity at higher levels of civic
participation. Taken as a whole, the above analysis suggests that:
(1) Age is nonlinearly related to DK.
(2) Civic participation is linearly related to DK.
6.4 Validation and Decomposition of Age and Civic

Participation
Now that we know more about the bivariate relationship between age/civic
participation and DK, what may explain these relationships? What, in other
words, are the underlying meanings behind these associations?
To answer these questions, I analyze the bivariate correlations among age,
civic participation, and the indicators of respondent motivation and comprehension.
I also estimate the relative importance of respondent motivation and comprehension
in the composition of the age and civic participation effects.
6.4.1 Validating Age
In this section, I have one simple objective: to determine the relative weight
of respondent motivation and comprehension in the composition of age. Table 11
below suggests that age is a function of both respondent motivation as well as
respondent comprehension. What do we find?
122
Table 11: Pearson's Bivariate Correlations of Age, Civic Participation and

indicators of Respondeat Motivation and Gowpreheaswit
DK Cognitive Age (I) O vie COOP C O M M END CONFID Anomie
DK 1.00 ft*ft *** ft** *«* *** ftftft *ftft
Cognitive - . 261* 1.00 ftftft Sftft ft** ftftft ft** ***
A ge(l) . 232 * -.027 1.00 *** ft** *** ft**
Civic -.212* . 462* - . 071 * •ft®* ftftft ftftft ftftft
COOP . 182 * - . 074 * . 063* -.885* L0O ftftft ft** ft**
COMMEND .354* - . 417 * . 153* - . 276* . 216 * 1.00 ftftft ft* ft
CONFID - . 136* .028 - . 091* .018 .000 - . 093* 1.00 ***
Anomie . 074 * - . 284 * . 076 * - . 215 * .010 . 178 * .085* 1.00
* Age re-coded where 1=65 phisyesa-s of age and CH34 less years of age
** COMBiEMD: good =1; fishras&f poor -D
f COOP: friendly and cooperative = 1; impatient and hostileK)
%AH codfrcsents vMt asterisk. are sipHfcasg at the .05 tewt
6.4X1 Respondent Motivation and Alienation
Table 11 above shows that a i of the indicators of respondent motivation and
alienation are correlated, though weakly, with age. Older respondents are more
likely to feel alienated; more likely to have lower levels of confidence in institutions;
and are less likely to cooperate. Specifically, Table 11 indicates that:
(1) CONFID: older respondents (65 years of age or more)

are less likely to be confident in the leaders of
institutions than younger respondents (r = -.091; p.<.05).
(2) Anomie: older respondents are more likely to have

higher levels of anomie than younger respondents (r =
.076; p.< 05).
(3) COOP: older respondents are more likely to be rated as

uncooperative or unfriendly during the interview than
younger respondents (r = .063: p. <.05).
123
6.4X2 Respondent Comprehension

Unlike the indicators of respondent motivation and alienation, age is not
correlated with all the indicators of respondent comprehension. In particular, age is
correlated with COMPREND but not cognitive sophistication.
Specifically, table 11 below shows that older respondents are more likely to
be rated by the interviewer as having only fair or poor comprehension of survey
questions (r = .153; p.<.05). In contrast, older respondents are no less cognitively
sophisticated than younger respondents (r = -.027; p.>.05). What might explain
these findings?
One explanation may be that cognitive sophistication captures respondent
abilities acquired in the past and/or accumulated over a long period of time (e.g.,
verbal ability, knowledge). Therefore, cognitive difficulties brought on by old age
(e.g., memory, ability to manipulate concepts, etc.) do not necessarily affect a
respondent’s cognitive sophistication (Salihouse 1999,1997; Herzog and Rodgers
1999). Unlike cognitive sophistication, interviewer rated comprehension
(COMPREND) seems to capture respondent’s present mental ability—memory,
reasoning ability, facility at manipulating new concepts (Salthouse 1999).
124
6.4.1.3 Decomposition of Age
Table 12 below includes the relative weights (variance explained) of
respondent motivation and comprehension in the composition of age. I calculated
the percent of variance explained in four stages. First, I regressed age on the
indicators of respondent motivation and comprehension.
Table 12: Age ass# the relative w eight o f respondent motivation a n d com pnfaem ien
Variable Beta Weight % Explained Variance
COMPREND 0.11 51
COOP 0.00 0
Cognitive -0.04 6
Anomie 0.09 33
CONPID -0.05 10
Civic 0.00 0
Total 0
* Standardized Beta; estimated using logistic
** CGM PREND: good =1; fair and poor =0
t COOP: friendly and cooperative = 1; im patient and hosfctte=0
J N ote the above variables explain 10- peteesA o f the total variance in age
Second, I squared the beta weights (beta weight * beta weight) taken from
the results of the multiple regression. Third, 1 summed the squared beta weights.
Fourth, I divided the squared beta weights by the sum of the squared beta weights.
The results in the table 12 above confirm our bivariate analysis in table 11—
both respondent motivation and comprehension vary with age. Specifically,
respondent motivation (COOP, CONFED and Anomie) accounts for 43 percent of the
variance (0% + 10% + 33%) with anomie explaining most of variance (33%). What
about respondent comprehension?
The results for respondent comprehension, however, are not as clear-cut.
Indeed, taken together, cognitive sophistication and COMPREND account for 57
125
percent of the variance in age, though they are correlated in opposite directions.
Specifically, cognitive sophistication is positively correlated with age—older
respondents are more likely to be cognitively sophisticated than younger respondents
(beta weight = - .04). COMPREND, in turn, is negatively correlated with age with
older respondents being less likely to comprehend survey questions (beta weight
= 11). On balance, COMPREND is the dominant effect, given its larger beta weight
(51%-6% = 45%).
Once again, the results above indicate that distinct differences exist between
cognitive ability acquired over the life-course (vocabulary; knowledge) and survey
performance. Most interesting of all, age seems to have counteracting effects. On
the one hand, older individuals have acquired knowledge and vocabulary over time
which make them, on average, more c&gmttiwly sophisticated. On the other hand,
they seem to be less cognitively agile in demanding situations, such as those required
by surveys.
The above results confirm our initial observation—age taps both respondent
motivation and comprehension. Older respondents are more likely to answer DK (1)
because they are more likely to feel alienated and (2) because they are less
cognitively agile. What do we find with civic participation?
6.4.2 Validating Civic Participation
In this section, I am interested in determining the relative weights of
respondent motivation and comprehension in composition of civic participation.
126
Table 11 above suggests that, like age, civic participation may also be a function of
both respondent motivation and comprehension. What do we find?
6.42.1 Respondent Motivation and Alienation
Table 11 shows that all the indicators of respondent motivation and
alienation, with the exception of CONFID, are correlated with civic participation.
Specifically, we find:
(1) COOP: respondents who participate less in civic

activities are more likely to be rated as being hostile or
unfriendly during the interview (r = -.085; p.<.05)
(2) CONFID: respondents who participate less in civic

activities are not iess likely to be confident in civic
activities (r = .018; p>.05).
(3) Anomie: respondents who participate less in civic

activities are more likely to have high levels of anomie (r
= -.215; p. <.05).
6.42.2 Respondent Comprehension
Table 11 also shows that civic participation is correlated quite strongly
with respondent comprehension. Specifically, we find that:
(1) Cognitive Sophistication: respondents who participate

more in civic activities are also more likely to be
cognitively sophisticated (r = .462; p.< 05).
(2) COMPREND: respondents who participate more in

civic activities are correspondingly less likely to be rated
as having poor or fair comprehension (r = -.276; p.< 05).
127
Table 13:Cwic Participation and th e relative weight of respondent motivation and comprehension
Variable Beta Weight % Explained Variance

COMPREND -0.072 3
COOP -0.043 1
Cognitiv 0.402 91
Anomie -0.089 4
CONFID 0.014 0
Age 0.009 0
Total *** 100
* Standardized Beta; estimated using OLS
** COMPREND: good =1; fair and poor =0
t COOP: friendly and cooperative = 1; impatient andhostile=0
| Note the above variables explain 23 percent of the total variance in Civic Participation
Both of the indicators of respondent comprehension are strongly correlated
with civic participation. We should expect, therefore, that respondent
comprehension will have a greater weight in explaining civic participation than
indicators of respondent motivation. What do we find empirically?
6.4.2 3 Decomposing Civic Participation
Table 13 above confirms our hypothesis. Respondent comprehension
explains almost all of the variance in civic participation (94%) with cognitive
sophistication accounting for 91 percent of the explained variance and COMPREND
only 3 percent. Conversely, respondent motivation explains only 6 percent of the
explained variance in civic participation.
These results run counter to the methods literature which assumes that civic
participation is a proxy variable for respondent motivation (Young 1998;Young
1999b; 1999c). Instead, it appears that respondents who participate more in civic
activities are less likely to say DK because they are more cognitively sophisticated
128
(higher verbal ability; more knowledgeable; and more exposed to information) and
not because they are more motivated to be “good respondents”.
This research, however, sheds no light on exactly why this is case. Is civic
participation a cause or an effect? Specifically, do respondents become more
cognitively sophisticated by participating more in civic activities? Or are more
cognitively sophisticated respondents more likely to participate in civic groups?
These questions can not be answered here.
6.4 Conclusion
I asked two questions at the beginning of this chapter: (1) why are older
respondents more likely to say DK than younger respondents?; and (2) why are
respondents who participate less in civic activities more likely to answer DK than
respondents who participate more in civic activities? So what did the research in this
chapter offer as evidence?
Our analysis of age ended in three important findings. First, age does not
appear to be linearly related to DK. Instead, the probability of providing a DK
response only increases among the oldest of respondents (65 years of age or more).
Second, age appears to capture both respondent comprehension and respondent
motivation. Specifically, older respondents (65 years or more) are less likely to
understand survey questions as well as more likely to feel alienated. Third, while
age is strongly correlated with performance during the interview (COMPREND), it
is not correlated with cognitive sophistication. However, as we have emphasized at
the beginning of this chapter, serious problems remain concerning the causal
direction of the correlation between COOP/COMPREND and DK.
Our analysis of civic participation resulted in two important findings. First,
unlike age, civic participation is linearly related to DK. Second, cognitive
sophistication (respondent comprehension) explains almost all the explained
variance in civic participation. So what explains these findings?
One possible hypothesis is that participation in civic activities actually
facilitates the accumulation of acquired cognitive ability (e.g., verbal ability,
knowledge, and information exposure), through increased contact and exposure to
those issues found on surveys. Another possible explanation is that those
respondents who are more cognitively sophistication are more prone to participate in
civic activities. In either case, these results run counter to the methods literature
which assumes that civic participation is a proxy for respondent motivation.
We are still left with two pending questions. First, does re-specifying age as
a binary variable (65+= 1) improve the explanatory power of the model? And
second, does the inclusion of the two-interviewer evaluations of respondent
performance (COMPREND and COOP) add explanatory power to our multivariate
model?
permission of the copyright owner. Further reproduction prohibited without permission.

130
C H A P T E R SEVEN
FINAL MULTIVARIATE MODELS
7.0 Final M ultivariate M odels
We have covered considerable ground in the last three chapters: (1) we
explained away the education effect and (2) we validated the age and civic
participation effects. In this chapter, my main objective is to answer three pending
questions. First, does the re-specification of the age variable improve the
explanatory power of our final model (model 3) in chapter 4? Given our analysis
earlier in chapter 6, we would expect that a non-linear specification would explain
more variance than a linear specification.
Second, do COMPREND and COOP increase the explanatory power of
model 3 in chapter 4? Our initial hypothesis would be yes—that both COMPREND
and COOP do significantly improve model fit.
Third, does the inclusion of COMPREND and COOP affect the relationship
between other correlates and DK? I have no a priori hypotheses concerning how
the relationships might change.
7.1 Models
To answer the above questions, I test two additional models against the final
model in chapter 4 (model 3). The baseline model hypothesizes that DK is a
function of 5 main effects (cognitive sophistication, civic participation, age, gender,
and race) and 4 interaction effects [(1) civic participation * cognitive sophistication;
(2) age * gender; (3) race * civic participation; and (4) age * civic participation)].
This model treats age (age in years) as being linearly related to DK (see equation 19
below)
(19)
DK = 00 + 01 (Civic Participation) + 02(Cognitive Sophistication) + fi3(Age

in years) + 04(Gender) +, 05(Kace) + f06(Civic Participation) *
(Cognitive Sophistication) + 07 (Age * Gender) + 08 (Civic
Participation * Race) + 09 (Age *■Civic Participation)/ + ei
I first test the baseline model against a second model (see equation 20 below)
which specifies age a dummy variable (65 years of age or more =1).
(20)
DK = 00 + 01 (Civic Participation) + 02(Cofpntive Sophistication) + /B(Age

binary) + 04(Gender) + 05(Race) + {06(Civic Participation) *
(Cognitive Sophistication) + 07 (Age * Gender) + 08 (Civic
Participation * Race) + 09 (Age * Civic Participation)/ + ei
Equation 20 above hypothesizes that DK is a function of five main effects (civic
participation, cognitive sophistication, age, gender, and race) and four interaction
effects (note bold parentheses). Note this model specifies age as a binary variable.
132
Finally, I test whether interviewer evaluation of respondent comprehension and
cooperation (COMPREND and COOP) significantly improves the model mentioned
above.
(21)
DK = po + pi (Civic Participation) + p2(Cognitive Sophistication) + p3(Age

binary) + p4(Gender) + p5(Race) + (P6(Civic Participation) *
(Cognitive Sophistication) + p7 (Age * Gender) + P8 (Civic
Participation * Race) + p9 (Age * Civic Participation)) +
IP 1OCOMPREND + pi 1 COOP) + ei
Equation 21 above hypothesizes that DK is a function of five main effects
(civic participation, cognitive sophistication, age (binary), gender, and race); four
interaction effects (note bold parentheses); and respondent comprehension and
cooperation (see bold brackets).
To test for the best fitting model, I use the difference in R-square test for
nested models (see equation 22 below).
(22)
/ = [(Rt —Ra2) / ( h - k J ]/[( 1 - R„2)/(n - h -1 ) ]
where S*2 is the R square for the M l model;

Ra" is the R square for the parsimonious model;
kb is the number o f parameters for the full model;
ka is the number o f parameters for the parsimonious
model;
n is the sample size.
133
7.2 Testing Model Fit
7.2.1 Testing Model Fit: Age
Table 14 below includes 3 regression models. First, Model 11 is the final
model (model 3) in chapter 4 where age is specified in years. Second, Model 12
includes the re-specified age variable. Here age has been re-coded into a binary
dummy variable where 1 =65 years of age or older and 0 = 64 years of age or
younger. Third, Model 13 includes the two interviewer ratings of respondent
performance .(COMPREND and COOP). Of these three-models which one best fits
the data?
Equation 22 above is designed'.to test nested models (e.g., a model with one
independent variable age versus a second model with two independent variables age
and education). Models 11 and 12, however, are not nested models—the only
difference being a re-specified age variable (age in years versus age (1=65+)).
Non-nested models can mot be directly tested using the above equation
because, in such cases, the equation has no mathematical solution.1 How then
should we proceed? What is the best model?
The econometrics literature is unclear as how best to proceed (Kennedy
1998). One simple rule is to assume that die model with a larger adjusted R-square
is the best fitting model. Using this decision rale, model 12 best fits the data
(adjusted-i?2 ^ .1656 versus .1588, though the difference is small). Another
where k b - k a = 0 and [jR/ - R j / kb - ka * null set].
134
strategy is to assume that the difference in the number of parameters is 1 (kb -k a =
1). Using this method, model 12 again is the best fitting model ( f - 11.7; p.<.001).
A final strategy is to employ the J-test for nan-nested models (Kennedy
1998). The J-test is a three-step procedure. First, I regress the dependent variable
(DK) on the re-coded age variable and the other explanatory variables. Second, I
estimate the predicted value, YAhat for this model.
Third, I regress the dependent variable (DK) on the explanatory variables;
age measured in years; and the predicted value (YAhat) estimated in step 2. If YAhat
is statistically significant, this suggests that the re-coded age variable fits the data
significantly better than age measured in years. If YAhat is not statistically
significant, this suggests that age measured in years fits the data significantly better
than the binary age variable.
What do we find? The J-test demonstrates that Model 12, once again,
significantly improves the fit of the data over Model 13 (t-value = 2.151; p=. 032).
So what should we conclude?
The above analysis suggests that the re-coded age variable fits the data better
than the age variable in years. Supporting our earlier bivariate analysis, these
findings underscore the initial hypothesis that age only becomes important in
explaining DK when comparing the oldest of respondents to all other respondents
135
7.2.2 Testing Model Fit: COMPREND and COOP
Model 14 in Table 14 below shows that COMPREND and COOP
significantly and substantially improve model fit i f —59.1 p.<.001). Indeed, model
14 increases the adjusted R-square from .1656 to .1986 (model 13 versus model 14).
These results suggest that respondent performance (COMPREND and
COOP) taps underlying constructs which are different than those captured by
respondent level characteristics (e.g., age, civic participation, and cognitive
sophistication). One possible explanation may be that COMPREND and COOP
measure actual behavior during the survey interview, while respondent level
characteristics (e.g., age) capture behavioralpropensities. '
Taken as a whole, the above results confirm our initial hypotheses—that
both the re-specified age variable and the measures of respondent performance
(COMPREND and COOP) significantly improve the explanatory power of our
multivariate model. So what about individual level correlates? How does the
introduction of COMPREND and COOP effect the relationship between DK and
other factors?
136
7.3 Analysis of DK Predictors
In this section, I examine 5 DK correlates: (1) age; (2) COMPREND; (3)
COOP; (4) cognitive sophistication; and (5) gender. Model 13 in Table 14 below
includes the independent effects of the DK predictors after controlling for
COMPREND and COOP. What do we find?
Age (65 plus =1): controlling for other factors, age has. a statistically
significant and independent effect on DK (b = .844; p.<.05). Specifically,
respondents who are 65 yearn of age or older are 3.5 times more likely to answer DK
than younger respondents (3.25 DKs versus .920 DKs). Furthermore, the joint
inclusion of COMPREND and COOP accounts for about 30% of the variance in age
(Model 13 versus Model 12 3.25 versus 4.72 DKs).
137
Table 14: Summated DK Index regressed on COMPREND, COOP,

Controlling for O tie r Predictors
(Unstandardized Regression Coefficients)
Variables M odel 11 Model 12 M odel 13
Age (1=65+) 0.972 0.844
[.202] [-179]
Age (in years) 0.019 *** ***
[.004]
Cognitive Sophistication -0.145 -0.141 -0.087
[.031] [.032] [.029]
Civic Participation -0.046 -0.078 0.010
[.043] [.035] [.032]
Gender (Female=l) -0.113 0.161 0.163
[182] [.068] [.063]
Race (White=l) -0.080 -0.037 0.028
[.172] [.173] [.163]
Cognitive*Civic 0.020 0.020 0.014
[.006] [.007] [.006]
Age*Gender 0.006 -0.029 0.004
[.004] [399] [.193]
Civic Participation*White -0.050 -0.052 -0.059
[.030] [.034] [030]
Civic Participation*Age -0.002 -0.090 -0.083
[.001] [-031] [.0028]
COMPREND (l=Fair, Poor) ** ** 0.611
[119]
COOP (l=Hositle, Uncooperative) ** ** 0.697
[.180]
Constant 0.573 1.20 0.959
[.341] [.183] [.167]
Sample size (n=) 1446 1446 1446
Adjusted R Square 0.1588 0.1656 0.1986
*AS coefficients in black italitcs are significant at the ,05 level (one-tailed test)
** Dependent variable transformed using the square root
f SEa aâstod for complex design of foe sample (cfostermg and siratificatioa)
$ SEs in brackets under coefficients
f
138
COMPREND and COOP: Both respondent comprehension and cooperation have
statistically significant and independent effects on DK (COMPREND: b = .611.
p.<.05 and COOP: b = .697; p.<,05). Pat simply, respondents who are rated as
having poor or fair comprehension of survey questions are 2.67 times more likely to
answer DK than those who are rated as having good comprehension (2.46 versus
.919 DKs). Similarly, respondents who are rated as hostile or restless during the
interview are approximately 3.0 times more likely to say DK than those who are
rated as being friendly or cooperative (2.75 versus .919 DKs).
Cognitive Sophistication: Controlling for other respondent characteristics,
cognitive sophistication has a statistically significant and especially strong
independent effect on DK (b = -.087; p <05). Specifically, respondents who have
low levels, of cognitive sophistication (low level =1) are 95 times more likely to
answer DK than respondents who have high levels (high level =10) of cognitive
sophistication (.760 versus .008 DKs). Furthermore, the inclusion of COMPREND
and COOP explains approximately 32 percent of the variance in cognitive
sophistication (model 12 versus model 13 1.12 versus .760 DKs).
Gender (female =1): controlling for other factors, gender has a statistically
significantly independent effect on DK (b = .163; p.<.05). Specifically, female
respondents are 1.4 times more likely to say DK than male respondents (1.29 versus
.912 DKs), So what does this mean? Why did gender become significant?
139
One possible explanation is the re-specification of the age variable. Indeed,
after re-specifying age, gender becomes statistically significant in model 12 (model
11 versus model 12 ^ t-value = -.621 versus 2.37), while the interaction effect (age
* gender) no longer remains significant after age is re-specified in model 12. This
finding is important because the literature has always treated age as a continuous
variable, linearly related to DK. This suggests that the significant interaction effect
found in the literature probably results from mis-specification error.
7.4 Conclusion
.:;So what did we find in the above.analysis? First, the re-specified age
variable(65+) fits the data better than the age in years variable. This result lends
further support to the argument that age captures life-cycle and m i cohort
differences . Future research must examine this question in more depth, given that
this.research is far from conclusive.
Second, both COMPREND and COOP significantly improve model fit. This
result is important because it shows that such measures tap qualitatively different
concepts than respondent level characteristics. But why is this the cases?
As I have already suggested, these two measures probably do a .betterjob of
capturing the underlying social and cognitive dynamics of the interview, while
respondent level measures probably tap more general behavioral propensities.
Considering these findings, three noteworthy suggestions are in order. First,
methodological studies should always employ similar measures, given their
explanatory power. Second, methodologists should revisit these measures in order
140
to perfect them. For instance, COMPREND uses a 3-point response scale, when a 5
or 7 point scale would probably do a better job of discriminating respondents.
Furthermore, other measures of respondent motivation and comprehension should be
developed (especially measures of respondent motivation). Hurd, given that they
explain additional variance in DK, statisticians should employ similar measures in
their data imputation models.
Third, COMPREND and COOP jointly account for 30% o f the
covariance in cognitive sophistication and age. This finding re-enforces our
above conclusion that respondent level characteristics probably tap different
underlying constructs: general behavioral tendencies versus actual survey
performance. Eiiitlieimore^ tie significant age effect suggests that measures of
respondent motivation and comprehension only partially explain why older
respondents are more likely to say DK. Future studies should examine other
possible explanations for the age effect.
141
CHAPTER EIGHT
CONCLUSION OF THESIS
8.0 Conclusion of Thesis
We have covered much ground in this thesis. So what have we learned? To
answer this question, it makes most sense to re-trace our first steps in chapter 1.
To summarize, we had two primary objectives: (1) to establish both a general
framework as well as specific guidelines for the treatment of item missing data and
(2) to understand why. certain survey respondents are more likely to answer DK than
other respondents. I specified four separate questions to operationalize the two
aforementioned objectives. Specifically, Tasked:
(1) Are respondents who are more likely to answer survey questions
different from than those who are less likely?
(2) Along what dimensions (demographic, behavioral, and/or attitudinal) are

respondents who are more likely to answer survey questions different
from those who are less likely?
(3) What social and/or psychological processes explain why some

respondents are more likely to answer survey questions than others?
(4) Can general principles be derived, so that missing data will not have to
be dealt with on a case by case basis?
To address these questions, I organize this concluding chapter into three
142
sections. In section 8.1, I detail a conceptual framework of DK responses (and item
nonresponse more generally). The primary objective of this conceptual framework
is to give survey researchers a rough idea about how they should think about item
missing data—both at the design stage as well as during post-survey data correction
(imputation) stage.
Throughout this thesis, I have assumed that DK is a function of underlying
social and cognitive processes. Here I more fully detail my logic.
In section 8.2,1 discuss the general findings with particular attention to the
question of why certain survey respondents are more likely to say DK than other
respondents. As I have stressed in this thesis, the mere existence of a correlation is
interesting, even useful in day-to-day research. However, understanding why a
given respondent characteristic is correlated with another is much more important
both theoretically and practically. In this section, I also discuss the practical
implications of my research.
Finally, in section 8.3,1 examine how each of the variables examined in
section 8.2 relate to the conceptual framework presented in section 8.3. To end
with, I discuss the use of data imputation in attitudinal.
8.1 Conceptual Framework
In this section, I first define and describe a generalized conceptual
framework of the survey interview. I, then, discuss how this framework can be used
to help guide survey methodologists in their treatment of missing data.
143
Research shows that responses to survey questions—including DK
responses—are a function of both the social and cognitive dynamics of the survey
interview (Sudman et. al. 1996; Rrosnick and Fabrigar 1997). On the one hand, the
survey interview has been shown to be a social encounter between the respondent
and the interviewer that is governed by certain norms and social rules. On the other
hand, the interview requires cognitive effort from the respondent who must first
understand the question, then retrieve the relevant information, and finally integrate
the information in order to answer the question. The social norms of the interview,
in turn, can influence how much cognitive effort a respondent exerts.
DK responses can be considered a fimctios of two respondent level meta
variables: (1) cognitive factors, such as knowledge level, exposure to information,
mental ability and respondent comprehension during the interview and (2) social
factors, such as respondent motivation and adherence to social norms (Young 1999b;
1999c). Equation 23 below shows that DK is a function of social and cognitive
factors:
DK = Social Factors + Cognitive Factors (23)
Furthermore, central to the interview is the task, which includes the
interviewer administering the question to the respondent and the respondent, in turn,
providing an answer. Task difficulty can vary depending on its characteristics (e.g.,
question wording, format, content). Considering this, the simple response model
144
presented above with two main effects expands to a four variable model with two
main effects and two interaction effects (see equation 24 below).
(24)
DK = Social Factors + Cognitive Factors + [ Social*Task

+ Cognitive*Task}
Specifically, equation 24 above hypothesizes that DK is a function of social
processes-, cognitive processes, and the characteristics of the survey task. Let us
further define what we mean by each of these concepts: (1) social; (2) cognitive; and
(3) task. . . .
8.1*1 Social Aspects of the Interview
By social aspects, I refer, to the degree to which a respondent feels social
pressure to give a substantive as opposed to a DK response. I take the concept of
social pressure (normative motivation, respondent motivation) both from the
literature on response effects which conceptualizes the survey interview as a micro-
social system (e.g., Sudman and Bradbum 1974; Bradbum 1983; Sudman et. al.
1996) as well as from the literature on satisficing which places central importance
on the social motivation, of the respondent to provide substantive responses
(Krosnick 1991; Krosnick and Fabrigar 1997; Krosnick 1999). So how does the
survey as a mi.cro-social system work?
145
In this social system, there are two participants, or social roles—that of the
interviewer and that of the respondent—joined by a common task which is to ask
and answer questions. The social roles of both the respondent and the interviewer, in
turn, afford certain rights and prescribe certain obligations. Let me briefly describe
the rights and obligations of respondents and interviewers.
Respondents have the right to refuse the interview; to refuse to answer a
given question; and to be treated with respect. Respondents also have certain
obligations such as answering questions truthfully and answering questions to the
best of their ability. The interviewer has certain rights, including the right to guide
the interview within the given constraints set by the researcher as well as the right to
limit the respondent’s comments about subjects relevant to the survey. The
interviewer is also obligated to ask the questions as intended by the researcher, to
keep answers confidential, and to treat the respondent with respect.
These .rights and obligations, in turn, dictate specific social norms which
govern the behavior of both the interviewer and the respondent. By understanding
this social system, researchers can design surveys (1) that do not violate the social
norms of the system (2) that emphasize both the rights and obligations of each
participant and (3) that emphasize those social norms which motivate the respondent
to provide substantive responses.
In the case of the interviewer for instance, survey designers want to limit
idiosyncratic interviewer behavior that might affect a respondent’s answers (e.g.,
explaining a term in a question). In order to minimize such behavior, standardized
interview protocols are used to strictly define how an interviewer should behave.
146
Protocols, in other words, are a mechanism used by the researcher to clearly define
an interviewer’s obligations.
In the case of the respondent, researchers expect respondents to thoroughly
consider questions and, in turn, to provide truthful answers. To accomplish this,
survey designers should stress those social norms which motivate respondents to
give reliable and valid answers (e.g., good respondent norm and norm of
truthfulness).
One example of how to motivate respondents is to emphasize the importance
of the survey. The General So« m S irvey (GSS), for instance, sends an introductory
letter to selected households poor to the survey interview stressing the importance of
the survey: “...the results of this research will be released quickly to officials in
Washington, to scholars in universities, around the country, and to the public”.
While survey organizations use such techniques to convince potential respondents to
participate in the survey, researchers also use this technique, and others like it, to
motivate respondents to answer questions more carefully and thoughtfully (Krosnick
1991).
The survey researcher, however, must keep in mind that all individuals have
multiple roles at any given time (e.g., the role of the father, son, professional) and
that these additional social roles can have different—even conflicting—rights,
obligations, and social norms. Some of these roles and norms can actually facilitate
a respondent in providing reliable and valid answers, while others can actually
hinder it. Good citizen norms, for instance, can be used to motivate respondents to
answer questions by linking participation in a survey to the greater good of society ,
147
while face-saving norms and norms o f politeness might actually cause respondents
to alter (or edit) their answers to make them more socially acceptable.
Put another way, on the one hand, there are norms that motivate respondents
to provide answers that more closely correspond to their true value. On the other
hand, there are norms that motivate respondents to give answers which do not
correspond to their underlying true value. As such, when designing a survey,
researchers should attempt to maximize the role of social norms which lead to more
reliable and valid data and to minimize the role of social norms which do not
(Sudman et. al. 1996).
In a later section, we will discuss in more detail the practical application of
methodologies that consider normative motivation. However, the above discussion
failed to address one, important issue. What about a lack or norms, or anomie?
How does a survey researcher manipulate the norms of a survey interview if one of
the actors does nothald the same values as the larger social group?
Underscoring this concern, research especially in chapter 6 showed that
respondents more likely to answer DK are correspondingly more likely to feel
alienated from society, more likely to be social isolates, and less likely to possess the
same underlying values as the majority. So what is the solution? Unfortunately, no
simple answer exists—what is needed is more research on how subgroup norms, as
they relate to survey response, differ from those of the larger social group.
148
8.1.2 Cognitive Aspects of the Interview
By cognitive aspects, I refer to a respondent’s ability to cognitively process
and retrieve the relevant information to answer a question. I take this concept from
the cognitive psychology literature on. context effects which hypothesizes that
question response is a four stage cognitive process where respondents must (1)
understand the question, (2) retrieve the information from memory, (3) form a
judgement from the retrieved information, and (4) format the answer to the response
category (Tourangeau and Rasinski 1988). The methods literature argues that
respondents who are more cognitively sophisticated are less likely to have problems
in answering survey questions (Krosnick 1991; Krosnick and Fabrigar 1997;
Krosnick 1999).
As has been discussed at length, cognitive sophistication is a multifaceted
concept including respondents’ level of knowledge, interest in the topic, verbal
ability, general political competence, and exposure to media. (Schuman and Presser
1981; Krosnick 1991; Young 1999b, 1999c). Respondents, in turn, vary
considerably in their level of cognitive sophistication (Converse 1977; Young
1999b; 1999c). Is cognitive sophistication really a singular concept?
The research in this thesis suggests that two distinct dimensions of cognitive
ability actually exist: (1) acquired cognitive ability and (2) cognitive agility. On the
one hand, cognitive sophistication appears not to tap a respondent’s ability to
process information during the interview and comprehend survey questions. In other
149
words, cognitive sophistication is not a direct measure of survey performance.
Instead, the concept seems to be more closely related to acquired cognitive ability
(vocabulary, knowledge, and information exposure). On the other hand, interviewer
evaluation of respondent comprehension (COMPREND) seems to tap cognitive
agility, specifically respondentperformance during the interview. Future research
needs to more felly examine the dual concepts of cognitive ability and survey
performance.
Even without a clear understanding of the specific dimensions of cognitive
ability, researchers should design surveys so that even the least cognitively able can
understand the questions. Such strategies may include the simple wording of
questions and the use of pictorial devices such as show cards (Sudman and Bradbum
1982; Sudman et. ai. 1996). In addition, even among the more cognitively able,
certain topics can be cognitively demanding, such as complex public policy issues
like NAFTA.
By understanding the cognitive complexity of survey topics, researchers can
design surveys that facilitate cognitive processing. For instance, in the case of
questions which require specialized knowledge, such as NAFTA, researchers might
want to include general questions concerning regional economic integration that
precede it in order to prime respondents’ memory.
150
8.1.3 Task: Characteristics of the Interview
The central task of the survey interview is for the respondent to answer
survey questions. The given characteristics of the task (e.g., question wording,
format), in turn, can make the response process more or less difficult for the survey
respondent.
Survey researchers have control over task difficulty through the
manipulation of task characteristics. While a respondent’s social or cognitive
characteristics can not be varied, survey researchers can control the normative and
cognitive difficulties (barriers) of any given question. Specifically, researchers can
manipulate three task variables to vary normative and cognitive task difficulty: (1)
question characteristics, (2) interviewer behavior, and (3) mode of administration.
The following two examples illustrate this point. First, questions concerning
household finances can be cognitively demanding because such questions require
substantial knowledge of very detailed information which the respondent must
retrieve from memory. To reduce the cognitive barriers associated with such
questions, questionnaire designers could include several intermediate questions
concerning different aspects of household finances (e.g., mortgage/rent, utilities,
food, gas, etc.).
This strategy decomposes the cognitive task for respondents, making it easier
to recall the relevant information from memory (Sudman and Bradbum 1982;
Sudman et. al. 1996). To further reduce the cognitive demands, researchers might
151
request that respondents access personal record when in doubt—this can either be
done in a face-to-face or in a self-administered survey.
Second, sensitive questions, such as those concerning drug-use, can create
high normative barriers for respondents as a result of pressures (e.g., face-saving
norm) to express socially acceptable answers.1
To reduce the normative barriers associated with drug-use questions, survey
researchers may want to administer a self-administered questionnaire. Researchers
could also assure respondents, in the introduction of the questionnaire, that their
answers are confidential.
8.1.4 'Summary Remarks
When designing a survey, researchers must take into consideration both the
social (normative motivation) and cognitive (cognitive ability) aspects of the survey
interview as well as the normative and cognitive difficulties associated with the task.
To illustrate this point, let us slightly alter equation 24 using the terminology
introduced in the last three sections (see equation 25 below).
(25)
DK = Normative Motivation + Cognitive Ability

+ [ Normative Motivation * Task + Cognitive
Ability * Task j
Note the literature has typically treated social desirability as a psychological trait (see
DeMaio 1984). However, some research suggests that such behavior is normatively determined
(Stockings 1979).
152
In place of social factors, equation 25 uses normative motivation, and, in
place of cognitive factors, it employs cognitive ability. So what does the above
model suggest?
Equation 25 shows that DK is a function of a respondent’s level of normative
motivation; cognitive ability and the difficulty of the task (e.g., question wording).
Task difficulty, however, does not have a direct effect on DK. Instead, both
normative motivation and cognitive ability mediate the effect of task on DK.
Specifically, the normative motivation by task interaction effect can be treated as the
normative barriers associated with a given task, while the cognitive ability by task
interaction effect can be treated as the cognitive barriers associated with a given
task. Simply put, these two interaction effects suggest that respondents who are
either less noitnatively motivated or less cognitively able will be more likely to
answer DK when the task is more difficult
In support of this model, the literature shows that respondents with lower
levels of cognitive ability and normative motivation are more likely to answer DK
when the task is more difficult (Sehtiman and Presser 1981; Narayan and Krosnick
1996). However, no methods research has attempted to determine exactly how
different respondent and task level characteristics are related. Such knowledge is
essential for a fully functional model of DK response. Even with these limitations
though, how can this model be used?
Survey researchers can use the above model in both questionnaire design and
data imputation. In the case of questionnaire design, the model stresses that survey
researchers can manipulate task characteristics to maximize or minimize DK
153
responses. For instance, researchers can minimize DK responses by employing
strategies that reduce normative and cognitive banders such as excluding the DK
option; using simple words and concepts; including an introduction emphasizing the
importance of the survey to the national public debate; using intermittent interviewer
probes to emphasize the importance of the survey.
In the case of post-hoc data fixes (imputation), the two interaction terms in
equation 25 above drop out because data imputation is usually concerned with
missing data on individual questions. Here the most important point is that DK
responses (and item missing data more generally) are a function of the social and
cognitive dynamics of the interview—something not discussed in the statistics
literature on data imputation. It is essential, therefore, that measures which capture
these processes be .included on surveys. As mentioned in chapter I, statisticians
should be most worried about unobserved heterogeneity in their models.
Huge gaps still remain with the conceptual framework presented above,
warranting further research. This model does establish general guidelines about how
to reduce DK response and finds strong theoretical and empirical support in the
methods literature, linking missing data with a well-known socio-cognitive
framework used to explain response effects. For statisticians imputing missing data,
the above model provides a general blueprint concerning the variables that should be
included the data imputation model. In sections 8.2,1 discuss in greater depth
specific imputation variables.
154
Filially, it is important to note that the above model is probably relevant for
other forms of missing data. However, future research still must test the validity of
the model on other forms of item missing data.
8.2 Analysis of Respondent Level Correlates
Both the research cited in this thesis as well as the research presented in the
previous chapters show that respondents who are more likely to answer DK are
systematically different than respondents who are less likely. Specifically, like past
studies, our research here has demonstrated that, on average, the less educated,
female, less knowledgeable, black, older, those less active in civic activities and the
less cognitively sophisticated are more likely to answer DEL In short, DK responses
are a non-random phenomenon. Survey researchers, therefore, should be worried
about potential bias in point estimates. But so what? What do these correlations
really mean? Can we ever get at meaning with quantitative research?
To address these questions, I examine six correlates of DK: (1) education;
(2) age; (3) cognitive sophistication; (4) civic participation; (5) COOP; and (6)
COMPREND. So what do we find?
Education: the research here shows that education is negatively and non-
linearly related to DK and can be completely explained away by two factors: (1) age
and (2) cognitive sophistication . These findings support the use of education as a
proxy for both cognitive ability and normative motivation. However, it is important
to note that no other research has explained away the education effect, suggesting
155
that future confirmation is needed before any definitive conclusions can be made.
So what are some practical lessons we can learn from these findings?
First, DK responses occur more often among the less educated than the more
educated. In principle, questionnaire designers should attempt to minimize question
difficulty and burden in order to minimize non-substantive responses such as DK
(Sudman and Bradbum 1981; Sudman et. al. 1996; Krosnick 1991). One simple rule
of thumb is to remember that approximately a quarter of Americans (24%) have less
than a high school degree—questionnaire designers should write at the same level
using simple words and concepts.
Second, DK responses occur overwhelmingly among the less educated. In
the above analysis, individuals with less than a high school degree represented 24
percent of the sampled population but were responsible for approximately 40 percent
of all DKs. Questionnaire designers should be keenly aware that a rather small
proportion of the population is responsible for a large proportion of all DK
responses. Three initial suggestions come to mind concerning the disproportionate
levels ofDK’s among the less educated.
(1) As stated earlier, question wording should be made simple, written so

that even respondents with less than a high school degree can easily
understand.
(2) Any and all devices should be employed in the design of the
questionnaire to ease cognitive difficulties of the survey interview—
always keeping in mind that a large portion of survey error probably is
disproportionately located among a small subgroup. Such devices could
include hand cards, clear transitions between sections and breaking long
blocks of questions.
156
(3) Pre-testing questionnaires should take into considerations that a
disproportionate level ofDK’s (and probably survey error in general)
occurs among a small group of respondents. Specifically, questionnaire
designers might want to disproportionately sample (oversample) low
education respondents in order to detect specific problems with the
instrument. At the pre-test stage, oversampling of low education groups
may also be desirable when pre-testing instruments using focus groups
and cognitive interviews. Pre-testing might also be further improved by
stratifying the pretest sample according to age and cognitive ability, as
these two variables actually account for the education effect.
Third, the research in this thesis also shows that other correlates of cognitive
ability, such as verbal ability, knowledge, and information exposure do a better job
of explaining DK than education. Should we use other variables in our imputation
models? The short answer is yes.. One initial suggestion is that other measures of
cognitive ability, such as verbal ability, objective knowledge questions; and
subjective interviewer and respondent evaluations of comprehension and knowledge
be included in the questionnaire.
A ge: the research has demonstrated that the age effect is, in part, a function
of cognitive ability as well as normative motivation. Specifically, older respondents
aremore likely to have problems understanding survey questions as well as less
likely to be normatively motivated to provide substantive answers. Our analysis,
however, suggests that older respondents are not less cognitively sophisticated but
instead are more likely to have problems understanding survey questions. In other
words, it is a question of survey performance and not acquired cognitive ability
(verbal ability, knowledge, and information exposure).
Furthermore, the research in this thesis has shown that age only has an effect
on DK for very old individuals. Specifically, respondents 64 years of age or
157
younger do not differ greatly in their probability of providing a DK responses.
Instead, the age effect is most prominent among respondents 65 years of age or older
who are much more likely to answer DK than their younger counterparts. This result
suggests that the age effect most probably results from life-cycle differences, such as
cognitive and social senescence, and not from cohort differences. However, this
conclusion must be seriously qualified considering that we only analyzed one point
in time. Future research needs to examine the relative weight of life cycle and
cohort differences in the composition of the age effect.
These results also suggest solutions and raise new questions. First, as
stressed with education, survey researchers should use age (especially a
dichotomized age variable) in stratifying pre-tests (e.g., elderly respondents versus
younger respondents). Second, the age effect, in part, results from lower levels of
normative motivation. In short, elderly respondents seem to either hold different
values from society and/or are less likely to adhere to the societal norms. So how do
we motivate such respondents?
The short answer is that we really do not know. Future research needs to
examine alternative strategies for normatively motivating elderly respondents. For
instance, research shows that older respondents are more likely to use stories and life
experiences when answering questions (Schwartz et. al. 1999). Perhaps the initiation
of stories, through interviewer probing, may increase normative motivation, even
though this method deviates from standardized interviewing practices.
158
Third, the results here suggest that methodologists should focus on the
specification of the age variable. Indeed, most of the methods research has treated
age a continuous variable linearly related to DK.
Cognitive Sophistication: our research has shown that cognitive
sophistication is an important and robust predictor of DK. Most importantly, by
using a direct measure of cognitive sophistication, we were able to explain away the
education effect.
Furthermore, the results demonstrate that cognitive sophistication and
survey performance are not necessarily the same thing. Yes, respondents’ general
cognitive sophistication does contribute significantly to their probability of giving a
DK answer. However, cognitive sophistication by itself should not be considered a
measure of survey performance. In short, measures, such as verbal ability,
knowledge, and information exposure, probably capture general behavioral
propensities, while interviewer evaluations of respondent comprehension, instead,
seem to be direct measures of survey performance.
In this thesis, I combined verbal ability, knowledge, and information
exposure into one global measure—cognitive sophistication. This aggregation was
both theoretically and empirically justified. My principal reason, though, for the
combination was to reduce the number of variables in order to facilitate
interpretation of the results. I sacrificed depth for efficiency. Future research should
break out the measures to determine their relative weight in explaining DK.
Lastly, direct measures of cognitive sophistication are probably most
important for the correct specification of data imputation models. My suggestion is
159
that when imputation is predicted to play an important role in a study—direct rather
than indirect measures of cognitive sophistication should be included in the survey.
Civic Participation: our research also suggests that civic participation
is a proxy for cognitive sophistication. Respondents who participate more in civic
activities are less likely to say DK because they are more cognitively sophisticated
and not because they are more normatively motivated. This finding runs counter the
methods literature which had assumed that civic participation is a proxy variable for
respondent motivation.
CO O P and COM PREND: the research in this thesis shows that
interviewer evaluations of respondent cooperation and comprehension are important
predictors of DK. This finding is important for three reasons. First, such measures
are easily administered and, therefore, should always be included when researchers
are considering data imputation. Second, the results suggest that methods
researchers should be aware that distinctions exist between actual survey
performance and general behavioral propensiti.es.
Third, the significant and robust independent effects of COOP and
COMPREND suggest that methods research, in general, should perfect interviewer
evaluations of survey performance. For instance, future questions could include
sub-dimensions of comprehension and cooperation. What sorts of questions did the
respondent not understand?
Of course, as we have already stressed, all results based on COOP and
COMPREND must be qualified when considering the serious questions concerning
causality. Does respondent comprehension and motivation influence DK? Or, does
160
the number of DK responses that a given respondent provides influence the
interviewer’s evaluation of that respondeat?
8.3 W rapping Things Up
So how does our general model of DK response presented in section 8.1
correspond to the specific variables presented in section 8.2? Which respondent
characteristics tap normative motivation and which tap cognitive ability? Where do
we need to develop measures? To answer these questions, table 15 below organizes
respondent characteristics by normative motivation and cognitive ability.
Table 15: R esp ond en t V ariables by N orm ative

M otivation and Cognitive A bility
Normative Motivation C!«ffnitive Afoilitv
Interviewer Evaluation Interview er Evaluation

Age
Age Education
Verbal Ability
Anomie Index Information Exposure
Objective/Subjective Knowledge
Social Norms Index Civic Participation
A quick glance at the above table indicates that many more proxy measures
of cognitive ability exist than do measures of normative motivation. This may be so
for two reasons. First, cognitive ability may simply be the more important predictor
of DK. Second, survey methodologists have not devoted the same energies to the
161
development of measures of normative motivation. Considering these deficiencies,
what should be done?
Future research should develop measures of social norms, specifically
targeted at capturing adherence to social norms associated with the survey interview.
Indeed, until such measures are developed, the socio-cognitive framework presented
in this thesis is much more cognitive than social.
Before closing out this study, we are left with one pending question—should
we really impute missing values on attitudinal items? Does data imputation have a
role in attitude surveys?
I would say yes and no. On questions with high levels of DK (20 percent or
more), especially those that require specified knowledge (e.g., NAFTA), I would
not recommend imputing DK responses for two reasons. First, the proportion of DK
responses typically is too large. The simple “rule of thumb” for imputation is that as
the percent of missing data cases reaches 25 or 30 percent, imputed data become less
reliable and valid (Little and Rubin 1987). Second, a significant portion of the DK
responses may actually be nonattitodes. Indeed, in such cases, the DK response
category most definitely says something substantive about what the public both
knows and thinks about this issue.
In the case of questions concerning general attitudes such as subjective well
being (e.g., happiness) however, data imputation may be the correct strategy for two
reasons. First, the rate of DK responses on such questions is very low—typically 0
to 3 percent. Second, it is reasonable to assume that all respondents have some
162
underlying true value when it comes to their subjective well-being. All respondents,
in other words, are more or less happy.
It is important to stress, though, that imputation of DK responses on attitude
items is not a common practice in the social sciences today (for an interesting
exception to this see Gelman et. al. 1998). In the end, the decision to impute on
attitude items should be dealt with on a case by case basis.
163
BIBLIOGRAPHY
Anderson, A.B, Balilevsky, A., and Hum, D.P. (1983) “Missing Data” in Handbook
o f Survey Research (eds) Rossi, P. et. al. Academic Press, Inc.: New York.
Bishop, G.F.(1980) ‘‘Experiments with the Middle Response Alternative in Survey

Questions,” Public Opinion Quarterly, 51, pp.220-232.
Bishop, G.F., Oldendick, R.W., Tuchfarber, A.J., Bennet, S.E. (1980) “Psuedo-
Opinions on Public Affairs,” Public Opinion Quarterly, 44, pp.198-209.
Bishop, G.F., Oldendick, R.W., Tuchfarber, AJ. (1983) “Effects of Filter Questions
in Public Opinion Surveys,” Public Opinion Quarterly, 47, pp.528-546.
Bishop, G.F., Tuchfarber, A J., Oldendick, R.W.,(1986) “Opinions on Fictitious

Issues: The Pressure to Answer Survey Questions,” Public Opinion
Quarterly, 50, pp.240-250.
Berk, R. (1983) “An Introduction to Sample Selection Bias in Sociological data,”

American Sociological Review, 48:386-98.
Blau, P. and O. Duncan (1967) The American Occupational Structure. Wiley:

London
Bohmstedt, G.W. (1983) “Measurement” in Handbook o f Survey Research (eds)

Rossi, P. et. al. Academic Press, Inc.. New York.
Bonmstedt, G.W., and Rnoke, D. (1994) Statisticsfo r Social Data Analysis. F.E.
Peacock Publishers, Inc.: New York.
Bradbum, N., Sudman S. and Associates (1979) Improving Interview Methods and
Questionnaire Desigp Joessy-Bass: San Francisco
Bradbum, N. (1983) “Response Effects” in Handbook o f Survey Research (eds)

Rossi, P. et. al. Academic Press, Inc.: New York.
Brody, C.J. (1986) “Things are Really Black and White: Admitting Gray into the
Converse Model of Attitude Stability,” American Journal o f Sociology, 92,
pp.657-677.
Campbell, Angus, Miller, W.E., Converse, P.E. (1960) The American Voter.
University of Chicago Press.
164
Campbell, C.F., and Fiske, D.W. (1959) “Convergent and Discriminant Validation
by the Multi-Trait Multi-Method Matrix,” Psychological Bulletin, Vol 56,
No.2pp.8M 05.
Ceci, SJ. (1992) “How Much Does Schooling Influence General Intelligence and Its
Cognitive Components? A Reassessment of the Evidence,” Developmental
Psychology, 27, pp.703-722.
Citrin, J., and Muste, C. (1999) “Trust in Government” in Measures o f Political

Attitudes (eds) Robinson, J.P., Shaver, P.R., and Wrightsman, L.S.
Academic Press: New York.
Cochran, W.G., (1977) Sampling Techniques. John Wiley & Sons: New York.
Converse, J.M., (1977) “Predicting No Opinion in the Polls ” Public Opinion

Converse, J.M., and Schuman, H. (1984) “The Manner of Inquiry: An Analysis

o f Survey Questions from Across Organizations and Over Time,” in C.F.
Turner and E. Martin (eds.), Surveying Subjective Phenomena (Vol 2),
pp. 283-316.
Converse, P.E., (1970) “Attitudes and Nonattitudes: Continuation of a Dialogue,”

inE.R. Tufts (ed.), The Quantitative Analysis o f Social Problems,
Reading, Mass.: Addison Wesley., pp.206-266.
Converse, P.E., (1964) “The Nature of Belief Systems in Mass Publics,” in D.E.
Apter (ed.), Ideology and Discontent, New York: Free Press, pp.206-266.
Coombs, C.H. and Coombs, L.C., (1977) “‘D on't Know’rltem Ambiguity or
Respondent Uncertainty,” Public Opinion Quarterly, 40, pp.497-514.
Couper, M P., Singer, E., and Kukla, R.A. (1997) “Participation in the Decennial
Census Census: Politics, Privacy, and Pressures,” America Politics
Crespi, I. (1988) Pre-Election Polling: Sources o f Accuracy and Error Russel

Sage Foundation: New York.
165
Davis, James A., (1980) "Conservative Weather in a Liberalizing Climate:
Change in Selected NORC General Social Survey Items, 1972-1978 in
Social Forces, 58, 1129-1156
Davis, James A., (1975) "Communism, Conformity, Cohorts, and Categories:

American Tolerance in 1954 and 1972-1973," in American Journal o f
Sociology, 81,491-513.
Davis, James A., (1992) "Changeable Weather in a Cooling Climate Atop the
Liberal Plateau: Conversion and Replacement in 42 Items, 1972-1989," in
Public Opinion Quarterly, 56, 261-306.
Davis, JA.,and Smith, T.W., (1996) General Social Surveys 1972-1996: Cumulative
Code Book The Roper Center for Public Opinion Research.
Dillman, D.A. (1978) Mail and Telephone Surveys: The Total Design Method. John
Wiley & Sons, Inc.: New York.
Dillman, D.A. (2000) Mail and Internet Surveys: The Tailored Design Method. John
Wiley & Sons, Inc.: New York.
DeMaio, T J. (1984) “Social Desirability and Survey Measurement: A Review,”

in C.F. Turner and E. Martin (eds.), Surveying Subjective Phenomena
(Vol 2),pp 257-282. '• .
Faulkenberry, G.D. and Mason, R, (1978) “Characteristics of Nonopinion and No

Opinion Response Groups,” Public Opinion Quarterly, 42, pp.533-543.
Feick, L.F., (1989) “Latent Class Analysis of Survey Questions that Include Don’t
Know Response,” Public Opinion Quarterly, 53, pp.525-547.
Ferber, R., (1966) “Item Nonresponse in A Consumer,” Public Opinion Quarterly,

30, pp.399-415.
Fowler, F.J., and Mangione, T.W. (1990) Standardized Survey Interviewing,

Newbury Park, CA: Sage Publication.
Francis, J. and Busch L., (1975) “What We Don’t Know about ‘I Don’t Know’,”
Public Opinion Quarterly, 39, pp.207-218.
Gelman, A., King, G., Lui, C. (1998) “Not Asked and Not Answered: Multiple
Imputation for Multiple Surveys,” Journal of the American Statistical
Association, Vol. 93, Nmn 443 pp.846-857.
166
Gergen, K.J., and Back, K.W., (1966) “Communication in the Interview and the
Disengaged Respondent,” Public Opinion Quarterly, 30, pp. 17-33.
Gilljam, M., and Granberg, D., (1993) “Should We Take Don’t Know for An
Answer,” Public Opinion Quarterly, 57, pp.348-357.
Glenn, N.G., (1969) “Aging, Disengagement, and Opinionation,” Public Opinion

Quarterly, 33, pp. 17-33.
Greene, W.H., (1997) Econometric Analysis. Prentice Hall: Upper Saddle River,
New Jersey.
Groves, GJML, (1989) Survey Errors and Survey Costs John Wiley & Sons, Inc.:
New York.
Groves, G.M., (1996) “Presidential Address: The Educational Infrastructure of the

Survey Research Profession,” Public Opinion Quarterly, 60, pp.477-490.
Groves, G M , and Couper,M.P, {199%) Nonresponse in Household Interview

Surveys John Wiiely & Sons, Inc.: New York.
Heckman, J.J. (1976) “Sample Bias as a SpecificationEiror” in Econometrica. 47:

pp. 153-162.
Heckman, JJ. (1979) “The Common Structure of Statistical Models of Truncation,

Sample Selection, and Limited Dependent Variables and a Simple Estimator
for Such Models” in Annuals o f Economic and Social Measurement. 5:
pp.475-492.
Herzog, A it., and Rodgers, W.L. (1999) “Cognitive Performance Measures in

Survey Research on Older Adults” in Cognition, Aging, and Self-Reports
(eds) Schwarz, N., Park, D., Knauper, B., abd Sudman, S. Academic Press:
New York.
Hippier, H J., and Schwarz, N. (1989) “No Opinion filters: A Cognitive Perspective”
in International Journal o f Public Opinion Research. 1:1, pp. 77-87.
Hyman, H.H. (1954) Interviewing in Social Research. University of Chicago Press:

Chicago.
Hyman, H.H., Wright, C.R., and Reed, J.S., (1975) The Enduring Effects o f
Education. University o f Chicago Press: Chicago.
167
Kennedy, P. (1998) A Guide to Econometrics. The MIT Press.: Cambridge,

Massachusetts.
Kish, L. (1965) Survey Sampling. John Wiley & Sons, Inc.: New York.
Kohut, A. (1981) “The 1980 Presidential Polls: A Review of Disparate Methods and
Results,” Proceedings o f the Section of m Survey Research Methods,
American Statistical Association, pp. 41-46.
Krosnick, J.A., (1991) “Response Strategies for Coping with the Cognitive Demands
of Attitude Measures in Surveys,” Applied Cognitive Psychology, 5, pp.213-
236.
Krosnick, J.A., (1999) “The Causes and Consequences of No-Opinion Responses in

Surveys,” presented at the International Conference on Survey Nonresponse.
Portland, Oregon. October 28-31.
Krosnick, J.A. and Alwin, D.F., (1987) “Satisficing: A Strategy for Dealing with the
Demands of Survey Questions,” GSS Methodological Report 46. March,
1987.
Krosnick, J.A. and Milbum, M.A. (1990)”Psychologjcal Determinants of Political

Opinionation” in Social Cognition 8:1, pp.49-72.
Krosnick, J.A., and Fabrigar, L.R.. (1997) “Designing Rating Scales for Effective
Measurement in Surveys,” in Lyberg et al. (eds.), Survey Measurement and
Process Quality, John Wfleiy & Sons, Inc.: New York, pp.141-164.
Leeuw de, Edith and Collins, Martin (1997) “Data Collection Methods and Survey
Quality: An Overview” in Lyberg et.al.(eds) (1997) Survey Measurement
and Process Quality, Jonh Wiley & Sons, Inc.: New York.
Lesser, Judith T., and Kalsbeek, William D. (1992) “Nonsampling Error in

Surveys” in Lyberg et.al.(eds) (1997) Survey Measurement and Process
Quality, John Wiley & Sons, Inc.: New York.
Little, J.A.R.,and Rubin, B.E., (1987) Statistical Analysis with Missing Data, John
Wiley & Sons, Inc: New York.
Lyberg, J. and Kaspryzyk, D, (1997) “Some Aspects of Post-Survey Processing” in

Lyberg et. al. (eds) Survey Measurement and Process Quality, John Wiley &
Sons, Inc.: New York.
168
Madow, W.G., and OUrin, I. (eds) Incomplete Data in Sample Surveys Vol. 3,
Proceedings of the Symposium. Academic: New York.
Mathiowetz, N.A., (1998) “Respondent Expression of Uncertainty: Data Source for

Imputation,” Public Opinion Quarterly, 62:2, pp.47-56.
Mathiowetz, N.A., DeMaio, T.J., and Martin, E. (1991) “Political Alienation, Voter
Registration, and the US 1990 Census,” Paper Presented at the annual
conference of the American Association of Public Opinion Research,
Phoenix, AZ.
Mayer, William G., (1992) The Changing American Mind: How and Why
American Public Opinion Changed Between I960 and 1988. Ann Arbor:
University o f Michigan Press, 1992.
McClendon, M and Alwin D.F., (1992) “No-Opinion Filters and Attitude

Measurement Reliability,” Sociological Methods & Research 21:4,
pp.438-464.
McCutcheon, A.L and Alwin D.F., (1987) Latent Class Analysis Sage Publications:
London.
Nagel, J.H. (1987) Participation Englewood Cliffs, NJ: Prentice-Hall
Narayan, S., and Krosnick, J.A., (1996) “Education Moderate Some Response
Effects .in Attitude Measurement,” Public Opinion Quarterly, 60,pp. 86-
96.
Nie, N.H., Junn, I., SteMik-Barry, K. (1996) Education and Democratic Citizenship
in America. The University of Chicago Press: Chicago and London.
Nuenaliy, J.C., and Bernstein, I. (1994} Psychometric Theory. McGraw-Hill, Inc:

New York.
O’Muircheartaigh, C., Krosnick, I.A., Helic, A., (1999)”Mddle Alternatives,

Acquiescence, and the Quality of Questionnaire Data” Paper Presented at the
1999 AAPOR Meeting in St. Petersburg Florida
Perry, P. (1979) “Certain Problems in Election Survey Methodology,” Public

Opinion Quarterly, 43, pp.312-325.
169
Rapoport, R.B., (1982) “Sex Differences in Attitude Expression: A Generational
Explanation,” Public Opinion Quarterly„46, pp.86-96.
Rapoport, R.B., (1985) “Like Mother, Like Daughter: Intergenerational

Transmission of DK Response Ratesf Public Opinion Quarterly, 49,
pp. 198-208.
Reef M I and Kuoke, D. (1999) “Political Alienation and Efficacy” in Measures o f

Put tical Attitudes (eds) Robinson, J.P., Shaver, P.R., and Wrightsman, L.S.
Rubin, D.B., (1976) “Inference and Missing Data,” Biometrika, 63, pp.581-592.
SaMtouse, T.A. (1999) “Pressing Issues in Cognitive Aging” in Cognition, Aging,

and Self-Reports (eds) Schwarz, N., Park, D., Knauper, B., and Sudman, S.
Sanchez, M.E and Morchio, G,, (1992) “Probing Don’t Know Answers: Effect on
■■ Survey Estimates andVariable Relationships,” Public Opinion Quarterly,
56, pp.454-474.
Schuman, H., and Presser, S., (1981) Questions & Answers in Attitude Surveys:
Experiments on Question Form, Wording, and Context. Sage Publications,
Inc.: Thousand Oaks, California. .
Schuman, Howard, Steeh, C , Bobo, L., and Kryson, M. (1986) Racial Attitudes
in America: Trends and Interpretations. Harvard University Press.
Schwarz, N., Park, D., Knauper, B., and Sudman, S. (1999) Cognition, Aging, and
Self-Reports. Academic Press: New York.
Schuman, H. and Scott, J., (1989) “Response Effects Over Time: Two Experiments”
in Sociological Methods & Research 17:4, pp.398-408.
Smith,T.W., (1981) “Educated Don’t Knows: An Analysis of the Relationship

Between Education and Item Nonresponse,” Political Methodology, pp. 47-
57.
Smith,T.W., (1984) “Nonattitudes: A Review and Evaluation,” in C.F. Turner and E.

Martin (eds.), Surveying Subjective Phenomena (Vol 2), pp. 215-255.
Smith,T.W., (1986) “A Study of Non-Response and Negative Values on the

Factorial Vignettes on Welfare,” GSS Methodological Report No. 44
November.
170
Smith,T.W., (1987a) “Attrition and Bias on the ISSP Supplement,” GSS

Methodological Report Mo. 42 February.
Smith,T.W., (1987b) “Phone Home?: An Analysis of Household Telephone

Ownership,” GSS Methodological Report No. 50. August.
Smith,T.W., (1991) “An Analysis o f Missing Income Information on the General

Social Surveys,” GSS Methodological Report No. 71 October.
Smith,!.W., (1992a) “The 1992 General Re-interview,” GSS Methodological Report

No. 75. December.
Smith,T.W., (1992b) “A Methodological Analysis of the Sexual Behavior Questions

on the General Social Surveys,” Journal o f Official Statistics Vol. 8: 3, pp.
309-325.
SmithJ.W, (1992c) “An Analysis of Response Patterns on 10-Point Scalometers,”

GSS Methodological Report No. 76 November.
Smith,T.W., (1993) “little Things Matter. A Sampler of How Differences in

. Questionnaire Format.can affect Survey Responses,” GSS Methodological
Report No. 78 July 1993.
Smith,T.W., (1997) “Factors Relating to Misanthropy in Contemporary American

Society,”.Social Science Research 26: pp. 170-195.
Southwell, P.L (1985) “ Alienation and Non-voting in the United States: A Refined
Operationalization” in Western Political Quarterly 38: PP. 663-674.
Stocking, Carol (1979) “ Reinterpreting the Marlowe-Crowne Scale” in Improving

Interview Method and Questionnaire Design (eds) Bradbum, N., and
Sudman, S. Jossey-Bass: San Francisco.
Stolzenberg, R.M., and Relies, D.A. (1997) “Tools for Intuition About Sample
Selection Bias and Its Correlates” in American Sociological Review, 62:
494-507.
Sudman, S., Bradbum, N.M., Schwarz, N. (1996) Thinking About Answers. Josey
Bass: San Francisco.
Sudman,S., and Bradbum, N.M., (1974) Response Effects in Surveys. Aldine

Publishing Company: Chicago
Sudman, S., and Bradbum, N.M., (1982) Asking Questions. Jossey-Baas: San
Francisco.
171
Taylor, M.C. (1983) “The Black-and-White Model of Attitude Stability: A Latent

Class Examination of Opinion and Non-opinion in the American Public,”
American Journal o f Sociology, 89, pp.373-401.
Tourangeau, R. andRasinsM, K..A, (1988) “Cognitive Processes Underlying Context

Effects in attitude Measurement” in Psychological Bulletin, 103, pp.299-14.
Tourangeau, R., Rips, L., and Rasinski, K.A., (2000) The Psychology o f Survey
Response, Cambridge University Press: New York.
Traugott, M.W., and Tucker, C. (1984) “Strategies for Predicting Whether a

Citizen will Vote and Estimation of Electoral Outcomes,” in Public
Opinion Quarterly 48:4 pp 330-343.
Verba, S and Nie, N.H. (1972) Participation in America New York: Harper &
Row.
Weatherford, M.S. (1991) “ Mapping the Ties that Bind: Legitimacy,

Representation, and Alienation” in Western Political Quarterly 44: pp. 251-
276.
Weeks, M.F., (1992) “Computer-Assisted Survey Information Collection: A review

of CASIC Methods and Their Implications for Survey Operations,” Journal
o f Official Statistics,:Vol 8, No 4, pp. 445-466.:
Wright, B.D., and Masters, G.N. (1982) Rating Scale Analysis. MESA Press:
Chicago
Young, C.A. (1998a) “Declining Confidence in American Political and Nonpolitical

Institutions, 1973-96: Effects of Cohort Succession and Attitndinal
Conversion,” presented at the regional seminar of the World Association for
Public Opinion Research. Manila, Philippines, January.
Young C.A. (1998b) “Sex, Crimes, and Economic Downturns: Why Americans
have become less confident in their Political and Non-political Institutions,”
Unpublished Masters Thesis. University of Chicago.
Young C.A. (1998c) “Cognitive Sophistication, Social Isolation and Role

Expectation: Don’t Knows on the General Social Survey” presented at the
Midwest Association for Public Opinion Research. Chicago, November.
Young, C.A. (1999a) “Mean Square Error: A Framework for Classifying Survey
Error,” paper presented to the research staff of DATA-UFF at the
Universidade Federal Fuminense. Niterio, Brazil. March 9,1999.
172
Young, C.A. (1999b) “An Analysis of Can’t Choose Responses on the 1993
International Social Survey Program,” paper presented at the annual meeting
of International. Social Survey Program. Madrid, Spain. April 26,1999.
Young, C.A. (1999c) “What.We Now Know about ‘I Don’t Know’: An Analysis of
the Relationship between ‘Don’t Know’ and Education,” paper presented at
the 54th annual meeting of the American Association of Public Opinion
Research. St.. Pete Beach, Florida. May 15,1999.
Young, C.A. (1999d) “Dealing with Don’t Know Responses on Behavioral,

Demographic and Attitiidinal Questions,” Unpublished Special Field Exam.
University of Chicago, June.
Young, C.A. (2000) “0 <pe qtieremos dizer quando falamos sobre quahdade em ■
pesquisa?: Defini?So e Glassificafao de Conceitos,” presented at the Ford
Foundation Series on Survey Methodology at DataUff, Universidade Federal
Fluminense. November, Rio de Janeiro.
Young, C.A,, Pansanato, K.A., and de Britto, F.C. (2000) “Estimating

Coverage Bias in Telephone Electoral Studies: Who Has a Phone at
Home? and .Does it Really Matter?,” presented at the SINAPE (Simposio
Nacional de Probabilidade e Estatistica). July Caxambu, Minas Gerais.
Young, C.A. (2001) ““In Search of.Country and House Effects: An International
Comparison of Data Quality Unpublished Paper, March. Sao Paulo.
Young, C.A., Andrade, F.C., and Moura, C.B. (2001) “Non-Response in Sample
Surveys: A Clarification of Concepts and Our Empirical Experience in
the United States and Brazil,” paper presented at the annual Pesquisa
Social Brasileira (PESB) seminar, April. Rio de Janeiro.

Young - Why Surveys

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Young - Why Surveys

Uploaded by

Copyright:

Available Formats

THE UNIVERSITY OF CHICAGO

EXPLAINING WHY SURVEY RESPONDENTS ANSWER T DON’T KNOW’:

UMI Microform 3029551

Bell & Howell Information and Learning Company

1.0 Introduction of Thesis........................................................................... 1

6.2 Methods and Data............................................................... ............. 107

Figure 1: Distribution of “Don’t Know” Responses from Attitude Items on

Table 1: Decomposing the DK Category: Ambivalent Attitudes vs. Other DK-

I write my acknowledgements from Sao Paulo, Brazil—a huge, unforgiving

Sociologists are important consumers of sample surveys. Modem

1.0 Introduction of Thesis

provided the means to develop and perfect the measurement of sociological

phenomena. Surveys, however, do not need to be mechanisms solely for testing

sociological concepts. Indeed, sociological concepts can also provide important

insights into how best to maximize survey quality.

than others. By understanding “why”, survey methodologists will be able to develop

better strategies to maximize survey response and survey quality. In addition, a

(2) Along what dimensions (demographic, behavioral, and/or attitudinal) are

(3) What social and/or psychological processes explain why some

1.1 The Problem

Quantitative sociologists use statistical techniques which have been designed

to examine rectangular data matrices. In standard statistical packages such as SPSS,

survey do not provide the requested information, or the provided information is

questionnaire formatting (e.g., improper skip patterns in questionnaires) as well as

for examples of questionnaire formatting problems). Process item nonresponse

minimized with redundant systems and checks. Computer assisted interviewing

nonresponse through computerized range, consistency, and m issing value checks

(Lyberg & Kasprzyk 1997; Weeks 1992).

By interview item nonresponse, I am referring to item missing data resulting

factors, in addition to individual level characteristics, result in missing data,

questionnaire administration. In this thesis, I will be analyzing interview

mean interview item nonresponse.

1.1.2 Item Missing Data and Its Effect on Substantive

define our terms.

increases the standard error.

standard error increases.

Standard Error (y) = Vvar(y) / n (1)

affluent than those without.

Bias (y) = E(|Y-y|) (2a)

Bias, as it relates to item nonresponse, is a function of two components: (1)

nonresponse subgroup (see equation 2b below).

incorrect inferences about the population of interest. Considering the

aforementioned limitations of excluding item missing data, how might analysts

effectively minimize error associated with item nonresponse?

1.1.3 Dealing with Item Nonresponse

these two methods to minimize problems of missing data.

1.1 3 .1 Questionnaire Design

Questionnaire designers have used a variety of strategies to minimize item

nonresponse through questionnaire design. The strategy, in general, has been to

sufficiently motivate the respondent to provide a substantive response (1) by

following examples will illustrate this point.

First, on income questions, questionnaire designers will typically ask

ended income questions because they ensure respondents a greater degree of

confidentiality (Bradbum, Sudman, and Associates 1979; Sudman and Bradbum

Second, in order to minimize don’t know (DK) responses, some surveys,

interviewer is instructed to record a DK response only after the respondent

whether minimizing DK responses improves or worsens data quality.

designed to provide respondents with a greater degree of privacy, eliminating the

(Perry 1979; Kohut 1981).

While survey methodologists have employed a variety of methods to

need of a conceptual framework to help guide methodologists in how best to

minimize the problem of item missing data.

1.1.3.2 Data Imputation