Professional Documents
Culture Documents
Dawes The Robust Beauty of Improper Linear Models PDF
Dawes The Robust Beauty of Improper Linear Models PDF
in Decision Making
ABSTRACT: Proper linear models are those in which A proper linear model is one in which the
predictor variables are given weights in such a way weights given to the predictor variables are chosen
that the resulting linear composite optimally predicts in such a way as to optimize the relationship be-
some criterion of interest; examples of proper linear tween the prediction and the criterion. Simple
models are standard regression analysis, discriminant
regression analysis is the most common example
function analysis, and ridge regression analysis. Re-
of a proper linear model; the predictor variables
search summarized in Paul Meehl's book on clinical
versus statistical prediction—and a plethora of re-
are weighted in such a way as to maximize the
search stimulated in part by that book—all indicates correlation between the subsequent weighted com-
that when a numerical criterion variable (e.g., graduate posite and the actual criterion. Discriminant
grade point average) is to be predicted from numerical function analysis is another example of a proper
predictor variables, proper linear models outperform linear model; weights are given to the predictor
clinical intuition. Improper linear models are those in variables in such a way that the resulting linear
which the weights of the predictor variables are ob- composites maximize the discrepancy between two
tained by some nonoptimal method; for example, they or more groups. Ridge regression analysis, an-
may be obtained on the basis of intuition, derived other example (Darlington, 1978; Marquardt &
from simulating a clinical judge's predictions, or set to Snee, 1975), attempts to assign weights in such
be equal. This article presents evidence that even
a way that the linear composites correlate maxi-
such improper linear models are superior to clinical in-
mally with the criterion of interest in a new set
tuition when predicting a numerical criterion from
numerical predictors. In fact, unit (i.e., equal) weight-
of data.
ing is quite robust for making such predictions. The Thus, there are many types of proper linear
article discusses, in some detail, the application of unit models and they have been used in a variety of
weights to decide what bullet the Denver Police De- contexts. One example (Dawes, 1971) was pre-
partment should use. Finally, the article considers sented in this Journal; it involved the prediction
commonly raised technical, psychological, and ethical
of faculty ratings of graduate students. All gradu-
resistances to using linear models to make important
social decisions and presents arguments that could
weaken these resistances.
Work on this article was started at the University of
Oregon and Decision Research, Inc., Eugene, Oregon; it
Paul MeehPs (1954) book Clinical Versus Statis- was completed while I was a James McKeen Cattell Sab-
tical Prediction: A Theoretical Analysis and a batical Fellow at the Psychology Department at the Uni-
Review of the Evidence appeared 25 years ago. versity of Michigan and at the Research Center for
It reviewed studies indicating that the prediction Group Dynamics at the Institute for Social Research there,
I thank all these institutions for their assistance, and I
of numerical criterion variables of psychological especially thank my friends at them who helped.
interest (e.g., faculty ratings of graduate students This article is based in part on invited talks given at
who had just obtained a PhD) from numerical the American Psychological Association (August 1977),
the University of Washington (February 1978), the
predictor variables (e.g., scores on the Graduate Aachen Technological Institute (June 1978), the Univer-
Record Examination, grade point averages, ratings sity of Groeningen (June 1978), the University of Am-
of letters of recommendation) is better done by a sterdam (June 1978), the Institute for Social Research at
the University of Michigan (September 1978), Miami Uni-
proper linear model than by the clinical intuition versity, Oxford, Ohio (November 1978), and the Uni-
of people presumably skilled in such prediction. versity of Chicago School of Business (January 1979).
The point of this article is to review evidence that I received valuable feedback from most of the audiences.
Requests for reprints should be sent to Robyn M.
even improper linear models may be superior to Dawes, Department of Psychology, University of Oregon,
clinical predictions. Eugene, Oregon 97403.
Prediction of neurosis vs. psychosis .28 .31 .30 .34 .46 .46
Illinois students' predictions of GPA .33 .50 .51 .60 .57 .69
Oregon students' predictions of GPA .37 .43 .51 .60 .57 .69
Prediction of later faculty ratings at Oregon .19 .25 .39 .48 .38 .54
Yntema & Torgerson's (1961) experiment .84 .89 .84 .97 .97
Note. GPA = grade point average.
(i.e., correlations) obtained by various methods in this experiment were asked to estimate the
were compared. In the first study, a pool of 861 value of each ellipse and were presented with out-
psychiatric patients took the MMPI in various come feedback at the end of each trial. The prob-
hospitals; they were later categorized as neurotic lem was to predict the true (i.e., experimenter-
or psychotic on the basis of more extensive in- assigned) value of each ellipse on the basis of its
formation. The MMPI profiles consist of 11 size, eccentricity, and grayness. .
scores, each of which represents the degree to The first column of Table 1 presents the average
which the respondent answers questions in a man- validity of the judges in these studies, and the
ner similar to patients suffering from a well-de- second presents the average validity of the para-
fined form of psychopathology. A set of 11 scores morphic model of these judges. In all cases, boot-
is thus associated with each patient, and the prob- strapping worked. But then what Corrigan and I
lem is to predict whether a later diagnosis will be constructed were random linear models, that is,
psychosis (coded 1) or neurosis (coded 0). models in which weights were randomly chosen
Twenty-nine clinical psychologists "of varying ex- except for sign and were then applied to standard-
perience and training" (Goldberg, 1970, p. 425) ized variables."
were asked to make this prediction on an 11-step
forced-normal distribution. The second two stud- The sign of each variable was determined on an a priori
basis so 'that it would have a positive relationship to the
ies concerned 90 first-year graduate students in the criterion. Then a normal deviate was selected at random
Psychology Department of the University of from a normal distribution with unit variance, and the
Illinois who were evaluated on 10 variables that absolute value of this deviate was used as a weight for
the variable. Ten thousand such models were constructed
are predictive of academic success. These vari- for each example. (Dawes & Corrigan, 1974, p. 102)
ables included aptitude test scores, college GPA,
various peer ratings (e.g., extraversion), and vari- On the average, these random linear models per-
ous self-ratings (e.g., conscientiousness). A first- form about as well as the paramorphic models of
year GPA was computed for all these students. the judges; these averages are presented in the
The problem was to predict the GPA from the 10 third column of the table. Equal-weighting mod-
variables. In the second study this prediction was els, presented in the fourth column, do even better.
made by 80 (other) graduate students at the Uni- (There is a mathematical reason why equal-weight-
versity of Illinois (Wiggins & Kohen, 1971), and ing models must outperform the average random
in the third study this prediction was made by 41 model.6) Finally, the last two columns present
graduate students at the University of Oregon.
The details of the fourth study have already been
covered; it is the one concerned with the predic- 5
Unfortunately, Dawes and Corrigan did not spell out
tion of later faculty ratings at Oregon. The final in detail that these variables must first be standardized
study (Yntema & Torgerson, 1961) was one in and that the result is a standardized dependent variable.
Equal or random weighting of incomparable variables—
which experimenters assigned values to ellipses for example, GRE score and GPA—without prior stan-
presented to the subjects, on the basis of figures' dardization would be nonsensical.
6
size, eccentricity, and grayness. The formula used Consider a set of standardized variables Si, X*, .Xm,
each of which is positively correlated with a standardized
was ij + kj + ik, where i, j, and k refer to values variable Y. The correlation of the average of the Xs
on the three dimensions just mentioned. Subjects with the Y is equal to the correlation of the sum of the