You are on page 1of 10

974769

research-article2021
PPSXXX10.1177/1745691620974769NavarroMathematical Psychological Theory

ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE

Perspectives on Psychological Science

If Mathematical Psychology Did Not Exist 1­–10


© The Author(s) 2021
Article reuse guidelines:
We Might Need to Invent It: A Comment sagepub.com/journals-permissions
DOI: 10.1177/1745691620974769
https://doi.org/10.1177/1745691620974769

on Theory Building in Psychology www.psychologicalscience.org/PPS

Danielle J. Navarro
School of Psychology, University of New South Wales

Abstract
It is commonplace, when discussing the subject of psychological theory, to write articles from the assumption that
psychology differs from the physical sciences in that we have no theories that would support cumulative, incremental
science. In this brief article I discuss one counterexample: Shepard’s law of generalization and the various Bayesian
extensions that it inspired over the past 3 decades. Using Shepard’s law as a running example, I argue that psychological
theory building is not a statistical problem, mathematical formalism is beneficial to theory, measurement and theory
have a complex relationship, rewriting old theory can yield new insights, and theory growth can drive empirical work.
Although I generally suggest that the tools of mathematical psychology are valuable to psychological theorists, I also
comment on some limitations to this approach.

Keywords
psychological theory, inductive generalization, mathematical psychology, cognitive modeling

In 1987 Roger Shepard published a brief article in Sci- rats) and stimulus domains (e.g., visual, auditory), data
ence with the ambitious title “Toward a Universal Law that had until that point been assumed to be quite dif-
of Generalization for Psychological Science” (Shepard, ferent from one another. To spot the invariance that
1987). Drawing on the empirical literature on stimulus holds across these data sets, Shepard used statistical
generalization in several domains and species, he insights from research on similarity modeling. He noted
asserted that any stimulus-generalization function that the apparent noninvariance of observed stimulus-
should be approximately exponential in form when generalization functions stemmed largely from the fact
measured with respect to an appropriately formulated that response data had previously been analyzed with
stimulus representation. His article begins with the fol- respect to the physical dissimilarities of the stimulus.
lowing remark: When the same responses were plotted as a function of
distance in a psychological space constructed by mul-
The tercentenary of the publication, in 1687, of tidimensional scaling, he found that the form of the
Newton’s “Principia” prompts the question of whether stimulus generalization was remarkably regular in shape.
psychological science has any hope of achieving a Taken by itself, Shepard’s reanalysis would have
law that is comparable in generality (if not in been impressive. However, Shepard went on to provide
predictive accuracy) to Newton’s universal law of a theoretical explanation for why we should expect to
gravitation. Exploring the direction that currently find this invariance. The theory was surprisingly simple:
seems most favorable for an affirmative answer, I The learner presumes there exists some unknown con-
outline empirical evidence and a theoretical rationale sequential region of the stimulus space across which
in support of a tentative candidate for a universal roughly the same properties hold (e.g., things that look
law of generalization. (p. 1317)
Corresponding Author:
Shepard’s claim was remarkable in scope. He drew Danielle J. Navarro, School of Psychology, University of New South Wales
on data from multiple species (e.g., humans, pigeons, E-mail: d.navarro@unsw.edu.au
2 Navarro

Generalization

Known
Consequential
Stimulus
Stimulus Dimension 2

Possible Consequential Region

Stimulus Dimension 1 Generalization


Fig. 1.  A schematic depiction of Shepard’s (1987) theory of stimulus generalization. The main panel
depicts a two-dimensional psychological space, in which possible stimuli can vary along two stimulus
dimensions (e.g., brightness, orientation). The black marker shows the location of a “consequential
stimulus” (e.g., a fruit with an unpleasant taste), and each of the gray rectangles represents one pos-
sible hypothesis about the set of possible stimuli that might also have this consequence (e.g., taste
unpleasant). Not knowing which of these hypotheses represents the true extension of the “region of
unpleasant fruits,” the learner “averages” across his or her uncertainty leading to the approximately
exponential generalization gradients plotted above and to the right. Note that the curves shown in
this figure are jagged rather than smooth because only a sample of possible regions is depicted and
that for ease of exposition this figure represents a simplified version of Shepard’s theory.

like apples will probably taste the same as one another). Although brief, Shepard’s article has been influential
When encountering a single stimulus that entails a par- in the cognitive-science literature. It presented no new
ticular consequence, the learner’s task is to infer the empirical data, and in substance it is mostly devoted to
location, shape, and size of the consequential region the derivation of a formal relation between one unob-
itself. This is naturally an underconstrained problem, servable quantity (psychological distance) and another
as there are an infinite number of possible regions that (stimulus generalizability). The universal law featured
might correspond to the true consequential region. Nev- prominently in a special issue of Brain and Behavior
ertheless, Shepard showed that under a range of Sciences in 2001 and a first-person retrospective
assumptions that the learner might make about the (Shepard, 2004), and for my contribution to this special
nature of consequential regions, the shape of the gen- issue I use it as an example of theory building in psy-
eralization function across the stimulus space ends up chology.1 The decision to focus on a single contribution
being approximately exponential. A visual illustration to theory is motivated by a desire to look at the particu-
of this idea is depicted in Figure 1. lars rather than speak solely in the abstract, and my
Mathematical Psychological Theory 3

decision to ignore disciplines outside of cognitive psy- form of “questionable research practice.” For instance,
chology is motivated by a desire to work toward what in the current zeitgeist it is sometimes argued with
Flis (2019) calls “an indigenous epistemology.” If psy- considerable vigor (especially on informal forums such
chology is to make theoretical progress we must do so as academic Twitter) that strong inferential claims can-
on our own terms. There are limits to what we can learn not be justified without preregistered confirmatory
from the physical sciences. tests. Shepard’s (1987) article does not present any such
Some desiderata for scientific theories seem easy to tests but makes sweeping claims nonetheless. Likewise,
list. A scientific theory should be independent of its one might wonder whether his post hoc theorizing is
creator, for instance. It is difficult to make much use of a form of hypothesizing after results are known (i.e.,
a theory otherwise. In practice this typically means a HARKing). The unwary reader might conclude that
theory is mathematical or computational in nature. Like- Shepard’s work is of questionable value: Perhaps cogni-
wise, psychological theories should of course make tive scientists have erred by according this article such
some connection with empirical data, giving an account high status?
of the generative mechanism that gave rise to those Something seems awry in this description, and few
data. Theories should be usable in the sense of provid- researchers familiar with Shepard’s work would endorse
ing other scientists guidance for future research. Other it. The problem, I suggest, arises from a subtle way in
criteria could also be named, including falsifiability, which the preceding paragraph misrepresents the infer-
simplicity, compatibility with existing literature, gener- ential problems scientists face. Methodological prescrip-
alizability, predictive ability, and so on. However, tions relating to confirmatory tests (e.g., Wagenmakers
although it is easy to list desiderata and even easier to et al., 2012) or post hoc hypotheses (e.g., Kerr, 1998)
debate which elements to such lists are the most impor- are narrow in scope: They have been developed to
tant, such “discussions in the abstract” rarely provide guide statistical inferences about empirical data, and
much guidance to the would-be theoretician. From the as I have argued before (Navarro, 2019), it is an error
perspective of the working scientist, it is perhaps more to presume that the same logic can be applied to the
useful to give concrete examples, and to that end I evaluation of scientific theories.2 To put it another way,
return to an examination of Shepard’s (1987) article and the success of Shepard’s theoretical work despite the
the mathematical-psychology literature to which it (apparent) failure to meet these statistical prescriptions
belongs. I make the following claims: (a) Theory build- tells us something about what a theory is not. In my
ing is not a statistical problem, (b) mathematical formal- view neither empirical data nor statistical tests can be
ism is beneficial to theory, (c) measurement and theory called a theoretical contribution, and prescriptions
have a complex relationship, (d) rewriting old theory deemed sensible for empirical research or data analysis
can yield new insights, and (e) theoretical growth can should not be considered suitable for the evaluation of
drive empirical work that might not otherwise have psychological theory.
been considered worthwhile. I suggest that the value for theory in Shepard’s article
was not the discovery of an exponential law but rather
Theory Building Is Not a Statistical the explanation proposed for it, and theories need to
be evaluated (in part) in terms of their explanatory
Problem
value. For example, Shepard’s article did not merely
When reading Shepard’s original 1987 article and the summarize data—it systematized an existing body of
2004 retrospective, some surprising characteristics of his empirical findings. It separated aspects of the data that
work on theory stand out. First, the theory development are invariant across studies from those that are not,
was largely post hoc. The original article does not collect sifting the wheat from the chaff so to speak. The sieve
new data, and indeed the main empirical results reported that enabled this was a mathematical theory describing
in the article were based on a reanalysis of existing data. regularities in stimulus generalization in terms of sim-
Second, the article reports no hypothesis tests. There are pler primitives. Thus, although Shepard’s theory asserts
no p values, Bayes factors, or any confidence intervals that the form of a generalization curve should be expo-
or their Bayesian equivalents. Third, the article does not nential, this exponential form is an entailment of his
outline any specific predictions about future experi- theory and not its substance.
ments. It makes a strong claim that the exponential law From the perspective of theory, this is important: If
should hold broadly but does not prescribe how tests of an exponential law were observed in a few terrestrial
this prediction should be constructed. species with no deeper explanation provided, there
Viewed through the lens of the methodological would be little reason to believe that such a law might
reform culture documented by Flis (2019) these proper- hold with any generality. Such an inference would be
ties might seem strange and might even amount to a statistically unjustifiable, even as a “tentative suggestion.”
4 Navarro

What Shepard does instead is note that an exponential Except for that last part—which forms the substan-
law emerges as an entailment of sufficiently primitive tive part of the exponential law—this seems intuitive,
rules that could be reasonably expected to hold in vastly but in the stated form it also sounds vacuous and peril-
different environments: “I tentatively suggest that because ously close to tautological. What precisely do I mean
these regularities reflect universal principles of natural when I use the word “similarity”? As philosophers
kinds and of probabilistic geometry, natural selection (Goodman, 1972) and psychologists (Medin et al., 1993)
may favor their increasingly close approximation in sen- alike have noted, the term similarity is not well defined
tient organisms wherever they evolve” (p. 1323). and requires additional constraint to be psychologically
In other words, his claim to generality arises not from meaningful. To make the theory workable, I must elab-
any statistical quantification of the strength of evidence orate on this verbal definition and try to pin down what
but from the formal structure of the theory. Statistical I mean by similarity. I also need to pin down what I
evidence and theoretical generality are quite different mean when I refer to the “tendency” to act a certain
from one another. Statistical tools can tell us what we way. Very quickly one finds that it is difficult to work
might expect to happen were an experiment to be out what underlying theoretical claim is being made if
precisely replicated in precisely the same context; theo- these claims are stated only in everyday language. Even
retical tools exist to tell us how to generalize from one if the theoretical claim is not entirely vacuous (in this
context to another. Insofar as all meaningful inferences case, if there is some of substance buried within my
that a practical scientist cares about are to some extent claim that “the tendency falls away very quickly”), I
an act of generalization across contexts, statistical infer- cannot work out what the substance may be when my
ences are insufficient to guide scientific judgment. theory is stated in this fashion. In other words, without
Theory-based inferences are a necessity, not a luxury. precision it is hard to know what tests and what infer-
ences are licensed by the theory.
Escaping this trap of vagueness is hard, and to illus-
Mathematical Formalism Is Beneficial trate how mathematical formalism can help, it is neces-
for Theory sary to introduce some. 3 In this article I use g(x, y) to
It is perhaps trite to say so, but the defining property refer to the generalization function: Specifically, g(x, y)
of mathematical psychology is the emphasis on formal is the probability that a newly encountered stimulus y
descriptions of human thought and behavior, either in shares a property that is already known to be possessed
the form of an abstract mathematical specification or a by a different stimulus x. Using this notation, Shepard’s
clearly defined computational model. To many psy- claim can be written in the following form:
chologists it might seem strange that such a discipline
even exists, but as Luce (1995) puts it, “mathematics g ( x , y ) = e −λd ( x , y ) (1)
becomes relevant to science whenever we uncover
structure in what we are studying” (p. 2). If we believe where the constant e is approximately 2.718 and λ is
that our empirical results have structure, we should an unknown parameter of little theoretical interest. 4
attempt to articulate what that structure is as precisely The quantity of interest here is d(x, y), the “psychologi-
as we can. It is with this task that mathematical psychol- cal distance” between stimulus x and stimulus y. Written
ogy is concerned. like this, the theory’s claim starts to become clearer: If
There are a number of reasons why formality is use- it is possible to measure both the psychological dis-
ful to the would-be theoretician, but first among them tance d(x, y) and the strength of generalization g(x, y)
(in my view) is precision. Consider how Shepard’s law in a defensible way, then we should expect a very
of generalization might have looked had he not sought specific nonlinear relationship to emerge between the
the precision that mathematics affords. My attempt to two. Already some of the value of the theory should
describe the law itself verbally using ordinary English be clear. It tells us which measurement problems we
language and not substituting any mathematical words need to solve.
is as follows: If an intelligent agent encounters one The value of this should not be understated: Know-
thing that has a particular property and encounters ing what quantities need to be measured is of consider-
another thing and is uncertain whether it possesses that able importance to psychologists, and knowing when
property, then all else being equal the agent will tend approximate measurements are “good enough” is simi-
to treat those things similarly in regard to the unknown larly critical. In the generalization context, if the
property to the extent that those two things are similar researcher can obtain only ordinal-scale information
in regard to their known properties, and this tendency about psychological distances, then Shepard’s law
will fall away very quickly as this similarity decreases. yields no predictions at all about the corresponding
Mathematical Psychological Theory 5

generalizations. Indeed, to the extent that one goal in representation to which the theory might be applied.
methodological reform is to encourage researchers to Setting aside the justifications for his choices, nonmetric
be more precise in stating the contexts to which they multidimensional scaling (MDS; Kruskal, 1964) served
believe their results may generalize (Simons et  al., as a measurement model for Shepard in 1987, and his
2017), it is advantageous to have precisely stated theory analyses all use MDS-estimated psychological spaces
to guide them. To comment sensibly on how an empiri- to supply the relevant measure of distance.
cal result might be expected to generalize (or not) As this discussion illustrates, the measurement instru-
beyond the original context, one needs to know which ment and development of theory were tightly linked.
properties of the sample or the study can be deemed Without MDS as a measurement tool Shepard would
inductively relevant to the new context. Formal theory have found it almost impossible to formulate the empir-
helps by providing the researcher with guidance as to ical regularity of interest with any confidence. However,
what matters and what does not. Indeed, Shepard’s it is equally clear that MDS is merely a tool used to help
description of the generalization problem facing every define the phenomenon to be explained. It can be used
learner seems pointedly appropriate to the generality to supply an approximate measure of psychological
problem facing psychological scientists: distance d(x, y) between two stimuli, but it does not
itself explain why a measure of stimulus generalization
We generalize from one situation to another not g(x, y) should diminish exponentially as a function of
because we cannot tell the difference between the this distance. Although MDS and other latent variable
two situations but because we judge that they are models (e.g., factor analysis) can be useful tools for
likely to belong to a set of situations having the same organizing our measurements in a statistically meaning-
consequence. Generalization, which stems from ful way, we should not mistake them for psychological
uncertainty about the distribution of consequential theory.
stimuli in psychological space, is thus to be To illustrate the latter point, it is notable that in the
distinguished from failure of discrimination, which stimulus-generalization literature it quickly became
stems from uncertainty about the relative locations apparent that Shepard’s law applies even in situations
of individual stimuli in that space. (p. 1322) in which MDS does not: Shortly after the publication
of Shepard’s original article, Russell (1988) demon-
If we hope to make sound generalizations as scien- strated that the same law holds for stimuli defined in
tists, we must know what theoretical space attaches to terms of discrete features as well as for the continuous
our empirical work: My modest suggestion is that for- spaces for which Shepard’s work was defined, a con-
mal mathematical theories are the method by which we nection that was later extended by Tenenbaum and
can do so. Griffiths (2001). Although the theoretical framework
could not have come into existence without the scaf-
folding provided by the MDS measurement model, it
Measurement and Theory Have
quickly outgrew any need for this support. Many of the
a Complicated Relationship generalization problems discussed by Tenenbaum and
Let us turn next to the question of measurement and Griffiths cannot be described with respect to any metric
its relation to theory. If one hopes to obtain empirical space extracted by MDS but are nevertheless consistent
support for a theoretical claim, it must be tethered in with Shepard’s theory. In other words, although the
some way to observational or experimental data. To measurement model supplied by MDS played a central
accomplish this, one must have an appropriate mea- role in developing theories of generalization, those
surement tool. For example, one of the key insights in theories are no longer dependent on MDS in any mean-
Shepard’s (1987) article is the recognition that although ingful sense.
stimulus-generalization functions can be extremely
irregular in form when we measure distance in “objec- Rewriting Old Theory Can Provide
tive” terms, they are often very smooth when measured
in more subjective terms: Color generalizations are pre-
New Insight
dictable with respect to the appropriate color space The specific mathematical form that Shepard used to
(e.g., Ekman, 1954), tones are regular when described implement his ideas is not unique, and the theory can
in an appropriate perceptual space, and so on. In ret- be rewritten in a different notation. Cooper and Guest
rospect this seems obvious, but at the time Shepard (2014) argued that work on theory need not be con-
developed the theory he was faced with a substantive strained to a particular “implementation” (or formalism)
problem of how to extract the appropriate stimulus but is better captured by a more abstract notion of a
6 Navarro

“specification.” As a concrete example, it is worth con- The Bayesian reformulation of Shepard’s theory that
sidering the manner in which Shepard’s law was later Tenenbaum and Griffiths presented allowed them to
reformulated by Tenenbaum and Griffiths as an (explic- generalize Shepard’s theory in three distinct ways. First,
itly) Bayesian model and the effect this rewriting had as mentioned earlier, they showed (much like Russell,
on how the theory could be applied. 1988) that Shepard’s theory could encompass stimuli
To illustrate what I mean here, it is worth considering that were not representable as points in a geometric
how Bayesian cognitive models are typically described space: In their notation, this is accomplished by sub-
in the cognitive-science literature. It is grossly typical stituting a new hypothesis space H. Second, this for-
now to introduce such a model by first saying “we mulation allowed the theory to naturally accommodate
propose to treat [psychological problem of interest] as inductive-generalization problems in which the learner
a Bayesian inference problem” and then introduce the has encountered more than one consequential stimulus.
formula for Bayes’s rule: Earlier approaches for allowing the model to account for
P ( x|h ) P (h ) multi-item generalization (e.g., Shepard & Kannappan,
P (h|x ) = . (2) 1991) were not quite so adaptable.
P(x )
Finally, this formalism called attention to a poten-
It would then be explained that P(h) defines the tially limiting assumption in Shepard (1987). Shepard
learner’s prior degree of belief in some hypothesis h argued that “in the absence of any information to the
about the world, whereas P(h|x) is the posterior belief contrary, an individual might best assume that nature
in that hypothesis after the learner encounters the infor- selects the consequential region and the first stimulus
mation embodied by x, whatever x may happen to be independently” (p. 1321). This so-called weak sampling
in the specific application at hand. Next it would be assumption places strong constraints on the inferences
noted that the likelihood term P(x|h) denotes the prob- that the learner can make, and when formally instanti-
ability of the learner observing x if hypothesis h were ated within the model it leads to a situation in which
true. The normalizing constant P(x) is also explained, the learner necessarily behaves like a naive falsification-
additional context is filled in, and the end result is an ist: The only role that observed stimuli x can play is
abstract specification for a mathematical model. 5 indicating which hypotheses h are consistent with the
One finds nothing of the kind in Shepard’s (1987) observations and which are not. Nevertheless, this is
article. None of the “standard” notation is used, and by no means the only assumption a sensible reasoner
there is no explicit appeal to Bayes’s rule in the text. might make, and by highlighting Shepard’s assumption
Instead, all that one finds is a discussion of “consequen- more clearly, Tenenbaum and Griffiths (2001) allowed
tial regions” of unknown size, probability measures that later work to explore alternative sampling models that
are not entirely easy to understand for the casual reader, allow the reasoner to use the stimulus information in a
and so on. It does not look like a Bayesian model in more sophisticated manner (e.g., Hayes et  al., 2019;
the sense that cognitive modelers would easily recog- Shafto et al., 2014). Each of these insights has led to new
nize 30 years later. I can certainly attest to the fact that empirical and theoretical work, a point that I expand on
I did not perceive the connection to Bayesian learning in the next section.
until Tenenbaum and Griffiths (2001) recast Shepard’s
formalism using a different notation, expressing the Theory Growth Can Drive Experimental
same ideas rather differently.
The contribution to theory of Tenenbaum and
Innovation
Griffiths (2001) is worth expanding on because I think The final point I want to make pertains to the relation-
it was instrumental in allowing Shepard’s theory to be ship between theoretical growth and empirical innova-
extended beyond the original stimulus-generalization tion. I have heard it suggested on occasion that
context. Whereas Shepard referred to the notion of a psychology needs to solve its empirical problems first
consequential region located within a psychological and only then consider how to construct good theory.
space—with all of the geometric connotations that this I am less than convinced by such claims and hope to
space entails—Tenenbaum and Griffiths took a more illustrate in this section why the two problems go hand
general view and framed their analysis in terms of “con- in hand, again using the stimulus-generalization theo-
sequential sets.” Moreover, any specific candidate for ries introduced by Shepard (1987) and Tenenbaum and
the true consequential set was labeled a “hypothesis” Griffiths (2001) as my example.
h and considered part of a broader “hypothesis space” One of the most important contributions made by
H and the underlying problem of generalizing from one the Bayesian formulation adopted by Tenenbaum and
stimulus to another could be recast as Bayesian reason- Griffiths (2001) is that it allowed the underlying theory
ing about (collections of) such hypotheses. to be applied to a much broader range of inductive
Mathematical Psychological Theory 7

problems. Shepard’s original construction, although pur- in Navarro (2019), I was able to resolve much of this
ported to be a very general law itself, was formulated uncertainty: Most of our experimental findings were
with respect to a narrow class of psychological prob- indeed consistent with the theory, but some were
lems: inductive generalization from a single observation. emphatically not. By adopting a mathematically precise,
Moreover, because the origins of his work lay in the theory-motivated approach to exploring this phenom-
study of human perception and the animal-learning lit- enon, my colleagues and I were able to obtain clarity
erature, it was not immediately clear—at least it was about what we were seeing in our empirical data. I
not clear to me—how the theory should be extended know of no other process that would have allowed me
to higher-order cognition. The reformulation offered by to do so.
Tenenbaum and Griffiths made it quite apparent that
Shepard’s original theory is a special case of a broader
A Word of Warning
class of Bayesian generalization models. By abstracting
away from the specific problem Shepard’s theory sought In this article I have argued that the toolkit provided
to explain and casting it in a language (Bayesian infer- by mathematical psychology can be a powerful aid to
ence) that is naturally extensible to new problems, I those seeking to build psychological theories. I would
was able to see how I could extend Shepard’s theory be remiss, however, if I did not comment on the limita-
on my own. tions to this approach. As a mathematical psychologist
Perhaps the cleanest example of this interplay in my studying human inductive reasoning, what I want is a
own research is the work presented by Hayes et  al. “mathematical theory of human reason” that explains
(2019), which was motivated by a puzzling finding pre- the entire psychological process of human reasoning
sented by Lawson and Kalish (2009) in which people about underconstrained problems. However, my skill
appeared to solve inductive-reasoning problems differ- and knowledge are both limited, and I cannot fathom
ently depending on how the information in the reason- what class of theoretical models might be applicable
ing problem was selected. At the time the original work to the entire psychological process at hand. Nor can I
was presented, no clear explanation for why people think of a way to circumscribe the scientific problem
would do this was available, so we considered the pos- in a fashion that allows me to render the entire domain
sibility that—following Tenenbaum and Griffiths’s of human reason subject to any kind of direct measure-
observation that from a statistical-learning perspective ment. This limitation has consequences. My experiment
inductive generalization should depend on the learner’s is a measurement tool that captures some aspects to
beliefs about how information is selected—the earlier human reasoning but inevitably confounds it with the
results by Lawson and Kalish (2009) represented the measurement of some other phenomena. If I try to
same kind of effect. account theoretically for all things in my data I must
The process I followed when adapting the theory to provide an account of these unknown things as well as
a new context may be informative. In my first pass at the thing I am trying to study. But if my experiment is
adapting the theory (Hayes et al., 2017), I constructed too complex then these unknown things will them-
a model that was only very slightly different from the selves become quite complex, leading to the risk that
Tenenbaum and Griffiths version and used it to derive any theoretical explanation I construct is little more
qualitative predictions regarding what kind of empirical than wild speculation.
manipulations should be expected to modulate the When faced with this concern, a sensible but poten-
effect reported by Lawson and Kalish (2009). My col- tially dangerous strategy is to make the task simpler.
leagues and I then undertook a series of experimental Make the task so small and so simple that we actually
tests, reported in Hayes et  al. (2019), showing that can write down models that specify precise assump-
under some circumstances (not all) the effects predicted tions about every aspect of the task. This may lead to
by (my trivial adaptation of) the Tenenbaum and Griffiths better theoretical models, but it may come at the price
model occur almost exactly as expected. However, from of limiting their theoretical scope to an unreasonable
my perspective this initial work was unsatisfying: extent. It is inconvenient, perhaps, but it remains true
Because the new experimental results involved a very that our theoretical models are defined with respect to
different design to the kind of “stimulus-generalization” simplified “toy worlds”; humans, however, must occupy
tasks with which Shepard was originally concerned, it the real one. If we emphasize formal rigor too much
was difficult to be certain which aspects of our data (and adapt all of our measurements to let us satisfy
could be explained as a “sampling effect” and which these demands) the experimental paradigms may
could not. This led me to a develop a more substantive become ossified and highly restricted, adapted to suit
modification of the Tenenbaum and Griffiths model, 6 only those phenomena that we know how to model in
and following the model-evaluation procedure outlined full. This can be dangerous insofar as it provides an
8 Navarro

illusion of explanatory power, one that falls apart once examined sequential sampling models of choice reac-
we step outside the narrow confines of our paradigm. tion time (Luce, 1986) and the rich theoretical tradition
This is not a novel observation: For example, Hacking that mathematical psychologists have developed in that
(1992) argued that over time the laboratory sciences domain. In each of these areas psychologists have been
can create a self-vindicating system by building theories slowly and carefully building psychological theories.
and methods that are “mutually adjusted to each other” The work is painstaking and slow and the articles are
and cannot be falsified, quite irrespective of their real- often difficult to read, but I would argue that the devel-
world utility: “The theories of the laboratory sciences opment of theory in this domain has been genuinely
are not directly compared to ‘the world’; they persist cumulative.
because they are true to phenomena produced or even These advances in theory have something in com-
created by apparatus in the laboratory and are mea- mon. In each of these areas psychological scientists
sured by instruments we have engineered” (p. 30). have built up a considerable body of theoretical knowl-
Theory-inclined psychologists should not shy away edge that is instantiated in formal models of psychologi-
from the concerns this raises. When seeking to develop cal processes. In every case the underlying theoretical
theories, one should take some care to reflect on how models are more than mere summaries of empirical
the perspective from theory may serve to circumscribe results and more substantive than a mere statistical
the problem at hand in too narrow a way. Precisely model. In all cases the formalism can be used to gener-
because of the fact that mathematical models are hard ate novel predictions in experimental paradigms that
to build and experimental paradigms are easy to sim- differ markedly from the experimental contexts used to
plify, those of us who advocate formal theory building develop the model (and, remarkably, some of those
must, I suggest, be especially wary of this trap. predictions have even turned out to be correct). By
judiciously combining abstraction and formalism, math-
ematical psychologists have been able to develop a
Conclusion toolkit that allows anyone to derive theory predictions
Mathematical psychology is something of an oddity in in completely novel paradigms. If it is indeed the case
the discipline. It does not eschew empirical research, that psychology suffers from a kind of “theoretical
but neither does it view the goal of psychological sci- amnesia” (Borsboom, 2013), perhaps the machinery of
ence to be the accrual of empirical effects. Quite unlike mathematical psychology can aid its memory. Perhaps
most areas of psychology with which I am familiar, fittingly, the words of Shepard (1987) seem an appro-
mathematical psychologists place a high value on the- priate way to conclude:
ory development, particularly when such theories can
be stated in a formal manner. My goal in this article Undoubtedly, psychological science has lagged by
was to highlight the manner in which cumulative work behind physical science by at least 300 years.
on theory has developed in this discipline, using Shepa- Undoubtedly, too, prediction of behavior can
rd’s law as an example. From its origins in associative never attain the precision for animate that it has
learning and stimulus generalization to its reformulation for celestial bodies. Yet psychology may not be
as a Bayesian model and its extension to a variety of inherently limited merely to the descriptive
novel contexts, a single theoretical claim can be shown characterization of the behaviors of particular
to connect to a variety of empirical findings in super- terrestrial species. Possibly, behind the diverse
ficially distinct domains. behaviors of humans and animals, as behind the
Although I have focused on Shepard’s law and its various motions of planets and stars, we may
extensions in this article, I suspect that the underlying discern the operation of universal laws. (p. 1323)
pattern is quite general. I could have chosen the
Rescorla-Wagner model of associative learning as the Transparency
basis for this discussion (Rescorla & Wagner, 1972) or Action Editors: Travis Proulx and Richard Morey
the generalized context model of human categorization Advisory Editor: Richard Lucas
(Nosofsky, 1986). I could have chosen to focus on mod- Editor: Laura A. King
els such as ALCOVE (attention learning covering map) Declaration of Conflicting Interests
that sought to unify associative learning and categoriza- The author(s) declared that there were no conflicts of
tion (Kruschke, 1992) or models such as the hierarchical interest with respect to the authorship or the publication
Dirichlet process that sought to unify various category- of this article.
learning models within a common theoretical language
(Griffiths et al., 2007). I could have revisited Ebbinghaus’s ORCID iD
work on memory (Ebbinghaus, 1885/1913). I could have Danielle J. Navarro https://orcid.org/0000-0001-7648-6578
Mathematical Psychological Theory 9

Acknowledgments References
This article grew out of numerous conversations with several Borsboom, D. (2013, November 20). Theoretical amnesia.
people, most notably Berna Devezer, to whom I am deeply Center for Open Science. http://osc.centerforopenscience
indebted and without whose thoughtful contribution this .org/2013/11/20/theoretical-amnesia
article would not exist. I would also like to thank Richard Cooper, R. P., & Guest, O. (2014). Implementations are not
Morey, Olivia Guest, and an anonymous reviewer for thought- specifications: Specification, replication and experimen-
ful (and kind) comments on the initial version of the manu- tation in computational cognitive modeling. Cognitive
script, which was submitted in a less-than-polished form Systems Research, 27, 42–49.
because of the outbreak of COVID-19. Source material associ- Ebbinghaus, H. (1913). Memory: A contribution to experimen-
ated with this article is available on GitHub (https://github tal psychology (H. Ruger & C. Bussenius, Trans.). Teachers
.com/djnavarro/shepard-theory) and OSF (https://osf.io/7cvtk). College. (Original work published 1885)
Ekman, G. (1954). Dimensions of color vision. The Journal
of Psychology, 38(2), 467–474.
Notes Flis, I. (2019). Psychologists psychologizing scientific psychol-
1. It should be noted that I do not discuss the empirical evi- ogy: An epistemological reading of the replication crisis.
dence for (or against) Shepard’s theory. It is not my intent to Theory & Psychology, 29(2), 158–181.
argue for any specific theory, so much as to describe some of Goodman, N. (1972). Seven strictures on similarity. In N.
the processes that go into constructing, extending, and evaluat- Goodman (Ed.), Problems and projects (pp. 437–446).
ing one. Most theories are, of course, wrong. I would not be Bobbs-Merrill.
surprised if Shepard’s work (or indeed my own) turns out to be Griffiths, T., Canini, K., Sanborn, A., & Navarro, D. (2007).
misguided. That is not the point of this article. The point is to Unifying rational models of categorization via the hierar-
present my views on what psychological theories are and how chical Dirichlet process. In D. S. McNamara & J. G. Trafton
they can be useful. (Eds.), Proceedings of the 29th annual conference of the
2. The reader may wonder then if I am constructing a straw- Cognitive Science Society (pp. 323–328). Psychology Press.
man argument by implying that theory building might be dis- Hacking, I. (1992). The self-vindication of the laboratory sci-
missed on this basis. All I can say in response is that I have ences. In A. Pickering (Ed.), Science as practice and cul-
received reviews in recent years (including by open-science ture (pp. 29–64). University of Chicago Press.
advocates) that have accused me of questionable research prac- Hayes, B. K., Banner, S., Forrester, S., & Navarro, D. J. (2019).
tice precisely because my theoretical work does not meet these Selective sampling and inductive inference: Drawing
statistical criteria. It is easy to claim that no one would fall prey inferences based on observed and missing evidence.
to the fallacy of conflating statistical with theoretical claims, but Cognitive Psychology, 113, Article 101221. https://doi
it does happen, and I believe there is value to pointing out the .org/10.1016/j.cogpsych.2019.05.003
error in the open literature rather than arguing it invisibly in the Hayes, B. K., Banner, S., & Navarro, D. J. (2017). Sampling
review process. frames, Bayesian inference and inductive reasoning. In
3. It is worth noting that doing so is often viewed as a risky G. Gunzelmann, A. Howes, T. Tenbrink, & E. J. Davelaar
proposition in psychology. In my experience journal editors (Eds.), Proceedings of the 39th annual conference of the
and reviewers are less likely to accept a manuscript that con- Cognitive Science Society (pp. 488–493). Cognitive Science
tains formal exposition and will often ask for such things to be Society. https://cognitivesciencesociety.org/wp-content/
removed, relegated to supplemental materials, appendices, or uploads/2019/01/cogsci17_proceedings.pdf
even recommend that manuscripts be sent to a more special- Kerr, N. L. (1998). HARKing: Hypothesizing after the results
ized journal. Although I have been as guilty of this practice as are known. Personality and Social Psychology Review,
anyone else, I am of the view that the hostility of institutional 2(3), 196–217.
gatekeepers to mathematical methods in psychology is part of Kruschke, J. K. (1992). ALCOVE: An exemplar-based connec-
the very problem that needs to be addressed. tionist model of category learning. Psychological Review,
4. This is not quite true. The specificity parameter λ describes how 99(1), 22–44.
quickly the generalization gradient falls away as a function of dis- Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A
tance, and there are many situations in which the researcher may numerical method. Psychometrika, 29(2), 115–129.
care primarily about how λ changes across contexts. However, Lawson, C. A., & Kalish, C. W. (2009). Sample selection and induc-
those situations were not the focus of Shepard’s work. tive generalization. Memory & Cognition, 37(5), 596–607.
5. A complete discussion of Bayesian cognitive modeling is Luce, R. D. (1986). Response times: Their role in inferring
beyond the scope of this brief article. See Perfors et al. (2011) elementary mental organization. Oxford University Press.
for a tutorial introduction. Luce, R. D. (1995). Four tensions concerning mathematical
6. Crudely put, I modified the hypothesis space H instead of modeling in psychology. Annual Review of Psychology,
each h corresponding to a single “consequential set” indicating 46(1), 1–27.
which stimuli possess an unknown property, the Hayes et al. Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects
(2019) model is probabilistic, and a hypothesis h is a function for similarity. Psychological Review, 100(2), 254–278.
defined over the stimulus space that describes the probability Navarro, D. J. (2019). Between the devil and the deep blue
with which stimuli possess an unknown property. sea: Tensions between scientific judgement and statistical
10 Navarro

model selection. Computational Brain & Behavior, 2(1), Shepard, R. N. (1987). Toward a universal law of generalization
28–34. https://doi.org/10.1007/s42113-018-0019-z for psychological science. Science, 237(4820), 1317–1323.
Nosofsky, R. M. (1986). Attention, similarity, and the identifica- Shepard, R. N. (2004). How a cognitive psychologist came
tion–categorization relationship. Journal of Experimental to seek universal laws. Psychonomic Bulletin & Review,
Psychology: General, 115(1), 39–57. 11(1), 1–23.
Perfors, A., Tenenbaum, J. B., Griffiths, T. L., & Xu, F. (2011). Shepard, R. N., & Kannappan, S. (1991). Toward a connection-
A tutorial introduction to Bayesian models of cognitive ist implementation of a theory of generalization. Advances
development. Cognition, 120(3), 302–321. in Neural Information Processing Systems, 3, 665–671.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of pavlov- Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints
ian conditioning: Variations in the effectiveness of rein- on generality (COG): A proposed addition to all empiri-
forcement and nonreinforcement. In A. H. Black & W. F. cal papers. Perspectives on Psychological Science, 12(6),
Prokasy (Eds.), Classical conditioning II: Current research 1123–1128. https://doi.org/10.1177/1745691617708630
and theory (pp. 64–99). Appleton-Century-Crofts. Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization,
Russell, S. (1988). Analogy by similarity. In D. H. Helman similarity, and Bayesian inference. Behavioral and Brain
(Ed.), Analogical reasoning (pp. 251–269). Springer. Sciences, 24(4), 629–640.
Shafto, P., Goodman, N. D., & Griffiths, T. L. (2014). A ratio- Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L.,
nal account of pedagogical reasoning: Teaching by, & Kievit, R. A. (2012). An agenda for purely confirma-
and learning from, examples. Cognitive Psychology, 71, tory research. Perspectives on Psychological Science, 7,
55–89. 632–638. https://doi.org/10.1177/1745691612463078

You might also like