by
Richard C. Atkinson and William K. Estes
TECHNICAL REPORT NO. lj.8
July 1, 1962
PSYCHOLOGY SERIES
Reproduction in Whole or in Part".is.Pennitte.d for
any Purpose of the United States Government
INSTITUTE FOR MATHEMATICAL STUDIES IN THE SOCIAL SCIENCES
Applied Mathematics and Statistics Laboratories
S T ~ O R D UNIVERSITY
Stanford, California
STIMULUS SAMPLING TllEORyl
by
Richard C. Atkinson and William K. Estes
lpreparation of this chapter was supported in part by Contract
Nonr908(16) between the Office of Naval Research and Indiana
University, Contract M5l84 between the National Institute of
Mental Health and Stanford University, and Contract Nonr225(17)
between the Office of Naval Research and Stanford University.
This is a draft of a chapter for the Handbook of Mathematical
Psychology, edited by R. R. Bush, E. Galanter, and R. D. Luce.
Section
10
TABLE OF CONTENTS
Page
1
2. OneElement Models. ,0 <. . '. .0 0 .0 _0 .. •• 0 ,0 .. ..
2.1 Learning of a single stimulus response
association." .• " ." ....
2.2 learning.. • .. .• • •
2.3 Probabilistic reinforcement schedules.
• •
5
6
10
28
Pattern Models.•
. 3.• 1 Genera). formulation,
3.2 Treatment of the simple noncontingent case
3.3 Analysis of a paired comparison learning
47
47
63
experiment........"
." " " .. " " " .. ... ." " ,0 ."
4. A Component Model For Stimulus Compounding And
Ge.ner&l:i,:zation... ,4 ." .... '0 ." _" 0 .. '0
4,1 Basic concepts; conditioning and response
axioms " " ,0 .• •.. '. ... ."
4.2 Stimulus compounding
4.3 Sampling axioms and major response theorem of
fixed sample size model" ." _. "4 ." _0 '. '0 ." ... ... ...
4.4 Jnterpretation of stimulus genera).ization,
III
III
112
121
124
Section
TABLE OF CONTENTS  continued
Page
5·
Component And Linear Models For Simple Learning •
5.1 Component models wi.th fixed sample size.
5.2 Component models with stimulus fluctuation
5.3 The linear model as a limiting case .•••
5.4 Applications to multiperson interactions.
135
135
155
168
182
Discrimination Learning . • . • • • . • • • . 6.
6.. 1
6.2
6.3
6.4
6.5
The pattern model for discrimination learning.
A mixed model ..
Component models
Analysis of a signal detection experiment.
Multiple process models ...•
19
2
193
199
209
211
221
REFERENCES.... . .• . • . . . •• •. •. .••.•••. 236
LIST OF TABLES
Table Page
1. Observed and predicted (oneelement model) values
for response sequences over first three trials
of a paired associate experiment
2. Observed and predicted values for Pk' the proba
bility of no errors following the k
th
success.
(Interpret PO as the probability of no errors
at all during the course of learning) •.
3. Predicted (pattern model) and observed values of
sequential statistics for final 100 trials of
the .6 series. • . • • . . . • • . • • •
4. Theoretical and observed asymptotic choice pro
portions for pairedcomparison learning experiment
5. Theoretical and observed proportions of response
A
l
to triads of lights in stimulus compounding
test • . . . . • • • • • . . . . . . •
6, Observed joint response proportions for RTT experiment
and predictions from linear retentionloss
model and sampling model . . . . • . • •
7. Predicted and observed asymptotic response proba
bilities for visual detection experiment
8. Predicted and observed asymptotic sequential response
probabilities in visual detection experiment • •
9. Predicted and observed asymptotic response proba
bilities in observing response experiment ••••
9a
27a
74a
104a
117a
174a
216a
219a
232a
LIST OF ILLUSTRATIONS
Figure Page
1. The average probability of an error on trial n in
Bower's pairedassociate experiment. 19a
2. Branching process, starting from state C
l
on trial
n, for one element model in two choice, non
contingent case . . 32a
3. Branching process, starting from state Co on trial
n, for one element model in two choice, non
contingent case
33a
4. Branching process for one element model in two
choice, contingent case
5. Branching process for N element model on a trial
when the subject starts in state C
i
and an E
l
event occurs. . 51a
6. Branching process for a diad probability on a paired
comparison learning trial . . . . . . .
7. Branching process for a triad probability on a paired
comparison learning trial . . . . . .
99a
100a
Observed proportion of responses in successive 8.
20trial blocks for
A.
l
paired comparison experiment. 103a
·5,
the proportion of
~ ,
and
~ l
= .1, the
A
l
to
~
prior to train
. . 126a S .
a
ing in
overlap between S
a
probability of response
9. Generalization from a training stimulus, Sa' to a
test stimulus, Sb' at several stages of training.
The parameters are
10. Branching process, starting in state <1122>, for a
single trial in the twoprocess discrimination
learning model. 229a
Po. and E. 1
1. INTRODUCTION
Stimulus sampling theory is concerned with providing a mathematical
language in which may be expressed assumptions about learning and
performance in relation to stimulus variables. A special advantage
of the formulations to be discussed is that their mathematical properties
permit application of the simple and elegant. theory of Markov chains
(Feller, 1957; Kemeny, Snell, and '['hompson, 1957; Kemeny and Snell, 1959)
to the tasks of deriving theorems and generating statistical tests of the
agreement between assumptions and data. This branch of learning theory
has developed in close interaction with certain types of experimental
analysis; consequently it will be both natural and convenient to organize
this presentation around the theoretical treatments of a few standard
reference experiments.
At the level of experimental interpretation, most contemporary
learning theories utilize a common conceptualization of the learning
situation in terms of stimulus, response, and reinforcement. The stimulus
term of this triumvirate refers to the environmental situation with respect
to which behavior is being observed, the response term to the class of
observable behaviors whose measurable properties change in some orderly
fashion during learning, and the reinforcement term to the experimental
operations or events believed to be critical in producing learning. Thus,
A. and E. 2
in a simple pairedassociate experiment concerned with the learning of
English equivalents to Russian words, the stimulus might consist in
presentation of the printed Russian word alone, the response measure in
the relative frequency with which the learner is able to supply the English
equivalent from memory, and reinforcement in paired presentation" of the
stimulus and response words.
In other chapters of this volume, and in the general literature on
learning theory, the reader will encounter the notions of sets of responses
and sets of reinforcing events. In the present chapter, mathematical sets
will be used to represent certain aspects of the stimulus situation. It
should be emphasized from the outset, however, that the mathematical models
to be considered are somewhat abstract and that the empirical interpreta'
tions of stimulus sets and their elements are not to be considered fixed
and immutable. TWo main types of interpretation will be discussed: in
one of these the empirical correspondent of a stimulus element is the full
pattern of stimulation effective on a given trial, in the other the
correspondent of an element is a component, or aspect, of the full pattern
of stimulation. In the former case, we speak of "pattern models" and in
the latter of "component models" (Estes, 1959b).
There are a number of ways in which characteristics of the stimulus
situation are knOwn to affect learning and transfer. Rates and limits of
conditioning and learning generally depend upon both stimulus magnitude,
or intensity, and upon stimulus variability fr9m trial to trial. Retention
and transfer of learning depend upon the similarity, or communality, between
the stimulus situations obtaining during training and during the test for
A. and E. 3
retention or transfer. These aspects of the stimulus situation can be
given direct and natural representations in terms of mathematical sets
and relations between sets.
The basic notion common to all stimulus sampling theories is the
conceptualization of the totality of stimulus condi tionsthat may be
effective during the course of an experiment in terms of a mathematical
set. Although it is not a necessary restriction, it is convenient for
mathematical reasons to deal only with fini.te sets, and this limitation
will be assumed throughout our presentation. Stimulus variability is
taken into account by assuming that of the total population of stimuli
available in '311 experimental situation, generally only a part actually
affects the subject on anyone trial. Translating this idea into the
terms of a stimulus sampling model, one may represent the total popula
. tion by a set of "stimulus elements" and the stimulation effective on
anyone trial by a sample from this set. Many of the simple mathematical
properties of the models to be discussed arise from the assumption that
these trial samples are drawn randomly from the population, with all
samples of a given size having equal probabilities. Although it is
sometimes convenient and suggestive to speak in such terms .• one should
not assume that the stimulus elements are to be identified wi.th any
simple neurophysiological unit, as, for example, receptor cells. At
the present stage of theory construction, we mean to assume only that
certain properties of the settheoretical model represent certain
properties of the process of stimulation. If these assumpti,ons prove
to be sufficiently well substantiated when the model is tested against
A. and E. 4
behavioral data, then it will be in order to look for neuro
physiological variables whi.ch might underlie the cor'respondences.
Just as the ratio of samp.le size to population size is a natural
way of representing stimulus variability, sample size per se may
be taken as a correspondent of stimulus intensity, and the amount
of overlap (i.e., proportion of common elements) between two stimulus
sets may be taken to represent the degree of communality between
two stimulus situations.
Our concern in thi s chapter is not to survey the rapi<ll.y
developing area of stimulus sampling theory, but simply to present
some of the fundamental mathematical and illustrate their
applications. For general background, the reader i.s referred to
Bush (1960)" Bush and Estes (1959)" Estes (1959a, 1962), and Suppes
and Atkinson (1960). We shall consider first, and in some detail,
the very simp.lest of aU. learning models  the pattern model for
simple learning. In this model, the population of available
stimulation is assumed to comprise a set of distinct stimulus
patterns, exactly one of which is sampled on each trial. In the
important special case of the one··element mod.el, it is assumed. that
there is only one such pattern and that it recurs intact at the
beginning of each experim.ental. tri.al. Granti.ng that the oneelement
model represents a radical idealization of even the most simplified
condi.tioning situations, we shall find. that it is worthy of study
not only for exposi.tional purposes but also for its value as an
analytic device in relation to certain types of learning data.
A. and E. 5
After a relatively thorough treatment of pattern models for simple
acquisition and for learning under probabilistic reinforcement schedules,
we shall take up more briefly the conceptualization of generalization
and transfer; the component models in which the patterns of stimulation
effective on individual trials are treated, not as distinct elements, but
as overlapping samples from a common population; and, finally, some
examples of the: more complex multipleprocess models which are becoming
increasingly important in the analysis of discrimination learning, concept
formation, and related phenomena.
2. ONEELEMENT MODELS
We begin by considering some oneelement modelS which are special
cases of the more general theory. These examples are especially simple
mathematically and provide us with the opportunity to develop some
mathematical tools which will be necessary in later discussions.
Application: of these models is appropriate if the stimulus situation
is sufficiently stable from trial to trial that it may be theoretically
represented to a good approximation by a single stimulus element which
is sampled with probability 1 on each trial. At the start of a trial
the element is in one of several possible conditioning states; it mayor
may not remain in this conditioning state, depending on the reinforcing
event for that trial. In the first part of this section we consider a
model for pairedassociate learning which has been intensively analyzed
by Bower (1961, 1962). In the second part of this section
A. and E. 6
we consider a oneelement model for a twochoice learning situation
involving a probabilistic reinforcement schedule. The model generates
some predictions which are undoubtedly incorrect, except possibly under
ideal experimental conditions; nevertheless it provides a useful intro
duction to more general cases which we pursue in Section 2.
2.1 Learning of a Single StimulusResponse Association
Imagine the simplest possible learning situation. A single stimulus
pattern, S ,is. to be presented on each of a series of trials and each
trial is to terminate with reinforcement of some designated response, the
"correct response" in this situation. According to stimulus sampling
theory, learning occurs in an a l l  o r ~ n o n e fashion with respect to S
This means that:
1. If the correct response is not originally conditioned to
("connected to") S, then,until learning occurs, the probability of the
correct response is zero.
2. There is a fixed probability c that the reinforced response
will become conditioned to S on any trial.
3. Once conditioned to S ,the correct response occurs with
probability one on .every subsequent trial.
These assumptions constitute the simplest ·caseof the "oneelement pattern
model." Learning situations which completely meet the specifications
laid down above are as unlikely to be realized in psychological experi
ments as perfect vacuums or frictionless planes in the physics laboratory.
However, reasonable apprOXimations to these conditions can be attained.
A. and.E. 7
The requirement that the same stimulus pattern be reproduced on each
trial is probably fairly well met in the standard pairedassociate
experiment with human subjects. In one such experiment, conducted in
the laboratory of one of the writers (W. K. E.), the stimulus member of
each item was a trigram and the correct response an English word, e.g.,
S R
xvk house
On a reinforced trial the stimulus and response members were exposed
together, as shown. Then, after several such items had received a
single reinforcement, each of the stimuli was presented alone, the
subject being instructed to give the correct response from memory, if
he could. Then each item was given a second reinforcement, followed by
a second test, and so on.
According to the assumptions of the oneelement pattern model, a
subject should be expected to make an incorrect response on each test
with a given stimulus until learning occurs, then a correct response on
every subsequent trial; if we represent an error by a 1 and a correct
response by a 0, the protocol for an individual item over a series of
trials should, then, consist in a sequence of O's preceded, in most
cases, by a sequence of lIs. Actual protocols for several subjects are
shown below:
A. and E. 8
a 0 0 0 0 0 0 0 0 0 0
b 1 .1 1 1 1 1 1 1 1 1
c 1 0 0 0 0 0 0 0 0 0
d 0 0 0 0 0 0 0 0 0 0
e 1 1 0 0 0 0 0 0 0 0
f 1 1 0 0 0 0 0 0 0 0
g 1 1 1 1 1 0 0 0 0 0
h 1
0 0 0 0 0 0 1 0 0
i 1 1 1 1 0 1 1 0 0 0
The first seven of these correspond perfectly to the idealized theoretical
picture; the last two deviate slightly. The proportion of "fits" and
"misfits" in this sample is about the same as in the full set of 80 cases
from which the sample was taken. The o.ccasional lapses, i. e., errors
following correct responses, may be symptomatic of a forgetting process
which should be incorporated into the theory or they may be simply the
result .of minor uncontrolled variables in the experimental situation
which are best ignored for theoretical purposes. Without judging this
issue, we may conclude that the simple oneelement model at least merits
further study.
Before we can make ~ u a n t i t a t i v e predictions we need to know the
value of the conditioning parameter c. Statistical learning theory
includes no formal axioms specifying precisely what variables determine
the value of c, but on the basis of considerable experience we can
safely assume that this parameter will vary with characteristics of
the populations of subjects and items represented in a particular
experiment. An estimate of
A. and E. 9
the value of c for the experiment under consideration is easy to come by.
In the full set of 80 cases (40 subjects, each tested on two items), the
proportion of correct responses on the test given after a single
reinforcement was .39. According to the model, the probability is c
that a reinforced response will become conditioned to its paired stimulus;
consequently, c is the expected proportion of successful conditionings
out of 80 cases, ahd therefore the expected proportion of correct responses
on the subsequent test. Thus we may simply take the observed proportion,
.39, as an estimate of c
In order to test the model, we need now to derive theoretical
expressions for other aspects of the data. Suppose we consider the
sequences of correct and incorrect responses, 000, 001, etc., on the
first three trials. According to the model, a correct response should
never be followed by an error, so the probability of the sequence 000 is
simply c , and the probabilities of 001, 010, 011, and 101 all zero.
Tb obtain an error on the first trial followed by a correct response on
the second, conditioning must fail on the first reinforcement but occurs
on the second, and this joint event has probability (lc)c Similarly,
the probability that the first correct response occurs on the third trial
is given by (1_c)2c and the probability of no correct response in
three trials by (1_c)3. Substituting the estimate.. 39 for c in each
of these expressions, we obtain the predicted values which are compared
with the corresponding empirical values for this experiment in Table 1.
Table 1 about here
A. and E. 9a
Table 1
Observed and predicted (oneelement model) values for response sequences
over first three trials of a paired associate experiment.
Observed Theoretical
Sequence* Proportions Proportions
000 .36
·39
001 .02 0
010 .01 0
011 0 0
100 .27 .24
101 0 0
no .11 .14
111 .23 .23
* 0 _ correct response
1 = error
A.andE.10
The correspondences are seen to be about as close as could be expected
with proportions based on 80 response .jsequences.
2.2 PairedAssociate Learning
In order to apply theoneelement mpdel to pairedassociate
experiments involving fixed lists of items, it is necessary to adjust
the. "boundary conditions" appropriately. Consider, for example, an
experiment reported by Estes, Hopkins, and Crothers (1990). The task
assigned their subjects was to learn associations between the numbers
1 through 8, serving as responses, and eight consonant trigrams, serving
as stimuli. Each subject was given two practice trials and two test
trials. On the first practice trial, the eight syllablenumber pairs
were exhibited singly in a random order. Then a test was given, the
syllables alone being presented singly in a new random order and the
subjects attempting to respond to each syllable with the correct number.
Then four of the syllablenumber pairs were presented on a second
practice trial and all eight syllables were included in.a final test trial.
In writing an expression for the probability of a correct response
on the first test in this experiment, we must take account of the fact
that after the first practice trial, the subjects knew that the responses
were the numbers 1  8, and were in a position to guess at the correct
answers when shown syllables that they had not yet learned. The mini
mum probability of achieving a correct response to an unlearned item by
guessing would be 1/8. Thus we would have for PO' the probability of
a correct response on the first test,
A. and E. 11
Po = c + (lc)/B .,
Le., the probability c that the correct association was formed plus the
probability (lc)/B that the association was not formed but the correct
response wa.s achieved by guessing. Setting this expression equal to the
observed proportion of correct responses on the first trial for the twice
reinforced items, we readily obtain an estimate of c for these ex
perimental conditions,
.404 = c + (lc)( .125)
"
c .32.
Now we can proceed to derive expressions for the joint probabilities of
various combinations of correct and incorrect responSes on the first
and second tests for the twice reinforced items. For the probability
of correct responses to a given item in both tests, we have
With probability c, conditioning occurs on the first reinforced trial,
and then correct responses necessarily occur on both tests; with probability
(lc)c(.125) , conditioning does not occur on the first reinforced trial
but does on the second and a correct response is achieved by guessing on
the first test; with probability (l_c)2(.125)2, conditioning occurs on
neither reinforced trial but correct responses are achieved by guessing
on both tests. Similarly,we obtain
A. and E. 12
(1_c)2 (.875) (.125)
 (1c)(.875)[c + (1c)(.125)]
and
Substituting for c in these expressions the estimate computed above, we
arrive at the predicted values which we compared with the corresponding
observed values below.
Observed Predicted
POO
.35
'35
POl
.05 .05
PlO
.27 .24
Pn
.33 ·35
Although this comparison reveals some disparities which we might hope
to reduce with a more elaborate theory, it is surprising, to the writers
at least, that the patterns of observed response proportions in both
experiments considered can be predicted as well as they are by such
an extremely simple model.
Ordinarily, experiments concerned with pairedassociate learning
are not limited to a couple of trials, like those just considered, but
continue until the subjects meet some criterion of learning. Under these
circumstances it is impractical to derive theoretical expressions for all
possible sequences of correct and incorrect responses. A reasonable goal
is, instea.d, to derive expressions for various statistics which can be
A. and E.13
A. and E. 14
The probability of the correct response when the element is in state C
depends on the experimental procedure. In Bower's experiment the subjects
were told the r responses av.ailable to them and each occurred equally
often as the tobelearned response. Therefore, we may assume that in the
1
uncondi tioned state the probability of a correct response is , where
r
r is the number of alternative responses.
The conditioning assumptions: can.,TeadilY' be. 'restated .in' terms of the
conditioning states:
1. On any reinforced trial, if the sampled element is in state C,
it has probability. c of going into state C.
2. The parameter c is fixed in value in a given experiment ..
3. Transitions from state C to state C have probability zero.
We shall now derive some predictions from the model and compare these
with observed data. The data of particular interest will be a subject's
sequence of correct and incorrect responses to a specific stimulus item
over trials. Similarly, in deriving results from the model we shall only
consider an isolated stimulus item and its related sequence of responses.
However, when we apply the model to data we assume that all items in the
list are comparable, i.e., all items have the same conditioning parameter
c and all items start out in the same conditioning state (C).
Consequently the response sequence associated with any given item is
viewed as a sample of size 1 from a pOpulation of sequences all generated
by the same underlying process.
A feature of this model which makes it especially tractable for
purposes of deriving various statistics is the fact that the sequences
A. and E. 15
of trans;Ltions·between states lC and ,C • constitutes a Markov chain. '.,']his
means that, given the state on anyone .trial, we can specifY the proba
bility of each state on the next trial without regard to the previous
history. If we represent by C and
n
C the events that an item is in
n
the conditioned or unconditioned state, respectively, on trial n, and by
and ~ l . the probabilities of transitions from state Cto state C
and from C to C, respectively,the conditioning assumptions lead
directly to the relation
2
2 See Feller (1957) for a discussion of condition!l,l probabiliti.es. In
brief, if H
l
, ... ,E
n
are a set of mutually exclusive events of which
one necessarily occurs, then any event A can occur only in conjunction
with some Since the are mutually exclusive, their probabi
l;i,ties add. Applying the wellknown theorem on compound probabilities., we
obtain Pr(A) = ~ Pr(AH
j
) ~ pr(AjHj)pr(H)
_______".1JL..
,
,
.. and
l:J
,
where Q is the matrix of onestep transition probabilities, the first
row and column referring to C and the second row and column to C. Now
the matrix of probabilities for transitions between any two states in n
trials is simply the nth power of Q, as may be verified by mathematical
.. A. and E. 16
induction (see, e.g., Kemeny, Snell,and Thompson, 1957, p. 327),
[
1
Qn = .. n
l(lc)
o nJ
( lc)
Henceforth we shall assume that all stimulus elements are in state C .at
the onset of the first trial of our experiment. Given that the state is
C on trial 1, the probability of being in stateC at the start of trial
(l_C)
nl, f 0
n is which goes to 0 as n becomes large, or c > .
Thus, with probability 1 the subject is eventually to be found in the
conditioned state.
Next we prove some theorems about the observable sequence of
correct and incorrect responses in terms of the underlying sequence of
unobservable conditioning states. We define the response random variable
A =
~ n
if a correct responSe occurred on tria.l ·n
if an error occurred on trial n
By our assumed response rule, the probabilities of an error given that
the subject is in the conditioned or unconditioned state, respectively,
are
Pr(A = 11 Cn) 0
~ n
and
Pr(A = lie)
_n n
1
1  
r
To obtain the proba.bility of an error on trial n, namely
P r ~ n = 1), we sum these cOnditional probabilities weighted by the
probabilities of being in the respective states:
A. and E. 17
Pr(A '" 1) '" Pr(A ",lIe .}Pr(e )+ Pr(A ",lIe )Pr(e }
n n n n n n n
(1)
(
1 ( nl
1  ,.) lc)
'. r.
(2)
'" (1  1)/c
r
.Thus the number of errors expected during the learning of any given item
is given by Eq. 2.
Equation 2 provides an easy method for estimating c. For.any
given subject we can obtain his average number of errors over stimulus
items, equate this number to the righthand side of Eq. 2 with r = 2 ,
and solve for c. We thereby obtain an estimate of c for each subject,
and intersubject differences in learning are reflected in the variability
of these estimates. Bower, in analyzing his data, chose to assume that c
was the same for all subjects; thus he set E(K) equal to the observed
number of errors averaged over both list items and subjects and obtained
a single estimate of c. ·This group estimate of c simplifies the
computations involved in generating predictions.. However, it has the
A. and E. 18
disadvantage that a discrepancy between observed and predicted values
may arise as a consequence of assuming equal c's when
o
in fact, the
theory is correct but c varies from subject to sUbject. Fortunately,
Bower has obtained excellent agreement between theory and observation
using the group estimate of .c and, for the particular conditions he
investigated, any increase in precision that might be achieved by
individual estimates of c does not seem crucial.
For the experiment described above, Bower reports 1.45 errors per
stimulus item averaged over all subjects. Equating E(A) in Eq.2

to 1.45, with r = 2 , we obtain the estimate c = ,344. All predictions
that we derive from the model for this experiment will be based on this
single estimate of c. It should be remarked that the estimate of c
in terms of Eq. 2 represents only one of many methods that .could have
been used. Which method one selects depends on the properties of the
particular estimator (e.g., whether the estimator is unbiased and effi
cient relative to other estimators). Parameter estimation is a theory in
its own right, and we shall not be able to dis.cuss the many problems
involved in the estimation of learning parameters. The reader is referred
to Suppes and Atkinson (1960), and Estes and Suppes (1962), for discus
sions of various methods and their properties. Associated with this
topic is the problem of assessing the statistical agreement between data
and theory, once parameters have been estimated; that is, the goodness
offit between predicted and observed values. In our analysis of data in
this chapter we shall offer no statistical evaluation of the predictions
. ~
but shall simply disPlay the results for. the reader's inspection. Our
reason is that we present
A. and E. 19
the data only to illustrate features of the theory and its application;
these results are not intended to provide a test of the model. However,
in rigorous analyses of such models the problem of goodnessoffit is
extremely important and needs careful consideration. Here again the
reader is referred to Suppes and Atkinson (1960) for a discussion of
some of the problems and possible statistical tests.
By using Eq. 1 with the estimate of c obtained above we have
generated the predicted learning curve presented in Fig. 1. The fit is
sufficiently close that most of the predicted and observed points cannot
be distinguished on the scale of the graph.
Insert Fig. 1 about here
As a basis for the derivation of other statistics of total errors,
we require an expression for the probability distribution of To
obtain this, we note first that the probability of no errors at all
occurring during learning is given by
c(l/r) + (1c)(1/r)2c + .•.
00
= clr ~ [(l_c)/r]i
i=O
c
r[l(lc)!r]
blr ,
This event may arise if a correct response occurs
c
where
b = l(lc)!r'
by guessing on the first trial and conditioning occurs on the first
reinforcement, if a correct response occurs by guessing on the first two
trials and conditioning occurs on the second reinforcement, and so on.
Similarly, the probability of no additional errors following an error on
any given trial is given by
o 0
'§
~
~
!:':!
6:
.5
.4
.3
.2
.1
\
\
~
~
~
Theory
00 Data
=:,
:t>
!
to
I
f'
\0
P>
,
o
1 2 3 4 5 6 7
TRIALS
8 9 10 11 12 13
Fig. 1.
The average probability of an error on trial n in Bower's pairedassociate experiment.
A. and E. 20
c + c (lc)/r+ •••
ro .
= c [(l_c)/r]l =  b
L l(lcJ/r  •
i=O
To have exactly k errors, we must have a first error (if k> 0),
which has probability 1  b/r, k  1 additional errors, each of which
has probability 1  b, and then no more errors. Therefore the
probability distribution is
Pr(A: = 0) = b/r
,...
(
 / kl
Pr A = k) = b(lb r)(lb) ,
,.,...
for k> ( 3)
3 can be applied to data directly to predict .the form of the
distribution of total errors. It may also be utilized in
deriving, e.g., the variance of this distribution.
computing the·variance, we need the expectation of
Preliminary to
00 2 k 1
= L k b(lb/r)(lb) 
k=O
00
= b(lb/r) L [k(k_l)+k](l_b)kl
k=O
ro
= (lb)b(lb/r) [k(k_l)+k](1_b)k2 ,
k=O
where the second step is taken in order to facilitate the summation.
Using now the familiar expression,
k 1
(lb) = b '
for the sum of a geometric series together with the relations
A. and E. 21
and
1
2'
b
2
3'
b
we obtain
and
 2  2
VariA) = E(A )  [E(A1] ,..,.... __ :.;.i.
1 . 1 2 2
+ ]  (1  ) / c
b2 r
(rl) (2ccr+rl)
=
rc rc
=
..........
(rl) (cr+2c2cr+rl) = (rl)[l+ (2Cl)(1r)]
rc rc rc rc
(4)
Inserting the estimates
E(A) = 1.45

and c = .344 from Bower's data
in Eq. 4, we obtain 1.44 for the predicted standard derivation of total
errors, which may be compared with the observed value of 1.37.
Another useful statistic of the error sequence is
namely, the .expectation of the product of error random variables on
trials nand n+k. This quantity is related to the autocorrelation
A. and E. 22
between errors on trials n+k and trial n. By elementary probability
theory,
: PrlA,+k : 1~ : l ) P r ( ~ : 1) •
But for an error to occur on trial n+k it must be the case·that
conditioning has failed to occur during the intervening k. trials and
that the subject guessed incorrectly on trial n+k. Hence
Substituting this result into the preceding expression, along with the
result presented in Eg. 1, yields the following expression:
A convenient statistic for comparison with data (directly related to the
average autocorrelation of errors with lag k, but easier to compute)
is obtained by summing the cross product of ~ and .h.
n
+
k
over all
trials. We define c
k
as the mean of this random variable, where
( 6)
To be explicit, consider the following response protocol running in time
from left to right: 1101010010000. The observed values for are
and so on.
c estimate
A.and E. 23
The predictions forc
l
, c
2
' and c
3
computed from the
given above for Bower's experiment were .479, .310, and .201. Bower's
observed values were .486, .292, and .187.
Next we consider the distribution of the number of errors between
the kth and k+lst success. The methods to be used in deriving this
result are general and can be used to derive the distribution of .errors
between the kth and k+mth success for any nonnegative integer m
The only limitation is that the expressions become unwieldy as m
increases. We shall define lk as the random variable for the number
of errors between the kth and k+lst success; its values are 0,.1,2, ...
An errqr following the kth success can only occur if the kth success
itself occurs as a result of guessing; that is, the subject necessarily
is in state C when the kth success occurs. Letting gk denote the
probability that the kth success occurs by guessing, we can write the
probability distribution
1  a
gk
for i
~
°
Pr(:!.k
~
i)
~
(7)
(i_a)a
i
gk
for i
> °
where
1
a ~ . (lc)(i 1') . To obtain we note that
°errors
can occur in one of three ways: (1) The kth success occurs because the
subject is in state C (which has probability lg )
k
and necessarily a
correct response occurs on the next trial; (?) the kth success occurs
by guessing, the subject remaining in state C and again guessing cor
rectly on the next trial [which has probability
(3) the kth success occurs by guessing but conditioning is effective
A. and E. 24
on the trial (which has probability gkc). Thus
1
+ gk(lc)(;::) + gkc = 1  a gk. The event of i
Pr(h =0) =1  gk
errors (i > 0) bet
ween thekth and k+lst successes can occur in one of two ways:
(1) The kth and k+lst successes occur by guessing [with probability
g (l_c)
i+l(l· _ ~ ) i ~ ]
or (2) the kth success occurs by guessing and
k r r
conditioning does not take place until the .trial immediately preceding
the k+lst success [with probability Hence
From Eq. 7 we may obtain the mean and variance of ,;!k ' namely
(8)
and
00
var0!k) = L i 2 p r ~ k
i=O
2 2
a gk
2
(Ia)
(9)
In order to evaluate the quantities above we require an expression
for Consider gl ' the probability that the first success occurs
by guessing. It could occur in one of the following ways: (1) The
subject guesses correctly on trial 1 (with probability ~ ) or (2)
r
the subject guesses incorrectly on trial 1, conditioning does not occur,
and the subject guesses successfully on trial 2 [this joint event having
A. and E. 25
probability (1  l)(lc) l] or (3) conditioning does not occur on trials
r r
1 and 2, and the subject guesses incorrectly on both of these trials but
guesses on trial 3 (with probability
so forth. Thus
1 2. 2 1
(1  ) (Ic) )
r r
1 00
=2::
r i=O
1 . i
(1  _)l.(l_c) = l/(lo:)r
r
Now consider the probability that the kthsuccess occurs by guessing for
k > 1. In order for this·event to occur it must be the case that (1)
the klst success occurs by guessing, (2) conditioning fails to occur
on the trial of the klst success, and (3) since the sUbject is assumed
to be in state C on the trial following the klst success, the next
correct response occurs by guessing with probability gl' Hence,
Solving this difference equation
3
we obtain
3
The solution of this equation can quickly be obtained. Note that
g2 =gl(lc)gl =(lc)gi Similarly, g3 =g2(1c)gl; substituting
the above result for g2 we obtain g3 = (lc )gi(lc )gl = (lc )2
gr
.
If we continue in this fashion it will be obvious that gk =
( )
kl k
gk = lc gl
Finally, substituting the expression obtained above for gl yields
A. and E. 26
We may now combine Eqs.,7and 10, inserting our original. estimate
of c ,to obtain predictions about the.number of errors between the
kth and k+lst success in Bower's data. Tb illustrate, for k = 1,
the predicted mean is .361 and the Observed value is .350.
Tb conclude our analysis of this mood, we consider the probability
Pk that a response sequence to a stimulus item will exhibit the property
of no errors following the kth success. This event can occur in one
of two ways: (1) The .k
th
, success occurs when the subject is in state
C [which we have already calculated to be lgk1, or (2) the kth
success occurs when the subject is in state C and no errors occur on
subsequent trials. Let b denote the probability of no more errors
following a correct guess. Then
Pk = (lgk) + gk
b
=1  gk(lb)
(11)
But the probability of no more errors following a guess is
simply
c

ex + c
ex(l_c)kl
= 1 _ _ 
',ex + c)(r  ex r)k
(12)
A. and E. 27
Observed and predicted values of Pk for Bower's experiment are shown
in Table 2.
Insert Table 2 about here
We shall not pursue more consequences of this model.
4
The particular
4
Bower also has compared the oneelement model with a comparable
singleoperator linear model presented by Bush and Sternberg (1959).
The linear model assumes that the probability of an incorrect response
on trial n is a fixed number where
PI = (lc)p
* n
and
The oneelement model and the linear many
identical predictions (e.g., mean learning curve) and it is necessary to
look at the finer structure of the data to differentiate models. Of the
20 possible comparisons Bower makes between the two models, he finds
that the oneelement model comes closer to the data on 18.
results we have examined were selected because they illustrated
fundamental features of the model and also introduced mathematical
techniques which will be needed later. In Bower's paper, more than 30
predictions of the type presented here are tested, with results comparable
to those exhibited above. The goodnessoffit of theory to data in these
instances is quite representative of that which one may now expect to
obtain routinely in simple learning experiments when experimental
conditions have been appropriately arranged to approximate the simplifying
assumptions of the mathematical model.
A. and E. 27a
Table 2
Observed and predicted values for Pk' the probability of no errors
th
following the k success. (Interpret PO as the probability of no
errors at all during the course of learning).
k Observed
Pk
Predicted
Pk
0 .255 .256
1 .628 .636
2 .812 .822
3
.869 .912
4 .928 .957
5 .963 .979
6
.973 .990
7 ·990 .995
,
8 .
.990 .997
9 .993
.. 998
10 .996··
.999
11 1.000 1.000
A. and E. 28
concepts of the sort developed in this section can be extended to
more traditional types of verbal learning situations involving stimulus
similarity, meaningfulness, and the like. For example, Atkinson (1957)
has presented a model for rote serial learning which is based on similar
ideas and deals with such variables as intertrial interval, list length,
and types of erros (perseverative, anticipatory, or responsefailures).
Unfortunately, theoretical analyses of this sort for traditional
experimental routines often lead to extremely complicated mathematical
models with the result that only a few consequences of the axioms can be
derived. Stated differently, a set of concepts may be very general in
terms of the range of situations to which it is applicable; nevertheless,
in order to provide rigorous and detailed tests of these concepts, it is
frequently necessary to contrive special experimental routines where the
theoretical analyses generate tractable mathematical systems.
2.3 Probabilistic Reinforcement Schedules
We shall now examine a oneelement model for some simple twochoice
learning problems. The oneelement model for this situation, as
contrasted with the pairedassociate model, generates some predictions
of behavior which are quite unrealistic and for this reason we defer an
analysis of experimental data until we consider comparable multielement
processes. The reason for presenting the oneelement model is that it
represents a convenient introduction to multielement models and permits
us to develop some mathematical tools in a simple fashion. Further, when
we do discuss multielement models we shall employ a rather restrictive
set of conditioning axioms. However, for the oneelement model we may
A. and E. 29
present an extremely general set of' conditioning assumptions without
getting into too much mathematical complexity. Theref'ore, the analysis
of' the oneelement case will suggest lines along which the multielement
models can be generalized.
The ref'erence experiment (see, e.g" Estes and Straughan, 1954;
Suppes and Atkinson, 1960) involves a long series of' discrete trials.
Each trial is initiated by the onset of' a signal. Tb the signal the
subject is required to make one of' two responses which we denote . ~
and A
2
The trial is terminated with an E
l
or E
2
reinforcing event;
the occurrence of' E
i
indicates that response Ai was the correct
response f'or that trial. Thus in a human learning situation the subject
is required to predict on each trial which reinf'orcing event he expects
will occur by making the appropri.ate responsean ~ if' he expects E
l
and an A
2
if' he expects E
2
; at the end of' the. trial he is permitted to
observe which event actually occurred. Initially the subject may have no
pref'erence between responses, but as information accrues to him over trials,
his pattern of choices undergoes systematic changes, The role of' a model is
to predict the detailed f'eatures of' these changes,
The experimenter may devise various schedules f'or determining the
sequence of' reinf'orcing events over trials. For example, the probability
of' an E
l
may be (1) some f'unction of' the trial number, (2) dependent
on previous responses of the. subject, (3) dependent on the previous
sequence of' reinforcing events, or (4) some combination of' the above,
For simplicity, we consider a noncontingent reinf'orcement schedule. The
case is def'ined by the condition that the probability of' E
l
is constant
A. and E. 30
over trials and independent of previous responses and reinforcements. It
is customary in the literature to call this probability n ; thus,
Fr(E
l
) = n
,n
for all n Here weare denoting by E.
the event
that reinforcement
n occurs on trial
Similarly, we shall n oCCurs on trial E
i
the event that response A.
represent by
We assume that the stimulus situation comprising the signal light
and the context in which it occurs can be represented theoretically by a
single stimulus element which is sampled with probability 1 when the
signal occurs. At the start of a trial, the element is in one of three
conditioning states: In state C
l
the element is conditioned to the A
l
response and in state C
2
to the response; in state Co the
The response rules element is not conditioned to either or
are similar to those presented earlier. When the subject is in C
l
or
A
l
or A
2
response occurs with probability 1. In state Co
we assume that either response will be elicited that is,
For some subjects a response bias may exist which
I
1
Pr(A
l
Co) =2
,n . "n
would that we assume Fr(A
l
I Co ) = 13 where
,n ,n
.Fo.r
these subjects it would be necessary to estimate 13 in aPPlying the
model. However, for simplicity we shall only pursue the case where
responses are the subject is in Co
We now present a general set of rules governing changes in
conditioning states. As the model is developed it will become obvious
that for some experimental problems restrictions can be imposed which
greatly simplify the process.
If the subject is in state C
l
and an E
l
occurs (i.e., the subject
A. and E•. 31
makes an A
l
response which is correct) , then he will remain in C
l
However, if' the sUbject is
in:
C
l
and an E
2
occurs, then with
probability c the subject goes to C
2
and with probability c' to
Co
Comparable rules apply when the subject is in C
2
Thus, if the
subject is in C
l
or C
2
and his response is correct, he will remain
If, however, he is in C
l
or C
2
and his response is
not correct, then he may shift to one of the other conditioning states,
thereby reducing the probability of repeating the same response on the
next trial.
finally, if the subject is in Co and an E
l
or E
2
occurs,
then with probability c
n
the subject moves to C
l
or iC2 ' respectively.5
5
Here we assume that the subject's response does not affect the change.
That is, if the subject is in Co and an E
l
occurs, then he moves to
C
l
with probability c
n
independently of whether A
l
or A
2
occurred.
'[his a.ssumption is not necessary and we could readily have the actual
for example, we might postulate response affect change.
combination, and
ell
2
for the
c·
1I
for an
1
or ~ E l
combination; that is,
Pr(Cl,n+l! E A C ) = Pr(C I EA C )
l,n l,n O,n 2,n+l 2,n 2,n O,n
c
n
.1
and
pr(Cl,n+l. I E A C ) = Pr(C I E A C ) = c
n
l,n 2,n O,n . 2,n+l 2,n l,n O,n . 2
where
e" .I e"
1 F 2
However, such additions make the mathematical. process more
complicated and should be introduced only when the. data clearly require
them.
Thus, to summarize, for i,j = 1,2 and i I j ,
on trial n
A. and E. 32
P;r'(C. +l!E. C. )
=
1
L"n l.,n l,ll
P;r'(C
O
liE. C. ) = c'
,n+ J,u l,n
(13)
pr(Cj,n+lIEj,n Ci,n)
=
c
pr(C. +lIE. Co n)
=
c
n
l,n l,n ,
where 0 < c" < 1 and 0 < c + c' < 1
We now use the assumptions of the preceding paragraphs and the
particular assumptions for the noncontingent case to derive the transition
matrix in the conditioning states. In making such a derivation it is
convenient to represent the various possible occurrences on a trial by a
tree. Each set of branches emanating from a point represents a mutually
exclusive and exhaustive set of possibilities. For example, suppose that
at the start of trial n the subject is in state C
l
' then the tree in
Fig. 2 represents the possible changes that can occur in the conditioning
state.
Insert Fig. 2 here
The first set of branches is associated with the reinforcing event
If the subject is in C
l
and an E
l
occurs, then he
will stay in state C
l
on the next trial. However, if an E
2
occurs,
then with probability c he will go to C
2
' with probability c' he
will go to CO' and with probability lcc' he will remain in C
l
.
Each path of a tree, from a beginning point to a terminal point,
represents a possible outcome on a given trial. The probability of each
A. and E. 32a
C
l,n+l
C
2,n+l
c
"'L..''' Co, n+1
C
l,n+l
Fig. 2. Branching process, starting from state C
l
on trial n,
for one element model in two choice, noncontingent case.
A. and E. 33
path is obtained by multiplying the appropriate conditional probabilities.
Thus, for the tree in Fig. 2 the probability of the bottom path may be
represented by P.i{E
2
IC
I
) Pr(C
l
llE
2
C
l
); (l:Jl)(lcc'). Of the
,n ,n ,n+ ,n,n
four paths, two lead from C
l
to C
l
; hence
Similarly,
PIO ; (l:Jl)c' and P12; (l:Jl)c , where Pij
denotes ·.;the
probabili ty of a onestep transition from C.
~
to
For the Co state we have the tree given in Fig. 3. On the top
branch an E
l
event is indicated and by Eq. 13 the probability of going
Insert ~ i g . 3 here
to C
l
is C" and of staying in Co is l_c"
holds for the bottom branches. Thus we have
Pm
;
:Jl
COl
P02
;
(l:Jl)c ..
POO
; I_cit
A similar analysis
Combining these results and the comparable results for C
2
yields the
following transition matrix:
C
l
Co
C
2
C
l
1  (l:Jl)( c '+c) c '(I:Jl) c(l:Jl)
p ;
Co
C"1t
I_en
c"(l':Jl)
(14)
Ci2
C:Jl C':Jl l:Jl(c'+c)
A. and E. 33a
" c
C
O,n
C
2,n+l
" c
Fig. 3. Branching process, starting from state C on trial n,
for one element model in two choice, noncontingent 2ase.
A. and E. 34
As in the case of the pairedassociate model, a large number of
predictions can be derived easily for this process. However, we shall
only select a few which are useful in clarifying the fundamental properties
of the model. We begin by considering the asymptotic probability of a
particular conditioning state and, in turn, the asymptotic probability of
an ~ response. The following notation will prove useful: Let
be the transition matrix and define as the probability of
being in state j on trial r+n, given that at trial r the subject
was in state i. The quantity is defined recursively:
Moreover, if the appropriate limit exists and is independent of i ,we
set
lim p ~ n )
n > 00 ~ j
The limiting quantities u
j
exist for any finitestate Markov chain
that is irreducible and aperiodic. A Markov chain is irreducible if there
is no closed proper subset of states; that is, no proper subset of states
such that once wi thin this set the probability of leaving it is a For
example, the chain whose transition matrix is
1 2
3
1
3/
4 1/4 a
2 1/2 1/2 a
3 1/3 1/3 1/,3
A. and E. 35
is reducible because the set (1, 2) of states is a proper closed subset.
A Markov chain is aperiodic if there is no fixed period for return to
any state, and periodic if a return to some initial state j is impossible
except at t, 2t , 3t, .•• trials for t > 1. Thus the chain whose
matrix is
1 2
3
1 0 1 0
2
0 0
1
3
1 0 0
has period t 3 for return to each state.
If there are r Btates, we call the vector u .,u
r
] the
stationary probability vector of the chain. It may be shown [Feller (1957),
Kemeny and Snell (1959)] that the components of this vector are the solutions
of the r linear equations
r
v=l
r
such that u = 1 Thus to find the asymptotic probabilities
v=l v
Of the states, we need find only the solution of the r equations.
(15 )
intuitive basis of this system of equations seems clear. Consider a two
state chain. Then the probability Pn+l of being in state 1 on
A. and E. ~ 3 6 
·trial n+l is the probability .ofbeing in state 1 on trial n and going
to 1 plus the probability of being in state 2 on trial n and going to 1;
that is
But at asymptote Pn+l
which is the first of the two equations of the system when r = 2 .
It is clear that the chain represented by the matrix P of Eq..14
is irreducible and aperiodic; thus the asymptotes exist and are indepen
dent of the initial probability distribution on the states. Let
numbers u. such that u.
= L uvPvj
and
LU
j
=
1
J J
v
solution is given by u. =
DiD
where
J
D
l
= P31(1 P22) + P21
P
32
D
2
= P31
P
12 + P32 (1  Pll)
D
3
(1  Pll)(l  P22)  P21P12
(16)
Inserting in these equations the equivalents of the P
ij
from the
transition matrix and renumbering the states appropriately we obtain
D . = rtc"(c +c'rt)
1 .
DO =rt(l  rt)c'(c' +2c).
D
2
= (1  rt)c"[c + c' (1  rt)J
A. and E.37
Since·D is the sum of the D 's
j
and sinceu.= D./D we may divide
J J
thenumeratoT and denominator by . (c,,}2 and obtain
re[p+€re]
u
l
=  "  ~ : . = : . L . _
re[p+€:n:] +re(11t)€[€+2p] + (lre)[p+€{lre)]
(l7)
re(1re)€[<;+2p]
U
o
=       ~ =   = . : . . = = = " '       
re[p+€re] + re(1re)€[€+2p] + (lre)[p+€(lre)]
c
where p = ;; and
c
,
c..
€ ="
C
By our response axioms We have
forall n
Pr(A
l
)
,n
Hence
1
Pr(cl' ) + 2 Fr(C
a
)
)n ,n
lim Pr(A
l
n)
n....? .CD '
122 12
re[p+€p + 2:10 ] + re [€€P 2:'€ ]
222
re[€ +2€p2€] + re [2€€ 2€p]+p+€
(18)
An illspection of Eq..18 indicates that the asymptotic probability of
an A
l
response is a function of re, p, and €. As will become clear later
the value of Pr(A
l
, 00 ) is bounded in the open interval from ~ to
l+ {lrel
values of p and e.
2
re ; whether Pr(A
l
) is above or belowre
,00
depends on the
A. and E. 38
We now consider two special cases' of our oneelement model. The first
case is comparable to the multielement models to be discussed later, whereas
the second case is, in some respects, the complement Of ,the first case.
Case of c' = O. Let us rewrite Eq. 14 with c' = O. Then the tran
sition matrix will have the following canonical farm:
C
L
'
C
2
Co
C
l
1

c(l  rt) c(l  rt) 0
P = C
2
Crt 1  Crt 0 (19)
Co
cllrt
C" (1  rt) 1 
en
We note that once the subject has left state Co he can never return. In
fact it is obvious that Pr(Co,n) =Pr(CO,l)(l  c,,)ul where Pt(CO,l) is
the initial probability of being in CO' Thus, except on early trials, Co
is not part of the process and the subject in the long run fluctuates between
C
l
and C
2
, being in C
l
on a proportion rt of the trials.
From Eq. 19 we have also
Pr(C
l
)[lc(lrt)] + Pr(C
2
) Crt + Pr(C
o
) C"rt •
,n ,n ,n
That is, the probability of being in C
l
on trial n + 1 is equal to the
probability of being in C
l
on trial n times the probability, Pll of
going from C
l
to ,C
l
plus the probability of being in C
2
times P21
plus the probability of being in
x = Pr( C
l
), Y = Pr( C
2
) and
n,n n ,n
z = z (lc" )nl
n 1
Co times POl For simplicity let
z = Pr( Co ) . Now we know that
n ,n
A. and E. 39
and also that x
n
+ Y
n
+ zn = 1 or Y
n
= 1  x
n
 zl(1 _ c,,)nl.
Making these substitutions in the recursion above yields
= x [1  c(l  rr)] + z c"rr(l  c,,)nl + crr[l  x _ z (1 _ c,,)nl]
n 1 n 1
This difference equation has the following SOlution
6
:
6
The solution of such a difference equation can readily be obtained.
Consider
Then
nl
x ax + be + d where
n+l = n
a, b, c and d are constants.
(1) ax
l
+ b +.d •
Similarly x
3
= ax
2
+ bc + d and substituting (1) for Xl we obtain
(2)
2
x
3
= a Xl + ab + ad + bc + d •
SimilarlY x4
2
= ax
3
+ bc + d and substituting (2) for x
3
we obtain
If we continue in this fashion it will be obvious that for n > 2
x
n
nl
= a Xl +
n2 .
d L a
l
+
i=O
A. and E. 40
Carrying out the summations yields the desired results. See Jordan (1950,
p. 583584) for a detailed treatment.
)( )
nl  ( If)nl ( )nl]
x
n
'" n  (n  xll c nZ
l
[ 1  c  1  c .
1
But N(A
l
) = x + 2z hence
,n n . n
Pr(A
l
)
,n
 n 1
= n  [n )  fr(e )](1  c) 
0,1 1,1
(20)
If Pr( e )
0,1
at Br{C1,1)
= 0 then we have a simple exponential learning function starting
and approaching n at a rate determined by c. If Fr(e
o
1) f 0,
,
then the rate of approach is a function of both c and' c".
We now consider one simple sequential prediction to illustrate another
feature of the oneelement model for c' =0. Specifically, consider the prob
ability of an Al response on trial n + 1 given a reinforced Al response on
trial n; namely Pt(A
l
_ llE
l
A ). Note f'irstof all that
,n+ ,n
Pr(A llE
l
Al )pr eEl Al ) =fr( AIEl Al ). Further we may write
"J.,n+ ,n,n ,n,n "J.,n+ ,n ,n
Pr(A 'I E
l
A ) = 2=pt(A 1 eJ.',n+l E
l
A e. )
l,n+ ,n "l,n .. "l,n+ ,n "l,n J,n
J.,J
LPr(A Ie. E A C. )Pr(e. IE A e.)
i; j l,n+l J.,n+l l,n l,n J,n J.,n+l l,n "l,n J,n
'Fr(E
l
JA. e. )Pr(A
l
Ie. )pr(C. ).
,n ""I,n J,n ,n, J,n J,n
"
A.and E. 41
But by assumption the probability of a response is determined solely by the
is independent of other events, and hence
conditioning state, and hence
Further,
Pr(A
l
+1 Ie. +1 E
l
A
l
C. );
,u J.,n ,n ,n J"n
by assumption the probability of an
Pr(El tAl C. );:J!.
,u . J n J, n
E
l
event
Substi
tuting these results in the above expression we obtain
; :J! ~ Pr(A
l
+llc. +1) Pr(e. +l[E
l
A
l
C. )
. . ,ll· 1,n 1,n,n ,n J,n
~ , J
Pr(A
l
IC. )Pr(C. )
,0 J,n J,n
Both i and j run over a, 1 and 2 and therefore there are nine terms
in the sum; but note that when either i or j is 2 the terms
Pr(A
l
· +1
1
C. +1·) and Pr(A
l
Ic. ) both equal a. Consequently, it
,0 J.,n ,n J,n
suffices to limit i and j to a and 1 ,and we have
Pr(A
l
+1 E
l
A
l
)
, n ,n,n
1
; :J! ~ a pr(Al,n+l1ci,n+l)
PreC. llE
l
A
l
C
l
) Pr(A
l
IC
l
)PreCl )
~ , n + ,n ,n ,0 ,0 ,0 ,0
1
+ :J! L Pr(A
l
+lle. +l)Pr(C. +llE
l
A
l
C
a
)Pr(A
l
ICa )Pr(C
a
) .
i=O ,0 1,n· 1,n ,n ,0 ,0 ,u,u ,0
Since the subject cannot leave state C
l
on a trial when A
l
is reinforced,
we know that Precl +llE
l
A
l
C
1
); 1 and PreCa +llE
l
A
l
C
l
); a;
,0, ,o"o'.,n,o ,ll ,0 ,,0
further, Pr(A
l
1 rC
l
1); 1. Therefore, the first sum is simply :J!Pr(C
l
) .
" , ~ ,0+ ,n
A.and E . .42
For the second sum FreC
l
+llE
l
A
l
Co )
,n ,ll ,n ,n
FreC
O
+llE
l
A
l
Co ) 1  c". Further
,n ,n, n ,n
f'or,thel.Becond sum we obtain
:::: en and
1
2: jhence
n[c" +  c")
·Combining these results
But Pr(El A
l
) = Pr(El IAl )Pr(A
l
) = nPr(A
l
) whence
 ,n ,n ,n ,n,n ,n
We know that PreCl ) and Pr(A
l
) both approach n in the limit
,n ,n
and that Pre C
o
) approaches O. Therefore, we predict that
,n
This prediction provides a very sharp test for this particular case of
the model and one that is certain to fail in almost any experimental situa
tion.That is, even after a large number of trials it is hard to conceive
of an experimental procedure such that a response will be repeated with
probability 1 if it occurred and was reinforced on the preceding trial.
Later we shall consider a multielement model which provides an excellent
description of many sets of data but is based on essentially the same condi
tioning rules specified by this case of c'= O. It should be emphasized that
A. and E. 43
deterministic predictions of the sort given in the equation above are
peculiar to oneelement models; for the multielement case such difficul
ties do not arise. This point will be amplified later.
Case of c = O. We now consider the case in which direct counter
conditioning does not occur, 1. e., c = 0, and thus p=O and 0 < E < OJ.
With this restriction the chain is still ergodic since it is possible
to go from every state to every other state, but transitions between
(21) Pr(A
l
) =
,GO
C
l
and C
2
must go by way of CO' Letting p = 0 in Eq. 18 we obtain
2 1
11 + j21l(1 ll)E
From Eq. 21 we can draw some interesting conclusions about the
relationship of the asymptotic response probabilities to the ratio
o
"" Pr(A
l
)
oE . , ro
c'
E ="
c
Differentiating with respect to E, we obtain
1
11(111)(2"  11)
If then Pr(A
l
) has no maximum
,GO
for E in the open interval (O)OJ), which is the permissible range on E.
In fact) since the sign of the derivative is independent of E we know
that Pr(A
l
) is either monotone increasing Or monotone decreasing in E
,OJ
1 1
strictly increasing if 11(111)(2"  11) > 0 (i.e., 11 > 2") and decreasing if
< 0 (Le., 11 < Moreover, because of the monotonicity of
compute bounds from Eq. 21. Pr(A
l
) in E , it is easy to
,OJ
see immediately that the lower bound (assuming 11 > is
Firstly, we
lim Pr(A
l
00 '
Secondly, when
however, that
A. and E. 44
€ is very small, Pr(A
l
)
. )lco
Eq. 21 is inapplicable when
approaches
€ = 0 » for
2
"
l + (1_,,)2
if both c = 0
Note,
and
c.'. = 0 , the tran.sition matrix (Eq. 14) reduces to
p =
1
cnrc
o
o
ie"
o
o
c"(l")
1
and if the process starts in Co ,Pr(Al,oe) = ". But for € > 0, if
1
" > "2' pr(A
l
, oe) is a decreasing function of E and its values lie in
the half open interval
It is readily determined that probability matching would not be predicted
in this case.
c'
When .. is greater than 2, the predicted value of
c
Pr(A
l
) is less than '" and when this ratio is less than 2, the
,CD
predicted value
Finally we
of Pr(A
l
) is greater than " .
,oe
derive Pr(Al,n+lIEl,nAl,n} for this case.
The derivation
is identical to that given for the case of c' = o. Hence
lim Pr(A
l
+llE
l
A
l
)
»0 . )·n ,n
n.?(Xj
u
l
+ ~ uo[c" + (lc") ~ ]
1
u
l
+"2 U
o
Note however that for c = 0 , the quantity U
o
is never 0 (except for
" = 0,1) , and consequently Pr(A
l
llE
l
A
l
)is always less than 1.
. ,n+ ,n,n
A. and E. 45
Contingent reinforcement. As a final example we shall apply the
oneelement model to a situation where the reinforcing event on trial n
is contingent on the response on that trial. Simple contingent reinforce
ment is defined by two probabilities '\ and 112 such that
We consider the case of the model in which c I = 0
and pX(CO,l) = O.
That is, the subject is not in state
Co
on trial 1 and (since c'
= 0)
he can never reach
Co
from C
l
or C
2
• Hence, on all trials he is in
C
l
or C
2
, and transitions between these states are governed by the
single parameter c. The trees for the C
l
and C
2
states are given
in Figure 4.
Insert Fig. 4 about here
The transition matrix is
C
l
C
2
C
I
l(lll )c (llln)C
n
P=
C
2
c"2l
lc"2l
,
and in terms of this matrix we may write
A. and E. 45a
C A
l,n l,n
c
c
C .JL
2,n C.,n
C
2,n+l
,n+l
Fig. 4. Branching process for one element model in twochoice,
contingent case.
But Pr(c
2
)
., n
henc.e
A.and E. 46
llPr(.C
l
) and Pr(c
l
) = Pr(A
l
)
,n ,n ; n
This difference equation has the solution
Pr(A
l
)
,n
where
The asymptote is independent of c, and the rate of approach is
determined by the quantity c(11(n+"21)' . It is interesting' to,no.:te that
the learning function for Pr(A
l
n) in this case of the oneelement model
,
is identical to that of the linear model (cf. Estes and Suppes, 1959''').
A.and E. 47
3. MULTICELEMENT PATTERN MODELS
3..1 Genere,lFormulation
In the literature of stimulus sampling theory a variety of proposals
have been made. for conceptually representing the stimulus situation. Funda
mental to all of these suggestions has been the distinction between pattern
elements and component elements. For .the oneelement case this distinc
·tion does not playa serious role, but for multielement formulations
these alterne,tive representations of the stimulus situations specify
different mathematice,l processes.
In component models, the stimulating situation is represented as a
.popl.\lation of elements which the learner is viewed as sampling from trial
to trial. I:t is e,ssumed that the conditioning of individual elements to
repponses occurs independently as the elements are sampled in conjunction
with reinforcing events, and that the response probability in the pres
ence of a sample containing a number of elements is determined by an
averaging rule. The principal consideration he,s been to account for
response variability to an apparently cons.tantstimulus situation by
postulating random fluctuations from trial to trial in the particular
sample of stimulus elements affecting the learner. These component
models he,ve provided a mechanism for effecting a reconcilie,tion between
the picture of gradual change usually exhibited by the learning curve
<md the a:llornone law of association.
For many experimental situations a detailed accGunt of the quanti
tative pr()perties of learning can be given by component models that
assume discrete associations between responses and the independently
,
variable elements ofa>stimu:t,atingsituation. However, in some cases
A. and E.,48 '
predictions from component models, fail, and it appears;;thilta simple
accoUnt of the learning process r e ~ u i r e s the assumption that responses
become associated, not with separate components or aspects of a stimulus
situation, but with total patterns of stimulation considered as units.
The model presented in this section is intended to 'represent such a case.
In it we assume that an experimentally specified stimulating situation
,can be conceived as an assemblage' of distinct, mutually exclusive patterns
of stimulation, each of which becomes conditioned to responses on an
allornone basis. By '''mutually exclusive" we mean that exactly one of
the patterns occurs (is sampled by the subject) on each trial. By "distinct"
we mean that no generalization occurs from one pattern to another. Thus
the clearest experimental interpretation would involve a set of patterns
having no common elements (i.e., common properties or components). In
practice the pattern model has also been applied with considerable success
to experiments in which the alternative stimuli have some common elements,
but nevertheless are sufficiently discriminable so that generalization
effects (e.g., "confusion errors") are small and can be neglected without
serious errorQ
In this presentation we shall limit cOnsideration to cases in which
patterns are sampled randomly with e ~ u a l likelihood so that, if there
are N patterns, each has probability ~ of being s8ll!pled on a trial.
This sampling assumption represents only one way of formulating the model
and is presented here because it generates a fairly simple mathematical
process and provides a good account of a variety of experimental results.
However, this particular scheme for sampling patterns has restricted
A. and E. 49
applicability. For example, in certain experiments it can be demonstrated
that the stimulus array to which the subject responds is in large
determined by events on previous trials; that is, trace stimulation asso
ciated with previous repponses and rewards determine the stimulus pattern
to which the subject responds. When this is the case, it is necessary to
postulate a.. more general rule for sampling patterns than the random scheme
proposed above. (e.g., see the discussion of "hypothesis models" in
Su'ppesand,Atkinson, 1960).
Before stating the axioms for the pattern model to be considered in
this section we define the following notions. As before, the behaviors
available to the subject are categorized into mutually exclusive and exhaus
tive response classes ... ,A
r
) The possible experimenterdefined
outcomes of a trial (e.g., giving or withholding reward, unconditioned
stimulus, knowledge of results) are classified by their effect on response
probability and are represented by ac'mutilallyexclusive .andexhaustive set
of reinforcing events (EO' E
l
,·· .,E
r
) The event E.(ifO) indicates
that response Ai is reinforced and EO represents any trial outcome
whose effect is neutral (i.e., reinforces none of the A.' s ).
The
subject.'s response and the experimenterdefined outcomes are observable,
but the occurrence of E
i
is a purely hypothetical event that represents
the reinforcing effect of the trial outcome. Event E
i
is said to have
occurred when the outcome of a trial is such as to increase the probability
of response Ai in the presence of the given of course,
that this probability is not already at its maximum value.
We now present the axioms. The first group of axioms deals with the
conditioning of sampled patterns, the second group with the sampling of
patterns, and the third group with responses.
A. and E. 50
Conditioning Axioms
Cl. On every trial each pattern is conditioned to exactly responseo
C2. If pattern is sampled .2£ trial, it becomes conditioned with
probability c to the response (if any) that is reinforced .2£ the
trial; if it is alreadz conditioned to that response, it remains
C3. If no reinforcement occurs .2£ trial, EO occurs), there
is no change in conditioning on that trial.
c4. Patterns that not sampled on a trial do not change their condi
tioning .2£ that trial.
C5. The probabilitz c that sampled pattern.• be conditioned to
reinforced reSponse is independent of the trial number and
preceding events.
Sampling Axioms
Sl. Exactly pattern is sampled each trial.
82. Given the set of N patterns available for sampling trial, the
probability of sampling given pattern is liN, independently of
the trial number and the preceding events.
Response Axiom
Rl.On any trial that response is made to which the sampled pattern is
conditioned.
Later in this section we apply these axioms to a learning
experiment and to a pairedcomparison study. First, however, we shall
prove several general theorems. Before we can begin our' analysis it is
necessary to define the notion of a conditioning state. For the axioms
aqove,all patterns are sampled with equal probability, and it suffices
A. and E. 51
to let the state of conditioning indicate the number of patterns condi
tioned to each response. Hence for r responses the conditioning states
are the ordered rtuples k >
r
where k
i
= 0, 1, 2, ••• , N
and k
l
+ ~ + ••• + k
r
= N ; the integer k
i
denotes the number of
pattenns conditioned to the
(
N + Nr ~
tioning states is
response. The number of possible condi
(In a generalized model which permitted
different patterns to have different likelihoods of being sampled, it
would be necessary to specify not only the number of patterns conditioned
to a response but also the sampling probabilities associated with the
patterns.)
For simplicity, in this section we limit consideration to the case
of two alternatives except for bne example where r = 3. Given only
two alternatives we denote the conditioning state on trial n of an
i indi i = 0, 1, 2, ... , N ; the subscript C where
i,n
cates the number of patterns conditioned to A
l
and Ni the number
experiment as
conditioned to ~ •
Transition Probabilities. Only one pattern is sampled per trial;
therefore,. the subject can go from state C. to only one of the three
J.
states C
i
_
l
' C
i
,. or C
i
+
l
on any given trial. The probabilities of
these transitions depend on the value of the conditioning parameter c,
the reinforcement schedule, and the value of i. We now proceed to
compute these probabilities.
If the subject is in state C
i
on trial n and an E
l
occurs,
then the possible outcomes are indicated by the tree in Figure 5.
Insert Fig. 5 about here
A. and E. 5la
1
___ ~  C . 1
1,n+
C.
~ , n
c
C. 1
J.,n+
Fig. 5.
subject
Branching process
starts in state C
i
for N
and an
element model on a
E
l
event occurs.
trial when the
A. andE. 52
On the upper main branch, which has probability a pattern that is
conditioned to A
l
is sampled and, since an E
l
reinforcement occurs,
the pattern remains conditioned to A
l
• Hence, the conditioning state
on trial n + 1 is the same as on trial n (see Axiom C2). On the
lower main branch, which has probability N;i, a pattern conditioned to
is sampled; then with probability c the pattern is conditioned to
and tlleclsubject moves to conditioning state C
i
+
l
' whereas Witth
probability lc conditioning is not effective and the subject remains
instate C
i
Putting these results together we obtain
Ni
= c
N
(22a)
Pr(C. +llE
l
C. )
l.,n ,n l.,n
i
= lc + C N
Similarly, if an E
2
occurs on trial n,
Pi(C. 1 +11E
2
C. )
1 ,n ,n
i
;:;:: c 
N
(22b)
Pr(C. +11E
2
C. )
2,n· ,u J.,n
By Axiom C3, if an EO occurs then
= lc + c Ni
N
Pic'(C. +lIE
O
C. ) = 1
J.,u· ,n J.,n
(22c)
Noting that a transition upward can occur only when a pattern condi
to is sampled on an E
l
trial, and a transition downward can
A.and E. 53
occur only when a pattern conditioned to A
l
is sampled on an E
2
trial,
we can combine the results :l"romEq. 22ac to obtain
P'r'(C'+
l
+llc. )
]. . ,n " l.J n
P'r'(C. '1 +llc. )
1 ,u,' ,1,n
P'r'(C
i
+llc. )
,n, :t..,.fi
Ni (I )
= c NP.rE
l
A_ C.,
, ,n '"'2,n J.,n
=CNiBr(E2IAl C. )
,on ,n 1,n
lc + c [N!?r(E
l
IA
l
C. )
, ,n ,n J.,o
Ni (I )
+ N?r ,E
2
A_ C.
,n "'2,n J.,n
+?r(E
a
Ie. )]
,on 1, n
(23a)
(23b)
(23c)
:l"or the probabilities 0:1" onestep transitions between states. Equation 23a,
:l"or example, ,states that the pnobability 0:1" moving :l"rom the state with i
elements conditioned to A
l
to the state with i + 1 elements conditioned
to 1)., is the product 0:1" the probability R;i that an element not already
conditioned to A. is sampled and the probabil:j.tyccPr(E
l
IA C. )
 ~ ,n '"'2,0 J.,n
that"under the given circumstances, conditioning occurs.
As de:l"ined earlier, we have a Markov process in the conditioning states
i:l" the probability 0:1" a transition :l"rom any state to any other state
depends at most on the state existing on the trial preceding the transi
t'ion. By inspection of Eq. 23 we see that the Markov condition may be
satis:l"ied by limiting ourselves to reinforcement schedules in which the
probability of a reinforcing event E
i
depends at most on the response
of the given trial, that is, in learningtheory terminology, to
A, and E, 54
noncontingent and simple contingent schedules. This restriction will be
assumed throughout the present section except fOr a few remarks ~ n which
we explicitly consider various lines of generalization,
With these restrictions in mind, we define
:lt
iJ
. ;= fr(E.1 A. )
..... JJn l
J
n
where j;= 0 to r, i;= 1 to r J and L:It .. ;= 1. That is, the
. ~ J
J .
reinforcement on atrial depends at most ·on the response of the given
trial, further, the reinforcement probabilities do not depend on the trial
number, We may then reWrite Eq, 23 as follows:
q ..
.1.,:1
q.. 1
1.,1"'"
Note that we use the notation
1
Ni, i .
'. c T :lt
21
 c N"12
in place of Fr(e. +lle. )
J,n.· J..,n
(24a)
(241;»
.(24c)
The
reason is that the. transj.tion probabilities do not depend on n , given
the restrictions on the reinforcement schedule stated above, and the
simpler notation expresses this fact,
Response Probabilities and Moments, By Axioms 81, 82, and Rl we
know that the relation between response probability and the conditioning
state is. simply
I
i
P'r'(A
l
e.);= N
,n J..,n
A. and E. 55
Hence
N
Pr(A
l
) '" LPr(A IC
1
)Pr(C
1
)
,n 1=0 ",n ,n ,n
(25)
N 1
=L:R'(C .)
1=0 N 1,n
But note t h ~ t by definition of the transition probabilities qij
(26)
The latter expression, together with Eq. 25, serves as the basis for a
general recursion in Pr(A
l
)
,n
.. N i N
Pr(A
l
) = L: N LR'(C. l)q'i
,n i=Oj=O J,n J
Now substituting for qji in terms of Eq. 24 and rearranging the sum we
have
A. and E. 56
N .
Pr(A
l
) = Pr(C. )
,n i=O N
Nl
 c"21 L
i=O
Nl
+ c"21
i=O
i(Ni) P (C )
2 r . i 1
N ,n
(i+l)(Ni) Pr(C )
N2 .i,nl
i(il) ( )
if Pr C
i
, nl
The first sum is, by EQ. 25, Pr(Al,n_l)' Let us define
N i2
a
2
,n = k. if pr(C
1
,n) ; then the second sum is simply c"12 a
2
,n_l
Similarly the third sum is c"21 [ Pr(A
l
, nl)  PreeN, nl)  a
2
, nl
+ pr(CN,n_l)1= C"h [pr(Al,n_l)  a
2
,n_1J and so forth. Carrying out
the su:mma.tion and simplifying we obtain the following recursion in
Pr(A
l
):
,n
.This difference equation has the wellknown s.olution (cf. Bush and
Mosteller, 1955; Estes,1959b; Estes and Suppes, 1959)
. nl
pr(Al,n) = pr(Al,a;»   Pr(Al,l)] [1  + "21)] , (28)
A.and E. 57
where
At this point it be instructive to calculate the vari
ance of the distribution of.response Pr(A
l
nfCi n) •
, ,
The second raw moment as defined above is
. Ct
2
,n
Carrying out the summation as was done in the case of Eq. 27 we obtain
Ct = Ct [1  2c (rr + rr )J
2,n 2,nl N 12 21
Subtracting the square of Pt(A
l
) as given in Eq. 28 from Ct
2 ,n ,on
yields the variance of the response probabilities. The second and
higher moments of the response probabilities are of inter
est primarily because they enter into predictions concerning various
sequential We return to this point. later.
ASYJll;ptotic Distributions. The pattern model has one particularly
feature not shared by many other models that
have appeared in the literature. This feature is a simple calcula
tional procedure for complete asymptotic distribution
A. and E. 58
of conditioning states and therefore the asymptotic distribution of
responses. The derivation to be given assumes that all elements
q .. 1 ' q .. , q4,4+.1 of the transition matrix are nonzero; the
:.1.,]. ,1., ~ .I.. ..!.
same technique c ~ n be applied if there are zero entries, except, of
course, that in forming ratios one must keep the zeros out of the
denominators.
As in Sec. 2.3, we let l lim Pr(C. ) ~ u. • The theorem to be
n + 00 J.., n J.
proved is that all of the asymptotic conditioning state probabilities
u
i
can be expressed recursively in terms of u
O
; since the u
i
' s
must sum to unity, this recursion suffices to determine the entire
distribution.
By Eq. 26 we note that
and hence
We now prove by induction that a similar relation holds for any ~ d j a c e n t
pair of states; that is
q·+l .
1. ,J.
q. ·+1
1.,1
For any state i, we have by Eq. 26,
Rearranging,
A. and E. 59
'However, under the inductive hypothesis we may replace u
i
_
1
by
its e<;tu:j.va1ent u
4
q !q • Hence
_ .i,i1i1,;I.
or
ui(l  q. . qi ·1) =
~ , ~ . ,;1.
Hpwever1  q q
i,i ;I.,il
and therefore
which concludes the proof.
Thus we may write
U;t
=
ui+l
since
qi+l,i
q;l.,i+l
q .. 1 + Q. i + 'Q. i 1
J.,l,' :1.,. ~ , +
= 1
,
and so forth .• Since the u. IS must sum to unity,
~
U
o
also is deter
mined. To illustrate the application of this technique we consi\ier
some siJrlplecases. For the noncontingent casediseussed in See. 2.;;
111 =
A.and E. 60
By Eq.24 we have
N1
qi,i+l = c N :n:
i
q.. 1 = c
N
(1  :n:)
1.,1.
ll..pplying .the technique of the previous paragraph
. ~ c:n: Nit
u(Y = C....,1,,....::(.:.:.l:n:) = ( 1  :n:)
N
c ,§. _ (Nl):n:
N  2(1 It)
and in general
~
=
~  l
(N k + l):n:
k(l  :n:)
This result has two interesting features. First, we note that the
asymptotic probabilities s.re independent of the conditioning para
I
N) :n:
k

l
(1 .' :n:)Nk+l
kl I
N) k Nk
k:n: (1  :n:)and
Second, the ratio of ~ to u
k
_
l
is the same as that
Df neighboring terms
meter c ..
in the expansion of
N
[:n: + (1  :n:)J • Therefore, the asymptotic
probabilities in this case are binomially distributed. For a
A.and E. 61
populat10n ofsuqjects whose learning is described by the model,
the limiting proportion of subjects having all N patterns condi
tioned to A
l
is
N
rt ; the proportion having all but one of the
N patterns conditioned to A
l
is NTCN1Cl  rt) ; and so on.
·For the case of simple contingent reinforcement,
uk
 ;
u
k
_
l
Again we note that the u
i
are independent of c. Further the ratio
uk to u
k
_
l
.is the same as that .of
(
N) kl Nk+l
to k ~ l TC
21
TC
12
Therefore the asymptotic state probabilities are the terms in the
expansion of
Explicit formulas for state probabilities are useful primarily as
intermediary expressions in the derivation of other quantities, as will
be seen below. Jnthespecial case of the pattern model(unlike other
types of stimulus sampling models) the strict determination of the
response on any trial by the conditioning state of the trial sample
permits a relatively direct empirical interpretation, for the moments
of the distribution of state probabilities are identical with the moments
A. and E. 62
of the response random variable. Thus, in the simple contingent case
we have immediately for the mean and variance of the response random
variable ... 1ln
and
N 2 1t )k 1t .)Nk
( )
_
 ~ ~ INI (21 (l2 _ [E(A )]2
Var b
n
£..2
k ~ l N·· k. 1t
21
+ 1t
l2
1t
21
+ "l2 n
A bit of caution is needed in applying this last expression to data.
If we select some fixed trial n (sufficiently large so that the
learning process may be assumed asymptotic), then the theoretical var_
iance for the A
l
response totals of a
of K subjects on trial n is simply
number of independent samples
"21 1t
l2
K . 2. by the familiar
("21+ 1t
l2
)
theorem for the variance of a sum of independent random variables.
However, this expression does not hold for the variance of A
l
response
totals over a block of K successive trials. The additional consid
eratibns involved in the latter case will be discussed in the next
section.
A. anli E. 63
3.2 Treatment of the Simple Noncontingent Case
In this section we shall consider various predictions that may be
derived from the pattern model for simple predictive behavior in a two
choice situation with noncontingent reinforcement. Each trial in the
reference experiment begins with presentation of a ready signal; the
subject's task is to respond to the signal by operating one of a pair
of response keys, A
l
Or ~ ,indicating his prediction as to which of
two reinforcing lights will appear. The reinforcing lights are pro
grammed by the experimenter to occur in random sequence, exactly one on
each trial, with probabilities which are constant throughout the series
and inliependent of the subject's behavior.
For illustrative purposes, we shl;tll use data from two experiments
of this sort. In one of these, henceforth designated the .6 series,
thirty subjects were run, each for a series of 240 trials, with proba
bilities of .6 and .4 for the two reinforcing lights. Details of the
experimental procedure,and a more complete analysis of the data than
we shall undertake here, are given by Suppes and Atkinson (1960, Ch. 10).
In the other experiment, henceforth designated the .8 series, eighty
subjects were run, each for a series of 288 trials, with probabilities
of ..8 and .2 for the two reinforcing lights. Details of the procedure
and results have been reported by Friedman et. al.,(196o). A possibly
important difference between the conliitions of the two experiments is
that in the .6 series the subjects were new to this type of experiment
whereas in the .8 series the subjects were highly practiced, having had
experience with a variety of noncontingent schedules in two previous
experimental sessions.
A. and E. 64
For our present purposes it will suffice to consider only the
simplest possible interpretation of the experimental situation in
terms of the pattern model. Let 01 denote the more frequently
occurring reinforcing light and 02 the less frequent light. We then
postulate a onetoone correspondence between the appearance of light
0l' and the reinforcing event E. which is associated with A. (the
l l
response of predicting O. ) . Also we assume that the experimental
l
conditions determine a set of N distinct stimulus patterns, exactly
one of which is present at the onset of any given trial. Since, in
experiments of the sort under consideration, the experimenter usually
presents the same ready signal at the beginning of every trial, one
might assume that N would necessarily equal unity. However, we shall
not impose this restriction on the model. Rather, we shall let N
appear as a free parameter in theoretical expressions; then we shall
seek to determine from the data what value of N is required to mini
mize the disparities between theoretical and observed values.
If the data of a particular experiment yield an estimate of N
greater than unity, and if with this estimate the model provides a
satisfactory account of the empirical relationships in question, we
shall conclude that the learning process proceeds as described by the
model but that, regardless of the experimenter's intention, the subjects
are sampling a population of stimulus patterns. The pattern effective
at the onset of a given trial might comprise the experimenter's ready
signal together with stimulus traces (perhaps verbally mediated) of the
reinforcing events and responses of one or more preceding trials.
A. and E. 65
It will be apparent that the pattern model could scarcely be
expected to provide a completely adequate account of the data of two
choice experiments run under the conditions sketched above. Firstly,
if the stimulus patterns to which the subject responds include cues
from preceding events, then it is extremely unlikely that all of the
available patterns would have equal sampling probabilities as assumed
in the model. Secondly, the different patterns must have component
cues in common and these would be expected to yield transfer effects
(at least on early trials) so that the response to a pattern first
sampled on trial n would be influenced by conditioning that occurred
when components of that pattern were present on earlier trials. How
ever, the pattern model assumes that all of the patterns available for
sampling are distinct in the sense that reinforcement of a response to
one pattern has no effect on response probabilities associated with
other patterns.
Despite these complications, many investigators (e.g., Suppes and
Atkinson, 1960; Estes, 1961b; Suppes and Ginsberg, 1962b; Bower, 196+)
have found it a useful strategy to apply the pattern model in the simple
form presented in the preceding section. The goal in these applications
is not the perhaps impossible one of accounting for every detail of the
experimental results, but rather the more modest, yet realizable, one
of obtaining valuable information about various theoretical assumptions
by comparing manageably siml'le models that embody different combinations
of assumptions. This procedure will be illustrated in the remainder of
the section.
A. and E. 66
Sequential Predictions. We begin our application of the pattern
model with a discussion of sequential statistics. It should be empha
sized that one Of the major contributions of mathematical learning
theory has been to provide a framework within which the sequential
aspects of learning can be scrutinized. Prior to the development of
mathematical models, relatively little attention was paid to trial by
trial phenomena; at the present time, for many experimental problems,
such phenomena are viewed as the most interesting aspect of the data.
Although we consider only the noncontingent case, the same methods
may be used to obtain results for more general reinforcement schedules.
We shall develop the proofs in terms of two responses but the results
hold for any number of alternatives. If there are r responses in a
given experimental application, anyone response can be denoted A
l
and the rest regarded as members of a single class, ~
We consider first the probability of an A
l
response given that
it occurred and was reinforced on the preceding trial; i.e.,
Pr(A
l
llE
l
A
l
) It is convenient to deal first with the joint
,n+ ,n ,n
probability Pr(A
l
+lE
l
A
l
), then to conditionalize later. First
,n ,n ,n
we note that
Pr(A
l
+lE
l
A
l
)
,n ,n,n
=L
i, j
Pr(A C. E A C ),
l,n+l J,n+l l,n l,n i,n
(30)
and that Pr(A C. E A C ) may be expressed in terms of
l,n+l J,n+l l,n l,n i,n
conditional probabilities as
A., and E. 67
Pr(A Ie. E A e. )Pr(e. IE A C. )Pr(E IA e. )
l,n+l J,n+l l,n l,n ~ , n J,n+l l,n l , n ~ , n l,n l,n ~ , n
. Pr(A
l
Ie. )Pr(e. ).
,0 1.,0 1.,0
But from the sampling and response axioms the probability of a response
on trial n is determined solely by the conditioning state on trial
n ; i.e., the first factor in the expansion can be rewritten simply as
Pr(Al,n+l1ej,n+l)' Further, by Axiom Rl, we have pr(Al,n+l1ej,n+l) =~
For the noncontingent case the probability of an E
l
on any trial is
independent of previous events and consequently we may write
pr(E
l
IA
l
C. ) = J(.
,n ,n l,n
Next, we note that
1 , if i
=
j
Pr(e. +lIE
l
A
l
e. )
=
J,n, ,n _,n 1,n
0 , if i
F
j
That is, an element conditioned to A
l
is sampled on trial n (since
an A
l
response occurs on n) and thus by Axiom C2no change in the
conditioning state can occur.
Putting these results together and substituting in Eq. 30 we
obtain
A. and E. 68
Pr(A
l
+lE
l
A
l
)
,n ,n J n
Pr(e. +l[E
l
A
l
e. )Pr(e. )
1,n ,0 ,n l,n 1,n
and
.2
; 11 L.!:..... Pr(e. )
i N
2
J., n
; rc a
2
'
,n
(31a)
Pr(A
l
""+lIE
l
" A
l
" )
,n ' , n ,n
11 a
2
= ,n
pr(El A
l
" )
,n ,on
a
2,n
(31b)
In .order to express this probability in terms of the
parameters n, c , N , and Pr(A
l
1) , we simply substitute into Eq. 31b
,
the expression given for Pr(A
l
) in Eq. 28 and the corresponding
,n
expression for a ,that would be given by the solution of the
2,n
difference equation, Eq•. 29. Unfortunately, the expression so obtained
is extremely cumbersome to work with. Consequently it is usually
preferable in working with data to proceed in a different way.
Suppose the data to be treated consist of proportions of OCcurrences
of the various triiSrams
A
k
+lE. A.
,n J,n J.,n
over blocks of M trials.
If, for example, M; 5, then in the protocol
Trial
Event
1 2
3
4
5
A. and E. 69
there are four opportunities for such trigrams. ,The combination
on one; hence the proportions of occurrence of
occurs on two of these, A E A
l,n+l l,n l,n
and A E A_
l,n+l 1,n""2,n
A E A
2,n+l l,nl,n
on one,
these trigrams are .5, .25, and .25 ,respectively. To deal
theoretically with quantities such as these, we need only average both
sides of Eq. 3la (and the corresponding for other trigrams)
over the appropriate block of trials, obtaining, e.g., for the block
running from trial n through trial n+Ml
(32a)
where a
2
(n,M) is the average value of the second momeI\t of the .esponse
probabilities over the given trial block. B,y .strictly analogous methods,
we can derive theoretical expressions for other trigram proportions, e.g.,
1 n+Ml
M:;S;:== Pr(A
l
n'+lE
l
n')
n,'n " J, ,
(3
2c
)
P122
1 n+Ml
Ii .«.>.,. Pr(A
l
I+lE
2
,)
nt =n J n _' "n ,n,
and so on; .the quantity al(n,M) . denoting the average A
l
probability
(or, equivalently, the proportion of A
l
responses) over the given
trial block.
A.· and E. 70
Now the average moments a
i
can be treated as parameters to be
estimated from the data in order to mediate theoretical predictions. To
illustrate, let us consider a sample of data from the .8 series. Over
the first 12 trials of the =.8 series, the observed proportion of
A
l
.responses for the group of 80 subjects was .63 and the observed
values for the trigrams of Eq. 32ad were
P121 = .061, and P122 = .035
we have frOm Eq. 32a
Using
Plll = .379, Pl12 = .168,
Plll to estimate a
2
(1,12) ,
.379 = .8
. which yields as our estimate
A
= .47 .
,
Now we are in a position to predict the value of P122
the appropriate parameter values into Eq. 32d, we have
P122 = .2(.63  .47) = .03
2
,
Substituting
which is not far from the observed value of .035. Proceedi.ng similarly,
we can use Eq. 32b to estimate
c .
N ' Vlz 0
Pll2 = •. 168 = .8 [(1  .63) +  .47],
from which
A
C
N = .135
With this estimate in hand, together with those already obtained for
the first and second moments, we can substitute into Eq.32c and predict
the value of
A. and E. 71
;::;: .077 ,
which is somewhat high in relation to the observed value of .061.
It should be mentioned that the simple estimation method used above
for illustrative purposes would be replaced, in a serious application
of the model, by a more systematic procedure. For example, one might
simultaneDusly estimate and
c
N
by least squares, employing all
eight of the Pijk; .this procedure would yield a better overall fit of
the theoretical and observed values.
A limitation of the method just described is that it permits esti
separately. N and c mation of the ratio
c
N ' but not estimation of
Fortunately, in the asymptotic case, the expressions for the moments
a
i
are simple enough so that expressions for the trigrams in terms of
the parameters are manageable; and it turns out to be easy to evaluate
the conditioning parameter and the number of. elements from these expres
in the is, of course, n for large The limit of sions. a
l,n
simple noncontingent case. The limit, a
2
, of a may be obtained
2,n
from the solution of Eq. 29; however, a simpler method of obtaining
the same result is to note that, by deflnition,
where again represents the asymptotic· probability of the state in
which i elements are conditioned to AI' Recalling that the u
i
A. and E. 72
are terms of the binomial distribution, we may then write
The SWl11Edti.0I.i. is tilL: secund ra:W" moment of' the binomia.L dist.ributiun
\,':1. til T"lrurW:;,ter rc ttrld sample size N. Therefore
Using Eq. 33 and the fact that lim Pr(A
l
) =11 we have
,n
By identical methods one can establish that
lim Pr(A
l
+llE
l
~ )
11(1
1
)
c
(34b)
=

!'iI
+ ,
,n ,n, n !'iI
lim Pr(A
l
+11E
2
A
l
) 11(1
1
)
lc
(34c)
=

N
+T' ,n ,n,n
and
lim
Pr(A
l
+11E
2
~ )
11(1
1
) (34d) =
ji
.
, n . . _"n,n
A. and E. 73
With these·formulas in hand, we need only apply elementary
probability theory to obtain expressions for dependencies of responses
on responses or responses on reinforcements, viz.,
lim Pr(A fA)
=
I( +
(lc)(lI()
(35a)
l,n+l l,n N
lim
Pr(A
l
n)
=
I(

(lc)n:
(35b)
, , N
lim pr(Al,n+lIEl,n)
(1 
c
)1(
c
(35c)
=
N
+
N
lim Pr(A
l
llE
2
)
(1 .:
c
)1(
(35d)
=
N
.
,0+ . , n
Given a set of trigram proportions from the asymptotic data of a
twochoice experiment, We are now in a position to achieve a rigorous
test of the model by using part of the data to estimate the parameters
c and N, and then substituting these estimates into 34ad and
35ad to predict the values of all eight of these statistics.
We shall illustrate this procedure with the data of the .6 series. The
observed transition frequencies F(A. +lIE. A
k
· ) for the last 100
1,0 J,n ,0
trials, aggregated over subjects, are as follows:
A
l
A1E
l
748 298
A
1
E
2
394 342
A
2
E
l
462 305
186 264
A. and E. 74
An estimate of the asymptotic probability of an A
l
response given
an A1E
l
event on the preceding trial can be obtained by dividing,
the first entry in row one by the sum of the row; i.e.,
But, if we turn to Eq. 34a
we note that lim pr(Al,n+lIEl,nAl,n) = 1((1  )
·7.l5 = .6(1  , we obtain an estimate
7
of
1
+N
N =
Hence, letting
3.48. Similarly
•
7 For anyone subject, N must, of course, be an integer. The fact
that our estimation procedures generally yield nonintegral values for
N may signify that N varies somewhat between subjects, or it may
simply reflect some contamination of the data by sources of experimental
error not represented in the model.
which by Eq. 34b is an estimate = 462,/(462 + 306) = .6<:9
(
1 c
of 1(1  "N) + N ; using our values of
and c = .605'.
and N we find that .£ = .174
N
Having estimated c and N we may now generate predictions for
any of our asymptotic quantities. Table 3 presents predicted and
observed values for the quantities given in Eq. 34a to Eq. 35d. Consid
ering that only two degrees of freedom have been utilized in estimating
parameters, the close correspondence between theoretical and observed
quantities in Table 3 may be interpreted as giving considerable support
to the assumptions of the model. A similar analysis of the asymptoti,c
Insert Table 3 about here
A. and E. 74a
Table 3
Predicted (pattern model) and observed values of sequential statistics
for final 100 trials of the .6 series.
Asymptotic
Quantity Predicted Observed
·715 .715
Pr(A
I
IE
2
A
l
) .541 .535
.601 .601
.428 .413
pr(AIIA
l
) .645 .641
·532
.532
Pr(AIIE
l
)
.669 .667
Pr(A
l
!E
2
) .496 .489
A. and E. 75
data from the .8 series, which has been reported elsewhere
Estes and Suppes, 1962), yields comparable agreement between theoretical
and observed trigram proportions. The estimate of c/N for the .8
data is very close to that for the ,b data (.172 vs .174), but the
estimates of c and N (.31 and 1.84, respectively) are both smaller
for the .8 data. It appears that the more highly practiced subjects of
the .8 series are, on the average, sampling from a smaller population
of stimulus patterns and at the same time are less responsive to the
reinforcing lights than the more naive subjects of the .6 series.
Since no model can be expected to give a perfect account of fallible
data arising from real experiments (as distinguished from the idealized
experiments to which the model should apply strictly), it is difficult
to know how to evaluate the goodnessoffit of theoretical to observed
values. In practice, investigators usually proceed on a largely
intuitive basis, evaluating the fit in a given instance against that
which it appears reasonable to hope for in the light of what is
known about the precision of experimental control and measurement.
Statistical tests of goodnessoffit are sometimes possible (discussions
of some tests which may be used in conjunction with stimulus sampling
models are given by Suppes and Atkinson, 1960, and by Estes and Suppes,
1962); however, statistical tests are not entirely satisfactory
taken by themselves, for a sufficiently precise test will often indi
cate significant differences between theoretical and observed values
even in cases where the agreement is as close as could reasonably be
hOPed for. Generally, once a degree of descriptive accuracy has been
A. and E. 76
attained which appears satisfactory to investigators familiar with the
given area, further progress must come largely via differential tests of
alternative models.
In the case of the twochoice noncontingent situation, the ingre
dients for one such test are immediately at hand; for we developed in
Sec. 2.3 a oneelement, guessing state model that is comrarable to the
Nelement model with respect to the number of free parameters, and which
to many might seem equally plausible on psychological grounds. These
models both embody the allornone assumption concerning the formation
of learned associations, but they differ in the means by which they
escape the deterministic features of the simple oneelement model. It
will be recalled that the oneelement model cannot handle the sequential
statistics considered in this section because it requires, for example,
a probability of unity for response Ai on any .trial following a trial
on which Ai occurred and was reinforced. In the Nelement model (with
N ~ 2), there is no such constraint, for the stimulus pattern present
on the preceding reinforced trial may be replaced by another pattern,
possibly conditioned to a different response, on the following trial.
In the guessing state model, there is no strict determinacy since the
Ai response LJilS,Y:· occur on the reinforced trial by guessing, if the
subject cis, in state Co ; and, if the reinforcement was not effective,
a different response J may:: occur, again through guessing, .on the
following trial.
The case of the guessing state model with c= 0 (c , it will be
recalled,being the counterconditioning parameter) provides a two
A.and E. 77
parameter model which may be compared with the twoparameter, Nelement
model. We will require an expression for at least one of the trigram
proportions that have been studied above in connection with the Nelement
model. Let us take Pr(A
l
+lE
l
A
l
) for this purpose. In Sec. 2.3
,n ,n,n
we obtained an expression for pr(Al,n+l [E
l
, nAl,n) for the case with
c = 0 and thus we can write at once
Since we are interested only in the asymptotic case, we shall drop the
n SUbScript from the right hand side of Eq.36a and have for the desired
theoretical asymptotic expression
(36b)
Substituting now into Eq. 36b the expressions for u
l
and U
o
derived
in Sec. 2.3, we obtain finally
(36c)
.To apply this model to the asymptotic data of the .6 series, we may
first evaluate the parameter € by setting the observed proportion of'
A
l
responses over the terminal 100 trials, .593, equal to the right
hand side of E«.21 and sOlving for € , viz,
·593
A. and E. 78
"(,, + (1,,) . ~ ]
and
.6(.6 + .2E)
.52 +.24E
,
Now introducing this value for € into E'l. 36c, and simplifying;, we
obtain the prediction
Plll = .2782 + .0775 c" •
Since the observed value of Pll.l for the .6 data is .249; it is apparent
that no matter what value (in the admissible range 0 < c" < 1) is
chosen for the parameter c", the value predicted from the guessing
state model will be too large. Further analysis, using the methods
illustrated above, makes it clear that for no combination of parameter
estimates can the guessing state model achieve predictive accuracy
comparable to that demonstrated for the N element model in Table 3.
Although this one comparison cannot be considered decisive, one might
be inclined to suspect that, for interpretation of twochoice, proba
bility learning, the notion of a reaccessible guessing state is on the
wrong track, whereas the N ~ e l e m e n t sampling model merits further
investigation.
Mean and variance of response proportion. By letting
"11 ="21 =" in E'l. 28, we have immediately an expression for the
A.and E.79
probability of an A
l
response ·on trial nin the noncontingent case,
viz.
nl
pr(Al,n) = rc  [rc  Pr(Al,ly](l  N) (37)
If we define a response random variable A
which equals 1 or 0
according as A
l
or respectively, occurs on trial n, then the
right side of Eq. 37 also represents the expectation of this random
variable on trial n The expected number of A
l
responses in a
series of· K trials is, then, given by the summation of Eq. 37 over
trials,
K
= L E(A
n
) =
n=l
N c K
Krr  c [rr  heAl, 1)][1  (1  iii) ]
(38)
In experimental applications, one is frequently interested in the learning
curve obtained by plotting the proportion of A
l
responses per Ktrial
block. A theoretical expression for this learning function is readily
obtained by an extension of the method used to derive Eq. 38. Let x
be the ordinal number of a Ktrial block running from trial K(xl) + 1
to Kx where x = 1, 2, •.• , and define p(x) as the proportion of
A
l
responses in block x. Then
1 [Kx !ili::ll l
P(x).C' K Pr(Al,n) 
N
= n;  Kc
[
rc  Pr(A )] [1  (1  2.)K] (1 _ 2.)K(Xl) • (39a)
.1,1 . .. N .. N
A. and E.c80
of Pr(A
I
1) should be in the neighborhood of .5 if response
,
bias does not exist. However, to allow'for sampling deviations we
may eliminate Pr(AI,I) in favor of the observed value of pel) . This,
can be done in the following way. Note that
Solving for Pr(A
I
1)] and substituting the result in Eq. 3913.,
,
we obtain
,,)K(XI)
p(x) =   P(l)](l  N
Applications of Eq.39b to data have led to results that are
(39
b
)
satisfying in some respects but perplexing in others (see, e.g., Estes,
195913.). In most instances the implication that the learning curve
have :J( as an asymptote has been borne out (Estes, 196Ib,1962), and.
further, with a suitable choice of values for c/N "the
by Eq. 39b has to describe the course of learning. However,. in
experiments run with naive subjects, as has been nearly always the case,
the value, of c/N required to fit, the mean learning curve has been
substantially smaller than the value required to handle the sequential
statistics discussed in the preceding section. Consider, for example,
the learning curve for the. 6 series plotted by 20 trial blocks • The
observed value of pel) is .48 and the value of c/N estimated from
the sequential statistics of the second 20trial block is .12. With
these parameter values, Eq. 39b yields a prediction Of .59 for P(3)
A.andE. ~ 8 1 
and t ~ e theoretical curve is essentially.at asymptote from block 4 On.
The empirical learning curve, however, .does not approach .59 until block
6 and is still short of asymptote at the end of 12 blocks, .the mean
proportion of Al responses over the last five blocks being .593
(Suppes and Atkinson, .1960, p. 197).
In the case of the .8 series there is a similar disparity between
the value of clN estimated from the sequential statistics and the value
estimated from the mean learning curve. As we have noted above, an
optimal account of the trigram proportions
a clN value of approximately .:17. B11t if
Pr(A IE. A. ) requires
x.,n+ J, n l,n
this estimate is substituted
into Eq.39a, the predicted Al frequency in the first block of 12
trials is .67, compared to an observed value of .63, and the ,theoretical
curve runs appreciably above the empirical curve for another five blocks.
A clN value of .06 yields a satisfactory graduation of the ob.served
mean curve in terms of Eq. 39a, and a fit to thetrigrams that does not
look bad by usual standards for prediction in learning experiments.
However, comparing predictions based on the two clN estimates for the
trigrams which contain this parameter, we see that the estimate of .17
is distinctly superior. For .the trigrams averaged over the first 12
trials, the result is
Observed Theoretical; c/N; .17 Theoretical,; c/N; .06
Pl12
.168
.177 .144
P12l
.061
·073
.087
P212
.121
.119 .152
P221
.062
.053 .039
A.andE.82
The reason for this discrepancy in :the value of clN reCluired to
give optimal descriptions of two different aspects of the data is not
clear even after much investigation. One contributing factor might be
individual differences in learning rates (c/N values) among subjects;
these would be expected to affect the two types of stati.stics differently.
However, in the case of the .8 series, when a more homogeneous subgroup
of Bubject.s (themi.ddle 50/0 on total A
].
freCluency) is analyzed, the
disparity., although somewhat reduced, is not eliminated; optimal clN
values for the mean curve and the trigram statistics are now.o8 and .15,
respectively. The principal sOurce of the remaining discrepancy in this
homogeneous subgroup is a much smaller increment in A
l
freCluency from
thefir'st to the second 12trial block than was predicted. Over the
first three blocks the observed proportions were .633, .665 and .790}
.theproportions predicted from ECl. 39a with clN ~ .15 run .657,.779
and .800. A possible explanation is that in the ea.rly part of the series
the subjects are responding to cues,perhaps verbal in character, which
are discarded (Le., are not resampled) .when they faU. to elicit consis
tently correct responding. An interpretation of this sort could be
incorporated into the model and SUbjected to formal testing, but this
has not yet been done. In any event, one can see that analyses of data
in terms of a model enables U;3 to determine precisely which aspects of
the subjects I behavior are and which are not accounted"forin terms of
a particular set of assumptions.
Next to the mean learning curve, the most freCluen:tly used behavioral
measure in learning experiments is perhaps the va.riance of response
A. and E,,.83
occ\l,rrences.in a 1)lock of trials. Predicting this variance from a
theoretical model is an exceedingly taxing·assigrunent; for the effects
of individual differences in learning rate, together with those of all
sources of experimental error not represented in the model, must be
expected to increase the observed response variance. However, this
statistic .is relatively easy to compute for the pattern model, and the
deviation mayserve as a prototype for deviations of similar expressions
in other learning models. For simplicity,we shall limit consideration
here to the case of the variance uf A
l
response frequency in a trial
block after the mean curve has reached asymptote.
As a preliminary to computation .of the variance, we require a
statistic which is.also of interest in its own right, the covariance of
A), responses on any two trials; that is (using the notation of Eq. 25),
(40)
Pr(A
l
+kAl·· ) Pr(A
l
+k)Pr(A
l
)
,n ,no ,·n. , ,n
first, we can establish by induction that
:n:Pr(A
l
)
,n
~ ; i . s form1).la is obviously an identity for k '" 1. Thus, .assuming that
the formula holds for trials nand n + k , ,we may proceed to establish
it for trials nand n + k + 1 First we 1).se our standard procedure
A. and E. 84
to expand the desired quantity in terms of' reinf'orcing events and states
of' Letting C. denote the state in which exactly j
J,n
of the N eIementsare conditioned to response AI' we may write
Pr(A A );L Pr(A ,E.. C . A )
I,n+k+II,n " I,n+k+I j,n+k I,n
J
;LPr(A IE. C. A )Pr(E. .C. A).
" I,n+k+I J,n+k I,n J,n+k I,n
Now we can make use of the assumptions that specify the nOncontingent
case to simplify the second ·factor to
llPr( C. +kAI·. )
J, n . , n
and
(llt)Pr(C. kAI ) foJ;' i; 1,2, respectively. Also, we may apply
J,n+ ,n
the learning axioms to the ,first factor,obtaining
.2 '. (1 ).
(
I
)
 L + ( _.J.) [ c J
Pr Al n+k+l E
I
n+kCJ' n+kAI n  2 1 N N
, ... , ,  ,  J ", N
; (1 _ +
N N N
and
Pr(A IE C A) ; (I. _ £.)..1
l,n+k+l 2,n+k j,n+k l,n N N
Combining,these results, we have
+ c(j;I)]
_ .£) ..1 + .£] + (llt)(l _ .£)
N N N· '.' N
 (1 N
c
)Pr(A
I
kAI ) + It
N
c
Pr(A
I
' )
,n+ ,n ,n
A. and E. 85
Substituting into this expression in terms of our inductive ,hypothesis
yields
as required.
; (1   [npr(Al,n)
• (1 _ .£/l} + n .£ Pr(A )
N. N l,n
; nPr(Al,n)  [nPr(Al,n) pr(Al,n+1Al,n)]
c k
(1  )
N
,
We wish to take the limit of the right side of Eq. 40 as n co in
order to obtain the covariance 'of the response random variable on any two
trials at asymptote. The limits of Pr(Al,n)
to be equal to n , and from Eq. 35 we have the
and Pr(Al,n+k) we know
expression n
2
+
for the limit of Pr(Al,n+lAl,n) • Making the appropriate substitutions
in Eq. 40, yields the S:impl.e.result
lim Gov
l
A ' kA )
. \P
n
+ .1'n
2 ,[ 2 2, (lC)](
; n ' n n n(ln) N • 1
kl
; (1 j)
2
,  n
(41)
Now we are ready to compute the variance of A
l
response
freqlJ,encies in a block of K trials at asymptote, by applying the stan
dard theorem for.the variance of a sum of (Feller, 1957):
A. and E. 86
Since
lim
n>ro
the limiting variance of A is simply
JV'n
lim
n>ro
Var(A ) = lim
"'In
n+oo
lim
n>co
2
=lClC
Substituting this result and that for
expression for var(4K)' we obtain
lim Cov(A +kA )
.n ""'n
into the general
. K
:S=L
i=l j=2
= Krc(llt) + 2lC(1rc)(lc) z.. !:! [1(:\'> .£)jl]
N j=l c N
= Klt(llC) + 2lC(1
rc
l(1c) {K  ~ [ 1(1 ~ { ] }
(42)
Application Of this formula can be conveniently illustrated in terms
of the asymptotic data for the .8 series. Least squares determinations
of
c
N
and N from the trigram proportions (using Eq. 34ad) yielded
estimates of .17 and 1.84, respectively (Estes and Suppes, 1962).
Inserting these values into Eq. 42, we obtain fora 48 trial block at
asymptote, VarCh) = 37.50; this variance corresponds to a standard
deviation of 6.12. The observed standard deviation for the final 48
trial.block was 6.94. Thus, the theory predicts a variance of the right
order of magnitude, but, as anticipated, ·underestimates the observed
value.
Of tbe many otber statistic$ tbat can be derived fromtbe
model for learning data,we sball give one final example,
selected primarily for tbe purpose. of reviewing tbe tecbnique for
deriving sequential statistics. This tecbnique is so generally useful
tbat tbe major steps sbould be empbasized: first, expand tbe desired
expression in terms of tbe conditioning states (13.$ done, for example,
in tbe case of Eq.30); second, conditionalize responses and
forcing events on tbe preceding sequence of events, introducing
ever simplification$ are permitted by tbe boundary conditions of tbe
case under consideration; tbird, apply tbe axioms and simplify to
obtain tbe appropriate result. These steps will now be followed in
deriving an expression of considerable interest in its own
tbe probability of an A
l
response following a sequence of exactly
V E
l
reinforcing events:
A. andE.88
Pr(A
l
+y[E
l
+Y_.l··.·El· E
2
1) =y
l
Pr(A
l
1<1 y 1" .E
l
E
2
1)
........ ,n. ,n ,n ,n ,{.(lll) ··,n+V ,n+  . ,.n ,n
= L Pr(A .C. 1< . ..,.E E
2
lC, '1)
llY(lll)i,f " ,n' J,n
= ,1 L Pr(A
l
,IC.1<l '1" ,E
l
E
2
lC, 1)
Vel' .) .. .' ,n+Y ,n+Y· ,n ,n' J,n'
II ' ll
pr(C. +ylEl +Y 1" .E
l
E
2
·le. '1)
1., n' ,n, ,  ,0, ,0' J,n
Pr(El +Y 1" .E
l
E
2
, l[C' ·l)Pr(C·
l
)
, . ,D ,' ,n, ',n J,n  J,n.
= LNiPr(C. +y[E
l
· +Y l' ... E
l
· E
2
lC, "l)Pr(Cj'l)
. .l,n',n , ",n ,n J,n , ,n...
N[' . ",. . . Y}. { . lY}]
= (lc..2) 1(1  ..2)(1 _.£) + c..L 1(1  _.£) Pr(C. )
. N "N' •.... N ' N . , N N" J,nl
J=O
= [1(1
j=O
c Y c c Y
= 1 (1  P )( 1  )  p ,(1  )
nl N N nl N
The derivation has a formidable appearance, mainly because we have
spelled out the steps in more than customary detail, but each step can
readily be justified. The first involves simply using the definition of
a condition",l together with the fact that
A. and E. 89
in. the simple noncontingent case, Pr(El ) = nand Pr(E
2
' ) = 1  n
,n ... ,n
V
for all n, .and Pr(El,n+Jll,n+V_l" .El,nE2,n_l) = n (ln) The
second step introduces the conditioning states
C
i
V and
,n+
denoting the states in which i elements are conditioned to Ai on
trial n + V and j elements on trial n  1 ,respectively. Their
insertion into the righthand expression of line 1 is permissible since
the summation of Pr(C.) Over all values of i is unity and similarly
1.
for the summation of pr(C
j
) • The third step is based solely on
repeated application of the defining equation for a conditional
probability, which permits the expansion
Pr(ABC•..•J)=Pr(AIBC..• •J)Pr(B[ C..• . J) •• •Pr(J) • The fourth step
involves assumptions of the model: the conditionalization of
on the preceding sequence can be reduced to Pr(A
l
+v1e. +V)
. ,n l,n.
Al,n+v
i
=:N
since, according to the theory, the preceding history affects response
probabiHty on a given trial only insofar as it determines the state
of conditioning, i.e., the proportion of elements conditioned to the
given response.
into
The decomposition of pr(El,n+V_l" .El,nE2,n_lICj,n_l)
is justified by the special assumptions of the
simple noncontingent case. The fifth step involves calculating, for
each value of j on trial n 1 , the expected proportion of elements
There are two main branches to
c ~ , the state of conditioning is
conditiOned to Ai on trial n + V
the process starting with state C
j
by the axioms has probability 1
on trial n  1 In one, which
unchanged by the E
2
event on trial n  1 ; then, applying Eq.37
with n = 1 (since from trial n onward we are dealing with a sequence
A. and E. 90
of
Pr(Al,l) = 4' we obtain the expression
{
.. c V}
1(1 ..l)(l_ )
. N· N
for the expected proportion of elements connected to A
l
on trial n + V
in this branch. In the other branch, which has probability c 4appli
cation of Eq. 37 with n = 1 and
'1
pr(Al,l) = I
N
yields the expression
{l(l  j;l)(l  j)V} for the expected proportion of elements connected
to A
l
on trial n + V. Carrying out the summation over j ,and using
the by now familiar property of the model that
N .
~ ~ Pr(C. 1) = Pr(Al,n_l) = Pnl '
j=O J, n
we finally arrive at the desired expression for probability of A
l
following exactly vE's
1
compute the following values for the conditional response proportions:
Application of Eq. 43 can conveniently be illustrated in terms of
the .8 series. Using the estimate of .17 for j (obtained previously
from the trigram statistics) and taking Pnl = .83 (the mean proportion
of A
l
responses over the last 96 trials of the .8 series), we can
4
.852
.897
3
.822
.859
2
.786
.838
1 o
.689
.695 Observed
V
Theoretical
It can be seen that the trend of the theoretical values represents quite
well the trend of the observed proportions over the last 96 trials.
Somewl:tat surprisingly, the observed proportions run slightly above
A. and E.91
the predicted .values. There is no indication hereof the "negative
recency effect" (decrease in A
l
proportion with increasing length of
the E
l
sequence) reported in a number of pUblished twochoice studies
(e.g., Jarvik, 1951; Nicks, 1959)0 It may be significant that nO nega
tive recency effect is observed in the .8 series,which, it will be
recalled, involved wellpracticed subjects who had had experience with a
wide range of:rr values in preceding series. However, the effect :is
observed in the .6 series, conducted with subjects new to this type of
experiment (cf. Suppes and Atkinson 1960, pp. 212213). This differential
result appears to support the idea (Estes, 1962) that the negative recency
phenomenon is attributable to guessing habits carried over from everyday
life to the experimental situation and extinguished during along.l:l;;r.aining
series conducted withnoncontingent reinforcement.
We shall conclude our analysis of the Nelement pattern model by
proving a very general "matching theorem." The substance of this theorem
is that, so long as either an E
l
or an E
2
reinforcing event occurS
on each trial, the proportion of A
l
responses for any individual subject
should tend to match the proportion of E
l
events over a sufficiently
long series of trials regardless of the reinforcement schedule.
For purposes of this derivation, we shall identify by a subscript
x the prObabilities and events associated with the individual x in a
population of subjects; thus PX1, n will denote probability of an A
l
response by subject x on trial n , and E
xl,n
and .Ilxl,n
will denote
random varia.bles which take on the values 1 or °according as an E
l
on trial n
A. andE. 92
event and an A
l
response do or do not occur in this subject's protocol
With this notation, the probability of an A
l
response
by subject x on trial n + 1 can be expressed by the recursion
c( )
= +E A ..
PX1,n+l PX1,nN xl,n xl,n
(44)
The genesis of Eq. 44 should be reasonably obvious if we recall that
Pxl n is equal to the proportion of elements currently conditioned to
,
the A
l
response. This proportion can change only if an E
l
event
occurs on a trial when a stimulus pattern conditioned to .is sampled,
in which case E  A = 1  0 = 1 , or if an
xl;n """Xl,n
event occurs on
a trial when a pattern conditioned to A
l
is sampled, in which case
o 1 =  1. In the former case, the proportion of pat" E A
xl,n 'xl,n
to increases by
1
N
if conditioning is effective
(which has probability c) and in the latter case this proportion
1
decreases by N (again with probability c).
Considering now a series of, say, n* trials: we can convertEq. 44
into an analogous recursion for response over the series
simply by summing both sides over nand dividing by n*, viz.
n*
;* L Pxl n+l
n=l '
n*
;* Pxl n
n=l '
n*
1 c L( )
+E A
n* Nxl n xl n
n=l J ,
Now we subtract the first sum on the right from both sides of the
equation, and distribute the second sum on the right yielding
PX1, n+l  PX1,1
n*
dividing both sides of the equatlon immediately preceding by
A. and E. 93
The limit of the left side of this last equation is obviously zero as
n* ~ 00 ; thus, taking the limit and rearranging, we have
7
7 Equation 45 holds only if the two limits exist, which will be the case
if the re'inforcing event on trial n depends at most on the outcomes of
some finite number of preceding trials. When this restriction is not
satisfied, a substantially equivalent theorem can be derived simply by
n*
1 LE
n'* n=l ""Xl,n
before passing to the limit; that is
PX1,n+l  PX1,1 = c
n* N
~ ~ l , n
n*
L>
1
xl,n
c n=
N n*
L llxl n
n=l '
Except for special cases in which the sum in the denominators converges,
the limit of the lefthand side is zero and
lim
n* 7 00
n*
~ 1.xl,n
n*
LE
n=l xl,n
= 1
lim
n* 7·00
n*
l ~
LA 
n* n=l "'xl,n 
lim
n* 7 CD
n*
1 ~ E
n* L.xl n
n=l "
A. and E. 94
. To appreciate the strength of this prediction one should note that
it holds for the data of an individual subject starting at anyarbi
trarily selected point in.a learning series, provided only that a suffi
ciently long block of trials following that point is available for
analysis. Further, it holds regardless of the values of the parameters
Nand c (provided that the latter is not zero) and regardless of the
way in which the schedule of reinforcement may depend on preceding events,
the trial number, the subject's behavior, or even events outside the
system (e.g., the behavior of another individual in a competitive or
cooperative social situation). Examples of empirical applications of
this theorem under a variety of reinforcement schedules are to be found
in studies reported by Estes, L957aand Friedman, et. al.,196:'l.
3.3 Analysis of a Paired Comparison Learning Experiment
In order to exhibit a somewhat different interpretation of the axioms
of Sec. 3.1, we shall now analyze an experiment involving a paired
comparison procedure. The experimental situation consists of a sequence
of discrete trials. There are r objects, denoted A.(i=ltor).
l
On
each trial two (or more) of these objects are presented to the subject
and he is required to choose between them. Once his response has been
made the trial terminates with the subject winning or losing a fixed
amount of money. The subject's task is to win as frequently as possible.
There are many aspects of the situation that can be manipulated by the
experimenter; for example, the strategy by which the experimenter makes
available certain subsets of objects from which the subjects must choose,
A. and E.
the schedule by which the experimenter determines whether the selection
of a given object leads to a win or loss, and the amount of money won or
lost on each trial.
The particular experiment for ",hich we shall essay atheoret'ical
analysis was reported by Suppes and Atkinson (1960, Ch. 11). The problem
for the subjects involved repeated choices from subsets of a set of
three objects, which may be denoted A ,A_ , and A
3
•
1 "
On each trial
The subject selected one of the objects in the
presentation set; then the trial terminated with a win or a loss of a
small silmof money. The four presentation sets (A
I
A
3
) ,
ilnd occurred with equal probabilities over the series
of trials. FQrther, if object was selected on a trial then with
probability A..
the subject lost and with probability 1  A..
he won
the predesignated amount. More complex schedules of reinforcement could
be used; of particular interest is a schedule where the likelihood of a
win following the selection ofa given object depends on the other
available objects in the presentation group. For example, the probability
of a win following "n Al choice could differ depending on whether the
(A
I
A
2
) , (A
I
A
3
) or presentation group occu:cred. The analysis
of these more complex schedules does not introduce new mathematical
problems and may be pursued by the same methods we shall use for the
simpler case.
Before the axioms of Sec. 3.1 can be applied to the present experi
ment we need to provideiln interpretation of the stimulus situation
A, and E, 96
confronting the subject from trial to triaL The one we select is some
what arbitrary and in Part 4 alternative interpretations are examined,
Of course, discrepancies between predicted and observed quantities will
indicate ways in which our particular analysis of the stimulus needs to
be modified,
We shall represent the stimulus display associated with the I,resen
tation of the pair of objects (AiA
j
) by a set Sij of stimulus
patterns of size N; ,the triple of objects ( A l ~ A 3 ) will be represented
bya set of stimulus patterns of size N*, Thus, there are four
sets of stimulus patterns, and we assume that the sets are pairwise
disjoint (Le" have no patterns in common), Since,inthe model under
consideration,the stimulus element sampled on any trial represents
the full pattern of stimulation effective on the trial, one might wonder
why a given combination of objects, say ( A l ~ ) ' should have more than
one element associated with it, It might be remarked in this connection
that in introducing a parameter N to represent set size, we do not
necessarily assume N> 1, We simply allow for the possibility that
such variations in the situation or different orders of presentation of
the same set of objects on different trials might give rise to different
stimulus patterns, The assumption that the stimulus patterns associated
with a given presentation set are pairwise disjoint does not seem appeal
ing on common sense grounds; nevertheless it is of interest to see how
far we can go in predicting the data of a pairedcomparison learning
experiment with the simplified model incorporating this highly restrictive
assumption, Even though we cannot attempt to handle the positive and
A. and E.
transfer effects that must occur between different members of
the set of patterns associated with a gtven combination of objects during
we may hope to account for statistics of asymptotic data.
When the pair of objects (AiA
j
) is presented the sUbject must
select A. or A. (i.e. , make response A. or A.); hence all pattern
l
J
l
J
elements in S.. become conditioned to A. or A. Similarly all
lJ l
J
elements in
S123
become condit:1,oned to
Al
, A
2
or A
3
. When (A.A. )
,l J
is presented the subject samples a single pattern from Sij
the response to which the pattern is conditioned.
and makes
The final step, before applying the axioms of Sec. }.1, is to
provide an interpretation of reinforcing events. Our analysis is as
follows: If (AiA
j
) is presented and the Ai object is selected" then
Cal the E
i
reinforcing event occurs if the Ai response is followed
by a win and (b) the Ej event occurs if the Ai response is followed
bya loss. If is presented and the Ai object is selected,
then (al ,the E
i
event occurs if the Ai response is followed by a win
and (b)E
j
or. E
k
occurs, the two events having equal probabilities
if the Ai response is followed by a loss. This collection of rules
represents only one way of relating the observable trial outcomes to the
hypothetical reinforcing events. For example, when Ai is selected
gtven and followed by a loss, rather than having E
j
or E
k
occur with equal likelihoods, one might postulate that they occur with
probabilities dependent on the ratio of wins following A
j
responses to
wins following A
k
responses over previous trials. Many such variations
in the rules of correspondence between trial outcomes and reinforcing
A. and E. 98
events b.ave been explored; these variations become" particularly important
when the experimenter manipulates the amount of money won orlost,the
magnitude of reward in animal studies,and related variables (see Estes,
1960p;Atkinson, 1962'9;and Suppes and Atkinson, 1960, Chapter 11; for.
discussions of this point).
In.analyzing the model we shall use the following notation:
=
J.,n
occurrence of an response on the nth presentation
Of (AiA
j
) [note that the reference is not to the nth
trial of the experiment but to the nth presentation of
(ij)
W
n
a win on the nth presentation of (AiA
j
) •
a loss on the nth presentation of (AiA
j
)
We now proceed to derive the probability of an Ai response on the
nth presentation of (A1.AJo); namely First we note that the
l,n
state of conditioning ofa stimulus pattern can change only when it is
sampled. Since all of the sets of stimulus patterns are pairwise disjoint
the sequence of trials on which (AiA
j
) is presented forms a learning
process that maybe studied independently of what happens on other
trials (see Axiom c4); that is, the interspersing of other types of trials
between the nth and n + 1
st
presentation of (AiA
j
) has no effect on
tb.e conditioning of patterns in set Sij •
We now want to obtain a recursive expression for This
can be done by using the same methods employed in the preceding section.
But to illustrate another approach we proceed differently in this case.
A. and E.
changes in
Let
I . '.)
Y and
lJn I n
Y
n
are given in
)
J,n
Figure 6.
1  Y . Then the possible
n
With probability 1  c no
Insert Figure 6 about here
change occurs ·in conditi.oning regardless of trial events and hence
Y
n
+
l
Y
n
p with probability c change can occur. If Ai occurs and
·i.s followed by a win then the sampled element remains conditioned to
A. p however, if a loss occurs the sampled element (which was conditioned
to Ai) becomes conditioned to A
j
and thus
Occurs and is followed by a win then Y
n
+
l
Y
n
p however, if it is
followed by a loss the sampled element (which was conditioned to A 1
j'
becomes conditioned to Ai J hence Putting these
results together we have
which simplifies to the expression
Y [1  + A.)] + m
c
A.
n J . J
(46)
Solving this difference equation, we obtain
A. ;. A.
J
[
A.
J
y
n
c
A. and E. 99a
~ ..
• .z
Fig. 6. Branching process for a diad probability on a paired
comparison learning trial.
The possible changes in
We now consider
A. and E. 100
for simplicity let
1 .. a _ = pr(A(12
3
)).
n n 3,n
a. are given in Figure 7. For example, on the bottom branch conditioning
n
Insert Figure 7 about here
is effective and an A
3
response occurs which leads to a loss; hence
E
l
or E
2
Occur with equal probabilities. BUt an A
3
followed by E
l
followed byE
2
makes a = a '.
n+l n
Combining the results in this figure yields the following difference
equation:
+ a [c(l  a
n n
+ a [c(l  a
n n
1
 13)" ]
n 3 2
Simplifying this result we obtain
By a similar argument we obtain
(48a)
I3
nH
c
2N*
(48b)
A. and E. lOOa
a = a
1

11
n+l n
'il ~
v tl
,
vr.
"
n·
an
1
Oi
n
+
l
=a 
N* n
a
n
+
l
= a
n
.",
~ ..
" _ 2
A
2
:
t\
[3n
1
i;
a = a +
a:
n+l n N*
a = a
n+l n
1
= a +
n N*
Fig. 7. Branching process for a triad probability on a
paired comparison learning trial.
A. and E. 101
Solutions for the pair of difference equations given by Eq. 48a' and48b
are well known and can be obtained by a number of different techniques
p. 130133;
(see Goldberg, 1958,/or Jordan, 1950) •. Any solution presented can be
verified by substituting into the appropriate difference equations.
However, for now we shall limit consideration to aSYmptotic results.
In terms of the Markov chain property of our process it can be. shpwn
that· the limits a ~ . lim a and
13
lim
I3
n
exist. Letting
n
n ~ CD n ~ 00
a
~
a ~ a and
I3
n
+l ~ I3
n
~
13
in Eq. 48a and 48b we obtain
n+l n
Solving for a and 13, and rewriting we have
lim Pr(A(12
3
l) ~
).,2).,3
(49a)
l,n
).,1).,2 + ).,1).,3 + ).,2).,3
,
lim p r ( ~ 1 2 3 ) )
).,1).,3
(49b)
,n
).,1).,2 + ).,1).,3 + ).,2).,3
,
and
).,1).,2
lim
pr(A(12
3
) )
(Mc)
~
).,1).,2 + ).,1).,3 +).,2).,3
3,n
The other moments of the distribution of response probabilities can
be obtained following the methods employed in Sec. 3.1; and,:· ate ..
A. and E..102
asymptote we can glOmerate the entire distribution. In particular, for
set 8
ij
the asymptotic probability that k patterns are conditioned
to Ai and N k to A
j
is siJnply
For the set 8
123
the asymptotic probability of k
l
patterns conditioned
to A
l
, ~ to ~ ,and k
3
to A
3
(where k
l
+ ~ + k
3
= N*) is
In analyzing data,it also is helpful to examine the marginal
liJniting probability of an Ai response, Pr(A.) , in addition to the
l
other quantities mentioned above. We define Pr(A.) as the probability
l
of an Ai response on any trial (regardless of the stiJnulus display) once
the process has reached asymptote. Theoretically
P r ( ~ )
and
A. and E.103
where pr(D(i
j
» is the probabil.ity of presenting the pair of objects
The experimental results we consider were reported in preliminary
form in Suppes and Atkinson, (1960}.Two groups were run each involving
48 subjects; subjects in one group won or lost one cent on each trial,
and those in the other group won or lost five cents on each trial. We
shall consider only the onecent group, for an analysis of the differen
tial effects of the two reward values requires a more elaborate inter
pretation of reinforcing events. Subjects were run for 400 trials with
the following reinforcement schedule
6/10 , "3 = 8/10
Figure 8 presents the observed proportions of A
l
, and
Insert Figure 8 about here
responses in successive 20trial blocks. The three curves appear to
be very stable over the last 10 or so blocks; consequently we treat the
data over trials 301 to 400 as asymptotic.
ByEq. 47 and Eq. 49ac we may generate predictions for
.and Given these values and the fact that the four presen
tationsets occur with equal probabilities we may, as shown above, generate
The predicted values for these quantities
pred·ictions for Pr(A. ).
1,.00
and the oqserved proportions over the last 100 trials are presented in
Table 4. The correspondence between predicted and observed values is
;J>
II'
g,
I?O
I
f'
~
II'
I
DO PrCA 1)
• • PrCA)
2
00 Pr(A
3
)
.5 I _ I
VI
.4 :z
0
i=
Q:
0
c..
0
Q:
.3
c..
0
LlJ
>
Q:
LlJ
VI
to
0
.2
.1 I I I I I I iii Iii j Iii i I I I
1 2 3 4 5 6 7 8 9 10 11
BLOCKS OF 20
12 13
TRIALS
14 15 16 17 18 19 20
Fig. 8. Observed proportion of A. responses in successive 20trial blocks for paired
1
comparison experiment.
A. and E. 104
Insert Table 4 about here
very good, particularly for Pr(A. )
CD
and
)
CD
The largest
discrepancy is for the triple presentation set, where we note that the
is .041 above the predicted value of .507. observed value of
Pr(A(
12
3))
1,00
The statistical problem of determining whether or not this particular
difference is significant is a complex·rnatter and we do not undertake
it here. However, it should be noted that similar discrepancies have
been found in other studies dealing with three or more responses (see
Gardner, 1957; Detambel, 1955) and it may be necessary, in subsequent
developments of the theory, to consider some reinterpretation of rein
forcing events in the multiple response case.
In order to make predictions for more complex aspects of the data
it is necessary to obtain estimates of c, Nand N*. Estimation
procedures of the sort referred to in Sec. 3.2 are applicable but the
analysis becomes tedious and such details are not appropriate here.
However, some comparisons can be made between sequential statistics
that do not depend on parameter values. For example, certain nonparametric
comparisons can be made between statistics where each individually
depends on c and N, but where the difference is independent of these
parameters. Such comparisons are particularly helpful when they permit
us to discriminate among different models without introducing the com
plicating factor of having to estimate parameters.
A. and E. 104a
Table 4
Theoretical and observed asymptotic choice proportions
for pairedcomparison learning experiment.
Predicted Observed
Pr(A
l
) .464 .473
P r ( ~ )
.302 .294
Pr(A
3
) .234 .233
pr(A
l
(12) )
.643 .651
Pr(A (13))
.706
·700
1
p r ( ~ ( 2 3 ) )
.571 .561
pr(A
l
(123))
·507
.548
P r ( ~ (123))
.282 .258
Pr(A
3
(1
2
3))
.211 .194
A. and E. 105
To indicate the types of comparisons that are possible, we may
consider the subsequence of trials on which presented and,
in expression
That is,the probability of an A
l
response on the n+l
st
presentation
of CAl given that on the nth presentation of (A
l
an A
l
occurred and was followed bya win, and that on the n_l
st
presentation
of an occurred followed by a win. To compute this proba
bility we note that
Now our problem is to compute the two quantities on the righthand side
of this equation. We first observe that
.A. and E. 105
where C(12) denotes the conditioning state for set 8
12
in which i
i,n
elements are conditioned to A
l
and to on the nth presen
tation of (AJ:A2) . Conditionalizing and applying the axioms,we may
expand the last expression.. into
L pr(A(l2) C
tl2
) )
.. l,n+l J,n+l J,n+l n l,n nl ",nl i,nl
. . .
. (1 i>. )Pr(A(l2) jW(12)A(l2) )(1  i>. )
1 1, n nl ", nl 2
Id
l2
) )Pr(d
l2
) ) .
",nl
Further, the sampling and response axioms permit the simplifications
and
pr(A(12) I Ni
",nl = N
Finally,in order to carry out the summation, we make use of the relation
A. and E. "107
11l
12
)A(12)w(12)d
12
) d
l2
) ) =
J,n+l n l,n nl
1 for i = j
o for i i j
which expresses the fact·that no change in the conditioning state can
occur if the pattern sampled leads to a win .(see AxiomC2). Combining
these results and simplifying we have
Pr(A(12) i
12
)/12)W{12)A(12) )
. l,n+l n l,n nl "2,nl
Similarly we obtain
) = (1 _ A. )(1 _ A. ) .)
n l,n nl . 1· 2 N N
and finally, taking the quotient of the last two expressions,
(50b)
,
pr(A(12) IW(12)A(12)W(12) )
l,n+l n l,n nl
L(11
2
(Ni)pr(. )
. N N
(50c)
We next consider the same sequential statistic but with the responses
reversed on trials nand n  ·1 ; namely
A. and E. 10'3
Interestingly enough, if we compute
and
they turn out to be expressed by the right sides of Eq. 50a and 50b;
respectively, Hence, for all n,
(51)
Pr(A(12) I A(12) ) •
l,n+l n z,n nl l,nl
Comparable predictions, of course, hold for the subsequences of trials
on which (AlA) or are presented.
3
Equation 51 provides a test of the theory which does not depend on
parameter estimates. Further, it is a prediction that differentiates
between this model and many other models. For example, in the next
section we consider a certain class of linear models, and it can be
shown that they generate the same predictions for the quantities in
Table 4 as the pattern modeL However, the sequential
in Eq. 51 does not hold for the linear model.
A. and E.l09
To check these predictions, we £hall utilize the data overall trials
of the subsequence and not restrict the analysis to asymptotic
performance. Specifically we define
But by the results just obtained we have S121 Sl12 and S21 S12
for any given subject. Further, if we define as the sum of the
. .
Sijk'S over all subjects then it follows that S121 Sl12 independent
of intersubject. differences in c and N. Similarly S12
Thus we have a set of predictions which are not only nonparametric but
which require no restrictive assumptions on variability between subjects.
Observed frequencies corresponding to these theoretical quantities are
as follows:
S121 140
SU2
138
. 0
s2l
243
s12
244
.576
.566
A. and E.
Similarly,for the (A
1
A
3
) subsequence
.
= 67 ;
. .
= 120 ;
=
122
= .558;
t113/
t
13
= .5
2
5
Finally,ror the subsequence
= 45
= 49
.
=
82
= 87
t 232f32
=
·549
.563
Further analyses will be required to determine whether the pattern
model gives an entirely satisfactory interpretation of pairedcomparison
It is already apparent, however, that it may be very diffi
cult indeed to find another theory takes us further in this dire.c
tion than the pattern model with equally simple machinery.
A. and E. 111
4. A C(:MPONENT MODEL FOR STIMULUS COMPOUNDING AND GENERALIZATION
4.1 Basic Concepts; Conditioning and Response
In the preceding section we simplified our analysis of learning in
terms .of the Nelement, pattern model by assuming that all of the patterns
involved in a given experiment are disjoint, or at any rate that generali
zation effects from one stimulus pattern to another are negligible. Now
We shall go to the other extreme and treat problems of simple transfer
of training between different stimulus situations that have elements in
common in a purely crosssectional manner, with no reference to a learning
process occurring over trials. Again the basic mathematical apparatus
will be that of sets and elements, but with a reinterpretation which
needs to be clearly distinguished frOm that of the pattern model. In
Sections 2 and 3 we regarded the pattern of stimulation effective on any
trial as a single element sampled from a larger set of such patterns;
now we shall consider the trial pattern as itself constituting a set of
elements, the elements representing the various components or aspects of
the stimulus situation which may be sampled by the subject in
combinations on different trials. We shall proceed first to give the
two basic axioms that establish the dependence of response probability
on the conditioning state of the stimulus. sample. Then some theorems
will be derived that specify relationships between response probabilities
in overlapping stimulus samples, and these will be illustrated in terms
of applications to experiments on simple stimulus compounding. Consi
deration of the process whereby trial samples are drawn from a larger
stimulus population will be deferred to Section 4.2.
A. and E.
The basic axioms ofhhe component model are as
CloThe sample s of stimulation effectiVe on any trial is partitioned
into subsets si(i = 1, 2, ••• r, where r is the number of response
alternatives), the ith subset containing the elements conditioned
to (or "connected tott) response Ai
C2. ,The, probability of response Ai in ,the presence of the stimulus
sample sis given E;:[
Pr(A.ls)
N(s.)
N(s) ,
where N(x) denotes the number of elements in the set x.
In Cl we modify the usual definition of a partition to the extent of
permitting some of the subsets to be empty; ,that is, there may be some
response alternatives Which are conditioned to none of the elements ,of
s .We do mean to assume, however, that each element of s is condi
tioned to exactly one response. ,The substance of C2 is, then, to make
the probability that a given response will be evoked bys equal to the
proportion of elements of s that are conditioned to that response.
4.2 Stimulus Compounding
An elementary transfer situation arises if one reinforces two
responses, each in the presence of a different stimulus sample, then
combines all or part of one sample with all or part of the other to
form a new test situation. To begin with a special case, let us consider
A. and E.1l3
an experiment conducted in the laboratory Of one of the writers
(W.)i:.E.).
8
8This experiment was conducted at Ipdiana University with the assistance
of Miss Joan SeBreny.
In one stage of the experiment, a number of disjoint samples of three
distinct cues drawn from a large population were used as the stimulus
members of pairedassociate items, and by the usual method of paired
presentation one response was reinforced in the presence of some of
these samples and a different response in the presence of others. The
constituent cues, intended to serve as the empirical counterparts of
stimulus elements, were various typewriter symbols, which for present
purposes we shall designate by small letters a, b, c, etc., and the
responses were the numbers "one" and "two," spoken aloud. Instructions
to the subjects indicated that the cues represented symptoms and the
numbers diseases with which the symptoms were associated. Following
the training trials,new combinations of "symptoms" were formed, and the
subjects were instructed to make their best guesses at the correct
diagnoses.
Suppose now that response Ai had been reinforced ·in the presence
of the sample (abc) and response ~ in the presence of the sample
(def). If a test trial were given subsequently with the sample (abd),
direct application of Axiom C2 yields the prediction that resp.onse Ai
should occur with probability 2/3. Similarly,if a test were given
A. andE. ll4
with the sample (ade), response A
l
would be predicted to occur with
probabilityl/3.ResuJ:ts obtained with 40 subjects, each giv'en: ..24:t!"sts of
each type,were as follows:
Percentage·overlap of training and test sets
Percentage response l to test set
.667
.669
.333
.33
2
Success in bringing off a priori predictions of this sort depends
not only on the basic soundness of the theory but also on one's success
in realizing various simplifying in the experimental situa
tion. As mentioned above, it was. our intention in designing the experi
ment just cited to choose cues, a, b, c, etc., which would take on the
role of stimulus elements. Actually, in order to justify our theoretical
predictions, it was necessary only that the cues behave as equalsized
sets of elements. To bring out the importance of the equal N assump
tion, let us suppose that the individual cues actually correspond to
subset to the reinforced response, application of Axiom C2 yields for
sets sa' sb' etc., of elements. Then, given the same training (response
A
l
reinforced to the combination abc and response to def), and
assuming the training effective in conditioning all elements of each
,
abd to
the probability of response A
l
where we have used the obvious abbreviation N(si) = N
i
. This equation
reduces to Pr{Allsasbsa) = 2/3 only if N
a
=N
b
';. N
d
.
A. and E. 115
In this experiment we depended on commonsense considerations to
choose cues which could be expected to satisfy the e'lualN re'luirement,
and also counterbalanced the design of the experiment so that minor
deviations might be expected to average out. Sometimes it may not be
possible to depend on commonsense considerations. In that case, one
can utilize a preliminary experiment to check on the simplifying assump
tions. Suppose, for example, we had been in doubt as to whether cues a
and b would behave as equalsized sets .To check on this,we could
have run a preliminary experiment in which we reinforced, say, response
Ai
to a and response
to b , then tested with the compound abo
Probability of response
Ai
to ab is) according to the model, given
by
,
which should deviate in the appropriate direction from 1/2 if N
a
and
N
b
are not equal. By meanS of calibration experiments of this sort,
sets of cues satisfying the equalN assumption can be. assembled for use
in further research involving applications of the model.
The expressions obtained above for probabilities of response to
stimulus compounds can readily be with respect both to set
sizes and level of training 0 Suppose that a collection of cues a, b, c .••
corresponds to a collection of stimulus sets
s , sb
J
S J 0 0 D
a c
of sizes
N , N
b
, N , ., •
a c
and that some response A
j
is conditioned to a propor
tion of .theelements in
sa' a proportion of the elements
A. andE. 116
in Sb' and so on. Then probability of response A
j
to a compound of
these cues is, by AxiomC2, expressed by the relation
N p' . + NbPbj + Ncpcj +
Pr(A.1 s ,sb's , ... ) = __
Ja·c N.a+N+N+,..
b c
(52)
Application of 52 can be illustrated in terms of a study of
probabilistic discrimination learning reported by ·Estesj Burke, Atkinson,
and Frankmann (1957). In this study the individual cues were lights
which differed from each other only in their positions on a panel. ·The
first stage of the experiment consisted in discrimination training
according to a routine which we shall not describe here except to say
that on theoretical grounds it was predicted that at the end of training
the proportion of elements in a sample associated with the ith light
conditioned to the first of two alternative responses would be given by
i
Pil = 13' Following this training, the subjects were given compounding
tests with various triads of lights. Considering, say,. the triad of
lights 1, 2,and 3, the values of Pil
and
3
P31 = 13 ' assuming
should be
N
l
= N
2
= N
3
= N , and substituting these
2
13 '
va).ues into 52, we obtain
2
= 13 =
.15
as the predicted probability of response 1 to the compound 1,2,3. Theo
retical values similarly computed for a number of triads are compared with
the empirical test proportions reported by Estes et. al. ,in Table 5..
A.andE. 117
Insert Table 5 about here
An important consideration in applications of models for stimulus
compounding is the question of whether the experimental situation contains
an appreciable amount of background stimulation in addition to the controlled
stimuli manipulated by the experimenter. Suppose, for example,we are
interested in the problem of whether a compound of two conditioned stimuli,
say a light and a tone, each of which has been paired with the same uncon
ditioned stimulus, may have a higher probability of evoking a conditioned
response (CR) than either of the stimuli presented separately. To ana
lyze this problem in terms of the present model, we may represent the
light and the tone by stimillus sets and Assuming that assa
result of the previous reinforcement the proportions of conditioned
elements in su and sT (and therefore the probabilities of CRs to the
stimuli taken separately) are PL and PT' respectively, application of
Axiom C2 yields for the probability of a CR to the compound. of light and
tone presented together, neglecting any possible background stimulation,
Pr(CRIL,T) =
NLPL + NTPT
N
L
+ NT
Clearly, the probability of a CR to the compound is simply a weighted
mean of and PT ' and therefore its value must fall between the
pro1)abilities of a CR to the two conditioned stimuli taken separately.
No "summation" effect is predicted.
A. and E. ll7a
Table 5
Theoretical and observed proportions of response A
l
to triads of lights in stimulus compounding test.
A, and E, 118
Often, however, it may be unrealistic to assume background stimula
tion from the apparatus and surroundings to be negligible, In fact, the
experimenter may have to count on an appreciable amount of background
stimulation, predominantly conditioned to behaviors incompatible with
the CR, to prevent "spontaneous" occurrences of the tobeconditioned
response during intervals between presentations of the experimentally
·controlled stimuliLet us now expand our representation of the condi
tioning situation by defining a set sb of backgr'ound elements, a propor
tion of which are conditioned to the CR, For simplicity, we shall
Then the theoretical pr'oba consider only the special case of Pb = 0
bilities of evocation of the CR by the light, the tone, and the compound
of light and sound (together with background stimulation in each case)
are given by
and
Pr(CR!L) =
Pr(cRIT)
,
,
Pr(CR!L,T)
NTPT + NLPL
NT+NL+N
b
,
respectively, Under these conditions it is possible to obtain a summa
tion effect, Assume, for eXaIllple, that NT = N
L
= N
b
and PT > PL '
A. and E. 119
so Pr(CRIT) > Pr(CRIL) • Taking the difference between the probability
of a CR to the compound and probability of a CR to the tone alone,
we have
,
6
2p 
L
which is positive if the inequality 2PL > PT holds. Thus, in this
case, probability of a CR to the compound will exceed probability of
a CR to either conditioned stimulus alone, provided that PT is not
more than twice
The role of background stimuli has been particularly important in
the interpretation of drive stimuli. It has been assumed (Estes, 1958,
1961a) th",t in simple animal learning experiments,(e.g., those involving
the learning of running or barpressing responses with food or water
r e w a r d ~ the stimulus sample to which the animal responds at any time is
compounded from several sourcesthe experimentally controlled conditioned
stimulus (CS) or equivalent; stimuli, perhaps largely intraorganismic
in origin, controlled by the level of food or water deprivation; and
extraneous stimuli which are not systematically correlated with reward
Of the response undergoing training and therefore remain for the most
part connected to competing responses. It is assumed further that the
A. and E. 120
sizes of samples of elements associated with the CS and with extraneous
sources, and sE ' are independent of drive, but that the size of
the sample of drivestimulus elements, sD' increases as a function of
deprivation. In most simple rewardlearning experiments, conditioning
to the CS and drive cues would proceed concurrently, and one might
expect that at a given stage of learning the proportions of elements in
samples from these sources conditioned to the rewarded response, R,
would be equal, i.e., PC = PD' If this were the case, then probability
of the rewarded response would be independent of depravation; for, letting
D and D' correspond to levels of deprivation such that N
D
< N
D
, ,
we have as the theoretical probabilities of response R at the two
deprivations,
Pr(RICS,D)
NePC + NoPD
=
N
C
N
D
+
and
Pr(RI CS,D')
NCPC + ND,PD'
=
N
C
N
D
, +
If the same training were given at the two drive levels, then we would
have PD = PD' , as well as Pc = PD ; in this case the difference between
the two expressions is zero. Considering the same assumptions but with
. extraneous cues taken explicitly into account, we arrive at a quite
different picture. In this case, the two expressions for response
probability are
A. andE. 121
and
Pr(RICS,D,E) ;
Pr(RICS,D' ,E) ;
NcPC + NifD + NEPE
N
C
+ N
D
+ N
E
NcPC+ ND,PD; + NEPE
N
C
+N
D
, + N
E
Now, letting Pc = PD = PD' = P , and for simplicity taking PE; 0 ,
we obtain for the difference
Pr(RI CS,D' ,E)  Pr(RI CS,D,E)
,
which is obviously greater than zero given the assumption N
D
, > N
D
.
Thus, in this theory, the principal reason why probability of the rewarded
response tends, other things equal, to be higher at higher deprivations
is that that the larger the sample of drive stimuli, the more effective
itis in outweighing the effects of extraneous stimuli.
4.3 Sampling Axioms and Major Response Theorem of Fixed Sample Size
Model
In 4.2 we considered some transfer effects which can be derived
within a component model by considering only relationships among stimulus
samples that have had different reinforcement histories. Generally,
A. and E. l22
however, it is desirable to take account of the fact that there may not
always be a onetoone correspondence between the experimental stimulus
display and the stimulation actually influencing the subject's behavior.
Owing to a number of factors, e.g., variations in receptororienting
responses, fluctuations situation, variations in
states or thresholds of receptors, the subject often may
sample only a portion of the stimulation made available by the experi
menter. One of the chief problems of statistical learning theories has
been to formulate conceptual representations of the stimulus sampling
process and to develop their implications for learning phenomena. With
respect to specific mathematical properties of the sampling process,
component models that have appeared in the literature may be classified
into two main types: (l) .models assuming fixed sampling probabilities
for the individual elements of a stimulus population, in which case
sample size varies randomly from trial to trial; and (2) models assuming
a fixed ratio between sample size and population size. The former type
was first discussed by Estes and Burke (l953), the latter by Estes (l950);
and some detailed comparisons of the two types have been presented by
Estes (l959b). In this section we shall limit consideration to models
of the second type, since these are in most respects easier to work with.
In the remainder of this section we shall distinguish stimulus
populations and samples by using S, with subscripts as needed, for a
pop\1lation, and s for a sample. The sampling axiOmS to be utilized
are as follows
A. and E. 123
S1. For any fixed,experimenterdef"ined stimulating situation, . sample
size and population size constant over trials.
S2. All samples of size have equal pro1;>abilities.
A prerequisite to nearly all applications of the model is a theorem
relating response probability to the state of conditioning of a stimulus
population.. We shall derive the theorem in terms of a stimulus situation
S containing N elements from which a sample of size N(s) = 0 is
drawn on each trial. Assuming that some number N. of the elements of
B are conditioned to response Ai' we wish to obtain an expression
for the expected proportion of elements conditioned to Ai in samples
drawn from S, since this proportion will, byAxiomC2, be equal to
the probability of evocation of resp0l)se Ai by samples from S. We
begin, as usual, with the probability in which we are interested, then,
using the axioms of the model as appropriate, proceed to expand in terms
of the state of conditioning and possible stimulus samples:
Pr(Ai!S) = L pr(Ails)pr(s!S) •
s
The summation being overall samples of size a that can be drawn from S .
represents the proba In the last expression on the right,
Nel't, substituting expressions for the condi+ioned probabilities we obtain
" .( ",) (.i:, ,
pr(Ai!S) = y 0 r)
0
N(s. )
l
bility of Ai in the presence of a .sample of size a containing a
A.and E.124
subset si of elements conditioned to Ai ; .the product of binomial
coefficients denotes the number of ways of obtaining exactly N(s.)
l
.elements conditioned to Ai in a sample of size cr, so the ratio of
The resulting
of size cr is sample
N(si)
the probability of obtaining the given value of
cr
formula will be recognized as the familiar expression for the mean of
this product to the number of ways of drawing a
a hypergeometric distribution (Feller, 1957, p.218), so we have the
pleasingly simple outcome that the probability of a response to the
stimulating situation represented by a set _S is equal to the proportion
of elements of S that are conditioned to the given response:
(53)
Thi.s result may seem too intuitively obvious to have needed a proof, but
one should note that the same theorem does not hold in general for
component models with fixed sampling probabilities for the elements
(cf. Estes and Suppes, 1959b).
4.4 Interpretation of Stimulus Generalization
OUr approach to the problem of stimulus generalization is to
represent the similarity between two stimuli by the amount of overlap
between two sets of elements .9
9A model similar in most essentials has been presented by Bush and
Mosteller ( 1 9 5 1 ~ ) .
A, and E.
In the simplest experimental paradigm for exhibiting generalization, we
begin with two stimllllls sitllation.s, represented by sets Sa and Sb'
neither of which has any of its elements conditioned to a reference
response A
l
Training is given by reinforcement of A
l
in the presence
of Sa only 1llltil the probability of A
l
in that sitllation reache.s
some value Pal> O. Then test trials are given in the presence of
Sb ' and if Pbl now proves to be greater than zero, we say that
stimllllls generalization has occllrred, If the axioms of the component
model are satisfied, the vallle of Pbl provides, in fact,a meaSllre
of the overlap of S and
a
Sb ; for, by Eq. 53, we have immediately
where den.otes the set of elements common to Sand
a
Sb ' since
the numerator of this fraction is simply the number of elements in Sb
that are now conditioned to response A
l
. More generally, if the
proportion of elements of Sb conditioned to A
l
prior to the experi
ment were eqllal to "bl' not necessarily zero, the probability of
response A
l
to stimllllls Sb after training in
S wOllld be given by
a
,
or with the more compact notation Nab = N(SanSb) , etc.,
:f>bl :::;
NabPal + (N
b
 Nab)"bl
N
b
(54a)
A. and E. 126
This relation can be put in still more convenient fODm by letting
Nab
= w
ab
' viz.
N
b
This maybe rearranged to read
(54b)
and we See that the difference (Pal  gbl) between the posttraining
probability of A
l
in Sa and the pretraining probability in
can be regarded as the slope parameter of a linear "gradient" of
generalization in which Pbl is the dependent variable and the propor
tion of overlap between S
a
and is the independent variable. If
we hold
constant and let Pal vary as the parameter, we generate
a family of generalization gradients which have their greatest disparities
at w
ab
= 1 (i.e., when the test stimulus Sb is identical with S)
a
and converge as the overlap between and Sa decreases, until the
gradients meet at when Thus the family of
gradients shown in Fig. 9 illustrates the picture to be expected if a
Insert Fig. 9 about here
series of generalization tests is given at each of several different
stages of training in S , or, alternatively, at several different stages
a
of extinction following training in S ,
a
as was done, for example, by
Guttman and Kalish (1956). The problem of "calibrating" a physical
stimulus dimension so as to obtain a series of values which represent
w
ab
·5,
gbl
~
.1,
in S
a
A. and E. 126a
1.0
.l
oct
UJ
VI
Z
o
a.
VI
UJ
c::
lJ..
o .5
~
:::i
iii
~
o
c::
a. _. __
o
Fig. 9. Generalization from a training stimulus, Sa' to a test
stimulus, Sb' at several stages of training. The parameters are
the proportion of overlap between Sa and Sb' and
the probability of response Al to Sb prior to training
A. and E. 127
equal differences in the value of w
ab
has been discussed by
Carterette (1961).
One might regard the parameter wah as an index of the similarity
(the former being given by
larger set to a test
a symmetrical relation,
Nab
and
N
b
N
a
= N
b
• When
In g ~ n e r a l , similarity is not to
for w b .. is not equal to w
b
a N a
the latter by Nab) except in the special case
a
N
a
f N
b
' generalization from training with the
,of
with the smaller set will be greater than generalization from training
with the smaller set to a test with the larger set (assuming .that the
reinforcement given the reference response A
l
in the presence of the
training set Si is such as to establish the same value of Pil in
each case prior to testing in Sj)' We shall give no formal assump
tion relating size of a stimulus set to observable properties; however,
it is reasonable to expect that larger sets will be associated with more
intense (where the notion of intensity is applicable) or attentiongetting
stimuli. Thus if Sa and ~ represent tones a and b of the same
frequency but with tone a more intense than b, we should predict
greater generalization if we train the reference response to a given level
with a and .test with b than if we train to the same level with b
and test with I?
It is worth noting that, although in the psychological literature
the notion of stimulus generalization has nearly always been taken to
refer to generalization along some physical continuum such as wavelength
of light, intensity of sound, or the like, the settheoretical model is
not restricted to such cases. Predictions of generalization in the case
A.and E. 128_
of complex stimuli may be generated by. first evaluating the overlap
parameter w
ab
for a given pair of situations a and b from a set
of observations obtained with some particular combination of values of
and gbl ' then computing theoretical values of ~ l for new
The problem conditions involving different levels of Pal and ~ l
of treating a simple "stimulus dimension" is of special interest,how
ever, and we shall conclude our discussion of generalization by sketching
10
one approach to this problem.
10
. We follow, in most respects, .the treatment given by W. K.Estes and
D.L. La Berge in unpublished notes prepared for the 1957SSRC Summer
Institute in Social Science for College Teachers of Mathematics. For
an approach combining essentially the same settheoretical model with
somewhat different learning assumptions, the reader is referred to
Restle (1961).
We shall consider the type of stimulus dimension that Stevens (1957)
.has termed substitutive, or metathetic, i.e., one which involves the
notion of a simple ordering of stimuli along a dimension without varia_
tion in intensity or magnitude. Let us denote by Z a physical dimen
sion of this sort, e.g., wavelength of visible light, which we wish to
represent by a s e ~ u e n c e of stimulus sets. First we shall briefly out
line the properties that we wish this representation to have, then we
shall spell out the assumptions of the model more rigorOUSly.
A.and E,. 129
It is part of t 4 e i n t ~ i t i v e basis of a substitutive dimension that
one moves from point to point by exchanging some of the elements of one
stimulus for some new ,ones belonging to the next. Consequently, we shall
assume that as values of Z 'change by constant increments., each success
ive stimulus set should be generated by deleting some constant ,number
of elements from the preceding set and adding the same number of new
elements to form the next set. But to ensure that the organism's
behavior can reflect the ordering of stimuli along the Z scale without
ambiguity, we need also to assume that once an element is deleted as we
go along the Z scale, it must not reapPear in the set corresponding
to any higher Z value. Further,in view of the ,abundant empirical
evidence that generalization declines in an orderly fashion as the
distance between two stimuli on such a dimension increases, we must
assume that at least up to the point where sets corresponding to larger
differences in Z are disjoint, the overlap between two, stimulus sets
should be directly ,related to the interval between the corresponding
stimuli on the Z scale. These properties, taken together, enable us
to establish an intuitively reasoBablecorrespondence'betweeu'c4aracter
istics ofa sequence of stimulus sets and the empirical notion of general
ization along a dimension.
These ideas are incorporated more ,forinally ,in the following set
of axioms. The basis for these axioms is a stimulus dimension Z,
which may be either continuous or discontinuousj a collection S* of
stimulus sets, and a functionx('Zl , having a finite number of consecu
tive integers in its range. The mapping of the set (xl of scaled
A. and E. 130
stimulus values onto the subset,s (.Si) of S* must satisfy the
axioms:
Gl. For all i < j <k in (x),
S/)Sk C.
S.

J
02·
For all
is j S k
in (x) , if
s/i
S
k
f P
, where p is the
  
null set, then S. C (s.US
k
) .
J  ~
G3 . For all h < i, j < k in (x), if i  h = k  j , then N
hi
= Njk ;
and for all i in (x), N
ii
= N .
The set (x) may simply be a set ofZ scale values, or it may be
a set of Z values rescaled by some transformation. The reasons for
introducing (x) are twofold. First, for reasons of mathematical sim
plicity we find it advisable to restrict ourselves, at least for present
purposes, to a finite set ofZ values, and therefore to a finite col
lection of stimulus sets. Second, there is no reason to think that equal
distances along physical dimensions will in general correspond to equal
overlaps between stimulus sets. All that is required,however, to make
the theory workable is that for any given physical dimension, wavelength
of light, frequency of a tone, or whatever,we can find experimentally a
transformation x such that equal distances on the x scale do corres
pond to equal overlaps.
Axiom Gl states that if an element belongs to any two sets it also
belongs to all sets which fall between these two sets on the x scale.
Axiom G2 states that, if two sets have any common elements then all of the
elements of any set falling between them belong to one or the other (or
both) of the given sets; this property ensures that the elements drop out of
the sets ,in order as we move along the dimension. Axiom G3states the
A, andE, 131
property which distinguishes a simple substitutive dimension froman
additive, or intensity (in Stevens' terminology, prothetic) dimension,
It should be noted that only if the number of values in the range of
x(Z) is no greater thanN(S*)  N + 1 can Axiom G3 be satisfied, This
restri ction is necessary in order to obtain a onetoone mapping of the
x values into the subsets (Si) of S*.
One advantage in having the axioms set forth explicitly is that it
then becomes_ relatively easy to design experiments bearing upon various
aspects of the modelThus, to obtain evidence concerning the empirical
tenability of Axiom Gl, we might choose a response A
l
and a set (x)
of stimuli, including a pair i and k such that Pr(A
l
[i) ~ Pr(A
l
[k) ~ 0,
then train subjects with stimulus i only until pr(All i) 1, and
finally test with stimulus k. If pr(Al[k) is found to be greater
than zero, it must be concluded, in terms of the model, that Si()Sk f P;
Le., the sets corresponding to i and k have some elements in connnon.
Given pr(All k) > 0, it must be predicted that for every stimulus j
in (xl such that i < j < k , Pr(Allj) ~ pr(Allk) • Axiom Gl ensures
that all of the elements of Sk which are now conditioned to A
l
by
virtue of belonging also to Si must be included in
augmented by other elements of Si which are not in
Sj , possibly
Sk
'fa deal similarly with Axiom G2, we proceed in the same way to
locate two members i and k of a set (x) such that Si()Sk f P
Then we train subjects on both stimulus i and stimulus k until
P r ( A l l i ) ~ pr(Allk) ~ 1 , response Al being one which before this
training had probability of less than unity to all stimuli. in (x),
A. and E. 132
Now, by .G2if any stimulus j falls between i and k, the set 8
j
must be contained entirely in the union 8
i
U8
k
; consequently,we must
predict that we will now find pr(Al[j) = 1 for any stimulus j such
that i::: j ::: k .
To evaluate Axiom G3 empirically we require four stimuli h < i, j < k,
such that i  h = k  j If the four stimuli are all different, we can
simply train subjects on h and test generalization to i, then train
subjects to an equal degree on j and test generalization to k. If
the amount of generalization, as measured by the probability of the
test response, ·is the same in the t""o cases, then the axiom is supported.
In the special case when h = i and j = k , we would be testing the
assertion that the sets associated with different values of x are of
equal size. To accomplish this test, we need only take any two neighbor
ing values of x, say i and j, train subjects to some criterion on
i and test on j, then reverse the procedure by training (different)
subjects to the same criterion on j and testing on i If the axiom
is satisfied, the amount of generalization should be the same in both
directions.
Once we have introduced the notion ofa dimension,it is natural
to inquire Whether the parameter which represents the degree of COJllIDun
ality between pairs of stimulus sets might not be related in some simple
way to a measure of distance along the dimension. With one qualification,
which we will mention later, th.e quantity d .. = 1  Wi' could serve as
lJ . J
a SUitable measure of the distance between stimuli i and j • We can
check to see whether the familiaraxiom$ for a metric are s.atisfied.
These axioms are
A. and E. 133
l. d
ij
0 if and only .if i
j
;
2. d.
>
0 ;
lj
3. d
ij
;d..
;
Jl
4. d. +d .> d
ik
;
lj jk 
where it is 1Ulderstood that i,j,and k are any members of the set
(x) associated with a given dimension. The first three of these obviously
hold, but the fourth a bit of analysis. To carry out a proof,
we shall use the notation N
ij
for the n1.llnber of elements common to
S. and S. , N
ijk
for the n1.llnber of elements in both S. and S.
1
J
1
J
but not in
Sk
and so on. The difference between the two sides of the
we wish to establish can be expanded in terms of this nota
tion as foliows:
N
i
·
2..l
N
N'
k
) + (1  )  (1 
N
 l(N  N  N + N )
 !'I . ij jk· ik
The last expression on the right is nonnegative, which establisqes the
desired To find the restrictions 1Ulder which d is additive)
let us ass1.llne that stimuli i, j, and k fall in the order i < j < k
on the dimension. Then, by.Axiom Gl, we know that N
iJk
0 However
A.and"E. ,.134
it is only in the special cases when 8
i
and 8
k
are either over
lapping or adjacent that N7.
k
 0, and,therefore, that d .. + d'
k
d<k'
J
It is possible to define an additive distance measure which is not
subject to this restriction, but such extensions raise new problems and
we shall not be able to pursue them here.
In concluding this section, we should like to emphasize one dif
ference between the model for generalization sketched here and some of
those already familiar in the literature (see, e.g., Spence, 1936;
HUll, 1943}. We do not postulate a particular form for generalization
of response strength or excitatory tendency. Rather, we introduce
certain assumptions about the properties of the set of stimuli associated
with a sensory dimension; then we take these together with
assumptions and information about reinforcement schedules as a basis for
deriving theoretical gradients of generalization for particular types
of experiments. Under the special conditions assumed in the example
considered above, the theory predicts a family of linear gradients with
very simple properties will be observed when response probability is
plotted as a function of distance from the point of reinforcement. Pre
dictions of this sort may reasonably be tested by means of experiments
in which suitable measures are taken to meet the conditiOnS assumed in
the derivations (see, e.g. Carterette, 196'\.<; ). But to deal with
experiments involving different training conditions, or response measures
other than relative frequencies, further theoretical analysis is called
for; and one must be prepared to find substantial differences in the
phenotypic properties of generalization gradients derived from the same
basic theory for different experimental situations.
A. and E. 135
5 • CCNPONENT AND LINEAR MODELS FOR SIMPLE LEARNING
·In this section we combine, in a sense, the theories discussed in
the preceding sections. Until now it was convenient for expositional
purposes, to threat the problems of learning and generalization separately.
We first considered a type of learning model in which the different
possible samples of stimulation from trial to trial were assumed to be
entirely distinct, and then turned to an analysis of generalization, or
transfer, effects that could be measured on an isolated test trial follow
ing a series of learning trials. Prediction of these transfer effects
depended on information concerning the state of the stimulus population
just prior to the test trial but did not depend on information about the
COurse of learning over preceding training trials. However, in many
(perhaps most) learning situations, it is not reasonable to assume that
the samples, or patterns, of stimulation affecting the organism on
different trials· of a series are entirely disjoint; rather, they must
overlap to various intermediate degrees, thus generating transfer effects
throughout the learning series. In the "component models" of stimulus
sampling theory, one simply takes the learning assumptions of the pattern
model (Sec. 3) together with the sampling axioms and response rule of the
generalization model (Sec. 4) to generate an account of learning for this
more general case.
5·1 CompoEent Model.s with Fixed S8Jllple Size
As indicated earlier, the analysis of a simple learning experiment
in terms of a component model is based on the representation of the
A. and E.  1 3 6 ~
stimulus as a set S of N stimulus elements from which the subject
draws a sample on each trial. At any time each element in the set S
is conditioned to exactly one of the r response alternatives A
l
, ... ,A
r
;
by the response axiom of Sec. 4.1 the probability of a response is equal
to the proportion of elements in the trial sample conditioned to that
response. At the termination of a trial, if reinforcing event E
i
(i F 0)
occurs, then with probability c all elements in the trial sample become
conditioned to response Ai If EO occurs the conditioned status of
elements in the sample does not change. The conditioning parameter c
plays the same role here as in the pattern model. It should be noted
that in the early literature of stimulus sampling theory, this parameter
was usually assumed to be equal to unity.
Two general types of component models can be distinguished. For
the fixed sample ~ model we assume that the sample size is a fixed
number s th",oughout any given experiment. For the independent sampling
model we assume that·the elements of the stimulus set, S, are sampled
independently on each trial, each element having some fixed probability
e of being drawn. In this section we discuss the fixed sample size model
and cOnsider the case in which all possible samples of size s are
sampled with equal probability.
Formulation for RTT Experiments. To illustrate the model we first
consider an experimental procedure in which a particular stimulus item is
given a single reinforced trial followed by two consecutive nonreinforced
test trials. The design may be conveniently symbolized RT
1
T
2
. Pro
cedures and results for a number of experiments using an RTT design
A. and E, 137
have been reported elsewhere (Estes, 1960a; Estes, Hopkins and Crothers,
1960; Estes, 1961b; Crothers, 1961). For simplicity, suppose one selects
a·situation in which the probability of a correct response is zero before
the first reinforcement (and in which the likelihood of a subject's
obtaining correct responses by guessing is negligible on all. trials),
In terms of the fixed sample size model we can readily generate predic
tions for the probabilities, Pij' of various combinations of response
i on T
l
and response j on T
2
. If i,j; 0 denotes a correct
response and i,j; 1 denotes an error then
To obtain the first result, we note that the correct response can occur
on either trial only if conditioning Occurs on the reinforced trial,
which has probability c. On occasions when conditi. oning occurs, the
whole sample of s elements becomes conditioned to the correct response
and the probability of this response on each of the test trials is
s
N'
On occasions when conditioning does not occur on the reinforced trial,
probability of a correct response remains at zero over both test trials.
A. and E. 1:58
Note that when s; N ; 1 this model is equivalent to the oneelement
model discussed in Sec. 2.1. If more than one reinforcement is given
prior to T
l
, the predictions are essentially unchanged. In general,
for k preceding the expected proportion of elements
conditioned to the correct response (i.e., the probability of a correct
response) at the time of the first test is
k
PO ; 1  (1  c; ) ,
and the probability of correct responses on both T
l
and T
2
is given
by
ki[ siJ2
c)  • 1  (1  iii)
To obtain this last expression, we n()tethat a subject for whom i of
the k reinforcements have been effective will have probability
[
s i]
1  (1  iii) of making a correct response on each test, and the
probability that
(
k) i ki
i c (1  c)
exactly i reinforcements are effective is
Similarly,
and
s2i
(1: )
. N
A. and E. 139
If s = N , these expressions reduce to
k
P = 1  (1  c)
00
= a
. k
Pn = (1  c)
This special case appears well suited to the interpretation of data
obtained by G. H. Bower (personal communication) from a study in which
the T
1
T
2
procedure was applied following various numbers of presenta
tions of wordword pairedassociates. For 32 subjects each tested on
.10 items, Bower reports observed proportions of POO = .894 ,
P10 = Pal = .003 , and Pll = .100 •
When applied to other RTT experiments, this model has, however,
not yielded consistently accurate predictions. The difficulty apparently
stems from the fact that our assumptions do not take account of the
retention loss that is usually observed from T
l
to T
2
(see, e.g.,
Estes, 1 9 6 1 ~ ) . An extension of the model which is capable of handling
retention decrement as well as the acquisition process will be discussed
in Sec. 5.2 below.
For RTT experiments in which the probability of successful guessing
is not negligible (as in pairedassociate tasks involving a fixed list
of responses which are known to the subject from the start) some addi
.tional considerations arise. Perhaps the most natural extension of the
preceding treatment is to assume that the subject starts the experiment
A. and E. l40
with a proportion
l
of the elements of a given set S. connected to
r ~
l
the correct response and a proportion (l  ) connected to incorrect
r
responses, r being the number of alternative responses. Then for a
fixed sample size model, the probability, PO' of a correct response to
a given item on the first test trial after a single reinforcement is
(1 c)
l
c[s
+ (N

s)jr]
Po
= 
+
r N
( l ~ ' ~ S )
l
+.£::.
= ,
r N
the bracketed quantity being the proportion of elements connected to the
correct response in the event that the reinforcement is effective. Then
the probabilities of various combinations of correct and incorrect reSponses
on the two test trials are given by
Poo = (l  c)
l l
PlO = POl = (l  c) r (1:;::) + cqJ(l qJ) (56)
l2
Pn = (le)(l :;::)
2
+c(lqJ) ,
where qJ = j + (1  j) ~
An alternative approach to the type of experiment in which the
subject guesses on unlearned items is to assume that initially all
elements are neutral, i.e., are connected neither to correct nor to
incorrect responses. In the presence of a sample containing only.neui:;:ti('l.l
tioned elements in the sample connected to the correct response determines
tioned to the correct response, two conditioned to an incorrect response,
and four unconditioned, then the probability of a correct response is simply
If of being correct.
r
1
A. and E. 141
elements, the subject guesses, with probability
the sample contains any conditioned elements, then the proportion of condi
its probability (e.g., if the sample contains nine elements, three condi
3/5). These assumptions seem in some respects more intuitively satisfactory
than those considered above. Perhaps the most important difference with
respect to empirical implications lies in the fact that with the latter
set of assumptions, exposure time on test trials must be taken into ac
count. If the stimulus exposure time is just long enough to permit a
response (in terms of the theory, just long enough to permit the subject
to draw a single sample of stimulus elements), then the probabilities of
correct and incorrect response combinations on T
l
and T
2
are
1 2
P = (1  c)  + c m'
00 2 'l'
r
= (1  c) 1 (1 _ !) + c cp r (1 _ cp')
r r
,
1 2
= (1  c)(l  )
r
2
+ c(l cp')
,
where The factor
hi
I ~ \
is the probability
that the subject draws a sample containing none of the s elements that
became conditioned on the reinforced trial; therefore 1  cp' represents
the probability that a subject for whom the reinforced trial was effective
nevertheless draws a sample containing no conditioned elements and makes
an incorrect guess, whereas cp' is the probability that such a subject
makes a correct response on either test trial.
A. and E. 142
The two sets of 56 and 57 are formally identical, and thus
cannot be distinguished in application to RTT data. Like 55,
they have the limitation of not allowing for the retention
loss usually observed (see, e.g., Estes, Hopkins, and Crothers, 1960);
we shall return to this point in Sec. 5.2.
If exposure time is sufficiently long on the test trials, then we
assume that the subject continues to draw successive random samples from
S and only makes a response when he finally draws a sample containing
at least one conditioned element. Thus, in cases in which the reinforce
ment has been effective on a previous trial (so that S contains a sub
sample containing one or more conditioned elements and will respond on
the basis of these elements thereby making a correct response with prob
(58) ,
+ C
1 1
c)  (1  )
r r
1 2
)
r
1
2
r
Poo = (1  c)
Pll = (1  c)(l 
Therefore, for the case of unlimited exposure time,
set of s conditioned elements), the subject will eventually draw a
ability 1
= 1 and 57 reduces to
which are. identical with the corresponding for the oneelement
model of Sec. 2.2.
General Formulation. We turn now to the problem of deriving pre
dictions from the fixed sample size model concerning the course of
learning over an experiment consisting of a of trials run
A. and E. 143
under some prescribed reinforcement schedule. We shall limit considera
tion to the case in which each element in S is conditioned to exactly
one of the two response alternatives, A
l
N + 1 conditioning states. Again, we let
or so that there are
C. (i = 0, ••• ,N) denote the
state in which i elements of the set S are conditioned to A
l
and
N  i to As in the pattern model the transition probabilities
among conditioning states are functions of the reinforcement schedules
and the settheoretical parameters c, .s, and N. Following our
approach in Sec, we shall restrict the analysis to cases in which
the probability of reinforcement depends at most upon the response on
the given trial; we thereby guarantee that all elements in the transi
tion matrix for conditioning states are constant over trials. Thus the
sequence of conditioning states can again be conceived as a Markov chain.
Transition Probabilities. Let s. denote the event of drawing
a sample on trial n with i elements conditioned to A
l
and s  i
conditioned to Then the probability ofa onestep transition from
state to state is given by
Pr(Ells V
C
.) ,
s J
(59a)
pr(EllsS_vCj) is the probability of an E
l
event given condi
tioning state C
j
and a sample with V elements conditioned to •
Ao and Eo 11+4
To obtain Eqo 59a we note that an E
l
must occur and that the subject
must sample exactly v elements from the N  j elements not already
conditioned to A
l
; the probability of the latter event is the number
of ways of drawing samples with V elements conditioned to ~ divided
by the total number of ways of drawing samples of size s 0 Similarly
and
1 
(59b)
(59c)
Although it is an obvious conclusion,it is important for the reader to
realize that the pattern model discussed in Part 3 is identical to the
fixed sample size model when s = 1 This correspondence between the
two models is indicated by the fact that Eqo 59 reduce to Eqo 23 when we
let s = 1
For the simple noncontingent schedule in which only the two events
E
l
and E
2
occur (with probabilities IT and 1  IT , respectively)
Eqso 59a to 59c simplify to
q ..
. J, J
,
,
(60a)
(60b)
(60c)
It is apparent that state C
N
is an absorbing state when
" ~ 1
and
Co
is an absorbing state when
"
~ 0 Otherwise all states are ergodic.
Mean Learning Curve. Following the same techniques used in connec
tion with Eq. 27 we obtain for the component model in the simple, non
contingent case
(61)
This mean learning function traces out a smooth growth curve that can
take any value between 0 and 1 on trial n if parameters are selected
appropriately. However, it is im:portant to note that for a given reali
zation of the experiment the actual response probabilities for individual
subjects (as opposed to expectations) can only take on the values 0,
1 2
N' N'
Nl
... , N ' 1 ; Le., the values associated with the conditioning
states. This stepwise aspect of the process is particularly important
when one attempts to distinguish between this model and models that
assume gradual continuous increments in the strength or probability of
A. and E. 146
a response over time (Hull, 1943; Bush and Mosteller, 1955; Estes
and Suppes, 1959a).
To illustrate this point we consider an experiment on avoidance
learning reported by Theios (1961). Fifty rats were used as subjects.
,The apparatus was a modified MillerMowrer electric shock box. The
animal was always placed in the black compartment; shortly thereafter
a buzzer and light came on as the door between the compartments was
opened. The correct response (A
l
) was to run into the other compart
ment within 3 seconds. If A
l
did not occur the subject was given a
high intensity shock (255 volts) until it escaped into the other com
partment. After 20 seconds the subject was returned to the black corn
partment, and another trial was given. Each rat was run until it met
a criterion of 20 consecutive successful avoidance responses.
Theois analyzes the situation in terms of a component model in
which and Further, he assumes that Pr(A
l
1) and hence
,
on trial 1 the subject is in conditioning state CO' Employing Eq. 60
with ,and we obtain the following transition matrix:
C
C
l
Co
2
S
1 0 0
C
l
c/2 1 0
Co
0 c 1

c
And the expected probability of an A
l
response on trial n is readily
obtained by specialization of Eq. 61,
probabilities are actually fixed at
Thus, according
A. and E.• 147
() (
_
_2c)nl
Pr Al,n = 1 1 .
Applying this model Theios estimates c = .43 and provides an impressive
account of such statistics as total errors, the mean learning curve,
trial number of last error, autocorrelation of errors with lags of 1,
2, 3 and 4 trials, mean number of runs, probability of no reversals, and
many others. However, for our immediate purposes we are interested in
only one feature of his data; namely, whether the underlying response
1
0, ..2· and 1 as specified by the
model. First, we note that it is not possible to establish the exact
trial on which the subject moves from Co to C
l
or from C
l
to C
2
Nevertheless, if there are some trials between the first succesS (Ai
response,). and the last error ( ~ response), we can be sure .that the
subject is.in state Cion these trials. For if the subject has made
one success, at least one of the two stimulus elements is conditioned
to the Ai response; if on a later trial the subject makes an error,
then, up to that trial, at least one of the elements is not conditioned
to the A
l
response. Since deconditioning does not occur in the present
model, the subject must be in conditioning state C
l
to the model, the sequence of response8after the first success and before
the last error should .form a sequence of Bernoulli trials with constant
probability of an response. Theios has applied several
statistical tests to che c.k this hypothesis and none suggest that the
assumption is incorrect. For example, the response sequences for the
trials between the first success and last error were divided into blocks
A. and E. 148
of four trials and the number of A
l
responses in each block was counted.
The obtained frequencies for 0, 1 , 2 , 3 and 4 successes were 2 ,
.12 , 17, 15 , and 4 , respectively; the predicted binomial frequencies
were 3.1 , 12.5, 18.5 , 12.5 and 3.1. The correspondence between pre
dieted and observed frequencies is excellent as indicated by a
goodnessoffit test that yielded a value of 1.47 with 4 degrees of
freedom.
Theios has applied the Same analysis to data from an experiment by
Solomon and Wynne (1953) where dogs were required to learn an avoidance
response. The findings with regard to the binomial property on trials
after the first success and before the last error are in agreement with
his own data but suggest that the binomial parameter is other than ~ •
From a stimulus sampling viewpoint this observation would suggest that
the two eiements are not sampled with equal probabilities. For a detailed
discussion of this Bernoulli stepwise aspect of certain stimulus sampling
mOdels, related statistical tests, and a review of relevant experimental
data the reader is referred to Suppes and Ginsberg (1962a).
The mean learning curve for the fixed sample size model given by
Eq. 60 is identical to the corresponding equation for the pattern model
with the sampling ratio
cs
N
taking the role of
c
N'
However, we need
not look far to find a difference in the predictions generated by the
two models. If we define
02,n as in Eq. 29; i.e.,
A. and E. 149
t.hen by carrying out theswnmation using the same methods as in the case
of Eq.27, we obtain
[1 _ 2cs + cs(s  1)]
~ , n  N N(N 1) ~ , n  l
/( 2)
+ Nc[N
s
 s(s  li] a 2 s s a
N(N :L 1., nl + c:n: iii ; 1., n1
2
c:n:s
+
;
(62)
The asymptotic variance of the response probabilistics for the component
model is simply
2
a
CID
Letting Cl = Cl = Cl , noting that
·"c,n"c,nl c,oo
Pr(A
l
) = :n: , and
,00
carrying out the appropriate computations we obtain
loo·  :n:(l  :n:)fN + eN .2)S]
 N L2N  s  1 .
This asymptotic variance of the response probabilities depends in
(63)
relatively simple ways on s and N If we hold N fixed and dif
ferentiate with respect to find that
2
increases monotoni s , we a
00
cally with s
; in particular, then, this variance for a fixed sample
size model with s > 1 is larger than that of the pattern model with
the same number of elements. If we hold the sampling ratio N fixed
.2
and take the partial derivative with respect to N, we find a to
00
A. and E. 150
be a decreasing function of N.. In the limit, if N t 00 in such a
way that
s
N= e
remains constant, then
(64)
which, we will see later, is the variance for the linear model (Estes
and Suppes, 1959a). In contrast, for the pattern model the variance of
the p values approaches 0 as N becomes large. We return to
comparisons between the two models in Sec. 5.].
Sequential Predictions. We now examine some sequential statistics
for the fixed sample size model that later will help clarify relationships
among the various stimulus sampling models. In particular, we consider
the probability of an A
l
response given that on the preceding trial
EO ,E
l
or E
2
occurred.
Consider, first pr(Al,n+lIEl,n)' By taking account of the condi
tioning states on trial n + 1 and trial n and also the sample on trial
n we may write
(1 ) L Pr(A C. E s. C ),
Pr E
l
'" k l,n+l J,n+l l,n l,n k,n
,n J.,JJ
where, as before, s. denotes the event of drawing a sample on trial
l,n
n with i elements conditioned to A
l
and s  i conditioned to A
2
,
Conditionalizing, with our learning axioms in mind, we obtain
A. and E. 151
= 1 L
Pr(E
l
) . . k
.9 n 1; J,
Pr(A
l
+llc. +l)Pr(C. +llE
l
s,C
k
)
,n· J,n. J,n ,n l,n ,n
. Pr(El Is. C
k
)Pr(s. IC
k
)Pr(C
k
)
,n l,n,n l,n,n ,n
Our reinforcement procedures depend at most on the responses of the
subject and hence Pr(E
l
·· ) =Pr(E
l
Is. C
k
).
,n ,n l,n ,n
Further
c if j=k+si
Pr(C. +llE
l
s. C
k
· ) =
J,n. Jn1,n ,n.
1 c if j = k
o otherwise
That is, the s ·i elements in the sample originally conditioned to
~ now become conditioned to A
l
with probability c and hence a
mOve from state C
k
to C
k
+
s
_
i
Occurs. Also, as noted with regard to
Pre s. IC
k
) =
l,n ,n
substituting these results in our last expression for pr(Al,n+lIEl,n)
yields
Pre C
k
) .
,n
A. and E. 152
We now need the fact that the first raw moment of the hypergeometric
distribution is
,
pel!'initting the simplification
Pr(A IE) = L [cs + ~ (1  CS)] Pr(C
k
n)
1, n+l 1, n k N N· N ,
But by definition
Pr(A
l
) = L ~ N Preck ),
,n k ,n
whence
(65a)
By the same method of proof we may show that
(65b)
Pr(A
l
)
,n
(65c)
finally, for comparison with other models, we present the expressions for
Pr(A
k
+lE. A. n) • As in previous cases (e.g.,Eg. 31a), we give
,n. J,nl,
A.and E. 153
results only for the noncontingent situation in which Pr(E
o
); 0
,n
and r; 2. Derivations of these probabilities are based on the same
methods used in connection with Eq. 61a.
Pr(A E A );rr{[l_C(Sl)]a +c(Sl) }
l,n+l l,n l,n N 1· 2,n N  1 '\,n
Pr(Al,n+lE2,nAl,n) ; (1 rr){[l [ c; 
Pr(A
l
+lE
2
A
2
.); a)
,n ,n . ,n 1 l,n 2,n
Pr(A_ .E A ) ;rr[l_c(Sl)](Q _Q )
'",n+l l,n l,n N1 .L,n
(66a)
(66b)
( 66c)
(66d)
(66e)
; rr{(l c;)(l'\,n) [1 t.ll)}al,n (66f)
; (lrr){[l+ it  [1 (66g)
; (1  rrJ{l  al,n  [1 /)]cal,n  a
2
,n)} (66h)
A. and E. 154
Application of these equations to the corresponding set of trigram
proportions for a preasymptotic trial block is not particularly reward
ing. The difficulty is that certain combinations of parameters, e.g.,
(
l_c(Sl). _
.. N  1 ) (ell n ~ n)
, ,
cs
and N' behave as units; consequently, the
basic parameters c, s ,and N cannot be estimated individually and,
as a result, the predictions available from the simpler Nelement pattern
model via Eq. 32 cannot be improved upon by use of Eq. 66. For asymptotic
data, the situation is somewhat different. By substituting the limiting
values for 0:
l,n
and 0:
2
in Eq. 66,i.e.,
,n
q ~ rc
1
and fromEq. 63
2
+ rc
~ rc(l rc)fN+ (N_2)S]+
N l 2N  s  1
2
rc ~
rc[N 2s+Ns+2rc(Ns)(Nl)]
 N(2Nsl) ,
we can express the trigram probabilities, Pr(A E. A. ) in terms
K,CO J,OO 1,00
of the basic parameters of the model. The resulting expressions are
somewhat cumbersome, however, and we shalL_not pursue this line of
analysis further in the present article.
A. and E. 155
5.2 Component Models with Stim)J.lus Fluctuation
In the preceding section, as in most of the literature on stimulus
sampling models for learning, we restricted attention to the special case
in which the stimulation effective on successive trials of an experiment
may be considered to represent independent random samples from the
population of elements available under the given experimental conditions.
More generally, we would expect that the independence of successive
samples would depend on the interval between trials. The concept of
stimulus sampling in the model corresponds to the process ofst1mulation
in the empirical situation. Thus sampling and resampling from a stimulus
population must take time; and if the interval between trials is suffi
ciently short, there will not be time to draw a completely new sample.
We should expect the correlation, or degree of overlap, between succes
sive stimulus samples to vary inversely with the .intertrial interval,
running from perfect overlap in
empirically realizable) of a zero interval to independence at suffi
ciently long intervals. These notions have been embodied in the
stimulus fluctuation model (Estes, 1955a, 1955b, 1959a). In this section,
we shall develop the assumption of stimulus fluctuation in connection
with fixed sample size models; consequently, the expressions derived
will differ in minor respects from .those of the earlier presentations
(cited above) which were not restricted to the case of fixed sample size.
Assumptions and Derivation of Retention CurVeS. Following the·
convention of previous articles on stimulus fluctuation models, we shall
denote by S* the set of stimulus elements potentially available for
A. and E.
sampling under. a given set of experimental conditions, by S the subset
of elements available for sampling a.t any given time and by S' the sub
set of elements that are temporarily unavailable (so that S* = sUS')
The trial sample, s , is in turn a subset of
S .
, however, in this
presentation we shall assume for simplicity that all of the temporarily
available elements are sampled on each trial (1. e., S = s ) . We denote
by N, N', and N*, respectively, the numbers of elements in s,
S', and S*
The·interchange between the stimulus sample and the remainder of
the sand S', is assumed to occur at a
constant rate ov(>r time. Specifically, we assume that during an inter
val L>t which is just long enough to permit the interchange of a single
element between sand S', there is probability gthat such an
interchange will oCCur, the parameter g being constant over time.
We shall limit consideration to the special case in which all stimulus
elements are equally likely to participate in an interchange. With this
restriction, the fluctuation process can be Characterized by the
difference equation,
f{t + 1)
11
= [1  g(iii + iii')] f(t) +
, (67.)
where f(t) denotes the probability that any given element.of S* is
in s at time t. This recursion can be solved by standard methods
A. and E. 157
to yield the explicit formula
fet)
N N 1 1 t
= N*  [N*  f( 0)][1  g('N + N')]
= J  [J  f(O)] at ,
(68)
where
and a
N
J = N*' the proportion of all elements which are in the sample,
1 1
1  g(N + N') •
Equation 68 can now serve as the basis for deriving numerOuS
expressions of experimental interest. Suppose for example, that at the
end of a conditioning (or extinction) period there were
J
" conditioned
·0
To ob
elements in Sand k
O
conditioned elements in S', the momentary
probability of a conditioned response thus being PO = jo IN
tain an expression for probability of a conditioned response after a
rest interval of duration t, we proceed as follows. For each condi
tioned element in S at the beginning of the interval, we need only set
f(O) = 1 in Equation 68 to obtain the probability that the element is
in S at time t. Similarly, for a conditioned element initially in
S', we set f(O) = 0 in Equation 68. Combining the two types, we
obtain for the expected number of conditioned elements in S at time t ,
A. and E. 158
Dividing by N (and noting that
N
J = N*)
we have, then, for the
probability of a conditioned response at time t,
(69)
where P5 and PO denote the proportion of conditioned elements in
the total population S* and the initial propo:rtion in S, respec
tively. If the rest interval begins following a conditioning period,
we would ordinarily have Po > P5' in which case Equation 69 describes
a decreasing flmction (forgetting, O:r spontaneous regression). If the
rest int"rval begins following an extinction period, we would have
Po < P5' in which case Equation 69 describes an increasing flmction
(spontaneous recovery). The manner in which cases of spontaneous re
gression or recovery depend on the amount and spacing of previous
acquisition or extinction.has been discussed in detail elsewhere (Estes,
1955a) •.
Application to the RTT Experiment. We noted in the preceding
section that the fixed sample size model could not p:rovide a generally
satisfactory account of RTT experiments because it did not allow for
the retention loss usually observed between the first and second tests.
It seems reasonable that this defect might be remedied by removing the
restriction on independent sampling. To illustrate application of the
A. and E. 159
more general model with provision for stimulus fluctuation, we shall
again consider the case of an RTT experiment in which the probability
of a correct response is negligible prior to the reinforced trial (and
also on later trials· if learning has not occurred). Letting t
l
and
t
2
denote the intervals between Rand T
l
and between T
l
and T
2
,
respectively, we may obtain the following basic expressions by setting
f( 0) equal to 1 or 0, as appropriate, in Equation 68:
For the probability that an element sampled on R is sampled
again on T
l
,
t
l
= J + (1  J)a
;
for the probability that an element sampled on T
l
is sampled
again on T
2
,
t
2
J + (1  J)a ;
and for the probability that an element not sampled on T
l
is
sampled on T
2
,
t
2
J( 1  a )
Assuming now that ·N = 1 , so that we are dealing with a generalized
form of the pattern model, we can write the probabilities of the four
combinations of correct and incorrect responses on T
l
and T
2
in
A. and E. ~ 1 6 0 .
terms of the conditioning parameter c and the parameters f
i
Poo ~
c f
l
f
2
POl
~
c f
l
(1f
2
)
PlO
~
c(l  f
l
) f
3
Pn
1 c + c(l fl)(l f
3
) ,
where, as before, the subscripts 0 and 1 denote correct responses
and errors, respectively. As they stand, Eq. 10 are not suitable for
application to data, for there are too many parameters to be estimated.
This difficulty could be surmounted by adding a third test trial, for
the resulting eight observation equations,
etc., would permit overdetermination of the four parameters. In the
case of some published studies (e.g., Estes, 1961b) .the data can be
handled quite well on the assumption that f
l
is approximately unity,
in which case Eq. 70 reduce to
A. and E. 161
Pll = 1  c
In the general case of 70, some predictions can be
made without information as to exact parameter values. It has been
noted in pUblished studies (Estes, Hopkins and Crothers, 1960; Estes,
1961b) that the observed proportion POl is generally larger than
P10' Taking the difference between the theoretical expressions for
these we have
POl  P10 = c fl(l f
2
)  c(lf
l
) f
3
t
l
. t
2
=c[J+(lJ)a ](lJ)(la )
t
l
t
2
 c(l J)(l a ) J(l a )
t
2
. t
l
t
l
c(lJ)(la )[J+(lJ)a J(la )]
,
which obviously must be to or greater than zero. The experiments
cited above have in all cases had t
l
< t
2
, and therefore f
l
> f
2
•
A. and E. 162
Since f
2
, which is directly estimated by the proportions of instances
in which correct responses on T
l
are repeated on T
2
, has ranged
from about .6 to .9 in these experiments (and f
l
must be larger)
it is clear that P10' the probability of an incorrect followed by a
correct response, should be relatively small. This theoretical predic.
tion accords well with observation.
Numerous predictions can be generated concerning the effects of
varying the durations of t
l
and t
2
The probability of repeating
a correct response from T
l
to T
2
, for example, should depend solely
on the parameter f
2
, decreasing as t
2
increases (and f
2
therefore
decreases). The probability of a correct response on T
2
following an
incorrect"response on T
l
should depend most strongly on f
3
, in
creasing as t
2
(and therefore f
3
) increases. The overall proportion
correct per test shOUld, of course, decrease from T
l
to T
2
(although
the difference between proportions on T
l
and T
2
tends to zero as
t
l
becomes large). Data relevant to these and other predictions are
available in studies by Estes, Hopkins, and Crothers (196Q),Peterson,
Saltzman, Hillner, and Land (1962), and Witte (R. Witte, personal com
munication). The predictions concerning effects of variation of t
2
are well confirmed by these studies. Results bearing on predictions
concerning variation in t
l
are not consistent over the set of experi
ments, possibly because of artifacts arising from item selection
(discussed by Peterson, et al, 1962).
A. and E. 163
Application to the Simple Noncontingent case. We shall restrict
consideration to the special case of N=1; thus, we shall be dealing
with a variant of the pattern model in which the pattern sampled on any
trial is the One most likely to be sampled on the next trial. No new
concepts are required beyond those introduced in connection with the
RTT experiment, but it will be convenient to denote by a single symbol,
say g, the probability that the stimulus pattern sampled on any trial
n. is exchanged for another pattern on trial n+l. In terms of the
notation used above,
g
t
(lJ)(l a )
where' t is now taken to denote the intertrial intervaL AlSO, we
the probability of the state of the organism in which denote by u
lm,n
m stimulus patterns are conditioned to the A
l
response and one of
patterns are m the probability that u
Cm,n
conditioned to A
l
but a pattern conditioned to ~ is sampled.
these is sampled, and by
Obviously,
N*
LU
m=O lm,n
,
Where, as usual, Pn denotes probability of the A
l
response on trial n .
Now we can write expressions for trigram probabilities, following
essentially the same reasoning used before in the case of the pattern
model with independent sampling. For the joint event A1E1A
l
, we obtain
A. and E. 164
Pr(A
l
+lE
l
A
l
)
,n . ,n ,n
= II 2= u
lm,n
m "
[
m 1]
lg+gN'" ;
= ll[(l g  ~ ' ) P n + g 2= u
lm
n j,] ;
ill '
for if an element conditioned to A
l
is sampled on trial n, then
with probability lg it is resampled and with probability
ml
gN'
it is replaced by another element conditioned to A
l
, and in either
event an A
l
response must occur on trial n + 1 Using the abbrevia
tions U = L: u
lm
N
m
, ?nd V = r u" N
m
" the trigram probabilities
n , n n \,All, n .
m m
can be written in relatively compact form:
Pr(A
l
+lE
l
A
l
)= n[ (1  g  Ng,);p + gU ]
,n .". ,n ,n .. n n
,
Pr(A
l
+lE
2
A
l
)=(lll)[((lc)(lg)g;}p +gU]
,n, ,n ,n .,', " W
r
n n
Pr(A
l
+lE
l
A_ ) = ll[c(lg)(lPn) + gV
n
] ,
,n, , ,n·c,n
Pr(A E A_ ) = (lll)gV ,
l,n+l 2,n'"2,n n
,
Pre A_ +lE
l
A
l
)
 ~ , n ,n ,n
,
Pr(A_ +lE
2
A
l
) = (lll).[(C cg+g+Ng,)p  gU]
""2, n , ,n ,n ,n  n
Pr(A_ E A_ ) = ll[(lc+cg)(lp) gV] ,
'""'2 ,n+I 1, n/2J n n n
,
Pr(A_ E A_) = (lll)[l p  gV] (71)
'"2,n+l 2,n'"2,n "n n
A. and E. 165
The chief difference between these expressions and the corresponding
ones for the independent sampling models is that sequential effects
now depend on the intertrial interval. Consider, for example, the
first two of Equations 71, involving repetitions of response A
l
• It
will be noted that both of these expressions represent linear combina
tions of with the relative contribution of increas
ing as the intertrial interval (and therefore g) decreases. Also,
it is apparent from the. defining equations for Pn and Un' that
p > U , with equality obtaining only in the special cases where both
n ~ 'n
are equal to unity or both equal to zero. Therefore, the probability
of a repetition is inversely related to the intertrial interval. In
particular, the probability that a correct A
l
or ~ response will
be repeated tends to unity in the limit as the intertrial interval goes
to zero. When the intertrial interval becomes large, the parameter g
and Equations 71 reduce to those of a pattern approaches
1
1 N* ,
model with N elements and independent sampling.
Summing the first four of Equations 71, we obtain a recursion for
probability of the A
l
'response:
~ (1  c  g  .!:aN' + cg)p + c(l  g)rc + g(U +V )
n' "n n
Now, although a full proof would be quite involved, it is not hard
to show heuristically that the asymptote is independent of the inter
trial interval. We note first that asymptotically we will have
A. and E. 166
,
where u
m
is the probability that m elements are conditioned to A
l
.
The substitution of for is possible in view of the intui
tively evident fact that, asymptotically, the probability that an ele
ment conditioned to A
l
constitutes the trial sample is simply equal
to the proportion of such elements in the total population. Substi
tuting into the recursion for Pn in terms of this relation, and the
analogous one for
we obtain
v ,
n
= (1 c +cg)Pn .+ c(l g)n ,
the simplification in the last line having been effected by means of
the identity
 g  ~ ,
N'+ 1
= g("NT")
N*
 g:N'
Setting Pn+l
outcome
whence
A. and E.. 167
Poo and solving for Poo' we arrive at the tidy
Poo ~ O·  c + cg)poo + c(l  g)n ,
The recursion in can be solved, but the resulting formula
expressing Pn as a function of n and the parameters is too cU1llber
some to yield much useful information by visual inspection. It seems
intuitively obVious that for
1
g < 1
N*
(Le., for any but very long
intertrial intervals) the learning curve will rise more sharply on early
trials than the corresponding curve for the independent sampling case.
This is so because only sampled elements can undergo conditioning, and
once sampled, an element is more likely to iI:le. resampled th!'l.,phiJrter the
intertrial interval. However, the curves for longer and shorter inter
vals must cross ultimately, with the curve for the longer interval
approaching asymptote more rapidly on later trials (Estes, 1955b). If
;n: ~ 1, the total nU1llber of errors expected during learning mUllt be
independent of the intertrial interval; for each initially unconditioned
element will continue to produce an error each time it is sampled until
it is finally conditioned, and the probability of any specified number
of errors prior to conditioning depends only on the value of the
A. and E. 108
conditioning parameter c. Similarly, if n is set equal to 0
after a conditioning session, the total number of conditioned responses
during extinction is independent of the intertrial interval.
5.3 The Linear Model as a Limiting Case
For those experiments in which the available stimuli are the same
on all trials, the possibility arises of using a model that suppresses
the concept of stimuli. In such a "pure" reinforcement model the
learning assumptions specify directly how response probability changes
on a reinforced trial. Elf all odds the most popular models of this
sort are those which assume probability of a response on a given trial
to be a linear function of the probability of that response on the
previous trial.
ll
11 For a discussion of this general class of "incremental" models see
the Chapter by Sternberg in this volume.
The socalled "linear models" received their first systematic treatment
by Bush and Mosteller (195la,1955) and have been investigated and
developed fu;rther by many others. We shall be concerned only with a
certain class of such models based on a single learning parameter e
A more extensive analysis of this class of linear models has been given
by Estes and Suppes (1959a).
The linear theory .is formulated for the probability of a response
on trial n +1, given the entire preceding sequence of responses and
reinforcements.
12
Let
A. and E. 169
x be the sequence of responses and
n
12
In the language of stochastic processes we have a chain of infinite
order.
reinforcements of a given subject through trial n; that is, x is
n
a sequence of length 2n with j's (where j = 1 to r) in the odd
positions indicating responses and i's (where i = 0 to r) in the
even positions indicating reinforcements. The axioms of the linear
model are as follows:
and O<k<r,
for every 1,i' and k such that 1<
•• I
1"J. < r
Ll. If Pr(E. A. I x 1) > 0 then
l,n 1 ,u n
Pr(A. llE
i
A. , x 1) = (1  e)
Pr(A
i
n1xn_l) +
e
1,n+ ,n 1 ,u n ,
L2. If
Pr(E
k
A. I x 1) > 0 ,
k ~
i and
k ~ 0 , then
,TIl ,U u,:","
Pr(A. +llEk A. I x 1) = (1 e) Pr(A. Ix 1)
l,n,ll l,n n J., n n
L3. If
Pr(E
o
A. I x 1) >
0 then
,nJ.,un
Pr(A. +lIE
O
A., x 1) = Pr(A. Ix 1)
l, n .,n ]. ,u n l,n u::
A. and E. 170
By Ll, if the reinforcing event, E
i
, corresponding to
response
Ai
oc.curs on trial n , then (regardless of the response
occur,ring on trial n ) the probability of A. increases by a linear
transform of the old value. By L2 , if some reinforcing event other
than E
i
occurs on trial n, then the probability of Ai decreases
by a linear transform of its old value. And by L3, occurrence of the
("neutral") event EO leaves response probabilities unchanged. The
axioms may be written more in terms of the probability,
PXi,n' that a subject identified with sequence x makes an Ai
response on trial
n •
, namely,
L If the subject receives an E
i
event on trial n,
PXi,n+l ; (1  e)Pxi,n + e ;
2. if the subject receives an E
k
event (k Fi and k F0 )
on trial n,
PXi,n+l  (1  e):Pxi,n ;
3. if the subject receives an EO event on trial n,
model; that i.s, if we let
A. and E.I,7l
From a mathematical standpoint it is important to note that for
the linear model the response probability associated with a particular
subject is free to vary continuously over the entire interval from 0
to 1 since this probability undergoes linear transformations as a
result of reinforcement. Consequently, if one wishes to interpret
changes in response probability as transitions among states of a Markov
process, one must deal with a continuousstate space. Thus the Markov
interpretation is of little practical value for calculational purposes.
In stimulus sampling models, response probability .is defined in terms
of the proportion of stimuli conditioned; since the set of stimuli is
finite, so also is the set of values taken on by the response proba
bility of any individual subject. It is this finite character of
stimulus sampling models that makes possible the extremely useful
interpretation of the models as finite Markov chains.
An inspection of the three axioms for the linear model indicates
that they have the same general form as Equation 60, which describe
changes in response probability for the fixed sample size component
·C's·
e = IN then the two sets of rules are
similar. As one might expect from this observation, many of the
predictions generated by the two models are identical when
cs
e =
N
For example, in the simple noncontingent situation the mean learning
curve for the linear model is
Pr(A
l
)
,n
(72)
the
A. and E. l72
which is the same as that of the component model (see Estes and Suppes,
1959a, fora derivation of results for the linear model). However,
the two models are not identical in all respects, as is indicated by a
comparison Of the asymptotic variances of the response distributions.
For the linear model
2 8
cr = rc(l rc) 
00 .. 28
as contrasted to Equation 63 for the component model. However, as
noted above in connection with Equation 63, i11 the limit (as N > 00 )
2
cr for the component model equals the predicted value for the
00
.linear model.
The last result suggests that the component model may converge to
the linear process a.s N > 00 • This conjecture is substantially cor
rect; it can be shown that, in the limit both the fixed sample size
model and the independent sampling model approach the linear model for
an class of assumptions governing the sampling of ele
ments. The derivation of .thelinear model from component models holds
for any reinforcement schedule, for any finite number r of responses,
and for every trial n, not simply at asymptote. The proof of this
convergence theorem is lengthy and we shall not present it here.
However, as one might expect, the proof depends on the fact that the
variance of the sampling distribution for any statistic of the trial
sample approaches 0 as N becomes large. A proof of the convergerce
theorem is given by Estes and Suppes (l.959b). Kemepyand Snell (1957)
A. and E. 173
also have considered the problem but their proof is restricted to the
·twochoice noncontingent situation at asymptote.
Comparison of the Linear and Pattern Models. The same limiting
result, of course, does not hold for the pattern model discussed in
Sec. 3. For the pattern model only one element is sampled on each trial
and it is obvious that as N ~ o o the learning effect of this sampling
scheme would diminish to zero. For experimental situations where both
the linear model and the pattern model appear to be applicable it is
important to derive differential predictions from the two models which,
on empirical grounds, will permit the researcher to choose between them.
To this end we display a few predictions for the linear model applied
to both the RTT situation and the simple tworesponse noncontingent
situation; these results will be compared with the corresponding
equations for the pattern model.
For simplicity, let us assume that in the case of the RTT situation
the likelihood of a correct response by guessing is negligible on all
trials. Then, according to the linear model, probability of a rein
forced response changes in accordance with the equation
Pl=(le)p +e.
n+ n
In the present application the probability of a correct response
on the first trial (the R trial) is zero, and hence the probability
. of a correct response on the first test trial is simply e. No
reinforcement is given on T
l
and consequently the probability of a
A. and E. 174
correct response does not change between T
l
and T
2
Therefore,
whereas the differ
P'OO' the probability of a correct response on both T
l
and T
2
(as
defined in connection with Equation 55) is e
2
• Similarly, we obtain
2
POl ~ P10 ~ e(l e), and Pll ~ (1  e) • Some relevant data are
Insert Table 6 about here
presented in Table 6 (from Estes, 1961b). They represent joint response
proportions for 40 subjects, each tested on 15 paired associate items of
the type described in Sec. 2 •.1, the RTT design applied to each item. In
order to minimize the probability of correct responses occurring by
guessing, these items were introduced (one per trial) into a larger list,
the composition of which changed from trial to trial. A critical item
introduced on trial n received one reinforcement (paired presentation
of stimulus and response members) followed by a test (presentation of
stimulus alone) on trial n and trial n +1, following which it was
dropped from the list.
From an inspection of the data column of Table 6 it is obvious
that the simple linear model cannot handle these proportions. It suf
fices to note that the model requires POL ~ P1C '
ence between these two entries in the data column is quite large.
One might try to preserve the linear model by arguing that the
pattern of observed results in Table 6 could have arisen as an artifact.
If, for example, there are differences in difficulty among items (or,
equivalently, differences in learning rate among subjects), then the
,
A. and E. 174a
Table 6
Observed Joint Response Proportions for RTT Experiment and Predictions
from Linear RetentionLoss Model and Sampling Model.
Observed Linear Sampling
Pnop:brtion Model Model
POO
.238 .238 .238
POl
.147 .238 .15
2
PlO
.017 .018 0
Pn
.598 .506 .610
A. and E. 175
instances of incorrect response on T
l
, would predominately represent
smaller a values than instances of correct responses. On this account
one might expect that the predicted proportion of correct following
incorrect responses would be smaller than that allowed for under the
"equal a" assumption, and therefore that the linear model might not
actually be incompatible with the data of Table 6. We can easily check
the validity of such an argument. Suppose that parameter a
i
is asso
ciated with a proportion f
i
of the items (or subjects). Then in each
Case where a
i
is applicable, the probability of a correct response on
Clearly then, POl
estimated from a group of items described by ,differences in a would be
POl = L f.a.(la.)
i 11· 1
But a similar argument yields
PlO = L::: f. (1  a. )a.
. il J.J.
Since, again, the expressions for and are equal for all
distributions of ai' it is clear that individual differences in
learning rates alone could not account for the observed results.
A related hypothesis that might seem to merit consideration is
that of individual differences in rates of forgetting. Since the pro
portion of correct responses on is less than that on there
i ~ evidently some retention loss, and differences among subjects, or
items in susceptibility to this retention loss might be a source of
A. and E.116
bias in the data. The hypothesis can be formulated in the linear model
as follows: Probability of the correct reElponse on T
l
iEl equal to
e ; if, however, there iEl a retention 10SEl then the probability of a
correct reElPonEle on T
2
will have declined to some value P, such
that p < e. If there are individual differences in amount of
retention loss, then we should again categorize the population of
subjectEl and items into ElubgroupEl, with a proportion f
i
of· the sub
jectEl characterized by retention parameter Pi' Theoretical expreEl
sions for Pij can be derived for Eluch a population by the same. method
used in the preceding case; the reElults are aEl follows:
Poo = e Lf.p
:i l: i
POl
=
e Lf.(lp.)
i J. J.
PIO
= (le) ~ f i P i
J.
Pn
=
(1 e) Lf. (1 P.)
• J. J.
J.
This time the expressions for and are different; with a
suitable choice of parameter values, they could accommodate the dif
ference between the observed proportions and However,
another difficulty remains. To obtain a near zero value for
would require either a e near unity, which would be incompatible with
the observed proportion of .385 correct on T
l
, or a value of
L:= fiPi near zero, which would be incompatible with the observed
't
A.andE. 177
proportion of .255 correct on T
2
• Thus, we have no support for the
hypothesis that individual differences in amount of retention loss might
account for the pattern of empirical values.
One can go on in a similar fashion and examine the results of
supplementing the original linear model by hypotheses involving more
complex combinations or interactions of possible sources of bias (see
Estes, 1961b). For example, one might assume that there are large
individual differences in both learning and retention parameters. But
even with this latitude it is not easy to adjust the linear model to
the RTT data. Suppose that we admit different learning parameters,
and and different retention parameters, and the
combination 8
1
°
1
, obtaining for half the items and the combination
for the other half, Now the formulas become
8
1
°1 + 8
2
°2
Poo =
2
,
8
1
(1°1)
+
8
2
(1 °2)
POl
=
2
(1  8
1
)°1
+
(18
2
)°2
P
10
=
2
,
(18
1
)(1 0:\) + (1. 8
2
)(1°2)
2
From the data column of Table 6, the proportionp of correct responses
on the first and second test trials, are PO = .385 and PO = .255 ,
respectively. Adding the first and second of the equ",tions above to
obtain :the theoretical expression for PO' and the 'first and third
equations to get P ~ O ' we· have .
A. and E. 178
e
l
+ e
2
PO
;
2
and
P
l
+ P
2
P
0
;
2
Equating theoretical and observed values, we obtain the constraints
which should be satisfied by the parameter values. If the proportion
Poa in Table 6 is to be predicted correctly, we must have further
elP
l
+ e
2
P
2
; :238
2
,
or, substituting from the two equations just above,
which may be solved for e
l
.083 + .77P
l
2P
l
 ·51
"now the admissible range of parameter values can be further reduced.
For the right hand side of this last equation to have a value between
o and 1, P
l
must be greater than .48, so we have the relatively
A.andE. 177
proportion of .255 correct on T
2
.' Thus, we have no support for the
hypothesis that individual differences in amount of retention loss might
account for the pattern of empirical values.
One can go on in a similar fashion and examine the results of
supplementing the original linear model by hypotheses involving more
complex combinations or interactions of possible sources of bias (see
Estes, 1961b). For example, one might assume that thElre are large
individual differences in both learning and retention parameters. But
even with this latitude it is not easy to adjust the linear model to
the RTT data. Suppose that we admit different learning parameters,
8
1
and and different retention parameters, and the
combination' 8
1
P
l
, obtaining for half the items and the combination
for the other half. Now the formulas become
8
1
P
l
+ 8
2
P
2
POO 2
,
Bl(lP
l
) + 8
2
(1P
2
)
PO'
2
,
,L
(1  8
1
)P
l
+ 8
2
)P2
P
10
=
2
,
(1 8
1
) (1
P
l
) + (1 8
2
)(1  P
2
)
Pn
2
From the data column of Table 6, the proportions of correct responses
on the first and second test trials, are PO = .385 and PO = .255 ,
respectively. Adding ,the first, and second of the equatiorts above to
obtain the theoretical expression for PO' ahdthe and third
equ.ationsto get, wehave
A. and E. 178
and
theoretical and observed values, we obtain the cOnstraints
which should be satisfied by the parameter values. If the proportion
Poo in Table 6 is to be predicted correctly, we must have further
8
1
P
l
+ 8
2
P
2
= :238
2
,
or, substituting from the two just above,
which may be solved fOr 8
1
.083 + .7!P
l
2P
l
 ·51
,.Now the admissible range of parameter values can be further reduced.
For the right hand side of this last to have a value between
a and 1, P
l
must be greater than .48, so we have the relatively
A. and E. 179
narrow bounds on the parameters Pi
Using these bounds on
as a function of that
we find from the equation expressing e
l
e
l
must in turn satisfy .93 ~ e
l
~ 1.0
But now the model is in trouble, for in order to also satisfy the
constraint e
l
+ e
2
= .77, e
2
would have to be negative (and the cor
rect response probabilities for half of the items on T
l
would also be
negative). About the best we can do, without allowing "negative proba
,bilities" is to use the limits we have obtained for P
l
, P
2
, and 8
1
and arbitrarily assign a zero or small positive value to e
2
Choosing
the combination e
l
= ·95 , 8
2
= .01, P
l
= .5, and P
2
= . 01 , we
obtain the theoretical values listed for the linear model in Table 6.
By introducing additional assumptions or additional parameters, we could
improve the fit of the linear model to these data, but there would seem
to be little point in doing so. The refractoriness of the data to des
cription by any reasonably simple form of the mode+ suggests that perhaps
the learning process is simply not well represented by the type of growth
function embodied in the linear model.
By contrast, these data can be quite readily handled by the stimulus
fluctuation model developed in the preceding section, Letting f
l
= 1
in Equations 70, and using the estimates c = .39 and f
2
= .61 , we
obtain the theoretical values listed under "Sampling l'lodel" in Table 6.
A. and E. l8o
One would not, of course, claim that the. sampling model has been rigorously
tested, since two parameters had to be estimated and there are only three
degrees of freedom in this set of data, However, the model does seem
more promising than any of the variants of the linear model that have
been investigated. More stringent tests of the sampling model can readily
be obtained by running similar experiments with longer sequences of test
trials, since predictions concerning joint response proportions over
blocks of three or. more test trials can be generated without additional
assumptions .
. Additional Comparisons Between the Linear and Pattern Model. We
now turn to a few comparisons between the linear model and the multi
element pattern model for the simple noncontingent situation. First of
. all, we note that the mean learning curves for the two models (as given
in Equation 37 and Equation 72) are identical if we let
c
iii '" e .
However, the expressions for the variance of the asymptotic response
distribution are different; for the linear model
2 ( )' B
a = n 1 n 
00 2  e
,
whereas for the pattern model
1
n(l  n)
. N
This difference is
reflected in another prediction that provides a more direct experimental
test of the two models. This concerns the asymptotic variance of the
distribution of the number of A
l
responses in a block of K trials
which we denote
1959a) ,
var(lc) . For the linear model (cf.Estes and Suppes,
A. and E. 181
for the pattern model, 1;)11 .E'l'; '42,
••(l .J K , 'K(;  oj  '1:, oJN [l  (l it]}
Note that, for c: e, the variance for the pattern model is larger
than for the linear model. However, for the case of e: i, the
variance for the pattern model can be larger or smaller than for the
linear model depending on theparticl).lar vall).es of c and N.
Finally, we present certain asymptotic seql).ential predictions for
the linear mode.l in the noncontingent sitl).ation; namely
limPr(A
l
+llE
l
A
l
· ):(l_e)a+e
,_n _ ,n ,n
lim Pr(A
l
l[E
l
A
2
) 1  (1  e)b
Jln+ _ ,n ,n
where a: [2,,(1 e) + e] I (2  e) and b: [2(1 ,,)(1 e) + e] I (2 e)
These predictions are to be compared with E'l. 34 for the pattern model.
In the case of the pattern model we note that
Pr(AlIE1A
l
)
and
depend only on
"
and N whereas pr(A
l
[E
2
A
l
) and
depend on
" ,
N and c . In contrast, all fOl).r seql).ential
probabilities depend on " and e in the linear model. For detailed
comparisons between the linear model and the pattern model in application
A. and Eo 182
to twochoice data, the reader is referred to Suppes and Atkinson (1960),
and Estes and Suppes (1962).
5.4 Applications to Multiperson Interactions
In this .secti.on.. we .apply the 11nearmOdel .to eXl'<;lximental situa:tions
involving multiperson interactions in Which the reinforcement for any
given subject depends both on his response and on the responses of other
subjects. Several recent investigations have provided evidence indi
cating the fruitfulness of this line of development. for example, Bush
and Mosteller (1955) have analyzed a study of imitative behavior in
terms of their linear model, and Estes (1957a), Burke (1959)+960) .aJild.'Atkinson
and Suppes (1958) have derived and tested predictions from linear models
for behavior in two and three person games. Suppes and Atkinson (1960)
have also provided a comparison between pattern models and linear models
for experiments and have extended the analysis to situations
involving communication between subjects, monetary payoff, social pres
sure, ec.onomic oligopolies, and related variables.
The simple twoperson game has particular advantages for expository
purposes, and we use this situation to illustrate the technique of
extending the linear model to multiperson interactions. We consider
a situation which, from the standpoint of game theory (see, e.g., Luce
and Raiffa, 1957), l)1aY be characterized as a game in normal form with
a finite number of strategies available to each player. Each play of
the game constitutes a trial, and a pl&yer.'.s choice of a strategy for
a given trial corresponds to the selection of a response. To avoid
A. and E.
problems h/3.ving to do with t.he measurement of utility (or from the
viewpoint of learning theory, problems of reward magnitude), we assume
/3. unit reward that is assigned on an allornone basis. Rules of the
game require the two players to exhibit .their choices simultane.ously
on all trials (as in.a game of matching pennies) and each player .is
informed that, given the choice of the other player on the trial, there
is exactly one choice leading to the unit reward •
.We designate the two players as A and B and let Ai (i ; 1, .... , r)
andB:( j; 1, •.. , r' ) denote the responses .available to the two players.
J
The set of reinforcement probabilities prescribed by the experimenter
may be represented in a matrix (aij,b
ij
) analogous to the "payoff
represents the pro The number a
ij
bability of Player A being correct on any trial ·of the experiment
matrix"familiar in g/3.me theory,
given.the response pair AiB
j
; similarly, b.. is the probability of
Player B being ·correct given the response pair AiB
j
consider the matrix
For example,
B
l
B
2
A
l
1 1
1
°
2' 2
,
A
2
1,
°
0, 1
When both subjects make response
1
1 ,each has probability 2 of
receiving reward; When both make response 2, then only Player B
receives reward; when either of the other possible response pairs occurs
(Le., or ·A
1
B
2
) , . then only PlayerA receives reward. It
A.. and E.
should be emphasized that although one usually thinks of one player
winning and the other any given play of a game, this is not
a necessary restriction on the model. In theory, and in experimental
tests of the theory,it is quite possible to permit both or nei1:iher of
the players to be rewarded on any trial. Eowever, to provide a rela
tively simple theoretical interpretation of reinforcing events it is
essential that on a nonrewarded trial the player be informed (or led to
infer) that some other choice, had he made it under the same circum."
stances, would have been successful. We return to this point later.
Let denote the event of reinforcing the response for
Player A and
J
the event of reinforcing the response for
Player B To simplify our analysis we consider the case in which
each subject has two response alternatives, and we define the
probability of occurrence of a reinforcing
the payoff parameters as follows (for i';' i' and j';' j ')
(A) I
b ..
(B) I
a ..
Pr(E. A. B. )
Pr(E. A. B. )
lJ l,n JJn lJ J l,n J,n
(73)
1 a ..
(A) I
1 b ..
(B) I
Pr(E., A. B. ) Pr(E., A. B. )
lJ l l,n J,n lJ J l,n J,n
For example, if Player A makes an A
l
response and is rewarded then
an Ei
A
) occurs; however, if an A
l
is made and no, reward occurs then
we assume that the other response is reinforced; i.e., an occurs.
Finally, one last definition to simplify We denote
Player A's response probability by a and Player B's by and we
denote, by "I the joint probability of an A
l
and B
l
response. , Specifically,
A. and E. J,.85
a ~ Pr(A
l
), ~
n ,n n
~ Pr(A
l
•· B
l
)
,n :son
We now derive a theorem that provides recursive expressions for
an and ~ n and points up a property of the model that greatly complicates
the mathematics; namely, that both a
n+l
and ~ 1· depend on the joint
n+
probability '1 ~ Pr(A
l
B
l
) •
n ,u .. ,n
Tlleorem
(75b)
A and B where e
A
ande
B
are the learning parameters for players
Proof. It will suffice to derive the difference equation for 06+1'
since the derivation for is identicaL To begin with, from
Axioms Ll and L2 we can easily show that the general form ofa recursion
for a is
n
a ~ (1  eA)a
n
+ e Pr(E(A))
n+l A l,n
.The term pr(E(A)) can then be expanded as follows
1,n
A. and E. 186
pr(E(A}}=L::Pr(E(A} A. B. }
l,n .,' l,n J.,n J"n
~ , J
. (A)
=z=Pr(El Ill.. B. }Pr(,A. B. }
.. ,n 1.,n J,n' J.,n J,n
~ , J
and by (73)
pr(E
(A}} Pr(A B } + P (A B )
= all 11 . a
12
r·
l
2
l,n, ,n ,.n ,n ,n
Next we observe that
Pr(A
l
B
2
}= Pr(B
2
IA
l
}Pr(A
l
}
J n , n ,n ,n , n
= [1  Pr(B
l
IA
l
} ]Pr(A
l
}
,n ,n,n
= Pr(A
l
}  Pr(A
l
B
l
}
,n . ,n "n
S;i.milarly
Pr(A B
2,n l,n} = Pr(B
l
}  Pr(A
l
B
l
} ,
, ,n JUJu
and
= [1  Pr(A
l
IB
2
}]Pr(B
2
}
,n , , n ,n
= pr(B
2
. }  Pr(A
l
B.
2
}
_,0 ,n,n
= 1  Pr(B
l
} ~ Pr(A
l
} + Pr(A
l
B
l
.)
,.n. ',n ,n .,n
(76)
(77a)
(77
b
)
(77c)
A. and E. 187
into Equation 76 from Equations 77a, 77b and 77c and
simplifying by means of the definition for a, 13 and 7, we obtain
+ (1 a
22
)(.1 a 13 +7 )
, n n n
Substituting this expression into the general recursion for
the desired result, which completes the proof.
a yields
n
It has been shown by Lamperti and Suppes (1959) that the limits
a, 13 and 7 exist, whence (letting O:n+l.; an ; a, I3
n
+
l
; I3
n
; 13
and 7
n
; 7 in 75a and 75b) we have two linear relations that are
independent of BA and BE; namely,
where
aa ; bl3 + c7 + d , el3 ; fa + gr + h , (78)
b ;
A. and E. 188
Byeliminating r from Equations 78 we obtain the following linear
relation in a and 13:
(
(  ag  ce)a + (bg tcf)13 ~ ch  dg . (80)
Unfortunately, this relationship is one of the few quantitative
results that can be directly computed for the linear model. It has,
however) the advantageous feature that it is independent of the learn
ing parameters $A and BB and therefore may be compared directly with
experimental data. Application of this result can be illustrated in
terms of the game cited earlier in which the payoff matrix takes the form
B
l
A
l
1 1
2' 2
~
1, 0
From Equations 79 we obtain
1
J
0
o} 1
a ~ 1
b 1
1
c =
2
d
~
1
e ~ 1
f = 1
1
g ~ 2
h 0
and Equation 80 becomes
Ao and Eo
To derive a prediction for Player A responses will tend to of
From this result we predict immediately that the longrun proportion
1
2
we substitute the known values of the parameters into the first part
of Eqo 78 to o1:ltain
Unfortunately we cannot compute?" the asymptoticprobability of the
is positive and since only response pair. However, we know 'I'
Player BI s responses are B
1
1
s , r'c cannot be greater than
1
2'
Therefore, we have
1
o < 'I' S 2' and as a result can set definite bounds
on the probability of an Ai response; namely
Thus, we have the basis for a rather exacting experimental test since
the asymptotic predictions for both subjects are parameterfree; ioeo,
they do not depend on the e values of either subject or on initial
response probabilities.
Ofcourse,by imposing restrictions on the experimentally determined
parameters and a variety of results can be obtained, We
limit ourselves to the consideration of one such case: choice of the
parameters so that the coefficients of 'I'n vanish in the recursive
equ.ations 7.5a and 75b. Bpecifically,if we, let c' = g = 0 and
af  be f 0 J then
A. and E. 190
a = aa + btl + d
n+l n n
(81)
etl + fa + h
n n
SQlutions for this system are wellknown and can be obtained by a number
of different techniques; for a detailed discussion of th", problem of
obtaining "'xplicit expressions of a and tl for arbitrary nth'"
n n
r"'ader is r"'f"'rr"'d to an articl'" by Burke (1960). We do knQW, however,
a and tl exist and ar", independent of both
n n
and tl = 13n+1 = I3
n
into the two recursions we obtain
(:t=a =a
n+l n
By substituting
that the limits for
th", initial conditions and e
A
and
bh + df
a=
af  be
and
ah + ite
af  be
Th", fact that a and 13 are independent of e
A
and e
B
under the
restrictiQns imposed on the parameters in no way implies that y is
also independent of these quantities.
Eq·'.· .61 provide a very precise test of the model and the necessary
conditions for this test involve only experimentally manipulable param
eters. A great deal of experimental work has been conducted on this
restricted problem and, in general, th", correspondence between predicted
and observed values has been very good; for an account of this wQrk see
Atkinson and Supp",s (1958), Burke (1959, 1960), .and Suppes and Atkinson
(1960).
A. and E. 191
In conclusion we should mention that ,all of the predictions
presented in this section are identical to those that can be derived
from the .pattern model of Section 2 However, in general, only the
grosser predictions, such as those for
for the two models.
a
n
and
~ ,
are the same
A. and E. 192
6. DISCRIMINATION LEARNING
The distinction between simple learning and discrimination
learning is somewhat arbitrary. By discrimination we refer, roughly
speaking, to the process by which the subject learns to make one re
sponse to one of a pair of stimuli and a different response to the
other. But there is an element of discrimination in any learning
situation. Even in the simplest conditioning experiment, the subject·
learns to make a conditioned response only when the conditioned stimu
lus is presented, and therefore to do something else when that stimulus
is absent. In the paired associate situation (referred to several times
in previous. sections) the subject learns to associate the appropriate
member of a response. set with each member of a set of stimuli, and
therefore to "discriminate" the stimuli. The principal basis for
differentiation between the two categories of learning seems to be that
in the case of discrimination learning the similarity, or communality,
between stimuli is a major independent variable; in the case of simple
learning, stimulus Similarity is an extraneous factor, to be minimized
experimentally and neglected in theory so far as possible.
One of the general.·strategic assumptions of the type of stim1).lus
response theory which has been associated with the development of
stimulus sampling models is that discrimination learning involves sim
ply a combination of processes each of which can be studied independently
in simpler situations  the learning aspect in experiments on simple
A. and E. 193
acquisition or extinction, and the stimulus relationships in experiments
on stimulus generalization or transfer of training. Thus,there will be
nothing new at the conceptual level in our treatment of discrimination.
There is adequate scope for analysis of different types of discrimina
tive situations;. but since our main concern in this article is with
methods rather than content, we shall not go far in this direction.
We propose only to show how the processes of association and generali
zation treated in preceding sections enter into discrimination learning,
and this can be accomplished by formulating assumptions and deriving
results of general interest for. a few important caseS.
6,1 The pattern Model for Discrimination Learning
As in the cases of simple acquisition and probability learning,
it is sometimes useful in the treatment of discriminative situations
to ignore generalization effects among the stimuli involved in an exper
iment and regard each stimulus display as a unique pattern. Thus,
behavior elicited by the stimulus display will depend only on the sub
ject's reinforcement history with respect to that particular pattern.
Two important variants of the model arise according as experimental
arrangements do or do not ensure that the subject will sample the entire
stimulus display presented on each trial.
Case L All cues presented are sampled on each triaL For a
classical twostimulus, two_response discrimination problem (e.g" a
Lashley situation with the rat being differentially rewarded for jumping
to ;a black card and avoiding a grey card), our conceptualization requires
A. and E.194
a distinction among three types of cues: We shall denote by Sl the
set of component cues present only in the stimulus si.tuation associated
·with reinforcement of response A
l
, by S2' the set of cues present
only in the situation associated with reinforcement of response ~ ,
and by S ,
c
the set of cues present in both situations. In the .exam
window, Sl the stimulation present only on trials with black cards,
ple of the Lashley situation, A
l
might be the response of jumping to
the left hand window, ~ , .the respon"e of jumping to the right hand
S2 the stimulation present only on trials with grey cards, and S
c
the stimulation common to both types of trials. And we denote by N
l
,
and N ,
c
the number of cues in each of these subsets. In s t a n ~
dard experiments, the "cues" refer to experimentally manipulable aspects
of the situation, such as tones, objects, colors, symbols, or the like,
and it is reasonably wellknown just how many different combinations of
these cues will be responded to by subjects as distinct patterns ·In
some instances, however, the experimenter may have no a priori knowledge
as to the patterns distinguishable by the subject; in such instances,
the N
i
may be treated as unknown parameters to be estimated from data,
and the model may thus serve as a tool to aid in securing evidence as to
the subject's perceptions of the physical situation.
Suppose, now, that the experimenter's procedure is to present on
some trials (T
l
trials) a set of cues including ~ from Sl and m
c
from
S •
c '
and on the remaining trials ( T
2
trials) cues from S·
2
and m from S
c c
Further, let the two types of trials occur with equal
A. and E. 195
We can obtain an expression for probability of a correct response on a
T
l
trial simply by appropriate substitution into Eq. 28, viz
Pr(A
l
IT
l
) = 1  [1  Pr(A
l
llT
l
1)]
,n
1
,n
l
.) .}
, (82)
where is the ordinal number of t h ~ trial. The corresponding
:':'TI~ (::)j'i""" "0"",0" d " ~ ~ "'lli "'"
In the discrimination literature, cues in the sets 8
1
and 8
2
are commonly referred to as relevant, those in 8 as irrelevant,· since
c
the former are correlated with reinforcing events whereas the latter
are not. It is apparent by inspection of Eq. 82 that (for the above speci
fied experimental conditions} the pattern model predicts that probability
of correct responding will go asymptotically to unity regardless of the
numbers of relevant and irrelevant cues, provided only that neither
m
l
nor ~ .is equal to zero. Rate of approach to asymptote on each
type of trial is inversely related to the total number of patterns
available for sampling. Therefore, other things equal, rate of learning
is qecreased (and total errors to criterion increased) by the addition
of either relevant or irrelevant cues.
Case 2.
A. and E, 196
Partial sampling of the cues presented on each trial.
We consider now the situation which arises if the number of cues pre
sented per trial is too large, or the exposure time too short, ,for the
entire stimulus display to be sampled by the subject. Let us suppose
for simplicity that there are only two stimulus displays: the display
on T
l
trials comprises the N
l
cues of 8
1
together with the N
c
cues of 8 , and that on T
2
trials the N
2
cues of 8
2
together
c
with the N cues of 8 For a given fixed exposure time}) we asSlll!1e
c c
;:"'T,'
of filling the sample, with sl cues from 8
1
and ihe remainder from
8 The asymptote of discriminative performance will depend on the
c
size of s relative to N
c
If s < N , so that the entire sample
 c
can come from the set of irrelevant cues, then the asymptotic probability
of a correct response will be less than unity.
In Case 2, two types of patterns need to be distingMished for each
type of triaL We can limit consideration to T
1
trials, si.nce analo
gous arguments hold for T
2
• There may be some patterns including
only cues from 8
c
' and learning with respect to these will be on a
simple random reinforcement schedule. The proportion of such patterns,
w , is given by
c
w =
(:c)
c
(N
l
: N
c
)
,
A. and E. 197
which is equal to zero if s > N •
c
If and trials have equal
probabilities, then the probability, to be denoted V
n
, that a pattern
on trial n can be obtained from Eq. 28 by setting rt
12
containing only cues from 8 will be conditioned to the
c
response
where
cw
c
r:;=
c
 =
N
and Pr(A
l
) =
,n
V
n
1 (1)( )nl
    V 1  cb .
2 2 1 lc
The remaining patterns available on T
l
trials all contain at least one
cue from 8
1
, and thus occur only on trials when response A
l
is re
inforced. The probability, to be denoted U ,
n
that anyone of these is
conditioned to A
l
on trial n may be similarly obtained by rewriting
Eq. 28, this time with rt
12
= 0, rt
21
~ 1, pr(Al,n) = Un' and
c 1
N = 2 cb
lc
' i.e.,
U = 1  (1  U )(1 _ l cb )nl
n 1 2 lc '
(84 )
where the factor
sampling on only
1
2
1
2
enters because these patterns are available for
of the trials.
Now to obtain the probability of an A
l
response if a T
l
display
is presented on trial n, we need only combine Eq. 83 and 84, weighting
each by the probability of the appropriate type of pattern, viz
A. and E. 198
Pr(A
l
IT
l
) ~ (lw)U + w V
. ,n ,n c 'n en
1 nl
(lw)(lU)(lcb)
c 1 2 lc
_ w (!V )(lcb )nl
c 2 1 lc
(85)
which may be simplified, if to
Pr(A
l
IT
l
)
,n ,n
1 _10 w lo(lW )(1_10 cb )nl
2 c 2 c 2 lc
( 85a)
The resulting expression for probability of a correct response
has a number of interesting general properties. The asymptote, as
anticipated, depends in a simple way On w
c
' the proportion of
"irrelevant patterns". When W
c
~ 0, the asymptotic probability of
a correct response is unity; when w = 1 ,
c
the whole process reduces
to simple random reinforcement. Between these extremes, asymptotic
performance varies i ~ v e r s e l y with w ,
c
so that the terminal proportion
of correct responses on either type of trial proYides a simple estimate
of this parameter from data. The slope parameter, cb
lc
' could then
be estimated from total errOrs over a series of trials. As in Case 1,
the rate of approach to asymptote proves to depend only on the condi
tioning parameters and total number of patterns available for sampling;
thus it is a joint fUnction of the total nmnber of cues, N
l
+.N
c
' and
the sample size,
A. and E. 199
s, but does not depend on the relative proportions
of relevant and irrelevant cues. The last result may seem implausable,
but it should be noted that the result depends on the simplifying assump
tion of the pattern model that there are no transfer effects from learn
ing on one pattern to performance on another pattern which has component
cues in COmmon with the first, The situation in this regard will be
different for the "mixed model" to be discussed below.
6.2 A Mixed Model
The pattern model may provide a relatively complete account of
discrimination data in situations involving only distinct, readily dis
criminable patterns of stimulation, as, for example the "paired comparison"
experiment discussed in Sec. 3.3 or the verbal discrimination experiment
treated by Bower (1962). Also, this model may account for some aspects of
the data (e.g., asymptotic performancelevel,.trialsto criterion) even in
discrimination experiments where similarity, or communality, among stimuli
is a major variable. But to account for other aspects of the data in cases
of the latter type, it is necessary to deal with transfer effects through
out the course of learning. The approach to this problem which we now
wish to consider employs no neW conceptual apparatus, but simply a com
bination of ideas developed in preceding sections.
In the mixed model, the conceptualization of the discriminative
situation and the learning assumptions are exactly the same as those of
the pattern model discussed in Sec. 6.1. The only change is in the
A. and E.
response rule, and that is altered in only one respect. As before, we
assume that, once a stimulus pattern has become cOnditioned to a response,
it will evoke that response on each subsequent occurrence (unless on
some later trial the pattern becomes reconditioned to a different re
sponse  as may occur during reversal of a discrimination). The new
feature concerns patterns which have not yet become conditioned to any
of the response alternatives of the given experimental situation, but
which have component cues in common with other patterns which have been
so conditioned. Our assumption is simply that transfer occurs from a
conditioned to an unconditioned pattern in accordance with the assump
tions utilized in our earlier treatment of compOunding and generalization
(specifically, by axiom C2, together with a modified version of Cl, of
Sec. 4.1).
Before the assumptions about transfer can be employed unambiguously
in connection with the mixed model, the notion of conditioned status of
a component cue needs to be clarified. We shall say that a cue is Con
ditioned to response Ai if it is a component of a stimulus pattern
that has become conditioned to response Ai' If a cue belongs to two
patterns, one of which is conditioned to response Ai and one
then the conditioning status of the cue follows sponse A.. (iij),
J
that of the more recently conditioned pattern. If a cue belongs to no
conditioned pattern, then it is said to be in the unconditioned, or
"guessing" state. Note that a pattern may be unconditioned even though
all of its cues are conditioned. Suppose, for example, that a pattern
Ao and Eo 201
con$isting of cues x, y and z in a particular arrangement has never
been presented during the first n triMs of an experiment,.but that
each of the cues has appeared in other pattern$, say wxy and wvz,
which have been presented and conditioned 0 Then all of the cues of pat
tern xyz would be conditioned, but the pattern would still be in the
unconditioned stateo Consequently, if wxy had been conditioned to
response A
l
and wvz to
of pattern xyz would be
the probability of
But if now response
in the presence
were effectively
reinforced in the presence of xyz, its probability of evocation by
that pattern would henceforth be unitYo
The only new complication arises if an unconditioned pattern
includes some cues which are still in the unconditioned stateo Several
alternative ways of formulating the response rule for this case have
some plausibility, and it is by no means sure that anyone choice will
prove to hold for all types of situationso We shall here limit consid
eration to the formulation suggested by a recent study of discrimination
and transfer which has been analyzed in terms of the mixed model (Estes,
and Hopkin$, 1961)0 The amended response rule for patterns including
unconditioned cues is as follows in this formulation: Axiom C2 of
Seco 401 is reinterpreted so that in a situation involving r response
alternatives,
is equal to of any response
10 if all cues in a pattern are unconditioned, the probability
1
r ;
A. and E. 202
2. if a pattern (sample) comprises m cues conditioned to
response Ai' m' cues conditioned to other responses, and
m" unconditioned cues, then the probability that Ai will
be evoked by this pattern is given by
Pr(A.)
~
mil
m+
r
=
m+ rot + mil
In other words, axiom C2 holds, but with each unconditioned cue
contributing "weight"
responses ..
1
r
toward the evocation of each of the alternative
To illustrate these assumptions in operation, let us consider a
simple classical discrimination experiment involving three cues, a,
b, and c, and two responses,A
l
and ~ . We shall assume that
the pattern ac is presented on half of the trials, with A
l
reinforced,
and bc on the other half of the trials, with ~ reinforced, the two
types of trials occurring in random sequence. We assume further that
conditions are such as to ensure the subject's sampling both cues pre
sented on each trial. The possible conditioning states of each pattern
and the probability of response A
l
associated with each may now be
tabulated as follows:
A. and E. 203
!ll
Probability
, States to each Pattern

ac bc ac bc
1 2
1 0
1 1 1 1
2 2
0 0
2 1 0 1
0 1
3/
4 1
0 2
1/4 0
1 0 1 3/4
2
0 0 1/4
0 0 1/2 1/2
where a 1, 2, or 0, respectively," in a ijtate column indicates
that the pattern is conditioned to A
l
, conditioned to or
unconditioned. For each pair of values under States, the associated
A
l
according to the modified response rule, are
given in the corresponding positions under A
l
Probability. To reduce
algebraic complications, we shall carry derivations for the special
case in which the subject starts the experiment with both patterns
unconditioned; then, under the conditions of reinforcement specified
above, only the states represented in the first; seventh, sixth, and
ninth rows of the tables are available to the 8ubject,and for brevity
we shall number these states 3, 2 , 1, and 0, in the order just
listed. That is,
A. and E. 204
State
3
~
pattern ac conditioned to A
l
,
conditoned to
~
,
State 2 ~
pattern ac conditioned to A
l
,
and pattern bc
and pattern bc
unconditioned,
State 1 ~ pattern ac unconditioned, and pattern bc conditioned
to ~ ,
State 0 both patterns ac and bc are unconditioned.
Now these states can be interpreted as the states of a Markov
chain, since the probability of transition from anyone of them to any
other on a given trial is independent of the preceding history. The
matrix of probabilities for onestep transitions among the four states
takes the following form;
1 0 0 0
c
1.£ 0 0
2" 2
Q
=
, (86)
c
0
1.£ 0
2 2
c c
1:_ c
0
2 2
where the states are ordered 3, 2 , 1 , 0 from top to bottom and
left to right. Thus, State 3 (in which ac is conditioned to A
l
,
and bc to ~ ) is an absorbing state, and the process must termi
nate in this state, with asymptotic probability of a correct response
to each pattern equal to unity. In State 2, ac is conditioned to
A
l
but bc is still unconditioned. This state can be reached only
A.and E. 205
from State 0, in which both patterns are unconditioned; the probability
of the transiti8n is
1
2'
(the probability that pattern ac is presented)
times c (the probability that the reinforcing event produces condi
tioning); thus the entry in the second cell of the bottom row is
c
2" •
From State 2, the subject can go only to State 3, and this transition
again has probability ~ . The other cells are filled in similarly.
Now the probability, U. ,
~ , n
of being in state i on trial n
can be derived quite easily for each state. The subject is assumed to
start the experiment in State 0 and has probability c of leaving this
state on each trial, hence
u
O,n
For State 1, we can write a recursion,
u
l,n
,
which holds ,if n > 2. For, , to be in State 1 on trial n, the
subject must have entered at the end of trial 1, which has proba
•.,. .. _ c,
lnlilityc, "2' "j' and then remained for n  2 trials, which has probability
(1
_
_2
c
)n2 ,
, or have entered at the end, of trial 2, which has proba
bility (1  c ) ~ , and then remained for n 3 trials, which has
probability (1 ~ ) n  3 ; ••• ; or have entered at the end of trial
n 1 , which has probability
(1 c)n2 ~
2
The right hand side of
this recursion can be summed to yield
u
l,n
A. and E.
(
c )v
n2 1  
; .£(1 c)n2 2
2 1  c
v;o
; 1]
; (1 _ (1 _ c)nl
By an identical argument, we Qqtain
u ; _2
c
)nl  (1 _ c)nl
2,n
and then by subtraction
,
; 1  
u  u
l,n O,n
(
c)nl( )nl
;121
2
+ lc
From the tabulation of states and response probabilities, we know
that the probability of response A
l
to pattern ac is equal to 1 ,
1
1
and
1
respectively, when the subject is in State
:5
2 , 1
,
'4 '
2
, , ,
or o • Consequently the probability of a correct (A
l
) response to
ac is obtained simply by summing these response probabilities, each
weighted by the state probability, viz
A. and E. 207
(
nl l( c)nl l( )nl
 1c)' +412 41c
1(1 )nl
+ 2  c
1
3(1 c)nl.' 1(1' ,)n_l
:::; '4 '2. +'4.
c (87)
Equation 87 is written fbr the probability of an A
l
response to
ac on trial n; however, the expression for probability of an ~
response to bc is ,identical, ,and consequently Eq. 87 expresses also
the probability, Pn' of a correct response on any trial, without
regard to the stimuluS pattern presented. A simple estimator :of',
othe:;conditi'OJiing' ; parameter c is now obtainable by summing the error
probability over trials. Letting e denote the expected total error,s
during learning, we have
1 1 5
4 C ~ 4c
An example of, the sort of prediction involving a ,elatively direct
assessment of transfer effects is the ,fOllowing. Suppose the first
A. and E. 208
stimulus pattern to appear is ac; the probability of a correct
response to it is, by hypothesis,
1
2 '
and if there were no transfer
Under the assump also.
between patterns, the probability of a correct response to bc when it
1
2
first appeared on a later trial should be
tions of the mixed model, however, the probability of a correct response
to bc, if it first appeared on trial 2, should be
1 .1
[1  2(1  c)  c] + 2 1 c
'=;2"...,,.:=. ; 2  '4 ;
if it first appeared on trial 3, should be
1 2 1
2(1  c) + ~
"'2__2; ~  ~ ( l  ~ )
;
and so on, tending to
1
'4
after a sufficiently long prior sequence of
ac trials.
Simply by inspection of the transition matrix, we can develop an
interesting prediction concerning behavior during. the presolution period
of the experiment. By presolution period, we mean the sequence of trials
prior to the last error for any given subject. We know.that the subject
cannot be in State 3 on any trial prior to the last error. On all trials
of the presolution period, probability of a correct response should be
equal either to ~ (if no conditioning has occurred) or to ~ (if
exactly one of the two stimulus patterns has been conditioned to its
correct response) •. Thus the proportion, which we may ·denote by P ,
( ps
A. and E. 209
of correct responses over the presolution trial seQuence should fall
in the interval
and, in fact, the same bounds obtained for any subset of trials within
the presolution seQuence. Clearly predictions from this model concern
ing presolution responding differ sharply from those derivable from any
model that assumes a continuous increase in probability of correct
responding during the presolution period; this model also differs,
though not so sharply, from a pure "insight" model assuming no learning
on presolution trials. So far as we know, no data relevant to these
differential predictions are available in the literature (though simi
lar predictions have been tested in somewhat different situations:
Suppes and Ginsberg, 1962a; Theios, 1961). Now that the predictions are
in hand, it seems likely that pertinent analyses will be forthcoming.
The development in this section was for the case where there were
only three cues a, b and c. For the more general case we could as
sume that there are N
a
cues associated with stimulus a, N
b
with
stimulus b, and N with stimulus c.
c
we assume, as we have in
this section, that conditions are such to ensure the sub
ject's sampling all cues presented 011 each trial, then EQ. 87 may be
rewritten as
A. and E. 209a
Pr(A
l
lac)
,n
c)nl 1 (1 )nl
  + W c
2 2 1
Further,
""
e L +
c 2
where The parameter w is an index of similarity be
tween the stimuli ac and bc; as w approaches its maximum value of
1, the number of total errors increases. Further the proportion of
correct responses over the presolution trial sequence should fall in
either the interval
1:. < P < 1:. + ,!o (lw )
2  ps  2 '+ 1
or the interval
1:. < P < 1:. + 1:. (1w
2
) ,
2  ps  2 4
depending on whether ac or bc is conditioned first.
6.3 Component Models
So long as the number of stimulus patterns involyed in a discrim
ination experiment is relatively small, an analysis in terms of an
appropriate case of the mixed model can be effected along the lines
A. and E. 209b
indicated in Sec. 6:.2. But the number of cues need become only moder
ately large in order to generate a number of patterns so great as to be
unmanageable by these methods. However, if the number of patterns is
large enough so that any particular pattern is unlikely to be sampled
A. and E., 210
more than once during an experiment, the emendations of the response
rule presented in Sec, 6.2 can be neglected and the process treated as
a simple extension of the component model of Sec. 5.1 •
Suppose, for example, that a classical discrimination involved a
set
S:j.
of cues available only on trials when A
l
is reinforced, a
set
S2
of cues available only on trials when
is reinforced, and
a set ,Be of cues common to Sl and S2; further, assume that a constant
fraction of each set presented is sampled by the subject on any trial.
If the two types of ' trials occur with equal probabilities, and if the
numbers of cues in the various sets are large enough so that the number
of possible trial samples is larger than the number of trials in the
experiment, then we may apply Eq. 53 of Sec. 4.3 to obtain approximate
expressions for response For example, asymptotically
all of the elements of and half of the elements of S
c
(on the average) would be conditioned to response A
l
, and therefore
probability of A
l
on a trial when Sl was presented would be predicted
by ,the component model to be
,
which will, in general, a value intermediate between
1
2"
and unity.
Functions for learning curves and other aspects of the data can be de
rived for various types of discrimination experiments from the assump
tions of the component model. Numerous results of this sort have been
A.. and E. 211
published (Burke and Estes, 1957; Bush and Mosteller, 1951b; Estes,
1958, Estes, Burke, Atkinson, and Frankmann, 1957; Popper, 1959;
Popper and
6.4 Analysis of a Signal Detection Experiment
Although thus far we have developed stimulus sampling models only
in connection with simple associative learning and discrimination learn
ing, it should be noted that such models may have much broader areas of
application. On occasion one may even :;;ee possibilities of u:;;ing the
concepts of stimulus sampling and association to interpret experiments
that, by conventional classifications, do not fall within the area of
learning. In this section we examine such a case.
The experiment to be considered fits one of the standard paradigms
associated with :;;tudies of Signal detection (see, e.g., Tanner and
1954; Swets, Tanner and Birdsall, 1961). The subject's task
in this experiment, like that of an observer monitoring a radar screen,
is to detect the presence of a visual signal which may occur from time
to time in one of several possible locations. Problems of interest in
connection with theories of signal detection arise when the signals are
faint enough so that the observer is unable to report them with complete
accuracy on all occasions. One empirical relation that we would want
to account for, in quantitative detail, is that between detection proba
bilities and the relative frequencies with which signals occur in differ
ent location:;;. Another is the improvement in detection rate that may
A. and E. 212
occur over a series of trials even when the observer receives no
knowledge of
A possible way of accounting for the "practice effect" is suggested
by some rather obvious analogies between the detection experiment and
the probability learning experiment considered earlier: We would ex
pect that, when the subject actually detects a signal (in terms of
stimulus sampling theory, samples the corresponding stimulus element),
he will make the appropriate verbal report. Further, in the absense of
any other information, this detection of the signal may act as a rein
forcing event, leading to conditioning of the verbal report to other
cues in the situation which may have been available for sampling prior
to the occurrence of the signal. If so, and if signals occur in some
locations more often than in others, then on the basis of the theory
developed in earlier sections we should predict that the subject will
come to report the signal in the preferred location more frequently
than in others on trials when he fails to detect a signal and is forced
to respond to backg:J:'0und cues. These notions will be made more explicit
in connection with the following analysis of a visual recognition exper
iment reported by Kinchla (1962).
Kinchla employed a forcedchoice visual detection situation
involving a series of over 900 discrete trials for each subject. Two
areas were outlined on a uniformly illuminated milk glass screen. Each
trial began with an auditory signal. During the auditory signal one of
L the following events occurred:
A. and E.
(1) A fixed increment in radiant intensity occurred in .area 
a T
l
type trial.
(2) A fixed increment in radiant intensity occurred in area 2

a T
2
type
tri"l.
(3) No change in the. radiant character of either signal area
occurred  a TO type trial.
Subjects were told that a change in illumination occur in
one of the two areas on each trial. Following the auditory sign<>l, the
subject was required to make either. an Al or response (i.e.,
select one of two keys placed below the slgnal areas) to indicate which
area he had changed in brightness. The subject was given no
information at the end of the trial as to whether or not his response
was correct. Thus} on a given trial one of three events occurred (T
l
,
T
2
' TO)} the subject made either an Al or response, and a
short time later the next tril?l began.
For a fixed intensity the experimenter has the option of
specifying l? for presenting the T
i
events. Kinchla selected
a simple probabilistic procedure in which Pr(T. ) = So
l,.n 1
and
Sl
+
S2
+
So
= 1 . Two groups of subjects were For group I,
Sl s2
. 4 and
S =
.2 . For Group II,
Sl
=
So
= .2 and
S2
= .6
0
The purpose of Kinchla's study was to determine how these event schedules
influenced the .lill:elihood of correct detections.
A. and E. 214
The model that we will 1.!se to analyze the experiment combines
two quite distinct processes: a simple perceptual process. defined
with regard to the signal events and a learning process associated
with background cues. The stimulus situation is conceptually repre
sented in terms of two sensory elements ~ l and s2' corresponding
to the two alternative signals, and a setS of elements associated
with stimulus features common to all trials. On every trial the sub
ject is assumed to sample a single element from the background set S
and he mayor may not sample one of the sensory elements. If the
element is sampled an A
l
occurs; if s2 is sampled an ~
occurs. ~ f neither sensory element is sampled the subject makes the
response to which the background element is conditioned. Conditioning
of elements in S changes from trial to trial via a learning process.
Th.e sampling of sensory elements depends on the trial type (T
l
,
T
2
' To) and is descriqed by a simple probabilistic model. The
learning process associated With S is assumed to be the multielement
pattern model presented in Sec.). Specifically, the assumptions of
the model are embodied in the following statements:
If T. (i = 1, 2)
l
occurs, then sensory element will be
sampled with probability h (With probability lh, neither
nor
nor
s2 will be sampled).
s2 will be sampled.
If
occurs, then neither
.".
A. and E. 215
2. Exactly one element is sampled from S on every trial.
Given the set S of N elements, the probability of
sampling a element is
1
iii '
If s. (i 1, 2) is sampled on trial n, then with
probability c' the element sampled from S on the
trial becomes conditioned to Ai at the end of trial n •
If neither nor is sampled, then with prooabiJ
ity c the element sampled from S becomes conditioned
with equal likelihood to A
l
or at th" end of trial n.
4. If sensory element s. is sampled, then A. will
1. l
If sensory element is sampled, then the response
to which the sampled element from S is conditioned will
Occur..
If we let Pn denote the expected proportion of elements in S
conditioned to A
l
at the start of trial n, then (in terms of state
ments 1 and 4 ab\lve) we can immediately.write an expression for the
likelihood of an Ai response given a T
j
event; namely,
Pr(A_ IT
2
)
"""2, n . ,n
h + (lh)p
n
h + (1  h) (1  P )
n
(88a)
(88b)
(88c)
A. and E. 216
for p can be obtained from statements 2 and 3 by the
n
methods throughout 3 of this and is as follows
(for a of this Atkinson, 1962a):
and Dividing and of by c
yields
(89)
c'
\jI =c .
Thus, asymptotic for p
n
not
on absolute of c' and c but only on ratio.
An of Kinchla's data that for
pr(AiITj) last 400 or so trials of
shall this portion of data as
asymptotic. 7 of Pr(A
i
ITj)
for last 400 trials. asymptotic
in of Eq. 88 and Eq. 89 and simply
Table 7 about
A. and E. 2l6a
Table 7
Predicted and Observed Asymptotic Response Probabilities
for Visual Detection Experiment
Group I Group II
Observed Predicted Observed Predicted
Pr(A1IT
1
)
.645 .645 .558 .565
.645 .724
pr(A1lT
o
) .494 ·500
A.and E. 217
lim
lim
n ¢O
lim
ll"7'7OO
Pr(A
l
iT
l
. ) =
,n . ,n
Pr(A_ \T
2
·) =
JlD
Pr(A
l
ITo )
,n ,n
h+(l_h)p
. .,
h + (lh)(lp )
.,
P.,
(9Gb)
(90c)
In order to generate asymptotic predictions we need values for hand
1
*. We £irst note by inspection of Eq. 89 that P., = 2 for Group I;
1
in fact, whenever = we have P., =2' Hence, taking the observed
asymptoti.c valu<j'·for pr(Al!T
l
) in Group I (Le., .645) and setting
it equal to h + (1  yields an estimate of h = .289. The back
ground illumination and the increment in radiant intensity are the same
for both experimental groups and therefore we WOuld require an estimate
of h obtaihedfrom Group I to be applicable to Group II.. In order to
estimate *, we take the observed asymPtotic value of pr(Al!T
o
) in
Gr"up II and set it equal to the right side of Eq. 89 w.ith h co .289 ,
'"
= =.2 and = .6; solving for * we obtain *=2.8 •
Using these estimates of hand * and Eqs. 89· and 90 yield the
asymptotic predictions given in Table 7.
Over all the equations give an excellent account of these particular
response measures. However, a more crucial test of the model is provided
by g,n analysis of thi= sequential data. To indicate the nature of the
sequential predictions that can be obtained, consider the probability
of an A
l
response on a T
l
trial given the various trial types and
responses that .can occur on the preceding trial, i. e.,
A. and E. 218
Pr(A
l
+.lIT
l
+lA. T. ) ,
,u . ,n l,n J,n
where i =1, 2 and j =0, 1, 2 • Explicit expressions for these
quantities can be derived from the axioms by the same methods used
throughout this chapter. To indicate their form, theoretical expres
sions for lim pr(A
l
llT
l
lAo T. ) will be given and, to
n ,n+. ,n+ l,n J,n
expressions for these quantities are as follows:
[h + (1  h)5]p + (1  p )h)"
(N  l)X
pr(Al[T1A1T
l
)
'"
'"
=
NX
+
N
(1h)5'(1p)
(N  l)X
pr(All
,   00
=
N(l X)
+
N
h)'p + [h
2
+ (1h)5'](1p)
(N  l)X
Pr(A
l
[T
1
A
2
T
2
)
'" ""
NY
+ N
(91a)
(91b)
(91c)
(1 h)5p
. '"
N(l  Y) +
(N  l)X
N
(91e)
(9lf)
where)' = c'h + (lc'), 7' = c' + (lc')h, 5 = h + ,
5' = + X = h + (lh)p"" and Y = h + (lh)(lp) •
Ao and 219
It is interesting to note that the asymptotic expressions for
lim Pr(A. IT. ) depend only on hand *, whereas the
J.,n JJn
in 91 are functions of all four parameters N, c ,c' and h 0
Comparable sets of equations can be written for and
pr(Al!TchTj) 0
The expressions in Eqo 91 are rather formidable, but numerical pre
dictions can be easily calculated once values for the parameters have
been obtainedo Further, independently of the parameter values, certain
relations among the sequential probabi.lities can be specified.o As an
example of such a relation, it can be shown that >
for any stimulus schedule and any set of parameter valueso
To see this, simply subtract Eqo 91f:Troni Eqo 91e and note that 5 > 5' 0
Insert Table 8 about here
In Table 8 the observed values for are presented as
reported by Kinchlao Estimates of these conditional probabilities were
computed for individual subjects using the data over the last 400 trials;
the averages of these individual estimates are the given in
the tableo Each entry is based on 24 subjectso
In order to generate theoretical predictions for the observed
entriep in Table 8 values for N, c ,c' and hare neededo Of course,
estimates of h and
c'
*= c
already have been made for this set of
data, and therefore it is only necessary to estimate N and either c
or c' 0 We obtain our estimates of Nand c by a least squares
A. and E. 219a
Table 8
Predicted and Observed Asymptotic Sequential Response
Probabilities in Visual Detection Experiment
Group I Group II
Observed Predicted Observed Predicted
.57 .58
·59
.64
.65 .69 .70 .76
·71 ·71 ·79 ·77
.61
·59
.69
.66
IT
2
A
l
TO) .54 .59 Jill .66
.66
·70 ·11
.76
pr(A
l
!T
1
A
1
T
l
)
·73 ·71 ·70
.65
.62
·59
.59
.52
·53
.58 .53
.51
..
pr(A
l
!T
1
A
1
T
2
) .66
·70
.64 .64
pr(A
l
lT
1
A
l
TO) ·72 ·70
.61 .63
pr(A
l
lT
1
A
2
T
O
) .61 .59
.48 .52
.38 .40 .47 .49
.56 .58
·59
.66
.64 .60 .67
.68
IT<flT
2
)
.47
.42 .51 ·51
IT<flTO)
.47
.42 .50 .51
IT
O
A
2
T
O
) .60 .58 .65
.66

A. and E. 220
" method; i. e 0, we select a value of Nand c (where c I = c1jr) so that
the sum of squared deviations between the 36 observed values in Table 8
and the corresponding theoretical quantities is minimized. The theoreti
cal quantities for pr(Al'I'T1A.iT
j
) are computed from Eq. 91; theoretical
expressions for and not been pre
sented here but are of the same general form as those given in Eqo 91.
Using this technique, estimates of the parameters are as follows:
c' = 1 .. 00
(92)
h 0289 C ::; 0357
The predictions corresponding to these parameter values are presented
in Table 8. When one considers that only four of the possible 36 degrees
of freedom represented in Table 8 have been utilized in estimating pa
rameters; the close correspondence between theoretical and observed
quantities may be interpreted as giving considerable support to the
assumptions of· ··the .. mbdel.
A great deal of research needs to be done to explore the consequences
of this approach to signal detections. In terms of the experimental pro
blem conSidered in this section much progress can be made vi.a different·ial
tests among alternative formulations of the model. For example, we
postulated a multielement pattern model to describe the learning pro
cess associated with background stimuli; it would be important to deter
mine whether other formulations of the learning process such as those
A, and E. 221
developed in Sec. 5 or those proposed.·by Bush and Mosteller (1955)
would provide as good or even better theoretical fits than the ones
displayed in Tables 7 and 8. Also, it would be valuable to examine
variation" i.n the scheme for sampling sensory elements along .lines
developed by Luce (1959) and Restle (1961).
More generally, further development of the theory is required
before one could attempt to deal with the wide range of empirical
phenomena encompassed in the approach to perception via decision theory
proposed by Swets, Tanner, and Birdsall (1961) and others. Some
retical work has been done by Atkinson (1961b ) along the lines outlined
in this section to account for the ROC
curves that are' typically ob"erved in detection studies and to specify
the relation between forced_choice and yesno experiments. However,
this work is still quite tentative and an evaluation of the approach
will require extensive analyses of the detailed sequential properties
of psychophysical data.
MUltiple Process Models
Analyses of certain behavioral situations have proved to require
formulations in terms of two or more distinguishable, though possibly
interdependent, learning processes that proceed simultaneously. For
some situations these separate processes may be directly observable;
for other situations we may find it advantageous to postulate processes
are unobservable but which determine in some welldefined fashion
the sequence of obsElrvable behaviors.
A. and E. 222
For example, in Restle's (1955) treatment of discrimination
learning it is assumed that irrelevant stimuli may become "adapted"
over a period of time and thus be: rendered nonfunctionaL Such an
analysis entails. a. twoprocess system. One process has to do with
the conditioning of stimuli to responses, whereas ·,the other process
prescribes both the conditions under which cues become irrelevant and
the rate at which adaptation occurs.
Another application of multiple process models arises with regard
to discrimination prOblems in which either a covert or a directly ob
servable orienting response is required. One process might describe
how the stimuli presented to the subject become conditioned to discrim
inative responses. Another process might specify the acquisition and
extinction of various orienting responses; these orienting responses
would determine the specific subset of the environment that the subject
would perceive on a given triaL For models dealing with this type of
problem see Atkinson (1958), Bower (1959), and WYckoff (1952).
As another example, consider a twoprocess scheme developed by
Atkinson (1960) to account for certain types of discrimination behavior.
This model makes use of the distinction, developed in Sec". 3 and 4 of
the present chapter, between component models and pattern models and
suggests that the subject may (at any .instant in time) perceive the
stimulus situation either as a unit pattern or as a collection of
individual components. ThuS, two perceptual states are defined; one
in which the subject responds to the pattern of stimulation and one in
A. and E. 223
which he responds to the separate components of the situation. Two
learning processes are also defined. One process specifies how the
patterns and components become conditioned to responses, and the second
process describes the conditions under which the subject shifts from
one perceptual state to 'another. The control of the second process is
)
governed by the reinforcing schedule, the subject's sequence of responses,
and by similarity of the discriminanda. In this model neither the condi
tioning states nor the perceptual states are observable; nevertheless,
the behavior of the subject is rigorously defined in terms of these
hypothetical states.
Models of the sort described above are generally difficult to work
with mathematically and consequently have had only limited development
and analysis. It is for this reason that we select a particularly
simple example to illustrate the type of formulation that is possible.
The example deals With a discrimination learning task investigated by
Atkinson (1961a) tn which observing res:pcnses are categorized and di
rectly measured.
The experimental situation consists of a sequence of discrete
trials. Each trial is specified in terms of the following classifications:
Trial type. Each trial is either a T
l
or a T
2
• The
trial type is set by the experimenter and determines in
part the stimulus event occurring on the trial.
A. and E. 224
Observing responses. On each trial, the subject makes
either an R
l
or R
2
• The particular observing response
determines in part the stimulus event for that trial.
sl' sb' s8: Stimulus events. Following the observing response, one
and only one of these stimulus events (discriminative cues)
can occur; on
or sb
occur.
13
can or
trial either On a T
l
trial either
occurs 0
13
The subscript b has been used to denote the stimulus event that
may occur on both T
l
and T
2
trials; the subscripts 1 and 2 denote
stimulus events unique to T
l
and T
2
trials, respectively.
Discriminative responses. On each trial the subject makes
either an A
l
or A
2
response to the presentation of a
stimulus event.
Trial outcome. Each trial is terminated with the occurrence
of one of these events. An 01 indicates that A
l
was
the correct response for that trial, and 02 indicates
that A
2
was correct.
The sequence of events on a trial is as follows: (1) The ready
signal occurs and the subject responds with R
l
or R
2
• (2) Following
the observing response or is presented. To the onset
of the stimulus event the subject responds with either A
l
or A
2
• (4)
The trial terminates with either an 01 or 02 event.
a
A. and E. 225
To keep the analysis simple we consider an experimenter controlled
reinforcement schedule. On a T
l
trial, either an 01 occurs with
probability "1' or an 02 with probability 1  "1;. on a T
2
trial
an 01 occurs with probability "2' or an 02 with probability 1"2'
The T
l
type trial occurs with probability ~ and T
2
with probability
1  ~ . Thus a T
l
 01 combination occurs with probability ~ l ;
T
l
 02 with probability . ~ ( l  rr
l
); and so on.
The particular stimulus event si (i = 1, 2, b) that the experi
menter presents on any trial depends on the trial type (T
l
or T
2
) and
the subject's observing response (R
l
or R
2
). Spectfically:
(i) If an R
l
is made then
(a) with probability ex the
sl
event occurs on a
T
l
trial and the
s2
event on a T
2
trial.
(b) with probability 1  ex the
sb
event occurs,
regardless of the trial type.
( ii ) If an R
2
is made then
(a) with probability ex the sb event occurs,
regardless of the trial type;
(b) with probability 1  ex the sl event occurs on
a T
l
trial and 8
2
on a T
2
trial,
To clarify this procedure, consider the case where ex = 1, "1 = 1,
and rr
2
= O. If the subject is to be correct on every trial, he must
A. and E. 226
make an A
l
on a T
l
type trial and an ~ on a T
2
type trial.
However, the subject can only ascertain the trial type by making the
appropriate observing response. That is, R
l
must be made in order to
identify the trial type, for the occurrence of ,E2 always leads to the
presentation of sb regardless of the trial type. Hence, for perfect
responding the subject must make R
l
with probability 1 and then
make A
l
to
or ~
to The purpose of the Atkinson study
was to determine how variations in 11
1
, 11
2
and a would affect both
the observing responses and the discriminative responses.
Our analysis of this experimental procedure will be based on the
axioms presented in Sees. 2 and 3. However, in order to apply the theory
we must first identify the stimulus and reinforcing events, in terms of
the experimental operations,. The identification we offer :seems 'quite
natural to us and is in accord with the formulations given in Sees. 2
and 3.
We assume that associated with the ready signal is a set SR of
pattern elements. Each element in SR is conditioned to either the
R
l
or the ~ observing response; there are N' such elements. At
the start of each trial (i.e., with the onset of the ready signal) an
element is sampled from SR and the subject makes the response to which
the element is conditioned.
Associated with each stimulus event s. (i = 1, 2,b)
~
is a set Si
of pattern elements; elements in Si are conditioned to either the A
l
or the ~ discrimination response. There are N such elements in each
A. and E. 227
set Si and, for simplicity, we assume the sets are pairwise disjoint.
When the stimulus event occurs one element is randomly sampled from
Si and the subject mades the discriminative response to which the ele
ment is conditioned.
Thus, we have two types of learning processes; one defined on the
set SR and the other defined on the sets Sl' Sb and S2' Once the
.reinforcing events have been specified for these processes we can apply
our axioms. The interpretation of reinforcement for the discrimination
response process is identical to that given in Sec. 3. If a pattern
element is sample from set Si for i; 1, 2, b and fallowed by an
0. (j ; 1, 2) outcome, then with probability c the element becomes
J
the conditioning state 1  c and with probability condtioned to A
j
of the sampled element remains unchanged.
The conditioning process for the SR set is somewhat more complex
in that the reinforcing events for the observing responses are assumed
to be subjectcontrolled. Specifically, if an element conditioned to
R. is sampled from
SR
and followed by either an
A10
l
or A
2
0
2 1
event, then the element will remain conditioned to R
i
; however, if
A
1
0
2
or A
2
0
1
occurs, then with probability c' the element will
become conditioned to the other observing response. Otherwise stated,

if an element from SR elicits an observing response that selects a
stimulus event and, in turn, the stimulus event elicits a correct dis
crimination response (i.e., A10
l
or A
2
0
2
), then the sampled element
A. and E. 228
will remain conditioned to that observing response. However, if the
observing response selects a stimulus event that gives rise to an in
correct discrimination response (i.e., A
1
0
2
or ~ O l ) ' then there
will be a decrement in the tendency to repeat that observing response
on the next trial.
Given the above identification of events we can now generate a
mathematical model for the experiment. To simplify the analysis we let
N' =N =1 ; namely, we aSSume that there is one element in each of our
stimulus sets and consequently the single element is sampled with proba
bility 1 whenever the set is available. With this restriction we may
describe the' conditioning state of a subject, at the start of each trial,
by an ordered four tuple < i j k I, > where
(1) the first member i is 1 or 2 and indicates whether the
single element of 8
R
is conditioned to R
l
or R
2
;
(2) the second member j is 1 or 2 and indicates whether the
single element of 8
1
is conditioned to A
l
or A
2
;
(3) the third member k is 1 or 2 and indicates whether the
element of 8
b
is conditioned to A
l
or A
2
;
(4) the fourth member I, is 1 or 2 and indicates whether the
element of 8
2
is conditioned to A
l
,or A
2
.
observing response; then, to sl' sb or s2'
native response A
j
, A
k
or AI,' respectively.
Thus, if the subject is in state < ijkl, > he will make the R
i
he will make discrimi
a
A. and E. 229
From our assumptions it follows that the sequence of random variables
that take the subject states < i j k £ > as values is a 16 state Markov
chain. displays the possible transitions that can occur when
Insert 10 aboilt here
the subject is in state < 1122 > on trial n. To clarify this tree,
let us trace out the top branch. An R
l
is elicited with probability 1
and with probability a T
l
trial with an 01 outcome occurs;
further, given an R
l
response on . a T
l
trial there is probability
that the sl stimulus event occurs; the onset of the sl event elicits
a correct response and hence no change occurs in the conditioning state
of any of the stimulus patterns. Now consider the next set of branches:
an R
l
occurs and we have a TIO
l
tria,l; with probability 1  a the
sb stimulus is presented and an occurs; the response is in
correct (in that it is followed by an 01 event), hence with proba
bility c the element of set Sb becomes conditioned to A
l
and
with independent probability c' the element of set SR becomes
conditioned to the alternative observing response, namely R
2
•
From this tree we obtain probabilities corresponding to the < 1122: >
row in the transition matrix, For example, the probability of going
from < 1122 > to < 2112 > is simply + (1 ;
that is, the sum over branches 2 and IS. An inspection of the transition
matrix yields some important results. For example, if a = 1 , rt
l
= 1 ,
and rt
2
= °then states < 1112 > and < 1122 > are absorbing and
   ~    __  ~  __ 1122
A. and E. ,2298.
1122
2112
1112
2122
1122
2222
1222
2122
1122
1122
1122
R
l
2121
1121
".
2122
1122
..
2112
05'0 '
~ j ; ,
1112
2122
1122
ex
: s2 &. Az
1122
Fig. 10. Branching process, starting in state <:1122>, for a single
trial in the twoprocess discrimination learning model.
hence in the limit
As before, let
A. and E. 230
Pr(R
l
)= i, Pr(A
l
IT
l
) =1 , and Pr(A_ !T
2
. )d.
,n ,n , ,n '"'"2.,.n,n
u i ~ ~ £ denote the probability of being in state
< i j k £ > on trial n's when the limit exists let u"!'
~ J r ..
1
, (n)
= J.m u
i
"k.€
n4oo J
Experimentally, we shall be interested in evaluating the following
theoretical predictions:
(n) (n) (n) (n)
+ u
1211
+ u
1212
+ u
1221
+ u
1222
(n) (n) (n) (n)
+ ex[u
l121
+ u
l122
+ u
2211
+ u
2212
]
( ) [
(n) (n) (n) (n) ]
+ 1  ex . u
1211
+ u
1212
+ u
2121
+ u
2122
(
I
)
(n) (n) (n) (n)
Pr Al,n T
2
,n = u
llll
+ u
1211
+ U21l1 + u
2211
[
(n) (n) (n) (n) ]
+ ex u
l121
+ u
1221
+ u
2112
+ u
2212
( ) [
(n) (n) (n) (n) ]
+ 1  Qi u
ll12
+ u
1212
+ u
2121
+ U2221
( )
(n) . (n) . ( ) (n)
Pr R
l
, n n A
l
, n = u
l111
+ ex u
l121
+. 1  Qi u
1212
1 [(n) (n)]
+ 2 Qi u
l122
+ u
122l
(93a)
(93c)
(93d)
A. and E. 231
1 (n) (n)
+ 2(1  a) + u2221l
1 (n) (n)
+ +
(93e)
The first equation gives the probability of an R
l
response. The
second and third equations give the probability of an A
l
response on
T
l
and T
2
trials, respectively. Finally, the last two equations
present the probability of the joint occurrence of each observing
response with an A
l
response.
In the experiment reported by Atkinson (1961a) six groups were run
with 40 subjects in each group. For all groups rt
l
= ·9 and = ·5·
'The groups differed with respect to the value of a and For
Groups IIII, the value of a = 1; and for Groups IVVI, a = .75. For
Groups I and IV, rt = ·9;
2
for II and V, and for Groups III
and VI, rt
2
.1. The design can be described by the following array:
a
1.0
.75
.9
I
IV
.5
II
V
.1
III
VI
Given these values of rt
l
, rt
2
' a and
chain is irreducible and aperiodic. Thus,
our 16 state Markov
(n)
lim u
ijk
£ = u
ijk
£ exists
A. and E. 232
and can be obtained by solving the appropriate set of 16 linear equations
(see Eq. 16). The values predicted by the model are given in Table 9
for the case w h e ~ e Values for the were computed
Insert Table 9 about here
and then combined by Eq. 93 to predict the response probabilities. By
presenting a single value for each theoretical quantity in the table we
imply that these predictions are independent of c and c' • Actually
this is not always the case. However, for the schedules employed in
this experiment the dependency of these· asymptotic predictions on c and
c' is virtually negligible. For c = c' ranging over the interval
from .0001 to 1.0 the predicted values given in Table 9 are affected
in o n ~ the third or fourth decimal place; it is for this reason that
we present theoretical values to only two decimal places.
In view of these comments it should be clear that the predictions
in Table 9 are based solely on the experimental parameter values.
Consequently, differences between subjects (that may be represented by
intersubject variability in c and c
l
) do not substantially affect
these predictions.
A. and E. 232a
Table 9
Predicted and Observed Asymptotic Besponse Probabilities
in Qbserving Besponse Experiment
Group I Group II Group III
Pred. Obs. 3D Pred. Obs. 3D Pred. Obs • 3D
.
Pr( A
1
IT
1
) .90 .94 .01.4 .81 .85 .164
·79 ·79
.158
pr(A
1
IT
2
)
·90
.94 .014
·59
.61. .134 .21 .23 .182
.
Pr(B
l
} .50 .45 .279
·55 ·59
.279
·73 ·70
.285
Fr(I\ n ~ )
.45 .43 .266
.39 .42 .226
.37
.36 .164
P r ( ~ n A 1 )
.45 ' .47 .293 .31 .31 .232 .13 .16 .161
i.
Group IV Group V Group VI
Fred. Obs. 3D Pred. Obs. 3D Pred. Obs. 3D
...
pr(Al!T
l
) .90 .93 .063 .80 .82 .114
·73 ·73
.138
p;r(All T
2
} .90 .95
.014 .60 .68 .114 .27 .25 .138
Pr(I\} .49
·50
.257 .
·52 ·53
.305 .63 .7
2 .263
Pr(Bln
A
l
}
.44 .47 .241 .35 .38 .219 .3
2 .36 .138
P r ( ~ () A
l
} .46 .47 .247 .34 .36 .272 .19 .13 .168
A. and E., 233
In the Atkinson study 400 trials were run and the response propor
tions appear to have reached a fairly stable level over the last half
of the experiment. Consequently, the proportions computed over the
final block of 160 trials were used as estimates of asymptotic quantities.
Table 9 presents the mean and standard deviation of the 40 observed pro
portions' obtained under each experimental condition. As can be seen,
the agreement between theoretical and observed quantities is fairly good.
Despite the fact that these gross asymptotic predictions hold up
quite well, it is obvious that some of the predictions from the model
will not be confirmed. The difficulty with the oneelement assumption
is that the fundamental theory laid down by the axioms of Sec. 3 is
completely deterministic in many respects. For example, when N' ; 1
we have
pr(R
l
+11°1 A
l
R
l
); 1 ;
. ., n ,on  , n , n
namely, if an ~ occurs on trial n and is reinforced (i.e., followed
by an A10
1
event) then R
l
will reoccur with probability 1 on trial
n + 1. This prediction, of course, is a consequence of the assumption
that we have but one element in set SR which necessarily is sampled
on every trial. If we assume more than one element, the deterministic
features of the model no longer hold and such sequential statistics
become functions of c, c' , Nand N' Uqfortunately, for elaborate
experimental procedures of the sort described in this section, the multi
element case leads to complicated mathematical processes for which it is
A. and E. 234
extremely difficult to carry out computations. Thus, the generality
of the multielement assumption may often be offset by the difficulty
involved in making predictions.
Naturally it is usually preferable to choose from the available
models the one that best fits the data, but in the present state of
psychological knowledge no single model is clearly superior to all others
in every facet of analysis. The oneelement assumption, despite some of
its erroneous features, may prove to be a valuable instrument for the
rapid exploi,ation of a wide variety of complex phenomena. For most of
the cases we have examined, the predicted mean response probabilities
of
are usually independent/ (or on;Ly slightly dependent on) the number of
elements assumed. Thus the oneelement assumption may be viewed as a
simple device for computing the grosser predictions of the general theory.
For exploratory work in complex situations, then, we recommend using
the oneelement model because of the greater difficulty of computations
for the multielement models. In advocating this approach we are taking
a methodological position with which some scientists do not agree. Our
position is in contrast to one which asserts that a model should be dis
carded once it is clear that certain of its predictions are in error.
We do not take it to be the principal goal (or even, in many cases, an
important goal) of theory construction to provide models for particular
experimental situations. The assumptions of stimulus sampling theory
are intended to describe processes or re;Lationships that are common to a
wide variety of learning situations, but with no implication that behavior.
A. and E. 235
in these situations is a function solely of the variables represented in
the theory. As we have attempted to illustrate by means of numerous
examples, formulation of a model within this framework for a particular
experiment is a matter of selecting the relevant assumptions, or axioms,
of the general theory and interpreting these in terms of the conditions
of the experiment. How much of the variance in a set of data can be
accounted for by a model depends jointly on the adequacy of the theoret
ical assumptions and on the extent to which it has been possible to
realize experimentally the boundary conditions envisaged in the theory
thereby minimizing the effects of variables not represented. In our
view, a model, in application to a given experiment, is not to be
classified as IIcorrectll or f1incorrect"; rather, the degree to which it
accounts for the data may provide evidence tending either to support or
to cast doubt on the theory from which the particular model was derived.
Psychometrika,
A. and E. 236
REFERENCES
Atkinson, R. C. Mathematical models in research on perception and
learning. In M. Marx (Ed.), Psychological Theory (2nd
Edition). New York: Macmillan Co., '1962a, in press.
Atkinson, R. C. Choice behavior and monetary payoffs. In J. Criswell,
H. Solomon and P, Suppes (Eds.), Mathematical methods in
small group processes. Stanford, California: Stanford
University Press, 1962b, in press.
Atkinson, R. C. A variable sensitivity theory of signal detection.
Technical Report No. 47, Psychology Series, Institute for
Mathematical Studies in the Social Sciences, Stanford
University, 1962c .. (to be published in Psychological Review).
Atkinson, R. C. The observing response in discrimination learning.
,.L. expo Psychol., 1961a, ~ . 253
2
62.
Atkinson, R. C. A theory of stimulus discrimination learning. In
K. J. Arrow, S. Karlin and P. Suppes (Eds.), Mathematical
methods in the social sciences. Stanford, California:
Stanford University Press, 1960, 221241.
Atkinson, R. C. A Markov model for discrimination learning.
Psychometrika, 1958, 23, 309322.
~
Atkinson, R. C. A stochastic model for rote serial learning.
1957; 22, 8796.
"""'
Atkinson, R. C. and Suppes, P. An analysis of twoperson game situations
in terms of statistical learning theory. ,.L. expo Psychol.,
1958, 55, 369378.
~
Billingsley, P. Statistical inference for Markov processes. Chicago:
University of Chicago Press, 1961.
Bower, G. R. A model for response and training variables in paired
as sociate learning. . PsychoLRev., 1962,·f?2J 34 53.
Bower, G, R, Application of a model to pairedassociate learning.
Psychometrika, 1961, 26, 255280.
Bower, G, R, Choicepoint behavior. In R. R. Bush and W. K. Estes
(Eds.), Studies mathematical learning theory. Stanford,
California: Stanford University Press, 1959, 109124.
Burke,C, ,To Some twoperson interactions. In K. J. Arrow, S. Karlin
and P. Suppes (Eds.), Mathematical methods in the social
sciences. Stanford, California: Stanford University Press,
1960, 242253.
Burke, C. J. Applications ofa linear model to twoperson interactions.
In R. R. Bush and W. K. Estes (Eds.), Studies in mathematical
learning theory. Stanford, California: Stanford University
Press, 1959, 180_203.
Burke, C. J. and Estes, W. K. A component model for stimulus variables
in discrimination Psychornetrika, 1957, 22,
Bush, R, R. A survey of mathematical learning theory. In R. Duncan Luce
(Ed.), Developments in mathematical psychology. Glencoe,
Illinois: The Free Press,1960.
Bush, R. R. and Estes, W. K. (Eds.). Studies mathematical learning
theory. Stanford, University Press, 1959.
Bush, R. R. and Mosteller, F. Stochastic models for learning.
New York: Wiley, 1955,
A•. and E.238
Bush, R. R, and Sternberg, S. A singleoperator model. In.R. R. Bush
and W. K. Estes (Eds •.), Studies in mathematical learning
theory. Stanford, Stanford University Press,
1959, 204214.
Psychol. Rev., 1951a, 58, 313323.
 <V'J
A mathematical·model for simple
A model for stimulus generalization
Psycho1. Rev., 1951b, 58, 413423.
 . '. A/'"V
learning.
Bush, R. R. and Mosteller, F.
Bush, R. R. and Mosteller, F.
and discrimination,
Carterette, T, S. An application of stimulus sampling theory to
sUllllllated generalization. J. exp. Psychol., 1961, 62,
 
448455.
Crothers, E. J. Allornone paired associate learning with unit and
compound responses. Indiana University, 1961. Unpublished
doctoral dissertation.
Detambel, M. H. A test of a model for multiplechoice behavior.
J. expo Psychol., 1955, 49, 97104.

Estes, W. K. Learning theory. Animal Review of Psycholog;y, 1962, 13,
Estes, W.• K. Growth and function of mathematical models for learning.
In Current Trends in Psychological Theory. University of
Pittsburgh Press, 1961a, 134151,
Estes, W. K. New developments in statistical behavior theory:
differential tests of axioms for associative learning.
Psychometrika, 7384.
A. and E. 239
Estes, W. K. Learning theory and the new mental chemistry. Psychol.
Rev.• , 196aa 67, 207223.
"""
Estes, W. K. A randomwalk model for choice behavior. In K. J. Arrow
S. Karlin and P. Suppes (Eds.), Mathematical methods in The
Social Sciences. Stanford, California: Stanford University
Press, 1960b, 265276.
Estes, W. K. The statistical approach to .learning theory. In S. Koch
(Ed.), Psychology: study of science. Vol. 2. New York:
McGrawHill, 1959a, 380491.
In R.R•. Bush and
learning theory.
Estes, W. K. Component and pattern models with Markovian interpretations.
W; K•. Estes (Eds.), Studies in mathematical
Stanford, California: Stanford University
Press, 1959b,
Estes, W. K. Stimulusresponse theory of drive. In M. R. Jones (Ed.),
Nebraska symposium motivation. Vol.• 6. Nebraska:
University of Nebraska Press, 1958.
Estes, W. K.
Estes, W. K.
Of models arid men. Amer. Psychol., \957a, 12, 609617.
"'"
Theory of learning with constant, variable, or contingent
probabilities of reinforcement.
11313
2
•
Psychometrika, 19571, 22,
"""'
Estes, W. K. Statistical theory of spontaneous recovery and regression.
Psy.chOl. Rev., 1955a, 62, 145154.
, "
Estes, W. K. Statistical theory of distributional phenomena in learning.
Psychol. Rev., 1955b, 369377.
Estes, W. K.Toward a statistical theory of learning. Psychol.
1950, m.
Eptes, W. K,·and Burke, C. J. A theory of stimulus variability in
learning. Psychol. Rev., 1953, 60, 276286.
A. andE.
Estes, W. K., Burke, C. J., Atkinson, R. C., and Frankmann, J. P.
Probabilistic discrimination learning. PsychoL,
1957, 233239.•
Estes, W. K. and Hopkins, B•. L. Acquisition an.dtransf'er.in pattern
vs. component discrimination learning. expo PsychOl.,
1961,
Estes, W. K., Hopkins, B. L., and Crothers, E. J. Allornone and
conservation ef'f'ects in the learning and retention of'
paired associates. J. expo Psychol., 1960, §g, 329339.
Estes, W. K. and Straughan, J. H. Analysis of'a verbal conditioning
situation in terms of' statistical learning theory. expo
PsychoL, 1954, WJ, 225234.
Estes,W. K. and Suppes, P. Foundations of' statisi;ical learning theory,
III. Analysis of' the noncontingent case. Report,
Psychology Series, Institutef'or Mathematical Studies in the
Social Sciences, Stanf'ord University, 1962, in preparation.
Estes, W. K. and Suppes,I'. Foundations of'linearmodels. In R. R. Bush
and W. K•. Estes (Eds.), Studies mathematical learning theory.
Stanford, CaJ,if'ornia: stanf'ord University)'ress, 1959a,137179.
 ...'.:'. ,'>,'
Estes, W. K. and Suppes, P. Foundations of' statistical learning theory,
II. The stimulus sampling model f'or simple learning.
nical Report No. 26, Psychology Series, Institute f'or Mathe
matical Studies in the Social Sciences, Stanf'ord University,
1959b.
Feller, W. An introduction !£ probability theory applications.
(2nd Edition) •. New York: Wiley, 1957•
A. and E.
F'ri"dman, M. P., Burke, C. J., M., Estes, W. le., and>MillwaI'd, R. B.
Extend"dtrainingin'a noncontingent
shifting reinforcement probabilities. Paper·given·at the.First
Meetings of the Psychon6mic ,Sbciety,ChiC:8go:,Illill.o:l.s.,. 1960.
Gardner, R. A. with two and three choices.
Amer. J .l?sychoL, : 70,:I.74185.
_. .. . f"v'oJ
Guttman, N. and H. I. Discriminability and generaliza
. tion. ;L. •• PSYchol., ,B, 79
88
•
Goldberg, S. Introduction to difference equations. New York: Wiley,
Hull, C. L.
Principles of behavior: An introduction behavioI' theory.
New York: AppletonCenturyCrofts, ;1.943.
Jarvik, M. E. Probability and a negative I'ecency effect in the
. anticipation of symbols.. ;L' expo psychoL,
291
2
97.
,<vV
Jordan, C. of finite differenc.es. New York:
Kemeny, J;. G.•·· and J. L. Finite Markov chains. Princeton,
New Jersey: D. Van Nostrand Co.,
Kemeny, J. G. and J. L. Markov processes in theory.
Psychometrika, 22,
ovv
Kemeny, J. G.• , J. L" and Thompson,
Finite Mathematics. New York:
Introduction to
Prentice
Kinchla, R. A.
A. and E. ,.242
Learned factors in visual discrimination. Unpublished
doctorifi dissertation, University of California, Los Angeles,
1962.
Lamperti, J. and Suppes, P. Chains of infinite order and applica
tions to learning theory. Pacific J. Math., 1959, 9, 7?9,.754.
 _.
Luc.e, H6 Dp Individual choic.e behavior: A theoretical analysis.
New York: Wiley, 1959.
Luce, R. D. and Raiffa, rI. Games and Decisions. New York: Wiley,1957.
Nicks, D. C. Prediction of sequential twochoice decisions from event
runs. Psychol., 1959, g, 105n4.
Peterson, L. R., Saltzman, Do, Hillner, K., and Land, V. Recency and
frequency in pairedassociate learning. .i!:. expo Psychol.,
196
2
,
Popper, J. Mediated generalization. In B. R. Bush and W. K. Estes (Eds.),
Studies in mathematical learning theory. Stanford, California:
Stanford University Press, 1959,
Popper, J. and Atkinson, R. C. Discrimination learning in a verbal
conditioning situation. .i!:. expo Psychol., 1958, 56, 2126.
Restle, F. Psychology of choice. New York: Wiley, .1961.
Restle, F. A theory of discrimination learningI Psychol. Rev., 1955,
62, 1119.
<Vv
Solom<::>n, R. L. and Wynne, L. C., Traumatic avoidance learning: acqui
sition in normal dogs. Psychol. Monogr., 1953, No. 4
(Whole No. 354).
Spence, K, W,
A. and E,
The nature of discrimination learning in animals,
Psychql. Rev., 1936, 43, 427449.
 "'"
Stevens, So S, On the psychophysical law0
15318L
PsychoL Rev. ,1957, 64,
{Vv
Suppes, Po. and Atkin!3on, R, C, Markov learning models for multiperson
interactions 0 Stanford, California: Stanford University
Press, 19600
Suppes, Po and Gin!3berg, R. A fundamental property of allornone models,
Psychol, Rev., 1962a, in press,
Suppe!3, Po and Ginsberg, R. Application of a stimulus sampling model
to children's concept formation of binary numbers with and
without an overt correction respon!3e. P!3ychoL,
1962b,.
tv
Swets, J, A., Tanner, W, Po, and
perception, PsychoL
Birdsall, T, G.
.1961, 68,
 rvv
Decision processes in
3013400
Tanner, W, Po, Jr, and Swets, Jo A. A decision making theory of visual
detection, 1954, 61, 401409.
"'"
Theios, John. A threestate Markov model for learning, Technical
Report No. 40, Psychology Series, Institute for Mathematical
Studies in the Social Sciences, Stanford University, 1961,
Wyckoff, L. B.,y· Jr.. The role of observing responses in discrimination
behavior, PsychoL Rev", 1952, 59, 431_442,
NV
STIMULUS SAMPLING TllEORyl
by Richard C. Atkinson and William K. Estes
lpreparation of this chapter was supported in part by Contract Nonr908(16) between the Office of Naval Research and Indiana University, Contract M5l84 between the National Institute of Mental Health and Stanford University, and Contract Nonr225(17) between the Office of Naval Research and Stanford University.
This is a draft of a chapter for the Handbook of Mathematical Psychology, edited by R. R. Bush, E. Galanter, and R. D. Luce.
TABLE OF CONTENTS Section
10
Page
1
2.
OneElement Models.
,0
<. . '.
.0
0
.0
_0
..
••
0
,0
..
..
5
2.1
Learning of a single stimulus response
association." .• " ." . .
6
• .. .• • • 10
2.2 2.3
Paired~associate learning..
Probabilistic reinforcement schedules. • • Pattern Models.•
28
Mu1ti~Element
47 47
63
. 3.• 1 3.2 3.3
Genera). formulation, Treatment of the simple noncontingent case Analysis of a paired comparison learning
experiment........" ." " " .
"
" "
..
... ."
"
,0
."
4.
A Component Model For Stimulus Compounding And
Ge.ner&l:i,:zation...
,4 ." .... '0 ." _" 0 .. '0
III
4,1
Basic concepts; conditioning and response
axioms " "
,0 .• •.. '. ... ."
III
4.2 4.3
Stimulus compounding Sampling axioms and major response theorem of
fixed sample size model" ." _.
"4 ."
_0 '. '0 ." ... ... ...
112
121
4.4
Jnterpretation of stimulus genera).ization,
124
.
Component models Analysis of a signal detection experiment.TABLE OF CONTENTS . . . . Component models with stimulus fluctuation The linear model as a limiting case . • • • .2 The pattern model for discrimination learning. 192 193 199 209 211 6. • • • . • • • . 236 .. • 6. .. •• • . A mixed model .2 Component models wi. .th fixed sample size. Multiple process models . . • • . • • • Applications to multiperson interactions.• .4 6. .3 6. 5...continued Section Page Component And Linear Models For Simple Learning • 5· 135 135 155 168 182 5. 1 6.5 221 REFERENCES. Discrimination Learning .3 5. • . • .4 6. . • .1 5.
.
• . . . • • 174a 7. . • . . . . 3. . Page Observed and predicted (oneelement model) values for response sequences over first three trials of a paired associate experiment 9a 2.LIST OF TABLES Table 1. . . • • • • • . Predicted and observed asymptotic sequential response probabilities in visual detection experiment • • 219a 9. • • . .6 series. • • . Observed and predicted values for (Interpret PO Pk' the probability of no errors following the k th success. • • • 74a 4. 5. Predicted and observed asymptotic response probabilities for visual detection experiment 216a 8. . • 117a 104a 6. . as the probability of no errors 27a at all during the course of learning) • . Theoretical and observed asymptotic choice proportions for pairedcomparison learning experiment Theoretical and observed proportions of response A to triads of lights in stimulus compounding l test • . Predicted (pattern model) and observed values of sequential statistics for final 100 trials of the . Observed joint response proportions for RTT experiment and predictions from linear retentionloss model and sampling model . Predicted and observed asymptotic response probabilities in observing response experiment • • • • 232a . .
.
learning model. . . Generalization from a training stimulus. starting in state <1122>. Sb' S The parameters are a 103a Sa' to a at several stages of training. Branching process. contingent case .. .. Branching process. test stimulus. Branching process for a triad probability on a paired comparison learning trial Observed proportion of 100a A. the proportion of and to probability of response Al S . 9.. overlap between ing in 10. . Branching process for one element model in twochoice. 2. . a ~ . single trial in the twoprocess discrimination 229a . ~. .. N element model on a trial C i when the subject starts in state and an E l 51a 6.LIST OF ILLUSTRATIONS Figure 1.. ·5. non 4. contingent case Branching process for event occurs. The average probability of an error on trial Bower's pairedassociate experiment. 3. starting from state n. for a 126a Branching process. Branching process for a diad probability on a paired comparison learning trial .. 5. . contingent case C on trial l for one element model in two choice. non32a n in 19a Page Co on trial 33a for one element model in two choice. l responses in successive 20trial blocks for paired comparison experiment.. 8..1. the ~l = prior to train . starting from state n. 99a 7.
.
Kemeny and Snell. This branch of learning theory has developed in close interaction with certain types of experimental analysis. A special advantage of the formulations to be discussed is that their mathematical properties permit application of the simple and elegant. . 1959) to the tasks of deriving theorems and generating statistical tests of the agreement between assumptions and data. Snell. Kemeny. consequently it will be both natural and convenient to organize this presentation around the theoretical treatments of a few standard reference experiments. At the level of experimental interpretation. 1 1. most contemporary learning theories utilize a common conceptualization of the learning situation in terms of stimulus. theory of Markov chains (Feller. and the reinforcement term to the experimental operations or events believed to be critical in producing learning. 1957.Po. The stimulus term of this triumvirate refers to the environmental situation with respect to which behavior is being observed. and reinforcement. response. and E. the response term to the class of observable behaviors whose measurable properties change in some orderly fashion during learning. Thus. and '['hompson. 1957. INTRODUCTION Stimulus sampling theory is concerned with providing a mathematical language in which may be expressed assumptions about learning and performance in relation to stimulus variables.
In the present chapter. In other chapters of this volume. Retention and transfer of learning depend upon the similarity. between the stimulus situations obtaining during training and during the test for . There are a number of ways in which characteristics of the stimulus situation are knOwn to affect learning and transfer. or intensity. that the mathematical models to be considered are somewhat abstract and that the empirical interpreta' tions of stimulus sets and their elements are not to be considered fixed and immutable. mathematical sets It will be used to represent certain aspects of the stimulus situation. the reader will encounter the notions of sets of responses and sets of reinforcing events. and E. should be emphasized from the outset. of the full pattern of stimulation. however. or aspect. we speak of "pattern models" and in the latter of "component models" (Estes. and reinforcement in paired presentation" of the stimulus and response words. 1959b).A. the response measure in the relative frequency with which the learner is able to supply the English equivalent from memory. and in the general literature on learning theory. or communality. In the former case. 2 in a simple pairedassociate experiment concerned with the learning of English equivalents to Russian words. the stimulus might consist in presentation of the printed Russian word alone. and upon stimulus variability fr9m trial to trial. in the other the correspondent of an element is a component. TWo main types of interpretation will be discussed: in one of these the empirical correspondent of a stimulus element is the full pattern of stimulation effective on a given trial. Rates and limits of conditioning and learning generally depend upon both stimulus magnitude.
At the present stage of theory construction.ons prove to be sufficiently well substantiated when the model is tested against . Stimulus variability is taken into account by assuming that of the total population of stimuli available in '311 experimental situation. 3 retention or transfer. one may represent the total popula. Although it is not a necessary restriction. it is convenient for mathematical reasons to deal only with fini.th any simple neurophysiological unit. with all samples of a given size having equal probabilities. as. we mean to assume only that certain properties of the settheoretical model represent certain properties of the process of stimulation. The basic notion common to all stimulus sampling theories is the conceptualization of the totality of stimulus condi tionsthat may be effective during the course of an experiment in terms of a mathematical set. Many of the simple mathematical properties of the models to be discussed arise from the assumption that these trial samples are drawn randomly from the population. and this limitation will be assumed throughout our presentation. generally only a part actually Translating this idea into the affects the subject on anyone trial. tion by a set of "stimulus elements" and the stimulation effective on anyone trial by a sample from this set. If these assumpti. for example.te sets. These aspects of the stimulus situation can be given direct and natural representations in terms of mathematical sets and relations between sets. and E. receptor cells.A.• one should not assume that the stimulus elements are to be identified wi. Although it is sometimes convenient and suggestive to speak in such terms . terms of a stimulus sampling model.
the population of available stimulation is assumed to comprise a set of distinct stimulus patterns. that it is worthy of study not only for exposi. sample size per se may be taken as a correspondent of stimulus intensity.A. and E.le size to population size is a natural way of representing stimulus variability. . We shall consider first. Our concern in thi s chapter is not to survey the rapi<ll.ng that the oneelement model represents a radical idealization of even the most simplified condi.. we shall find. 1962). learning models . it is assumed. proportion of common elements) between two stimulus sets may be taken to represent the degree of communality between two stimulus situations. but simply to present some of the fundamental mathematical applications. Just as the ratio of samp. and Suppes and Atkinson (1960).al. and the amount of overlap (i.the pattern model for simple learning. techni~ues and illustrate their For general background.ch might underlie the cor'respondences. that there is only one such pattern and that it recurs intact at the beginning of each experim.y developing area of stimulus sampling theory. In this model.e. the very simp.el. the reader i.tioning situations.tional purposes but also for its value as an analytic device in relation to certain types of learning data. Granti.s referred to Bush (1960)" Bush and Estes (1959)" Estes (1959a. 4 behavioral data. exactly one of which is sampled on each trial.ental. then it will be in order to look for neurophysiological variables whi. tri. and in some detail.lest of aU. In the important special case of the one··element mod.
2. These examples are especially simple mathematically and provide us with the opportunity to develop some mathematical tools which will be necessary in later discussions. In the first part of this section we consider a model for pairedassociate learning which has been intensively analyzed by Bower (1961. In the second part of this section . At the start of a trial the element is in one of several possible conditioning states. ONEELEMENT MODELS We begin by considering some oneelement modelS which are special cases of the more general theory. 5 After a relatively thorough treatment of pattern models for simple acquisition and for learning under probabilistic reinforcement schedules. but as overlapping samples from a common population. and. depending on the reinforcing event for that trial. we shall take up more briefly the conceptualization of generalization and transfer. 1962). it mayor may not remain in this conditioning state. not as distinct elements. Application: of these models is appropriate if the stimulus situation is sufficiently stable from trial to trial that it may be theoretically represented to a good approximation by a single stimulus element which is sampled with probability 1 on each trial. and related phenomena. the component models in which the patterns of stimulation effective on individual trials are treated. finally. some examples of the: more complex multipleprocess models which are becoming increasingly important in the analysis of discrimination learning. concept formation.A. and E.
If the correct response is not originally conditioned to S. and E.until learning occurs. to be presented on each of a series of trials and each trial is to terminate with reinforcement of some designated response. S . pattern. 2. S A single stimulus . reasonable apprOXimations to these conditions can be attained. The model generates some predictions which are undoubtedly incorrect. theory." Learning situations which completely meet the specifications laid down above are as unlikely to be realized in psychological experiments as perfect vacuums or frictionless planes in the physics laboratory. 2. the probability of the According to stimulus sampling fashion with respect to S allor~none ("connected to") correct response is zero.1 Learning of a Single StimulusResponse Association Imagine the simplest possible learning situation.the correct response occurs with 3. However. the "correct response" in this situation. Once conditioned to probability one on . 6 we consider a oneelement model for a twochoice learning situation involving a probabilistic reinforcement schedule. These assumptions constitute the simplest ·caseof the "oneelement pattern model. There is a fixed probability S c that the reinforced response will become conditioned to on any trial. learning occurs in an This means that: 1. .every subsequent trial.A. then.is. nevertheless it provides a useful introduction to more general cases which we pursue in Section 2. except possibly under ideal experimental conditions.
7The requirement that the same stimulus pattern be reproduced on each trial is probably fairly well met in the standard pairedassociate experiment with human subjects. Then. after several such items had received a single reinforcement. K..E. a subject should be expected to make an incorrect response on each test with a given stimulus until learning occurs. the stimulus member of each item was a trigram and the correct response an English word. e. shown below: Actual protocols for several subjects are . Then each item was given a second reinforcement. In one such experiment. in most cases.A. E. by a sequence of lIs. then. and. S xvk R house On a reinforced trial the stimulus and response members were exposed together. the protocol for an individual item over a series of trials should. each of the stimuli was presented alone. the subject being instructed to give the correct response from memory. conducted in the laboratory of one of the writers (W.). followed by a second test. consist in a sequence of O's preceded. if we represent an error by a 1 and a correct response by a 0. and so on. if he could. According to the assumptions of the oneelement pattern model.g. as shown. then a correct response on every subsequent trial.
8a b c d e f g h i 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 . Without judging this issue. but on the basis of considerable experience we can safely assume that this parameter will vary with characteristics of the populations of subjects and items represented in a particular experiment. The proportion of "fits" and "misfits" in this sample is about the same as in the full set of 80 cases from which the sample was taken. errors following correct responses.1 1 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 The first seven of these correspond perfectly to the idealized theoretical picture. the last two deviate slightly..of minor uncontrolled variables in the experimental situation which are best ignored for theoretical purposes. Statistical learning theory value of the conditioning parameter includes no formal axioms specifying precisely what variables determine the value of c.ccasional lapses.A. we may conclude that the simple oneelement model at least merits further study. and E. e. Before we can make ~uantitative predictions we need to know the c. An estimate of . i. may be symptomatic of a forgetting process which should be incorporated into the theory or they may be simply the result . The o.
. and E.39. of these expressions. According to the model. the proportion of correct responses on the test given after a single reinforcement was . conditioning must fail on the first reinforcement but occurs on the second. In order to test the model. each tested on two items).39. ahd therefore the expected proportion of correct responses on the subsequent test. According to the model.. 001. we obtain the predicted values which are compared with the corresponding empirical values for this experiment in Table 1. 000. as an estimate of c Thus we may simply take the observed proportion. consequently. the probability that the first correct response occurs on the third trial is given by (1_c)2c and the probability of no correct response in Substituting the estimate .. we need now to derive theoretical expressions for other aspects of the data. on the first three trials. so the probability of the sequence 000 is simply c . etc. 011. 39 for c in each three trials by (1_c)3. 010. In the full set of 80 cases (40 subjects.A. Suppose we consider the sequences of correct and incorrect responses. 9 the value of c for the experiment under consideration is easy to come by. and this joint event has probability (lc)c Similarly. and the probabilities of 001. Table 1 about here . Tb obtain an error on the first trial followed by a correct response on the second. a correct response should never be followed by an error. the probability is c that a reinforced response will become conditioned to its paired stimulus. c is the expected proportion of successful conditionings out of 80 cases. and 101 all zero.
.
24 0 .A. and E.02 . Sequence* 000 001 010 011 100 101 Observed Proportions .01 0 .23 no 111 .11 .23 * 0 _ correct response 1 = error .27 0 Theoretical Proportions ·39 0 0 0 .36 . 9a Table 1 Observed and predicted (oneelement model) values for response sequences over first three trials of a paired associate experiment.14 .
A. an The task experiment reported by Estes. and eight consonant trigrams. Thus we would have for PO' the probability of a correct response on the first test. we must take account of the fact that after the first practice trial. Consider. Then four of the syllablenumber pairs were presented on a second practice trial and all eight syllables were included in.10The correspondences are seen to be about as close as could be expected with proportions based on 80 response . assigned their subjects was to learn associations between the numbers 1 through 8. syllables alone being presented singly in a new random order and the subjects attempting to respond to each syllable with the correct number. the were exhibited singly in a random order. it is necessary to adjust the. the subjects knew that the responses were the numbers 1 . The mini mum probability of achieving a correct response to an unlearned item by guessing would be 1/8. Hopkins. trials.jsequences. serving as responses. 2. In writing an expression for the probability of a correct response on the first test in this experiment.2 PairedAssociate Learning In order to apply theoneelement mpdel to pairedassociate experiments involving fixed lists of items. Each subject was given two practice trials and two test On the first practice trial.andE . for example. .8. and were in a position to guess at the correct answers when shown syllables that they had not yet learned. serving as stimuli. the eight syllablenumber pairs Then a test was given. "boundary conditions" appropriately. and Crothers (1990).a final test trial.
we obtain . that the correct association was formed plus the that the association was not formed but the correct Setting this expression equal to the response wa. the probability probability (lc)/B c = c + (lc)/B .A. conditioning does not occur on the first reinforced trial but does on the second and a correct response is achieved by guessing on the first test.. Now we can proceed to derive expressions for the joint probabilities of various combinations of correct and incorrect responSes on the first and second tests for the twice reinforced items.s achieved by guessing. 11 Po Le.32.125) . conditioning occurs on neither reinforced trial but correct responses are achieved by guessing on both tests.404 = c + (lc)( . with probability (l_c)2(.125)2. with probability (lc)c(. we readily obtain an estimate of perimental conditions.. observed proportion of correct responses on the first trial for the twice reinforced items. we have With probability c. and E. conditioning occurs on the first reinforced trial. For the probability of correct responses to a given item in both tests. . and then correct responses necessarily occur on both tests.125) c for these ex " c . Similarly.
we arrive at the predicted values which we compared with the corresponding observed values below.875)[c + (1c)(. but continue until the subjects meet some criterion of learning. to derive expressions for various statistics which can be .d. Under these circumstances it is impractical to derive theoretical expressions for all possible sequences of correct and incorrect responses. like those just considered.35 . that the patterns of observed response proportions in both experiments considered can be predicted as well as they are by such an extremely simple model. A reasonable goal is. Observed POO POl PlO Pn .125) . Ordinarily. and E.24 ·35 Although this comparison reveals some disparities which we might hope to reduce with a more elaborate theory.125)] and Substituting for c in these expressions the estimate computed above.875) (. 12 (1_c)2 (.33 Predicted '35 .27 .05 . it is surprising.A. instea. experiments concerned with pairedassociate learning are not limited to a couple of trials.05 . to the writers at least.(1c)(.
13 . and E.A.
A feature of this model which makes it especially tractable for purposes of deriving various statistics is the fact that the sequences .. were told the r In Bower's experiment the subjects responses av. The parameter Transitions from state C to state C have probability zero.in' terms of the conditioning states: 1. if the sampled element is in state C.ailable to them and each occurred equally Therefore. The data of particular interest will be a subject's sequence of correct and incorrect responses to a specific stimulus item over trials.e. we may assume that in the 1 r often as the tobelearned response.A. c 2. c is fixed in value in a given experiment . in deriving results from the model we shall only consider an isolated stimulus item and its related sequence of responses. On any reinforced trial. We shall now derive some predictions from the model and compare these with observed data. 3. However. 'restated . Consequently the response sequence associated with any given item is viewed as a sample of size 1 from a pOpulation of sequences all generated by the same underlying process. Similarly. it has probability. of going into state C. uncondi tioned state the probability of a correct response is r is the number of alternative responses.TeadilY' be. where The conditioning assumptions: can. and E.. all items have the same conditioning parameter c and all items start out in the same conditioning state (C). 14 The probability of the correct response when the element is in state C depends on the experimental procedure. when we apply the model to data we assume that all items in the list are comparable. . i..
. given the state on anyone .1JL. If we represent by C n and C n the events that an item is in n.. respectively. and by C the conditioned or unconditioned state.the conditioning assumptions lead 2 directly to the relation 2 See Feller (1957) for a discussion of condition!l.. Q is the matrix of onestep transition probabilities... we ~ pr(AjHj)pr(H) _______".i. 15of trans.ties add.. brief. . obtain Since the are mutually exclusive.A. as may be verified by mathematical . Now the matrix of probabilities for transitions between any two states in n Q. the first C and the second row and column to C. if H .l probabiliti.C • constitutes a Markov chain.Ltions·between states lC and .trial. '. respectively.']his means that. on trial and and from ~l. we can specifY the probability of each state on the next trial without regard to the previous history. then any event with some l. . their probabi Applying the wellknown theorem on compound probabilities.E n l In are a set of mutually exclusive events of which A can occur only in conjunction one necessarily occurs. and l:J where row and column referring to trials is simply the nth power of . and E. . the probabilities of transitions from state Cto state C to C. Pr(A) = ~ Pr(AHj ) .es.
1957.and Thompson. n. p... A.bility of an error on trial Pr~n = 1). Qn = 1 . Snell. 16induction (see. and E.l ·n ~n A = if an error occurred on trial n By our assumed response rule. n [ l(lc) olc) ( nJ Given that the state is Henceforth we shall assume that all stimulus elements are in state C . 327). which goes to 0 as n becomes large. are Pr(A and Pr(A = lie) _n n 1 1 . Kemeny. e. Next we prove some theorems about the observable sequence of correct and incorrect responses in terms of the underlying sequence of unobservable conditioning states. namely we sum these cOnditional probabilities weighted by the probabilities of being in the respective states: . f or c > 0. respectively.r = ~n 11 Cn) 0 To obtain the proba. Thus. with probability 1 the subject is eventually to be found in the conditioned state. C on trial 1. the probability of being in stateC at the start of trial n is (l_C) nl. the probabilities of an error given that the subject is in the conditioned or unconditioned state..g.at the onset of the first trial of our experiment. We define the response random variable if a correct responSe occurred on tria.
equate this number to the righthand side of Eq. and E. and intersubject differences in learning are reflected in the variability of these estimates.. 17Pr(A '" 1) '" Pr(A ".}Pr(e )+ Pr(A n n n n n ". Equation 2 provides an easy method for estimating c. ·This group estimate of c simplifies the computations involved in generating predictions. (2) '" (1  1)/c r .any given subject we can obtain his average number of errors over stimulus items.lIe )Pr(e } n n (1) nl 1 ( ( 1 . chose to assume that E(K) equal to the observed c was the same for all subjects.Thus the number of errors expected during the learning of any given item is given by Eq. We thereby obtain an estimate of c r = 2 . Bower. For. it has the . in analyzing his data. r.lIe . However. for each subject. 2. thus he set number of errors averaged over both list items and subjects and obtained a single estimate of c..A. 2 with and solve for c.) lc) '..
once parameters have been estimated.c and. to 1. 2 represents only one of many methods that . for discussions of various methods and their properties. with r Equating c E(A) . Associated with this topic is the problem of assessing the statistical agreement between data and theory. and Estes and Suppes (1962). =2 . Bower has obtained excellent agreement between theory and observation using the group estimate of . For the experiment described above. and E. for the particular conditions he investigated.45 errors per stimulus item averaged over all subjects. Bower reports 1. the goodnessoffit between predicted and observed values. that is.cuss the many problems involved in the estimation of learning parameters. whether the estimator is unbiased and efficient relative to other estimators).. The reader is referred to Suppes and Atkinson (1960). In our analysis of data in this chapter we shall offer no statistical evaluation of the predictions . varies from subject to sUbject. Parameter estimation is a theory in its own right. Which method one selects depends on the properties of the particular estimator (e. any increase in precision that might be achieved by individual estimates of c does not seem crucial.45. It should be remarked that the estimate of c in terms of Eq.344.~ but shall simply disPlay the results for.2 All predictions that we derive from the model for this experiment will be based on this single estimate of c.A. the reader's inspection. reason is that we present Our .g.could have been used. we obtain the estimate =  in Eq. 18disadvantage that a discrepancy between observed and predicted values may arise as a consequence of assuming equal theory is correct but c c's wheno in fact. and we shall not be able to dis. the Fortunately.
if a correct response occurs by guessing on the first two trials and conditioning occurs on the second reinforcement.•. where c b = l(lc)!r' This event may arise if a correct response occurs by guessing on the first trial and conditioning occurs on the first reinforcement. and E. 1 about here As a basis for the derivation of other statistics of total errors. Similarly. 19the data only to illustrate features of the theory and its application. By using Eq. the probability of no additional errors following an error on any given trial is given by . and so on. we note first that the probability of no errors at all occurring during learning is given by c(l/r) + (1c)(1/r)2 c + . sufficiently close that most of the predicted and observed points cannot be distinguished on the scale of the graph. = clr ~ [(l_c)/r]i i=O 00 c r[l(lc)!r] blr . 1 with the estimate of c obtained above we have The fit is generated the predicted learning curve presented in Fig. However. 1. we require an expression for the probability distribution of To obtain this. in rigorous analyses of such models the problem of goodnessoffit is extremely important and needs careful consideration. Here again the reader is referred to Suppes and Atkinson (1960) for a discussion of some of the problems and possible statistical tests. Insert Fig.A. these results are not intended to provide a test of the model.
.
2 ! to I f' \0 P> . .3 '§ ~ ~ :t> !:':! 6: .. 2 3 4 5 6 7 TRIALS 8 n 9 10 11 12 13 The average probability of an error on trial in Bower's pairedassociate experiment.4 \ \ ~ o 0 Theory 00 Data . o 1 Fig. 1.5 . ~ .1 ~ =:.
b.. A E~uation for k> ( 3) 3 can be applied to data directly to predict . 20c + c (lc)/r+ ••• = c ~ [(l_c)/r]l = ~.. probability distribution is Pr(A: = 0) = b/r .. we need the expectation of = L 00 k 2 b(lb/r)(lb)  k 1 k=O = b(lb/r) L 00 [k(k_l)+k](l_b)kl ro k=O = (lb)b(lb/r) ~ [k(k_l)+k](1_b)k2 .1 k> ro .g. It may also be utilized in Preliminary to fre~uency deriving.the form of the distribution of total errors. k (lb) = b ' 1 for the sum of a geometric series together with the relations . additional errors. computing the·variance..= . Using now the familiar expression... which has probability has probability 1 . To have exactly k 0)... and E.:. k=O where the second step is taken in order to facilitate the summation.b/r. e.:c.A. = k) = b(lb/ r)(lb) kl .b Ll(lcJ/r • i=O errors. each of which Therefore the re~uired and then no more errors. Pr (... we must have a first error (if 1 .... the variance of this distribution. k .
and E. 2 2 + . (4) and c = Inserting the estimates E(A) = 1......44 for the predicted standard derivation of total errors. 1 2 r 2 b2 = (rl) (2ccr+rl) rc rc (rl) (cr+2c2cr+rl) rc rc = (rl)[l+ (2Cl)(1r)] rc rc = E(A)[1+E(A)(1~2c)] .(1 .) / c 1 . which may be compared with the observed value of 1..i. This quantity is related to the autocorrelation . we obtain 1. 21 and 1 2' b 2 3' b we obtain and VariA) = E(A ) .] .A.45 in Eq.......... __ :. Another useful statistic of the error sequence is E~+k).expectation of the product of error random variables on trials nand n+k.  ... 4.37. the ..[E(A 1 ] .344 from Bower's data namely.
h. Hence Substituting this result into the preceding expression. and E. By elementary probability : PrlA. trials and conditioning has failed to occur during the intervening that the subject guessed incorrectly on trial n+k. We define c k k. where ( 6) To be explicit. 22between errors on trials theory. n+k and trial n.A. along with the result presented in Eg. 1. The observed values for are . consider the following response protocol running in time from left to right: 1101010010000. ~ but easier to compute) and . and so on.n+k over all as the mean of this random variable.+k : 1 ~ : l)Pr(~ : 1) • But for an error to occur on trial n+k it must be the case·that k. yields the following expression: A convenient statistic for comparison with data (directly related to the average autocorrelation of errors with lag is obtained by summing the cross product of trials.
k ~ i) ~ (i_a)a i gk ° i > ° ~ (7) where 1 a~. Letting gk denote the probability that the kth success occurs by guessing. by guessing. 23The predictions forc .a gk for for i Pr(:!. The methods to be used in deriving this result are general and can be used to derive the distribution of ..1') . (lc)(i. we can write the probability distribution 1 . . and .and E. . its values are 0.1. .310.2..187. To obtain we note that ° errors can occur in one of three ways: subject is in state (1) The kth success occurs because the lg ) k C (which has probability and necessarily a correct response occurs on the next trial..479. Next we consider the distribution of the number of errors between the kth and k+lst success. An errqr following the kth success can only occur if the kth success itself occurs as a result of guessing.292. c2 ' and l c computed from the c estimate Bower's 3 given above for Bower's experiment were . We shall define lk m m as the random variable for the number of errors between the kth and k+lst success.A. observed values were . and . that is. the subject remaining in state (?) the kth success occurs C and again guessing cor rectly on the next trial [which has probability (3) the kth success occurs by guessing but conditioning is effective . the subject necessarily is in state C when the kth success occurs.486.errors between the kth and k+mth success for any nonnegative integer The only limitation is that the expressions become unwieldy as increases.201.
trial immediately preceding the k+lst success [with probability Hence From Eq.A. and E. It could occur in one of the following ways: r subject guesses correctly on trial 1 (with probability ~) or (2) the subject guesses incorrectly on trial 1..a gk.!k ' namely (8) and var0!k) = L i=O 00 i2pr~k a gk 2 2 2 (9) (Ia) In order to evaluate the quantities above we require an expression for Consider gl ' the probability that the first success occurs (1) The by guessing. Thus i Pr(h errors = 0) = 1 .::) + gk c = 1 .gk 1 The event of (i > 0) bet k+lst k+lst successes can occur in one of two ways: successes occur by guessing [with probability success occurs by guessing and (1) k The kth g (l_c) i+l(l· _ ~)i ~] or (2) the kth conditioning does not take place until the . 24on the trial (which has probability + gk(lc)(. conditioning does not occur. and the subject guesses successfully on trial 2 [this joint event having . ween thekth and and r r gkc). 7 we may obtain the mean and variance of .
gk If we continue in this fashion it will be obvious that = (l_c)klg~. 25probability (1 . the next gl' Hence.A. and (3) since the sUbject is assumed klst success. the 1 00 (1 .l)(lc) l] r r or (3) conditioning does not occur on trials 1 and 2. i = l/(lo:)r Now consider the probability that the kthsuccess occurs by guessing for In order for this·event to occur it must be the case that (1) success occurs by guessing. (2) conditioning fails to occur klst success. and E.) (Ic) r 1 2.(l_c) r 1 . kl k gk = ( lc ) gl Finally. and the subject guesses incorrectly on both of these trials but guesses ~orrectly on trial 3 (with probability (1 . Note that = gl(lc)gl = (lc)gi Similarly. substituting the expression obtained above for gl yields ._)l. substituting the above result for g2 we obtain g3 = (lc )gi(lc )gl = (lc )2 gr . klst on the trial of the to be in state C on the trial following the correct response occurs by guessing with probability Solving this difference equation 3 we obtain 3 g2 The solution of this equation can quickly be obtained. Thus =2:: r i=O k > 1. 2 1 r ) so forth. g3 = g2(1c)gl.
= (lgk) =1 + gkb . or (2) the kth C [which we have already calculated to be success occurs when the subject is in state subsequent trials.gk(lb) suc~essful (11) guess is But the probability of no more errors following a simply ex + c c = 1 _ _ ~_=:::. success occurs when the subject is in state lgk 1. Let b C and no errors occur on denote the probability of no more errors Then Pk following a correct guess.number of errors between the and k+lst success in Bower's data.J.ex r)k .350. and E. inserting our original.361 and the Observed value is . estimate of kth c . 26 We may now combine Eqs. for k = 1.7and 10.A..ex + c)(r . the predicted mean is . Tb illustrate. Tb conclude our analysis of this mood. we consider the probability Pk that a response sequence to a stimulus item will exhibit the property kth success.to obtain predictions about the..kth . This event can occur in one of no errors following the of two ways: (1) The ._ ex(l_c)kl  (12) '.
with results comparable to those exhibited above.A. The goodnessoffit of theory to data in these instances is quite representative of that which one may now expect to obtain routinely in simple learning experiments when experimental conditions have been appropriately arranged to approximate the simplifying assumptions of the mathematical model. he finds that the oneelement model comes closer to the data on 18. Pk for Bower's experiment are shown Insert Table 2 about here We shall not pursue more consequences of this model. and E. results we have examined were selected because they illustrated fundamental features of the model and also introduced mathematical techniques which will be needed later.. The linear model assumes that the probability of an incorrect response on trial n is a fixed number where PI * = (lc)p n and many The oneelement model and the linear model~nerate identical predictions (e. In Bower's paper. 4 The particular 4 Bower also has compared the oneelement model with a comparable singleoperator linear model presented by Bush and Sternberg (1959). more than 30 predictions of the type presented here are tested. 27 Observed and predicted values of in Table 2. . 20 possible comparisons Of the Bower makes between the two models.g. mean learning curve) and it is necessary to look at the finer structure of the data to differentiate models.
.
and E.A.812 .990 .256 .000 Pk 0 1 2 3 4 5 6 7 8 .628 .996·· 1. Pk' the probability of no errors (Interpret PO as the probability of no errors at all during the course of learning). k Observed Pk .255 .636 . .822 .995 . 9 10 11 .000 Predicted .979 .993 . 998 .999 1.912 .963 .990 .869 .957 . 27aTable 2 Observed and predicted values for following the k th success..973 ·990 .928 .997 .
Unfortunately. nevertheless. 28concepts of the sort developed in this section can be extended to more traditional types of verbal learning situations involving stimulus similarity. in order to provide rigorous and detailed tests of these concepts. and the like. or responsefailures). Stated differently. 2. a set of concepts may be very general in terms of the range of situations to which it is applicable. as contrasted with the pairedassociate model. anticipatory. generates some predictions of behavior which are quite unrealistic and for this reason we defer an analysis of experimental data until we consider comparable multielement processes. and types of erros (perseverative.3 Probabilistic Reinforcement Schedules We shall now examine a oneelement model for some simple twochoice learning problems. when we do discuss multielement models we shall employ a rather restrictive set of conditioning axioms. and E. Further. meaningfulness. The reason for presenting the oneelement model is that it represents a convenient introduction to multielement models and permits us to develop some mathematical tools in a simple fashion. The oneelement model for this situation. list length. for the oneelement model we may . theoretical analyses of this sort for traditional experimental routines often lead to extremely complicated mathematical models with the result that only a few consequences of the axioms can be derived. However. Atkinson (1957) has presented a model for rote serial learning which is based on similar ideas and deals with such variables as intertrial interval.A. For example. it is frequently necessary to contrive special experimental routines where the theoretical analyses generate tractable mathematical systems.
Tb the signal the .~ subject is required to make one of' two responses which we denote and A 2 The trial is terminated with an E i E l or Ai E 2 reinforcing event. we consider a noncontingent reinf'orcement schedule. and E. subject. The ref'erence experiment (see. at the end of' the. Thus in a human learning situation the subject is required to predict on each trial which reinf'orcing event he expects will occur by making the appropri. but as information accrues to him over trials. (3) dependent on the previous sequence of' reinforcing events.g" Estes and Straughan. the occurrence of' indicates that response was the correct response f'or that trial. pref'erence between responses. the analysis of' the oneelement case will suggest lines along which the multielement models can be generalized. 1954. The experimenter may devise various schedules f'or determining the sequence of' reinf'orcing events over trials. 29 present an extremely general set of' conditioning assumptions without getting into too much mathematical complexity. Suppes and Atkinson. or (4) some combination of' the above. Each trial is initiated by the onset of' a signal. e. Theref'ore. his pattern of choices undergoes systematic changes.ate responsean and an A 2 if' he expects ~ if' he expects E l E .A. trial he is permitted to 2 Initially the subject may have no observe which event actually occurred. the probability The role of' a model is may be (1) some f'unction of' the trial number. case is def'ined by the condition that the probability of' E l The is constant . 1960) involves a long series of' discrete trials. to predict the detailed f'eatures of' these changes. (2) dependent on previous responses of the. For simplicity. of' an E l For example.
r these subjects it would be necessary to estimate model.Fo.n .n would e~uiprobably. we shall occurs on trial n the event that response We assume that the stimulus situation comprising the signal light and the context in which it occurs can be represented theoretically by a single stimulus element which is sampled with probability 1 when the signal occurs.n Ei Similarly. thus. is customary in the literature to call this probability Fr(E l . the subject . ~. If the subject is in state C l and an E l occurs (i. "n =2 1 For some subjects a response bias may exist which Fr(Al . in aPPlying the However. A l or A 2 When the subject is in response occurs with probability 1.n ) = n It n ~. for all n Here weare denoting by E.e. the element is in one of three In state conditioning states: response and in state C l the element is conditioned to the ~ A l C 2 to the response.n ) = 13 where 13 . At the start of a trial.A. I Co) . Co we assume that either response will be elicited Pr(Al . 30over trials and independent of previous responses and reinforcements. As the model is developed it will become obvious that for some experimental problems restrictions can be imposed which greatly simplify the process. In state that is. in state ~ Co the element is not conditioned to either or ~ The response rules C l or are similar to those presented earlier. and E.. for simplicity we shall only pursue the case where e~uiprobablewhen the responses are subject is in Co We now present a general set of rules governing changes in conditioning states. oCCurs on trial n the event that reinforcement represent by A.n re~uire that we assume I Co .
n+l = I EA2. then he moves to A independently of whether l or A 2 occurred. process more complicated and should be introduced only when the. data clearly require them. ' respectively.ssumption is not necessary and we could readily have the actual response affect change.n+l! El.I e" 1 F 2 However. if the subject is in and his response is correct.n = cn and pr(Cl.2 and i I j . he will remain C l or C 2 and his response is If.n+l. then with and with probability C2 c' to Comparable rules apply when the subject is in C l or C 2 Thus. then he will remain in and an C 2 C l However. then he may shift to one of the other conditioning states. he is in not correct.1 for the or Pr(Cl. and combination.nA2. to summarize. c2 n where e" . That is. and E •.nCO. however.nCO.j = 1.n ) .nCO. 31 makes an Al response which is correct) . '[his a. thereby reducing the probability of repeating the same response on the next trial. for example.n ) 2. such additions make the mathematical.5 the subject moves to or iC2 5 Here we assume that the subject's response does not affect the change.nAl.n+l I E2. if the subject is in C l with probability cn Co and an E l occurs. that is.nCO. . for i. . we might postulate ell 2 c·1I 1 ~El for an combination.A. Thus. if' the sUbject is in: C l probability Co c the subject goes to E 2 occurs. Pr(C2. finally.nAl. I El.n ) = Pr(C2. if the subject is in then with probability cn Co and an E l C l or E 2 occurs.n ) .
A.n) = c n = c pr(C. and E.n+lIEj. l.r'(C O.u C. 32 = P. Insert Fig.n Co . represents a possible outcome on a given trial. l. he will go to However. The probability of each . from a beginning point to a terminal point. c' he C .n C.ll ) 1 = c' P. 2 represents the possible changes that can occur in the conditioning state.n+liE. J. +lIE.n ) (13) pr(Cj..n l.r'(C. n) where 0 < c" < 1 and 0 < c + c' < 1 We now use the assumptions of the preceding paragraphs and the particular assumptions for the noncontingent case to derive the transition matrix in the conditioning states. l will stay in state on the next trial.n Ci. l. 2 here The first set of branches is associated with the reinforcing event on trial n If the subject is in C l c C l and an E l occurs. Each set of branches emanating from a point represents a mutually For example. In making such a derivation it is convenient to represent the various possible occurrences on a trial by a tree. if an then with probability will go to C ' with probability 2 lcc' he will remain in CO' and with probability Each path of a tree. L"n l. suppose that C ' then the tree in l exclusive and exhaustive set of possibilities. +l!E. at the start of trial n the subject is in state Fig. then he E 2 occurs.
. and E. n. 2. Branching process. .Co.A. noncontingent case. starting from state C on trial l for one element model in two choice. n+1 C l.n+l c "'L.n+l Fig.n+l C 2. 32a C l.'''.
(l:Jl)c C. ~ . ) Pr(C l 2 .n.(l:Jl)( c '+c) C"1t Co Ci2 C:Jl C':Jl .n four paths. On the top event is indicated and by Eq. Pm P02 POO Thus we have .n l .. 3. PIO . 33 path is obtained by multiplying the appropriate conditional probabilities. 3 here l_c" A similar analysis to C l is C" and of staying in Co is holds for the bottom branches. (l:Jl)c' and P12. Combining these results and the comparable results for following transition matrix: C l C l p .n+ 2 . (l:Jl)c . :Jl COl .the probabili ty of a onestep transition from For the branch an E l Co state we have the tree given in Fig. two lead from C l to C . where to Pij denotes ·. C 2 yields the Co c '(I:Jl) I_en C 2 c(l:Jl) c"(l':Jl) l:Jl(c'+c) (14) 1 . hence l Of the Similarly. (l:Jl)(lcc'). I_cit .A. 2 the probability of the bottom path may be C represented by P.i{E IC llE ) . Thus.n I . and E.. for the tree in Fig. 13 the probability of going Insert ~ig.
Branching process. and E.n+l " c Fig. 33a c" C O.A. noncontingent 2ase. 3. starting from state C on trial for one element model in two choice. n.n C 2. .
the asymptotic probability of an ~ response. no proper subset of states such that once within this set the probability of leaving it is example. However. that is. we shall only select a few which are useful in clarifying the fundamental properties of the model. The following notation will prove useful: Let be the transition matrix and define being in state was in state i. if the appropriate limit exists and is independent of set n > i . We begin by considering the asymptotic probability of a particular conditioning state and.3 3 . j on trial as the probability of the subject r+n. and E. 34 As in the case of the pairedassociate model. a large number of predictions can be derived easily for this process.we lim p~n) 00 ~j The limiting quantities u j exist for any finitestate Markov chain A Markov chain is irreducible if there that is irreducible and aperiodic.A. in turn. given that at trial r The quantity is defined recursively: Moreover. the chain whose transition matrix is a For 1 1 2 2 3 3/ 4 1/2 1/3 1/4 1/2 1/3 a a 1/. is no closed proper subset of states.
2t . we need find only the solution of the r intuitive basis of this system of equations seems clear. and periodic if a return to some initial state except at matrix is t . It may be shown [Feller (1957). Then the probability Pn+l of being in state 1 on .ur ] the If there are stationary probability vector of the chain. 3t. and E. 2) of states is a proper closed subset. r Btates. 35is reducible because the set (1.•• trials for t > 1. Kemeny and Snell (1959)] that the components of this vector are the solutions of the r linear equations LU~vl v=l (15 ) r such that r ~ v=l u v = 1 Thus to find the asymptotic probabilities equations.A. we call the vector u =[ul'~'" . A Markov chain is aperiodic if there is no fixed period for return to any state. state chain. j is impossible Thus the chain whose 1 2 1 0 0 3 0 1 2 3 has period t ~ 0 0 1 0 1 3 for return to each state.. . Consider a two Of the states.
rt)c'(c' +2c). ~36·trial n+l is the probability . But at asymptote Pn+l which is the first of the two equations of the system when It is clear that the chain represented by the matrix P r = 2 . of Eq .rt)c"[c + c' (1 .. D = (1 .Pll)(l .Pll) D3 (1 .14 is irreducible and aperiodic.P21P12 (16) Inserting in these equations the equivalents of the P ij from the transition matrix and renumbering the states appropriately we obtain D . Let numbers u. DO =rt(l .A. J such that J u. thus the asymptotes exist and are independent of the initial probability distribution on the states. and E.ofbeing in state 1 on trial n n and going to 1 plus the probability of being in state 2 on trial that is and going to 1. = rtc"(c +c'rt) 1 .rt)J 2 . = DiD Lv uvPvj where and LUj = 1 Dl = P31(1 P22) + P21P 32 D2 = P31P12 + P32 (1 .P22) . = J solution is given by u.
As will become clear later the value of re is bounded in the open interval from Pr(A l .. c c and € =" C ....18 indicates that the asymptotic probability of an A l 2 response is a function of Pr(A . and E.(c. = : .+2p] re[p+€re] + re(1re)€[€+2p] + (lre)[p+€(lre)] where p = .? .A.n Pr(cl')n ) + 2 Fr(C a.00 ) is above or belowre ~ to . c..:.. By our response axioms We have 1 Pr(A forall n Hence lim Pr(Al n) n.2:'€ ] 222 re[€ +2€p2€] + re [2€€ 2€p]+p+€ (18) An illspection of Eq . L  " l re[p+€:n:] +re(11t)€[€+2p] + (lre)[p+€{lre)] .CD ' ) l . p.. (l7) U o = ~==./D J J we may divide thenumeratoT and denominator by ..37Since·D is the sum of the D 's j and sinceu. depends on the l+ {lrel values of . whether p and e.}2 u and obtain _ = re[p+€re] ~ : .= D. l 00 ) re. and €.==="' re(1re)€[<.n ) 122 12 re[p+€p + 2:10 ] + re [€€P.
C l C l on trial C l n on trial n + 1 is equal to the of P21 times the probability. In where Pt(CO.c.l)(l . Let us rewrite Eq. the complement Of .n z n = z For simplicity let Now we know that Y = Pr( C ) 2 . 19 we have also Pr(C l . From Eq.)ul the initial probability of being in CO' Thus.n) =Pr(CO.n )[lc(lrt)] + Pr(C 2 . the probability of being in probability of being in going from C l to .A. in some respects. 38We now consider two special cases' of our oneelement model. l n .n ) .en We note that once the subject has left state Co he can never return.the first case.. Then the tran sition matrix will have the following canonical farm: CL ' C l P = C C2 c(l . Pll C 2 times plus the probability of being in Co times POl plus the probability of being in x = Pr( C ).l) is Co fact it is obvious that Pr(Co.n n and z n = Pr( Co.n ) Crt + Pr(C o .rt) 1 . Case of c' = O.rt) Co 0 0 1  c(l .rt) Crt cllrt 2 (19) Co 1 . and E.Crt C" (1 . whereas the second case is. 1 (lc" )nl . being in 2 C l on a proportion rt of the trials. The first case is comparable to the multielement models to be discussed later. 14 with c' = O. except on early trials. is not part of the process and the subject in the long run fluctuates between C l and C .n ) C"rt • That is.
(1) ax l + b +.d • Similarly x 3 = ax 2 + bc + d and substituting (1) for Xl we obtain (2) x 3 = a Xl + ab + ad + bc + d • 2 SimilarlY x4 = ax 3 + bc 2 + d and substituting (2) for x 3 we obtain If we continue in this fashion it will be obvious that for n2 a n > 2 x n = a nl Xl + d L l ..c(l .)nl] n 1 n 1 This difference equation has the following SOlution : 6 6 The solution of such a difference equation can readily be obtained.x .)nl.A. n n Making these substitutions in the recursion above yields = x [1 .x _ z (1 _ c. + i=O ..zl (1 _ c. b. and E. 39and also that x n + Y + zn = 1 n or Y = 1 . c and d are constants.rr)] + z c"rr(l ..c. Consider Then xn+l = axn + benl + d where a.)nl + crr[l .
n J. 40 Carrying out the summations yields the desired results..n+l El . If Fr(e o.x )( l .c . "l.n+l l.(n .. )Pr(e. n )pr eEl .n e.n+1 eJ.n+l Ie..1 .n " . _ llE l .fr(e1.c l . See Jordan (1950.n+llEl .n l .n "l.'.n+E l .c . n . J. l .n J..n J.. E A C.c)  (20) If Pr( e 0.n ). )Pr(A Ie.n ""I. l 1 But N(A ) = x + 2z hence l .n c' =0. p.n+ El . ).nZ [ 1 . ) J. n ) =fr( "J. consider the probgiven a reinforced Al response on n.1) f 0.n "l.n ) = 2=pt(A .1) and approaching at a rate determined by c c.n ~.( If)nl ( )nl] ) '" n .n. n .[n ~ltPt(e 0.1 )](1 ..n+l l.n "l.n l.) i.n+ l . n n 1 Pr(A l .J ~ LPr(Al. and E.1 ) .n n . 583584) for a detailed treatment. x n nl . )pr(C.n J.1 ) = 0 then we have a simple exponential learning function starting n at Br{C1.n J.n Al .n 'Fr(E JA. namely Pt(A Note f'irstof all that Further we may write Pr(A Al Al AI "J. then the rate of approach is a function of both and' c". Pr(A 'I A Al. n + 1 Specifically. We now consider one simple sequential prediction to illustrate another feature of the oneelement model for ability of an trial Al response on trial A).n J.A. IE A e.n ) = n . j J. e.
. ) J.n a both equal to and 1 .lnAl . +lle.0 l .0 Since the subject cannot leave state we know that further. +1 El . o+llE l .0 + :J! L i=O 1 Pr(Al .n· 1. n .0 . Pr(A Precl .n and therefore there are nine terms i or j is 2 a.0 J.u . :J! ~ Pr(Al .n+l1ci.0 ) .n C.n 1 +l[E J..n C. ).0 IC.and we have Pr(Al . n +1 El . PreCa Therefore. +1) Pr(e.u+1 Ie.llAl . 1 and 2 l . +l)Pr(C.~ 1). by assumption the probability of an is independent of other events. :J! ~a 1 A C ) Pr(A llE IC )PreCl. n ) A .n Al . and hence Pr(El . Al " o ' . . 0 l .0.. ) . the terms Consequently.0+ .n+l) PreC. nl.n l .0 ) pr(Al. and hence Pr(Al .J Pr(A Both i and j run over a. o C1 ) . 1 . J n J. but note that when either Pr(A · +1 1 C.0 a .n Ic. 1.n l . 1 rC l .and E. the first sum is simply :J!Pr(Cl ) .n+ l . it in the sum.n l ". +1·) l . ~.n J"n E event l Substi Further.n l . n tuting these results in the above expression we obtain .n ~. 41But by assumption the probability of a response is determined solely by the conditioning state. . J.n J. )Pr(C.n .0 A C )Pr(A l ICa )Pr(C a ) . a.0Cl . ) J. +llE l 1.ll· +llc.u tAl C.n suffices to limit and i Pr(A and j l .:J!. +llE l . C l on a trial when A l is reinforced.A.u. 1. . ) . .andn .
Becond sum we obtain n[c" ·Combining these results Further :::: en and 2: jhence 1 ~ + ~(l .n) But Pr(El A ) = Pr(El IAl )Pr(A ) = nPr(A ) .ll l ..n .n ) both approach n in the limit Pre C ) o.n.n whence We know that and that PreCl . we predict that This prediction provides a very sharp test for this particular case of the model and one that is certain to fail in almost any experimental situation. nCo.n . . Therefore.and E .n) ~ 1 .n. even after a large number of trials it is hard to conceive of an experimental procedure such that a response will be repeated with probability 1 if it occurred and was reinforced on the preceding trial.n l l .n f'or.42For the second sum FreCl +llE l A Co ) .n ) and Pr(A l .That is.n .A.n . Later we shall consider a multielement model which provides an excellent description of many sets of data but is based on essentially the same conditioning rules specified by this case of c'= O.n Fre CO +llE l .c".n l .thel. Al .c") ~]Fr(Co.n approaches O. It should be emphasized that .
which is the permissible range on E In fact) since the sign of the derivative is independent of that Pr(A we know E l .GO ) = CO' 2 Letting p = 0 in Eq. 21 we can draw some interesting conclusions about the relationship of the asymptotic response probabilities to the ratio E =" c c' Differentiating with respect to E.. we lim see immediately that the lower bound (assuming 11 > ~) is E.OJ ) in E . and E. ro ) 1 If then Pr(A l . 43deterministic predictions of the sort given in the equation above are peculiar to oneelement models.. but transitions between C l and C 2 must go by way of 11 Pr(A l .~ 00 Pr(A l GO)=~ ' .GO ) has no maximum E.l . and thus p=O and 0 < E conditioning does not occur. it is easy to compute bounds from Eq. This point will be amplified later. for E in the open interval (O)OJ). < OJ. With this restriction the chain is still ergodic since it is possible to go from every state to every other state. Firstly. 11 < ~).OJ ) is either monotone increasing Or monotone decreasing in 1 1 strictly increasing if 11(111)(2" . 1. 21.. we obtain o "" oE 11(111)(2" .11) > 0 (i.A. Case of c = O. Moreover.11) Pr(A. We now consider the case in which direct counterc = 0. e. 11 > 2") and decreasing if ll(lll)(~ll) < 0 (Le. 18 we obtain 1 + j21l(1.ll)E (21) From Eq. because of the monotonicity of Pr(A l . for the multielement case such difficulties do not arise.e.
oe) 1 is a decreasing function of and its values lie in the half open interval It is readily determined that probability matching would not be predicted in this case. c' The derivation = o. 14) reduces to 1 p o ie" o c"(l") 1 = cnrc o and if the process starts in o Co .n l l .oe) = ". Pr(A .oe ) c' c is greater than 2. Pr(A l . E But for € > 0. Hence Pr(Al »0 +llE l )·nAl . 1 c =0 .nAl.Pr(Al. 442 Secondly. and E.A.sition matrix (Eq. = 0 . the quantity o 0.l )lco ) approaches € = Eq.n. and c. l .n ) .1) . if " > "2' pr(Al .n+ ..n} is identical to that given for the case of ul + for this case. the predicted value of 2. and consequently Pr(A llE ) i s always less than A . that € is very small.)2 if both c = 0 for l " Note..CD ) When . 21 is inapplicable when 0 » + (1_. '" l . the tran.'.?(Xj lim Note however that for " = ~ uo[c" + (lc") ~] ul U +"2 U o is never 0 (except for 1. the is less than and when this ratio is less than is greater than " .n+lIEl. n. predicted value of Pr(A Finally we derive Pr(Al. when however.
The transition matrix is C l C I P= 4 about here C 2 )c (llln)C l(lll n . and E. As a final example we shall apply the n oneelement model to a situation where the reinforcing event on trial is contingent on the response on that trial. 1 and (since c' = 0) Hence. ment is defined by two probabilities '\ Simple contingent reinforcesuch that and 112 We consider the case of the model in which That is. 45 Contingent reinforcement. The trees for the C l and C 2 states are given single parameter in Figure 4. the subject is not in state he can never reach C l or C .A. 2 Co from C l or Co C • 2 cI = 0 on trial and pX(CO. lc"2l C 2 c"2l and in terms of this matrix we may write . on all trials he is in and transitions between these states are governed by the c.l) = O. Insert Fig.
.
n C. Branching process for one element model in twochoice.JL 2. 45a C l.n . .A..n c c C 2. and E. 4.n A l. contingent case.n+l C .n+l Fig.
n llPr(.2.and E. .n l . It is interesting' to. n) in this case of the oneelement model is identical to that of the linear model (cf.:te that Pr(A l .no. 1959''').n This difference equation has the solution Pr(A l . Estes and Suppes.A.n ) where The asymptote is independent of determined by the quantity the learning function for c. and the rate of approach is c(11(n+"21)' .C l .e Pr(c ) . 46) = Pr(A ) But henc.n ) and Pr(c l .
1 Genere.the oneelement case this distinc ·tion does not playa serious role.popl.atingsituation.ssumed that the conditioning of individual elements to repponses occurs independently as the elements are sampled in conjunction with reinforcing events.\lation of elements which the learner is viewed as sampling from trial to trial.tive representations of the stimulus situations specify different mathematice. For . Funda mental to all of these suggestions has been the distinction between pattern elements and component elements.tion between the picture of gradual change usually exhibited by the learning curve <md the a:llornone law of association. in some cases .l processes. for conceptually representing the stimulus situation.tantstimulus situation by postulating random fluctuations from trial to trial in the particular sample of stimulus elements affecting the learner. For many experimental situations a detailed accGunt of the quantitative pr()perties of learning can be given by component models that assume discrete associations between responses and the independently variable elements ofa>stimu:t. 47 3. but for multielement formulations these alterne. 3. I:t is e. the stimulating situation is represented as a . These component models he.lFormulation MULTICELEMENT PATTERN MODELS In the literature of stimulus sampling theory a variety of proposals have been made.. In component models.and E. .ve provided a mechanism for effecting a reconcilie.s been to account for response variability to an apparently cons. and that the response probability in the presence of a sample containing a number of elements is determined by an averaging rule.A. However. The principal consideration he.
By "distinct" Thus we mean that no generalization occurs from one pattern to another. this particular scheme for sampling patterns has restricted .thilta simple accoUnt of the learning process re~uires the assumption that responses become associated.48. In practice the pattern model has also been applied with considerable success to experiments in which the alternative stimuli have some common elements. and E ..g. mutually exclusive patterns of stimulation.. if there of being s8ll!pled on a trial.A.. each of which becomes conditioned to responses on an allornone basis. N patterns. fail. each has probability ~ This sampling assumption represents only one way of formulating the model and is presented here because it generates a fairly simple mathematical process and provides a good account of a variety of experimental results. but nevertheless are sufficiently discriminable so that generalization effects (e.can be conceived as an assemblage' of distinct.' predictions from component models. In it we assume that an experimentally specified stimulating situation . By '''mutually exclusive" we mean that exactly one of the patterns occurs (is sampled by the subject) on each trial. and it appears.. "confusion errors") are small and can be neglected without serious errorQ In this presentation we shall limit cOnsideration to cases in which patterns are sampled randomly with are e~ual likelihood so that.e. However. the clearest experimental interpretation would involve a set of patterns having no common elements (i. common properties or components). not with separate components or aspects of a stimulus situation. The model presented in this section is intended to 'represent such a case. but with total patterns of stimulation considered as units.
' s ). . the second group with the sampling of patterns. occurred when the outcome of a trial is such as to increase the probability of response Ai in the presence of the given stimulus~provided. Before stating the axioms for the pattern model to be considered in this section we define the following notions. ~ whose effect is neutral (i. 1960).·· . . The first group of axioms deals with the conditioning of sampled patterns. As before. trace stimulation associated with previous repponses and rewards determine the stimulus pattern to which the subject responds.g... 49applicability.... of course. it is necessary to postulate a. knowledge of results) are classified by their effect on response probability and are represented by ac'mutilallyexclusive . but the occurrence of E i is a purely hypothetical event that represents Event E i is said to have the reinforcing effect of the trial outcome.A. see the discussion of "hypothesis models" in Su'ppesand.'s response and the experimenterdefined outcomes are observable. reinforces none of the The subject. When this is the case.E ) r that response Ai is reinforced and EO The event E. more general rule for sampling patterns than the random scheme proposed above. and E.e. in certain experiments it can be demonstrated p~rt that the stimulus array to which the subject responds is in large determined by events on previous trials. unconditioned stimulus.g. and the third group with responses.. that this probability is not already at its maximum value.Atkinson. For example.andexhaustive set of reinforcing events (EO' El . (e. the behaviors available to the subject are categorized into mutually exclusive and exhaustive response classes (Al'~' .A ) r The possible experimenterdefined outcomes of a trial (e. We now present the axioms.. that is.(ifO) indicates ~ represents any trial outcome A. giving or withholding reward.
h. If no reinforcement occurs .all patterns are sampled with equal probability.On any trial that response is made to which the sampled pattern is conditioned. prove several general theorems. 50Conditioning Axioms Cl. it becomes conditioned with probability to the response (if any) that is reinforced . the probability of sampling ~ given pattern is the trial number and the preceding events. C5. Response Axiom liN. however.2£ that trial. (. EO occurs).. ~ ~ 82. two~choice learning First.2£ c ~ trial.2£ the ~. and it suffices .A. is no change in conditioning on that trial. Given the set of N patterns available for sampling trial. Patterns that ~ not sampled on a trial do not change their conditioning . independently of Rl. C2.•~ll be conditioned to ~ reinforced reSponse is independent of the trial number and preceding events. trial. it remains C3. and E.2£ ~ trial. if it is alreadz conditioned to that response. Exactly ~ pattern is sampled ~ each trial. we shall Before we can begin our' analysis it is For the axioms necessary to define the notion of a conditioning state. there c4.~. Sampling Axioms Sl. On every trial each pattern is conditioned to exactly If ~ ~ responseo pattern is sampled . aqove. The probabilitz ~ c that ~ sampled pattern . Later in this section we apply these axioms to a experiment and to a pairedcomparison study.
N .. . 1. 1. in this section we limit consideration to the case of two alternatives except for bne example where r = 3.) For simplicity. 51to let the state of conditioning indicate the number of patterns conditioned to each response. to only one of the three J. ••• .. If the subject is in state C i on trial n and an E l occurs. Transition Probabilities.A. The probabilities of c. 2. the subscript A l and Ni cates the number of patterns conditioned to conditioned to ~ the number • Only one pattern is sampled per trial... Given only n of an i indi two alternatives we denote the conditioning state on trial experiment as C i. . and E. We now proceed to then the possible outcomes are indicated by the tree in Figure Insert Fig. therefore. are the ordered rtuples and k l + ~ Hence for r responses the conditioning states k r > where i k i = 0. or C + i l i i l on any given trial. k denotes the number of pattenns conditioned to the tioning states is ( N + Nr ~ The number of possible condi (In a generalized model which permitted different patterns to have different likelihoods of being sampled. the integer response. C. . these transitions depend on the value of the conditioning parameter the reinforcement schedule. and the value of compute these probabilities.n where i = 0. 5 about here 5. N + ••• + k r =N . i. the subject can go from state states C _ ' C . 2. it would be necessary to specify not only the number of patterns conditioned to a response but also the sampling probabilities associated with the patterns.
.
A. subject starts in state C and an E l i . 1 1.n+1 Branching process for N element model on a trial when the Fig.n+ C. and E. 5.n c C. 5la 1 ___~C. J. ~.. event occurs.
. then with probability c and tlleclsubject moves to conditioning state lc i l ' whereas Witth probability instate C i conditioning is not effective and the subject remains Putting these results together we obtain = c Ni N (22a) Pr(C.n l.n (22c) Noting that a transition upward can occur only when a pattern conditio~ed to ~ is sampled on an E l trial. i Pi(C.n J.A. andE. which has probability ~ ~ N.n+11E2 . +11E2 . ) = 1 . a pattern that is reinforcement occurs.u C.u· O C. 52On the upper main branch. a pattern conditioned to the pattern is conditioned to C + is sampled. ) = lc + c Ni 2.n C i N Similarly. +llE l . the conditioning state n (see Axiom C2).n C.i. if an E 2 occurs on trial n. ) = lc + l. since an E l the pattern remains conditioned to on trial n + 1 A • l Hence.. conditioned to A l is sampled and..:: c 1.. +lIE J.n C. ) . 1 .:. if an EO occurs then Pic'(C. and a transition downward can . which has probability ~.n· J.~Jn N (22b) Pr(C. On the is the same as on trial lower main branch..n N By Axiom C3.
E I A_ 2 J.A. ) +llc. Equation 23a. is sampled and the probabil:j. ) :t.o l Ni + N ?r( . 23 we see that the Markov condition may be satis:l"ied by limiting ourselves to reinforcement schedules in which the probability of a reinforcing event E i depends at most on the response of the given trial.tyccPr(E . n )] 1. As de:l"ined earlier. .n C. ) . .u.J .' .on a Ie. conditioning occurs..n '"'2.1.0 J.on 1. i :l"or example.n +llc... 53occur only when a pattern conditioned to A l is sampled on an E2 trial.n 1 (I ) (23a) (23b) P'r'(C i .n C.and E.n. 22ac to obtain Ni A_ P'r'(C'+ l.n C. n ) = c NP.n P'r'(C. ]. :l"or the probabilities 0:1" onestep transitions between states.fi lc + c [N!?r(E l .. ) .. J.rEl .n IA .n "'2. that is. we can combine the results :l"romEq.n IAC. . " l.n C.n (23c) + ?r(E ..n l that"under the given circumstances. to . is the product 0:1" the probability ~ R.states that the pnobability 0:1" moving :l"rom the state with elements conditioned to to A l to the state with i + 1 elements conditioned 1). '1 .n) =CNiBr(E2IAl . By inspection of Eq. +llc. J. in learningtheory terminology.. ) '"'2... we have a Markov process in the conditioning states i:l" the probability 0:1" a transition :l"rom any state to any other state depends at most on the state existing on the trial preceding the transit'ion.i that an element not already conditioned to A.
n i . J...1.= fr(E. J ~J .n. By Axioms 81. simply P'r'(Al . :lt 21 .= 0 to r..= N J. and Rl we know that the relation between response probability and the conditioning state is.1"'" 1 . further. and E.tion probabilities do not depend on n .. and the simpler notation expresses this fact.. .n I e. . .» . i.· +lle. We may then reWrite Eq. assumed throughout the present section except fOr a few remarks we explicitly consider various lines of generalization..= 1 . reinforcement on atrial depends at most ·on the response of the given trial. ) JJn l J n where j.. 54This restriction will be ~n noncontingent and simple contingent schedules.n ) The reason is that the..A. With these restrictions in mind. we define :lt which iJ . transj. J.c N "12 i . That is. 23 as follows: (24a) q . Note that we use the notation in place of Fr(e.. the .)..(24c) q 1. (241.:1 1 '. Response Probabilities and Moments.. .1 A. c T Ni.= 1 to r J and L:It .. 82. the reinforcement probabilities do not depend on the trial number. given the restrictions on the reinforcement schedule stated above..
n IC1 .n ) '" LPr(A 1=0 ". serves as the basis for a general recursion in Pr(A ) l .n But note th~t by definition of the transition probabilities qij (26) The latter expression. 55Hence N Pr(Al .n L: N LR'(C. 25. N = i N Pr(Al ) . together with Eq.. and E.A.n J Now substituting for have qji in terms of Eq. l)q'i i=Oj=O J.) N 1 =L:R'(C 1=0 N 1.n )Pr(C1 . 24 and rearranging the sum we .n ) (25) .n .
Estes.n_l) a 2 .tion and simplifying we obtain the following recursion in Pr(A l .n_l Similarly the third sum is + pr(CN. 1955.n 1 ) + c"21 (i+l)(Ni) Pr(C ) N2 . (28) .»  [:pr(~. nl)  a2 . by EQ. nl Carrying out = C"h [pr(Al. Pr(Al. )l .a. if i2 pr(C1 .n) .n = N k. then the second sum is simply a 2 .n_1J and so forth. and E.nl i(il) if ( ) Pr Ci . nl) .olution (cf.1959b.n i=O N ~. 56Pr(A ) = ~ ~ Pr(C.Pr(Al.nl N .i. 25. the su:mma. Estes and Suppes. Bush and Mosteller.n) = pr(Al.This difference equation has the wellknown s. nl The first sum is. . 1959) pr(Al. .n_l) 1 c"21 [ Pr(Al . i .PreeN.n ): .» .n_l)' Let us define c"12 a 2 .c"21 L i=O ~ Nl i=O Nl i(Ni) P (C N2 r .A.~("12 + "21)] nl .l)] [1 .a.
n = Ct 2.on yields the variance of the response probabilities.and E. higher moments of the response probabilities are of The second and experim~ntal inter est primarily because they enter into predictions concerning various sequential stat~stics. 57where At this point it wi~~ a~so be instructive to calculate the vari ance of the distribution of. Ct 2 . We return to this point.n ) as given in Eq. tional procedure for This feature is a simple calculacomplete asymptotic distribution g~nerating ~he .response probabi~ities Pr(Al .ptotic Distributions. later.2c N (rr 12 + rr 21 )J Subtracting the square of Pt(A l . n) • The second raw moment as defined above is .nl [1 . advantage~lliSc feature not shared by many other models that have appeared in the literature.A. 27 we obtain Ct 2.n Carrying out the summation as was done in the case of Eq. 28 from Ct 2 . nfCi . The pattern model has one particularly ~earning ASYJll.
26.. Rearranging. the same technique be applied if there are zero entries. q.. The theorem to be proved is that all of the asymptotic conditioning state probabilities u i can be expressed recursively in terms of u O .. q4.1 . By Eq.1. we have by Eq. of course. except. 1 ' :. . The derivation to be given assumes that all elements . As in Sec. 26 we note that and hence We now prove by induction that a similar relation holds for any pair of states. . 58 of conditioning states and therefore the asymptotic distribution of responses.. we let l lim Pr(C.. q.1.1 ~djacent For any state i . and E.3..4+. .I. 1.]. q. this recursion suffices to determine the entire distribution.J.~ c~n of the transition matrix are nonzero. 2. that in forming ratios one must keep the zeros out of the denominators..A. n ) ~ u. ·+1 1. that is q·+l . • J..!. since the u ' s i must sum to unity.. n + 00 J. .
..~ . and E. ~. under the inductive hypothesis we may replace its e<... ~ IS must sum to unity.. some siJrlplecases. +1 = 1 U. Thus we may write .i. Since the u.i q .• mined. ·1) = Hpwever1 .tu:j. 111 = ..~.i = ui+l q. :1.i+l which concludes the proof. • !q 4 Hence u _ i 1 by or ui(l .I.. and so forth . qi .q.l. U o also is deter To illustrate the application of this technique we consi\ier For the noncontingent casediseussed in See. 2. i .va1ent u _ q.il since q J.t qi+l. .A.i1i1... 59'However.I.'1 + Q..1.l.q and therefore i. i + 'Q.
. 60By Eq. the asymptotic For a probabilities in this case are binomially distributed..~ c ..:n:)and Ikl :n:k .the technique of the previous paragraph c:n: Nit u(Y = C.:n:) 1.. the ratio of ~ to u _ k l is the same as that Df neighboring terms N) Ik:n: (1 .:n:)J N • Therefore. 1 = c..:n:) First. asymptotic probabilities s..:.i+l = c N1 N :n: i q . .24 we have qi.(1 .l:n:) = ( 1 .::(.§..and E.l N) k Nk (1 .pplying .re independent of the conditioning parameter c .1. we note that the This result has two interesting features.:n:) N .N ll.1.A. .' :n:)Nk+l in the expansion of [:n: + (1 .:.... Second..2(1 It) and in general = ~ ~l (N k + l):n: k(l . _ (Nl):n: N ..
rt) . u _ k l uk . the limiting proportion of subjects having all N patterns condiN tioned to A is rt .and E. Further the ratio u _ . l ·For the case of simple contingent reinforcement. the proportion having all but one of the l N patterns conditioned to A is NTCN1Cl .is the same as that . Jnthespecial case of the pattern model(unlike other types of stimulus sampling models) the strict determination of the response on any trial by the conditioning state of the trial sample permits a relatively direct empirical interpretation. as will be seen below. and so on. for the moments of the distribution of state probabilities are identical with the moments .of k l N) kl Nk+l to ( k~l TC21 TC12 Therefore the asymptotic state probabilities are the terms in the expansion of Explicit formulas for state probabilities are useful primarily as intermediary expressions in the derivation of other quantities. Again we note that the uk to u i are independent of c.A. 61populat10n ofsuqjects whose learning is described by the model.
2 k~l 1t 1t l2 1t n A bit of caution is needed in applying this last expression to data. in the simple contingent case we have immediately for the mean and variance of the response random variable . Thus. A l response The additional consid eratibns involved in the latter case will be discussed in the next section. If we select some fixed trial n (sufficiently large so that the learning process may be assumed asymptotic). However.N N·· N (bn ) _~ ~ IkI. by the familiar ("21 + 1t l2 ) A l theorem for the variance of a sum of independent random variables.)Nk _ [E(A )]2 Var + 21 + 1t 1t £. . and E. 1ln and 2 l . this expression does not hold for the variance of totals over a block of K successive trials.. 62of the response random variable.A. then the theoretical var_ iance for the of response totals of a number of independent samples "21 1t l2 K subjects on trial n is simply K .( 2 1 )k ( 212"l2 ... 2.
8 series the subjects were highly practiced.6 and .tll use data from two experiments of this sort.8 series. each for a series of 288 trials.6 series the subjects were new to this type of experiment whereas in the .and a more complete analysis of the data than we shall undertake here.8 and . with probabilities of . having had experience with a variety of noncontingent schedules in two previous experimental sessions. the subject's task is to respond to the signal by operating one of a pair of response keys.4 for the two reinforcing lights. eighty subjects were run.indicating his prediction as to which of The reinforcing lights are pro two reinforcing lights will appear. 633. are given by Suppes and Atkinson (1960.2 Treatment of the Simple Noncontingent Case In this section we shall consider various predictions that may be derived from the pattern model for simple predictive behavior in a two choice situation with noncontingent reinforcement. anli E.. henceforth designated the . In the other experiment. with probabilities which are constant throughout the series and inliependent of the subject's behavior.(196o). In one of these. henceforth designated the . Details of the experimental procedure. each for a series of 240 trials.A. al. Details of the procedure A possibly and results have been reported by Friedman et. grammed by the experimenter to occur in random sequence.2 for the two reinforcing lights.6 series. with probabilities of . Each trial in the reference experiment begins with presentation of a ready signal. A l Or ~ . thirty subjects were run. 10). exactly one on each trial.. Ch. important difference between the conliitions of the two experiments is that in the . . we shl. For illustrative purposes.
and if with this estimate the model provides a satisfactory account of the empirical relationships in question. (the l response of predicting Also we assume that the experimental N distinct stimulus patterns. the subjects are sampling a population of stimulus patterns. ) . we shall let appear as a free parameter in theoretical expressions. and E. in conditions determine a set of one of which is present at the onset of any given trial. Let 01 02 denote the more frequently the less frequent light. we shall N not impose this restriction on the model. experiments of the sort under consideration. Rather. exactly Since. However. . We then occurring reinforcing light and postulate a onetoone correspondence between the appearance of light 0l' and the reinforcing event O. one might assume that N would necessarily equal unity.A. If the data of a particular experiment yield an estimate of greater than unity. the experimenter usually presents the same ready signal at the beginning of every trial. then we shall seek to determine from the data what value of N is required to mini mize the disparities between theoretical and observed values. l which is associated with A. l E. regardless of the experimenter's intention. 64For our present purposes it will suffice to consider only the simplest possible interpretation of the experimental situation in terms of the pattern model. The pattern effective N at the onset of a given trial might comprise the experimenter's ready signal together with stimulus traces (perhaps verbally mediated) of the reinforcing events and responses of one or more preceding trials. we shall conclude that the learning process proceeds as described by the model but that.
many investigators (e. The goal in these applications is not the perhaps impossible one of accounting for every detail of the experimental results. one of obtaining valuable information about various theoretical assumptions by comparing manageably siml'le models that embody different combinations of assumptions. the different patterns must have component cues in common and these would be expected to yield transfer effects (at least on early trials) so that the response to a pattern first sampled on trial n would be influenced by conditioning that occurred How when components of that pattern were present on earlier trials. 65It will be apparent that the pattern model could scarcely be expected to provide a completely adequate account of the data of twochoice experiments run under the conditions sketched above. 1962b. Suppes and Atkinson. Firstly. then it is extremely unlikely that all of the available patterns would have equal sampling probabilities as assumed in the model. This procedure will be illustrated in the remainder of . 1960. the pattern model assumes that all of the patterns available for sampling are distinct in the sense that reinforcement of a response to one pattern has no effect on response probabilities associated with other patterns. Despite these complications. Secondly. Bower.. ever. the section. 196+) have found it a useful strategy to apply the pattern model in the simple form presented in the preceding section. Suppes and Ginsberg.g. 1961b. but rather the more modest. yet realizable. and E. Estes.A. if the stimulus patterns to which the subject responds include cues from preceding events.
j Pr(A C. i.n+l J.n Pr(A l .n l . Although we consider only the noncontingent case.n First probability we note that Al Pr(A l .n+lE l .n ) =L i.n+l l. Pr(Al . l.n may be expressed in terms of conditional probabilities as . l . E A C ).n l. If there are r responses in a A l given experimental application. at the present time.n +lE It is convenient to deal first with the joint A ) .n (30) and that Pr(A C..n i. Prior to the development of mathematical models.e. We shall develop the proofs in terms of two responses but the results hold for any number of alternatives.n. relatively little attention was paid to trial by trial phenomena. 66Sequential Predictions.n l. and E. the same methods may be used to obtain results for more general reinforcement schedules. We consider first the probability of an A l ~ response given that it occurred and was reinforced on the preceding trial. anyone response can be denoted and the rest regarded as members of a single class.n+l J. for many experimental problems. then to conditionalize later. sized that one Of the major contributions of mathematical learning theory has been to provide a framework within which the sequential aspects of learning can be scrutinized.n l . We begin our application of the pattern It should be empha model with a discussion of sequential statistics.n+ llE A ) l .A. E A C ) l.n i. such phenomena are viewed as the most interesting aspect of the data.n+l l.
E A e. ) J.n ~. the first factor in the expansion can be rewritten simply as Pr(Al. )Pr(E IA e.ne.e.n l. Putting these results together and substituting in Eq. 1.n l.n IA C.nAl_. ) J. )Pr(e..0 1. i.n J. 30 we obtain .n J(.n+l l..n~.0 Ie.A.n . we have pr(Al.n = 0 Fj n (since That is..n+l1ej. if if i i =j Pr(e. 67 Pr(A l.n l. and E. Next. IE A C.n+l1ej. . 1.n l.n+l) =~ For the noncontingent case the probability of an E on any trial is l independent of previous events and consequently we may write pr(E l .n l. an element conditioned to A l an A l response occurs on n) is sampled on trial and thus by Axiom C2no change in the conditioning state can occur. by Axiom Rl.n+l l. +lIE l . we note that 1 .n ~. )Pr(e. ). ) = l ..n. Pr(A l .n+l)' Further.0 But from the sampling and response axioms the probability of a response on trial n is determined solely by the conditioning state on trial n .n+l Ie.
then in the protocol 2 3 4 5 . ) J. +l[E l .n l. the expression so obtained Consequently it is usually difference equation.n .A.n 1.2 Pr(e. )Pr(e. 2 i N ' Pr(e.. 28 and the corresponding the expression given for expression for a 2. preferable in working with data to proceed in a different way.n (31a) and 11 Pr(A l . Trial Event M. 29. Suppose the data to be treated consist of proportions of OCcurrences of the various triiSrams If. 1 Ak .n +lE. ) 1) .. 11 L. and Pr(A Pr(A l. 5.n l .0A . N .n l .!:. and E. A.n+lE l . 31b in Eq. is extremely cumbersome to work with.. n . c .nAl J n ) .n l .n pr(El A " ) . Eq •. J...on a2 a 2.n ) " = .that would be given by the solution of the Unfortunately.n ' ""+lIE l .n (31b) In . we simply substitute into Eq. 68 Pr(A l . ) 1.n over blocks of M trials.n J. for example.order to express this parameters cond~tional probability in terms of the n. rc a 2 . "nAl .n e..
Pr(Al nt =n J n I+lE2 I~ .strictly analogous methods. equivalently.S.the quantity al(n.respectively. To deal these trigrams are theoretically with quantities such as these. for the block running from trial n through trial n+Ml (32a) where a 2 (n.nl.25. 3la (and the corresponding e~pressions for other trigrams) over the appropriate block of trials.M) . and A E A_ l.n""2.'n " J. e.25 . . . and E.:== Pr(Al n'+lE l nl~ n') n.>.. and .esponse probabilities over the given trial block. hence the proportions of occurrence of . e.5. denoting the average (or.n on one. we need only average both sides of Eq.n E A occurs on two of these. B. A l probability responses) over the given . (32c ) P122 ~ Ii 1 n+Ml .) _' "n and so on.n.n l. .n+l 1.The combination E A l. .. we can derive theoretical expressions for other trigram proportions. obtaining.n+l l.g..n+l l. A ...n on one.A. A 2. 69there are four opportunities for such trigrams. 1 n+Ml ~ M:.M) is the average value of the second momeI\t of the .y . the proportion of A l trial block.g.«.
63) + ~ .63 and the observed l values for the trigrams of Eq.12) .8 series. 32a .379 = .168.379.· and E.032 . the first 12 trials of the ~ Over =..32c and predict the value of C . 168 = . Plll Pl12 = . we can substitute into Eq. 32ad were P121 = . N' c . together with those already obtained for the first and second moments.035.47 . Now we are in a position to predict the value of P122 Substituting the appropriate parameter values into Eq.8 . and P122 = .035 Using Plll = . N = .ng similarly. the observed proportion of A . illustrate. let us consider a sample of data from the .135 With this estimate in hand.47) = .8 [(1 from which A ~)( . we can use Eq.responses for the group of 80 subjects was . 70Now the average moments a can be treated as parameters to be To i estimated from the data in order to mediate theoretical predictions.63 .. Vlz 0 Pll2 = •. to estimate a2 (1.061. 32b to estimate Proceedi. which is not far from the observed value of .8 series.2(.12) = . a2(~.47]. we have frOm Eq. 32d.A.12)] . we have P122 = . which yields as our estimate A [~(1.
n may be obtained from the solution of Eq. however. The limit of a l. elements from these expressions. where which i again represents the asymptotic· probability of the state in elements are conditioned to AI' Recalling that the u i . of a 2. It should be mentioned that the simple estimation method used above for illustrative purposes would be replaced.A. simultaneDusly estimate eight of the and c N For example.: . employing all Pijk. by a more systematic procedure.n for large The limit. in the simple noncontingent case. in the asymptotic case. of course. Fortunately.: . a simpler method of obtaining the same result is to note that. 29. n is. . which is somewhat high in relation to the observed value of .061. by deflnition. and it turns out to be easy to evaluate the conditioning parameter and the number of. in a serious application of the model. the expressions for the moments ai are simple enough so that expressions for the trigrams in terms of the parameters are manageable. one might by least squares. 71 .077 . and E. a2 .this procedure would yield a better overall fit of the theoretical and observed values. A limitation of the method just described is that it permits esti mation of the ratio N ' but not estimation of c c and N separately.
n ) = 11(1 Al and lim Pr(A 11(1 l . 33 and the fact that ) lim Pr(A l . we may then write The \.i.. and E. n +11E_"n.n ) = .ributiun T"lrurW:.n+11E2 .n+llE l . (34d) .ter ttrld sample size N.0I.n = 11 we have By identical methods one can establish that lim Pr(Al . !'iI 1 lc .L dist.A. is rc tilL: secund ra:W" moment of' the binomia.N ) +T' 1 (34b) (34c) ji ) .!'iI ) +.':1.n. n ) = 11(1 ~ lim Pr(Al . til SWl11Edti. Therefore Using Eq.. 72are terms of the binomial distribution. n . 2 ~ 1 c ..
We shall illustrate this procedure with the data of the .6 series. observed transition frequencies F(A.n+lIEl. Given a set of trigram proportions from the asymptotic data of a twochoice experiment. aggregated over subjects.0+ llE2. 34ad and 35ad to predict the values of all eight of these se~uential statistics. l.n+l fA) l.n) lim Pr(Al . A · ) J.A.N )1( +N = (1 .. lim Pr(A l.n k .: c )1( N . and then substituting these estimates into E~.0 for the last 100 trials. are as follows: A l A1E l A E2 1 AE 2 l ~E2 ~ 298 342 305 264 748 394 462 186 . and E.n (lc)(lI() = I( + N (35a) (35b) (35c) (35d) lim Pr(A n+ll~ n) . 73With these·formulas in hand. We are now in a position to achieve a rigorous test of the model by using part of the data to estimate the parameters c and N.0 The +lIE. n ) = I(  (lc)n: N c c = (1 . . lim pr(Al. 1. viz. we need only apply elementary probability theory to obtain expressions for dependencies of responses on responses or responses on reinforcements.
£ = . A similar analysis of the asymptoti."N) + c N . we obtain an estimate 7 of N = 3.48.c Insert Table 3 about here . and E. 35d. 34a to Eq.174 N ( 1(1 . using our values of c = . i. of course. N must. the first entry in row one by the sum of the row.~ ) +N 1 Hence. letting Similarly . 74An estimate of the asymptotic probability of an A l response given an A1E l event on the preceding trial can be obtained by dividing.A.e.l5 = lim pr(Al. 34a we note that ·7. The fact that our estimation procedures generally yield nonintegral values for N may signify that N varies somewhat between subjects. ering that only two degrees of freedom have been utilized in estimating parameters. But.n) = 1((1 ./(462 + 306) = . be an integer.nAl. if we turn to Eq. or it may simply reflect some contamination of the data by sources of experimental error not represented in the model. observed values for the quantities given in Eq..6(1  ~)+ ~ .n+lIEl.605'. Having estimated c and N we may now generate predictions for Table 3 presents predicted and Consid any of our asymptotic quantities. • 7 For anyone subject.6<:9. the close correspondence between theoretical and observed quantities in Table 3 may be interpreted as giving considerable support to the assumptions of the model.which by Eq. Pr(A1IE1~) of and 1 = 462. 34b is an estimate and N we find that .
413 .541 .A.645 ·532 .601 .667 .532 .669 .428 .715 . Asymptotic Quantity Predicted ·715 . 74aTable 3 Predicted (pattern model) and observed values of sequential statistics for final 100 trials of the .49 6 Observed .601 .489 pr(~IEl~) Pr(A IE2Al ) I pr(AIIEl~) Pr(AIIE2~) pr(AIIAl ) Pr(All~) Pr(AIIEl ) Pr(Al !E2 ) . and E.6 series.535 .641 .
.
174). Statistical tests of goodnessoffit are sometimes possible (discussions of some tests which may be used in conjunction with stimulus sampling models are given by Suppes and Atkinson. the .~96lb. data is very close to that for the estimates of c The estimate of . 1962). but the and N (. once a degree of descriptive accuracy has been . Generally. for a sufficiently precise test will often indicate significant differences between theoretical and observed values even in cases where the agreement is as close as could reasonably be hOPed for. it is difficult to know how to evaluate the goodnessoffit of theoretical to observed values. and by Estes and Suppes. 1960.6 series.8 series are. In practice.8 data. however. Estes and Suppes. and E. on the average. 1962). Since no model can be expected to give a perfect account of fallible data arising from real experiments (as distinguished from the idealized experiments to which the model should apply strictly). investigators usually proceed on a largely intuitive basis.31 and 1.8 series. evaluating the fit in a given instance against that which it appears reasonable to hope for in the light of what is known about the precision of experimental control and measurement.172 vs .b c/N for the . sampling from a smaller population of stimulus patterns and at the same time are less responsive to the reinforcing lights than the more naive subjects of the .84. which has been reported elsewhere (Estes.A. statistical tests are not entirely satisfactory taken by themselves. yields comparable agreement between theoretical and observed trigram proportions. 75data from the .8 data (. respectively) are both smaller It appears that the more highly practiced subjects of for the .
for example. if the reinforcement was not effective. for we developed in Sec. for the stimulus pattern present on the preceding reinforced trial may be replaced by another pattern. if the subject cis. further progress must come largely via differential tests of alternative models. 76attained which appears satisfactory to investigators familiar with the given area. guessing state model that is comrarable to the Nelement model with respect to the number of free parameters. there is no strict determinacy since the Ai response LJilS. and. and E. The case of the guessing state model with c= 0 (c . It will be recalled that the oneelement model cannot handle the sequential statistics considered in this section because it requires.Y:· occur on the reinforced trial by guessing. again through guessing. 2). possibly conditioned to a different response. and which to many might seem equally plausible on psychological grounds. These models both embody the allornone assumption concerning the formation of learned associations.on the recalled.A. In the case of the twochoice noncontingent situation. 2. in state Co . it will be J may:: occur. . a different response following trial. the ingredients for one such test are immediately at hand. there is no such constraint. but they differ in the means by which they escape the deterministic features of the simple oneelement model. In the guessing state model. a probability of unity for response on which N ~ Ai on any .3 a oneelement.being the counterconditioning parameter) provides a two . on the following trial.trial following a trial In the Nelement model (with Ai occurred and was reinforced.
equal to the right € . . 36b the expressions for in Sec. We will require an expression for at least one of the trigram proportions that have been studied above in connection with the Nelement model.n+l [E l . 2. hand side of E« .593.3 we obtained an expression for c pr(Al. we may first evaluate the parameter A l € by setting the observed proportion of' responses over the terminal 100 trials.6 series.n. 2. . nAl.n +lE l Al ) .A. 77parameter model which may be compared with the twoparameter.and E. Nelement model.21 and sOlving for viz.n for this purpose. In Sec. we shall drop the n SUbScript from the right hand side of Eq.To apply this model to the asymptotic data of the .n) for the case with =0 and thus we can write at once Since we are interested only in the asymptotic case.3. Let us take Pr(Al .36a and have for the desired theoretical asymptotic expression (36b) Substituting now into Eq. we obtain finally ul and U o derived (36c) .
probability learning.0775 c" • Since the observed value of Pll. we have immediately an expression for the .52 +. i t is apparent that no matter what value (in the admissible range chosen for the parameter 0 < c" < 1) is c".249.6 data is ..2E) .A. whereas the N~element sampling model merits further investigation. using the methods illustrated above. the value predicted from the guessing state model will be too large. 78 "(.. makes it clear that for no combination of parameter estimates can the guessing state model achieve predictive accuracy comparable to that demonstrated for the N element model in Table 3.l for the . Although this one comparison cannot be considered decisive.) . + (1. and simplifying. Further analysis. By letting "11 = "21 =" in E'l. 28. Now introducing this value for obtain the prediction € into E'l.2782 + . 36c. Mean and variance of response proportion.24E and . one might be inclined to suspect that..6 + .~ ] ·593 .6(. for interpretation of twochoice. the notion of a reaccessible guessing state is on the wrong track. we Plll = . and E.
N .. K E(~) = L n=l E(An ) = Krr . pr(Al. N) nl (37) ~n A which equals 1 or 0 n. A l responses per Ktrial A theoretical expression for this learning function is readily Let x obtained by an extension of the method used to derive Eq.A.• . be the ordinal number of a Ktrial block running from trial K(xl) + 1 to A l Kx where x = 1. . •. 38.1.Pr(A )] [1 . .~pr(Al.n) .(1 .)K] (1 _ 2. 1 P(x).n~ !ili::ll l = n.c [rr  N c K heAl.ly](l If we define a response random variable according as A l or ~. N .(1 .Pr(Al. given by the summation of Eq.iii) ] (38) In experimental applications. .C' K ~ [Kx Pr(Al. A l response ·on trial nin the noncontingent case.)K(Xl) • (39a) . 37 also represents the expectation of this random variable on trial n The expected number of A l responses in a series of· K trials is. and define Then p(x) as the proportion of responses in block x.2. 1)][1 . 37 over trials.[rc . then. then the respectively.and E.n) = rc . one is frequently interested in the learning curve obtained by plotting the proportion of block. occurs on trial right side of Eq.Kc N [ rc .79probability of an viz. 2.1 .
39b yields a prediction Of . However. 196Ib. 6 series plotted by 20 trial blocks • observed value of pel) is . e. Eq. may eliminate Pr(AI. as has been nearly always the case. the mean learning curve has been substantially smaller than the value required to handle the sequential statistics discussed in the preceding section. 3913.[~ . in experiments run with naive subjects. these parameter values.. and substituting the result in Eq.)K(XI) N (39b ) Applications of Eq. c/N "the curver~presented further. This. 1) should be in the neighborhood of . for example.).12.g. have :J( In most instances the implication that the learning curve sho~ld as an asymptote has been borne out (Estes.A. with a suitable choice of values for by Eq. the value. Consider. to allow'for sampling deviations we in favor of the observed value of Note that pel) . The the learning curve for the. and E.1962).. p(x) = ~ . Solving for we obtain [~ Pr(A 1)] I.59 for .48 and the value of c/N estimated from With P(3) the sequential statistics of the second 20trial block is . and. 195913..c80Th~ valu~ of Pr(A I.. bias does not exist.P(l)](l  . of c/N required to fit.I) can be done in the following way. 39b has serv~d to describe the course of learning.5 if response However.39b to data have led to results that are satisfying in some respects but perplexing in others (see. Estes.
compared to an observed value of .06 .119 .121 Theoretical. and the .152 . we see that the estimate of . the result is Observed Pl12 P12l P212 P221 .39a.17 .59 until block 6 and is still short of asymptote at the end of 12 blocks.n requires estimated from the mean learning curve.053 Theoretical.039 For . comparing predictions based on the two clN estimates for the trigrams which contain this parameter.062 . .the mean proportion of Al responses over the last five blocks being . and t~e ~81 theoretical curve is essentially.144 .. 197). optimal account of the trigram proportions a clN value of approximately . c/N.. . l. The empirical learning curve. However.061 .at asymptote from block 4 On. A clN value of .168 .087 . B11t i f this estimate is substituted into Eq. an Pr(A x.17 is distinctly superior.177 ·073 .67. In the case of the . . . trials.served mean curve in terms of Eq.1960. ) J. however.63. and a fit to thetrigrams that does not look bad by usual standards for prediction in learning experiments.06 yields a satisfactory graduation of the ob.8 series there is a similar disparity between the value of clN estimated from the sequential statistics and the value As we have noted above.theoretical curve runs appreciably above the empirical curve for another five blocks.the trigrams averaged over the first 12 .:17. c/N.andE. the predicted Al frequency in the first block of 12 trials is . .A. nA.n+IE.does not approach .593 (Suppes and Atkinson. p. 39a.
the most freCluen:tly used behavioral measure in learning experiments is perhaps the va.ddle 50/0 on total A].790} .when they faU. The principal sOurce of the remaining discrepancy in this A l freCluency from Over the homogeneous subgroup is a much smaller increment in thefir'st to the second 12trial block than was predicted.riance of response . is not eliminated.82The reason for this discrepancy in :the value of clN reCluired to give optimal descriptions of two different aspects of the data is not clear even after much investigation. first three blocks the observed proportions were . clN ~ .657.800. when a more homogeneous subgroup of Bubject. one can see that analyses of data in terms of a model enables U.A. optimal clN values for the mean curve and the trigram statistics are now. which are discarded (Le. but this has not yet been done.perhaps verbal in character. the disparity. in the case of the . However.633. freCluency) is analyzed.665 and .779 A possible explanation is that in the ea. An interpretation of this sort could be incorporated into the model and SUbjected to formal testing. .stics differently.andE.15 run . respectively.15. One contributing factor might be individual differences in learning rates (c/N values) among subjects.theproportions predicted from ECl.. Next to the mean learning curve. 39a with and .o8 and .rly part of the series the subjects are responding to cues. these would be expected to affect the two types of stati.3 to determine precisely which aspects of the subjects I behavior are and which are not accounted"forin terms of a particular set of assumptions. to elicit consistently correct responding..s (themi.. In any event. are not resampled) .8 series. although somewhat reduced.
se our standard procedure n +k + 1 . we require a statistic which is.in a 1)lock of trials. must be expected to increase the observed response variance.assuming that the formula holds for trials it for trials nand n + k . this statistic .we shall limit consideration here to the case of the variance uf A response frequency in a trial l block after the mean curve has reached asymptote. Thus.+k)Pr(A l .·n.we may proceed to establish First we 1).i.A.of the variance. However.83occ\l.rrences.also of interest in its own right. 25). together with those of all sources of experimental error not represented in the model.la is obviously an identity for nand k '" 1. responses on any two trials. Predicting this variance from a theoretical model is an exceedingly taxing·assigrunent. for the effects of individual differences in learning rate. (40) Pr(Al . and E. For simplicity.s form1). that is (using the notation of Eq. As a preliminary to computation .n ) ) .no Pr(Al . the covariance of A). we can establish by induction that :n:Pr(A ) l . and the deviation may serve as a prototype for deviations of similar expressions in other learning models.is relatively easy to compute for the pattern model.n +kAl··. .n ~.. first. ..
".).n+k+I ~.L + ( 1 _.A ) I.n+k j. (1 _ ~). we have _ ..A. _ £.J. and E.n+k I. kAI ) J.n+k I.n+k J.) [ Pr (Al . . n+kCJ' n+kAI J n ) . denote the state in which exactly J. L Pr(A .2 '.first factor. + ~ N N N .n+k I. C. (I.J. n+k+l IEI.£] + (llt)(l _ .N )Pr(AI .n foJ.n N eIementsare conditioned to response AI' we may write Letting j Pr(A A ) .J Now we can make use of the assumptions that specify the nOncontingent case to simplify the second ·factor to (llt)Pr(C.' i . J .LPr(A IE...N2 N . .' N c . J + c(j.£) N . .n ~.n ~. 84 to expand the desired quantity in terms of' reinf'orcing events and states of' conditioning~ of the C.. A).E.n+k+l IE C A ) . . .these results.1 + .n ) I ..n " I. n +kAI·.n+kAI .n ~.£) N N· '. respectively. n ) J.. and Also. (1 c N ).(1.. 1.1 2. we may apply the learning axioms to the .n ) + It c N Pr(A ' .obtaining .n+k+I ~.n+k J.n+ . llPr( C. C .n N N Combining. " I.I)] and Pr(A l.n+k+II. A )Pr(E.n+k l.n+k j.C.2..
£ Pr(A N l.n)] (1 .n) ) • (1 _ .e.£/l} + n . nPr(Al.n+lAl.1'n ) .hypothesis yields .) N . n n~oo . 35 we have the expression n + n(l_n)~.n 2 (41) j) kl Now we are ready to compute var(~). . n(ln~(lc) (1 . The limits of Pr(Al. as required.n) pr(Al.n) .n) • Making the appropriate substitutions in Eq.n) and for the limit of Pr(Al. and from Eq.n(ln) (lC)]( 1 A A N • . the variance of Al response freqlJ.n . n2 '[ n2 . 2.the variance of a sum of randomvari~bles (Feller. by applying the standard theorem for.n c k . 85Substituting into this expression in terms of our inductive . 40. lim Gov l\P '+k. We wish to take the limit of the right side of Eq. and E. yields the S:impl.[npr(Al. 40 as n ~ co in order to obtain the covariance 'of the response random variable on any two trials at asymptote.[nPr(Al.result . 1957): .c).n+k) we know 2 to be equal to n .A. (1  ~){npr(Al.encies in a block of K trials at asymptote.. Pr(Al.n+1Al.n) N.
17 and 1. [1(:\'> . the theory predicts a variance of the right deviation of 6. respectively (Estes and Suppes.n ""'n lim Var(A ) = lim "'In n>ro n+oo Substituting this result and that for expression for var(4K)' we obtain =lClC 2 into the general :S=L i=l j=2 = Krc(llt) + 2lC(1rc)(lc) !:! N j=l c rc = Klt(llC) + 2lC(1.94.block was 6.84. 42. 86Since lim n>ro the limiting variance of JV'n A is simply lim n>co lim Cov(A +kA ) . 1962). of c N and N from the trigram proportions (using Eq. as anticipated. K z.50. trial.~{]} Least squares determinations Application Of this formula can be conveniently illustrated in terms of the asymptotic data for the .l(1c) {K  . we obtain fora 48 trial block at asymptote. but. order of magnitude. and E.8 series..12.A. 34ad) yielded estimates of .£)jl] N (42) ~ [ 1(1. . ·underestimates the observed value. VarCh) = 37. Inserting these values into Eq. this variance corresponds to a standard The observed standard deviation for the final 48 Thus.
introducing ever simplification$ are permitted by tbe boundary conditions of tbe case under consideration.Of tbe many otber statistic$ tbat can be derived fromtbe model for two~cboice N~element learning data. This tecbnique is so generally useful first.~~ deriving an expression of considerable interest in its own tbe probability of an A l VE l reinforcing events: response following a sequence of exactly . for example. selected primarily for tbe purpose. These steps will now be followed in rigbt. expand tbe desired tbat tbe major steps sbould be empbasized: expression in terms of tbe conditioning states (13. second.we sball give one final example.30). conditionalize responses and rein~ wbat~ forcing events on tbe preceding sequence of events. apply tbe axioms and simplify to obtain tbe appropriate result.$ done. of reviewing tbe tecbnique for deriving sequential statistics. tbird. in tbe case of Eq.
..::.{.nE2 .E lC. nE2 .88l Pr(A l .IC. ··..n+Y· ' L pr(C.. '1) " 1.D. .n..n. 1(1 .l ) = LNiPr(C.2)(1 _.' .1<l '1" . n ' .·n+Vl.1" . 1< . . ..l~ llY(lll)i.pr{AIB)=~~I~)'together .·El·..n 2 . +ylEl  E ·l J .n+Y_.'1) E . .p .C.. mainly because we have spelled out the steps in more than customary detail. { 1(1 .1 Pr(A . . N " N ' •.~)(l _... 2 . E.lC.. N N" J.n+V~.n+V.(lll) Pr(Al +y[E y 1<1 . .l . "... lY} = ~ (lc.n = II .n 1) =.El +Y 1" 1.n. . . .( 1 . n . andE..nl J=O = ~ j=O [1(1 = 1.n+Yll.) ~.n.E E lC.nE2 .nl[C' +Y 1" . "l)Pr(Cj'l) ..E l . 1) E l . ln+Y l' .n...n. l .1) = ~. J.P c Y c c Y )( 1 ..n+Y ~.n.£) ] Pr(C.. .. ..L ) ~ .E l ..n+y .J .A.n 2 ...(1 ..l··.. N ' N ..n' Vel' ll .n.n·l)Pr(C· J.f L Pr(A .) nl N N nl N The derivation has a formidable appearance.n' J. ...) . but each step can readily be justified.. 2 '. ~..l The first involves simply using the definition of with the fact that probability... +y[E · .. Pr(El .£) Y } .2) ~ + c.' l .. n ' .n+V.0.0' e ." l· J..J N[' . a condition".n' J.
and Pr(El ) = nand ' Pr(E2 ) = 1 .El. Over all values of j i is unity and similarly for the summation of pr(C ) • The third step is based solely on repeated application of the defining equation for a conditional probability.n+ V and The second step introduces the conditioning states denoting the states in which trial n + V and j i elements are conditioned to Ai n .37 (since from trial onward we are dealing with a sequence .c ~ unchanged by the with n = 1 E 2 event on trial n .•J)=Pr(AIBC .n. then. the preceding history affects response probabiHty on a given trial only insofar as it determines the state of conditioning.1 In one..El.n Pr(El.• •J)Pr(B[ C. which permits the expansion Pr(ABC •. i. .respectively. according to the theory..1 .n_l) is justified by the special assumptions of the The fifth step involves calculating.) 1.n+V_l" .n_l) = n (ln) Ci .• . the simple noncontingent case. on Their elements on trial insertion into the righthand expression of line 1 is permissible since the summation of Pr(C.1 .n+Jll. J) •• •Pr(J) • The fourth step involves assumptions of the model: the conditionalization of Pr(Al . =:N i since. into The decomposition of pr(El.n+v1e.n+v on the preceding sequence can be reduced to . the expected proportion of elements n + V There are two main branches to on trial n . the state of conditioning is n .1 . the proportion of elements conditioned to the given response. +V) l. Al..nE2.nE2. applying Eq. 89in.n V . and E. each value of conditiOned to j Ai on trial on trial the process starting with state C j by the axioms has probability 1.. which simple noncontingent case.n_lICj.e. .A.n . for n . for all n. .n+V_l" .
l) = 4' . 37 with {l(l . the observed proportions run slightly above . and E.859 It can be seen that the trend of the theoretical values represents quite well the trend of the observed proportions over the last 96 trials. we finally arrive at the desired expression for probability of following exactly vE's 1 N .897 .A.852 .695 .689 1 2 3 . the by now familiar property of the model that ~ ~ Pr(C.822 4 Theoretical Observed .l) = I yields the expression N j)V} for the expected proportion of elements connected Carrying out the summation over j . which has probability n = 1 4 cation of Eq. Using the estimate of .8 series). n 1) = Pr(Al. Somewl:tat surprisingly.l)(l to A l on trial and '1 pr(Al. A l on trial c n + V appli In the other branch.8 series.83 responses over the last 96 trials of the . 43 can conveniently be illustrated in terms of the .17 for Pnl j (obtained previously (the mean proportion from the trigram statistics) and taking of A l = .n_l) = Pnl ' j=O J.. we can compute the following values for the conditional response proportions: V o .l)(l_ ) . c we obtain the expression { 1(1.838 .786 . 90 of Pr(Al.j.and using n + V... N· N V} for the expected proportion of elements connected to in this branch. A l Application of Eq.
However. conducted with subjects new to this type of experiment (cf. x in a Al population of subjects. it will be recalled. This differential result appears to support the idea (Estes.values...g. Nicks.aining series conducted withnoncontingent reinforcement. so long as either an E l or an A l E 2 The substance of this theorem reinforcing event occurS on each trial. and E.6 series. 1959)0 tive recency effect is observed in the . thus response by subject x will denote probability of an and 1 E xl.r. the effect :is observed in the . the proportion of responses for any individual subject E l events over a sufficiently should tend to match the proportion of long series of trials regardless of the reinforcement schedule.which. we shall identify by a subscript x the prObabilities and events associated with the individual PX1.8 series. 1962) that the negative recency phenomenon is attributable to guessing habits carried over from everyday life to the experimental situation and extinguished during along.n will denote E l . n n . 1951. 212213).Ilxl. Jarvik.n on trial random varia.l:l.bles which take on the values or ° according as an and . Suppes and Atkinson 1960.A.91the predicted . involved wellpracticed subjects who had had experience with a wide range of:rr values in preceding series." is that. For purposes of this derivation. We shall conclude our analysis of the Nelement pattern model by proving a very general "matching theorem. pp. There is no indication hereof the "negative A l proportion with increasing length of recency effect" (decrease in the E l sequence) reported in a number of pUblished twochoice studies It may be significant that nO nega (e.
A. andE. 92event and an on trial by subject n x
A
l
response do or do not occur in this subject's protocol
With this notation, the probability of an on trial n + 1
=
A
l
response
can be expressed by the recursion
PX1,n+l The genesis of Eq. Pxl n
c( ) +E A .. PX1,nN xl,n xl,n
(44)
44 should be reasonably obvious if we recall that
,
is equal to the proportion of elements currently conditioned to l response. This proportion can change only if an E
~
the
A
l
event .is sampled,
occurs on a trial when a stimulus pattern conditioned to in which case
E
xl;n """Xl,n
A
= 1  0 = 1 , or if an
event occurs on
a trial when a pattern conditioned to
xl,n 'xl,n
A
l
is sampled, in which case
E
A
o 1
=  1.
In the former case, the proportion of pat"
1 N
ternSI~ditioned to
increases by
if conditioning is effective
(which has probability decreases by
1 N
c) and in the latter case this proportion c). trials: we can convertEq. over the series n*, viz.
(again with probability
Considering now a series of, say, n* into an analogous recursion for response simply by summing both sides over n* ;* n*
44
proportion~
nand dividing by
L n=l
Pxl n+l '
n* 1 c +E A ;* ~ Pxl n n* N  x l J n xl , n n=l n=l '
L(
)
Now we subtract the first sum on the right from both sides of the equation, and distribute the second sum on the right yielding PX1, n+l  PX1,1 n*
A. and E. 93The limit of the left side of this last equation is obviously zero as n* ~
00 ;
thus, taking the limit and rearranging, we have 7
7 Equation 45 holds only if the two limits exist, which will be the case
if the re'inforcing event on trial n depends at most on the outcomes of When this restriction is not
some finite number of preceding trials.
satisfied, a substantially equivalent theorem can be derived simply by dividing both sides of the equatlon immediately preceding by before passing to the limit; that is
LE n'* n=l ""Xl,n
1
n*
PX1,n+l  PX1,1 = c
~~l,n
n*
N
c n= 1 N n*
L> xl,n
n*
L llxl' n n=l
Except for special cases in which the sum in the denominators converges, the limit of the lefthand side is zero and
lim
n*
7 00
~ 1.xl,n
n*
LE n=l xl,n
n*
= 1
n*
lim
7·00
LA
l~
n*
n* n=l "'xl,n  n*

lim
7 CD
n* L.xl n n=l "
1
~E
n*
A. and E.
94
. To appreciate the strength of this prediction one should note that it holds for the data of an individual subject starting at anyarbitrarily selected point in.a learning series, provided only that a sufficiently long block of trials following that point is available for analysis. Nand Further, it holds regardless of the values of the parameters
c (provided that the latter is not zero) and regardless of the
way in which the schedule of reinforcement may depend on preceding events, the trial number, the subject's behavior, or even events outside the system (e.g., the behavior of another individual in a competitive or cooperative social situation). Examples of empirical applications of
this theorem under a variety of reinforcement schedules are to be found in studies reported by Estes, L957aand Friedman, et. al.,196:'l. 3.3 Analysis of a Paired Comparison Learning Experiment In order to exhibit a somewhat different interpretation of the axioms of Sec. 3.1, we shall now analyze an experiment involving a pairedcomparison procedure. of discrete trials. The experimental situation consists of a sequence There are
r
objects, denoted
A.(i=ltor).
l
On
each trial two (or more) of these objects are presented to the subject and he is required to choose between them. Once his response has been
made the trial terminates with the subject winning or losing a fixed amount of money. The subject's task is to win as frequently as possible.
There are many aspects of the situation that can be manipulated by the experimenter; for example, the strategy by which the experimenter makes available certain subsets of objects from which the subjects must choose,
A. and E.
95~
the schedule by which the experimenter determines whether the selection of a given object leads to a win or loss, and the amount of money won or lost on each trial. The particular experiment for ",hich we shall essay atheoret'ical analysis was reported by Suppes and Atkinson (1960, Ch. 11). The problem
for the subjects involved repeated choices from subsets of a set of three objects, which may be denoted A ,A_ , and A • 3 1 "
On
each trial
The subject selected one of the objects in the presentation set; then the trial terminated with a win or a loss of a small silmof money.
(~A3) ilnd (Al~A3)
The four presentation sets
(Al~)' (A A ) ,
I 3
occurred with equal probabilities over the series was selected on a trial then with
1  A..
~
of trials. probability
FQrther, if object
A.. ~
the subject lost and with probability
he won
the predesignated amount.
More complex schedules of reinforcement could
be used; of particular interest is a schedule where the likelihood of a win following the selection ofa given object depends on the other available objects in the presentation group. of a win following "n Al (A A2 ) , (A A ) I I 3 or For example, the probability
choice could differ depending on whether the presentation group occu:cred. The analysis
(Al~A3)
of these more complex schedules does not introduce new mathematical problems and may be pursued by the same methods we shall use for the simpler case. Before the axioms of Sec. 3.1 can be applied to the present experiment we need to provideiln interpretation of the stimulus situation
A, and E, 96confronting the subject from trial to triaL what arbitrary and in Part The one we select is some
4
alternative interpretations are examined,
Of course, discrepancies between predicted and observed quantities will
indicate ways in which our particular analysis of the stimulus needs to be modified, We shall represent the stimulus display associated with the I,resentation of the pair of objects patterns of size (AiA ) j by a set Sij of stimulus will be represented
N; ,the triple of objects of size
(Al~A3)
bya set of stimulus patterns
N*,
Thus, there are four
sets of stimulus patterns, and we assume that the sets are pairwise disjoint (Le" have no patterns in common), Since,inthe model under
consideration,the stimulus element sampled on any trial represents the full pattern of stimulation effective on the trial, one might wonder why a given combination of objects, say one element associated with it, that in introducing a parameter necessarily assume N> 1,
(Al~)'
should have more than
It might be remarked in this connection N to represent set size, we do not
We simply allow for the possibility that
such variations in the situation or different orders of presentation of the same set of objects on different trials might give rise to different stimulus patterns, The assumption that the stimulus patterns associated
with a given presentation set are pairwise disjoint does not seem appealing on common sense grounds; nevertheless it is of interest to see how far we can go in predicting the data of a pairedcomparison learning experiment with the simplified model incorporating this highly restrictive assumption, Even though we cannot attempt to handle the positive and
l or S.l J is presented the subject samples a single pattern from the response to which the pattern is conditioned. gtven (AiAj~) For example. Sij When (A. }. A j responses to Many such variations in the rules of correspondence between trial outcomes and reinforcing . make response become conditioned to elements in elements in A. is to provide an interpretation of reinforcing events.A. (AiA ) j is presented the sUbject must A. and E. rather than having occur with equal likelihoods. then (al . E k (AiAj~) is presented and the Ai event occurs if the response is followed by a win occurs.the and (b)E if the j If E i or.). hence all pattern l J When the pair of objects select A. J (i.oned to Al .e. the two events having equal probabilities This collection of rules Ai response is followed by a loss. represents only one way of relating the observable trial outcomes to the hypothetical reinforcing events. when Ai is selected E j or E k and followed by a loss. one might postulate that they occur with probabilities dependent on the ratio of wins following wins following A k responses over previous trials. . before applying the axioms of Sec. n~g~tive ~97 transfer effects that must occur between different members of the set of patterns associated with a gtven combination of objects during ~earningJ we may hope to account for statistics of asymptotic data.A. J Similarly all A 3 S123 become condit:1. l or A. follows: Cal the E If i (AiA ) j is presented and the Ai Ai Ai Ai Our analysis is as object is selected" then response is followed response is followed object is selected. reinforcing event occurs if the Ej event occurs i f the by a win and (b) the bya loss.. or A. and makes The final step. A or 2 . ) .1. lJ A.
and related variables (see Estes..analyzing the model we shall use the following notation: A~i.Atkinson. But to illustrate another approach we proceed differently in this case.the magnitude of reward in animal studies.AJo).and Suppes and Atkinson.j) J. 98events b. these variations become" particularly important when the experimenter manipulates the amount of money won orlost. 1960p . and E. for. Chapter 11.n = occurrence of an Of response on the nth presentation (AiA ) [note that the reference is not to the nth j trial of the experiment but to the nth presentation of W (ij) n a win on the nth presentation of a loss on the nth presentation of (AiA j ) • (AiA ) j Ai response on the L~ij) ~ We now proceed to derive the probability of an nth presentation of (A1. 1960. discussions of this point). .n First we note that the state of conditioning ofa stimulus pattern can change only when it is sampled.A. namely pr(A~ij)) l.ave been explored. that is.e conditioning of patterns in set We now want to obtain a recursive expression for can be done by using the same methods employed in the preceding section. 1962'9. Since all of the sets of stimulus patterns are pairwise disjoint (AiA ) j is presented forms a learning the sequence of trials on which process that maybe studied independently of what happens on other trials (see Axiom c4). In. the interspersing of other types of trials n + 1 st presentation of Sij • This (AiA ) j has no effect on between the nth and tb.
A. n ". With probability about here no 6 change occurs ·in conditi.) n and pr(A~ij) ) J.n ~ 1 . J (46) Solving this difference equation. + A.c Y n are given in Figure Insert Figure 6. ~ J . n Then the possible 1 .Y . . we obtain A.oning regardless of trial events and hence Y+ n l ~ Y p with probability n c change can occur. if a loss occurs the sampled element (which was conditioned ~ to Ai) becomes conditioned to A j and thus Y + n l ~ Occurs and is followed by a win then Y p however. if it is n Aj' 1 followed by a loss the sampled element (which was conditioned to becomes conditioned to results together we have Ai J hence Putting these which simplifies to the expression ~ Y [1 .)] + m A. A. p however.. J [ A. '. and E.s followed by a win then the sampled element remains conditioned to A.~ J c .~(A. If Ai occurs and ·i. Let changes in Pr(A~~J'1 ~ Y lJn I ~99' I .
and E.A. Branching process for a diad probability on a paired comparison learning trial. 6.z • Fig.. 99a c y n ~ . . .
13)".a n n + a n [c(l . For example.n for simplicity let 1 .a n . 100 We now consider pr(A~123)). 2 n+l n Combining the results in this figure yields the following difference equation: + a [c(l . ~.A. n n _ ~ n = pr(A(12 3 )).. and E.n The possible changes in are given in Figure 7. a a. BUt an A followed by E l 3 followed byE makes a = a '. hence 3 Occur with equal probabilities. 3.] n 3 2 1 Simplifying this result we obtain (48a) By a similar argument we obtain I3nH c 2N* (48b) . on the bottom branch conditioning Insert Figure is effective and an El or E2 7 about here A response occurs which leads to a loss.
A. and E. lOOa
,
v
'il
tl
~
1

a
11
n+l
=
a
n
"
an
vr. n·
Oin +l
=a n
1 N*
a n +l = a n
" _
~
.",
21
A : [3n 2
t\
..
a n+l = a n +
i;
a:
N*
a
n+l
=
a
n
=
a
n
+
1
N*
Fig. 7. Branching process for a triad probability on a paired comparison learning trial.
A. and E. 101Solutions for the pair of difference equations given by Eq. 48a' and48b are well known and can be obtained by a number of different techniques p. 130133; (see Goldberg, 1958,/or Jordan, 1950) •. Any solution presented can be verified by substituting into the appropriate difference equations. However, for now we shall limit consideration to aSYmptotic results. In terms of the Markov chain property of our process it can be. shpwn that· the limits
a
~
. lim a n n ~ CD
~
and
~
13
lim n
~
00
I3n exist.
Letting
a
n+l
~
a
n
~
a and I3n +l
I3n
13 in Eq. 48a and 48b we obtain
Solving for
a
and
13, and rewriting we have
lim Pr(A(12 3 l) l,n lim pr(~123)) ,n and lim pr(A(12 3 ) ) 3,n
).,2).,3
~
).,1).,2 + ).,1).,3 + ).,2).,3 ).,1).,3 ).,1).,2 + ).,1).,3 + ).,2).,3 ).,1).,2
~
, ,
(49a)
(49b)
).,1).,2 + ).,1).,3 +).,2).,3
(Mc)
The other moments of the distribution of response probabilities can be obtained following the methods employed in Sec. 3.1; and,:· ate .
A. and E ..102
asymptote we can glOmerate the entire distribution. set to
8
In particular, for
ij
the asymptotic probability that and
N k
k
patterns are conditioned
Ai
to
A j
is siJnply
For the set to A , ~ l
8
123
the asymptotic probability of
k 3
k
l
patterns conditioned
3
= N*)
to
~ ,and
to
A 3
(where
k
l
+ ~ + k
is
In analyzing data,it also is helpful to examine the marginal liJniting probability of an Ai response, Pr(A.) , in addition to the
l
other quantities mentioned above. of an Ai
We define
Pr(A.)
l
as the probability
response on any trial (regardless of the stiJnulus display) once Theoretically
the process has reached asymptote.
Pr(~)
and
A. and E.103j
where
pr(D(i
»
is the probabil.ity of presenting the pair of objects
The experimental results we consider were reported in preliminary form in Suppes and Atkinson, (1960}.Two groups were run each involving 48 subjects; subjects in one group won or lost one cent on each trial, and those in the other group won or lost five cents on each trial. We
shall consider only the onecent group, for an analysis of the differential effects of the two reward values requires a more elaborate interpretation of reinforcing events. Subjects were run for 400 trials with
the following reinforcement schedule
6/10
, "3 = 8/10
Figure 8 presents the observed proportions of
A , ~ l
and
Insert Figure 8 about here
responses in successive 20trial blocks.
The three curves appear to
be very stable over the last 10 or so blocks; consequently we treat the data over trials 301 to 400 as asymptotic. ByEq. 47 and Eq. 49ac we may generate predictions for .and
pr(A~ij)}
~.~co
Pr(Ai~~)}.
Given these values and the fact that the four presen
tationsets occur with equal probabilities we may, as shown above, generate pred·ictions for Pr(A.
1,.00
).
The predicted values for these quantities
and the oqserved proportions over the last 100 trials are presented in Table 4. The correspondence between predicted and observed values is
4 i= Q: c.5 I _ I VI 0 0 0 :z .J> > Q: LlJ LlJ 0 II' g.. Q: .. .2 DO PrCA 1) PrCA) 2 Pr(A ) 3 j • • ~ II' I f' 00 . c.. 9 10 11 12 13 14 15 16 17 18 19 I 20 BLOCKS OF 20 TRIALS Observed proportion of 1 responses in successive 20trial blocks for paired comparison experiment. I?O I to 0 VI .3 .1 I 1 I I I I I i i i I i i I i i i I I 2 Fig. 3 8. 4 5 6 7 8 A.
However.041 above the predicted value of . Estimation procedures of the sort referred to in Sec.A.00 is . Nand N*. to consider some reinterpretation of reinforcing events in the multiple response case. However. it should be noted that similar discrepancies have been found in other studies dealing with three or more responses (see Gardner. where we note that the observed value of Pr(A ( 12 3)) 1.2 are applicable but the analysis becomes tedious and such details are not appropriate here. Detambel. 3. CD The largest discrepancy is for the triple presentation set. For example. particularly for Pr(A. . 104 Insert Table 4 about here very good. The statistical problem of determining whether or not this particular difference is significant is a complex·rnatter and we do not undertake it here. but where the difference is independent of these Such comparisons are particularly helpful when they permit us to discriminate among different models without introducing the complicating factor of having to estimate parameters. c and N. and E. 1957. CD ) and pr(A~ij) ) ~. in subsequent developments of the theory. ~. 1955) and it may be necessary. certain nonparametric comparisons can be made between statistics where each individually depends on parameters. some comparisons can be made between sequential statistics that do not depend on parameter values. In order to make predictions for more complex aspects of the data it is necessary to obtain estimates of c.507.
211 Pr(A ) 3 pr(A (12) ) l Pr(A (13)) 1 pr(~(23)) pr(A (123)) l Pr(~ (123)) Pr(A (12 3)) 3 .302 .473 .282 . and E.548 . 104aTable 4 Theoretical and observed asymptotic choice proportions for pairedcomparison learning experiment.A.651 ·700 .233 .571 ·507 .464 .234 .561 . Predicted Pr(A ) l Pr(~) Observed .294 .194 .643 .706 .258 .
.
expression That is.the probability of an A l of CAl~) response on the n+l st presentation (A ~) an A l l occurred and was followed bya win. We first observe that .A. we may consider the subsequence of trials on which in particula~the CA1~)iS presented and. To compute this proba bility we note that Now our problem is to compute the two quantities on the righthand side of this equation. and that on the n_l st presentation given that on the nth presentation of (Al~) of an ~ occurred followed by a win. 105To indicate the types of comparisons that are possible. and E.
n denotes the conditioning state for set A l and N~i 8 12 in which nth i elements are conditioned to tation of (AJ:A2) . nl ~. the sampling and response axioms permit the simplifications and pr(A(12) I C~12)) Ni ".n+l n l. 105where C(12) i.n+l J. we make use of the relation . )Pr(A(l2) jW(12)A(l2) C~12) )(1 .nl Further. (1 i>. .we may expand the last expression.nl = N Finally.nl Id l2 ) ~..A. .nl ~.nl i.. n nl ". into L.nl )Pr(d l2 ) ) .i>. ~. and E.nl· 2 pr(A~l2) ".n+l J. . .n nl ". ) 1 1. ~J pr(A(l2) Ic~l2) )pr(C~12) IW(l2)A(l2)W(12)A~12) Ctl2 ) ) l.in order to carry out the summation.nl . to ~ on the presen Conditionalizing and applying the axioms.
nl ~. )(1 _ A.(see AxiomC2).nl i Similarly we obtain (50b) pr(w(12)A(12)W(.) ) n l. "107 Pr(C~.·1 .A.n+l 11l 12 )A(12)w(12)d 12 ) l2 ) ) = n l. pr(A(12) IW(12)A(12)W(12) A~12) ) l. N ~ N C~12) ~.n nl "2. 1· 2 ~1 (Ni)pr(c~12) . 2 (Ni)pr(.nl d 1 for for i = j i o i j which expresses the fact·that no change in the conditioning state can occur if the pattern sampled leads to a win .12) J.n nl ~.nl = (1 _ A. namely . taking the quotient of the last two expressions.nl ) (50c) We next consider the same sequential statistic but with the responses reversed on trials nand n .12)A~.nl ~ and finally. ~ N N ~.12.n nl ~.nl L(11 .n+l n l.n+l n l. and E.n nl ~. ) . l.) . these results and simplifying we have Combining 12 )/12)W{12)A(12) ) Pr(A(12) .
3 Equation 51 provides a test of the theory which does not depend on parameter estimates. (51) Pr(A (12) IW(12)A~12)W(12) A(12) ) • l. 51 does not hold for the linear model.n+l n z. 50a and 50b.A. it is a prediction that differentiates For example. . hold for the subsequences of trials on which (AlA) or (~A3) are presented. respectively.n nl l. the sequential equalityd~isplayed in Eq. of course. Hence. if we compute and they turn out to be expressed by the right sides of Eq. for all n.nl Comparable predictions. and it can be shown that they generate the same predictions for the quantities in Table 4 as the pattern modeL However. in the next between this model and many other models. 10'3 Interestingly enough. section we consider a certain class of linear models. and E. Further.
576 ~112. if we define ~ijk as the sum of the Sijk'S over all subjects then it follows that of intersubject. . But by the results just obtained we have for any given subject. differences in c and N.A. we £hall utilize the data overall trials of the (Al~) subsequence and not restrict the analysis to asymptotic Specifically we define performance.l09To check these predictions.112 ~ . independent ~S21' Thus we have a set of predictions which are not only nonparametric but which require no restrictive assumptions on variability between subjects. S121 ~ Sl12 Similarly S12 . Observed frequencies corresponding to these theoretical quantities are as follows: S121 ~ 140 SU2 0 ~ 138 244 . and E.566 . S121 ~ Sl12 and S21 ~ S12 Further. s2l ~ 243 s12 ~121}21 ~ .
however.ror the (~A3) t113/t 13 .5 25 subsequence ~232 ~32 = 82 t 232f32 = . = 45 ~223 = 49 ~23 = 87 ·549 ~223/~23 . ~3l . = = = 67 . that it may be very diffit~t cult indeed to find another theory takes us further in this dire. .558. . and E.3l Finally. Similarly.c tion than the pattern model with equally simple machinery.A.for the (A A ) 1 3 ~1l0 subsequence ~13l . 120 . ~13 .563 Further analyses will be required to determine whether the pattern model gives an entirely satisfactory interpretation of pairedcomparison learning~ It is already apparent. = 122 = ~13l.
Again the basic mathematical apparatus will be that of sets and elements. sample. pattern model by assuming that all of the patterns involved in a given experiment are disjoint. Consi deration of the process whereby trial samples are drawn from a larger stimulus population will be deferred to Section 4. 4. the elements representing the various components or aspects of the stimulus situation which may be sampled by the subject in combinations on different trials. In Sections 2 and 3 we regarded the pattern of stimulation effective on any trial as a single element sampled from a larger set of such patterns.2. Conditioning and Response In the preceding section we simplified our analysis of learning in terms . now we shall consider the trial pattern as itself constituting a set of elements. Now We shall go to the other extreme and treat problems of simple transfer of training between different stimulus situations that have elements in common in a purely crosssectional manner. .of the Nelement. but with a reinterpretation which needs to be clearly distinguished frOm that of the pattern model.A. with no reference to a learning process occurring over trials.1 A C(:MPONENT MODEL FOR STIMULUS COMPOUNDING AND GENERALIZATION ~ioms Basic Concepts. and these will be illustrated in terms of applications to experiments on simple stimulus compounding. and E. Then some theorems will be derived that specify relationships between response probabilities in overlapping stimulus samples. 111 4. or at any rate that generalization effects from one stimulus pattern to another are negligible. differ~ng We shall proceed first to give the two basic axioms that establish the dependence of response probability on the conditioning state of the stimulus.
~112 The basic axioms ofhhe component model are as CloThe sample into subsets s follows~ of stimulation effectiVe on any trial is partitioned si(i = 1. each in the presence of a different stimulus sample. let us consider s equal to the that are conditioned to that response. To begin with a special case.the presence of the stimulus Pr(A. where N(x) denotes the number of elements in the set x.2 Stimulus Compounding An elementary transfer situation arises if one reinforces two responses. 2.We do mean to assume. In Cl we modify the usual definition of a partition to the extent of permitting some of the subsets to be empty.A.The substance of C2 is. the ith subset containing the elements conditioned to (or "connected tott) response Ai C2. . however. ••• r. . . probability of response sample s i s given E. there may be some response alternatives Which are conditioned to none of the elements . to make the probability that a given response will be evoked bys proportion of elements of 4.The.of s . then combines all or part of one sample with all or part of the other to form a new test situation.:[ Ai in .that is. and E. then. where r is the number of response alternatives). .ls) ~ N(s.) ~ N(s) . that each element of s is condi tioned to exactly one response.
and by the usual method of paired presentation one response was reinforced in the presence of some of these samples and a different response in the presence of others. (abc) Ai had been reinforced ·in the presence ~ and response in the presence of the sample If a test trial were given subsequently with the sample (abd). to the subjects indicated that the cues represented symptoms and the numbers diseases with which the symptoms were associated. etc. c.E.1l3an experiment conducted in the laboratory Of one of the writers (W. The constituent cues.if a test were given . Following the training trials.. were various typewriter symbols. Ai direct application of Axiom C2 yields the prediction that resp.new combinations of "symptoms" were formed." spoken aloud. and the subjects were instructed to make their best guesses at the correct diagnoses. which for present purposes we shall designate by small letters a. In one stage of the experiment. and the Instructions responses were the numbers "one" and "two.)i:.). and E.onse should occur with probability 2/3. 8 8This experiment was conducted at Ipdiana University with the assistance of Miss Joan SeBreny.A. b. a number of disjoint samples of three distinct cues drawn from a large population were used as the stimulus members of pairedassociate items. Similarly. Suppose now that response of the sample (def). intended to serve as the empirical counterparts of stimulus elements.
N . response A l would be predicted to occur with probabilityl/3. our intention in designing the experi ment just cited to choose cues. c..24:t!"sts of each type. a. ll4with the sample (ade). given the same training (response abc and response ~ to def). andE. each giv'en: . and assuming the training effective in conditioning all elements of each subset to the reinforced response. i This equation N =N '. ~ssumptions in the experimental situa As mentioned above. To bring out the importance of the equal N as sump tion..were as follows: Percentage·overlap of training and test sets Percentage response l to test set .. which would take on the role of stimulus elements.A. reinforced to the combination Then.333 . where we have used the obvious abbreviation reduces to Pr{Allsasbsa) = 2/3 only if N(si) = N . let us suppose that the individual cues actually correspond to sets A l sa' sb' etc. a d b . it was necessary only that the cues behave as equalsized sets of elements. etc.332 Success in bringing off a priori predictions of this sort depends not only on the basic soundness of the theory but also on one's success in realizing various simplifying tion. b. it was.667 . Actually.. in order to justify our theoretical predictions.ResuJ:ts obtained with 40 subjects. application of Axiom C2 yields for the probability of response A l to abd . of elements.669 .
and E. and b Suppose. which should deviate in the appropriate direction from 1/2 if N b are not equal. 115In this experiment we depended on commonsense considerations to choose cues which could be expected to satisfy the e'lualN re'luirement.theelements in sa' a proportion . say. response Ai to a and response ~ Ai to to b ab . then tested with the compound abo Probability of response by is) according to the model.A. sb J S corresponds to a collection of stimulus sets N . . one possible to depend on commonsense considerations. assembled for use in further research involving applications of the model. N a and By meanS of calibration experiments of this sort.•• Suppose that a collection of cues s a .we could a have run a preliminary experiment in which we reinforced.To check on this. for example.. c . N . b. The expressions obtained above for probabilities of response to stimulus compounds can readily be sizes and level of training 0 generalize~ with respect both to set a. sets of cues satisfying the equalN assumption can be. given . Sometimes it may not be In that case. we had been in doubt as to whether cues would behave as equalsized sets . can utilize a preliminary experiment to check on the simplifying as sumptions. and also counterbalanced the design of the experiment so that minor deviations might be expected to average out. • a c b c J 0 0 D of sizes and that some response A j is conditioned to a proporof the elements tion of . N .
:a"'J".a+N+N+. ....sb's . expressed by the relation N p' .. 116in Sb' and so on...15 as the predicted probability of response 1 to the compound 1. al. and Frankmann (1957).and 3. the triad of should be 2 tests with various triads of lights.. andE.ues into E~..A..=. say..:'~_ _ Ja·c N. 52 can be illustrated in terms of a study of probabilistic discrimination learning reported by ·Estesj Burke. 52. Theo 2 retical values similarly computed for a number of triads are compared with the empirical test proportions reported by Estes et.... first stage of the experiment consisted in discrimination training according to a routine which we shall not describe here except to say that on theoretical grounds it was predicted that at the end of training the proportion of elements in a sample associated with the ith light conditioned to the first of two alternative responses would be given by Pil = 13' i Following this training.. 2...3.in Table 5.. + NbPbj + Ncpcj + Pr(A. by AxiomC2. b c (52) Application of E~.2. the subjects were given compounding Considering. and substituting these va). lights 1. Then probability of response A j to a compound of these cues is... Atkinson. In this study the individual cues were lights ·The which differed from each other only in their positions on a panel. ) = _a=. the values of and P31 Pil 3 13 ' 3 = 13 ' assuming N l = N2 = N = N .. we obtain = 13 = .:....1 s .=... .. .:.
of light and tone presented together.T) = NLPL + NTPT NL + NT Clearly.A.andE. Pr(CRIL. say a light and a tone.we are interested in the problem of whether a compound of two conditioned stimuli. the probability of a CR to the compound is simply a weighted mean of and PT ' and therefore its value must fall between the pro1)abilities of a CR to the two conditioned stimuli taken separately. each of which has been paired with the same unconditioned stimulus. for example. . may have a higher probability of evoking a conditioned response (CR) than either of the stimuli presented separately. 117 Insert Table 5 about here An important consideration in applications of models for stimulus compounding is the question of whether the experimental situation contains an appreciable amount of background stimulation in addition to the controlled stimuli manipulated by the experimenter. neglecting any possible background stimulation. Suppose. we may represent the light and the tone by stimillus sets and Assuming that assa result of the previous reinforcement the proportions of conditioned elements in su and sT (and therefore the probabilities of CRs to the PL and PT' respectively. No "summation" effect is predicted. application of stimuli taken separately) are Axiom C2 yields for the probability of a CR to the compound. To ana lyze this problem in terms of the present model.
.
A. and E. ll7aTable 5 Theoretical and observed proportions of response A l
to triads of lights in stimulus compounding test.
A, and E, 118Often, however, it may be unrealistic to assume background stimulation from the apparatus and surroundings to be negligible, In fact, the
experimenter may have to count on an appreciable amount of background stimulation, predominantly conditioned to behaviors incompatible with the CR, to prevent "spontaneous" occurrences of the tobeconditioned
response during intervals between presentations of the experimentally ·controlled stimuliLet us now expand our representation of the conditioning situation by defining a set tion sb of backgr'ound elements, a proporCR, For simplicity, we shall
of which are conditioned to the Pb
consider only the special case of bilities of evocation of the CR
=0
Then the theoretical pr'oba
by the light, the tone, and the compound
of light and sound (together with background stimulation in each case) are given by
Pr(CR!L) =
,
Pr(cRIT) and Pr(CR!L,T)
,
NTPT + NLPL NT+NL+Nb
,
respectively, tion effect,
Under these conditions it is possible to obtain a summaAssume, for eXaIllple, that NT = N = N b L and PT > PL '
A. and E. 119so of a Pr(CRIT) > Pr(CRIL) • CR Taking the difference between the probability CR to the tone alone,
to the compound and probability of a
we have
2p
L

6
which is positive if the inequality case, probability of a a CR CR 2PL > PT holds.
,
Thus, in this
to the compound will exceed probability of PT is not
to either conditioned stimulus alone, provided that
more than twice The role of background stimuli has been particularly important in the interpretation of drive stimuli. It has been assumed (Estes, 1958,
1961a) th",t in simple animal learning experiments,(e.g., those involving the learning of running or barpressing responses with food or water
reward~
the stimulus sample to which the animal responds at any time is
compounded from several sourcesthe experimentally controlled conditioned stimulus (CS) or equivalent; stimuli, perhaps largely intraorganismic in origin, controlled by the level of food or water deprivation; and extraneous stimuli which are not systematically correlated with reward Of the response undergoing training and therefore remain for the most part connected to competing responses. It is assumed further that the
A. and E. 120sizes of samples of elements associated with the CS and with extraneous
sources,
and
sE ' are independent of drive, but that the size of sD' increases as a function of
the sample of drivestimulus elements, deprivation. to the CS
In most simple rewardlearning experiments, conditioning and drive cues would proceed concurrently, and one might
expect that at a given stage of learning the proportions of elements in samples from these sources conditioned to the rewarded response, would be equal, i.e., PC = PD' R,
If this were the case, then probability
of the rewarded response would be independent of depravation; for, letting D and D' correspond to levels of deprivation such that N < N , , D D
we have as the theoretical probabilities of response deprivations,
R at the two
Pr(RICS,D) =
NePC + NoPD N + N C D
and NCPC + ND,PD' N + N , C D
Pr(RI CS,D') =
If the same training were given at the two drive levels, then we would have PD = PD' , as well as Pc = PD ; in this case the difference between Considering the same assumptions but with
the two expressions is zero.
. extraneous cues taken explicitly into account, we arrive at a quite different picture. probability are In this case, the two expressions for response
A. andE. 121NcPC + NifD + NEPE N + N + N
C D E
Pr(RICS,D,E) ;
and
Pr(RICS,D' ,E) ;
NcPC+ ND,PD; + NEPE N +N , + N C D E
Now, letting
we
Pc = PD = PD' = P , and for simplicity taking
PE; 0 ,
obtain for the difference
Pr(RI CS,D' ,E)  Pr(RI CS,D,E)
,
which is obviously greater than zero given the assumption
N , > N . D D
Thus, in this theory, the principal reason why probability of the rewarded response tends, other things equal, to be higher at higher deprivations is that that the larger the sample of drive stimuli, the more effective itis in outweighing the effects of extraneous stimuli.
4.3 Sampling Axioms and Major Response Theorem of Fixed Sample Size
Model In 4.2 within
we
considered some transfer effects which can be derived
a component model by considering only relationships among stimulus Generally,
samples that have had different reinforcement histories.
e. since these are in most respects easier to work with. l22however. component models that have appeared in the literature may be classified into two main types: (l) . and are as follows s for a sample. fluctuations excitato~ intheenvi~onmental situation. In the remainder of this section we shall distinguish stimulus populations and samples by using pop\1lation. One of the chief problems of statistical learning theories has been to formulate conceptual representations of the stimulus sampling process and to develop their implications for learning phenomena. and some detailed comparisons of the two types have been presented by Estes (l959b). with subscripts as needed. for a The sampling axiOmS to be utilized . the subject often may sample only a portion of the stimulation made available by the experimenter. and (2) models assuming a fixed ratio between sample size and population size.models assuming fixed sampling probabilities for the individual elements of a stimulus population. in which case sample size varies randomly from trial to trial.g. it is desirable to take account of the fact that there may not always be a onetoone correspondence between the experimental stimulus display and the stimulation actually influencing the subject's behavior. With respect to specific mathematical properties of the sampling process. The former type was first discussed by Estes and Burke (l953). and E.A. the latter by Estes (l950).. variations in states or thresholds of receptors. In this section we shall limit consideration to models of the second type. S. Owing to a number of factors. variations in receptororienting responses.
All samples of the. since this proportion will. then. using the axioms of the model as appropriate. We the probability of evocation of resp0l)se begin.:~)) yr) 0 . substituting expressions for the condi+ioned probabilities pr(Ai!S) = N~) " .. byAxiomC2. A prerequisite to nearly all applications of the model is a theorem relating response probability to the state of conditioning of a stimulus population. proceed to expand in terms of the state of conditioning and possible stimulus samples: Pr(Ai!S) = L s pr(Ails)pr(s!S) • a The summation being overall samples of size that can be drawn from S . sample size and population size S2. size have equal pro1.A. of the elements of B are conditioned to response Ai' we wish to obtain an expression for the expected proportion of elements conditioned to drawn from Ai in samples S. ) In the last expression on the right. ~ containing N(s) = 0 is drawn on each trial. Nel't. and E. )t:. 123S1. be equal to Ai by samples from S. For any fixed. S We shall derive the theorem in terms of a stimulus situation N elements from which a sample of size Assuming that some number N.>abilities.experimenterdef"ined stimulating situation.i:. as usual.~ N(s. we obtain 0 .) (. bility of Ai l represents the proba in the presence of a .sample of size a containing a .~ ~ constant over trials.(". with the probability in which we are interested. .
.A.and E. .the product of binomial N(s.218). so we have the pleasingly simple outcome that the probability of a response to the stimulating situation represented by a set _S of elements of S is equal to the proportion that are conditioned to the given response: (53) Thi.elements conditioned to Ai in a sample of size cr. 1957.4 Interpretation of Stimulus Generalization OUr approach to the problem of stimulus generalization is to represent the similarity between two stimuli by the amount of overlap between two sets of elements .9 9A model similar in most essentials has been presented by Bush and Mosteller (1951~).) l coefficients denotes the number of ways of obtaining exactly . but one should note that the same theorem does not hold in general for component models with fixed sampling probabilities for the elements (cf.s result may seem too intuitively obvious to have needed a proof. 4. Estes and Suppes. p. so the ratio of this product to the number of ways of drawing a sample of size cr is N(si) the probability of obtaining the given value of The resulting cr formula will be recognized as the familiar expression for the mean of a hypergeometric distribution (Feller. 1959b).124subset si of elements conditioned to Ai .
the probability of Sb after training in S a to stimllllls wOllld be given by NC~) or with the more compact notation Nab N(SanSb) . etc. More generally. the vallle of of the overlap of S a and Pbl Sb ..s.Nab)"bl N b (54a) . we begin with two stimllllls sitllation. in fact. by Eq. model are satisfied.a meaSllre stimllllls generalization has occllrred.A. 125~ In the simplest experimental paradigm for exhibiting generalization. if the conditioned to A l prior to the experi "bl' not necessarily zero. we have immediately where San~ den. for. 53.s some value Sb ' and if Then test trials are given in the presence of now proves to be greater than zero. we say that If the axioms of the component provides. represented by sets Sa and Sb' neither of which has any of its elements conditioned to a reference response of Sa A l Training is given by reinforcement of A l A l in the presence only 1llltil the probability of Pal> O. and E. Pbl in that sitllation reache.otes the set of elements common to Sand a Sb ' since Sb the numerator of this fraction is simply the number of elements in that are now conditioned to response proportion of elements of ment were eqllal to response A l Sb A l . . = :f>bl :::. NabPal + (Nb .
9 about here series of generalization tests is given at each of several different stages of training in S .. until the Thus the family of gradients shown in Fig. a as was done. at several different stages a of extinction following training in Guttman and Kalish (1956). S . This e~uation maybe rearranged to read (54b) and we See that the difference probability of (Pal .A. If between a constant and let vary as the parameter.e. or. and E.gbl) between the posttraining in ~ Al in Sa and the pretraining probability can be regarded as the slope parameter of a linear "gradient" of generalization in which tion of overlap we hold ~l Pbl S is the dependent variable and the proporand Pal is the independent variable. we generate a family of generalization gradients which have their greatest disparities at wab =1 (i. for example. when the test stimulus ~ Sb and is identical with Sa S) a and converge as the overlap between gradients meet at when decreases. alternatively. 9 illustrates the picture to be expected if a Insert Fig. 126This relation can be put in still more convenient fODm by letting Nab N b = w ab ' viz. by The problem of "calibrating" a physical stimulus dimension so as to obtain a series of values which represent .
w ·5.0 oct VI . gbl in S a ~ Generalization from a training stimulus.5 :::i iii ~ ~ o c:: _. o Fig..l UJ o Z a. . 126a 1. Al to the proportion of overlap between the probability of response Sa Sb Sa' and to a test Sb' and The parameters are prior to training ._ _ a. and E. . VI c:: UJ o lJ. ab . 9. Sb' at several stages of training.A.1. stimulus.
.
however. and E. it is reasonable to expect that larger sets will be associated with more intense (where the notion of intensity is applicable) or attentiongetting stimuli. or the like. intensity of sound. Nab (the former being given by for w b . is not equal to w and b N a N a b the latter by Nab) except in the special case N = N • When b a a N f N ' generalization from training with the larger set to a test b a with the smaller set will be greater than generalization from training with the smaller set to a test with the larger set (assuming .test with I? b than if we train to the same level with b and test with It is worth noting that. the settheoretical model is not restricted to such cases. One might regard the parameter . we should predict greater generalization if we train the reference response to a given level with a and . w ab has been discussed by wah as an index of the similarity similarity is not a symmetrical relation.that the reinforcement given the reference response training set Si A l in the presence of the Pil in is such as to establish the same value of Sj)' each case prior to testing in We shall give no formal assump tion relating size of a stimulus set to observable properties.A.of to In g~neral. Predictions of generalization in the case . although in the psychological literature the notion of stimulus generalization has nearly always been taken to refer to generalization along some physical continuum such as wavelength of light. 127equal differences in the value of Carterette (1961). Thus if Sa and a ~ represent tones a and b of the same frequency but with tone more intense than b.
and we shall conclude our discussion of generalization by sketching one approach to this problem. or metathetic.has termed substitutive. 128_ of complex stimuli may be generated by.the treatment given by W. Let us denote by Z a physical dimen sion of this sort. .and E. For an approach combining essentially the same settheoretical model with somewhat different learning assumptions. one which involves the notion of a simple ordering of stimuli along a dimension without varia_ tion in intensity or magnitude.however. e.Estes and D.. wavelength of visible light. La Berge in unpublished notes prepared for the 1957SSRC Summer Institute in Social Science for College Teachers of Mathematics. We shall consider the type of stimulus dimension that Stevens (1957) .A.g. K. the reader is referred to Restle (1961). 10 10 We follow. in most respects. . i. which we wish to represent by a se~uence of stimulus sets. First we shall briefly out line the properties that we wish this representation to have.e. .. first evaluating the overlap parameter w ab for a given pair of situations a and b from a set of observations obtained with some particular combination of values of and gbl ' then computing theoretical values of Pal and ~l ~l for new The problem conditions involving different levels of of treating a simple "stimulus dimension" is of special interest. then we shall spell out the assumptions of the model more rigorOUSly.L.
we must assume that at least up to the point where sets corresponding to larger differences in Z are disjoint. The basis for these axioms is a stimulus dimension Z.in the following set of axioms. each success ive stimulus set should be generated by deleting some constant . But to ensure that the organism's Z scale without behavior can reflect the ordering of stimuli along the ambiguity.. and a functionx('Zl .forinally .and E.in view of the . These ideas are incorporated more .abundant empirical Z value.ones belonging to the next. assume that as values of Consequently.A. 129It is part of t4eint~itive basis of a substitutive dimension that one moves from point to point by exchanging some of the elements of one stimulus for some new . taken together. evidence that generalization declines in an orderly fashion as the distance between two stimuli on such a dimension increases.related to the interval between the corresponding stimuli on the Z scale.number of elements from the preceding set and adding the same number of new elements to form the next set. it must not reapPear in the set corresponding Further. These properties. The mapping of the set (xl of scaled . enable us to establish an intuitively reasoBablecorrespondence'betweeu'c4aracteristics ofa sequence of stimulus sets and the empirical notion of generalization along a dimension. having a finite number of consecutive integers in its range.. we need also to assume that once an element is deleted as we go along the to any higher Z scale. we shall Z 'change by constant increments. the overlap between two. S* of which may be either continuous or discontinuousj a collection stimulus sets. stimulus sets should be directly .
(x). or whatever.h = For all h < i. for reasons of mathematical sim plicity we find it advisable to restrict ourselves. S/)Sk C.i < j <k in (x). and therefore to a finite col Second. or i t may be The reasons for may simply be a set ofZ a set of Z introducing values rescaled by some transformation. All that is required. at least for present purposes. i f s/i Sk f P . frequency of a tone.s axioms: (. if i . 130stimulus values onto the subset. Axiom G3states the .Si) of S* must satisfy the For all . this property ensures that the elements drop out of the sets .p Gl. and E. J For all where is the 02· . then G3 . First. C (s. Axiom G2 states that. Axiom Gl states that if an element belongs to any two sets it also belongs to all sets which fall between these two sets on the x scale. S. if two sets have any common elements then all of the elements of any set falling between them belong to one or the other (or both) of the given sets. to a finite set ofZ lection of stimulus sets.US ) . values.. . ii scale values. there is no reason to think that equal distances along physical dimensions will in general correspond to equal overlaps between stimulus sets.A.j . null set.. to make the theory workable is that for any given physical dimension. then N hi = Njk .i s j S k in (x) .we can find experimentally a transformation x such that equal distances on the x scale do corres pond to equal overlaps. N = N .in order as we move along the dimension. j < k and for all The set (x) i in k . (x) are twofold.however. S. wavelength of light.. k J ~ in (x).
or intensity (in Stevens' terminology. have some elements in connnon. the sets corresponding to Given in (xl i and k f j P. One advantage in having the axioms set forth explicitly is that it then becomes_ relatively easy to design experiments bearing upon various aspects of the modelThus.A. response until Al being one which before this (x). Pr(Allj) ~ pr(Allk) • Sk Si Axiom Gl ensures A l by that all of the elements of virtue of belonging also to which are now conditioned to must be included in Si which are not in Sj . we proceed in the same way to locate two members i and k of a set (x) i such that and stimulus Si()Sk k f P Then we train subjects on both stimulus Pr(Alli)~ pr(Allk) ~ 1 . pr(All k) > 0. including a pair i and i If k such that A l and a set (x) Pr(A [i) ~ Pr(A [k) ~ 0.N + 1 can Axiom G3 be satisfied. 131property which distinguishes a simple substitutive dimension froman additive. that Le. possibly Sk augmented by other elements of 'fa deal similarly with Axiom G2. in . This restri ction is necessary in order to obtain a onetoone mapping of the x values into the subsets (Si) of S*. then train subjects with stimulus finally test with stimulus k. It should be noted that only if the number of values in the range of x(Z) is no greater thanN(S*) .. l l pr(All i) 1. to obtain evidence concerning the empirical tenability of Axiom Gl. andE. in terms of the model. prothetic) dimension. it must be concluded. only until pr(Al[k) and is found to be greater Si()Sk than zero. training had probability of less than unity to all stimuli. it must be predicted that for every stimulus such that i < j < k . we might choose a response of stimuli.
we would be testing the x are of assertion that the sets associated with different values of equal size. the amount of generalization should be the same in both directions. J d .e quantity lJ With one qualification.it is natural to inquire Whether the parameter which represents the degree of COJllIDunality between pairs of stimulus sets might not be related in some simple way to a measure of distance along the dimension. the set 8j must be contained entirely in the union predict that we will now find that i::: j ::: k .h = k . In the special case when h =i and j =k . then reverse the procedure by training (different) j and testing on i If the axiom subjects to the same criterion on is satisfied. These axioms are . Once we have introduced the notion ofa dimension. i . .atisfied.j h h < i. = 1 . and E. 132Now. by . th.A. which we will mention later. If the four stimuli are all different. ing values of i To accomplish this test. we need only take any two neighborx. then the axiom is supported. train subjects to some criterion on and test on j . If simply train subjects on subjects to an equal degree on and test generalization to the amount of generalization. as measured by the probability of the test response. consequently..G2if any stimulus j falls between 8 U8 i k and k. j < k.we must j such pr(Al[j) = 1 for any stimulus To evaluate Axiom G3 empirically we require four stimuli such that i . ·is the same in the t""o cases.Wi' i and could serve as j • We can a SUitable measure of the distance between stimuli check to see whether the familiaraxiom$ for a metric are s. we can and test generalization to j i . then train k. say i and j .
but the fourth a bit of analysis.Axiom Gl. by.and k are any members of the set The first three of these obviously To carry out a proof. j. which establisqes the desired ine~uality. 133l.l(N . we know that 0 . Jl . we shall use the notation S.d ik .llnber of elements common to S.llnber of elements in both S.llne that stimuli on the dimension.. d ij ~ 0 if and only . The difference between the two sides of the and ine~uality we wish to establish can be expanded in terms of this nota tion as foliows: i k 2. d.> lj where it is 1Ulderstood that i. Nijk for the n1. . 3. +d jk .j. 1 N ij for the n1.~ ) . and E. d.!'I . d . lj d ij . fall in the order N iJk ~ Then.A.0 . >. re~uires hold. 1 J J but not in Sk and so on..Nij .if i ~ j .(1 N N · N' N . .Njk·+ Nik ) . and k d is additive) i < j < k However let us ass1.l ) + (1 . To find the restrictions 1Ulder which i. The last expression on the right is nonnegative. (x) associated with a given dimension. and S. 4. 2.
and.~ 0. ). In concluding this section. that ~J k d<k' ~ It is possible to define an additive distance measure which is not subject to this restriction..therefore. Spence. . 196'\. the theory predicts a family of linear gradients with very simple properties will be observed when response probability is plotted as a function of distance from the point of reinforcement. Under the special conditions assumed in the example considered above.A. and one must be prepared to find substantial differences in the phenotypic properties of generalization gradients derived from the same basic theory for different experimental situations. + d' k ~J J ~ lapping or adjacent that N7. e. we should like to emphasize one difference between the model for generalization sketched here and some of those already familiar in the literature (see. 1936. certain assumptions about the properties of the set of stimuli associated with a sensory dimension. Pre dictions of this sort may reasonably be tested by means of experiments in which suitable measures are taken to meet the conditiOnS assumed in the derivations (see. HUll. . Carterette. e.134i t is only in the special cases when 8 i and 8 k are either overd .g. but such extensions raise new problems and we shall not be able to pursue them here.<. . or response measures other than relative frequencies. 1943}. We do not postulate a particular form for generalization Rather. we introduce of response strength or excitatory tendency. then we take these together with learni~g assumptions and information about reinforcement schedules as a basis for deriving theoretical gradients of generalization for particular types of experiments.and"E. further theoretical analysis is called for.. But to deal with experiments involving different training conditions..g.
Until now it was convenient for expositional purposes. 4) to generate an account of learning for this 5·1 CompoEent Model. 3) together with the sampling axioms and response rule of the generalization model (Sec.s with Fixed S8Jllple Size As indicated earlier. or patterns. rather. Prediction of these transfer effects depended on information concerning the state of the stimulus population just prior to the test trial but did not depend on information about the COurse of learning over preceding training trials. they must overlap to various intermediate degrees. We first considered a type of learning model in which the different possible samples of stimulation from trial to trial were assumed to be entirely distinct. it is not reasonable to assume that the samples. the analysis of a simple learning experiment in terms of a component model is based on the representation of the . 135 5• CCNPONENT AND LINEAR MODELS FOR SIMPLE LEARNING ·In this section we combine. in a sense. or transfer. of stimulation affecting the organism on different trials· of a series are entirely disjoint. the theories discussed in the preceding sections.A. more general case. thus generating transfer effects throughout the learning series. in many (perhaps most) learning situations. However. and then turned to an analysis of generalization. one simply takes the learning assumptions of the pattern model (Sec. and E. In the "component models" of stimulus sampling theory. to threat the problems of learning and generalization separately. effects that could be measured on an isolated test trial following a series of learning trials.
if reinforcing event c If E (i i F 0) occurs. that in the early literature of stimulus sampling theory.. stimulus as a set S of 136~ N stimulus elements from which the subject At any time each element in the set r response alternatives S draws a sample on each trial. plays the same role here as in the pattern model. . are sampled th". . and E. 4.A . each element having some fixed probability e of being drawn. Formulation for RTT Experiments. is conditioned to exactly one of the A .A.oughout any given experiment.1 the probability of a response is equal to the proportion of elements in the trial sample conditioned to that response. independently on each trial. then with probability conditioned to response Ai all elements in the trial sample become EO occurs the conditioned status of The conditioning parameter It should be noted c elements in the sample does not change. To illustrate the model we first consider an experimental procedure in which a particular stimulus item is given a single reinforced trial followed by two consecutive nonreinforced test trials. the fixed sample number s ~ For model we assume that the sample size is a fixed For the independent sampling S. Two general types of component models can be distinguished. In this section we discuss the fixed sample size model s are and cOnsider the case in which all possible samples of size sampled with equal probability.. 1 2 Pro cedures and results for a number of experiments using an RTT design . model we assume that·the elements of the stimulus set. l r by the response axiom of Sec. At the termination of a trial. The design may be conveniently symbolized RT T . this parameter was usually assumed to be equal to unity.
1961b. In terms of the fixed sample size model we can readily generate predictions for the probabilities. 0 denotes a correct response and denotes an error then To obtain the first result. suppose one selects a·situation in which the probability of a correct response is zero before the first reinforcement (and in which the likelihood of a subject's obtaining correct responses by guessing is negligible on all. 1 j Pij' of various combinations of response on T . we note that the correct response can occur on either trial only if conditioning Occurs on the reinforced trial. oning occurs.j. On occasions when conditi. i on T l and response i. 2 If i. probability of a correct response remains at zero over both test trials. 1960a. Estes. trials). . the elements becomes conditioned to the correct response s and the probability of this response on each of the test trials is N' On occasions when conditioning does not occur on the reinforced trial. 1961). Crothers.j. For simplicity. Estes. 1960. Hopkins and Crothers. and E. 137 have been reported elsewhere (Estes.A. which has probability whole sample of s c.
iii) i] of making a correct response on each test.(1  c.c) Similarly.e. model discussed in Sec. and the i reinforcements are effective is probability that exactly k) i ki (i c (1 . 1 this model is equivalent to the oneelement If more than one reinforcement is given In general.1. and s2i (1:. and E. prior to for k T .A.(1 . we n()tethat a subject for whom the k reinforcements have been effective will have probability i of s [1 . 2. 1 .• To obtain this last expression. the expected proportion of elements conditioned to the correct response (i.(1 .) .iii) . N .. N . ) . the probability of a correct response) at the time of the first test is PO . the predictions are essentially unchanged. 1:58Note that when s. l preceding reinforcemen~s. T l and T 2 is given k and the probability of correct responses on both by siJ2 c) ki[1 .
The difficulty apparently not yielded consistently accurate predictions. however. 5.10 items. H.A.(1 .c) k = a k Pn = (1 .003 . An extension of the model which is capable of handling retention decrement as well as the acquisition process will be discussed in Sec.. e.c) . Perhaps the most natural extension of the preceding treatment is to assume that the subject starts the experiment .2 below. T l to T 2 (see. 1961~).tional considerations arise. Bower (personal communication) from a study in which the T T2 1 procedure was applied following various numbers of presentaFor 32 subjects each tested on tions of wordword pairedassociates. and Pll = .g. For RTT experiments in which the probability of successful guessing is not negligible (as in pairedassociate tasks involving a fixed list of responses which are known to the subject from the start) some addi. and E. P10 = Pal = . Bower reports observed proportions of POO = . these expressions reduce to P 00 = 1 . This special case appears well suited to the interpretation of data obtained by G. 139 If s =N . stems from the fact that our assumptions do not take account of the retention loss that is usually observed from Estes. this model has.894 .100 • When applied to other RTT experiments. .
In the presence of a sample containing only. j + (1  j) ~ An alternative approach to the type of experiment in which the subject guesses on unlearned items is to assume that initially all elements are neutral.£::.l .::) (56) Pn where qJ = +c(lqJ) 2 . PO' of a correct response to a given item on the first test trial after a single reinforcement is Po = l (1 .:ti('l.A.. ~ connected to the correct response and a proportion responses. fixed sample size model.qJ) l2 = (le)(l. l40with a proportion l r of the elements of a given set l (l . N (N N .::) + cqJ(l.c) r (1:.s)jr] = .c) l l PlO = POl = (l .) r S. the bracketed quantity being the proportion of elements connected to the correct response in the event that the reinforcement is effective.neui:.e. connected to incorrect Then for a r being the number of alternative responses. and E. are connected neither to correct nor to incorrect responses.c) r + c[s + l (l~'~S) r +. Then the probabilities of various combinations of correct and incorrect reSponses on the two test trials are given by Poo = (l . i. the probability.:.
respect to empirical implications lies in the fact that with the latter set of assumptions. therefore hi I~\ is the probability s elements that represents 1 . exposure time on test trials must be taken into account. then the proportion of conditioned elements in the sample connected to the correct response determines its probability (e.. These assumptions seem in some respects more intuitively satisfactory Perhaps the most important difference with than those considered above. (1  1 2 c)(l . two conditioned to an incorrect response. just long enough to permit the subject to draw a single sample of stimulus elements). and E. 141elements. three conditioned to the correct response. . then the probabilities of correct and incorrect response combinations on T l and T 2 are P 00 = (1 .+ c m' 2 'l' r 1 2 = (1 .cp') .cp' the probability that a subject for whom the reinforced trial was effective nevertheless draws a sample containing no conditioned elements and makes an incorrect guess.) r + c(l. with probability 1 r of being correct. if the sample contains nine elements. whereas cp' is the probability that such a subject makes a correct response on either test trial.g. where The factor that the subject draws a sample containing none of the became conditioned on the reinforced trial. the subject guesses. and four unconditioned. If the stimulus exposure time is just long enough to permit a response (in terms of the theory.c) 1 (1 _ !) r r = + c cp r (1 _ cp') 2 . then the probability of a correct response is simply 3/5).c) .A. If the sample contains any conditioned elements.
and thus Like E~uation cannot be distinguished in application to RTT data. General Formulation. Estes..c) 2 1 + C r c) . in cases in which the reinforceS contains a sub at least one conditioned element. for the retention loss usually observed (see. for the oneelement We turn now to the problem of deriving pre dictions from the fixed sample size model concerning the course of learning over an experiment consisting of a se~uence of trials run .2. Therefore. = 1 and 57 reduces to Poo = (1 . e.) r r 1 1 . (58) Pll = 1 2 (1 .(1 .2. for the case of unlimited exposure time. they have the limitation of not allowing ade~uately 55. 1960).c)(l . ment has been effective on a previous trial (so that set of s conditioned elements). the subject will eventually draw a sample containing one or more conditioned elements and will respond on the basis of these elements thereby making a correct response with probability ~I 1 E~. If exposure time is sufficiently long on the test trials. Hopkins. then we assume that the subject continues to draw successive random samples from S and only makes a response when he finally draws a sample containing Thus.A. 2. 142The two sets of e~uations 56 and 57 are formally identical. 5. identical with the corresponding model of Sec. and Crothers.) r e~uations which are. we shall return to this point in Sec. and E.g.
. (59a) w~ere pr(EllsS_vCj) C j is the probability of an E l event given condi~ tioning state and a sample with V elements conditioned to • .A. Thus the sequence of conditioning states can again be conceived as a Markov chain.VCJ. a sample on trial conditioned to state ~ Let s. we let C.s. 3~1.i n with i elements conditioned to Then the probability ofa onestep transition from is given by to state Pr(Ellss.n denote the event of drawing A l and s . Transition Probabilities. N+ 1 conditioning states. Following our we shall restrict the analysis to cases in which the probability of reinforcement depends at most upon the response on the given trial. 143under some prescribed reinforcement schedule.N) state in which N. we thereby guarantee that all elements in the transition matrix for conditioning states are constant over trials. elements of the set S are conditioned to A l As in the pattern model the transition probabilities among conditioning states are functions of the reinforcement schedules and the settheoretical parameters approach in Sec. and N. i A l S or ~ We shall limit considerais conditioned to exactly ~. and E.) .i to ~. tion to the case in which each element in one of the two response alternatives. ••• . c. (i = 0. so that there are denote the and Again. ~.
the probability of the latter event is the number l V elements conditioned to s 0 of ways of drawing samples with ~ divided by the total number of ways of drawing samples of size Similarly (59b) and 1  (59c) Although it is an obvious conclusion.Ao and Eo To obtain Eqo 59a we note that an must sample exactly conditioned to v E l 11+4must occur and that the subject N.j elements not already elements from the A . respectively) Eqso 59a to 59c simplify to .it is important for the reader to realize that the pattern model discussed in Part 3 is identical to the fixed sample size model when s = 1 This correspondence between the two models is indicated by the fact that Eqo 59 reduce to Eqo 23 when we let s = 1 For the simple noncontingent schedule in which only the two events El and E2 occur (with probabilities IT and 1  IT .
noncontingent case (61) This mean learning function traces out a smooth growth curve that can take any value between appropriately. N ' 1 . J. " ~ 0 Otherwise all states are ergodic. Le. (60b) . (60a) . J q . . Following the same techniques used in connec tion with Eq. 27 we obtain for the component model in the simple. This stepwise aspect of the process is particularly important when one attempts to distinguish between this model and models that assume gradual continuous increments in the strength or probability of . 1 2 Nl N' N' .. the values associated with the conditioning states.. it is im:portant to note that for a given reali zation of the experiment the actual response probabilities for individual subjects (as opposed to expectations) can only take on the values 0... 0 and 1 on trial n if parameters are selected However.. (60c) It is apparent that state C is an absorbing state when N " ~ 1 and Co is an absorbing state when Mean Learning Curve.
The correct response If A l (A ) l was to run into the other compart ment within 3 seconds. 61. and E. 60 . Fifty rats were used as subjects. Further. N~2. a criterion of 20 consecutive successful avoidance responses. Employing Eq.The apparatus was a modified MillerMowrer electric shock box. The .and s~l we obtain the following transition matrix: C 2 Cl 0 1 ~2 c S C l Co 1 c/2 0 Co 0 0 1  c And the expected probability of an A l response on trial n is readily obtained by specialization of Eq. shortly thereafter a buzzer and light came on as the door between the compartments was opened. Estes and Suppes. 1955. did not occur the subject was given a high intensity shock (255 volts) until it escaped into the other compartment. 1959a). animal was always placed in the black compartment.A. 146 a response over time (Hull. Theois analyzes the situation in terms of a component model in which N~2 and s~l. 1) ~O and hence on trial 1 the subject is in conditioning state with rt~l. he assumes that Pr(A CO' l . Bush and Mosteller. After 20 seconds the subject was returned to the black cornEach rat was run until it met partment. 1943. To illustrate this point we consider an experiment on avoidance learning reported by Theios (1961). and another trial was given. .
we can be sure . for our immediate purposes we are interested in only one feature of his data.n Applying this model Theios estimates _ _ c)nl 2. namely. trials between the first success and last error were divided into blocks .1 Al. at least one of the elements is not conditioned to the A l response. and the last error subject is. 1 0. Theios has applied several to che c.that the For if the subject has made Cion these trials. and E. according model. if there are some trials between the first succesS (Ai response. one success. However.form a sequence of Bernoulli trials with constant probability statistical tests of an response. the subject must be in conditioning state to the model. the response sequences for the assumption is incorrect. mean number of runs.. whether the underlying response probabilities are actually fixed at model. 2. Since deconditioning does not occur in the present C l Thus.). the mean learning curve. then. trial number of last error.2· and 1 as specified by the First. the sequence of response8after the first success and before the last error should . up to that trial.k this hypothesis and none suggest that the For example. .A.43 and provides an impressive account of such statistics as total errors. if on a later trial the subject makes an error. we note that it is not possible to establish the exact Co trial on which the subject moves from to C l or from C l to C2 Nevertheless. 3 and 4 trials. and many others. c = .• 147 ( Pr ( ) = 1 . probability of no reversals. autocorrelation of errors with lags of 1.in state (~ response). at least one of the two stimulus elements is conditioned to the Ai response.
1.e.n as in Eq. related statistical tests. we need not look far to find a difference in the predictions generated by the two models. 2 . respectively. For a detailed discussion of this Bernoulli stepwise aspect of certain stimulus sampling mOdels.5.5 .. The correspondence between pre dieted and observed frequencies is excellent as indicated by a goodnessoffit test that yielded a value of 1. 3 and 4 successes were 2 . 148of four trials and the number of The obtained frequencies for A l responses in each block was counted. 12. 17. 12. the predicted binomial frequencies were 3. Theios has applied the Same analysis to data from an experiment by Solomon and Wynne (1953) where dogs were required to learn an avoidance response.47 with 4 degrees of freedom. The findings with regard to the binomial property on trials after the first success and before the last error are in agreement with his own data but suggest that the binomial parameter is other than ~ • From a stimulus sampling viewpoint this observation would suggest that the two eiements are not sampled with equal probabilities. . 0.12 . and a review of relevant experimental data the reader is referred to Suppes and Ginsberg (1962a). 1 . 60 is identical to the corresponding equation for the pattern model with the sampling ratio cs N taking the role of c N' However. and E. .5 and 3. 15 . and 4 . The mean learning curve for the fixed sample size model given by Eq. If we define 02.A. i. 18. 29.1 .
:n:)fN + eN .. then. N fixed a00 .:n:(l . and E. we find size model with the same number of elements.nl s + Nc[N .1) ~. and carrying out the appropriate computations we obtain loo· .s(s .s .hen by carrying out theswnmation using the same methods as in the case of Eq. we obtain ~.1 .li] N(N :L 2 /a (iii s.n"c.n  [1 _ 2cs + cs(s . n1 (62) c:n:s + . 149t. in particular.A. noting that Pr(A l . 1.2)S] N L2N . this variance for a fixed sample s > 1 is larger than that of the pattern model with If we hold the sampling ratio N. The asymptotic variance of the response probabilistics for the component model is simply aCID Letting Cl 2 = Cl·"c.oo . (63) This asymptotic variance of the response probabilities depends in relatively simple ways on ferentiate with respect to cally with s s and N If we hold a 2 00 N fixed and difincreases monotoni . we find that s .1)] N N(N.2 and take the partial derivative with respect to to . a 2) nl + 2 c:n: s 1.nl = Clc..00 ) = :n: .27.
with our learning axioms in mind. Sequential Predictions.JJ k Pr(A l. we will see later.n l. and E.n+l J.n k.n (1 ) L '" J.A. 5. the probability of an EO .. for the pattern model the variance of 0 approaches as N becomes large. 2 elements conditioned to Conditionalizing. we consider response given that on the preceding trial occurred. 1959a). then (64) which.i conditioned to A . E s. n with i s.n+lIEl. and Suppes. 150be a decreasing function of way that N.El or E2 A l In particular. denotes the event of drawing a sample on trial A l and s . if N t in such a 00 s N = e remains constant. C ). the p values is the variance for the linear model (Estes In contrast. we obtain . We now examine some sequential statistics for the fixed sample size model that later will help clarify relationships among the various stimulus sampling models.n C.n)' n + 1 and trial By taking account of the condin and also the sample on trial Consider. first tioning states on trial n we may write Pr E l .]. In the limit.. l.n where. as before. pr(Al.n+l l. We return to comparisons between the two models in Sec.
n I Ck .n ) Our reinforcement procedures depend at most on the responses of the subject and hence Pr(E · ) =Pr(E Is. l.n) Pre C ) .c if if j=k+si j = k o otherwise That is.A. +l)Pr(C.n k . l.n l .9 n ) .n.n . and E. Jn1. +llE l s. J.n.n k . J.n k C )Pr(s. l. +llEl . .n Is.n· +llc.n k IC )Pr(C k .n. C ). 151 = 1 Pr(E l .n ) = substituting these results in our last expression for yields pr(Al. the ~ s ·i elements in the sample originally conditioned to A l with probability Occurs. c and hence a now become conditioned to C k to C + _ k s i mOve from state Also.n Further c Pr(C. k 1. l . C · ) = J.n.n s . as noted with regard to Pre s.n l.n.n ) J. k . Pr(El . Ck .n .n+lIEl. 1.n l. L Pr(A l .
and E. we give . As in previous cases (e. we present the expressions for Pr(Ak +lE. n Lk [cs + N ~ (1 N·  CS)] N Pr(C n) k. (65a) By the same method of proof we may show that (65b) Pr(A ) l . But by definition Pr(Al whence .Eg..n (65c) finally.n ) .nl.n ) = L k ~N Preck . J. n) • .n.A. 31a). n+l 1. for comparison with other models. 152We now need the fact that the first raw moment of the hypergeometric distribution is .g. pel!'initting the simplification Pr(A IE) = 1. A.
n) .n)} (66g) pr(~.ll)}al.n [ c.n+l l.L.n)} (66f) pr(~.a2 . 2.n l .c~S_ll)]'\.rr{[l_C(Sl)]a +c(Sl) } l.n+lE2.al. 153results only for the noncontingent situation in which and r .n+lEl.and E.n+lE2. (1.n} c~S_/) ]cal.n .n} ( 66c) +lE A .n 2.n ) (66e) pr(~. Pr(A E A ). 61a.n l. .n+lE2.nAl. (l_rr)[l_C(Sl)]~ .[1 (66h) .n) .n 2 .n) . rr{(l.n ).n .[1.n N 1· 2.n _Q (66d) Pr(A_ .).c.n N.n+l l. (1 .t.n (66a) (66b) Pr(Al.1 '\.a ) N.rr[l_c(Sl)](Q '".n ~.n .n .c~~/)]'\.n) [1.nAl.n~.n N 1 .c~S_11)}=t2.n~.rr){[l c~s_/)]~.A. (lrr){[l+ it. Pr(A 2 .n ~.E A ) .)(l'\. 0 Derivations of these probabilities are based on the same methods used in connection with Eq.rrJ{l .~ 1 l. Pr(E o.n l.n) .
Pr(A K.s . values for 0: For asymptotic By substituting the limiting q l.n 2 and 0: 2 .. ) in terms The resulting expressions are somewhat cumbersome. The difficulty is that certain combinations of parameters. A.. 63 + rc ~ rc(l.2s+Ns+2rc(Ns)(Nl)] N(2Nsl) . 154Application of these equations to the corresponding set of trigram proportions for a preasymptotic trial block is not particularly rewarding. N l 2N .OO 1. as a result. the basic parameters c. and E.n in Eq. 32 cannot be improved upon by use of Eq. 1 ~ rc and fromEq. of the basic parameters of the model. and cs N' behave as units.and N cannot be estimated individually and.CO J. data. s . the predictions available from the simpler Nelement pattern model via Eq.A.1 ~ we can express the trigram probabilities.l_c(Sl). 66. the situation is somewhat different. consequently..i. and we shalL_not pursue this line of analysis further in the present article. ) (ell n _ ~ n) N. 66. .e.00 E.1 . (.rc)fN+ (N_2)S]+ rc2 rc[N.g. however. e. .
those of the earlier presentations (cited above) which were not restricted to the case of fixed sample size.2 Component Models with Stim)J. stimulus fluctuation model (Estes. We should expect the correlation. Following the· convention of previous articles on stimulus fluctuation models.limitingcase(~otnece~sarily empirically realizable) of a zero interval to independence at sufficiently long intervals. The concept of stimulus sampling in the model corresponds to the process ofst1mulation in the empirical situation.intertrial interval. between successive stimulus samples to vary inversely with the . consequently. These notions have been embodied in the In this section. 1959a). or degree of overlap. 1955a. 155 5. we shall develop the assumption of stimulus fluctuation in connection with fixed sample size models.lus Fluctuation In the preceding section. Assumptions and Derivation of Retention CurVeS. and E. running from perfect overlap in the.A. we would expect that the independence of successive samples would depend on the interval between trials. the expressions derived will differ in minor respects from . as in most of the literature on stimulus sampling models for learning. Thus sampling and resampling from a stimulus population must take time. 1955b. we shall denote by S* the set of stimulus elements potentially available for . we restricted attention to the special case in which the stimulation effective on successive trials of an experiment may be considered to represent independent random samples from the population of elements available under the given experimental conditions. there will not be time to draw a completely new sample. More generally. and if the interval between trials is sufficiently short.
e.t any given time and by set of elements that are temporarily unavailable (so that The trial sample.between sand S'. the parameter being constant over time.i. the fluctuation process can be Characterized by the difference equation. is at time This recursion can be solved by standard methods . We denote s. in this presentation we shall assume for simplicity that all of the temporarily available elements are sampled on each trial (1.. by N. With this restriction. respectively. S* = sUS') is in turn a subset of S . and E. by S the subset S' the sub of elements available for sampling a. val L>t Specifically.of t. We shall limit consideration to the special case in which all stimulus elements are equally likely to participate in an interchange. S'. however. . a given set of experimental conditions. N'.. S* (67.A. f{t + 1) = [1  11 g(iii + iii')] f(t) + ~. we assume that during an inter which is just long enough to permit the interchange of a single S'. S* and N*. there is probability g gthat such an element between sand interchange will oCCur. is assumed to occur at a constant rate ov(>r time. the numbers of elements in and The·interchange between the stimulus sample and the remainder of the pop~llation.) where in s f(t) denotes the probability that any given element. s . ~156 sampling under.e. . S = s ) .
we need only set in Equation 68 to obtain the probability that the element is t.g('N . we S at time t . we proceed as follows.f( 0)][1 . t. For each condi S at the beginning of the interval. Similarly.A.[J . conditioned the momentary To ob probability of a conditioned response thus being PO = jo IN tain an expression for probability of a conditioned response after a rest interval of duration tioned element in f(O) = 1 in S'.g(N + N') • Equation 68 can now serve as the basis for deriving numerOuS expressions of experimental interest. to yield the explicit formula N N 157 fet) = N* . (68) N*' N the proportion of all elements which are in the sample.[N* . S at time we set f(O) = 0 obtain for the expected number of conditioned elements in . for a conditioned element initially in in Equation 68. Suppose for example. 1 1 1 .f(O)] at 1 + N')] 1 t =J where and a J = . that at the J" ·0 end of a conditioning (or extinction) period there were elements in Sand k O conditioned elements in S'. Combining the two types. and E.
1955a) •. we would ordinarily have Po > P5' in which case Equation 69 describes If the a decreasing flmction (forgetting. then.has been discussed in detail elsewhere (Estes. We noted in the preceding section that the fixed sample size model could not p:rovide a generally satisfactory account of RTT experiments because it did not allow for the retention loss usually observed between the first and second tests. Application to the RTT Experiment. respec the total population tively. and E. O:r spontaneous regression). probability of a conditioned response at time (69) where P5 and PO denote the proportion of conditioned elements in S* and the initial propo:rtion in S. If the rest interval begins following a conditioning period. rest int"rval begins following an extinction period. Dividing by N (and noting that J 158N = N*) we have. we would have Po < P5' in which case Equation 69 describes an increasing flmction The manner in which cases of spontaneous re (spontaneous recovery). To illustrate application of the . gression or recovery depend on the amount and spacing of previous acquisition or extinction. It seems reasonable that this defect might be remedied by removing the restriction on independent sampling.A. for the t.
we shall again consider the case of an RTT experiment in which the probability of a correct response is negligible prior to the reinforced trial (and also on later trials· if learning has not occurred).J)a 2 . we can write the probabilities of the four combinations of correct and incorrect responses on T l and T 2 in . in Equation 68: R is sampled For the probability that an element sampled on again on T .J)a l . so that we are dealing with a generalized form of the pattern model. and E. 2 and between respectively. we may obtain the following basic expressions by setting f( 0) equal to 1 or 0.A. 2 t T is sampled l J + (1 . 2 t T l is J( 1 .a 2 ) Assuming now that ·N = 1 . 159 more general model with provision for stimulus fluctuation. l t = J + (1 . for the probability that an element sampled on again on T . t2 denote the intervals between Rand T l Letting T l t l and and T . and for the probability that an element not sampled on sampled on T . as appropriate.
10 are not suitable for application to data. 1961b) . etc. the subscripts and errors. Estes. 70 reduce to f l is approximately unity. This difficulty could be surmounted by adding a third test trial.c + c(l. respectively. for there are too many parameters to be estimated.fl)(l f ) 3 0 and 1 ~ PlO Pn ~ ..A. for the resulting eight observation equations. denote correct responses As they stand. terms of the conditioning parameter c ~160. and E. and the parameters f i Poo POl ~ c f f2 l c f (1f2 ) l c(l . would permit overdetermination of the four parameters.. where.the data can be handled quite well on the assumption that in which case Eq.g. as before. Eq. In the case of some published studies (e. .f ) f l 3 1.
Estes.c In the general case of E~. noted in pUblished studies (Estes. and E. we have POl .J)(l.a ) t c(lJ)(la 2 . Hopkins and Crothers. 70.P10 = c fl(l.c(l. t l The experiments f l cited above have in all cases had < t2 .A. which obviously must be e~ual to or greater than zero.a l ) J(l. t )[J+(lJ)a l t J(la l )] . and therefore > f2 • . 1961b) that the observed proportion P10' these POl is generally larger than Taking the difference between the theoretical expressions for ~uantities. 161 Pll = 1 . ](lJ)(la t2 ) t t 2 .c(lf l ) f 3 =c[J+(lJ)a tl . some predictions can be It has been made without information as to exact parameter values. 1960.f 2 ) .
Since f . and E. and Crothers (196Q). and Witte (R. possibly because of artifacts arising from item selection (discussed by Peterson. correct response.6 to . f 2 . decrease from the difference between proportions on t l becomes large). l has ranged in which correct responses on from about . 3 correct per test shOUld. 162 2 which is directly estimated by the proportions of instances T l are repeated on f T2 . . should depend solely t 2 increases (and T2 f2 therefore T . t T l l and to t 2 The probability of repeating for example. et al. and Land (1962).Peterson. of course. in3 The overall proportion T l to T2 (although tends to zero as Data relevant to these and other predictions are available in studies by Estes. The predictions concerning effects of variation of t2 are well confirmed by these studies. Witte.9 in these experiments (and must be larger) i t is clear that P10' the probability of an incorrect followed by a This theoretical predic. Saltzman.A. tion accords well with observation. personal communication). should be relatively small. concerning variation in t l Results bearing on predictions are not consistent over the set of experi ments. 1962). Hillner. Numerous predictions can be generated concerning the effects of varying the durations of a correct response from on the parameter decreases). T l and T 2 (and therefore . Hopkins. 2 decreasing as The probability of a correct response on T l following an f incorrect"response on creasing as t 2 should depend most strongly on f ) increases.
a ) where' t is now taken to denote the intertrial intervaL u AlSO. t g (lJ)(l. we obtain l . we shall be dealing with a variant of the pattern model in which the pattern sampled on any trial is the One most likely to be sampled on the next trial. following essentially the same reasoning used before in the case of the pattern model with independent sampling. say g.n the probability of the state of the organism in which A l response and one of m m stimulus patterns are conditioned to the these is sampled. No new concepts are required beyond those introduced in connection with the RTT experiment. thus.n m=O Where. A response on trial n . and by u conditioned to A l Obviously. consideration to the special case of N =1. In terms of the n. and E.n the probability that ~ patterns are but a pattern conditioned to is sampled. the probability that the stimulus pattern sampled on any trial n+l. as usual. Cm. For the joint event A1E1A . l denotes probability of the Now we can write expressions for trigram probabilities.A. 163We shall restrict Application to the Simple Noncontingent case. Pn N* . is exchanged for another pattern on trial notation used above. we denote by lm. LUlm. but it will be convenient to denote by a single symbol.
] .n+l 2.n n . 164m 1] [ lg+gN'" Pr(Al . and E.n ) = II 2= u lm.n ) = (lll). .n.n'"2.p .+lE .'. then it is resampled and with probability A .n l .[(C cg+g+Ng.A. " Wr n 2 l l Pr(A .nAl .n "n n (71) .n·c. +lE l l Pr(A A_ . .nA ""2. E A_ ) = ll[(lc+cg)(lp). l gN' ml it is replaced by another element conditioned to event an tions A l response must occur on trial n +1 and in either Using the abbrevia Un = L: u lm.p + gU ] A l . .n . .n+l 2.g . ?nd Vn = r u"m \. n/2 J n n n Pr(A_ E A_) = (lll)[l.gU] .n ) = ll[c(lg)(lPn) + gV ] n .n ) A_ l Pr(A_ n+lE2 . E A_ ) = (lll)gV l. n n . the trigram probabilities can be written in relatively compact form: Pr(A +lE )= n[ (1 .nA .)p .n.n+I . .gV] '"2.n n Pre ~.}pn +gU] . ill ' for if an element conditioned to with probability lg is sampled on trial n.g ~')Pn + g A l 2= u lm n j.). . = ll[(l.n'"2.nA .".n+lE l .All...n .n )=(lll)[((lc)(lg)g.Ng.n+lE l. l . l Pr(A_ '""'2 ..n m " . m " n N . Pr(A .gV] 1. n m m N .
and Equations 71 reduce to those of a pattern N elements and independent sampling. involving repetitions of response will be noted that both of these expressions represent linear combinations of with the relative contribution of g) Pn decreases.g)rc + g(U +Vn ) + ' "n Now. 'n with equality obtaining only in the special cases where both Therefore. that > U .. defining equations for p n~ Also. particular. approaches model with When the intertrial interval becomes large. we obtain a recursion for probability of the A l 'response: ~ (1 . and E.!:aN' + cg)pnc(l . although a full proof would be quite involved. it is not hard to show heuristically that the asymptote is independent of the intertrial interval. 165 The chief difference between these expressions and the corresponding ones for the independent sampling models is that sequential effects now depend on the intertrial interval. We note first that asymptotically we will have . of a repetition is inversely related to the intertrial interval. for example. the probability that a correct A l or ~ response will be repeated tends to unity in the limit as the intertrial interval goes to zero.A. the probability In are equal to unity or both equal to zero. the parameter 1.c . and Un' increas ing as the intertrial interval (and therefore it is apparent from the. the A • l It first two of Equations 71.N* 1 g . Consider.g . Summing the first four of Equations 71.
the probability that an element conditioned to A l constitutes the trial sample is simply equal Substi to the proportion of such elements in the total population. l m The substitution of is possible in view of the intui tively evident fact that.+ c(l. = g("NT") N'+ 1 . we obtain = (1.c +cg)Pn .g ~. and E. where u is the probability that for m elements are conditioned to A .g:N' N* . tuting into the recursion for analogous one for Pn in terms of this relation.A. asymptotically. the simplification in the last line having been effected by means of the identity .g)n . 166 . and the vn .
for any but very long intertrial intervals) the learning curve will rise more sharply on early trials than the corresponding curve for the independent sampling case. with the curve for the longer interval approaching asymptote more rapidly on later trials (Estes. intuitively obVious that for g < 1 1 N* (Le. an element is more likely to iI:le.phiJrter the intertrial interval.c + cg)poo + c(l . If the total nU1llber of errors expected during learning mUllt be independent of the intertrial interval.. .n: ~ 1. Poo 167Poo' we arrive at the tidy Setting outcome Pn+l and solving for Poo ~ whence O· . and E. the curves for longer and shorter inter vals must cross ultimately. This is so because only sampled elements can undergo conditioning. However. for each initially unconditioned element will continue to produce an error each time it is sampled until it is finally conditioned.. The recursion in expressing Pn can be solved. resampled th!'l. but the resulting formula n and the parameters is too cU1llberIt seems as a function of some to yield much useful information by visual inspection.A. 1955b). and the probability of any specified number of errors prior to conditioning depends only on the value of the .. and once sampled.g)n .
if after a conditioning session.A. conditioning parameter c. The socalled "linear models" received their first systematic treatment by Bush and Mosteller (195la.rther by many others. Elf all odds the most popular models of this sort are those which assume probability of a response on a given trial to be a linear function of the probability of that response on the previous trial. and E.1955) and have been investigated and developed fu. We shall be concerned only with a certain class of such models based on a single learning parameter e A more extensive analysis of this class of linear models has been given by Estes and Suppes (1959a). the total number of conditioned responses during extinction is independent of the intertrial interval.is formulated for the probability of a response on trial n + 1.3 The Linear Model as a Limiting Case For those experiments in which the available stimuli are the same on all trials. ll 11 For a discussion of this general class of "incremental" models see the Chapter by Sternberg in this volume. 5. 108n is set equal to 0 Similarly. The linear theory . In such a "pure" reinforcement model the learning assumptions specify directly how response probability changes on a reinforced trial. given the entire preceding sequence of responses and . the possibility arises of using a model that suppresses the concept of stimuli.
for every 1.1) = (1. .1) > 0 l. 12 169 12 Let x n be the sequence of responses and In the language of stochastic processes we have a chain of infinite order.un.n J. .e) Pr(A. then Pr(A. n Ixn. Ix 1) l nl. and E.. . n +lIE.nA. I . •• I < r Ll." L2.nJ. k ~ i and k ~ 0 .:". . l lA.n u::O ]. r) to x n is j = 1 in the odd r) in the positions indicating responses and (where i = 0 even positions indicating reinforcements. . If .e) Pr(A n1xn_l) + e Pr(A. reinforcements of a given subject through trial a sequence of length 2n with j's (where i's n.Ux 1) > 0 A. n A.TIl I . A.1) > 0 A. reinforcements. If Pr(E o.n 1 then . model are as follows: and O<k<r..n+ llEi . If Pr(E.i' The axioms of the linear k and such that 1< 1"J. = (1 ..ux n. . to that is. u. L3.1) l +llEk l.ux 1) = Pr(A.ux n. 1. I xn. Pr(E k.1) 1 i. n .A. I x then Pr(A.
n' in terms of the probability. Ai if the reinforcing event. and E. E k event (k Fi and k F0 ) PXi. E i event on trial n. x makes an Ai that a subject identified with sequence n • response on trial L . oc. namely. 2.n+l .n + e . than E i occurs on trial L2 .n . i corresponding to Ll.ring on trial n ) the probability of By increases by a linear transform of the old value. if the subject receives an on trial n. 3. then the probability of And by L3.e):Pxi.n+l . (1 . By a~ibm 170E .curs on trial response n . if some reinforcing event other Ai decreases n. If the subject receives an PXi. . ~ occur. by a linear transform of its old value.(1 .A. compa~tly axioms may be written more PXi. then (regardless of the response A.e)Pxi. ("neutral") event EO occurrence of the The leaves response probabilities unchanged. if the subject receives an EO event on trial n.
I. which describe changes in response probability for the fixed sample size component model. response probability . many of the predictions generated by the two models are identical when e = cs N For example. ·C's· e = IN then the two sets of rules are As one might expect from this observation. since the set of stimuli is finite.is defined in terms of the proportion of stimuli conditioned.n ) (72) . so also is the set of values taken on by the response probability of any individual subject. if one wishes to interpret result of reinforcement. if we let similar. In stimulus sampling models. in the simple noncontingent situation the mean learning curve for the linear model is Pr(A l . It is this finite character of stimulus sampling models that makes possible the extremely useful interpretation of the models as finite Markov chains.s.7lFrom a mathematical standpoint it is important to note that for the linear model the response probability associated with a particular subject is free to vary continuously over the entire interval from to 1 0 since this probability undergoes linear transformations as a Consequently. that i.A. An inspection of the three axioms for the linear model indicates that they have the same general form as Equation 60. one must deal with a continuousstate space. Thus the Markov interpretation is of little practical value for calculational purposes. and E. changes in response probability as transitions among states of a Markov process.
thelinear model from component models holds r of responses. . for any finite number and for every trial n. However. as one might expect. However.. i11 the limit the cr00 2 for the component model equals the predicted value for the .A.rc) . For the linear model 2 8 cr00 = rc(l. A proof of the convergerce Kemepyand Snell (1957) theorem is given by Estes and Suppes (l.linear model. and E. The proof of this convergence theorem is lengthy and we shall not present it here. The derivation of . the proof depends on the fact that the variance of the sampling distribution for any statistic of the trial sample approaches 0 as N becomes large. l72 which is the same as that of the component model (see Estes and Suppes. not simply at asymptote. it can be shown that. as is indicated by a comparison Of the asymptotic variances of the response distributions.959b). However. the two models are not identical in all respects. for any reinforcement schedule. 28 as contrasted to Equation 63 for the component model. in the limit both the fixed sample size model and the independent sampling model approach the linear model for an extreme~ybroad class of assumptions governing the sampling of ele ments.s N > 00 • This conjecture is substantially cor rect. as (as N > 00 ) noted above in connection with Equation 63. 1959a. The last result suggests that the component model may converge to the linear process a. fora derivation of results for the linear model)..
let us assume that in the case of the RTT situation the likelihood of a correct response by guessing is negligible on all trials. will permit the researcher to choose between them. For the pattern model only one element is sampled on each trial N ~oo the learning effect of this sampling For experimental situations where both and it is obvious that as scheme would diminish to zero. these results will be compared with the corresponding equations for the pattern model. does not hold for the pattern model discussed in Sec. the linear model and the pattern model appear to be applicable it is important to derive differential predictions from the two models which. n+ In the present application the probability of a correct response on the first trial (the R trial) is zero. The same limiting result.A. 173 also have considered the problem but their proof is restricted to the ·twochoice noncontingent situation at asymptote. of a correct response on the first test trial is simply reinforcement is given on T l e. For simplicity. according to the linear model. Then. No and consequently the probability of a . on empirical grounds. Comparison of the Linear and Pattern Models. and E. of course. 3. and hence the probability . To this end we display a few predictions for the linear model applied to both the RTT situation and the simple tworesponse noncontingent situation. probability of a rein forced response changes in accordance with the equation Pl=(le)pn +e.
differences in learning rate among subjects). From an inspection of the data column of Table 6 it is obvious that the simple linear model cannot handle these proportions. fices to note that the model requires POL ~ n and trial n +1. there are differences in difficulty among items (or. 2 •. then the . 174T l and T 2 T l Therefore.e). the composition of which changed from trial to trial.A. equivalently. the RTT design applied to each item. They represent joint response proportions for 40 subjects. each tested on 15 paired associate items of the type described in Sec. for example. introduced on trial n A critical item In received one reinforcement (paired presentation of stimulus and response members) followed by a test (presentation of stimulus alone) on trial dropped from the list. 1961b). If. and E. One might try to preserve the linear model by arguing that the pattern of observed results in Table 6 could have arisen as an artifact. we obtain Some relevant data are Insert Table 6 about here presented in Table 6 (from Estes. order to minimize the probability of correct responses occurring by guessing.e) • 2 e2 • Similarly. following which it was It suf P1C ' whereas the differ ence between these two entries in the data column is quite large. and Pll ~ (1 . these items were introduced (one per trial) into a larger list. and T 2 (as correct response does not change between P'OO' the probability of a correct response on both defined in connection with Equation 55) is POl ~ P10 ~ e(l.1.
174a Table 6 Observed Joint Response Proportions for RTT Experiment and Predictions from Linear RetentionLoss Model and Sampling Model.A. .238 .147 .238 .238 . and E.506 .15 2 0 POO POl PlO Pn .018 Sampling Model .017 . Observed Pnop:brtion .598 Linear Model .238 .610 .
.
the probability of a correct response on Clearly then. POl would be estimated from a group of items described by .(la.differences in POl a = L i f. and E. instances of incorrect response on smaller 175 T . portion of correct responses on i~ Since the prothere is less than that on evidently some retention loss. and therefore that the linear model might not We can easily check a i is asso actually be incompatible with the data of Table 6. = L::: l (1 . ciated with a proportion Case where a i f i Suppose that parameter of the items (or subjects). Then in each is applicable.a. the expressions for distributions of ai' and are equal for all it is clear that individual differences in learning rates alone could not account for the observed results. the validity of such an argument. would predominately represent l On this account a values than instances of correct responses. )a. A related hypothesis that might seem to merit consideration is that of individual differences in rates of forgetting.) 11· 1 But a similar argument yields PlO .A.a. Since. one might expect that the predicted proportion of correct following incorrect responses would be smaller than that allowed for under the "equal a" assumption. again. and differences among subjects.J. i J. or items in susceptibility to this retention loss might be a source of . f.
they could accommodate the difference between the observed proportions another difficulty remains. = (le) ~fiPi J. If there are individual differences in amount of retention loss. J. with a proportion jectEl characterized by retention parameter sions for Pij Pi' f i of· the sub Theoretical expreEl can be derived for Eluch a population by the same.p i :i l: POl PIO Pn = e Lf. J. = (1.) • J. the reElults are aEl follows: Poo = e Lf. To obtain a near zero value for near unity. however. that if. and E. method used in the preceding case. then we should again categorize the population of subjectEl and items into ElubgroupEl. This time the expressions for and are different. there iEl a retention 10SEl then the probability of a T2 will have declined to some value P.e) L f . would require either a and However. or a value of l L:= fiPi 't near zero. J. which would be incompatible with e the observed proportion of .A.385 correct on T . which would be incompatible with the observed . such correct reElPonEle on p < e.) i J.(lp.116bias in the data. (1 P. as follows: The hypothesis can be formulated in the linear model T l iEl equal to Probability of the correct reElponse on e. with a suitable choice of parameter values.
.andE. we have no support for the hypothesis that individual differences in amount of retention loss might account for the pattern of empirical values.255 . 1961b). proportion of . and different retention parameters. . For example. PO = . Poo = = = 8 1 °1 + 82 °2 2 .255 correct on T2 • 177Thus.A.tions above to obtain :the theoretical expression for equations to get P~O' PO' and the 'first and third we· have . POl 81 (1°1) + 82 (1 °2) 2 (1 .8 1 )°1 + (18 2 )°2 2 P10 (18 1 )(1. even with this latitude it is not easy to adjust the linear model to the RTT data. one might assume that there are large But individual differences in both learning and retention parameters.385 and PO = . 1 1 and the obtaining for half the items and the combination Now the formulas become for the other half. are respectively. One can go on in a similar fashion and examine the results of supplementing the original linear model by hypotheses involving more complex combinations or interactions of possible sources of bias (see Estes. and combination Suppose that we admit different learning parameters. Adding the first and second of the equ".0:\) + (1. 8 ° . the proportionp of correct responses on the first and second test trials.82 )(1°2) 2 From the data column of Table 6..
If the proportion Poa in Table 6 is to be predicted correctly. P l must be greater than . For the right hand side of this last equation to have a value between o and 1. 178. which may be solved for el .·51 "now the admissible range of parameter values can be further reduced. :238 2 . Equating theoretical and observed values.A.77P l 2P l . and E. el + 2 e2 PO and Pl + P2 2 P 0 .48. or.083 + . we obtain the constraints which should be satisfied by the parameter values. so we have the relatively . substituting from the two equations just above. we must have further elP l + e 2P2 .
PO.255 correct on T2 . and the and combination' 8 P . One can go on in a similar fashion and examine the results of supplementing the original linear model by hypotheses involving more complex combinations or interactions of possible sources of bias (see Estes. we have no support for the hypothesis that individual differences in amount of retention loss might account for the pattern of empirical values.the first. the proportions of correct responses on the first and second test trials.P ) + (1. and second of the equatiorts above to obtain the theoretical expression for equ.255 .ationsto get. For example. 1961b).P ) 2 1 2 l 2 From the data column of Table 6. .= . P10 Pn (1 .8 1 )P l + (l~ 82 )P2 2 (1 8 ) (1. one might assume that thElre are large But individual differences in both learning and retention parameters. Bl(lP l ) + 8 2 (1P2 ) 2 = . 8 P POO PO' . and different retention parameters. even with this latitude it is not easy to adjust the linear model to the RTT data. Adding .andE.' 177Thus.8 )(1 . 1 l obtaining for half the items and the combination Now the formulas become for the other half. p~O' PO' ahdthe ~irst and third wehave . proportion of . are respectively.385 and PO = .A.L 1 l + 8 2 P2 2 . 8 1 Suppose that we admit different learning parameters.
so we have the relatively . 178 and E~uating theoretical and observed values. we obtain the cOnstraints which should be satisfied by the parameter values..083 + .7!P l 2P l . we must have further 8 1Pl + 82 P2 = :238 2 . If the proportion Poo in Table 6 is to be predicted correctly.A. For the right hand side of this last e~uation to have a value between a and 1.48. substituting from the two e~uations which may be solved fOr 8 1 .Now the admissible range of parameter values can be further reduced. or. P l must be greater than .·51 . just above. and E.
5. About the best we can do.A. By contrast. but there would seem to be little point in doing so.bilities" is to use the limits we have obtained for and arbitrarily assign a zero or small positive value to the combination e2 P2 Choosing . The refractoriness of the data to des cription by any reasonably simple form of the mode+ suggests that perhaps the learning process is simply not well represented by the type of growth function embodied in the linear model. 01 obtain the theoretical values listed for the linear model in Table 6.77. without allowing "negative probaP . we el = ·95 . By introducing additional assumptions or additional parameters.01. we obtain the theoretical values listed under "Sampling l'lodel" in Table 6. l and 8 1 . 82 = .39 and = . e2 would have to be negative (and the corT l would also be rect response probabilities for half of the items on negative). we could improve the fit of the linear model to these data.0 el must in turn satisfy . and = .61 . for in order to also satisfy the constraint el + e2 = . . P l = . narrow bounds on the parameters Pi 179 Using these bounds on as a function of that we find from the equation expressing el 1. and E. these data can be quite readily handled by the stimulus fluctuation model developed in the preceding section. and using the estimates c Letting f2 f l = 1 = . in Equations 70.93 ~ el ~ But now the model is in trouble. P2 .
First of . . of course.A. However.Estes and Suppes. claim that the.Be 00 2 This difference is . all. For the linear model (cf. since two parameters had to be estimated and there are only three degrees of freedom in this set of data. N 1 a2 = n ( 1. and E. We now turn to a few comparisons between the linear model and the multielement pattern model for the simple noncontingent situation. the expressions for the variance of the asymptotic response distribution are different. c However. reflected in another prediction that provides a more direct experimental test of the two models.n ) ' . sampling model has been rigorously tested. . more test trials can be generated without additional assumptions . More stringent tests of the sampling model can readily be obtained by running similar experiments with longer sequences of test trials. for the linear model whereas for the pattern model n(l . since predictions concerning joint response proportions over blocks of three or. 1959a) . distribution of which we denote This concerns the asymptotic variance of the A l responses in a block of K trials the number of var(lc) .n). Additional Comparisons Between the Linear and Pattern Model. the model does seem more promising than any of the variants of the linear model that have been investigated. l8oOne would not. we note that the mean learning curves for the two models (as given in Equation 37 and Equation 72) are identical if we let iii '" e .
'1:. namely limPr(A . In the case of the pattern model we note that Pr(AlIE1Al ) pr(A [E2A ) l l and and pr(Al[E2~) depend only on Pr(A1IE1~) depend on probabilities depend on " and and c N whereas ". 1.ential For detailed e in the linear model.e) and b: [2(1 .e)b where a : [2. for the case of than for the linear model..l in the noncontingent sitl). for c: (l it]} the N.)11 . 'K(.r seql).J ~K . " N and .)(1. and E. all fOl)..n 1 .A.nA ·.n ):(l_e)a+e _ l l l lim Pr(A l Jln+ l[E A ) l _ .n 2 .ential predictions for the linear mode. 34 for the pattern model. for the pattern model. 181 ~(~J •• (l . we present certain asymptotic seql).oJN [l Note that. the variance for the pattern model is larger However._n+llE .e) + e] I (2 e) These predictions are to be compared with E'l.oj . '42. .ation.(1 .E'l'.es of and Finally. variance for the pattern model can be larger or smaller than for the linear model depending on theparticl).e) + e] I (2 . In contrast. e.(1.lar vall).. comparisons between the linear model and the pattern model in application . e: c i.
'Atkinson and Suppes (1958) have derived and tested predictions from linear models for behavior in two and three person games. Several recent investigations have provided evidence indifor example. and we use this situation to illustrate the technique of extending the linear model to multiperson interactions. we .to eXl'<.A. from the standpoint of game theory (see. We consider a situation which.4 Applications to Multiperson Interactions In this . and Mosteller (1955) have analyzed a study of imitative behavior in terms of their linear model. the reader is referred to Suppes and Atkinson (1960).. and related variables.on. e.apply the 11nearmOdel . Bush cating the fruitfulness of this line of development. ec. social pressure. and a pl&yer. 1957). l)1aY be characterized as a game in normal form with a finite number of strategies available to each player. To avoid .secti.. Luce and Raiffa. Each play of the game constitutes a trial. and Estes (1957a).aJild.lximental situa:tions involving multiperson interactions in Which the reinforcement for any given subject depends both on his response and on the responses of other subjects. and Estes and Suppes (1962). The simple twoperson game has particular advantages for expository purposes.'.s choice of a strategy for a given trial corresponds to the selection of a response.onomic oligopolies. monetary payoff. Burke (1959)+960) .g. and Eo 182 to twochoice data. 5. Suppes and Atkinson (1960) have also provided a comparison between pattern models and linear models for multi~person experiments and have extended the analysis to situations involving communication between subjects.
a game of matching pennies) and each player . r' ) J A and B and let Ai (i . ~Bl· or ·A B ) .. problems of reward magnitude).available to the two players. 1. r) denote the responses .. b. we assume /3. The set of reinforcement probabilities prescribed by the experimenter may be represented in a matrix matrix"familiar in g/3.each has probability 2 2.. 1 2 .he measurement of utility (or from the viewpoint of learning theory. then only PlayerA receives reward. ~183 problems h/3.the response pair is the probability of AiB j For example. ° 1 ° 0.b ij ) analogous to the "payoff a The number ij represents the pro Player A being correct on any trial ·of the experiment AiB j . when either of the other possible response pairs occurs (Le.. given the choice of the other player on the trial. . there is exactly one choice leading to the unit reward • . bability of (aij.A. similarly.ving to do with t. 1 .their choices simultane.. Rules of the game require the two players to exhibit .me theory. ~J given. 1 When both subjects make response 1 . of receiving reward.. 1. unit reward that is assigned on an allornone basis. and E.We designate the two players as andB :( j. It .ously on all trials (as in.. •. . Player B being ·correct given the response pair consider the matrix B l A l A 2 B2 1 1 2' 2 1. When both make response then only Player B receives reward.is informed that.
..it is quite possible to permit both or nei1:iher of the players to be rewarded on any trial. and in experimental a necessary restriction on the model. response for response for denote the event of reinforcing the E~B) J the event of reinforcing the To simplify our analysis we consider the case in which on~ each subject has two response alternatives. lJ 1..Specifically.tll'!lmJls. to provide a rela tively simple theoretical interpretation of reinforcing events it is essential that on a nonrewarded trial the player be informed (or led to infer) that some other choice." stances. Eowever. i. had he made it under the same circum. B. Let Player A and Player B We return to this point later.n A l response and is rewarded then For example.A. ~ l.' i' b ... an Finally.e. . ~84 should be emphasized that although one usually thinks of one player winning and the other ~osingon any given play of a game. if an is made and no. B. one last definition to simplify Player A's response probability by a ~otation. this is not In theory. and Player B's and we denote.n J. ) J l.a .n Player A makes an A l (B) 1.Jf the payoff parameters as follows (for a . IA. ) l l. tests of the theory. we assume that the other response is reinforced. if an EiA) occurs. l l .. however..n JJn B. We denote by ~. IA. lJ j'. ) ~ (B) Pr(E.' j ') ~ (A) Pr(E.b .n J. i'.)!.n J. ~ Pr(E.. B. IA.n (73) lJ (A) Pr(E.. would have been successful. reward occurs then occurs. and we define the particu~ar probability of occurrence of a reinforcing and event~_fun'. by "I the joint probability of an A and B response. ) lJ J l. IA. and E.
. from Axioms Ll and L2 we can easily show that the general form ofa recursion for an is a n+l ~ (1 .n .85 an ~ Pr(A l . that both probability '1 an+l and ~ n+ 1· depend on the joint n ~ Pr(A B ) • l .u .. namely. l .A.n Tlleorem (75b) where Proof. eA ande B are the learning parameters for players A and B It will suffice to derive the difference equation for is identicaL 06+1' since the derivation for To begin with.nBl :son ) We now derive a theorem that provides recursive expressions for an and ~n and points up a property of the model that greatly complicates the mathematics. J.eA)an + eAPr(E(A)) l.The term pr(E(A)) can then be expanded as follows 1. and E. ~ n ~ Pr(A l •·.n ).n .
.+ a Pr· l .n 2 .n l .n .) } ~ Pr(A } + Pr(A B l l .n J"n ~.2 .n } .A.n J. l . .n. } _.Pr(A B } ]Pr(B2 } l .Pr(A B } .n l. (77b ) and = [1 .n l .n..n = 1 .. } 1.n ..n . ~..n.n l J n 2.. B. J .n.0 2 l . J = and by (73) B (A B ) pr(E (A}} = allPr(A1 1 } .n J.. B. Pr(El(A) Ill.n' J.Pr(B .Pr(A B } l JUJu l .Pr(A B. l .n .' l.Pr(B l .n} = Pr(B .n } .n l.n J. and E.n IA } ]Pr(A } l l .milarly l .. 186pr(E(A}}=L::Pr(E(A} A.n B.n I (77c) = pr(B2 . n = [1 .. '.n l"n Pr(A B 2.n l. . }Pr(.n (77a) = Pr(A S.i.n (76) Next we observe that Pr(A }= Pr(B IA }Pr(A B } 2 . } l. } . n .n l.n 12 z=.A.
a . . fa + gr + h . a n yields It has been shown by Lamperti and Suppes (1959) that the limits a. and E. bl3 + c7 + d where . 7 n 7 exist. I3n +l .1. and 13 and 7 . el3 . aa .an 13n +7n ) Substituting this expression into the general recursion for the desired result.. which completes the proof. 22 )(. 77b a. we have two linear relations that are independent of namely. (78) b . a. an . S~bstituting 187 Equations 77a. 13 in and and 75b) BE.A. I3n . and 7. whence (letting 75a BA O:n+l. into Equation 76 from 77c and simplifying by means of the definition for 13 and we obtain + (1.
ag . this relationship is one of the few quantitative results that can be directly computed for the linear model. Unfortunately.ce)a + (bg tcf)13 ~ ch . however) the advantageous feature that it is independent of the learning parameters $A and BB and therefore may be compared directly with experimental data. 0 1J 0 ~ From Equations 79 a b ~ o} 1 we obtain c = 2 d ~ 1 1 1 e f ~ 1 g ~ 2 0 1 1 = 1 h and Equation 80 becomes .dg . Byeliminating relation in a 188 r from Equations 78 we obtain the following linear and 13: (80) ( . and E. It has.( A. Application of this result can be illustrated in terms of the game cited earlier in which the payoff matrix takes the form Bl Al 1 1 2' 2 1.
Ao and Eo l~. they do not depend on the response probabilities. we know B 1s . From this result we predict immediately that the longrun proportion of responses will tend to 1 2 To derive a prediction for Player A we substitute the known values of the parameters into the first part of Eqo 78 to o1:ltain Unfortunately we cannot compute?" response pair. 1 responses are r'c cannot be greater than 1 2' o< 'I' 1 S 2' and as a result can set definite bounds probability of an Ai response. we have the basis for a rather exacting experimental test since the asymptotic predictions for both subjects are parameterfree. we have on the 10ng~run the asymptoticprobability of the 'I' is positive and since only However. e values of either subject or on initial We limit ourselves to the consideration of one such case: parameters so that the coefficients of equ.5a 0 J choice of the 'I'n vanish in the recursive c' = g = 0 and and then 75b.by imposing restrictions on the experimentally determined parameters and a variety of results can be obtained.be 7.ations af . ioeo.if we. Bpecifically. Ofcourse. Player BI s Therefore. namely Thus. let f .
be Th". fact that a and 13 are independent of eA and eB under the y restrictiQns imposed on the parameters in no way implies that also independent of these quantities. correspondence between predicted and observed values has been very good. 190 a = aa + btl + n+l n n etl n d (81) + fa + h n SQlutions for this system are wellknown and can be obtained by a number of different techniques. however. th".s (1958). is Eq·'. in general.A.· . a n and tl n exist and ar". A great deal of experimental work has been conducted on this restricted problem and.and Suppes and Atkinson (1960). 1960). and E.61 provide a very precise test of the model and the necessary conditions for this test involve only experimentally manipulable parameters. problem of obtaining "'xplicit expressions of a n and tl n for arbitrary nth'" r"'ader is r"'f"'rr"'d to an articl'" by Burke (1960). for a detailed discussion of th". Burke (1959. initial conditions and and eA and (:t=a =a n+l n tl = 13n+1 = I3 n into the two recursions we obtain bh + df af .be a= and ah + ite af . . independent of both By substituting th". for an account of this wQrk see Atkinson and Supp". . that the limits for We do knQW.
all of the predictions presented in this section are identical to those that can be derived from the . in general. such as those for for the two models.A. However. and E. 191In conclusion we should mention that . only the a n and ~. are the same .pattern model of Section 2 grosser predictions.
set with each member of a set of stimuli.A. sections) the subject learns to associate the appropriate member of a response. 192 6. stimulus Similarity is an extraneous factor. and E. learns to make a conditioned response only when the conditioned stimulus is presented. differentiation between the two categories of learning seems to be that in the case of discrimination learning the similarity. to the process by which the subject learns to make one response to one of a pair of stimuli and a different response to the other. In the paired associate situation (referred to several times in previous. to be minimized experimentally and neglected in theory so far as possible. One of the general.the learning aspect in experiments on simple . in the case of simple learning. roughly speaking. By discrimination we refer. DISCRIMINATION LEARNING The distinction between simple learning and discrimination learning is somewhat arbitrary. But there is an element of discrimination in any learning Even in the simplest conditioning experiment. and therefore to do something else when that stimulus is absent.·strategic assumptions of the type of stim1).lusresponse theory which has been associated with the development of stimulus sampling models is that discrimination learning involves simply a combination of processes each of which can be studied independently in simpler situations . between stimuli is a major independent variable. the subject· situation. and The principal basis for therefore to "discriminate" the stimuli. or communality.
193 acquisition or extinction. a few important caseS. Thus. two_response discrimination problem (e. it is sometimes useful in the treatment of discriminative situations to ignore generalization effects among the stimuli involved in an experiment and regard each stimulus display as a unique pattern.1 The pattern Model for Discrimination Learning As in the cases of simple acquisition and probability learning. behavior elicited by the stimulus display will depend only on the subject's reinforcement history with respect to that particular pattern. and E. and this can be accomplished by formulating assumptions and deriving results of general interest for.g" Lashley situation with the rat being differentially rewarded for jumping to . we shall not go far in this direction. 6. Case L All cues presented are sampled on each triaL For a a classical twostimulus. and the stimulus relationships in experiments on stimulus generalization or transfer of training. Thus. We propose only to show how the processes of association and generalization treated in preceding sections enter into discrimination learning.. There is adequate scope for analysis of different types of discriminative situations.a black card and avoiding a grey card). but since our main concern in this article is with methods rather than content.A.there will be nothing new at the conceptual level in our treatment of discrimination. our conceptualization requires . Two important variants of the model arise according as experimental arrangements do or do not ensure that the subject will sample the entire stimulus display presented on each trial.
A. the left hand window. and And we denote by In the stimulation common to both types of trials.exam ple of the Lashley situation. l by S2' the set of cues present ~.the respon"e of jumping to the right hand the stimulation present only on trials with black cards. symbols. and E. the set of cues present in both situations. Suppose.tuation associated ·with reinforcement of response A . only in the situation associated with reinforcement of response and by S c . might be the response of jumping to . that the experimenter's procedure is to present on some trials (T from and S l trials) a set of cues including ~ from Sl and cues from m c c ' • and on the remaining trials ( T 2 S c trials) S· 2 m from c Further. the N i may be treated as unknown parameters to be estimated from data. l stan~ the number of cues in each of these subsets. in such instances. let the two types of trials occur with equal . S2 Sl ~. now. and the model may thus serve as a tool to aid in securing evidence as to the subject's perceptions of the physical situation. A l In the . c N . S c the stimulation present only on trials with grey cards. the experimenter may have no a priori knowledge as to the patterns distinguishable by the subject. however.194a distinction among three types of cues: We shall denote by Sl the set of component cues present only in the stimulus si. dard experiments. and N . such as tones. or the like. window. objects. colors. the "cues" refer to experimentally manipulable aspects of the situation. and it is reasonably wellknown just how many different combinations of these cues will be responded to by subjects as distinct patterns·In some instances.
[1 . The corresponding 0"".n l ) = 1 . provided only that neither m nor l ~ . 195 We can obtain an expression for probability of a correct response on a T l trial simply by appropriate substitution into Eq.' " In the discrimination literature. Rate of approach to asymptote on each type of trial is inversely related to the total number of patterns available for sampling. 82 that (for the above speci fied experimental conditions} the pattern model predicts that probability of correct responding will go asymptotically to unity regardless of the numbers of relevant and irrelevant cues.is equal to zero.· since the former are correlated with reinforcing events whereas the latter It is apparent by inspection of Eq.A. those in 8 c 8 1 and 82 as irrelevant. cues in the sets are commonly referred to as relevant. Therefore.} th~ . and E.0" d"~~ "' lli " . (82) :':'TI~ ( j'i""" " ::) are not. viz Pr(A l where .Pr(Al llTl 1)] .) . . is the ordinal number of trial. 28.n 1 ITl . other things equal. rate of learning is qecreased (and total errors to criterion increased) by the addition of either relevant or irrelevant cues.
A.:'en:~ot1:::j(:. In Case 2.':::::'. . and that on cues of trials the N2 82 together asSlll!1e with the 8 c For a given fixed exposure time}) we :q:::::'::::~' . c is given by (:c) w = c (Nl : Nc ) .nce analo gous arguments hold for only cues from 8 c ' There may be some patterns including and learning with respect to these will be on a The proportion of such patterns. si. two types of patterns need to be distingMished for each type of triaL We can limit consideration to T • 2 T 1 trials.:::. . with 8 c sl cues from 8 1 and ihe remainder from The asymptote of discriminative performance will depend on the s relative to N c If s < N .for the entire stimulus display to be sampled by the subject.:"'T. c size of so that the entire sample can come from the set of irrelevant cues. 196 Partial sampling of the cues presented on each trial. then the asymptotic probability of a correct response will be less than unity.' . Let us suppose the display N c for simplicity that there are only two stimulus displays: on Tl trials comprises the 8 N l T2 cues of 8 1 together with the cues of cues of c N c . and E.:::'.". Case 2. simple random reinforcement schedule.l"~' of filling the sample. or the exposure time too short. We consider now the situation which arises if the number of cues presented per trial is too large. w .::.
= N 12 cw l . A l response if a T l display Now to obtain the probability of an is presented on trial n. . we need only combine Eq. 1 trials all contain at least one A l is re and thus occur only on trials when response U .) 2 2 1 lc The remaining patterns available on cue from inforced.( 1 V ( 1 .= T l c where V n 1 . this time with 12 = 0. n The probability. 28. pr(Al.A. which is equal to zero if s 197If > Nc • and trials have equal V .cb )nl . 21 ~ 1. probabilities... 8 . to be denoted containing only cues from on trial Pr(A ) = n that a pattern response 8 c will be conditioned to the rt n can be obtained from Eq. then the probability. to be denoted A l on trial rt that anyone of these is conditioned to n may be similarly obtained by rewriting rt Eq. weighting each by the probability of the appropriate type of pattern.(1 .n) = Un' and N = 2 cb lc ' c 1 i. 83 and 84. viz .n r:. U = 1 ..e. and E.U )(1 _ n 1 l 2 cb lc )nl ' (84 ) where the factor sampling on only 1 2 1 2 enters because these patterns are available for of the trials. 28 by setting and c .
n l . performance varies i~versely w = 1 .n ) 1 _10 w lo(lW )(1_10 cb )nl 2 c 2 c 2 lc ( 85a) The resulting expression for probability of a correct response has a number of interesting general properties. as the proportion of w ' c 0. and E. c Between these extremes. the rate of approach to asymptote proves to depend only on the conditioning parameters and total number of patterns available for sampling. asymptotic w with c . anticipated. As in Case 1. ) ~ (lw)U + w V Pr(A IT . thus it is a joint fUnction of the total nmnber of cues. depends in a simple way On "irrelevant patterns". if to 1 nl (85) Pr(A l .l .A.n IT l . N +.n c 'n en 198 (lw)(lU)(lcb) c 1 2 lc _ w (!V )(lcb )nl c 2 1 lc which may be simplified. when to simple random reinforcement. The slope parameter.N ' and c l . so that the terminal proportion of correct responses on either type of trial proYides a simple estimate of this parameter from data. the asymptotic probability of the whole process reduces a correct response is unity. cb lc ' could then be estimated from total errOrs over a series of trials. When W c ~ The asymptote.
for example the "paired comparison" experiment discussed in Sec. this model may account for some aspects of the data (e. the conceptualization of the discriminative situation and the learning assumptions are exactly the same as those of the pattern model discussed in Sec. The only change is in the .2 A Mixed Model The pattern model may provide a relatively complete account of discrimination data in situations involving only distinct. as.g.trialsto criterion) even in discrimination experiments where similarity. of relevant and irrelevant cues. among stimuli is a major variable. The situation in this regard will be different for the "mixed model" to be discussed below.. but it should be noted that the result depends on the simplifying assumption of the pattern model that there are no transfer effects from learning on one pattern to performance on another pattern which has component cues in COmmon with the first. But to account for other aspects of the data in cases of the latter type. s. but simply a combination of ideas developed in preceding sections. 6.. In the mixed model. and E. 6. 3. The approach to this problem which we now wish to consider employs no neW conceptual apparatus. 199 but does not depend on the relative proportions The last result may seem implausable. Also. readily discriminable patterns of stimulation. or communality. the sample size.1.A. it is necessary to deal with transfer effects throughout the course of learning.3 or the verbal discrimination experiment treated by Bower (1962). asymptotic performancelevel.
for example.as may occur during reversal of a discrimination). or "guessing" state. by axiom C2. As before. Before the assumptions about transfer can be employed unambiguously in connection with the mixed model. the notion of conditioned status of a component cue needs to be clarified. and E. conditioned pattern. one of which is conditioned to response sponse A . ditioned to response Ai We shall say that a cue is Con if it is a component of a stimulus pattern Ai' If a cue belongs to two Ai and one tore~. ~200 response rule. of Sec. Our assumption is simply that transfer occurs from a conditioned to an unconditioned pattern in accordance with the as sumptions utilized in our earlier treatment of compOunding and generalization (specifically. then it is said to be in the unconditioned. it will evoke that response on each subsequent occurrence (unless on some later trial the pattern becomes reconditioned to a different response . 4. Note that a pattern may be unconditioned even though Suppose. we assume that. and that is altered in only one respect. but which have component cues in common with other patterns which have been so conditioned. . The new feature concerns patterns which have not yet become conditioned to any of the response alternatives of the given experimental situation.. (iij).A.1). that a pattern all of its cues are conditioned. once a stimulus pattern has become cOnditioned to a response. J then the conditioning status of the cue follows If a cue belongs to no that of the more recently conditioned pattern. that has become conditioned to response patterns. together with a modified version of Cl.
10 so that in a situation involving if all cues in a pattern are unconditioned. y and z n 201 in a particular arrangement has never triMs of an experiment. Then all of the cues of pat would be conditioned. the probability of any response is equal to r. of pattern would be reinforced in the presence of its probability of evocation by that pattern would henceforth be unitYo The only new complication arises if an unconditioned pattern includes some cues which are still in the unconditioned stateo Several alternative ways of formulating the response rule for this case have some plausibility. 1 . say which have been presented and conditioned tern xyz 0 wxy and wvz. and Hopkin$. and it is by no means sure that anyone choice will prove to hold for all types of situationso We shall here limit consid eration to the formulation suggested by a recent study of discrimination and transfer which has been analyzed in terms of the mixed model (Estes. but the pattern would still be in the Consequently.Ao and Eo con$isting of cues x.. 1961)0 The amended response rule for patterns including Axiom r C2 of response unconditioned cues is as follows in this formulation: Seco 401 is reinterpreted alternatives. if to wxy had been conditioned to in the presence were effectively unconditioned stateo response A l and xyz wvz the probability of But if now response xyz.but that been presented during the first each of the cues has appeared in other pattern$.
let us consider a simple classical discrimination experiment involving three cues. We shall assume that A l reinforced..) = m + ~ r rot + mil In other words. 2. ac and two responses. and c. 202m cues conditioned to if a pattern (sample) comprises response m" Ai' m' cues conditioned to other responses. and Ai will unconditioned cues. then the probability that be evoked by this pattern is given by m+ mil Pr(A. the pattern and bc is presented on half of the trials. 1 r toward the evocation of each of the alternative To illustrate these assumptions in operation. with ~ on the other half of the trials. We assume further that conditions are such as to ensure the subject's sampling both cues presented on each trial. with reinforced. b. the two types of trials occurring in random sequence.A. The possible conditioning states of each pattern A l associated with each may now be and the probability of response tabulated as follows: . a. and E. but with each unconditioned cue contributing "weight" responses .A l and ~. axiom C2 holds.
then. only the states represented in the first. l conditioned to ~. seventh. or 0.A. and E. under the conditions of reinforcement specified above. 2. That is. are A l Probability. sixth. 3. 2 . and ninth rows of the tables are available to the 8ubject.Pattern bc 2 1 2 1 1 2 0 0 0 ac 1 1 0 0 3/ 4 1/4 1 0 1/2 bc 0 1 0 1 1 0 3/4 1/4 1/2 where a 1.. the associated according to the modified response rule. 203 !ll . 1." in a ijtate column indicates A . To reduce probabilities. and 0. A l or For each pair of values under States. respectively..comp~ted given in the corresponding positions under algebraic complications. in the order just . that the pattern is conditioned to unconditioned. we shall carry o~t derivations for the special case in which the subject starts the experiment with both patterns unconditioned. States ac 1 1 2 2 0 0 1 2 0 Probability to each .and for brevity we shall number these states listed.
1 . with asymptotic probability of a correct response to each pattern equal to unity. ac unconditioned. State 1 ~ pattern to ~. and the process must termi nate in this state.£ 2 0 (86) c 2 c 2 1:_ c where the states are ordered left to right.A. A l but bc In State 2. and E. 2 . 1 c Q 0 0 0 0 = 2" c 2 0 1. ac is conditioned to is still unconditioned. and pattern bc pattern ac conditioned to and pattern bc unconditioned. and bc to 3. State 3 pattern ac 204 ~ conditioned to conditoned to State 2 ~ ~ . This state can be reached only . 0 ac from top to bottom and is conditioned to A . Al Al . 1. The matrix of probabilities for onestep transitions among the four states takes the following form. since the probability of transition from anyone of them to any other on a given trial is independent of the preceding history. Now these states can be interpreted as the states of a Markov chain. State 3 (in which ~) is an absorbing state. l Thus.£ 2 0 0 . and pattern bc conditioned State 0 both patterns ac and bc are unconditioned. .
which has probability 2. of being in state i on trial n can be derived quite easily for each state.. . (1 trials. or have entered at the end of trial (1.. _ c.A. and this transition again has probability ~. bility •. U.c)n2 ~ 2 which has probability The right hand side of this recursion can be summed to yield . The subject is assumed to c of leaving this start the experiment in State 0 and has probability state on each trial. ~. ~)n3 . to be in State 1 on trial 1. of trial and then remained for n 3 (1 . which holds . from State 0. ••• . the subject can go only to State 3. The other cells are filled in similarly. which has probability n 1 . which has proba and then remained for n2 trials. 205 in which both patterns are unconditioned..n For State 1..n . the subject must have entered at the end of trial lnlilityc.c)~. u l. thus the entry in the second cell of the bottom row is 2" • c From State 2. hence u O.n . . For. the probability 1 2' (the probability that pattern ac is presented) of the transiti8n is times c (the probability that the reinforcing event produces condi tioning). we can write a recursion. Now the probability. "2' "j' _2 c (1 _ )n2 . n > 2.and E.if n. which has proba or have entered at the end.
1 .~)nl _ (1 _ c)nl By an identical argument. each weighted by the state probability. and E.£(1. 1 . when the subject is in State :5 (A ) l . ~2c6~ u n2 1 .n v.2 1 .c l.u l.A.c .u O. (1C)nl[(2h_~)tl 1] .(1 _ c)nl u 2. viz .c)n2 ~ 2 2 ~ (1 . we Qqtain c . 1 . (1.n . '4 ' A l to pattern ac is equal to 1 .o )v .n . and then by subtraction ~jn . (l~ _2 )nl .n .c)nl( lc )nl + ( 2 From the tabulation of states and response probabilities. respectively. 2 .~. we know that the probability of response 1 1 . . and 1 2 .n . or ac o• Consequently the probability of a correct response to is obtained simply by summing these response probabilities.
regard to the stimuluS pattern presented. of a correct response on any trial. 207 nl l( c)nl l( )nl . .fOllowing. without A simple estimator :of'. the sort of prediction involving a .conditi'OJiing' .elatively direct 1 1 5 assessment of transfer effects is the . we have c is now obtainable by summing the error e denote the expected total error.c )nl 3(1 c)nl.(1c)' +412 41c + 1(1 2 . bc is written fbr the probability of an A l response to ~ however. Pn' response to 87 expresses also the probability. and E. othe:.. parameter probability over trials.s Letting 4 C ~ 4c An example of. +'4. Suppose the first .)n_l '4 '2.A.' 1(1' .and consequently Eq.c : :. during learning. the expression for probability of an is . Equation ac on trial 1 (87) 87 n.identical.
'4 1 c .c] + 2 1 . and so on. by hypothesis. should be 2(1 .. tending to ac trials. and E. . By presolution period. 1 208the probability of a correct response to it is.. 1 '4 after a sufficiently long prior sequence of Simply by inspection of the transition matrix.1 '=.2"..c) . the probability of a correct response to bc." ' 2 _ _ .A. .:=. 2 ' and if there were no transfer bc when it between patterns. Under the assump tions of the mixed model.c) 1 2 +~ 1 . if it first appeared on trial 3. stimulus pattern to appear is ac. we can develop an interesting prediction concerning behavior during. Thus the proportion.. 2 .. 2 ~ .2(1 . if it first appeared on trial 2.. all trials of the presolution period. however. the probability of a correct response to first appeared on a later trial should be 1 2 also. should be [1 . which we may ·denote by ( P ps . probability of a correct response should be equal either to ~ (if no conditioning has occurred) or to ~ (if exactly one of the two stimulus patterns has been conditioned to its correct response) •. cannot be in State 3 on any trial prior to the last error. we mean the sequence of trials We know. the presolution period of the experiment.~(l~) .that the subject On prior to the last error for any given subject.
So far as we know. as we have in this section. no data relevant to these differential predictions are available in the literature (though similar predictions have been tested in somewhat different situations: Suppes and Ginsberg. in fact. 87 may be rewritten as . Theios. Clearly predictions from this model concern ing presolution responding differ sharply from those derivable from any model that assumes a continuous increase in probability of correct responding during the presolution period. For the more general case we could asa. then EQ. that ex~p. 209 of correct responses over the presolution trial seQuence should fall in the interval and. 1961).A. and E. this model also differs. The development in this section was for the case where there were only three cues a. Now that the predictions are in hand. 1962a. though not so sharply. the same bounds obtained for any subset of trials within the presolution seQuence. b N a N c and c. from a pure "insight" model assuming no learning on presolution trials. N b with sume that there are stimulus b. and cues associated with stimulus with stimulus c.erimental conditions are such to ensure the sub ject's sampling all cues presented 011 each trial. I~ we assume. it seems likely that pertinent analyses will be forthcoming.
an analysis in terms of an appropriate case of the mixed model can be effected along the lines . correct responses over the presolution trial sequence should fall in either the interval 1:.nlac)] + ~[1pr(A2. < P < 1:.A. < P < 1:.n lac) W 1 Further.w) c 2 where tween the stimuli 1. (1w ) .nlbC)]) ~l(l+1:. and E.ps . + 1:. 6. + 2 . Pr(A 209ac)nl +1 . as w w is an index of similarity be approaches its maximum value of Further the proportion of the number of total errors increases.ps . e "" h~l L (~[lpr(Al.2 2 (1 c )nl l .3 Component Models So long as the number of stimulus patterns involyed in a discrim ination experiment is relatively small.2 or the interval . ac The parameter and bc.2 4 depending on whether ac or bc is conditioned first.!o (lw1 ) '+ 1:. 2 2 .
2. if the number of patterns is large enough so that any particular pattern is unlikely to be sampled . However. and E.A. 209b indicated in Sec. But the number of cues need become only moder ately large in order to generate a number of patterns so great as to be unmanageable by these methods. 6:.
For example. h~ve a value intermediate between 1 2" and unity. 53 of Sec.1 • Suppose. 5.Be further. in general. Numerous results of this sort have been .the component model to be . for example. assume that a constant fraction of each set presented is sampled by the subject on any trial. Functions for learning curves and other aspects of the data can be derived for various types of discrimination experiments from the assumptions of the component model. If the two types of 'trials occur with equal probabilities. l S elements of and half of the c (on the average) would be conditioned to response probability of A l on a trial when Sl and therefore was presented would be predicted by .A. and if the numbers of cues in the various sets are large enough so that the number of possible trial samples is larger than the number of trials in the experiment. 6. S2 of cues available only on trials when of cues available only on trials when of cues common to Sl and S2. which will. that a classical discrimination involved a set set S:j.3 to obtain approximate expressions for response all of the prob~bilities. then we may apply Eq. the emendations of the response rule presented in Sec. a is reinforced. and ~ a set . 210 more than once during an experiment. A l is reinforced. and E. asymptotically elements of A .2 can be neglected and the process treated as a simple extension of the component model of Sec.. 4.
Burke. like that of an observer monitoring a radar screen.4 Analysis of a Signal Detection Experiment Although thus far we have developed stimulus sampling models only in connection with simple associative learning and discrimination learning.. 1957. is that between detection probabilities and the relative frequencies with which signals occur in different location:. Atkinson. Swets. Tanner and Birdsall. 1954. 211 published (Burke and Estes. Bush and Mosteller. 1958.A. by conventional classifications. 1957. 6. 1959.ee possibilities of u:. 1961). and Frankmann. is to detect the presence of a visual signal which may occur from time to time in one of several possible locations. In this section we examine such a case. Tanner and Swet~. Popper. 1961a~ Estes. Problems of interest in connection with theories of signal detection arise when the signals are faint enough so that the observer is unable to report them with complete accuracy on all occasions. 1951b..g. Popper and A. and E. e.ing the concepts of stimulus sampling and association to interpret experiments that...t~inson. it should be noted that such models may have much broader areas of application.1958).tudies of Signal detection (see. Estes.. in quantitative detail. On occasion one may even :. Another is the improvement in detection rate that may . One empirical relation that we would want to account for. do not fall within the area of learning.. The subject's task in this experiment.. The experiment to be considered fits one of the standard paradigms associated with :.
samples the corresponding stimulus element). 212 occur over a series of trials even when the observer receives no knowledge of r~ults.:'_:. then on the basis of the theory developed in earlier sections we should predict that the subject will come to report the signal in the preferred location more frequently than in others on trials when he fails to detect a signal and is forced to respond to backg:J:'0und cues. trial began with an auditory signal. Further. in the absense of any other information. Kinchla employed a forcedchoice visual detection situation involving a series of over 900 discrete trials for each subject. this detection of the signal may act as a reinforcing event. he will make the appropriate verbal report. and if signals occur in some locations more often than in others. These notions will be made more explicit in connection with the following analysis of a visual recognition experiment reported by Kinchla (1962). If so. and E. A possible way of accounting for the "practice effect" is suggested by some rather obvious analogies between the detection experiment and the probability learning experiment considered earlier: We would ex pect that. L the following events occurred: Two Each During the auditory signal one of . areas were outlined on a uniformly illuminated milk glass screen.A. when the subject actually detects a signal (in terms of stimulus sampling theory.·. leading to conditioning of the verbal report to other cues in the situation which may have been available for sampling prior to the occurrence of the signal.
subject was required to make either.A.area a T l type trial. ) = So and . (1) and E.n schedu~e for presenting the Kinchla selected 1 a simple probabilistic procedure in which Sl + S2 + So = 1 Sl s2 Pr(T.  (3) No change in the. occur in Following the auditory sign<>l.e.a TO type trial. For a fixed specifying l?signl?~ intensity the experimenter has the option of T i events. radiant character of either signal area occurred . For Group II. l. the Al or ~ response (i. 4 and S0 .2 The purpose of Kinchla's study was to determine how these event schedules influenced the .6 . .. an select one of two keys placed below the slgnal areas) to indicate which area he be~ieved had changed in brightness. ~  (2) A fixed increment in radiant intensity occurred in area 2 a T2 type tri"l. The subject was given no information at the end of the trial as to whether or not his response was correct.lill:elihood of correct detections. Two groups of subjects were = run~ For group I. and S2 = . Sl = So = .2 . wou~d Subjects were told that a change in illumination one of the two areas on each trial. response.. ~2~3 A fixed increment in radiant intensity occurred in . and a short time later the next tril?l began. T2 ' TO)} Thus} on a given trial one of three events occurred the subject made either an Al or ~ (T l .
~f S If the ~ A l occurs.A. if s2 is sampled an neither sensory element is sampled the subject makes the Conditioning response to which the background element is conditioned. then neither . The stimulus situation is conceptually repre~l sented in terms of two sensory elements and s2' corresponding to the two alternative signals. neither sampled with probability nor nor s2 s2 will be sampled). then sensory element h (With probability If will be lh. and E.!se to analyze the experiment combines two quite distinct processes: a simple perceptual process. S The is assumed to be the multielement Specifically. 214 The model that we will 1. defined with regard to the signal events and a learning process associated with background cues. l (i = 1. of elements in S changes from trial to trial via a learning process.e sampling of sensory elements depends on the trial type T2 ' To) and is descriqed by a simple probabilistic model. the assumptions of the model are embodied in the following statements: If T. 2) occurs. Th. and a setS with stimulus features common to all trials. (T l . occurs.). learning process associated With pattern model presented in Sec. will be sampled. of elements associated On every trial the sub ject is assumed to sample a single element from the background set and he mayor may not sample one of the sensory elements. element is sampled an occurs.
then A. S A l then (in terms of state ments 1 and 4 ab\lve) we can immediately. n IT. namely. then with S on the n • ~ 1.P ) n . 215S on every trial. will occ~r. 4. (i ~ S of N elements. A. is sampled. 2.. the probability of element is 1 partic~lar iii ' n. then the response S is conditioned will to which the sampled element from Occur. then with prooabiJS becomes conditioned ~ the element sampled from A l or with equal likelihood to at th" end of trial n. If sensory element If neit~er s.2 . h + (lh)p n (88a) (88b) (88c) Pr(A_ """2. Exactly one element is sampled from Given the set sampling a If s.h) (1 . and E. If we let conditioned to Pn denote the expected proportion of elements in at the start of trial n. 2) c' is sampled on trial probability the element sampled from Ai trial becomes conditioned to If ity neither c nor at the end of trial is sampled..n) ~ h + (1 . l sensory element is sampled.".write an expression for the likelihood of an Ai response given a T j event. 1.
89 and ar~ simply Ins~rt Table 7 about h~r~ .A. Th~ ~xpression sam~ 216 for p n can be obtained from statements 2 and 3 by the S~c. Tabl~ 7 pr~s~nts th~ obs~rv~d m~an valu~s of for th~ last 400 trials. 1962a): and yields Dividing th~ ~xpr~ssion th~numerator and d~nominator of by c (89) wh~r~ \jI c' =c . th~ asymptotic c' and c ~xpr~ssion for p n do~s not d~p~nd on th~ absolute valu~s of but only on indicat~s th~ir ratio. Sp~cifi~d Th~ corr~sponding asymptotic expr~ssions ar~ in t~rms of Eq. Thus. 88 and Eq. and E. cons~qu~ntly w~ this portion of th~ data as Pr(A IT j) i asymptotic. methods us~d throughout 3 of this chapt~r and is as follows (for a d~rivation of this r~sult s~~ Atkinson. for An insp~ction of Kinchla's data that th~ curv~s pr(AiITj) ar~ ~xtr~m~ly stabl~ ov~r th~ last 400 or so trials of th~ shall vi~w exp~rim~nt.
558 .7~0 Predicted .494 .645 .645 .565 .A.64~ Group I I Observed .724 .~88 .645 ·500 Predicted . 2l6a Table 7 Predicted and Observed Asymptotic Response Probabilities for Visual Detection Experiment Group I Observed Pr(A1IT1 ) . and E.~88 Pr(~IT2) pr(A1lT o) .
.
89· and 90 yield the and asymptotic predictions given in Table 7. solving for Using these estimates of hand * * we obtain '" =2. . sequential predictions that can be obtained.289.ith ~l = ~o =. and ~·2 = . 89 that ~:l P. we take the observed asymPtotic value of pr(Al!T ) o Gr"up II and set i t equal to the right side of Eq. lim Pr(Al 217h+(l_h)p. (90c) In order to generate asymptotic predictions we need values for hand *. = 1 2 for Group I.2 h co .. . In order to in estimate *. .n P.and E.. a more crucial test of the model is provided To indicate the nature of the by g.n analysis of thi= sequential data. Over all the equations give an excellent account of these particular response measures..289 .n iT.n ) l ..645) and setting The back h + (1 .can occur on the preceding trial. We £irst note by inspection of Eq. ) ~""2jn ~ (9Gb) ¢O ll"7'7OO lim Pr(A ITo. consider the probability of an A l response on a T l trial given the various trial types and responses that .6.c valu<j'·for it equal to pr(Al!T ) l in Group I (Le.l .8 • * Eqs. whenever = ~2 we have P..A. taking the observed asymptoti. 89 w. e... ..n ) = n ·) lim Pr(A_ \T2 JlD = h + (lh)(lp. in fact. 1 = 2' Hence. i. However.h)~ yields an estimate of h = .. ground illumination and the increment in radiant intensity are the same for both experimental groups and therefore we WOuld require an estimate of h obtaihedfrom Group I to be applicable to Group II.
00 = N(l. ) . 2 • Explicit expressions for these quantities can be derived from the axioms by the same methods used throughout this chapter.A. 2 and j = 0.h)5p .n where i =1. and Y = ~ + (l~)h.p )h)" '" '" NX + (N . to lim ~oo pr(A llT lAo T. sions for n To indicate their form. 7' = c' + (lc')h.n+.l)X + N (91e) (9lf) where)' 5' = c'h + (lc').Y) '" (N . l. and E.lIT l .h)5]p pr(Al[T1A1T l ) = + (1 .l)X N (91b) Pr(A [T A2 T2 ) l 1 '" + (1h)5'](1p) NY l)X "" + (N N (91c) (1. T.u+.n J. 5 = ~ h + (l~) . 218. ) l . X = h + (lh)p"" = h + (lh)(lp) • .n expressions for these quantities are as follows: [h + (1 .n+lA.X) h)'p + [h 2 + (N . N(l .l .n J.n+ l.l)X N (91a) pr(All Tl~Tl) (1h)5'(1p) . Pr(Al . theoretical expreswill be given and. 1.
c' and h 0 91 are functions of all four parameters Comparable sets of equations can be written for pr(Al!TchTj) 0 Pr(~IT2AiTj) and The expressions in Eqo 91 are rather formidable. c ' and hare neededo Of course. c . in E~o J.. and therefore it is only necessary to estimate or c' 0 We obtain our estimates of Nand c by a least squares .n IT. certain As an relations among the sequential probabi. independently of the parameter values. JJn ) depend only on hand *. c .Ao and ~o 219 It is interesting to note that the asymptotic expressions for lim Pr(A. *= c' c already have been made for this set of N and either c data. simply subtract Eqo 91f:Troni Eqo 91e and note that Insert Table 8 about here In Table 8 the observed values for pr(AiITj~T£) are presented as reported by Kinchlao Estimates of these conditional probabilities were computed for individual subjects using the data over the last 400 trials. the averages of these individual estimates are the the tableo Each entry is based on 24 subjectso ~uantities given in In order to generate theoretical predictions for the observed entriep in Table 8 values for estimates of h and N. it can be shown that pr(AlITl~To) pr(~ITIAITO) > for any stimulus schedule and any set of parameter valueso 5 > 5' 0 To see this.o example of such a relation. but numerical predictions can be easily calculated once values for the parameters have been obtainedo Further. whereas the ~uantities N.lities can be specified.
.
54 .58 pr(Al !T1A1T2 ) pr( Al l T1Al TO) pr(Al lT1A2T ) O pr(~IT(hTl) p'(~ITchTl) pr(~ITchT2) Pr(~ IT<flT2 ) Pr(~ IT<flTO) pr(~ ITO 2T ) A O  .69 Jill Predicted .49 .59 .. . 219aTable 8 Predicted and Observed Asymptotic Sequential Response Probabilities in Visual Detection Experiment Group I Observed Predicted .62 ·53 .47 ·59 .51 .76 .58 ·70 ·70 .50 .66 ·72 .69 ·71 ·59 .61 .61 .48 .59 ·70 ·71 ·59 Group II Observed ·59 .52 .52 .56 .53 .A.76 Pr(~IT2A1Tl} pr(~IT2~Tl) Pr(A2IT2~T2) pr(~!T2A1T2) Pr(~ IT2Al TO) .51 .60 .57 .66 .40 .65 .70 ·79 .64 .59 .61 .51 .65 pr(Al !T1A1Tl ) pr(AlITl~Tl) pr(AIIT1~T2) . and E.47 .47 .42 .66 .64 .38 .64 .63 .66 ·73 .42 .65 ·71 .67 .58 .66 .64 .66 pr(~IT2~TO) ·11 ·70 .68 ·51 .58 .60 ·77 .
A. When one considers that only four of the possible 36 degrees of freedom represented in Table 8 have been utilized in estimating parameters. we select a value of Nand c (where c I = c1jr) " so that the sum of squared deviations between the 36 observed values in Table 8 and the corresponding theoretical quantities is minimized. In terms of the experimental pro blem conSidered in this section much progress can be made vi. the close correspondence between theoretical and observed quantities may be interpreted as giving considerable support to the assumptions of· ··the . e 0. 0357 The predictions corresponding to these parameter values are presented in Table 8. estimates of the parameters are as follows: c' = 1 .. and E. i. 00 (92) h 0289 C ::. 91.iT ) are computed from Eq. mbdel. A great deal of research needs to be done to explore the consequences of this approach to signal detections. theoretical pr(~IT2AiTj) and pr(~IT~iTj) h~ve not been pre sented here but are of the same general form as those given in Eqo 91. Using this technique. cal quantities for expressions for j The theoreti pr(Al'I'T1A.a different·ial tests among alternative formulations of the model. 220 method. we postulated a multielement pattern model to describe the learning process associated with background stimuli. it would be important to determine whether other formulations of the learning process such as those . For example.
this work is still quite tentative and an evaluation of the approach will require extensive analyses of the detailed sequential properties of psychophysical data. Also. and Birdsall (1961) and others.·by Bush and Mosteller (1955) would provide as good or even better theoretical fits than the ones displayed in Tables 7 and 8. Tanner. it would be valuable to examine variation" i.n the scheme for sampling sensory elements along . . for other situations we may find it advantageous to postulate processes th~t are unobservable but which determine in some welldefined fashion the sequence of obsElrvable behaviors. further development of the theory is required before one could attempt to deal with the wide range of empirical phenomena encompassed in the approach to perception via decision theory proposed by Swets.A. 6~5 MUltiple Process Models Analyses of certain behavioral situations have proved to require formulations in terms of two or more distinguishable. learning processes that proceed simultaneously. More generally. and E. For some situations these separate processes may be directly observable. 221 developed in Sec. 5 or those proposed. Some theo~ retical work has been done by Atkinson (1961b ) along the lines outlined in this section to account for the ROC curves that (receiveroperating~characteristic) are' typically ob"erved in detection studies and to specify However. the relation between forced_choice and yesno experiments.lines developed by Luce (1959) and Restle (1961). though possibly interdependent.
As another example.instant in time) perceive the stimulus situation either as a unit pattern or as a collection of individual components. One process might describe how the stimuli presented to the subject become conditioned to discriminative responses.the other process prescribes both the conditions under which cues become irrelevant and the rate at which adaptation occurs. in Restle's (1955) treatment of discrimination learning it is assumed that irrelevant stimuli may become "adapted" over a period of time and thus be: rendered nonfunctionaL Such an analysis entails.A. This model makes use of the distinction. 222 For example. consider a twoprocess scheme developed by Atkinson (1960) to account for certain types of discrimination behavior. and WYckoff (1952). and E. Another application of multiple process models arises with regard to discrimination prOblems in which either a covert or a directly observable orienting response is required. these orienting responses would determine the specific subset of the environment that the subject would perceive on a given triaL For models dealing with this type of problem see Atkinson (1958). twoprocess system. whereas ·. ThuS. One process has to do with the conditioning of stimuli to responses. 3 and 4 of the present chapter. Bower (1959). one in which the subject responds to the pattern of stimulation and one in . Another process might specify the acquisition and extinction of various orienting responses. between component models and pattern models and suggests that the subject may (at any . a. developed in Sec". two perceptual states are defined.
Each trial is specified in terms of the following classifications: Trial type. Each trial is either a T l or a T2 • The trial type is set by the experimenter and determines in part the stimulus event occurring on the trial. The example deals With a discrimination learning task investigated by Atkinson (1961a) tn which observing res:pcnses are categorized and di rectly measured. nevertheless. It is for this reason that we select a particularly simple example to illustrate the type of formulation that is possible. learning processes are also defined. 223Two which he responds to the separate components of the situation. Models of the sort described above are generally difficult to work with mathematically and consequently have had only limited development and analysis. the behavior of the subject is rigorously defined in terms of these hypothetical states. and the second process describes the conditions under which the subject shifts from one perceptual state to 'another. One process specifies how the patterns and components become conditioned to responses. The experimental situation consists of a sequence of discrete trials. and E. In this model neither the conditioning states nor the perceptual states are observable.A. ) The control of the second process is governed by the reinforcing schedule. and by similarity of the discriminanda. the subject's sequence of responses. .
A l To the onset or A • 2 of the stimulus event the subject responds with either The trial terminates with either an 01 (4) or 02 event.A. on trial either 13 can occur. either an A l or A 2 On each trial the subject makes response to the presentation of a stimulus event. Following the observing response. indicates The sequence of events on a trial is as follows: signal occurs and the subject responds with the observing response or R (1) (2) The ready Following l or R • 2 is presented. one and only one of these stimulus events (discriminative cues) occurs 0 On a T l trial either or or sb can occur. either an R l or R • 2 On each trial. and E. may occur on both stimulus events unique to Discriminative responses. sl' sb' s8: Stimulus events. and that A 2 was correct. indicates that 02 A l was the correct response for that trial. the subscripts 1 and 2 denote and T 2 trials. 13 The subscript T l b has been used to denote the stimulus event that and T 2 T l trials. . Trial outcome. 224 Observing responses. the subject makes The particular observing response determines in part the stimulus event for that trial. respectively. Each trial is terminated with the occurrence An 01 of one of these events.
rr ).A. combination occurs with probability and so on. b) a .ex the sl (R l or R ). ( ii ) I f an R 2 is made then with probability ex the sb event occurs. "1 = 1. 2 Spectfically: event occurs on a event on a the sb T2 trial. l The particular stimulus event that the experi(T l or T ) 2 and menter presents on any trial depends on the trial type the subject's observing response (i) If an (a) R l is made then with probability ex T l (b) trial and the s2 1 . on a occurs with probability "2' with probability and T 2 type trial occurs with probability Thus a T . either an 01 occurs with T2 trial 1"2' with probability or an 02 ~ 1 . regardless of the trial type. (b) with probability a T l trial and 1 . 225 To keep the analysis simple we consider an experimenter controlled reinforcement schedule. with probability event occurs. (a) regardless of the trial type.."1.02 with probability . si (i = 1.01 l with probability ~l. and E.ex 8 the T 2 sl event occurs on trial. If the subject is to be correct on every trial. = O.~(l . probability "1' an The 1 T l 01 T l or an On a 02 T l trial. consider the case where and rr 2 = 1. ~. he must . ex 2 on a To clarify this procedure. 2.
and E. The identification we offer :seems 'quite natural to us and is in accord with the formulations given in Sees. Associated with each stimulus event of pattern elements.e. 11 2 and a would affect both was to determine how variations in the observing responses and the discriminative responses. At the start of each trial (i. 2 and 3.E2 always leads to the Hence. with the onset of the ready signal) an element is sampled from SR and the subject makes the response to which the element is conditioned.A. That is. in order to apply the theory we must first identify the stimulus and reinforcing events. R l or the ~ SR of Each element in SR is conditioned to either the observing response. However. there are N' such elements. Our analysis of this experimental procedure will be based on the axioms presented in Sees. We assume that associated with the ready signal is a set pattern elements. N such elements in each . (i = 1.. in terms of the experimental operations. R l must be made in order to identify the trial type. R l responding the subject must make make A l to or ~ with probability and then to The purpose of the Atkinson study 11 1 . the subject can only ascertain the trial type by making the appropriate observing response. However. for the occurrence of presentation of sb . 2 and 3. 2.b) ~ is a set Si A l Si are conditioned to either the There are discrimination response.. for perfect 1 regardless of the trial type. elements in or the ~ s. make an A l on a T l 226~ type trial and an on a T2 type trial.
e. Thus.A. (j .reinforcing events have been specified for these processes we can apply our axioms. 3. b c and fallowed by an the element becomes the conditioning state outcome. A10 l or A 0 ). 2) J Si for i . we have two types of learning processes. for simplicity. in turn. set Si 227 and. if A 0 1 2 or A 01 2 occurs. R. if an element conditioned to and followed by either an A10 l R . 2. then the element will remain conditioned to however. 1. we assume the sets are pairwise disjoint. 2 2 then the sampled element .  if an element from SR elicits an observing response that selects a stimulus event and. become conditioned to the other observing response.. The interpretation of reinforcement for the discrimination If a pattern response process is identical to that given in Sec. then with probability condtioned to A j and with probability 1 . 1 is sampled from SR Specifically. one defined on the set SR and the other defined on the sets Sl' Sb and S2' Once the . 1.c of the sampled element remains unchanged. then with probability c' the element will Otherwise stated. the stimulus event elicits a correct discrimination response (i. i or A0 2 2 event. The conditioning process for the SR set is somewhat more complex in that the reinforcing events for the observing responses are assumed to be subjectcontrolled. and E. element is sample from set 0. occurs one element is randomly sampled from When the stimulus event Si and the subject mades the discriminative response to which the ele ment is conditioned.
N' To simplify the analysis we let =N =1 . namely.or A . Given the above identification of events we can now generate a mathematical model for the experiment.. > where is 8 i 1 or 2 and indicates whether the R l or R . With this restriction we may describe the' conditioning state of a subject. . 2 2 is conditioned to Thus. 2 1 is is conditioned to 1 or 2 the third member element of 8 k and indicates whether the A l 2 or A . if the subject is in state observing response. 2 b is conditioned to I. > he will make the Ri or s2' he will make discrimi sl' sb or AI. A 02 1 or ~Ol)' then there will be a decrement in the tendency to repeat that observing response on the next trial. and E. to native response A . A j k < ijkl. we aSSume that there is one element in each of our stimulus sets and consequently the single element is sampled with probability 1 whenever the set is available. (4) the fourth member element of 8 is 1 or and indicates whether the A l .e. by an ordered four tuple (1) the first member single element of (2) the second member single element of (3) < i j k I. 228However. then. observing response selects a stimulus event that gives rise to an incorrect discrimination response (i.' respectively. if the will remain conditioned to that observing response.A. at the start of each trial. 2 R is conditioned to is 1 or 2 j 8 and indicates whether the A l or A .
from < 1122: > For example. with probability occurs. the 01 event). given an that the sl R l ~rtl R l is elicited with probability 1 01 outcome occurs. 229 From our assumptions it follows that the sequence of random variables that take the subject states chain. < 1112 > and < 1122 > are absorbing and . the probability of going ~rtl(la)cc' + < 1122 > to < 2112 > is simply (1. rt ° then states l = 1 . a T l trial with an T l response on . the sum over branches 2 and matrix yields some important results.A. the onset of the a correct response and hence no change occurs in the conditioning state of any of the stimulus patterns. a trial there is probability a sl event elicits stimulus event occurs. namely From this tree we obtain probabilities corresponding to the row in the transition matrix. An inspection of the transition For example. the subject is in state < 1122 > on trial n. and rt2 = IS. if a =1 . ~ 1a the sb stimulus is presented and an response is in correct (in that it is followed by an bility c the element of set hence with probaA l and Sb becomes conditioned to c' the element of set with independent probability SR becomes R2 • conditioned to the alternative observing response. that is.~)rt2(1a)cc' .l. and E. Fi~reIO < i j k £ > as values is a 16 state Markov displays the possible transitions that can occur when Insert Fi~re 10 aboilt here To clarify this tree. an R l occurs and we have a TIO l ~ Now consider the next set of branches: tria. and with probability further. An let us trace out the top branch.
.
2298. . Az ~ 1122 __. . 10. for a single trial in the twoprocess discrimination learning model. Branching process.. and E. 1122 2112 1112 2122 1122 2222 1222 2122 1122 1122 1122 R l 2121 1121 2122 1122 ". starting in state <:1122>..~ __ 1122 Fig. 05'0 ~j. .A. ' 2112 1112 2122 1122 ex : s2 &..
l .n. ) d . n ) = u l111 + ex u l121 +.n IT.n i. 1 + 1 2 Qi Qi ) (n) u 1212 [(n) (n)] u l122 + u 122l (93d) .n < i j k £> on trial denote the probability of being in state (n) .n ) = u llll + u 1211 + U21l1 + u2211 (n) + ex [ u (n) + u (n) + u (n) + u 2212 ] 1221 2112 l121 + (n) (1 . let Pr(R )= 230 l . n4oo ui~~£ J Experimentally. n n (n) .m ui "k. Pr(Al .n IT2 . n's when the limit exists let u"!' = 1 J.( Al ..A. (n) .n ) =1 . and Pr(A_ !T2 .[ u (n) + u (n) + u (n) + u (n) ] 1211 2121 2122 1212 (n) (n) (n) (n) Pr (Al.. '"'"2.Qi ) [u ll12 + u 1212 + u2121 + U2221 (n) (n) (n) ] (93c) Pr ( Rl .ex ) .. and E. hence in the limit As before. we shall be interested in evaluating the following theoretical predictions: (n) (n) (n) (n) + u 1211 + u 1212 + u 1221 + u1222 (93a) + ex[u (n) (n) (n) (n) + u + u + u ] 2211 2212 l121 l122 + ( 1 .€ ~J r .
= . For 2 = ·9. (n) lim u ijk£ = u ijk£ . for II and V. rt 2 ' a and ~ our 16 state Markov exists chain is irreducible and aperiodic. rt rt a For a = 1. Thus. and E. In the experiment reported by Atkinson (1961a) six groups were run with 40 subjects in each group.a) [~122 + u2221l + [12(1a)][~112 + ~211l 1 1 (n) (n) (n) (n) (93e) The first equation gives the probability of an R l A l response. and for Groups III 2 ~ .0 I . response on Finally. The second and third equations give the probability of an T l and T 2 trials. the last two equations present the probability of the joint occurrence of each observing response with an A l response.5 II . respectively.A. 231 + 2(1 .1. the value of Groups I and IV.75. and for Groups IVVI.1 III . For all groups rt l = ·9 and and a ~ = ·5· 'The groups differed with respect to the value of Groups IIII. and VI.9 a 1. The design can be described by the following array: .75 IV V VI Given these values of rt l .
it is for this reason that we present theoretical values to only two decimal places. differences between subjects (that may be represented by intersubject variability in these predictions. 93 to predict the response probabilities.0001 to For c = c' ranging over the interval 1. In view of these comments it should be clear that the predictions in Table 9 are based solely on the experimental parameter values.A. for the schedules employed in c and this experiment the dependency of these· asymptotic predictions on c' from in is virtually negligible. c and cl ) do not substantially affect . Consequently. By presenting a single value for each theoretical quantity in the table we imply that these predictions are independent of this is not always the case.0 the predicted values given in Table 9 are affected on~ the third or fourth decimal place. and E. . for the case The values predicted by the model are given in Table 9 whe~e Values for the were computed Insert Table 9 about here and then combined by Eq. 16). 232 and can be obtained by solving the appropriate set of 16 linear equations (see Eq. c and c' • Actually However.
45 .257 .47 .68 ·53 . Obs.94 .285 .31 Obs. . Obs • ·79 .232 . .158 .13 3D .32 .60 Obs.81 ·59 ·55 .272 .161 ·59 ·73 .63 .39 .85 .82 .45 .47 Pred.138 .182 .279 . . Pred.138 .36 Pred.168 . Pred.25 . and E.23 ·70 .219 .44 .279 .35 .95 ·50 .42 .164 .114 .50 .34 () Al } . .36 .36 .241 .61.305 .80 .38 .114 .r(All T2 } Pr(I\} Pr(Bln Al } Pr(~ .94 .134 .46 . ·73 . . ·52 . Pr( A1 IT1 ) pr(A1 IT2 ) Pr(Bl } Fr(I\ n~) Pr(~nA1) Group II 3D Group III 3D Obs.47 Group IV .4 . ·73 .263 .247 .138 . 232aTable 9 Predicted and Observed Asymptotic Besponse Probabilities in Qbserving Besponse Experiment Group I Pred.266 .164 .063 .37 .93 . 3D .014 .293 .19 Obs. pr(Al!T l ) p.72 .01.90 ..014 .21 .90 . .226 .49 .16 i.31 Group V 3D Group VI 3D Fred. ·79 .27 .45 .A..43 ' .13 .90 ·90 .
. of course. If we assume more than one element.. the proportions computed over the final block of 160 trials were used as estimates of asymptotic quantities. The difficulty with the oneelement assumption is that the fundamental theory laid down by the axioms of Sec.e. 1 . n ) . c' . A10 1 ~ occurs on trial R l n and is reinforced (i. and E. it is obvious that some of the predictions from the model will not be confirmed..onA. n Rl .A. the deterministic features of the model no longer hold and such sequential statistics become functions of c. if an by an n + 1. when N' . the multielement case leads to complicated mathematical processes for which it is . 3 is completely deterministic in many respects. Nand N' Uqfortunately. the agreement between theoretical and observed quantities is fairly good.. As can be seen. 233In the Atkinson study 400 trials were run and the response proportions appear to have reached a fairly stable level over the last half of the experiment. l . Table 9 presents the mean and standard deviation of the 40 observed proportions' obtained under each experimental condition. is a consequence of the assumption SR which necessarily is sampled that we have but one element in set on every trial. Consequently. . followed event) then will reoccur with probability 1 on trial This prediction. For example. for elaborate experimental procedures of the sort described in this section. 1 namely. we have pr(Rl n +11°1 . Despite the fact that these gross asymptotic predictions hold up quite well.
A. and E.
234Thus, the generality
extremely difficult to carry out computations.
of the multielement assumption may often be offset by the difficulty involved in making predictions. Naturally it is usually preferable to choose from the available models the one that best fits the data, but in the present state of psychological knowledge no single model is clearly superior to all others in every facet of analysis. The oneelement assumption, despite some of
its erroneous features, may prove to be a valuable instrument for the rapid exploi,ation of a wide variety of complex phenomena. For most of
the cases we have examined, the predicted mean response probabilities of are usually independent/ (or on;Ly slightly dependent on) the number of elements assumed. Thus the oneelement assumption may be viewed as a
simple device for computing the grosser predictions of the general theory. For exploratory work in complex situations, then, we recommend using the oneelement model because of the greater difficulty of computations for the multielement models. In advocating this approach we are taking Our
a methodological position with which some scientists do not agree.
position is in contrast to one which asserts that a model should be discarded once it is clear that certain of its predictions are in error. We do not take it to be the principal goal (or even, in many cases, an important goal) of theory construction to provide models for particular experimental situations. The assumptions of stimulus sampling theory
are intended to describe processes or re;Lationships that are common to a wide variety of learning situations, but with no implication that behavior.
A. and E. 235in these situations is a function solely of the variables represented in the theory. As we have attempted to illustrate by means of numerous
examples, formulation of a model within this framework for a particular experiment is a matter of selecting the relevant assumptions, or axioms, of the general theory and interpreting these in terms of the conditions of the experiment. How much of the variance in a set of data can be
accounted for by a model depends jointly on the adequacy of the theoretical assumptions and on the extent to which it has been possible to realize experimentally the boundary conditions envisaged in the theory thereby minimizing the effects of variables not represented. In our
view, a model, in application to a given experiment, is not to be
classified as IIcorrectll or f1incorrect"; rather, the degree to which it
accounts for the data may provide evidence tending either to support or to cast doubt on the theory from which the particular model was derived.
A. and E.
236
REFERENCES Atkinson, R. C. Mathematical models in research on perception and In M. Marx (Ed.), Psychological Theory (2nd New York: Macmillan Co., '1962a, in press. In J. Criswell,
learning. Edition). Atkinson, R. C.
Choice behavior and monetary payoffs.
H. Solomon and P, Suppes (Eds.), Mathematical methods in small group processes. Stanford, California: Stanford University Press, 1962b, in press. Atkinson, R. C. A variable sensitivity theory of signal detection.
Technical Report No. 47, Psychology Series, Institute for Mathematical Studies in the Social Sciences, Stanford University, 1962c . (to be published in Psychological Review). Atkinson, R. C. The observing response in discrimination learning.
~.
,.L. expo Psychol., 1961a,
Atkinson, R. C.
253 2 62. In
A theory of stimulus discrimination learning.
K. J. Arrow, S. Karlin and P. Suppes (Eds.), Mathematical methods in the social sciences. Stanford, California: Stanford University Press, 1960, 221241. Atkinson, R. C. A Markov model for discrimination learning.
~
Psychometrika, 1958, 23, 309322. Atkinson, R. C. A stochastic model for rote serial learning. 1957; 22, 8796.
"""'An analysis of twoperson game situations
Psychometrika,
Atkinson, R. C. and Suppes, P. 1958, 55, 369378.
~
in terms of statistical learning theory.
,.L. expo Psychol.,
Billingsley, P.
Statistical inference for Markov processes.
Chicago:
University of Chicago Press, 1961.
Bower, G. R.
A model for response and training variables in paired
as sociate learning. . PsychoLRev., 1962,· f?2J 34 53. Bower, G, R, Application of a model to pairedassociate learning.
~
Psychometrika, 1961, 26, 255280. Bower, G, R, Choicepoint behavior. In R. R. Bush and W. K. Estes Stanford,
(Eds.), Studies ~~ mathematical learning theory. California:
Stanford University Press, 1959, 109124.
Burke,C, ,To
Some twoperson interactions. Stanford, California:
In K. J. Arrow, S. Karlin Stanford University Press,
and P. Suppes (Eds.), Mathematical methods in the social sciences. 1960, 242253. Burke, C. J. Applications ofa linear model to twoperson interactions. Stanford, California: Stanford University
In R. R. Bush and W. K. Estes (Eds.), Studies in mathematical learning theory. Press, 1959, 180_203. Burke, C. J. and Estes, W. K. in discrimination Bush, R, R. A component model for stimulus variables
learnin~.
Psychornetrika, 1957, 22,
~
133~145.
A survey of mathematical learning theory. Illinois: The Free Press,1960. Studies
~
In R. Duncan Luce Glencoe,
(Ed.), Developments in mathematical psychology.
Bush, R. R. and Estes, W. K. (Eds.). theory. Stanford,
mathematical learning University Press, 1959.
C~liforniae
St~nford
Bush, R. R. and Mosteller, F. New York:
Stochastic models for learning.
Wiley, 1955,
Pittsburgh Press. 1959.~ 1O'7~144. R. learning. H. Learning theory. An application of stimulus sampling theory to sUllllllated generalization. A test of a model for multiplechoice behavior.. K. 1951a. Detambel. F. R. . 1955. ... 97104.. 1962. R. University of In Current Trends in Psychological Theory. Psychol. . Estes (Eds •. W. T. A model for stimulus generalization Psycho1. 1961. W. and E. Rev. and Mosteller. and Sternberg. R. R.. 58. K. . 134151. 1951b. New developments in statistical behavior theory: differential tests of axioms for associative learning. F. S. Bush. 49. 1961a. K. Unpublished compound responses. and Mosteller. A singleoperator model. E. 13. Studies in mathematical learning Stanford University Press. theory. M. R.A/'"V Bush. 313323.). and discrimination. Crothers. California~ In. . Rev. 1961. expo Estes.A•. Estes. ~ J. 62. Estes.  J. A mathematical·model for simple <V'J Psychol. 58.• Growth and function of mathematical models for learning. 1961l:J.Psychol. J. ~  Allornone paired associate learning with unit and Indiana University. Carterette. exp. Stanford. Psychometrika.R. doctoral dissertation. 413423. W K. '. S.~" 7384. 448455.y. R. Bush and W.238Bush. 204214. Animal Review of Psycholog.
Stanford. Studies in mathematical learning theory. ~ Estes. Arrow S. 1950. Psychol.chOl. 60. 1960b. 2. 62. 94~107· Eptes. Rev. 145154. K. Bush and W. ~ 369377. Jones (Ed.).• . W. Koch (Ed. Estes. Press.. \957a. variable. Rev. and E. Mathematical methods in The Social Sciences. Component and pattern models with Markovian interpretations. 1959b. . 276286.). K. 1955a. Stanford University Press.. Estes (Eds. 22. """ Estes. 12. K. Of models arid men. C. A theory of stimulus variability in ~ Psychol. K.).). Psychol. 1955b. 609617. W. W. Learning theory and the new mental chemistry.• 6. Psychol. University of Nebraska Press. "'" Theory of learning with constant.. 19571. Psychol. 380491.. Estes. " Psy. probabilities of reinforcement.Toward a statistical theory of learning. W. or contingent Psychometrika.·and Burke. ~ In M. A randomwalk model for choice behavior. W. W. ~ In S. New York: McGrawHill. """' Statistical theory of spontaneous recovery and regression. California: Stanford University In R.R •. Stimulusresponse theory of drive. Vol. Rev. W. 239Estes. Estes. . 113132 • Estes. 207223. Amer. J. Estes. motivation.A. The statistical approach to . K. Psychology: study of ~ science. Stanford.. R. K. K. K. 1958. 1953. Statistical theory of distributional phenomena in learning. 265276. 9~52. Nebraska: Nebraska symposium Estes. Vol. 196aa 67. W. Suppes (Eds. K•. m. K. W. W. ~. W. K. K.learning theory. learning. Rev. California: In K. 1959a. Estes. Karlin and P. J.
R.~. K. W.. J. H. ~echnical Analysis of' the noncontingent case. J. Allornone and conservation ef'f'ects in the learning and retention of' paired associates.dtransf'er. Institutef'or Mathematical Studies in the Social Sciences. and Frankmann. and Hopkins.• Estes. §g.W. In R.~2328. W.. Feller. New York: . Estes. B. P. K. III. W. expo WJ. 233239. . Burke. 329339. and Crothers. K. L. P. and Suppes. Hopkins.. An introduction !£ probability theory Wiley. J. nical Report No. L. 225234.. Stanford. expo Psychol. K•. K. II. C. and Suppes. 1957 • and~ applications. Studies ~ mathematical learning theory... expo PsychOl. W. J. R..'>.component discrimination learning. W. 1960. Report. PsychoL. 1957. and Straughan. 1962.137179. C.I'.. andE. Stanf'ord University. 24~ Estes. CaJ. Psychology Series. Acquisition an. 1961. 1954. Foundations of' statisi. Estes.in pattern ~. J. 1959a. P. Probabilistic discrimination learning. Psychology Series.'. ~. vs. Analysis of'a verbal conditioning ~. and Suppes.. Institute f'or Mathematical Studies in the Social Sciences. Foundations of'linearmodels. 26. ~. . ~ech The stimulus sampling model f'or simple learning. Estes. Stanf'ord University. PsychoL. K. (2nd Edition) •. K. W. W. B•.. K.A. stanf'ord University)'ress.if'ornia: Estes. Atkinson. Estes (Eds.). 1959b.:'.ical learning theory. E. 61. in preparation. situation in terms of' statistical learning theory.' Foundations of' statistical learning theory. Estes. Bush and W.
C.• . C..es.ChiC:8go:. with two and three choices. and Sne~~. P.L. Calcu~us Kemeny. B. tion. Gardner. J. M. . ~959. J.. Introduction to difference equations. 4~. J . ~957. J.88 • New York: Wiley. Kemeny. ~958.297. 22. Finite Markov chains. ..<vV a~ternating 291. R. N. Princeton.A.:I. ~ •• PSYchol. AppletonCenturyCrofts. C. 70. Discriminability and stimu~us generaliza .First Meetings of the Psychon6mic . D. . Hull. G. G and . : ~957. 79. M. . Goldberg. Markov processes in ovv ~earning theory. 22~230.s.l?sychoL. L.B. . S. Jordan. New York: Prentice Ha~~. 1960.74185.•·· Sne~~. and 1Ca~ish. J. F'ri"dman. A. .. Psychometrika. Van Nostrand Co. Amer.1. seria~ anticipation of ~95~. G. Principles of behavior: New York: An introduction ~ behavioI' theory. Burke. ~950.. New York: Che~sea.. Probability and a negative I'ecency effect in the symbols.o:l. Snel~. New Jersey: Kemeny. I. ~earning Jarvik.. W. J. Paper·given·at the. Introduction to Finite Mathematics. Probabi~itylearning _. . ~957. L.943. J. of finite differenc.Illill. Co~e. le. ~956.Sbciety. L. J.. and E. f"v'oJ Guttman. two~choicesituationwith Extend"dtrainingin'a noncontingent shifting reinforcement probabilities.L' expo psychoL. E. ... 24~ M. L" and Thompson. H. . . Estes. and>MillwaI'd. R.
. J. Unpublished doctorifi dissertation. frequency in pairedassociate learning. and Wynne. and Suppes. 1959. Monogr. H6 Dp Individual choic. Prediction of sequential twochoice decisions from event . ~ Restle. R. Luce. California: Studies in mathematical learning theory.. acqui Psychol. K. 4 . Mediated generalization. Discrimination learning in a verbal conditioning situation. Popper. runs.i!:. Stanford University Press. .754. 1962 . C.e behavior: New York: Wiley.'396~403. Los Angeles... (Whole No. New York: Wiley. .. and Land. Learned factors in visual discrimination. R.. expo Psychol. <Vv Psychol. 1962. rI. V. 1959. 9. R.1961. J. P. Psychology of judgment.. Estes (Eds. 2126. Popper. C. L. L. F. R. 1955. Pacific J. New York: Wiley. Bush and W. Rev.e.. A theory of discrimination learning I 62. A. Luc. K. the~r ~ Chains of infinite order and applica tions to learning theory. Solom<::>n. In B.i!:. 354). 1953. R.. 94~108. 1959. . 7?9. F. 1959.~. Peterson. Stanford. Traumatic avoidance learning: sition in normal dogs. 1958. Psychol. Math. Hillner. and E. Games and Decisions.A. and Raiffa. Do._ . ~ Recency and . D.1957. Saltzman. Lamperti. expo Psychol. R. 105n4.~ choice. No. University of California. and Atkinson. 63.. L. Nicks. D. g. Restle. ~.). 56. 1119.242 Kinchla. A theoretical analysis. . J..i!:. C.
Po.. C. Psychology Series.. R. W. Press. 427449.1961. 1962a. Tanner. detection. California: A fundamental property of allornone models. The role of observing responses in discrimination behavior. W. B. 1952. J. and Birdsall. Application of a stimulus sampling model ~o~.A.1957. R. rvv Decision processes in . Po and Gin!3berg.~. 431_442. and Swets. Suppe!3. 1936. in press. Psychol. G.  "'" Stevens. 401409. 64. Psychol. Tanner. 1962b. John. 1961. A. and Atkin!3on. . to children's concept formation of binary numbers with and without an overt correction respon!3e. Psychql. Theios. Jo A. Swets. 63. On the psychophysical law 0 PsychoL Rev. ~243 The nature of discrimination learning in animals. PsychoL Rev". 61. and E. Rev. Markov learning models for multiperson Stanford University interactions 0 Stanford.. Po and Ginsberg. R. L. Stanford University. K. Rev. 43. Institute for Mathematical Studies in the Social Sciences. Po..::330~3360 tv P!3ychoL. {Vv 15318L Suppes. PsychoL Rev~. 40. NV . 59... 19600 Suppes.y· Jr.. perception. Wyckoff. "'" A threestate Markov model for learning. Jr. W. Po. Spence. T. 3013400 A decision making theory of visual 1954. 68. Technical Report No. So S.