You are on page 1of 10

Herbert A.

Simon Skill in Chess


William G. Chase
Experiments with chess-playing tasks and computer
#3*3 simulation of skilled performance throw light on
some human perceptual and memory processes

As genetics needs its model orga­ a running computer program, mates is to examine first those
nisms, its Drosophila and Neuros- MATER, and subjected to addi­ moves that permit the opponent
pora, so psychology needs standard tional empirical testing (3). the fewest replies. A comparison of
task environments around which the MATER program with think-
knowledge and understanding can The MATER theory is an applica­ ing-aloud protocols from human
cumulate. Chess has proved to be tion to the chess environment of a chess players confirms the impor­
an excellent model environment for more general theory of problem tance of heuristic search as a basic
this purpose. About a decade ago in solving that employs heuristic underlying process.
the pages of this journal, one of us, search as its core element (4). The
with Alien Newell, described the MATER theory postulates that While the MATER theory was suc­
progress that had been made up to problem solving in the chess envi­ cessful in accounting for much of
that time in using information-pro­ ronment, as in other well-struc­ what was known about chess think­
cessing models and the techniques tured task environments, involves a ing in mating situations, some im­
of computer simulation to explain highly selective heuristic search portant empirical phenomena—
human problem-solving processes through a vast maze of possibilities. some of them known when the
(1). A part of our article was devot­ Normally, when a chess player is theory was formulated, some of
ed to a theory of the processes that trying to select his next move, he is them discovered subsequently—
expert chess players use in discov­ faced with an exponential explosion eluded the theory's grasp. In this
ering checkmating combinations of alternatives. For example, sup­ paper, after describing the phenom­
(2), a theory that was subsequently pose he considers only ten moves ena, we should like to tell the story
developed further, embodied in for the current position; each of of a ten-year effort to account for
these moves in turn breeds ten new the recalcitrant facts.
moves, and so on. Searching to a
Herbert A. Simon took his bachelor's and depth of six plies (three moves by An important by-product of this ef­
doctor's degrees at the University of Chi- White and three by Black) will al­ fort has been to bring about a con­
cago, the latter in 1943. He has served on ready have generated a search vergence of the theory of problem
the faculties of the University of Califor- space with a million paths. Hence, solving with theories that have
nia, Berkeley, and Illinois Institute of if every legal move is considered (as been developed to explain quite dif­
Technology, and, since 1949, on the facul-
ty of Carnegie-Mellon University, where would be the case in an exhaustive ferent phenomena, which psycholo­
he is Richard King Mellon Professor of search), an enormous search space gists label "perception," "rote
Computer Science and Psychology. Begin- would be generated. Such a search learning," and "memory." In the
ning with an interest in decision-making
in organizations, Professor Simon has been
is beyond the capacity of the past, both theorizing and experi­
led during the past fifteen years into re- human player, as well as present- mentation relating to these differ­
tearch on human performance in complex day computers. Humans seldom ent kinds of tasks—problem solv­
tasks, using the computer to simulate cog- search more than a hundred paths ing, perceiving, learning by rote,
nitive processes. With Alien Newell, he is in choosing a move or finding a and remembering—have tended to
coauthor of a recent book, Human Prob­ checkmate, and they seldom con­ go their separate ways. In the
lem Solving.
William G. Chase is an Associate Professor sider more than two or three possi­ course of our story we will see how
of Psychology at Carnegie-Mellon Univer- ble moves per position. these theories come together to ex­
sity, where he has served since receiving plain chess skill; we will see the im­
his Ph.D. from the University of Wisconsin
in 1969. His research has concentrated on
The MATER theory postulates portant constraint that a limited-
the elementary information processes un- that humans don't consider moves capacity short-term memory im­
derlying cognition. He has edited a recent at random. Rather, they use infor­ poses on problem solving in chess
book, Visual Information Processing. mation from a position and apply and how this limit can be bypassed
This research was supported by Public some general rules (heuristics) to by specific perceptual knowledge
Health Service Research Grant MH-
07722, from the National Institute of Men-
select a small subset of the legal acquired through long experience,
tal Health. Address of both authors: De- moves for further consideration. For stored in long-term memory, and
partment of Psychology, Carnegie-Mellon example, one powerful heuristic accessed by perceptual discrimina­
University, Pittsburgh, PA 15213. that MATER uses in finding check­ tion processes.
394 American Scientist, Volume 61
The phenomena the previous experiment, but now esting and relevant to find out how
constructed random positions with the human eye extracts information
In Amsterdam, Adriaan de Groot, them. Under the same conditions, from a complex visual display like a
who was the first psychologist to all players, from master to novice, chess position and to see whether
carry out extensive experiments on recalled only about three or four this extraction process is compat­
problem solving using chess as the pieces on the average—performing ible with the assumptions of the
task, also initially formulated his significantly more poorly here than heuristic search theories.
theory in terms of heuristic search the novice did on the real positions.
(5). His subjects ranged from quite (The same result was obtained by A pair of Russian psychologists,
ordinary players to some of the W. Lemmens and R. W. Jongman Tichomirov and Poznyanskaya,
strongest chess grandmasters in the in the Amsterdam laboratory, but placed an expert before a chess po­
world, including several former their data have never been pub­ sition with instructions to find the
world champions. He was puzzled lished, 8.) best move, and they observed his
by one thing: none of the statistics eye movements during the first 5
he computed to characterize his In sum, these experiments show seconds of the task (11). The eye
subjects' search processes—number that chess skill cannot be detected movements were inconsistent with
of moves examined, depth of from the gross characteristics of the the hypothesis that the subject,
search, speed of search—distin­ search processes of chess players during these 5 seconds, was search­
guished the grandmasters from the but can be detected easily using a ing through a tree of possible moves
ordinary players. He could only perceptual task with meaningful and their replies.
separate them by the fact that the chess content. The experiment with
grandmasters usually chose the random boards shows that the mas­ To describe further what Tichomi­
strongest move in the position, ters' superior performance in the rov and Poznyanskaya found, we
while ordinary players often chose meaningful task cannot be ex­ must say a word about how the eye
weaker moves. Why were the plained in terms of any general su­ operates. The eye has a central re­
grandmasters able to do this? periority in visual imagery. The gion of high resolution, the fovea
Wherein lay their chess skill? perceptual skill is chess-specific. (about 1° in radius), surrounded by
Moreover, a theory of problem solv­ a periphery of decreasingly lower
The perceptual basis of chess mas­ ing in chess that does not include resolution. Most information about
tery. One clue to this riddle came perceptual processes cannot be an visual patterns is acquired while
when de Groot repeated and ex­ adequate theory—cannot explain the fovea is fixated on them; and
tended an experiment that had the superior ability of the strong the eye moves abruptly, in so-called
been performed earlier in the USSR player to choose the right moves. saccadic movements, from one
(6). He displayed a chess position point of fixation to the next. There
to his subjects for a very brief peri­ Eye movements at the chess board. are at most about four or five sac­
od of time (2 to 10 seconds) and The second set of phenomena we cadic movements per second.
then asked them to reconstruct the must consider are also perceptual,
position from memory. These posi­ but of a more recent discovery. Ex­ In Tichomirov and Poznyanskaya's
tions were from actual master planations in terms of heuristic record of the first 5 seconds of their
games, but games unknown to his search postulate that problem solv­ subject's eye movements, there
subjects. The results were dramat­ ing, and cognition generally, is a were about 20 fixations.
ic. Grandmasters and masters were serial, one-thing-at-a-time process. these centered on squaresMost of
of the
able to reproduce, with almost per­ (We are oversimplifying matters to board occupied by pieces that any
fect accuracy (about 93% correct), make the issue clear, but the over­ chess player would consider to be of
positions containing about 25 piec­ simplification will suffice for the importance to the position. There
es. There was a quite sharp drop-off present.) Many psychologists have were few fixations at the edges or
in performance somewhere near the found this postulate implausible corners of the board or on empty
boundary between players classified and have sought for evidence that squares. Moreover, a large number
as masters, who did nearly as well the human organism engages in ex­ of the saccades moved from one
as grandmasters, and players clas­ tensive parallel processing (9). The piece to another, where the former
sified as experts, who did signifi­ intuitive feeling that much infor­ piece stood in a "chess" relation—
cantly worse (about 72%). Good mation can be "acquired at a that is, an attack or defense rela­
amateurs (Class A players in the glance" argues for a parallel proces­ tion—to the latter. For example,
American rating scheme) could re­ sor. Of course, the correctness of the eye would move frequently from
place only about half the pieces in the intuition depends both on the a pawn to a Knight that attacked
the same positions, and novice amount of information that can ac­ it, or to a Knight that defended it,
players (from our own experiments) tually be acquired and upon what or from a Queen to a pawn it at­
could recall only about eight pieces is meant by a "glance." If a glance tacked.
(about 33%). There is a quite nice means a single eye fixation (lasting
gradation on this perceptual task as anywhere from a fifth of a second to It is important to note that the sac­
a function of chess skill, and we a half-second or longer), then we cadic movements were not random
have verified this in our own exper­ know that there are high-speed se­ —therefore, that some information
iments (7). rial processes (e.g. short-term must have been acquired peripher­
memory search, visual scanning) ally about the target square before
We went one step further: we took that operate within this time range the saccade began. From other evi­
the same pieces that were used in (10). Thus, it is certainly inter­ dence, we know that a strong chess

1973 July-August 395


must invent a new mechanism,
then we have lost the game.
Theories, gradually modified and
improved over time, are convincing
only if the range of phenomena
they explain grows more rapidly
than the set of mechanisms they
postulate.
In the present instance, there are
two ways in which we may seek to
preserve parsimony as we extend
the theory. First, we may examine
our existing theory to see whether
the mechanisms already incorpo­
rated in it might be adequate if
they were reorganized. Second, if
we need additional mechanisms to
explain some of the phenomena,
then, instead of inventing them ad
hoc, we may draw upon mecha­
nisms already postulated or known
in other parts of psychology-
mechanisms whose existence al­
ready has empirical support. We
will explore both of these routes for
improving the theory while preserv­
ing parsimony.
Perceptual processes in MATER.
Let us return to the MATER theo­
Figure 1. In this middle game position, used eye movement experiments, Black is to ry and see how much we must add
by Tichomirov and Poznyanskaya in their play. to, or subtract from, it in order to
account for the eye movement data.
MATER, as noted earlier, is a pro­
gram for discovering mating combi­
player can recognize a piece within ery for a likely target for the next nations by selective search. What is
a radius of 5° to 7° from his point of fixation unless the two processes the basis for the selectivity? A fun­
fixation; for eye-movement studies overlap in time (13,14). damental idea imbedded in
show that he can frequently replace MATER is that forceful moves
such a piece correctly on a board Even more important, the Russian should be explored first, where a
when he has had no closer point of experiments confirm the existence forceful move is one that accom­
fixation to it (12). of an initial "perceptual phase," plishes some significant chess func­
earlier hypothesized by de Groot, tion, like attacking or capturing a
The Russian experiments are of in­ during which the players first learn piece or restricting the movements
terest for two reasons. First, while the structural patterns of the pieces of the opponent. Discovering the
the saccadic eye movements them­ before they begin to look for a good opportunities for forceful moves in
selves are serial, some parallel visu­ move in the "search phase" of the any chess position involves perceiv­
al capacity appears to be operating, problem-solving process. The ex­ ing the attack, defense, and threat
for, since the saccade is not ran­ periments of Tichomirov and relations that hold among pairs and
dom, information about the target Poznyanskaya have been repeated clusters of pieces on the chess­
square must be acquired peripher­ and confirmed both in Amsterdam board—it is basically a perceptual
ally. From what we know about and in our own laboratory. How process.
search and scanning rates, it can be shall we extend the heuristic search
concluded that the processes of theory or problem solving to ac­ Hence, if we examine MATER a
scanning the periphery for the next commodate them? level or two below the executive
target square and preparing the routine that organizes its search,
next saccade must overlap in time we see that the program is com­
with the processes of searching Explaining the eye posed chiefly of a collection of pro­
memory for the identity and func­ movements cesses for noticing significant chess
tion of a piece (or square) presently relations among pieces or squares.
occupying the fovea. Visual scan­ Among the ground rules that ought In the program as originally orga­
ning experiments show that an eye to be followed in building theories, nized, these processes were enlisted
fixation does not allow enough time one of the most important is the in the service of the heuristic search
both to recognize a pattern in the rule of parsimony. If, in order to ex­ for a mating combination. Are
fovea and to scan the visual periph- plain each new phenomenon, we these noticing processes a sufficient

396 American Scientist, Volume 61


base on which to build a theory of
the eye movements?
The PERCEIVER program. It
proved surprisingly easy to simu­
late the eye movements. It was not
difficult to replace MATER's exec­
utive program with a new program
that used the same perceptual pro­
cesses to guide the scanning of the
board, and when this was done, a
good correspondence was found be­
tween the squares fixated during
the first 20 saccades by the human
player and the squares fixated by
the program (15).
The program, dubbed PERCEIV-
ER, operates in a very simple man­
ner. With the simulated fovea fix­
ated on a square of the board, in­
formation is acquired peripherally
about pieces standing on nearby
squares that attack or defend the
fixated square, or that are attacked
or defended by the piece on that
square. Attention is then assumed
to switch to one of these nearby
squares, and, unless it immediately
returns to the square already fixat­
ed, causes a saccadic movement to
the new square. With the fovea fix­ Figure 2. Eye movements of an expert play­ squares occupied by the most active pieces
ated on the new square, the process er are recorded for the first 5 seconds, by (see Fig. 1) are shaded.
Tichomirov and Poznyanskaya. The 10
simply repeats. A moment's reflec­
tion will convince the reader that a
process having this structure will
cause a biased random walk of the behavior of PERCEIVER lies large­ ed eye around the chessboard are,
fixation point around the board, ly in a difference in goal or motiva­ in fact, serially organized, and it is
returning most frequently to those tion at different stages in the prob­ a simple matter to simulate them
regions where relations among piec­ lem solving process. The empirical in real time on standard computers.
es are densest and spending little data from human subjects indicate Even if realistic time parameters,
time on the edges of the board. that initially the player sets himself estimated from human perfor­
(not necessarily consciously or de­ mance, were assigned to the various
Figure 1 is one of the positions used liberately, but perhaps habitually) processes of PERCEIVER, it is still
by Tichomirov and Poznyanskaya the task of acquiring information not clear that anything resembling
in their eye-movement experiments; about the chess-significant relations a parallel process would be neces­
Figure 2 is a record of the first 20 on the board (PERCEIVER). Hav­ sary. This problem is related to the
fixations of their expert in this po­ ing acquired this information, he third point.
sition; and Figure 3 shows the first turns to generating moves and
15 fixations produced by PER­ exploring their consequences Third, there is one level of percep­
CEIVER in the same position. Of (MATER). There would be no tual processing that is finessed and
interest is the fact that the PER­ great difficulty in revising MATER one level that is entirely missing in
CEIVER simulation, by means of to conform to this pattern—with PERCEIVER. The part that is fi­
its simple mechanism of attending the perceptual, information-gather­ nessed is the mechanism that rec­
to attack and defense relations, ing phase preceding the cognitive, ognizes the chess pieces in the first
shows the same preoccupation with heuristic search phase. As a matter place. What is more important,
the important pieces as does the of fact, one earlier computer chess while PERCEIVER notices attacks
human expert. program, written by Newell, Shaw, and defenses, it has no processes for
and Simon in 1958, had much of organizing and remembering this
There are three points we need to this flavor (16), and another such information once it is attended to.
make about this simulation. First, program is now being constructed But, as we shall see, the organizing
no new mechanisms were invoked; by Berliner (17). process itself drives the eye move­
it was sufficient to reorganize the ments. It is quite plausible that
lower-level perceptual mechanisms Second, there is nothing a priori these missing processes operate
of MATER. The difference between parallel about PERCEIVER; the partly in parallel with the scanning
the behavior of MATER and the simple rules that drive the simulat- processes of PERCEIVER.

1973 July-August 397


Among the striking phenomena
that had been observed in rote
learning are: (1) a characteristic
shape of the serial position curve
(in serial anticipation learning), (2)
a three-to-one (approximately)
time advantage in learning mean­
ingful over meaningless and famil­
iar over unfamiliar syllables, (3)
certain characteristic differences in
learning times between similar and
dissimilar stimulus and response
items, and (4) certain conditions
that determine whether rote learn­
ing will have an incremental or an
all-at-once appearance. EPAM has
been successful in accounting for all
of these phenomena (19).
The program of EPAM, and hence
the theory it embodies, is quite
simple. EPAM learns by growing a
discrimination net—a tree-like
structure whose nodes contain tests
that may be applied to objects that
have been described as bundles of
perceptual features. When a famil­
iar object is perceived, it is recog­
nized by being sorted through the
EPAM net. At the terminal
Figure 3. The solid line represents eye riod of initial orientation from the PER- branches of the EPAM net are
movements and the broken lines represent CEIVER program. The 10 squares occupied stored partial "images"—also in
relations noticed peripherally in this record by the most active pieces (see Fig. 1) are the form of feature bundles—of the
of simulated eye movements during the pe­ shaded. objects sorted to the respective ter­
minals, together with other infor­
mation about the objects.
The EPAM theory also plays an
The board reconstruction program has a mechanism for the important role in explaining the eye
experiment extensive storage in long-term movements. Recall that in the pre­
memory of familiar patterns, nor vious section, PERCEIVER was
Nothing in the perceptual mecha­ indeed do they have a long-term found inadequate because it con­
nisms we have described so far will memory of any complexity. But it tained no mechanism for recogniz­
allow us to account for the spectac­ is precisely this kind of pattern-rec­ ing pieces and patterns of pieces. A
ular skill of chess masters in recon­ ognition process that lies at the more complete theory of eye move­
structing positions that they have heart of the master's reconstructive ments would require that PER­
seen for only a few seconds. Both ability. CEIVER have access to EPAM.
MATER and PERCEIVER gloss
over details of the process for recog­ Elementary perceiver and memo- The processes of EPAM influence
nizing a chess piece—noticing that rizer. Still retaining our respect for the eye movements via the way the
it is a Bishop, say, rather than a parsimony, we note that there al­ discrimination net is searched. Fig­
pawn. Each piece is represented by ready exists in psychology an infor­ ure 4 illustrates a small section of
a little bundle of features—its mation processing theory to explain the net with two terminal nodes.
color, for example, and its type how feature-bundles can become Observe that the nodes contain
(King, Queen, etc.). The programs familiarized, associated with other questions about the contents of
do not undertake to explain or sim­ information in long-term memory, specific squares; depending upon
ulate the feature extraction process, and used as components in larger what is found at a square, a decision
but simply assume that it is per­ organizations of structures. This is made concerning which square to
formed and that previous learning theory, called EPAM (Elementary query next. In short, the EPAM net
has stored in long-term memory the Perceiver and Memorizer), was ini­ is organized as a set of instructions,
requisite information about the tially developed by Feigenbaum to albeit abstract, for scanning the
capabilities of the different kinds of explain some of the principal board for familiar patterns. These
pieces. More important, neither empirical findings about the rote instructions must then be inter­
program contains any mechanisms learning of nonsense syllables in the preted by the perceptual system
for the recognition of meaningful, standard serial anticipation and (PERCEIVER) in order to extract
familiar patterns of pieces—neither paired-associate paradigms (IS). the information, and eye move-
398 American Scientist, Volume 61
ments may well be necessary to ex­ KR2? about seven chunks (22). Miller
ecute the instructions. For small showed that the well-known limit
clusters of pieces, some of these on the amount of information that
successive recognition steps may be can be held in short-term memory
executed in a single foveal fixation, is not to be measured in bits, but in
without saccadic movement. Thus, chunks—the capacity is about
eye movements may be of two "seven, plus or minus two" familiar
kinds: (1) initial familiarization, in units of any kind. By acquiring new
which simple chess functions (at­ familiar units (e.g. octal digits) and
tack, defense) are noticed, and (2) learning to recode information in
recognition, in which complex pat­ terms of those units (e.g. receding
terns are scanned. from binary to octal), holding a
constant number of chunks in
This explanation of the eye move­ short-term memory allows one to
ments gains additional support hold an increased number of bits
from the work of Noton and Stark, (in the example, a gain of three to
who developed independently a one). The chunk of EPAM theory
similar theory (20). They proposed has these same characteristics.
that people's memory of a picture
will determine how that picture is Since Miller's influential article
Figure 4. A portion of the EPAM net for
subsequently scanned for recogni­ chess shows the terminal nodes for two pat- was published, there has been a
tion, and they presented evidence terns: (1) three pawns on second rank, and tremendous amount of research on
that, under the appropriate condi­ (2) fianchettoed Bishop. At each node is short-term memory, and virtually
tions, eye movements followed ste- shown the test executed there. For KR2?, every present-day theory about
for example, read: "What piece stands on cognitive processes incorporates
reotypic "scanpaths" before the pic­ the King's Rook Two Square?" The patterns
ture was recognized. EPAM makes at the terminal nodes are for illustrative such a memory system. Much re­
this same strong assumption—that purposes only: all the information needed to search on thinking and problem
patterns are recognized by scanning recognize the pattern is imbedded in the solving has shown that, outside of
the configuration for specific fea­ logic of the discrimination net. The terminal strategies, the only other human
node has the internal name of the node, an characteristic that consistently lim­
tures in a particular order. abstract symbolic reference (internal ad­
dress) that can be stored in short-term its performance in a wide variety of
EPAM has a recursive structure. memory as a single chunk. tasks is the small capacity of short-
This means that any object, once term memory. And without a short-
familiarized and incorporated in term memory, EPAM theory by it­
the net, can itself serve as a percep­ self does not account for the verbal
tual feature of a more complex lables, in turn, makes these avail­ learning phenomena mentioned
object. Thus, once the various able as components of syllable pairs earlier. Short-term memory, then,
types of chess pieces—Kings, or lists, and so on. Thus, EPAM is one of the basic cognitive capaci­
pawns, Bishops—have become postulates a single learning process, ties. For our purposes, we assume
familiarized, these can become fea­ identical with what we have been that what gets stored in short-term
tures of more complex configura­ calling familiarization, and a single memory are the internal names of
tions, say, a "fianchettoed castled kind of output of that process, a chunks (e.g. "fianchettoed castled
Black King's position" (see Fig. 1 new unit or chunk. Black King's position"), which
for this pattern in the upper-right serve as memory addresses or re­
part of the board). Once familiar­ The EPAM theory implies that the trieval cues for information about
ized (and this particular pattern is length of time required for a learn­ the chunks in long-term memory.
known to every strong player), such ing task will be proportional to the
a complex can, in turn, serve as a number of new chunks that have to Let us return now to the chess­
perceptual feature of a still more be familiarized in order to perform board construction phenomena.
complex pattern—e.g. an entire the task. This implication also fits From Miller's chunking hypothesis,
chess position. the empirical evidence very well, EPAM theory, and the limited ca­
the basic learning time being about pacity of short-term memory, we
We have now illustrated the re­ 5 seconds per chunk (21). would predict that a chessboard
cursive structure of EPAM with a can be reconstructed from informa­
chess example, but the EPAM pro­ Chunks and short-term memory. tion held in short-term memory if,
gram was not constructed with this Finally, an additional mechanism, and only if, it can be encoded in
application in mind. In the context short-term memory, is needed in not more than about seven familiar
of rote verbal learning, the lowest- order to understand the reproduc­ perceptual chunks. If a single piece
level features in EPAM are the geo­ tion experiment—a mechanism for on a particular square constitutes a
metrical and topological properties holding all that information for the chunk for a subject, then he should
of English letters. With familiariza­ short period of time before it is re­ be able to recall only about seven
tion, the EPAM net expands to en­ called. George Miller, in order to pieces. If he can recall the positions
compass the letters themselves, account for the observed invarian- of more than twenty pieces, then it
which then can be used as compo­ ces in memory-span experiments, must be that each chunk consists,
nents (test nodes) of nonsense syl­ first postulated such a memory sys­ on average, of a configuration of
lables. Familiarization of the syl- tem with a constant capacity of about three pieces.

1973 July-August 399


We now have a proposed explana­ a chunk boundary, we performed a
tion for the remarkable ability of I T
second experiment, in which the
chess masters to reconstruct posi­ subject also reconstructed a chess
tions—an explanation that meets position but with the original posi­
our requirements of parsimony. We tion in view. The two boards were
have employed only mechanisms so placed that the subject had to
that are well rooted in other parts turn his head to look from the one
of psychological theory: (1) a lim­ to the other. We found that, when
ited-capacity short-term memory the subject placed two or more
that can hold the names of only pieces on the board without turning
about seven chunks, (2) a vast rep­ his head, each latency was almost
ertoire of familiar patterns stored always under 2 seconds. We as­
as chunks in long-term memory, I I I I I
sumed that, under these speeded 01234
and a recognition mechanism—the conditions, subjects load a single
EPAM net—for getting at them, chunk into short-term memory Number of relations
and (3) the related chunking process when they view the board and then Figure 5. Mean latencies between succes­
that builds these patterns and their sively placed pieces in the reconstruction
look directly over and recall that task are plotted as a function of the number
retrieval mechanisms in the first chunk. (It would be inefficient, of chess relations between the pieces.
place. under these conditions, to store
more than one chunk, because they
The next task is to find more direct would then have to store the chunk patterns (coefficients of .12, .18,
ways to test the theory. Several names—there isn't enough room in .10, and .23) and not at all correlat­
routes are open: we can seek direct short-term memory to store the ed with the random pattern (-.04
empirical evidence for the existence structural information comprising and -.03). On the other hand, the
of these chunks and see if the mem­ more than one chunk—and then at two between-chunk patterns were
ory span for chunks is of the order recall use each chunk name in suc­ strongly correlated with each other
of seven; we can attempt to simu­ cession to retrieve the chunk from (.91) and with the random pattern
late the reproduction task using the long-term memory—a time-con­ (.87 and .81). Thus, there is strong
mechanisms of the theory within a suming procedure.) We therefore evidence that the 2-second criterion
computer program; and we can cal­ assumed that, in the reconstruction in fact marks chunk boundaries.
culate whether the hypothesis leads task, a pause longer than 2 seconds
to reasonable estimates of the num­ indicated the retrieval of a chunk What was the nature of the chunks
ber of familiar chunks a chess mas­ from long-term memory via the thus delineated? Most of them were
ter must have stored in long-term chunk name in short-term memory. local clusters of pieces in arrange­
memory. We consider these in turn. ments that recur with high frequen­
To check the plausibility of this 2- cy in actual chess positions. (The
Empirical identification of chunks. second criterion, we counted the fianchettoed castled King's position
The logic we used in isolating the number of chess relations that held mentioned earlier actually occurs in
chunks was to see if, during the re­ between pairs of successively placed about ten percent of all recent
construction of a position, chunk pieces. The relations counted were games between grandmasters.) In
boundaries could be identified by attacks, defenses, proximity, iden­ the case of a subject who is a chess
long pauses. Time measurements tity of type (e.g. both Rooks or master, we were able to classify
have been used for identifying pawns), and color. There was a 75% of his chunks as highly stereo­
chunks in other experimental tasks. strong negative correlation between typed. Of the 77 chunks observed
McLean and Gregg, for example, numbers of relations and latency in his performance of the memory
had subjects memorize permuta­ (see Fig. 5). experiment, 47 were pawn chains,
tions of the alphabet (23). They sometimes with a nearby support­
then timed the intervals (latencies) Next, we compared the pattern of ing or blockading piece. Ten
between successive letters in the frequencies of the between-chunk chunks were castled King's posi­
subjects' recitals of the lists. They relations (greater than 2 seconds) tions. Twenty-seven chunks were
obtained convincing evidence that with the pattern of the within- other clusters of pieces of the same
the permuted alphabet was stored chunk relations (less than 2 sec­ color, and 19 of these were of com­
in memory, not as a single uniform onds) and both of these with the mon types: 9 consisted of pieces on
list, but as a hierarchy of segments; pattern that would have been ob­ their original squares in the back
the individual letter segments most served had the pieces been replaced rank, and 9 of connected Rooks or
frequently were three or four letters in random order. We made this connected Queen and Rook. These
in length. Within-chunk latencies comparison for both forms of the are configurations a chess master
were much shorter than between- reconstruction experiment—from has seen thousands of times—as
chunk latencies. memory and in sight of the board often as we have seen many of the
(see Table 1). For the two forms of familiar words in our reading voca­
Adapting this technique to our the experiment, the within-chunk bularies. There is as much reason
task, we videotaped subjects recon­ relational patterns were highly cor­ to suppose in the one case as in the
structing chess positions and mea­ related (Pearson correlation coeffi­ other that they are stored in his
sured the latencies in placing suc­ cient of .89), but these patterns long-term memory and that he will
cessive pieces. In order to estimate were only slightly correlated with usually recognize them when he
what interval would correspond to the corresponding between-chunk sees them.
400 American Scientist, Volume 61
Table 1. Intercorrelation matrix for the Sight-of-Board Constructions (1 and 3),
scan the board in some way in
Memory Constructions (2 and 4), and Hypothetical Random Constructions (5). order to notice the pieces and their
relations. The scanning program is
2 3 4 5 a simplified version of PERCEIV-
1. Within-chunk .89 .12 .18 -.04 ER, hence can be viewed as a simu­
2. Less than 2 sec .10 .23 -.03 lation of the eye movements and
control of attention. When a piece
3. Between-chunk .91 .81 is fixated (salient piece), an
4. Greater than 2 sec .87 EPAM-like discrimination process
5. Random seeks to recognize the cluster of
pieces surrounding the fixated piece
as a familiar chunk. If it is success­
ful, the symbol designating this
Thus far the empirical data support of the right order of magnitude— chunk is stored in short-term mem­
our theory, but we must mention not far from the memory span of ory. This process is repeated at suc­
one piece of evidence that is equiv­ seven—but the difference between cessive points of fixation until no
ocal. If we accept the 2-second cri­ them is not predicted by the theo­ more pieces become salient or
terion for chunk boundaries, then ry. At the moment, we have no short-term memory capacity is
we can measure directly the number good explanation for the discrepan­ reached, whichever occurs first.' Fi­
of chunks our subjects are holding in cy, but have simply placed it as an nally, in the reconstruction phase,
short-term memory when they at­ item high on the research agenda. the terminal information in the
tempt to reconstruct the board. Our hunch is that a less simplistic EPAM net is used to decode the
Our theory predicts that the num­ model of the structure of chunks symbols held in short-term memory
ber of chunks will be the same for and their interrelations, or of the into locational information for each
strong and weak players, but that organization of chunks in short- of the pieces in a chunk and thus to
the average chunk size will vary by term memory, will be needed to at­ reconstruct the position.
a factor of two or three with chess tain a better second approximation.
skill. The learning component of MAPP
The MAPP simulation. A second is a simplified version of the portion
This prediction is not borne out approach to testing the theory of of EPAM that grows or elaborates
fully. When we compare, for exam­ the chessboard reconstruction task the discrimination net and stores in­
ple, the data from the memory ex­ was to build a computer program, formation at its terminal nodes.
periment for a chess master with MAPP, to simulate the observed The input to the program consists
the data for a Class A player, we phenomena (24). The general out­ of many different configurations of
find that the master recalled about lines of the program follow immedi­ pieces (of two to seven pieces each)
twice as many pieces as the Class A ately from our description of the that occur frequently as compo­
player, but the former's chunks theory. The program contains a nents of chess positions. If such a
averaged only about 50% larger learning component to acquire and pattern has been familiarized pre­
than the letter's, while the average store in memory a large set of con­ viously, the program will simply
number of chunks he recalled also figurations of chess pieces and a recognize it; if it has not, it will
averaged about 50% more. The av­ performance component to carry discriminate it from patterns pre­
erage sizes of the first chunks re­ out the board reconstruction task viously learned, will add tests to
called by master and Class A play­ (Fig. 6). the EPAM net to implement the
er were 3.8 and 2.6, respectively; *
discrimination, will create a new
the average numbers of chunks per Consider first the performance terminal node to designate the new
position were 7.7 and 5.7, respec­ component. When a chess position pattern, and will store information
tively. Now the latter numbers are is presented, the program must about the pattern at that node.
Pattern of Figure 6. A schematic representation of the
chess pieces EPAM net principal components of MAPP shows the
learning and performance processes used to
EPAM-like reconstruct a chess position.
pattern
learner

Salient piece Reconstructed


detector chess position

Chess position Salient piece Chunks


in short-term
memory

1973 July-August 401


Thus the MAPP program is a hy­ ulthood. Such people have reading guishable from the statistics of the
brid of a simplified PERCEIVER vocabularies of 50,000 words or weaker player's search?
with a simplified EPAM; the finer more. If a chunk is a chunk is a
details of those prior programs are chunk as to learning time (as EPAM Two facts that have not been much
i-ot essential to demonstrating the theory proposes), then we would ex­ studied in the laboratory, but
phenomena. With a net of about pect the chess master to have a com­ which are well known in chess cir­
1,000 patterns, the performance of parable chess vocabulary. Our esti­ cles, need to be mentioned. First,
MAPP on the reconstruction task is mate agrees well with that reached the master and grandmaster not
about equal to that of a Class A previously. only select good moves but they
player, twice as good as a begin­ often—much oftener than weak
ner's, but only half as good as a Finally, we may ask: given the vari­ players—notice these moves in the
master's. In a typical set of posi­ ety of possible chess positions from first few seconds after they look at
tions, MAPP recalled 51% of the well-played games, how big a vo­ a new position. Having noticed such
pieces placed correctly by the mas­ cabulary of patterns must we have a move, the master may continue
ter, but only 30% of the pieces so that each position could be rep­ to analyze the position for some
missed by the master, indicating resented by a distinct set of seven, minutes before he is satisfied that
that its chunks were not dissimilar or so, patterns? If N is the number it is the best move—and sometimes
from the master's. Finally, the of possible positions, while P is the his analysis will show that his first
within-chunk chess relations of number of patterns, then the re­ impulse was wrong. Nevertheless,
pieces recalled successively by quirement is P7 > N. If P « 50,000, his ability to notice moves "at a
MAPP were highly similar to those then P7 is approximately 8 x 1032 . glance" is always astonishing to
of the human subjects, while the The latter number, in turn, is close lesser players.
between-chunk relations were close to 640. Now if we played chess
to the random pattern. games to a depth of 20 moves for Second, although the average time
each player and at each choice an per move in serious tournament
The chess master's vocabulary. We average of 6 reasonable moves were chess is 3 to 4 minutes (which
can extrapolate from the present available, approximately 640 differ­ means that some moves are made
performance of the MAPP program ent games could be played. Since rapidly, while others are brooded
to estimate how large a vocabulary there are probably not, on the aver­ over for as much as half an hour), a
of chess patterns would have to be age, six reasonable moves at each master or grandmaster can beat
stored in the EPAM net to match choice point, 50,000 patterns should players of inferior skill while taking
the performance of the chess mas­ be more than enough to accommo­ only a few seconds per move and
ter. The distribution of different date the positions that could be playing simultaneously against
patterns by frequency is highly reached in such games. It should be many players. His play in these
skewed, like the frequency distribu­ emphasized that this estimate is games is not of the same quality as
tion of words in natural language. very crude, since it does not take in his more deliberate tournament
Assuming that the patterns in the into account that some patterns are games, but it is strong enough to
present MAPP net are those most much more frequent than others. beat most experts and almost all
frequently encountered in chess Nevertheless, it is reassuring that it players of lower class.
games, and assuming the same de­ gives results that are not inconsist­
gree of skewness for chess patterns ent with those arrived at by other The most likely explanation of
as for words, we can estimate that routes. Until we can get better these facts is that the chess master
something of the order of 50,000 data—possibly by expanding the is not only acquainted with tens of
patterns would have to be stored to EPAM net—it seems reasonable to thousands of familiar patterns of
match the master's performance. Is assume that a chess master can pieces, but that with many of these
this a plausible estimate from other recognize at least 50,000 different patterns are associated plausible
viewpoints? We can check its configurations of pieces on sight, moves that take advantage of the
plausibility in two ways. and a grandmaster even more. features represented by the pattern
(25). Many of the basic heuristics
First, there are no instant experts that guide the search for good
in chess—certainly no instant mas­ moves are based on the presence of
ters or grandmasters. There ap­ Familiarity breeds a pattern on the board. For exam­
pears not to be on record any case
(including Bobby Fischer) where a
competence ple, every chess player of even
moderate skill is familiar with the
person has reached grandmaster If the MAPP theory provides an ex­ advice: "If there's an open file, put
level with less than about a dec­ planation—at least a first approxi­ a Rook on it." He knows that the
ade's intense preoccupation with mation—of the chess master's su­ advice is not meant quite literally,
the game. We would estimate, very perior skill in quickly perceiving that what is really meant is "con­
roughly, that a master has spent chess positions and then recon­ sider putting a Rook on it." The
perhaps 10,000 to 50,000 hours structing them from memory, it pattern of an open file will trigger
staring at chess positions, and a leaves unexplained the link be­ the heuristic and initiate a move in
Class A player 1,000 to 5,000 hours. tween this superiority and his chess- the heuristic search. Some patterns
For the master, these times are playing prowess. How does the theo­ (perhaps many hundreds) may ac­
comparable to the times that high­ ry solve the riddle with which we tually be associated with an algo­
ly literate people have spent in began—that the statistics of the rithmic solution—traps and combi­
reading by the time they reach ad­ master's search appear indistin­ nations that lead to the guaranteed
402 American Scientist, Volume 61
win of a piece, a checkmate, or tice—thousands of hours of prac­ 7. Chase, W. G., and H. A. Simon. 1973.
whatnot—in which a series of tice. This is implicit in the EPAM Perception in chess. Cognitive Psychology
moves may be played almost by theory; what is needed is to build 4:55-81.
rote. up in long-term memory a vast ren- 8. Jongman, R. W. 1968. Het Oog van de
Meester. (Doctoral dissertation, Universi­
ertoire of patterns and associated ty of Amsterdam.) Assen: Van Gorcum &
Thus, we suggest that the key to plausible moves. Early in practice, Company.
understanding chess skill—and the these move sequences are arrived at 9. Neisser, U. 1963. The imitation of man
solution to our riddle—lies in un­ by slow, conscious heuristic search by machine. Science 139:193-97.
derstanding these perceptual pro­ —"If I take that piece, then he 10. Sternberg, S. 1969. Memory-scanning:
cesses. The patterns that masters takes this piece . . ."—but with Mental processes revealed by reaction-
perceive will suggest good moves to practice, the initial condition is time experiments. American Scientist
them. The structure of the search 57:421-57.
seen as a pattern, quickly and un­ 11. Tichomirov, O.
process through possible moves will K., and E. D. Poznyan-
consciously, and the plausible move skaya. 1966. An investigation of visual
not be very different from that of comes almost automatically. Such search as a means of analyzing heuristics.
weaker players; only the paths sug­ a learning process takes time— Soviet Psychology 5:2-15.
gested by the patterns will be dif­ years—to build the thousands of fa­ (Trans . from Voprosy Psikhologii 2(4): 39-
ferent. 53).
miliar chunks needed for master-
12. Noordzij, P. 1967. Registratie van
level chess. oogbewegingen bij schakers. Unpublished
Such a view of chess skill is quite working paper, Psychology Laboratory of
amenable to theorizing in terms of Clearly, practice also interacts with the University of Amsterdam.
production systems. By a produc­ talent, and certain combinations of 13. Williams, L. G. 1966. The effect of tar­
tion is meant a routine consisting of basic cognitive capacities may have get specification on objects fixated during
two parts: a condition part and an special relevance for chess. But visual search. Perception & Psychophysics
action part. The condition part 1:315-18.
there is no evidence that masters 14. Ellis, S. H.,
tests the presence or absence of a and W. G. Chase. 1971.
demonstrate more than above-aver­ Parallel processing in item recognition.
specific (perceptual) feature (e.g. age competence on basic intellectu­ Perception & Psychophysics 10:379-84.
an open Hie); the action part, which al factors; their talents are chess- 15. Simon, H. A., and M. Barenfeld. 1969.
is executed whenever the condition specific (although World Champion Information-processing analysis of per­
is satisfied (whenever the feature is caliber grandmasters may possess ceptua l processes in problem solving. Psy­
recognized as being present), gener­ truly exceptional talents along cer­ chological Review 76:473-83.
ates a chess move for consideration tain dimensions). The acquisition 16.Simon.Newell, A., J. C. Shaw, and H. A.
1958c. Chess-playing programs
that is relevant to that specific fea­ of chess skill depends, in large part, and the problem of complexity. IBM Jour­
ture (e.g. putting a Rook on the on building up recognition memory nal of Research and Development 2:320-
open file). A separate analysis rou­ for many familiar chess patterns. 35.
tine can then carry out the tree 17. Private communication.
search required for a final evalua­ We now have an account of percep­ 18. Feigenbaum, E. A. 1961. The simulation
tion of proposed moves. The advan­ tual skills in chess that is consistent of verbal learning behavior. Proceedings of
tage of modeling human behavior the Western Joint Computer Conference,
with theories drawn from other 121-32. (Reprinted in Feigenbaum &
with production systems is that parts of psychology. There is no Feldman, eds. 1963. Computers and
such systems are very simple and lack of tasks for continuing re­ Thought. New York: McGraw-Hill.)
rulelike, avoiding many of the search, and the environment of 19. Simon, H. A., and E. A. Feigenbaum.
inflexibilities of algorithmic pro­ chess continues to be one of the 1964. An information- processing theory of
gramming languages. They can most fruitful for cognitive studies. some effects of similarity, familiarization,
mimic learning by simply adding and meaningfulness in verbal learning.
Journal of Verbal Learning and Verbal
new productions (26), and they Behavior 3:385-96.
have the perceptual flavor we need References 20. Noton, D. and L. Stark. 1971. Scan-
to simulate the pattern-recognition 1. Simon, H. A., and A. Newell. 1964. In­ paths in eye movements during pattern
processes in chess. formation processing in computer and perception. Science 171:308-11.
man. American Scientist 52:281-300. 21. Simon, H. A. 1969. The Sciences of the
While the evidence is not yet in, it 2. Simon, H. A., and P. A. Simon. 1962. Artificial. Cambridge: M. I. T. Press, pp.
Trial and error search in solving difficult 35-38.
becomes increasingly plausible that 22. Miller, G. A. 1956. The magical number
the cognitive processes underlying problems: Evidence from the game of
chess. Behavioral Science 7:425-29. seven, plus or minus two. Psychological
skilled chess performance have 3. Baylor, G. W., Jr., and H. A. Simon. Review 63:81-97.
some such organization as this. 1966. A chess mating combinations pro­ 23. McLean, R. S., and L. W. Gregg. 1967.
Such a scheme would account for gram. AFIPS Conference Proceedings, Effects of induced chunking on temporal
the association of chess-playing 1966 Spring Joint Computer Conference aspects of serial recitation. Journal of Ex­
28:431-47. Washington, D.C.: Spartan perimental Psychology 74(4): 455-59.
skill with the ability to recognize
Books. 24. Simon, H. A., and K. Gilmartin. 1973.
numerous perceptual patterns on A simulation of memory for chess posi­
4. Newell, A., and H. A. Simon. 1972.
the board. Human Problem Solving. Englewood tions. Cognitive Psychology (in press).
Cliffs, New Jersey: Prentice-Hall. 25. Chase, W. G., and H. A. Simon. 1973.
There is another question which we 5. deGroot, A. D. 1965. Met Denken van The mind's eye in chess. In Visual Infor­
haven't addressed directly, but den Schaker. Trans. as Thought and mation Processing, ed. W. G. Chase. Pro­
ceedings of Eighth Annual Carnegie Psy­
whose answer is implicit in what we Choice in Chess. The Hague: Mouton &
chology Symposium. New York: Academic
have been saying. The question is: Company.
Press.
how does one become a master in 6. Djakow, I. N., N. W. Petrowski, and P. 26. Newell, A., and H. A. Simon. 1972.
A. Rudik. 1927. Psychologie des Schach- Human Problem Solving. Englewood
the first place? The answer is prac­ spiel. Berlin: Walter de Gruyter. Cliffs, New Jersey: Prentice-Hall.

1973 July-August 403

You might also like