Story of a Research Program

John Sweller
English speakers, it became my only
At school, I began as a mediocre
student who slowly deteriorated to the
status of a very poor student by the time I
arrived at the University of Adelaide. (Most
Australian students attend their home
university.) Initially, I enrolled in an
undergraduate dentistry course but never
managed to advance beyond the first year.
While I am sure that was a relief to the
Dental Faculty, it also should be a relief to
Australian dental patients.
Early Days Given the physical proximity of the
I was born in 1946 in Poland to teeth and brain, I decided next to try my
parents who, apart from my older sister, luck at psychology. It was a good choice
were their families’ sole survivors of the because my grades immediately shot up
Holocaust. Very few family members who from appalling back to mediocre, where
lived outside of Poland survived. One of they had been earlier in my academic career.
those was my mother’s sister, my aunt who I decided I wanted to be an academic.
lived in Adelaide, South Australia. She had That decision was not as silly as it
become a dentist in Vienna and was sounded. While I was no better at sitting for
fortunate that Nazi regulations did not exams or obtaining good grades on normal
permit her to practice her profession prior assignments than I had ever been, on some
to the war. In 1938 she and her family left occasions we were given research
for Adelaide, Australia. My aunt was my assignments requiring us to devise
mother’s only surviving relative, and since psychological experiments. Leon Lack, then
Adelaide was almost as far from Europe as a tutor in the Department of Psychology,
my parents could find, that was where my instituted these. At that time, the University
parents, my sister, and I landed in 1948. of Adelaide’s Psychology Department was
My parents’ native language was emphatically oriented towards experimental
Polish rather than the commonly spoken psychology. I seemed to have some skill at
Yiddish of Polish Jews, and so my first theorizing about psychological variables and
language, strictly speaking, was Polish. In devising experiments. It was the only
practice, given my age, English supplanted academic skill that I had ever rated myself as
Polish very rapidly and was my de facto first better than average. As I advanced through
language. I certainly could understand and my undergraduate years, my grades gradually
speak English prior to arriving at school. As improved. While they never reached any
is regrettably common among native stellar heights, they now were sufficient to
allow me to enroll as a PhD student, a

degree that under the Australian system to live in Sydney permanently. At UNSW, I
concentrated entirely on research with no recommenced my research career, studying
coursework. I was finally in my milieu. problem solving.
In 1970, under the capable oversight
of my supervisor, Tony Winefield, I The Beginnings of Cognitive Load
commenced my research as an experimental Theory
psychologist studying learning theory. At After several non-descript
that time, Behaviorism was dying and the experiments, I saw some results that I
cognitive revolution was beginning. I began thought might be important. I, along with
my work on animal learning but decided research students Bob Mawer and Wally
fairly rapidly that applying cognitive Howe, was running an experiment on
principles to animal learning might not be problem solving, testing undergraduate
productive and so rapidly switched to students (Sweller, Mawer, & Howe, 1982).
human problem solving. I have conducted The problems required students to
my research on learning and problem transform a given number into a goal
solving ever since, leading to my career as an number where the only two moves allowed
educational psychologist. were multiplying by 3 or subtracting 29.
Each problem had only one
possible solution and that
solution required an
alternation of multiplying by
3 and subtracting 29 a
specific number of times.
For example, a given and
goal number might require a
2-step solution requiring a
single sequence of: x 3, - 29
to transform the given
number into the goal
University of Adelaide Campus number. Other, more difficult problems
would require the same sequence consisting
On completing my PhD at the of the same two steps repeated a variable
University of Adelaide in 1972 my first number of times. For example, a 4-step
academic position was as a lecturer in problem always had the solution: x 3, - 29, x
educational psychology in the teacher 3, -29 while a 6-step problem required 3
training program at the Tasmanian College iterations of x 3, - 29. Accordingly, all
of Advanced Education in Launceston. I problems required alternation of the two
was unused to living in a small town, unused operations a variable number of times.
to living alone rather than with my family My undergraduates found these
and, of course, unused to teaching rather problems relatively easy to solve with very
than solely carrying out research. I liked the few failures, but there was something
college, and Launceston is an exceptionally strange about their solutions. While all
attractive town, but I needed to leave. I problems had to be solved by this
stayed for a personally difficult year before alternation sequence because the numbers
moving to Sydney for an equivalent position were chosen to ensure that no other
at the University of New South Wales solution was possible, very few students
(UNSW), where I have remained ever since. discovered the rule, that is, the solution
I was at home in Sydney. After three sequence of alternating the two possible
years, I married my wife, Susan, a 1956 moves. Whatever the problem solvers were
refugee from Hungary, and we have two doing to solve the problems, learning the
daughters, Naomi and Tamara. I was happy
alternating solution sequence rule did not but seem to learn little from the exercise? I
play a part. needed to determine their problem solving
Cognitive load theory probably can strategy and needed to analyse why that
be traced back to that experiment. My strategy was preventing learning. It took
objections to the many variations of several years during the 1980s to identify the
discovery and problem-based learning also relevant cognitive structures and functions
have a similar source. While the puzzle with much of the work continuing to use
problem-solving task used had no direct puzzle problems.
educational relevance because such tasks do In the 1970s, the study of problem
not form part of any curriculum, the results solving had expanded and made
seemed to say something about how considerable progress. The seminal work
students learned and solved problems. It was carried out by Newell and Simon at
was obvious to me that if I had simply Carnegie-Mellon University in Pittsburgh
informed students to solve each problem by (Newell & Simon, 1972). They had
alternating the two moves until they reached identified the characteristic strategy that
solution, they would have immediately humans use when problem solving: means-
learned the rule and would have been able ends analysis. A means-ends strategy
to solve any problem presented to them no requires problem solvers to consider their
matter how many moves were required for current problem state, the goal state, extract
solution. Of course, since these were differences between the two states, find a
problem-solving experiments, I had not problem-solving operator that can reduce
informed participants of the alternation rule the differences, and repeat the process until
and most failed to discover the rule for the goal is reached. This strategy requires
themselves. problem solvers to process
It seemed plausible to me that
the same processes might apply when
students were asked to solve problems
in an educational context. We give
students problems to solve in subjects
such as mathematics with the
expectation that they would learn to
solve such problems. If my
experimental results were generalizable
to educational problems, that
expectation may not be realised.
Perhaps we should be showing
students how to solve problems rather
than having them solve the problems
themselves? University of New South Wales Campus

Early Theoretical Issues in working memory all of the information
The next step was to test whether concerning problem states and problem
educational problems had the same operations simultaneously. I figured that
characteristics as the puzzle problems that I since it was well known that working
had used. Despite being an obvious step, it memory is very limited in capacity and
was not one that I took. Before testing the duration, it is likely that when a means-ends
hypothesis using educational problems, I strategy was used, nothing else can be
decided to try to determine the cognitive considered. The result is that problem
mechanisms that caused my strange solvers can successfully solve a problem but
experimental results. Specifically, why could learn nothing from the exercise if no
my participants easily solve their problems
information is transferred to long-term The initial work on the goal-free effect used
memory. puzzle problems and was carried out with
Marvin Levine during my first sabbatical at
Cognitive Load Effects the State University of New York at Stony
This process seemed to explain why Brook. I discussed my ideas with him and
my problem solvers could solve their we devised the first experiments on the
problems but not discover the rule that they goal-free effect. We established that transfer
had solved all of them by alternating the two effects were substantially enhanced by the
possible moves. If students solving use of goal-free, puzzle problems (Sweller &
educational problems used the same Levine, 1982).
procedures, then the use of problem solving I had written to several people
in educational contexts should be asking whether I could visit them during a
questioned. The function of problem sabbatical. Apart from Marvin Levine, none
solving using means-ends analysis seemed to were very enthusiastic with most making it
be to reach the goal of a problem, not to clear that a visit from me would be a
learn, where learning was defined as nuisance. In contrast, Marvin was
transferring knowledge to long-term enthusiastic. Susan and I arrived on Long
memory. Island for a one-year stay. It was our first
Goal-Free Effect. The goal-free effect trip outside of Australia or New Zealand.
derived from this reasoning. It was the first Every weekend, we would catch the train to
cognitive load theory effect although in Manhattan, going to museums and attending
some senses, the theory derived from the concerts and plays. We wandered all over
effect rather than the effect from the theory. Manhattan. At that time, Manhattan had a
Here is the reasoning that was used. If reputation for being dangerous but all the
working memory during problem solving criminal behavior must have been occurring
was overloaded by attempts to reach the behind us because although we were offered
problem goal thus preventing learning, then drugs on a regular basis, we never saw
eliminating the problem goal might allow anything violent. We did avoid the subway
working memory resources to be directed to so perhaps that saved us.
learning useful move combinations rather
than searching for a goal. Problem solvers
could not reduce the distance between their
current problem state and the goal using
means-ends analysis if they did not have a
specific goal state. Rather than asking
learners to “find Angle X” in a geometry
problem, it might be better to ask them to On returning to Sydney, the next
“find the value of as many angles as step was to provide evidence that the goal-
possible”. free effect applied to educational problems
You can reduce differences between not just puzzle problems. Elizabeth Owen,
where you are and where you are going if Bob Mawer, and Mark Ward, the first two
you know that where you are going is to mathematics teachers and the third a physics
find a value for Angle X. You cannot find teacher, demonstrated the effect using
such differences if you are merely geometry and physics problems (Owen &
attempting to find as many angle values as Sweller, 1985; Sweller, Mawer, & Ward,
you can. For example, you can work 1983). That work provided us with the first
backwards from a goal such as “find angle cognitive load theory effect using
X”. You cannot work backwards from the instructional materials.
goal “find the value of as many angles as Worked Examples Effect. The worked
you can.” It is a different type of goal that example effect, according to which learners
requires a different problem solving strategy. who study worked examples perform better
on test problems than learners who solve load that interfered with learning (Sweller,
the same problems themselves, also derived 1988).
from the reasoning that conventional It was the worst possible time to be
problem solving interfered with learning publishing papers calling into question the
because it concentrated on reaching a efficacy of using problem solving as a
problem goal rather than transferring learning device. Our increased knowledge of
knowledge to long-term memory. Papers by problem solving, largely led by researchers
Graham Cooper demonstrating the worked in Pittsburgh, led many to suggest and most
example effect using algebra problem to accept that educational problem solving
solving were published (e.g. Cooper & should be emphasized. For reasons that are
Sweller, 1987). More recently, in her PhD, unclear to me, randomized, controlled trials
Arianne Rourke demonstrated the worked testing the effects of problem solving on
example effect using students learning learning were not used. The fact that
designers’ styles (Rourke & Sweller, 2009). evidence in support of the notion that
Juhani Tuovinen demonstrated an problem solving was a relatively good way
advantage of worked examples over to learn was contradicted by the worked
discovery learning in his PhD that I example effect, was treated as an irrelevant
supervised (Tuovinen & Sweller, 1999). detail. Most of the field leapt enthusiastically
Sunah Kyun, a PhD student from Korea on the problem solving bandwagon. The
supervised by Slava Kalyuga and me research on worked examples was treated
extended the work on worked examples to either with hostility or more commonly,
learning about English literature for ignored, a state of affairs that lasted for
students studying English as a foreign about two decades.
language (Kyun, Kalyuga, & Sweller, 2013).
We have had several PhD students working Further Instructional Effects of
on cognitive load theory to teach English as Cognitive Load Theory
a foreign language. Jase Moussa-Inaty, In the meantime, ignoring the issues
whose PhD was supervised with Paul Ayres the field had with worked examples, we
(Moussa-Inaty, Ayres, & Sweller, 2012) and needed to extend our knowledge of the
Yali Diao (Diao & Sweller, 2007) worked in
this area studying native Arabic and native
Chinese, respectively, learning English.

Difficulties Convincing Researchers of
the Problem with Problem Solving
In 1984, I spent a few months on a
sabbatical at the Learning Research and
Development Center (LRDC) in Pittsburgh
that, along with Carnegie-Mellon University,
was the center for research into problem
solving. I tried, far too ambitiously, to
convince people that learning via problem
solving was a dead-end. Predictably, I failed.
At that time, writing computational models
of cognitive processes was strongly
emphasized and since Pittsburgh was the
center for such activity I wrote and
published a model, based on the goal-free
effect, supporting the suggestion that
Learning Research and Development
problem solving imposed a heavy cognitive
Center (LRDC) in Pittsburgh
the angle and find it on the diagram. Until
worked example effect. The original the statement and the diagram have been
experiments demonstrating that studying mentally integrated, neither can make any
worked examples was better than solving sense. This activity has to occur in limited
problems had been carried out using algebra working memory and the sole reason it has
transformation problems such as, a + b = c, to occur is because of the conventional
solve for a (Cooper & Sweller, 1987). The format of geometry worked examples. If
next and obvious step was to demonstrate instead, the statements are placed on the
that other areas such as geometry or physics diagram or had arrows indicating the
problem solving also led to the worked relations between each statement and the
example effect. We ran experiments diagram, the worked example is physically
comparing problem solving with studying integrated and working memory resources
worked examples using geometry or physics do not have to be expended to integrate the
problems and found no statistically two sources of information. Extraneous
significant differences whatsoever. We were cognitive load is reduced and learning is
mystified. Why should studying algebra facilitated. By eliminating split-attention in
worked examples be superior to solving the the worked examples, the same worked
equivalent problems but studying geometry example effect that was obtained using
or physics worked examples prove no better algebra problems can be obtained using
than solving the equivalent problems? geometry or physics problems.
After several years we realized that The issue did not arise in the case of
the issue was not whether worked examples algebra problems because while the
or problems were used but whether conventional way of presenting algebra
different instructional procedures increased worked examples does not incorporate split-
or decreased working memory load. It was a attention, the conventional way of
lesson that we seemed to have to re-learn presenting worked examples in geometry
every few years. The issue could never be and physics does incorporate split-attention.
whether the use of worked examples was A worked example in algebra consists only
better than solving problems but rather, of one line followed by the next line with a
whether the particular worked examples transformation. Learners do not have to
used reduced unnecessary working memory split their attention between the statements
load compared to solving problems. In the and a diagram or between different
case of our algebra worked examples, the categories of statements as they do in
conventional format used to present a geometry or physics. We had been fortunate
worked example did reduce unnecessary that our first attempts to test the worked
working memory load compared to solving example effect happened to use an area that
a problem. In the case of geometry and conventionally did not incorporate split-
physics worked examples, the conventional attention, otherwise we may never have
worked example format did not reduce discovered the worked example effect. The
working memory load compared to solving effect applies to the presentation of all
problems and so the worked examples were information, not just worked examples.
ineffective. Rohani Tarmizi, a Malaysian mathematics
Split-Attention Effect. The issue with teacher enrolled in a PhD demonstrated the
geometry or physics worked examples was split-attention effect using geometry
split-attention. Learners studying worked problems (Tarmizi & Sweller, 1988) while
examples in conventionally structured Mark Ward demonstrated the effect using
geometry or physics had to split their physics problems as part of his PhD (Ward
attention between multiple sources of & Sweller, 1990). Paul Owens demonstrated
information and then mentally integrate similar effects in his PhD on music
them. For example, if a geometry statement education (Owens & Sweller, 2008) as did
mentions Angle ABC, learners have to note Narciso Cerpa studying computer education
for his PhD (Cerpa, Chandler, & Sweller, Science at UNSW, found similar effects
1996). Physically integrating disparate using animation that is transient compared
sources of information so that they no to static graphics that are permanent (Wong,
longer have to be mentally integrated Leahy, Marcus, & Sweller, 2012). Samuel
reduces extraneous cognitive load and Ng, a PhD student from Singapore who
facilitates learning. Slava Kalyuga and I supervised also studied
Modality Effect. The modality effect the effects of transience due to animation
provided an alternative technique to when teaching physics (Ng, Kalyuga, &
physically integrating disparate sources of Sweller, 2013). These findings led to the
information. Seyed Mousavi, an Iranian transient information effect.
PhD student supervised by colleague Renae The transient information effect is
Low and me, discovered the instructional interesting in that it explained some failures
modality effect (Mousavi, Low, & Sweller, to obtain the modality effect. The modality
1995). When dealing with, for example, a effect had been discovered in the mid-
diagram and text, instead of presenting the 1990s. It was frequently replicated over the
text in written form, it can be presented in years by many researchers but there were
spoken form. By using both auditory and some notable failures and even reversals of
visual channels, working memory capacity the effect with a few studies indicating single
can be increased. Thus, the modality effect modality presentation was superior to dual
occurs when learning is facilitated by using modality presentation. Of course, like all
both visual and auditory channels. It also is cognitive load effects, the modality effect
useful to provide markers on the visual will only be obtained when the instructional
information to indicate what the auditory procedure reduces extraneous working
information is referring to, as found by memory load. A long, complex oral
Hyunju Jeung from Korea in her PhD with statement, because it is transient, will
Paul Chandler and me (Jeung, Chandler, & increase rather than decrease working
Sweller, 1997). memory load compared to a written
Transient Information Effect. When statement, leading to the transient
demonstrating the modality effect, care must information effect and a reversal of the
be taken to ensure that spoken material is modality effect. As always, restructuring
short and simple. Auditory material is information is only effective insofar as it
transient with new information replacing old reduces working memory load. Just as the
information, unlike written material which is worked example effect cannot be obtained
permanent. Indeed, the permanence of using split-attention information, the
written material and the transient nature of modality effect cannot be obtained using
spoken material is presumably why we lengthy, complex, transient, oral
invented writing. Wayne Leahy, a previous information.
PhD student of mine and now a colleague at The transient information effect also
Macquarie University, found that the has implications for technology-mediated
modality effect is reversed when using long, presentation of information. Frequently, we
complex spoken text (Leahy & Sweller, introduce new technology because we can.
2011). Any advantage of using both auditory For example, we may use spoken rather
and visual processors is negated by than written information or animations
presenting lengthy, complex text in spoken rather than static graphics because
form. Such text always should be presented technology now allows us to use these
in written form so that learners can easily techniques more readily. Commonly, the
return to given segments to ensure that they cognitive consequences of new instructional
understand the text. Anna Wong and her procedures are far more important than the
PhD supervisor, Nadine Marcus, a previous medium used.
PhD student of mine and now employed as Redundancy Effect. In the late 1980s,
an academic in the School of Computer Paul Chandler turned up to enroll in a PhD
and commence the most effective research Angle XYZ” is likely to be unintelligible
collaboration I have had. His work had an without reference to the diagram. In
immense influence on cognitive load theory, contrast, some statements simply reiterate
first as a student and then as a colleague. information that can be seen just by looking
Several cognitive load effects were at a diagram. A diagram that shows blood
discovered during our collaboration. His flowing from the left ventricle to the aorta
PhD investigated the redundancy effect does not need a statement saying “Blood
(Chandler & Sweller, 1991). Providing flows from the left ventricle to the aorta”.
learners with any unnecessary information Such statements belong to a different
requires them to process that information category to statements such as “Angle ABC
equals Angle XYZ” which are essential to
understand the diagram. Redundant
statements are unnecessary and processing
them leads to an extraneous cognitive load.
Instead of integrating them with a diagram,
they should be eliminated due to
redundancy, leading to the redundancy
which can overload working memory. For
example, one of my PhD students, Compound Cognitive Load Effects
Susannah Torcasio, found that providing Compound cognitive load effects
beginning readers with pictures was are ones in which different effects interact.
redundant and interfered with learning to The manner in which they interact,
read (Torcasio & Sweller, 2010). Janette effectively provides limits to various
Bobis in her PhD found that including cognitive load effects. All instructional
additional, complex diagrams during effects have limits and those limits are just
mathematics education could lead to the as important as the effects themselves.
redundancy effect (Bobis, Sweller, & Element Interactivity Effect. The first
Cooper, 1994). Most people assume that compound effect we found was the element
providing learners with additional interactivity effect. As indicated above,
information is at worst, harmless and might cognitive load theory effects depend on the
be beneficial. Redundancy is anything but imposition of a heavy working memory load
harmless. Providing unnecessary by a task. Some instructional material,
information can be a major reason for because of its composition, does not require
instructional failure. extensive working memory resources. That
As was the case for the split- material should not be expected to
attention effect, we found the redundancy demonstrate any cognitive load effects. In
effect by accident. In fact, the redundancy order to determine whether information
effect flowed from the split-attention effect. potentially might impose a heavy working
We had previously found that requiring memory load, we needed a measure of
learners to split their attention between a complexity. The measure devised was
diagram and text interfered with learning element interactivity. If elements of
compared to physically integrated diagrams information interact, they must be processed
and text. We erroneously assumed that all simultaneously in working memory to be
diagrams and text had the same properties. understood, imposing a heavy cognitive
Instead, the logical relations between the load. If they do not interact, they can be
two sources of information was critical. For processed individually, with less cognitive
the split-attention effect, the sources of load.
information had to refer to each other and The concept of intrinsic cognitive
be unintelligible in isolation. For example, a load derived from this reasoning. It could be
geometry statement such as “Angle ABC = predicted that only when element
interactivity was high resulting in a high than studying worked examples. The reason
intrinsic cognitive load would cognitive load is that while studying worked examples may
effects occur. Empirical work on the be of assistance to novices, as expertise
element interactivity effect confirmed this increases, worked examples may become
prediction. Unless learners find the redundant with learners needing to practice
information being processed complex and solving problems instead. Increased
difficult to understand, cognitive load expertise reduces element interactivity and
effects will not be obtained. Sharon Tindall- there is no need to reduce working memory
Ford, a PhD student of Paul Chandler and load if few memory resources are needed
mine was instrumental in this work, during problem solving.
especially as it related to the modality effect Alexander Yeung’s PhD that I
(Tindall-Ford, Chandler, & Sweller, 1997). supervised with Putai Jin was one of the
The element interactivity effect suggests that first to demonstrate the expertise reversal
no cognitive load effect can be obtained if effect (Yeung, Jin, & Sweller, 1998). He
element interactivity is low. Cognitive load studied the effect of explanatory notes
theory applies to complex material that is associated with text. Maria Pachman, a PhD
difficult to understand. student supervised by Slava and me did
Expertise Reversal Effects. The element work on the expertise reversal effect as it
interactivity effect and another important related to deliberate practice when studying
cognitive load theory effect, the expertise mathematics (Pachman, Sweller, & Kalyuga,
reversal effect are closely related because 2013). Kimberley Leslie, for her PhD
information that is high in element supervised by Renae Low, Putai Jin, and me,
interactivity for a novice is likely to be low also studied the expertise reversal effect
in element interactivity for an expert. Paul when primary school students learn science
Chandler and I first discussed the expertise concepts (Leslie, Low, Jin, & Sweller, 2012).
reversal effect while walking through a
Sydney brewery. The brewery trained International Acceptance of Cognitive
apprentices and we were there to discuss the Load Theory
possibility of running some training studies. While cognitive load theory work
Eventually, we did not run any experiments was continuing in Sydney, similar research
there and neither did we get any free beer. was beginning to gain some traction in the
However, we needed to run some rest of the world, primarily Europe. It was
experiments testing the expertise reversal largely ignored in Australia—where it
effect. Slava Kalyuga turned up to do a PhD continues to be ignored. The first large-scale
and another great collaboration ensued. He interest in the theory occurred in Holland,
also became a colleague in due course. Slava led by Jeroen van Merriënboer and his
had almost completed a PhD in the Soviet brilliant PhD student, Fred Paas. They
Union before it collapsed along with Slava’s confirmed the worked example effect, and
hopes of completing his degree. Instead, he invented two new effects.
was exiled into the gulag of cognitive load Completion and Variability Effects. The
theory from which he has never managed to completion problem effect occurs when
escape. learners asked to complete the solution to a
The expertise reversal effect occurs partially completed problem learn more
when Instructional Procedure A is superior rapidly than students asked to solve a
to B for novices with the superiority problem without being shown any of the
decreasing and eventually disappearing or moves (Paas, 1992). The variability effect
even reversing with increases in knowledge occurs when learners shown highly variable
levels. For example, studying worked worked examples learn more than learners
examples may be better than solving shown more similar worked examples (Paas
problems for novices but with increased & van Merrienboer, 1994). Yuan Gao, a
expertise, solving problems may be better PhD student who I co-supervised with
colleagues, Putai Jin and Renae Low,
recently generalized the variability effect to
learning to listen in a foreign language (Gao,
Low, Jin, & Sweller, 2013).
Measuring Cognitive Load. Jeroen and
Fred’s most important work at this time was
in devising a subjective rating scale to
measure cognitive load (Paas, 1992; Paas &
van Merrienboer, 1993). Subsequently, the
measurement of cognitive load became an
important sub-field in its own right with
Detlev Leutner and Roland Brünken in
Germany along with Jan Plass in New York
contributing heavily to the field and in the
process, transforming it (Brunken, Plass, &
Leutner, 2004).
Guidance Fading Effect. Alexander
Renkl from Germany carried out ground-
breaking research. He became the world’s
leading expert on the worked example More Recent Cognitive Load Effects
effect. His work on the guidance fading The theory continued to generate
effect according to which learning is new instructional effects, that will be
facilitated by gradually fading the assistance described briefly below.
given to learners as they gain expertise has Isolated Elements Effect. The isolated
been critical (Renkl & Atkinson, 2003). elements effect, according to which
Domain-Specific vs. Generic Skills. In France, interacting elements of very complex, high
André Tricot advanced theoretical work on element interactivity information that are
cognitive load theory and introduced it to initially presented as isolated elements
the French-speaking world, along with improve learning compared to having them
Lucile Chanquoy (Chanquoy, Tricot, & presented in their “natural”, integrated
Sweller, 2007). André and I collaborated on format was demonstrated by Edwina
a paper concerned with the relative merits Pollock, one of my PhD students (Pollock,
of emphasizing domain-specific as opposed Chandler, & Sweller, 2002). Paul Ayres
to generic skills (Tricot & Sweller, 2014). devised important experiments on the
When it came to academic isolated elements effect (Ayres, 2006). Paul
organizational flair, the Europeans, Blayney, an accountancy academic at Sydney
especially the Dutch, made me look like a University who completed a PhD with me,
rank amateur. They organized symposia on demonstrated many of the conditions
cognitive load theory, first at European required for the isolated elements effect by
conferences and then in the US and relating it to the element interactivity and
organized special issues of both American expertise reversal effects (Blayney, Kalyuga,
and European journals on the theory. Those & Sweller, 2010).
special issues had a far greater impact on the Imagination Effect. The imagination
field than individual papers. Alexander effect occurs when students asked to
Renkl made contact with researchers at imagine concepts or procedures learn better
Carnegie-Mellon and that contact had more than students only asked to study the same
influence in acquainting Americans with the materials. Graham Cooper, Sharon Tindall-
worked example effect than I had ever Ford, Paul Chandler and I carried out the
managed. initial work on this effect (Cooper, Tindall-
Ford, Chandler, & Sweller, 2001).
Subsequently, Paul Ginns, currently an
academic at Sydney University, further information during reproduction and the
developed this work during his PhD with transmission of information between
me (Ginns, Chandler, & Sweller, 2003), as communicating humans; and 3. random
did Wayne Leahy (Leahy & Sweller, 2005). mutation and random generate and test
Paul Ginns also was instrumental in during problem solving. On the other hand,
publishing several meta-analyses of working memory, so central to human
individual cognitive load effects (e.g. Ginns, cognition, did not seem to have an obvious
2005). analogous function or process in
Collective Working Memory Effect. evolutionary biology.
Femke Kirschner and Fred Paas introduced My wife, Susan who is a biologist,
the collective working memory effect to deal provided the necessary link (Sweller &
with collaborative learning (Kirschner, F., Sweller, 2006). The epigenetic system plays
Paas, F., & Kirschner, P., 2009). When the same role in evolutionary biology as
learners with different knowledge bases working memory plays in human cognition.
collaborate on a task, each individual, of The epigenetic system uses the environment
course, has a limited working memory but to determine biological structures and
by collaborating, in effect they are pooling functions. For example, a person’s skin and
their working memories. Providing the costs liver cells have exactly the same DNA in
of collaborating are less than the effective their nuclei but have vastly different
increase in working memory due to pooling, structures and functions. Those differences
performance should be increased compared are due to the epigenetic, not the genetic
to individual learning. Endah Retnowati, a system. Both working memory and the
PhD student of Paul Ayres and mine from epigenetic system act as a link between the
Indonesia followed up on that work information store (long-term memory and a
(Retnowati, Ayres, & Sweller, 2010). genome) and the external environment. The
analogy between biological evolution and
Evolutionary Educational Psychology human cognition proved to have
and Cognitive Load Theory instructional implications, discussed below.
While the ultimate aim of cognitive There is a second, very direct role
load theory is to provide instructional played by evolution by natural selection
effects leading to instructional when determining instructional procedures.
recommendations, the theory itself David Geary provided the relevant
continued to develop leading to new effects theoretical constructs (Geary, 2012). He
and controversies in the first 10 years of this described two categories of knowledge:
century. Evolutionary psychology was biologically primary knowledge that we have
beginning to gain a degree of prominence evolved to acquire and so learn effortlessly
and some aspects of it were relevant to both and unconsciously and biologically
the theoretical base of cognitive load theory secondary knowledge that we need for
and instructional design. cultural reasons. Examples of primary
It became apparent that the knowledge are learning to listen and speak a
information processing orientation used by first language while virtually everything
cognitive load theory was analogous to the learned in educational institutions provides
information processes that formed the base an example of secondary knowledge. We
of evolution by natural selection. The invented schools in order to provide
suggestion that human cognition and biologically secondary knowledge.
evolution by natural selection were I had been vaguely aware of Geary’s
analogous was not new, with the ancestry of work but had not paid much attention to it
the analogy traceable back to Darwin. There because I assumed it was not relevant to my
seemed an obvious correspondence immediate research concerns. In 2006, Jerry
between: 1. information held in DNA and in Carlson and Joel Levin asked me to write a
long-term memory; 2. the transmission of commentary (Sweller, 2007) on an article
written by David Geary (2007) to be altering one variable at a time. The worked
published in an edited book. I agreed, which example effect demonstrated clearly that
necessitated my reading his work with showing learners how to do something was
considerably more care and thought than I far better than having them work it out
had hitherto managed. I was astonished. themselves. Of course, with the advantage
What Geary was proposing had the of hindsight provided by Geary’s distinction
potential to change our field. It provided a between biologically primary and secondary
resolution to issues that had seemed knowledge, it is obvious where the problem
intractable to me. lies. The difference in ease of learning
between class-based and non-class-based
Instructional Consequences of topics had nothing to do with differences in
Evolutionary Educational Psychology how they were taught and everything to do
This is the context for the issues with differences in the nature of the topics.
Geary dealt with. For many years our field If class-based topics really could be
had been faced with arguments along the learned as easily as non-class-based topics,
following lines. Look at the ease with which we would never have bothered including
people learn outside of class and the them in a curriculum since they would be
difficulty they have learning in class. They
can accomplish objectively complex tasks
such as learning to listen and speak, to
recognie faces, or to interact with each
other, with consummate ease. In contrast,
look at how relatively difficult it is for
students to learn to read and write, learn
mathematics or learn any of the other
subjects taught in class. The key, the
argument went, was to make learning in learned perfectly well without ever being
class more similar to learning outside of mentioned in educational institutions. If
class. If we made learning in class similar to children are not explicitly taught to read and
learning outside of class, it would be just as write in school, most of them will not learn
natural and easy. to read and write. In contrast, they will learn
How might we model learning in to listen and speak without ever going to
class on learning outside of class? The school.
argument was obvious. We should allow Explicit Instruction. Coinciding with
learners to discover knowledge for these theoretical developments, Paul
themselves without explicit teaching. We Kirschner, an American who had
should not present information to learners – transformed himself into a Dutchman and
it was called “knowledge transmission” – whom I knew from meetings at cognitive
because that is an unnatural, perhaps load theory symposia, suggested we
impossible, way of learning. We cannot collaborate on writing a paper advocating
transmit knowledge to learners because they the use of explicit instruction rather than the
have to construct it themselves. All we can minimal guidance commonly promoted. We
do is organize the conditions that will wrote some drafts of the paper and Paul
facilitate knowledge construction and then suggested that we should send it to Dick
leave it to students to construct their version Clark for advice before submitting it to a
of reality themselves. The argument was publisher. Dick made it clear he liked the
plausible and swept the education world. paper very much and gave some excellent
The argument had one flaw. It was advice for further improvements. With
impossible to develop a body of empirical more work it became clear that Dick’s
literature supporting it using properly advice was becoming too extensive for a
constructed, randomized, controlled trials mere acknowledgement and so he was
included as a co-author. Thus began an Learning this procedure is important to
extensive collaboration between the three of solving a limited class of problems but
us that continues to this day. useless when solving unrelated problems.
The Kirschner, Sweller, and Clark While generic cognitive skills are far more
(2006) paper had an immediate impact, important than domain-specific skills,
unlike the 10-20 year wait before my because of that very importance, they are
empirical papers were noticed. It was likely to be biologically primary and so do
polarizing with many opinions either not need to be taught.
strongly positive or strongly negative. Some
idea of the reactions can be found in the
edited collection of Sig Tobias and Tom
Duffy (Tobias & Duffy, 2009), a book that
derived from the several symposia on the
topic of Constructivism generated by the
original paper. Whatever the long-term André Tricot and I suggested that at
influence of this work, it is notable that the least one of the reasons for the success of
term “constructivism” seems to have largely cognitive load theory has been its emphasis
disappeared from the current research on teaching domain-specific rather than
literature. Whether that disappearance was generic skills (Tricot & Sweller, 2014). Our
due to our efforts or other factors, and field’s emphasis on the teaching of generic
whether some of the replacements are any cognitive skills may be entirely misplaced,
better than constructivism, are debatable explaining the lack of a body of evidence
topics. supporting the teaching of such skills.
There is a strong confluence While generic cognitive skills can be
between the work on evolutionary learned but not taught, they can be used in
educational psychology and the issues the acquisition of domain-specific skills that
associated with explicit instruction and need to be both taught and learned (Paas &
minimal guidance. Humans are amongst the Sweller, 2012). Learners who know how to
very few species that provide and obtain use a generic cognitive skill may not know
extensive information from other members that it can be usefully applied when dealing
of the species. Providing and obtaining with particular, domain-specific content.
information is a biologically primary skill. Simply pointing out to students that they
We are very good at it. Given this should use a generic, cognitive skill may be
biologically primary skill, the suggestion that beneficial when learning domain-specific
we should not explicitly provide learners content. Recently, Amina Youssef-Shalala, a
with information is bizarre. I hope that the PhD student of Paul Ayres and mine, Carina
minimal guidance movement is an Schubert, a visitor from Germany, and I
aberration that does not return. found that telling students that in the
Domain-Specific Knowledge. There are absence of domain-specific knowledge
other implications that flow from the indicating how certain problems can be
evolutionary educational psychology base of solved, they should use a strategy of
cognitive load theory. In the last few randomly generating moves (Youssef-
decades there has been a considerable Shalala, Ayres, Schubert, & Sweller, 2014).
emphasis on the acquisition of generic, For this strategy, students are told to make
cognitive skills such as, in mathematics, as many moves as they can without any
general problem-solving skills rather than reference to the goal of the problem. That
domain specific skills. An example of a strategy can be effective on transfer
domain-specific skill in mathematics might problems. While students did not need to be
be learning that when faced with a problem taught how to randomly generate moves
such as, a/b = c, solve for a, one should because it is a biologically primary skill, they
multiply both sides by the denominator. did need to be told to use the strategy in the
specific, biologically secondary domains that decisions and astonishingly ignorant
they were studying. reviewers. I have had my fair share of both.
But as a frequent reviewer, I learned long
Some Current Work ago that my reviews are just as incompetent
Currently, while technically retired, I as the worst. We all try our best but we are
am continuing to conduct research and attempting to judge work that we are clearly
supervise research students. As one inept to judge. The competent judges of our
example, I have commenced collaboration work are several generations hence. We do
with Tzu-Chien Liu from Taiwan who is not have sufficient knowledge to properly
carrying out important work on cognitive judge current work but, of course, despite
load theory. Several of my students and our incompetence, there is no one else
collaborators are working on relations available. It will be up to future generations to
between the worked example, element determine the usefulness of our efforts.
interactivity and generation effects and on Contrary to what we might expect of
element interactivity and the testing effect. researchers allegedly devoted to new ideas
While the concept of element interactivity and new knowledge, we are incredibly
has been around for 20 years and is central conservative. The closer our ideas are to the
to cognitive load theory, cognitive load prevailing zeitgeist, the more acceptable they
theorists have largely ignored it. Its will be. Most research papers support the
centrality needs to be emphasized. As prevailing views, whatever those views
indicated above, cognitive load theory only might be. Therefore, do not hesitate to advance
applies to complex information that is high ideas conflicting with the current zeitgeist. They
in element interactivity. It is not a theory of may be ignored for a while but, if they do
everything and cognitive load effects should have merit, they are very likely to be
not be expected using low element ultimately recognized.
interactivity information. It is also useful to follow up your ideas
systematically and build a program of research
studying one variable after another rather than
floating from one area to another. That may take
some stubbornness, especially since the
merit of one’s work may not become clear
for decades, but such a program of research
has a better chance of obtaining eventual
recognition than an identical number of
unrelated studies.
I have at times, advanced
suggestions that many felt were outrageous.
Some of those suggestions now seem to be
considered self-evident by many in the field.
How did I change people’s views? I do not
Lessons Learned think I did. Rather, people retired or died to
There are several general lessons be replaced by younger people who did not
(not generic cognitive skills!) that I have have to carry the burden of their own long
learned over an almost half century of history. Societal renewal and change over the
research. The main one is that age-old generations is as much a part of the human
lesson that applies to many facets of life: if condition as individual resistance to change, at least
you are confident of your ideas, persist. In the case in some societies.
of research, ignore negative editorial
