You are on page 1of 12

Journal of Experimental Psychology: Copyright 1986 by the American Psychological Association, Inc.

Learning, Memory, and Cognition 0278-7393/86/400.75


1986, Vol. 12, No. 1,42-53

Bizarre Imagery: The Misremembered Mnemonic


Neal E. A. Kroll, Eva M. Schepeler, and Karen T. Angin
University of California, Davis

Two experiments were performed to measure the effects of bizarre imagery and image interaction on
the brief and long-term memory of word pairs. Subjects in Experiment 1 performed an incidental
learning task and were tested with both free- and cued-recall tasks. Subjects in Experiment 2 performed
intentional learning tasks and were tested with cued-recafl tests. Because performance in the delayed
tests of Experiment 1 was extremely poor, subjects in Experiment 2 were given a much more intensive
training procedure. In both experiments, bizarre imagery did not improve memory more than plausible,
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

interactive imagery. The degree of interaction in the image was a strong determinant of cued-recall
This document is copyrighted by the American Psychological Association or one of its allied publishers.

performance at both retention intervals. Most subjects in Experiment 2, questioned after their cued-
recall test, believed that they had remembered more bizarre than plausible pairs, even though this
was clearly not the case.

Professional mnemonists (e.g., Furst, 1954; Lorayne & Lucas, formation times were longer for the bizarre images, but the plau-
1974) have long advocated the use of bizarre imagery to improve sible images were, in fact, remembered better than the bizarre
one's memory. O'Brien and Wolford (1982), among others, have ones.
suggested that bizarre imagery should be effective; it should in- Hauck, Walsh, and Kroll (1976) thought that perhaps profes-
crease the distinctiveness of items, thereby reducing interference sional mnemonists preferred bizarre images because they were
effects. Experiments designed to demonstrate this advantage, required to remember many more items than the experimental
however, have had mixed results. Wollen, Weber, and Lowry subjects had been required to learn. In an attempt to measure
(1972), for example, presented subjects with drawings of noun the effect of increasing interference (and increasing practice),
pairs. These drawings depicted the objects named by the noun Hauck et al. required their subjects to form images to 48 noun
pairs either in bizarre or in plausible ways—and, in addition, pairs per day over 5 days. The first member of the pairs remained
the objects were either shown interacting with one another or the same from day to day, but the second member of each pair
not. Although the degree of interaction between the objects was always different. The subjects formed bizarre images to half
proved to be an important variable for memory, bizarreness had of the pairs in each list and plausible interacting images to the
no effect. other half of the pairs. At the end of the fifth day, subjects were
There were a number of reasons why this was not accepted still taking longer to form the bizarre images than they were to
as the final evaluation concerning the effect of bizarreness on form the plausible images, but a surprise recall test at the end
memory. For one thing, the subjects in the Wollen et al. exper- of the experiment for all five lists showed that there was no mem-
iment had the pictures shown to them rather than forming their ory advantage for the pairs with bizarre images.
own idiosyncratic images. In an attempt to answer this criticism, More recently, experimenters have considered the possibility
Nappe and Wollen (1973) presented noun pairs and required that, although plausible interactive images may result in as good
subjects to form their own images. Although the bizarre images or better memory as bizarre images over brief retention intervals,
took much longer to form than did the plausible ones, the plau- bizarre images might result in better long-term memories. Ex-
sible interactive images were remembered just as well as were perimenters making this argument have a tendency to reference
the bizarre images. Similarly, Collyer, Jonides, and Bevan (1972) Delin's 1968 experiment, which found that bizarre imagery re-
presented subjects with noun-verb-noun triplets that suggested sulted in better memory after a 15-week retention interval, but
either bizarre or plausible scenes, and they too found that image- tend not to reference Delin's 1969 experiment with an equally
long retention interval where he concluded: "Bizarreness, if it
This research was funded by a University of California Faculty Research has any effect, reduces the effectiveness of imagery mnemonics"
Grant to the first author. (p. 88).
The second author helped with the designs of the pilot and second There have been a number of recent experiments which seem
experiments, helped to revise the word pairs and sentences and to supervise to demonstrate the effectiveness of bizarre imagery over longer
the list presentation phase of these experiments, and helped to write the retention intervals. Andreoff and Yarmey (1976) reported finding
manuscript. that bizarre imagery results in a greater improvement over plau-
The third author helped with the design of thefirstexperiment, helped sible imagery after a 24-hr retention interval than after an im-
to prepare the word pairs and sentences and to supervise the list presen- mediate retention interval. However, they also found a large ad-
tation phase of this experiment, and commented on an early draft of the
manuscript. vantage of bizarre over plausible imagery even on the immediate
Correspondence concerning this article should be addressed to Neal test and, together with the wording of their instructions, this
Kroll, Department of Psychology, University of California, Davis, Cali- suggests that subjects may have had more interaction in their
fornia 95616. bizarre imagery than in their plausible imagery.

42
BIZARRE IMAGERY 43

Webber and Marshall (1978) also argued in favor of a delayed Another set of findings which seem to support the hypothesis
bizarre imagery superiority. In an intentional learning task with that bizarre imagery results in longer lasting recallable memories
line drawings as stimuli, they found a cross-over interaction with are those of Merry and Graham (1978) and Merry (1980). In
plausible imagery being superior to bizarre imagery at the 2- these studies, subjects were asked to rate sentences as plausible
min retention interval, but bizarre imagery being better at the or bizarre. Better recall was found for the bizarre sentences, and
1-week retention interval. Unfortunately, they also neglected to this facilitation from the degree of bizarreness appeared to be
control or measure the degree of interaction present in their particularly pronounced in a delayed test. However, these results
plausible and bizarre drawings. Webber and Marshall were kind held only with free-recall tasks and not with cued-recall tasks.
enough to send the first author reproductions of drawings which (The other experiments discussed earlier used cued-recall tests.)
they said were similar to those used in their experiment. The Although Wollen and Cox (1981a) found similar results using
results from an informal testing suggested that the bizarre draw- Merry's procedure, Wollen and Cox (1981b) discovered a con-
ings were more interactive than the plausible ones. Another study founding in this procedure and, when the confounding was re-
originating in Marshall's laboratory (Marshall, Nau, & Chandler, moved, so was the free-recall advantage for bizarre imagery.
1980) also found an advantage of bizarre over plausible imagery However, Wollen and Cox (1981b) tested only after a brief (1
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

after a 1-week retention interval, but immediate retention was min) retention interval. Consequently, our first experiment was
This document is copyrighted by the American Psychological Association or one of its allied publishers.

not measured and, again, no measure of interaction was at- designed to compare the memories resulting from bizarre and
tempted. plausible interactive imagery after brief and after 7-day retention
intervals with both free- and cued-recall tasks.
Opposed to the findings just mentioned of a possible advantage
of bizarre imagery at long retention intervals, there are not only Subjects in Experiment 1 performed an incidental learning
the negative findings of Delin (1969) after 15 weeks and Hauck task which consisted of rating their imagery for bizarreness, viv-
et al. (1976) after 5 days but also those of Bergfeld, Choate, and idness, and degree of interaction. The subjects in Experiment 2
Kroll (1982) who, after a 24-hr retention interval, found that also rated their imagery; however, they also knew that their
recall memory was better for bizarre imagery pairs than for non- memory for the word pairs would be tested and they received
interactive imagery pairs, but also found that the bizarre imagery an intensive learning procedure to ensure a high level of long-
pairs showed more forgetting than plausible interactive imagery term memory performance.
pairs. In addition, both the Hauck et al. (1976) and the Bergfeld
et al. (1982) experiments allowed at least some minimal test of
the hypothesis that bizarre imagery might be relatively more
Experiment 1
effective in situations involving more potential interference. That Method
is, Hauck et al. (1976) had subjects learn five response lists to a
single stimulus list, and Bergfeld et al. (1982) used several of the Subjects. Thirty-two students of introductory psychology or intro-
stimuli with two responses. Neither experiment found any relative ductory rhetoric classes at the University of California, Davis served as
advantage accruing to bizarre imagery as a result of this increased subjects in this experiment, 16 in each of two retention-interval groups.
degree of interference. They received extra-credit points in their class in return for their partic-
ipation.
More recently, however, O'Brien and Wolford (1982) have Materials. The critical pairs consisted of 48 pairs of high imagery
published two experiments which found that plausible images nouns. Four sentences were developed for each of these pairs. In each of
resulted in better memory at the immediate retention interval, these sentences, the first member of the pair appeared in a beginning
whereas bizarre images resulted in significantly better memory noun phrase and the second in an ending noun phrase. When one of
at a 7-day interval, and with the cross-over occurring somewhere these sentences was shown to a subject, these two nouns were presented
between 1 and 3 days. in capital letters. The four sentences developed for each of the critical
There are several potential problems with these findings; for pairs were of four different types. The four types, with examples for the
pair ANT-COMB, were (a) the plausible-short (P-S) sentence, which was
example, there was no attempt to measure the degree of inter-
plausible, very brief, and did not suggest much of an interaction between
action, and the overall level of memory at the longer retention
the critical words, for example, "An ANT goes around a COMB."; (b) the
intervals is so low and the recall difference between the conditions plausible-low (P-L) sentence, which used modifiers to suggest a complete
is so small (0.6 word for the plausible and 1.6 words for the image, but like the P-S sentence, suggested a plausible image and a rel-
bizarre) that a few extraordinary word pairs could be accounting atively low degree of interaction, for example, "A large, black ANT slowly
for the entire effect. Nevertheless, these experiments probably turns and detours around a plastic COMB"; (C) the plausible-high (P-H)
represent the strongest case yet for a possible improvement in sentence, which also suggested a plausible (feasible) image, but included
long-term memory resulting from the use of bizarre imagery. a relatively high degree of interaction occurring between the two critical
The confirmation and understanding of such a finding could words, for example, "A large, black ANT crawls in and out of the teeth
have important implications for the understanding of memory of a plastic COMB"; and (d) the bizarre-high (B-H) sentence, which also
processes well beyond the study of this particular mnemonic suggested an image with a high degree of interaction between the two
critical words, but now the image was bizarre (impossible, cartoon-like),
technique. The reason for this is that it would be one clear-cut
for example, "A large, black ANT carefully fixes its hair with a plastic
case of a variable affecting memory per se; other variables orig-
COMB." The P-S sentence for each pair was always very brief; the other
inally thought to influence memory now appear to affect pri- three types of sentences were always considerably longer, but within any
marily the rate of learning, including such variables as depth of set, the three were of approximately the same length.
processing (Nelson & Vining, 1978) and imagery itself (Olton, During the learning phase of the experiment, any given subject saw
1969). only one sentence for any given critical pair. In addition to these 48
44 N. KROLL, E. SCHEPELER, AND K. ANGIN

critical pairs, eight other pairs were presented during the learning phase: Table i
two within each of the four sentence types, four at the beginning of the Experiment 1; Image Formation Times (in Seconds), Average
learning list, and four at the end. These were used to reduce primacy and Ratings (5-Point Scale), Percentage of Free Recall, and
recency effects on recall results. The subjects also saw two pairs before Percentage of Cued Recall as a Function of Sentence Type
the learning phase began—one in the instructions and one in the practice
phase—both of which were presented within P-H sentences. The entire Sentence type
list of words and sentences used in Experiment 1 are presented in the
Appendix. Plausible Plausible Plausible Bizarre
Procedure and design. Subjects were tested individually and were Dependent measure short low high high
assigned to either the 5-min or 1 -week retention interval group according
to a predetermined schedule. Regardless of the assignment, the subjects Image formation
were informed that the purpose of the experiment was to measure visual times 3.8 6.5 6.4 6.9
Average rating
imagery, but they were not informed that there would be a memory test.
Bizarreness 2.0 2.1 2.2 4.1
All subjects were told that they would have to participate in a second Vividness 4.1 3.8 3.8 3.7
session 1 week later but were told that this session was to test another Interaction 3.0 3.0 3.6 3.4
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

aspect of visual imagery. The subjects were then given a detailed set of Free recall
This document is copyrighted by the American Psychological Association or one of its allied publishers.

instructions to read, followed by a practice trial, and were strongly en- 5-min retention 20.8 18.7 20.8 20.3
couraged to ask questions or make comments while reading the instruc- 1-week retention 4.7 5.2 3.9 5.5
tions or performing the practice trial. After the practice trial, the exper- Cued recall
imental (incidental) learning phase began, and subjects were discouraged 5-min retention 50.5 55.7 65.6 57.8
from commenting during this phase of the experiment. 1-week retention 6.3 3.7 10.9 6.8
Each trial began with a sentence centered on a display screen of a
personal computer. The subjects had been instructed to form an image
of the objects named by the two capitalized nouns, using the sentence to was given a cued-recall test sheet which contained a list of 52 nouns and
help choose the image. The subjects were asked to form a strong, vivid was asked to try to recall the capitalized noun which had been paired
image and to press a switch as soon as the image was formed. The times with each of the cued words. The first four cues were always from the
taken to form these images were recorded to the nearest millisecond. As first four (noncritical) words in the learning list. The remaining 48 cues
soon as the subject pressed the switch, the sentence would disappear from were from the critical pairs and were presented in a different order from
the screen and was replaced by the words "COMMON-BIZARRE IMAGERY that shown during the learning phase. For each subject half of the cues
RATING SCALE" on the same line, and, beneath it, the words "1-Common. were the first word of the pair and half were the second word. When the
2-Plausible. 3-Possible. 4-Unlikely. 5-Bizarre." The subjects then re- subject had finished the cued-recall test, the experimenter explained the
sponded on the basis of the bizarreness of the image formed. As soon as purpose of the experiment and requested that the subject not inform
that response had been made, a new rating scale appeared: "VIVIDNESS other students about the surprise recall tests. Given that there were no
OF IMAGERY RATING SCALE," "1-No Image. 2-Faint. 3-Bright. 4-Vivid. systematic differences either in the free recall of the first versus the second
5-Very Vivid." Following the subject's rating of the vividness of image, a word or in the use of the first versus the second word as a cue, the results
third 5-point rating scale appeared: "INTERACTION OF IMAGERY RATING reported average over first and second word recall.
SCALE," ranging from (I) little interaction to (5) much interaction. Subjects
were told that only the second scale asked if they actually had an image,
and they were told, "Although we hope that you are able to form an Results
image for each sentence, you can rate your ideas for bizarreness and for Image formation: Formation times and ratings. The aver-
degree of interaction even if you did not form a true image." After the
age formation time and the standard deviation of the formation
third rating response, the screen would go blank for 5 s, then a tone would
times were found for each subject. To reduce the effects of a few
sound, and the next sentence would be presented.
inordinately long formation times, any formation time longer
Each subject received 12 critical word pairs with each of the four types than two standard deviations above the average was replaced by
of sentences. Within a retention-interval group, a set of 4 subjects received
this maximum. The average times taken to form an image for
the 48 critical word pairs in the same order, but across these 4 subjects,
each word pair would be shown with all four of the different sentence
each of the four sentence types are presented in the first row of
types. The next 4 subjects within a group would receive the 48 word Table 1.
pairs in a different order. Only the first and last four(noncritical) sentences The speed of reporting an image after a P-S sentence may
were the same for all subjects. The subjects in the two groups were yoked reflect the relative shortness of P-S sentences. The P-H and
so that the first subject in each group received all 48 critical pairs in the B-H sentences were of approximately the same length, however,
same order with the same sentences. and yet the images following the B-H sentences took significantly
When subjects completed the entire list of 56 sentences, they were sent longer to form, F(U 30) = 6.404, p < .025.'
to another office, where the subjects in the 5-min retention group were
given the memory tests, and the subjects in the 1-week retention group
were then given an appointment slip to return 1 week later for the second
1
session. All of the latter subjects returned for the memory test and all A reviewer noted that this reaction time is not necessarily measuring
expressed surprise when given the memory test. only the image formation time. The long reaction times following bizarre-
The memory test phase of the experiment began with a free-recall test high (B-H) sentences may reflect a subject's playing with the image. This
where the subjects were requested to write down as many of the capitalized is, of course, possible, but (a) subjects were instructed to press the switch
nouns as they could remember. They attempted to write them down as as soon as they had formed an image and (b) if the longer times following
pairs, but if they remembered single words—without the paired word— B-H sentences reflect processing beyond initial formation, then these
they were to write these down as well. When a subject reported that it items should have a large memory advantage from that fact alone. As we
would be unlikely that any additional words would be recalled, the subject will see later no such advantage occurs.
BIZARRE IMAGERY 45

The average ratings for the four sentence types are presented Table 2
in the second (plausible/bizarre), third (vividness), and fourth Experiment 1: Image Formation Times (in Seconds),
(degree of interaction) rows of Table 1. Percentages of Free Recall, and Percentage of Cued
It would appear from the average ratings for plausible/bizarre Recall as a Function of Subjects' Rating
images that the sentences were effective in controlling this aspect
of the images: Every subject gave a higher (more bizarre) average Rating scale
rating to the images formed in response to the B-H sentences
than to any of the others. There was, however, also significant Bizarreness Vividness Interaction
variation among the rated bizarreness of the images produced
to the remaining three sentence types, F\2,62) = 3.930, p< .05. Dependent measure Low High Low High Low High
There was significant variation among the sentence types for
Degree of attribute
the vividness of the images, F(3,62) = 9.740, p< ,001. However, times 5.4 6.8 7.1 5.6 6.0 6.0
we had expected the P-S sentences to produce the least vivid Free recall
images; instead, they produced the images with the highest viv- 5-min retention 21.2 20.0 17.7 22.3 17.0 24.2
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1 -week retention 5.5 4.6 4.2 4.6 2.9 5.5


This document is copyrighted by the American Psychological Association or one of its allied publishers.

idness ratings. In fact, when the P-S sentences were removed


Cued recall
from the analysis, the variation in vividness ratings was no longer
5-min retention 62.4 52.4 61.3 60.9 55.2 60.2
significant, F(2,62) = 1.960, p > .05. One possible interpretation 1 -week retention 9.3 5.7 2.7 7.9 4.1 11.4
of this is that the extra words in the other sentence types—which
were meant to increase the image content—decreased the image-
forming options.
The formation time data were also analyzed as a function of tence type or by any of the image attributes measured by the
the rating given to the image. For this analysis (and for the similar rating scales. There was, however, a suggestion of an effect of
memory test analyses), sentences with ratings of either 1 or 2 rated interaction, and it is possible that a more sensitive exper-
were classified as low and those with ratings of either 4 or 5 were iment would find this effect to be significant.
classified as high, on each of the three scales. There were 5 subjects Cued-recall performance. The average percentages of words
who did not rate any of their images as 1 or 2 on the vividness recalled in the cued-recall test are presented in rows 7 (5-min
dimension, and for these subjects, sentences with vividness ratings retention interval) and 8 (1-week retention interval) of Table 1
of 3 were classified as low. The average image-formation times as a function of the type of sentence used to present the words
for the sentences falling into the low and high categories of the to the subjects. Two analyses were performed, one at each re-
three rating scales are presented in the first row of Table 2. tention interval. Both found significant variation in the per-
Given that the subjects were controlling how the images were centage of words recalled as a function of sentence type, F(3,
rated, unequal numbers of images fell into the various categories, 45) = 3.006 (5-min) and 2.909 (1-week), p < .05. The main
and some subjects did not report images in all eight cells; con- effect was broken down into three component parts of one degree
sequently, separate analyses were performed on the image-for- of freedom each: (a) interactive versus noninteractive, that is, B-
mation times for each of the rating scales. Images rated as bizarre H and P-H versus P-S and P-L; (b) B-H versus P-H; (c) P-S
were formed more slowly than those rated as plausible, F( 1,30) = versus P-L. Of these, only the first was significant, and that was
74.12, p < .001, whereas those images rated highly vivid were significant at both the 5-min and 1-week retention intervals, Fs(l,
formed more rapidly than those rated low in vividness, F\ 1, 30) = 45) = 5.650 and 4.873, p < .05. This first component accounted
34.56, p < .001. The degree of interaction did not affect image- for 62.6% of the between-conditions variance at the 5-min re-
formation time, F(\, 30) = 0.009? tention interval and 55.8% of the between-conditions variance
at the 1-week retention interval.
Free-recall performance. The average percentages of words
recalled in the free-recall test are presented in rows 5 (5-min Table 2 presents the average percentages of words recalled in
retention interval) and 6 (1-week retention interval) of Table 1 the cued-recall test as a function of the rating level on each of
as a function of the type of sentence used to present the words the rating scales. Three 2 X 2 (Rating Level X Retention Interval)
to the subject. Analyses of the free-recall performance as a func- analyses were performed, one for each rating scale. The effect of
tion of sentence type found no significant differences at either retention interval was, of course, again significant in each, Fs(\,
of the retention intervals nor between any set of sentence types. 30) = 92.00 (bizarreness), 102.93 (vividness), and 60.59 (inter-
Table 2 presents the average percentages of words recalled in action), p < .001. In addition, the effect of rating level was sig-
the free-recall test as a function of the rating level on each of the nificant for bizarreness, F(l, 30) = 11.55, p < .01, and for in-
three rating scales. Three 2 X 2 (Rating Level X Retention In- teraction, F{\, 30) = 5.032, p < .05, but not for vividness, F{\,
terval) analyses were performed, one for each rating scale. The 30) = 0.198. The statistical interaction between retention interval
effect of retention interval was, of course, significant in each, and rating level did not reach significance in any of the analyses,
Fs(l, 30) = 51.40 (bizarreness), 26.67 (vividness), and 33.37 Fs(l,30) = 2.619, 0.264, and 0.176.
(interaction), p < .001, but neither the effect of rating level, Fs( 1,
30) = 0.352, 1.197, and 2.550, nor the interaction between re-
tention interval and rating level, Fs(l, 30) = 0.006, 0.965, and 2
The extremely low F value reflects the fact that, in this analysis, the
0.820, reached significance in any of the analyses. Thus, the free- effects of both bizarreness and vividness were appearing in the error
recall performance did not appear to be affected either by sen- term but were nearly balanced across the two levels of interaction.
46 N. KROLL, E. SCHEPELER, AND K. ANGIN

Thus, the cued-recall performance appeared to be affected by mance in this situation, nor did it give any advantage to word
the degree of interaction, whether measured by sentence type or pairs presented with bizarre sentences.
by rating level, with high interaction leading to better cued recall. Rather than continuing with this experiment, it seemed more
Words linked by images that were rated as bizarre were signifi- advantageous to perform another experiment with an increased
cantly less likely to be recalled than those rated as plausible. The likelihood of a more reasonable recall performance at the end
effect of bizarreness, as measured by sentence type, was in the of a 1-week retention interval.
same direction but was not significant.
The results of Experiment 1, then, did not support the con- Experiment 2
clusions of O'Brien and Wolford (1982). Rather, as with earlier
experiments which controlled for degree of interaction present As in the pilot experiment, the subjects in this experiment
in the imagery (e.g., Bergfeld et al., 1982; Hauck et al., 1976; were told that they would be tested on their memory, and that
Nappe & Wbllen, 1973; Wollen et al., 1972), a high degree of this was the main point of the experiment. After receiving the
interaction in the suggested image proved to be the critical feature presentation list, they were given two memory tests during the
for improving memory. Increasing the degree of bizarreness re- same session: one cuing them with the first word of the pairs
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

and asking them for the second, and one cuing them with both
This document is copyrighted by the American Psychological Association or one of its allied publishers.

sulted in much longer image-formation times (see also Hauck et


al., 1976; Nappe & Wollen, 1973) but did not improve cued words of the pairs and asking them for the image that they had
recall. In fact, rather than finding that bizarre mental images formed. They were then told to return a week later to participate
aided cued recall more than plausible images after a I-week re- in another experiment, at which time they were given another
tention interval, the advantage of plausible images over bizarre (surprise) cued-recall test where they were given the first word
images—when both were interactive—was present over the entire of each pair and asked to recall the second.
interval.
One might complain that the subjects did not know that they Method
were to be tested for their memory of the word pairs and that,
Subjects. Twenty-four students of introductory psychology or intro-
therefore, the bizarre-image mnemonic did not have a "fair" ductory rhetoric classes at the University of California, Davis served as
chance. After all, the amount remembered after the 1-week in- subjects in this experiment and received extra-credit points in their class
terval (6.9% overall and only 11.4% even with images rated as in return for their participation.
high in interaction) was awfully low for a mnemonic technique. Materials. With few exceptions, the word pairs were those used in
However, if this complaint is valid for our Experiment 1, it is Experiment I, but now the word pairs were embedded within P-H and
also valid for the O'Brien and Wolford experiments. Their sub- B-H phrases. For example, the phrases for the word pair ANT-COMB were
jects were also uninformed about the memory component of the the B-H phrase "a large, black ANT carefully fixing its hair with a plastic
task and their cued-recall level was even lower than that found COMB" and the P-H phrase "a large, black ANT crawling in and out of
with our subjects (4.5% overall and 7% for the words presented the teeth of a plastic COMB." These phrases seemed to be more descriptive
with bizarre pictures). of an actual image than were the sentences.
Procedure and design. The subjects in this experiment were told that
A pilot experiment used materials similar to those of Exper- this was an experiment on memory and that, after receiving the presen-
iment 1 (but with only B-H and P-H sentence types) to measure tation list, they would be given two memory tests: one (word-recall) cuing
cued-recall performance after subjects had been informed of the them with the first word of the pairs and asking them for the second, and
memory component. In order to accomplish this, while at the one (image-recall) cuing them with both words of the pairs and asking
same time keeping the likelihood of their practicing during the them to report the image that they had formed. They were not told in
longer retention interval low, we tested them for half of the pairs advance which of the two types of tests they would receive first. (In fact,
immediately, told them that this was a "partial-report technique," half of the subjects received one order, and the other half received the
and then asked them to return a week later to control "for the other order.)
possible self-selection occurring among the 'other' subjects who During the presentation of the list, the word pair appeared in capital
are tested a week later." They were not told that they would be letters centered on the computer screen. Below the pair was the word
imagine, and below this word was a phrase containing both words, again
tested for the other half of the list. It was thought that this pro-
capitalized. They were told that "professional memory experts claim that
cedure would also give us a more sensitive within-subjects mea- one excellent way to remember that two things go together is to form a
sure of a potential Bizarreness of Image X Retention Interval mental image of the two things interacting with each other." They were
interaction. asked to attempt to form the image suggested by the phrase, but to pay
This pilot experiment was terminated after 8 subjects when particular attention to the capitalized nouns. They were warned that
it was obvious that giving the subjects foreknowledge concerning sometimes the noun itself would be capitalized, but the plural s might
the memory test did not improve their memory performance not be. For example, the bizarre phrase for the pair NEWSPAPER-SOLDIER
was "Hundreds of NEWSPAPERS marching down the street like well-trained
scores with respect to those of Experiment 1 subjects, although
SOLDIERS." In these cases, they were to attempt to form the suggested
it seemed to have lengthened the amount of time they were taking
image, but at the same time, to try to remember the words in the singular.
with each pair. In fact, if anything, their performance on the The point of this was to assess the advice often given by professional
memory tests seemed to be a trifle worse. During the second mnemonists (e.g., Lorayne & Lucas, 1974, p. 16) to use exaggerated
(delayed) test, the subjects complained that the images tested plurality to help form bizarre images.
during the first (immediate) test kept interfering with their efforts They were instructed to form "a good, strong image as quickly as
to recall those needed for the second test. In any case, fore- possible" and to press a lever as soon as they had done so. It was em-
knowledge of a memory test did not improve memory perfor- phasized that, although they were being timed and that they should per-
BIZARRE IMAGERY 47

form as rapidly as possible, it was most important that they form "good, Table 3
strong images." After pressing the lever indicating that the image was Experiment 2: Image Formation Times (in Seconds), Average
formed, the screen went blank for 1 s and then presented the words Interaction Ratings, Percentage of Immediate Cued Word
INTERACTION OF IMAGERY RATINO SCALE and, below that, the following:
Recall, Percentage of Immediate Cued Image Recall, and
"1 = No Interaction, 2 = Some Interaction, and 3 = Much Interaction."
Percentage of Delayed Cued Word Recall as
Following their rating of the degree of interaction between the objects
suggested by the capitalized words that was afforded by their image, the a Function of Phrase Type
screen again went blank, this time for 2 s, followed by the presentation
Phrase type
of the next word pair and the related phrase.
Following the presentation of the entire list of 56 pairs, subjects received Dependent measure Plausible Bizarre
the two cued-recall tests. In the word-recall test, the first word from a
given pair would appear on the screen, followed by the question "What Image formation times 11.5 12.7
was the other word?" The subject then attempted to recall the other word Average interaction rating 2.5 2.2
of the pair. After the subject responded, the experimenter keyed-in a code Immediate word recall 87.1 78.0
for the correctness of the response which also signaled the computer to Immediate image recall 75.0 70.5
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

present the correct word pair for 2 s, followed by a blank screen for 1 s, Delayed word recall score
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Strict scoring 68.2 65.4


followed by the next cue word. In order to be correct, the word had to
Lenient scoring 71.4 70.4
be precisely as it was in the presentation list; that is, the response was
scored as incorrect if a plural word was made singular or vice versa.
Errors of this type were drawn to the subject's attention. The cue words
were always the first member of the pair and the order was different than they had done better with the bizarre pairs because bizarre images were
that in the original presentation. recalled more easily or more rapidly than were the plausible images.
In the image-recall test, both words of a given pair would appear, fol- Consequently, the subjects in the second half of this experiment were
lowed by the question "What was the image?" The subject then attempted asked to first rush through the cued-recall test sheet, writing down only
to explain the image. They were told to be as complete as possible and those words that came easily and rapidly; then, when they began to require
were given the following example, which followed from the example given more time, apparently needing to really search their memories, they were
in the instructions prior to the original list presentation: asked to switch from writing their responses with a pencil to a red pen.
In this way, we had hoped to separate their fast-and-easy responses from
BOOK RUG. What was the image? In this case the image phrase was "A those requiring more time and effort.
dark orange BOOK sliding off the window sill onto the bright green
RUG" and an adequate correct response would be "A BOOK sliding off
Results
a window sill onto a RUG" but the following response would not: "A
BOOK fell onto a RUG." Image formation: Formation times and ratings. In this ex-
periment, the untruncated image formation times were used to
Following the experimenter's recording of the correctness of the response,
find the average formation time for images following bizarre
the original phrase seen with that word pair was presented for 2 s, followed
phrases and for those following plausible phrases. These averages
by a 1-s blank screen, followed by the next word pair. To be correct, the
response had to retain the plausibility-bizarreness characteristic suggested are presented in the first column of Table 3. As usual, subjects
by the original phrase. Errors due to making a plausible into a bizarre reported needing significantly more time to form bizarre images
phrase (which only occurred rarely) or due to making a bizarre into a than to form plausible ones, F\\, 22) = 15.429, p < .001. These
plausible phrase (which occurred frequently) were drawn to the subject's times are considerably longer than those obtained in Experiment
attention. The order of the word pairs was different than that of the 1, perhaps because of the additional emphasis that was given in
original presentation list and different than that of the word-recall test. the instructions of this experiment on the importance of remem-
The ordering of the pairs in both presentation lists and recall tests bering both the images and the words.
changed after every 4 subjects. Within any 4 subjects with the same order,
The average interaction rating for the two phrase types are
2 received the word-recall test before the image-recall test, and across
presented in the second column of Table 3. The advantage
these 2 subjects, the type of phrase used with any particular pair was
reversed. Subjects received half of the pairs with bizarre and half with favoring the plausible images is significant, F{\, 22) = 9.181,
plausible phrases and half of the primacy (first four) and half of the p<-01.
recency (last four) pairs were presented with each of the phrase types. Immediate cued-recall performance. The average percentages
After finishing both of the recall tests, subjects were told that the first of words recalled in the immediate cued word-recall test are
experiment was completed but were requested to return in 1 week to a presented in column 3 of Table 3. There was not much difference
different room to participate in another experiment. When they returned, between the word-recall performance of subjects having this as
they were given another (surprise) cued-recall test, consisting of a test their first test (87.2 and 77.1 for plausible and bizarre phrases,
sheet containing the first word of each pair, and asked to recall the second. respectively) and those having this as their second test (86.9
When they could not recall any more words, they were asked to go back and 78.9).
over the list and to write B or P after each cue word—whether or not
The average percentage of images recalled in the immediate
they had recalled the associated word—indicating the type of image they
had formed with that cue word during the original list presentation. cued image-recall test are presented in column 4 of Table 3.
When they had finished, they were asked which pairs they thought There was a considerable difference between the groups, with
they remembered better: those with bizarre images or those with plausible those subjects receiving the image-recall test first (89.9 and 84.8)
images. Almost all of the subjects reported remembering those with the remembering considerably more of the images than did those
bizarre phrases better, even though this was, in fact, not the case. When subjects receiving the image-recall test second (60.1 and 56.2).
confronted with their actual performance, most supposed they had thought When evaluating the scores on the immediate cued-recall tests,
48 N. KROLL, E. SCHEPELER, AND K. ANGIN

one must keep in mind that the experimenter had to make instant Table 4
decisions concerning the correctness of the responses that the Experiment 2: Percentage of Delayed Cued Word Recall as a
subject was making verbally. The sense of the experimenter was Function of Phrase Type and Rating for Degree of Interaction
that subjects tended to misremember bizarre phrases as plausible
and tended to remember the plurality of the words in the phrases, Interaction rating
rather than the way that the word itself had been presented. Both
Phrase type Low High
of these tendencies led to more errors for the word pairs presented
within the bizarre phrases, F( 1, 22) = 8.119, p < .01. The words Plausible 57.5 71.1
were remembered better than the images, F(l, 22) = 13.575, Bizarre 62.1 68.1
p < .01, although this could be at least partially a function of a
differential scoring criterion; that is, it is much easier to make a
fast decision concerning the correctness of a word than of a Subjects were also asked which they thought they remembered
phrase. better, words presented within bizarre or within plausible phrases.
There was also a Group X Test interaction, resulting from the Only 2 subjects thought they remembered those within plausible
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

very low performance obtained on the image-recall test by the phrases better, and only I subject thought the two were remem-
group receiving this test second, F(\, 22) = 31.958, p < .001. bered equally well. All others reported that they thought they
One possible reason for the Group X Test interaction is that in had remembered the words within the bizarre phrases better—
the image-recall test, subjects received both members of each many thought they had done much better with the words in
word pair as the cue for the image. Thus, the group receiving bizarre phrases. When these subjects were told that they had not
the image-recall test first did well on the image test because they remembered the bizarre pairs better and then asked why they
had it immediately, and did well on the word test because they had thought that they had, most had no idea, but several (6 over
had, in effect, received another word-pair practice trial before all 24) reported that the bizarre pairs came faster or easier (one
receiving the word-recall test. The group receiving the word- said the bizarre image "popped" into her head) than did the
recall test first, on the other hand, still did well on the word- plausible pairs.
recall test because they received this test immediately, but did In an attempt to verify this subjective feeling of the subjects,
poorly on the image-recall test because the intervening word- the last 6 of the subjects in each of the groups were asked to first
recall test did not overtly provide additional practice with the go through the list as rapidly as possible, writing only those re-
images. sponses which came quickly and easily. Then their pencils were
Delayed cued word-recall performance. The average per- replaced by red pens and they were asked to try to recall as many
centages of words recalled in the delayed cued word-recall test more responses as possible. Under these conditions and using
are presented in the last two columns of Table 3. In the first of the strict criterion (again, with no penalty for misremembering
these columns are the results of a "strict" criterion where the the plurality of bizarre pairs), subjects remembered 14.8 plausible
word had to be essentially the same as the original word: Only and 15.3 bizarre pairs on the first attempt and 4.83 plausible
the plurality of the word could be different. We did not score and 3.58 bizarre pairs on the second attempt. Although these
plurality because only bizarre phrases changed a word's plurality. scores are in the expected direction, they are neither large nor
(Contrary to the advice of professional mnemonists, this did not significant, rs(l I) = 0.467 and 1.419.
increase the likelihood of remembering the pair; it only increased Delayed cued word-recall performance was also evaluated as
the likelihood of misremembering the plurality.) The last column a joint function of phrase type (bizarre vs. plausible) and inter-
represents the results of a "lenient" criterion where words that action rating. If a subject had at least four high interactive (3)
suggested the original image were also scored as correct. For and at least four low interactive (I) rated pairs within a given
example, the bizarre phrase for the pair BICYCLE-SKYSCRAPER phrase type, medium interactive (2) rated pairs were not used
was "a bright, blue BICYCLE being ridden up the side of a tall in the analysis. If a subject had fewer than four of either the high
concrete SKYSCRAPER." With the strict criterion, BUILDING was or low interactive pairs, the medium interactive pairs were av-
not scored as a correct response to BICYCLE, but was scored as eraged into the small cell. The resulting delayed cued word-recall
correct with the lenient criterion. Neither the strict nor the lenient performance, using the strict criterion, is presented in Table 4.
criterion resulted in a significant difference between word pairs The difference in the recall of high and low interaction pairs is
presented with plausible versus bizarre phrases, ts(23) = 1.120 significant, F(\, 22) - 8.463, p < .01, but not the difference in
and 0.431. recall of plausible and bizarre pairs, F(\, 22) = 0.173, nor the
One possibility is that the bizarre mnemonic is only effective statistical interaction of phrase type and interaction rating, F{\,
for those with superior memories. To test this possibility, cor- 22) = 0.923. There was also no significant effect of the group
relations were found between the total number of words remem- (order of immediate tests) nor any significant group interaction
bered and the difference between the words remembered which effect on the delayed cued-recall performance.
had been presented with bizarre versus plausible phrases. With Delayed memory for type of image. After the subjects had
both strict and lenient criteria, these correlations were both in- recalled as many of the words as they could, they were asked to
significant and negative; that is, the more a subject remembered, go back through the test sheet and to indicate the type of imagery
the more likely the subject was to have remembered more that they had used to learn a particular pair. They wrote B if
plausible than bizarre word pairs, rs = -0.135 and -0.140, they thought they had used bizarre imagery, P if they thought
rs(22) = 0.64 and 0.67. they had used plausible imagery, and left the spot blank if they
BIZARRE IMAGERY 49
could not remember the type of imagery. They were asked to Table 5
write aBoraP if they remembered the image at alt, even if they Experiment 2: Percentage of Pairs Thought to Have Been
did not remember what the image was, but not to make totally Learned With Plausible or With Bizarre Imagery
wild guesses. After the Delayed Cued Word-Recall Test
The percentage of pairs that were marked with B or P or not
marked, averaged over all 24 subjects, is presented in Table 5 as Response
a function of the type of phrase that had been originally presented
Recall Type of
with the pair and as a function of whether or not the subject had of word phrase Plausible Bizarre No indication
remembered the second word of the pair.
When the subjects remembered the second word of the pair Correct Plausible 91.1 8.5 0.4
correctly (the first two rows of Table 5), they also tended to cor- Correct Bizarre 11.8 88.0 0.2
Incorrect Plausible 25.8 27.7 46.5
rectly classify the type of imagery suggested by the original Incorrect Bizarre 8.6 54.7 36.6
phrase—and there did not appear to be any difference in their
accuracy as a function of the type of phrase. This was tested by
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

forming difference scores for each subject and comparing the


advantage of P responses over B responses to cues learned with effect and they showed that, in the absence of this confounding,
plausible phrases with the advantage of B over P responses to there was no advantage of bizarre imagery on either cued or free
cues learned with bizarre phrases. Fourteen subjects had greater recall at short retention intervals. Our Experiment 1 extends this
difference scores for the plausible than for the bizarre, 8 had lack of a benefit from bizarre imagery to free recall over a re-
greater difference scores for the bizarre, and 2 had equal difference tention interval of 1 week.
scores. A two-tailed sign test performed on the 22 subjects without McDaniel and Einstein (1986) have recently found what ap-
tied difference scores gave no evidence of a difference in the pears to be a free recall advantage following bizarre imagery in
accuracy as a function of phrase type, p > .20. a within-list design. They speculate that our failure to find such
When the subjects did not remember the second word (rows an effect might be due to our orienting tasks. That is, they found
3 and 4), their classification of the imagery type was much poorer. that subjects recalled more bizarre sentences following a vividness
What makes this more interesting is the fact that the classification rating task—but not after a bizarreness rating task—and con-
of the imagery resulting from the bizarre phrases is much more cluded that the bizarreness rating task may lead the subjects into
accurate than is that for the plausible phrases. With the cues semantic as opposed to imagery processing. When evaluating
learned with the plausible phrases, nearly half of the cues that this speculation, one should first note that although their subjects
did not lead to the correct response were not classified at all and were instructed to rate the bizarreness of the sentences per se,
the remaining cues that did not lead to the correct response were our subjects were instructed to rate the bizarreness of their im-
split nearly equally between the P and B classifications. Ten sub- ages. Perhaps this difference in instructions contributed to the
jects made more P classification responses, 8 made more B re- variable results between the experiments. Second, our subjects
sponses, and 6 made an equal number of P and B responses, were asked to indicate, with their vividness rating, whenever they
p > .80. had not formed an image. The number of trials so indicated
On the other hand, the cues learned with bizarre phrases were constituted less than 1.5% of the trials, and the distribution of
much more likely to be classified correctly (B) and much less these no-image trials was nearly uniform over the P-L, P-H, and
likely to be classified incorrectly (P). Twenty-one subjects made B-H sentence types. Thus, our subjects apparently did use imag-
more B responses, and only 1 made more P responses, p < .001. inal processing.
Comparing these difference scores, we find that 16 subjects had Some have suggested that bizarre imagery has a benefit on
larger correct difference scores to the cues learned with the bizarre cued recall after longer retention intervals (Andreoff & Yarmey,
phrases, 3 had larger correct difference scores to those learned 1976; Marshall, Nau, & Chandler, 1980; O'Brien & Wolford,
with plausible phrases, and 5 had equal difference scores for cues 1982; Webber & Marshall, 1978). For example, O'Brien and
learned with the two types of phrases, p < .05. Thus, there is at Wolford charted the time course of cued-recall performance and
least some evidence that subjects remember something about the found an advantage of plausible imagery at brief intervals and
bizarre images, even when they do not remember enough to help an advantage of bizarre imagery at longer intervals, with the
them recall the correct word. cross-over occurring between 1 and 3 days. This meant that the
cross-over was occurring when there were approximately three
plausible and three bizarre pairs (out of a list of 48 pairs) left in
General Discussion memory. Unfortunately, none of these experiments controlled
Many psychologists seem to be strongly motivated to prove for, or made any attempt to even measure, the degree of inter-
that bizarre imagery has a positive effect on memory. Recently, action, provided by the images, between the items in a pair. This
it appeared that the difficulty in finding such an effect was that is particularly strange given that the degree of interaction has
the experiments had all been using cued-recall test, but that bi- been clearly demonstrated to have a potent effect on memory
zarre imagery did not improve cued recall, only free recall (Merry, (e.g., Wollen, Weber, & Lowry, 1972). The present experiments
1980; Merry & Graham, 1978; Wollen & Cox, 1981a). Wollen demonstrated once again that the bizarreness of the image does
and Cox (1981 b), however, found a confounding in the construc- not improve memory accuracy beyond that obtained from an
tion of the materials that had been used to demonstrate this interactive plausible image. This was true at both short and long
50 N. KROLL, E. SCHEPELER, AND K. ANGIN

retention intervals, with both free and cued recall, and at several degree of interaction is controlled, it does not; or, at very least,
levels of memory performance. And again, the degree of inter- its effect on memory is much smaller than is the effect of inter-
action provided by the image proved to be extremely influential action. It seems to us that the interesting phenomenon which
on cued recall. Its effect on free recall was much less. If one needs to be studied is not the effect of bizarre imagery on memory,
concedes that there must be some self-cuing within any free- but rather its effect on metamemory. Why do we believe that it
recall task, it is quite conceivable that this small effect of inter- improves our memory? The only potential clue that we have
action on free-recall performance was actually on the self cued- been able to find is that subjects seem to remember that the
recall portion of the task. image was bizarre, even when they cannot remember enough
Although it seems strange that papers are still being published about this image to allow them to correctly recall the second
that ignore what appears to be the most important variable— member of the pair. Thus, subjects may be responding to this
interaction—nevertheless, bizarre imagery must have some effect feeling-of-knowing the image when they say that they remem-
on memory. Why else do professional mnemonists insist on its bered more bizarre pairs. Perhaps this feeling-of-knowing is
effectiveness, and why else do experimental psychologists con- somehow related to the tip-of-the-tongue phenomena and, if so,
tinue to think they are finding something? One answer might be perhaps it can be profitably studied with similar techniques (e.g.,
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

simply that, in the absence of controlled counterbalancing, it is Brown & McNeil, 1966).
This document is copyrighted by the American Psychological Association or one of its allied publishers.

often possible to make images interactive via bizarre imagery


more easily than it is via plausible imagery. However, this ar-
References
gument does not explain why subjects take so much longer to
form bizarre imagery than they do to form plausible imagery, AndreofF, G. R., & Yarmey, A. D. (1976). Bizarre imagery and associative
even when they are instructed to make this plausible imagery learning: A confirmation. Perceptual and Motor Skills, 43, 143-148.
interactive (e.g., Bergfeld et al., 1982; Hauck et al., 1976; Nappe Bergfeld, V. A., Choate, L. S., & Kroll, N. E. A. (1982). The effect of
& Wollen, 1973; and the present experiments). bizarre imagery on memory as a function of delay: Reconfirmation of
interaction effect. Journal of Mental Imagery, 6, 141-158.
Another possible reason that people think that bizarre imagery
Brown, R. W., & McNeil, D. (1966). The "tip-of-the-tongue" phenomena.
improves memory is that there are so many reasons why it should.
Journal of Verbal Learning and Verbal Behavior, 5, 325-337.
For example, it should make the image more distinctive and,
Collyer, S. C , Jonides, J., & Bevan, W. (1972). Images as memory aides:
thus, less subject to interference. Also, given that it typically
Is bizarreness helpful? American Journal of Psychology, 85, 31-38.
requires more time and effort to produce, the resulting memory
Delin, P. S. (1968). Success in recall as a function of success in imple-
trace should be stronger (e.g., Jacoby, Craik, & Begg, 1979; but
mentation of mnemonic instructions. Psychonomic Science. 12, 153-
see also Zacks, Hasher, Sanft, & Rose, 1983). In any case, the
154.
subjects in our second experiment, questioned after they had
Delin, P. S. (1969). Learning and retention of English words with successive
made their responses, almost invariably thought they had re- approximations to a complex mnemonic instruction. Psychonomic
membered more of the bizarre imagery pairs, even when they Science, 17. 87-89.
clearly had not. It does not seem likely that the plausible pairs Furst, B. (1954). Stop forgetting. New York: Garden City Press.
were being remembered as bizarre, because (a) the immediate
Hauck, P. D., Walsh, C. C , & Kroll, N. E. A. (1976). Visual imagery
cued image-recall test had shown that subjects were very likely mnemonics: Common vs. bizarre mental images. Bulletin of the Psy-
to remember the bizarre images as plausible, but very unlikely chonomic Society, 7, 160-162.
to remember the plausible images as bizarre and (b) subjects Jacoby, L. L., Craik, F. I. M , & Begg, I. (1979). Effects of decision difficulty
were at least equally likely to correctly classify the type of imagery on recognition and recall. Journal of Verbal Learning and Verbal Be-
after correctly recalling the pairs learned with plausible imagery havior, 18, 585-600.
as they were to correctly classify the type of imagery after correctly Lorayne, H., & Lucas, J. (1974). The memory book. New York: Ballantine.
recalling the pairs learned with bizarre imagery. Marshall, P. H., Nau, K., & Chandler, C. K. (1980). A functional analysis
Subjects reported that the recall of bizarre pairs seemed easier of common and bizarre visual mediators. Bulletin of the Psychonomic
or faster than the recall of plausible pairs. Our attempt to measure Society, 15, 375-377.
this found that there was some tendency for subjects to remember McDaniel, M. A., & Einstein, G. O. (1986). Bizarre imagery as an effective
more bizarre items the first time through the cue list, but this memory aid: The importance of distinctiveness. Journal of Experi-
mental Psychology: Learning, Memory, and Cognition, 12. 52-63.
tendency was a relatively small one. It is possible that a few
bizarre items are remembered extremely well and that these few Merry, R. (1980). Image bizarreness in incidental learning. Psychological
Reports, 46, 427-430.
extremely strong items stand out when one is making a meta-
memory assessment. This could easily explain the O'Brien and Merry, R., & Graham, N. C. (1978). Imagery bizarreness in children's
recall of sentences. British Journal of Psychology, 69, 315-321.
Wolford (1982) results, which suggested that the last few items
Nappe, G. W., & Wollen, K. A. (1973). Effects of instructions to form
in memory tended to be the bizarre pairs. But we did not find
common and bizarre mental images on retention. Journal of Experi-
this in Experiment 1, where the number of items recalled was
mental Psychology, 100, 6-8.
as low as those found by O'Brien and Wolford. We also found
Nelson, T. O., & Vining, S. K. (1978). Effect of semantic versus structural
no tendency for the bizarre items to be recalled first in the free- processing on long-term retention. Journal of Experimental Psychology:
recall test in Experiment 1. Human Learning and Memory, 4, 198-209.
We are left, then, with an anomaly. People—including mne- O'Brien, E. J., & Wolford, C. R. (1982). Effect of delay in testing on
monists, experimental psychologists, and experimental subjects— retention of plausible versus bizarre mental images. Journal of Exper-
believe that bizarre imagery helps their memories, but, once the imental Psychology: Learning, Memory, and Cognition, 8, 148-152.
BIZARRE IMAGERY 51
Olton, R. M. (1969). The effect of a mnemonic upon the retention of Wollen, K. A., & Cox, S. D. (1981 b). Sentence cuing and the effectiveness
paired-associate verbal material. Journal of Verbal Learning and Verbal of bizarre imagery. Journal of Experimental Psychology: Human
Behavior, 8, 43-48. Learning and Memory, 7, 386-392.
Webber, S. M., & Marshall, P. H. (1978). Bizarreness effects in imagery Wollen, K. A., Weber, A., & Lowry, D. H. (1972). Bizarreness versus
as a function of processing level and delay. Journal of Mental Imagery, interaction of mental images as determinants of learning. Cognitive
2, 291-300. Psychology, 3, 518-523.
Wollen, K. A., &. Cox, S. D. (1981 a). The bizarreness effect in a multitrial Zacks, R. T.( Hasher, L., Sanft, H., & Rose, K. C. (1983). Encoding effort
intentional learning task. Bulletin of the Psychonomk Society, 18, 296- and recall: A cautionary note. Journal of Experimental Psychology:
298. Learning, Memory, and Cognition, 9, 747-756.

Appendix
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

To-Be-Remembered Words (in Capitals) and Sentences Used as Stimuli in Experiment 1


This document is copyrighted by the American Psychological Association or one of its allied publishers.

Practice BOTTLE-PIANO
The tall, bearded PROFESSOR wrote the equation on the messy, green A BOTTLE stands on a PIANO.
CHALKBOARD. A large, cracked beer BOTTLE stands on the top of a dusty PIANO.
Primacy Items A large, cracked beer BOTTLE is spilling its contents on a dusty PIANO.
A CANDLE lies outside of the CHURCH. A large, cracked beer BOTTLE sings loudly as it plays a dusty PIANO.
A small, yellow CANARY, using its wings like arms, bends open a HAIRPIN. OCEAN-TOWER
Her ornate EARRING lies on top of a long, blue ENVELOPE. The OCEAN rages beneath the TOWER.
The small, black HORSE stretches his neck out of his stall to eat the HAY. The stormy OCEAN rages noisily on the rocks beneath the tall, dark TOWER.
Recency Items The stormy OCEAN crashes over the rocks, demolishing the tall, dark
The large, majestic EAGLE swoops down to land on the rotting, wooden TOWER.
FENCE. The stormy OCEAN angrily throws fish and rocks at the tall, dark TOWER.
The smoking VOLCANO can be seen clearly from the little log CABIN. AIRPLANE-TREE
The BELL hangs over the SANDAL. The AIRPLANE flies over the TREE.
The fancy, oriental KITS are held by strings made of colorful CATER- The large, silver AIRPLANE flies easily over the branches of the high TREE.
PILLARS. The large, silver AIRPLANE cuts through the top branches of the high
Critical Items TREE.
Each subject saw each of the following 48 pairs with one of the four The large, silver AIRPLANE grows like a fruit from a branch of the high
possible sentences. The sentences for each pair are presented in the order TREE.
plausible-short, plausible-low, plausible-high, and bizarre-high. SPOON-CUP
AUTOMOBILE-UMBRELLA The SPOON rests against the CUP.
An AUTOMOBILE is next to an UMBRELLA. The monogrammed SPOON rests against the edge of the small, china CUP.
An old, green AUTOMOBILE is parked in the sun next to an open UM- The monogrammed SPOON stirs the contents of the small, china CUP.
BRELLA. The monogrammed SPOON drinks the contents of the small, china CUP.
An old, green AUTOMOBILE drives over and flattens an open UMBRELLA. CAT-FIREPLACE
An old, green AUTOMOBILE protects itself in traffic with an open UM- The CAT walks toward the FIREPLACE.
BRELLA. The long-haired CAT walks slowly toward the front of the brick FIREPLACE.
REFRIGERATOR-TABLE The long-haired CAT scratches her claws on the front of the brick FIRE-
A REFRIGERATOR stands by a TABLE. PLACE.
A white REFRIGERATOR stands in the corner by a thick, pine TABLE. The long-haired CAT builds herself a fancy house with a brick FIREPLACE.
A white REFRIGERATOR tips over and breaks a thick, pine TABLE. GIRL-FLAG
A white REFRIGERATOR dances merrily with a happy, pine TABLE. The GIRL stands near the FLAG.
FOX-BOWL The tall, thin GIRL stands near a large American FLAG.
A FOX sleeps next to a BOWL. The tall, thin GIRL proudly folds a large American FLAG.
A thin red FOX sleeps lazily next to a rain-filled sugar BOWL. The tall, thin GIRL is suddenly attacked by a large American FLAG.
A thin red FOX licks cautiously at the edges of a rain-filled sugar BOWL. BUTTERFLY-COIN
A thin red FOX skates on the frozen surface of a rain-filled sugar BOWL. A BUTTERFLY flits past the COIN.
BOULDER-FROG A swallow-tailed BUTTERFLY flits hurriedly past the shiny foreign COIN.
A BOULDER shades the FROG. A swallow-tailed BUTTERFLY is engraved on the shiny foreign COIN.
A large, round BOULDER provides nice, cool shade for the old, tired FROG. A swallow-tailed BUTTERFLY slyly takes off with the shiny foreign COIN.
A large, round BOULDER rolls relentlessly over the unfortunate, old FROG. CHAIR-CRATE
A large, round BOULDER bends and groans under the weight of the old, A CRATE is next to a CHAIR.
tired FROG. A sturdy wooden CRATE is next to an old, uncomfortable CHAIR.
MAGAZINE-APPLE A sturdy wooden CRATE is used as an old, uncomfortable CHAIR.
The MAGAZINE covers an APPLE. A sturdy wooden CRATE relaxes in an old, uncomfortable CHAIR.
The new MAGAZINE partially covers a brown, dried-out APPLE. PAINTING-RABBIT
The new MAGAZINE contains a picture of a brown, dried-out APPLE. The PAINTING hangs over the RABBIT.
The new MAGAZINE is being read by a brown, dried-out APPLE. The modern oil PAINTING hangs over the cage of a fat, old RABBIT.
52 N. KROLL, E. SCHEPELER, AND K. ANGIN

The modern oil PAINTING appears to depict a fat, old RABBIT. A colorful hot-air BALLOON sails through the spray of a tall, steep WA-
The modern oil PAINTING is being painted by a fat, old RABBIT. TERFALL.
MOP-PAIL A colorful hot-air BALLOON provides the source of a tall, steep WATERFALL.
The MOP is by the PAIL. BASEBALL-WINDOW
The long-handled sponge MOP is in the hall closet by the dented metal A BASEBALL rolls beneath the WINDOW.
PAIL. A tattered, stained BASEBALL rolls to a stop beneath the stained-glass
The long-handled sponge MOP is dipped in and out of the dented metal WINDOW.
PAIL. A tattered, stained BASEBALL crashes noisily through the stained-glass
The long-handled sponge MOP walks slowly by, carrying a dented metal WINDOW.
PAIL. A tattered, stained BASEBALL turns in flight, missing the stained-glass
PENCIL-CALCULATOR WINDOW.
The PENCIL is along side of the CALCULATOR. TULIP-SHAMPOO
The yellow PENCIL is placed neatly along side of the tiny CALCULATOR. A TULIP lies next to the SHAMPOO.
The yellow PENCIL is used to push the small keys on the tiny CALCULATOR. A bright yellow TULIP lies on the shelf next to the bottle of creamy SHAM-
The yellow PENCIL tries to solve the problem with the tiny CALCULATOR. POO.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

POLICEMAN-WALL A bright yellow TULIP is shown on the label of the bottle of creamy
This document is copyrighted by the American Psychological Association or one of its allied publishers.

The POLICEMAN rests against the WALL. SHAMPOO.


The slender rookie POLICEMAN rests against the tall, stone WALL. A bright yellow TULIP washes its petals with a full bottle of creamy SHAM-
The slender rookie POLICEMAN quickly climbs the tall, stone WALL. POO.
The slender rookie POLICEMAN hungrily eats the tall, stone WALL. COFFEE-SPONGE
TROUT-CANOE The COFFEE is by the SPONGE.
The TROUT swims under the CANOE. The steaming-hot, spilled COFFEE forms a puddle near the dry, pink
The brown TROUT swims under the birch-bark CANOE. SPONGE.
The brown TROUT is dropped into the birch-bark CANOE. The steaming-hot, spilled COFFEE soaks rapidly into the dry, pink SPONGE.
The brown TROUT paddles his own birch-bark CANOE. The steaming-hot, spilled COFFEE completely dissolves the dry, pink
FLASHLIGHT-VIOLIN SPONGE.
A FLASHLIGHT lies in front of the VIOLIN.
FEET-GRASS
A new FLASHLIGHT lies in the dark closet in front of the old VIOLIN.
Her FEET are on the GRASS.
A new FLASHLIGHT is used in the dark closet to find the old VIOLIN.
Her tired, swollen FEET relax as they stand on the cool, wet GRASS.
A new FLASHLIGHT is used to pluck the strings of the old VIOLIN.
Her tired, swollen FEET leave their impressions on the cool, wet GRASS.
DOCTOR-TELEPHONE
Her tired, swollen FEET float several inches above the cool, wet GRASS.
The DOCTOR ignores the TELEPHONE.
KEY-BOX
The old, bearded DOCTOR ignores the ringing of the pale green TELE-
The KEY lies in the BOX.
PHONE.
The small, silver KEY lies in the center of the elaborately carved BOX.
The old, bearded DOCTOR speaks urgently into the pale green TELEPHONE.
The small, silver KEY sticks in the lock of the elaborately carved BOX.
The old, bearded DOCTOR operates urgently on the pale green TELEPHONE.
The small, silver KEY fights to escape out of the elaborately carved BOX.
NEWSPAPER-SOLDIER
TRAIN-STRAWBERRIES
The NEWSPAPER lies next to the SOLDIER.
The TRAIN goes by the STRAWBERRIES.
The newly-arrived NEWSPAPER lies on the bench next to the anxious
The sleek new TRAIN speeds by a field filled with fresh, juicy STRAW-
SOLDIER.
BERRIES.
The newly-arrived NEWSPAPER is opened rapidly by the anxious SOLDIER.
The sleek new TRAIN crushes all of the spilled fresh, juicy STRAWBERRIES.
The newly-arrived NEWSPAPER shouts contemptuously at the anxious
The sleek new TRAIN is tragically derailed by the fresh, juicy STRAW-
SOLDIER.
BERRIES.
DOG-FRISBEE
BALLERINA-CAKE
A DOG watches the FRISBEE.
The BALLERINA buys a CAKE.
A shaggy, brown DOG sits in the shade watching the tossed FRISBEE.
A shaggy, brown DOG jumps high in the air to catch the tossed FRISBEE. The beautiful, young BALLERINA buys herself a five-layer birthday CAKE.
A shaggy, brown DOG balances precariously on top of the tossed FRISBEE. A tacky, plastic BALLERINA is placed on top of afive-layerbirthday CAKE.
ANT-COMB
The beautiful, young BALLERINA wears, as a hat, a five-layer birthday
CAKE.
An ANT goes around a COMB.
A large, black ANT slowly turns and detours around a plastic COMB. ROSE-TUXEDO
A large, black ANT crawls in and out of the teeth of a plastic COMB. A ROSE lies next to the TUXEDO.
A large, black ANT carefully fixes its hair with a plastic COMB. A tiny red ROSE lies on the dresser next to the elegant white TUXEDO.
CARPENTER-SHED A tiny red ROSE is placed in the lapel of the elegant white TUXEDO.
The CARPENTER goes to the SHED. A tiny red ROSE looks sharp wearing its elegant white TUXEDO.
The laughing CARPENTER carries his ladder back lo the metal SHED. BICYCLE-SKYSCRAPER
The laughing CARPENTER climbs up his ladder to the roof of the metal A BICYCLE is next to the SKYSCRAPER.
SHED. A bright blue BICYCLE is ridden swiftly past the tall, concrete SKYSCRAPER.
The laughing CARPENTER flies on his ladder to the roof of the metal A bright blue BICYCLE crashes noisily into the tall, concrete SKYSCRAPER.
SHED. A bright blue BICYCLE is ridden up the side of the tall, concrete SKY-
BALLOON-WATERFALL SCRAPER.
A BALLOON sails above a WATERFALL. PLUMBER-FEATHER
A colorful hot-air BALLOON sails quietly, high above a tall, steep WATER- The PLUMBER picks up a FEATHER.
FALL. The short, fat PLUMBER sings to himself as he walks by a fallen FEATHER.
BIZARRE IMAGERY 53

The short, fat PLUMBER fixes the stopped drain by pulling out a FEATHER. MICROPHONE-DRUM
The short, fat PLUMBER fixes the stopped drain by tickling it with a The MICROPHONE is near the DRUM.
FEATHER. The singer's tall, thin MICROPHONE is on the stand near the big, bass
COCKROACH-STOVE DRUM.
The COCKROACH runs toward the STOVE. The singer's tall, thin MICROPHONE falls over, puncturing the big, bass
The large, reddish COCKROACH scurries across the floor toward the dirty DRUM.
STOVE. The singer's tall, thin MICROPHONE flails wildly, playing the big, bass
The large, reddish COCKROACH, waving its feelers, climbs up the dirty DRUM.
STOVE. STAR-OWL
The large, reddish COCKROACH, waving its feelers, carries off the dirty The STAR shines on the OWL.
STOVE. The bright, distant STAR provides light needed by the hunting horned
OWL.
PINEAPPLE-KNIFE
The bright, distant STAR is the "beak" in the constellation of the horned
The PINEAPPLE is near the KNIFE.
OWL.
The large, ripe PINEAPPLE is on the counter near a sharp, stainless-steel
The bright, distant STAR provides light needed by the hunting horned
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

KNIFE.
OWL.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

The large, ripe PINEAPPLE is being sliced with a sharp, stainless-steel


QUEEN-LOCKET
KNIFE.
The QUEEN wears a LOCKET.
The large, ripe PINEAPPLE stabs itself with a sharp, stainless-steel KNIFE.
The tall, richly-dressed QUEEN happily wears the dainty, golden LOCKET.
JACKET-BED
The tall, richly-dressed QUEEN angrily rips offthe dainty, golden LOCKET.
A JACKET is tossed on the BED.
The tall, richly-dressed QUEEN cries a tear that is a dainty, golden LOCKET.
A black, leather JACKET is casually tossed towards the king-sized BED. SURFBOARD-DOOR
A black, leather JACKET is mixed among the covers of the king-sized BED. The SURFBOARD is near the DOOR.
A black, leather JACKET crawls out from among the covers of the king- The new, red SURFBOARD leans on the house next to the solid oak DOOR.
sized BED. The new, red SURFBOARD is wedged under the knob of the solid oak
TUBA-ELEPHANT DOOR.
A TUBA is played behind the ELEPHANT. The new, red SURFBOARD has, at its very center, a solid oak DOOR.
A shiny brass TUBA is played by the man just behind the large, grey COW-BILLBOARD
ELEPHANT. The cow eats by the BILLBOARD.
A shiny brass TUBA is played into the ear of the sleeping, large, grey The fat, sleepy cow slowly chews its cud in front of the large BILLBOARD.
ELEPHANT. The fat, sleepy cow is shown in the advertisement on the large BILLBOARD.
A shiny brass TUBA is played very loudly by the weaving, large, grey The fat, sleepy cow slowly climbs to the very top of the large BILLBOARD.
ELEPHANT. COOKIE-BROOM

EGGS-NAPKIN The COOKIE is by the BROOM.

The EGGS are next to the NAPKIN. The broken peanut-butter COOKIE is lying to the left of the yellow, straw
The soft, sloppy EGGS are placed next to the red and white checked BROOM.
NAPKIN. The broken peanut-butter COOKIE is being swept up with the yellow,
The soft, sloppy EGGS are wiped up with the red and white checked straw BROOM.
NAPKIN. The broken peanut-butter COOKIE complains angrily to the yellow, straw
The soft, sloppy EGGS complain about the cheap red and white checked BROOM.
NAPKIN. WAITER-NECKTIE
The WAITER wears a NECKTIE.
SHIRTS-COCONUTS
The lazy, arrogant WAITER wears a black suit and a narrow, white NECKTIE.
The SHIRTS are sold near the COCONUTS.
The lazy, arrogant WAITER admires himself as he ties a narrow, white
The short-sleeved SHIRTS are sold under the decorative, large, shaggy
NECKTIE.
COCONUTS,
The lazy, arrogant WAITER serves a plate containing a narrow, white
The short-sleeved SHIRTS are patterned with decorative, large, shaggy
NECKTIE.
COCONUTS.
The short-sleeved SHIRTS are worn by a group of angry, large, shaggy Received January 17, 1985
COCONUTS. Revision received June 20, 1985 •

You might also like