Professional Documents
Culture Documents
consequence that reliably follows its occurrence. When a behavior is strength- ened, it is more
likely to occur again in the future.
Thorndike placed a hungry cat in a cage and put food outside of the cage where the cat could
see it. Thorndike rigged the cage so that a door would open if the cat hit a lever with its paw.
The cat was clawing and biting the bars of the cage, reaching its paws through the open- ings
between the bars, and trying to squeeze through the opening. Eventually, the cat accidentally hit
the lever, the door opened, and the cat got out of the cage and ate the food. Each time
Thorndike put the hungry cat inside the cage it took less time for the cat to hit the lever that
opened the door. Eventually, the cat hit the lever with its paw as soon as Thorndike put it in the
cage (Thorndike, 1911). Thorndike called this phenomenon the law of effect.
In this example, when the hungry cat was put back in the cage, the cat was more likely to hit the
lever because this behavior had resulted in an immediate consequence: escaping the cage and
getting food. Getting to the food was the consequence that reinforced (strengthened) the cat's
behavior of hitting the lever with a paw.
3. that results in the strengthening of the behavior. (The person is more likely to engage in the
behavior again in the future.)
We can determine that a behavior is strengthened when there is an increase in its frequency,
duration, intensity, or speed (decreased latency). A behavior that is strengthened through the
process of reinforcement is called an operant behav- ior. An operant behavior acts on the
environment to produce a consequence and, in turn, is controlled by, or occurs again in the
future as a result of, its immediate consequence. The consequence that strengthens an operant
behavior is called a reinforcer.
In the first example in Table 4-1, the child cried at night when her parents put her to bed. The
child's crying was an operant behavior. The reinforcer for her crying was the parents' attention.
Because crying at night resulted in this immediate consequence (reinforcer), the child's crying
was strengthened: She was more likely to cry at night in the future.
There are two types of reinforcement: positive reinforcement and negative rein- forcement. It is
extremely important to remember that both positive reinforcement and negative reinforcement
are processes that strengthen a behavior, that is, they both increase the probability that the
behavior will occur in the future. Positive and negative reinforcement are distinguished only by
the nature of the consequence that follows the behavior.
2. is followed by the removal of a stimulus (an aversive stimulus) or a decrease in the intensity
of a stimulus,
A stimulus is an object or event that can be detected by one of the senses, and thus has the
potential to influence the person (stimuli is the plural form of the word stimulus). The object or
event may be a feature of the physical environment or the social environment (the behavior of
the person or of others).
In positive reinforcement, the stimulus that is presented or that appears after the behavior is
called a positive reinforcer.
Consider Example 8 in Table 4-1. The mother's behavior of buying her child candy results in
termination of the child's tantrum (an aversive stimulus is removed). As a result, the mother is
more likely to buy her child candy when he tantrums in a store. This is an example of negative
reinforcement. On the other hand, when the child tantrums, he gets candy (a positive reinforcer
is presented).
As a result, he is more likely to tantrum in the store. This is an example of positive
reinforcement.
As you have learned, reinforcement can involve the addition of a reinforcer (posi- tive
reinforcement) or the removal of an aversive stimulus (negative reinforce- ment) following the
behavior. In both cases, the behavior is strengthened. For both positive and negative
reinforcement, the behavior may produce a consequence through the actions of another person
or through direct contact with the physical environment. When a behavior produces a reinforcing
consequence through the actions of another person, the process is social reinforcement. An
example of social positive reinforcement might involve asking your roommate to bring you the
bag of chips. An example of social negative reinforcement might involve asking your roommate
to turn down the TV when it is too loud. In both cases, the consequence of the behavior was
produced through the actions of another person. When the behavior produces a reinforcing
consequence through direct contact with the physical environment, the process is automatic
reinforcement. An example of automatic positive reinforcement would be if you went to the
kitchen and got the chips for yourself. An example of automatic negative reinforcement would
be if you got the remote and turned down the volume on the TV yourself. In both cases, the
reinforcing consequence was not produced by another person.
The distinction between escape and avoidance is shown in the following situation. A laboratory
rat is placed in an experimental chamber that has two sides. separated by a barrier, the rat can
jump over the barrier to get from one side to the other. On the floor of the chamber is an electric
grid that can be used to deliver a shock to one side or the other. Whenever the shock is
presented on the right side of the chamber, the rat jumps to the left side, thus escaping from the
shock. Jumping to the left side of the chamber is escape behavior because the rat escapes from
an aversive stimulus (the shock). When the shock is applied to the left side, the rat jumps to the
right side. The rat learns this escape behavior rather quickly and jumps to the other side of the
chamber as soon as the shock is applied.
In the avoidance situation, a tone is presented just before the shock is applied. (Rats have
better hearing than vision.)
Reinforcement is a natural process that affects the behavior of humans and other animals.
Through the process of evolution, we have inherited certain biological characteristics that
contribute to our survival. One characteristic we have inherited is the ability to learn new
behaviors through reinforcement. In particular, certain stimuli are naturally reinforcing because
the ability of our behaviors to be rein- forced by these stimuli has survival value (Cooper, Heron,
& Heward, 1987; 2007). For example, food, water, and sexual stimulation are natural positive
rein- forcers because they contribute to survival of the individual and the species. Escape from
painful stimulation or extreme levels of stimulation (cold, heat, or other discomforting or aversive
stimulation) is naturally negatively reinforcing because escape from or avoidance of these
stimuli also contributes to survival. These natural reinforcers are called unconditioned
reinforcers because they func- tion as reinforcers the first time they are presented to most
human beings; no prior experience with these stimuli is needed for them to function as
reinforcers. Unconditioned reinforcers sometimes are called primary reinforcers. These stimuli
are unconditioned reinforcers because they have biological importance (Cooper et al., 1987;
2007).
Another class of reinforcers is the conditioned reinforcers. A conditioned reinforcer (also called a
secondary reinforcer) is a stimulus that was once neutral (a neutral stimulus does not currently
function as a reinforcer; i.e., it does not influence the behavior that it follows) but became
established as a reinforcer by being paired with an unconditioned reinforcer or an already
established condi- tioned reinforcer. For example, a parent's attention is a conditioned reinforcer
for most children because attention is paired with the delivery of food, warmth, and other
reinforcers many times in the course of a young child's life. Money is per- haps the most
common conditioned reinforcer. Money is a conditioned reinforcer because it can buy (is paired
with) a wide variety of unconditioned and condi- tioned reinforcers throughout a person's life. If
you could no longer use money to buy anything, it would no longer be a conditioned reinforcer.
People would not work or engage in any behavior to get money if it could not be used to obtain
other reinforcers. This illustrates one important point about conditioned reinfor- cers: They
continue to be reinforcers only if they are at least occasionally paired with other reinforcers.
Nearly any stimulus may become a conditioned reinforcer if it is paired with an existing
reinforcer. For example, when trainers teach dolphins to perform tricks at aquatic parks, they
use a handheld clicker to reinforce the dolphin's behavior. Early in the training process, the
trainer delivers a fish as a reinforcer and pairs the sound of the clicker with the delivery of the
fish to eat. Eventually, the click- ing sound itself becomes a conditioned reinforcer. After that,
the trainer occasion- ally pairs the sound with the unconditioned reinforcer (the fish) so that the
clicking sound continues to be a conditioned reinforcer (Pryor, 1985). A neutral stimulus such as
a plastic poker chip or a small square piece of colored cardboard can be used as a conditioned
reinforcer (or token) to modify human behavior in a token reinforcement program. In a token
reinforcement program, the token is presented to the person after a desirable behavior, and
later the person exchanges the token for other reinforcers (called backup reinforcers). Because
the tokens are paired with (exchanged for) the backup reinforcers, the tokens themselves
become reinforcers for the desirable behavior.
When a conditioned reinforcer is paired with a wide variety of other reinfor- cers, it is called a
generalized conditioned reinforcer. Money is a generalized conditioned reinforcer because it
is paired with (exchanged for) an almost unlim- ited variety of reinforcers. As a result, money is
a powerful reinforcer that is less likely to diminish in value (to become satiated) when it is
accumulated. That is, satiation (losing value as a reinforcer) is less likely to occur for
generalized reinfor- cers such as money.
Contingency
If a response is consistently followed by an immediate consequence, that conse- quence is
more likely to reinforce the response. When the response produces the consequence and the
consequence does not occur unless the response occurs first, we say that a contingency exists
between the response and the consequence. When a contingency exists, the consequence is
more likely to reinforce the response (turning the key in your ignition to start your car). This is an
example of contingency: Every time you turn the key, the car starts. The behavior of turning the
key is reinforced by the engine starting. If the engine started only sometimes when you turned
the key, and if it started sometimes when you did not turn the key, the behavior of turning the
key in this particular car would not be strength- ened very much. A person is more likely to
repeat a behavior when it results in a consistent reinforcing consequence. That is, a behavior is
strengthened when a reinforcer is contingent on the behavior (when the reinforcer occurs only if
the behavior occurs).
Motivating Operations
Some events can make a particular consequence more or less reinforcing at some times than at
other times. These antecedent events, called motivating operations (MOs), alter the value of a
reinforcer. There are two types of MOs; establishing operations and abolishing operations. An
establishing operation (EO) makes a reinforcer more potent (it establishes the effectiveness of
a reinforcer). An abolishing operation (AO) makes a reinforcer less potent (it abolishes or
decreases the effectiveness of a reinforcer). Motivating operations have two effects: a) they
alter the value of a reinforcer and b) they make the behavior that produces that rein- forcer more
or less likely to occur at that time. An EO makes a reinforcer more potent and makes a behavior
that produces the reinforcer more likely. An AO makes a reinforcer less potent and makes a
behavior that produces that reinforcer less likely.
Let's consider some examples of establishing operations. Food is a more pow- erful reinforcer
for a person who hasn't eaten recently. Not having eaten in a while is an EO that makes food
more reinforcing at that time and makes the behavior of getting food more likely to occur.
Likewise, water is a more potent reinforcer for someone who has not had a drink all day or who
just ran 6 miles. Water or other beverages are more reinforcing when a person just ate a large
amount of salty popcorn than when a person did not. (That is why some bars give you free salty
popcorn.) In these examples, going without food or water (dep- rivation), running 6 miles, and
eating salty popcorn are events called establishing operations because they increase the
effectiveness of a reinforcer at a particular time or in a particular situation and make the
behavior that results in that rein- forcer more likely to occur.
Deprivation is a type of establishing operation that increases the effectiveness. of most
unconditioned reinforcers and some conditioned reinforcers. A particular reinforcer (such as
food or water) is more powerful if a person has gone without it for some time. For example,
attention may be a more powerful reinforcer for a child who has gone without attention for a
period of time. Similarly, although money is almost always a reinforcer, it may be a more
powerful reinforcer for someone who has gone without money (or enough money) for a period of
time. In addition, any circumstances in which a person needs more money (e.g., unex- pected
doctor bills) make money a stronger reinforcer.
Individual Differences
Magnitude
The other characteristic of a stimulus that is related to its power as a reinforcer is its amount or
magnitude. Given the appropriate establishing operation, generally, the effectiveness of a
stimulus as a reinforcer is greater if the amount or magnitude of a stimulus is greater. This is
true for both positive and negative reinforcement. A larger positive reinforcer strengthens the
behavior that produces it to a greater extent than a smaller amount or magnitude of the same
reinforcer does. For example, a person would work longer and harder for a large amount of
money than for a small amount. Likewise, the termination of a more intense aversive stimulus
strengthens the behavior that terminates it more than the termination of a lower magnitude or
intensity of the same stimulus would. For example, a person would work harder or engage in
more behavior to decrease or eliminate an extremely pain- ful stimulus than a mildly painful
stimulus. You would work a lot harder to escape from a burning building than you would to get
out of the hot sun.
SCHEDULES OF REINFORCEMENT
The schedule of reinforcement for a particular behavior specifies whether every response is
followed by a reinforcer or whether only some responses are followed by a reinforcer. A
continuous reinforcement schedule (CRF schedule) is one in which each occurrence of a
response is reinforced. In an intermittent reinforcement schedule, by contrast, each
occurrence of the response is not reinforced. Rather, responses are occasionally or
intermittently reinforced.
A CRF schedule is used when a person is learning a behavior or engaging in the behavior for
the first time. This is called acquisition: The person is acquiring a new behavior. Once the
person has acquired or learned the behavior, an intermittent reinforcement schedule is used so
that the person continues to engage in the behavior. This is called maintenance: The behavior
is maintained over time with the use of inter- mittent reinforcement. A supervisor could not stand
by Maria and praise her for every correct behavior every day that she works. Not only is this
impossible, but it is also unnecessary. Intermittent reinforcement is more effective than a CRF
schedule for maintaining a behavior.
Fixed Ratio
In fixed ratio and variable ratio schedules of reinforcement, the delivery of the rein- forcer is
based on the number of responses that occur. In a fixed ratio (FR) sched- ule, a specific or fixed
number of responses must occur before the reinforcer is delivered. That is, a reinforcer is
delivered after a certain number of responses. For example, in a fixed ratio 5 (FR 5) schedule,
the reinforcer follows every fifth response. In an FR schedule, the number of responses needed
before the reinforcer is delivered does not change. Ferster and Skinner (1957) found that
pigeons would engage in high rates of responding on FR schedules, however, there was often a
brief pause in responding after the delivery of the reinforcer. Ferster and Skinner investigated
FR schedules ranging from FR 2 to FR 400, in which 400 responses had to occur before the
reinforcer was delivered. Typically, the rate of responding is greater when more responses are
needed for reinforcement in an FR schedule.
FR schedules of reinforcement sometimes are used in academic or work set- tings to maintain
appropriate behavior. Consider the example of Paul, a 26-year-old adult with severe intellectual
disability who works in a factory packaging parts for shipment. As the parts come by on a
conveyor belt, Paul picks them up and puts them into boxes. Paul's supervisor delivers a token
(conditioned reinforcer) after every 20 parts that Paul packages. This is an example of an FR
20. At lunch and after work, Paul exchanges his tokens for backup reinforcers (e.g., snacks or
soft drinks). An FR schedule could be used in a school setting by giving students rein- forcers
(such as stars, stickers, or good marks) for correctly completing a fixed num- ber of problems or
other academic tasks. Piece-rate pay in a factory, in which workers get paid a specified amount
of money for a fixed number of responses (e.g., $5 for every 12 parts assembled), is also an
example of an FR schedule.
Variable Ratio
Fixed Interval
With interval schedules (fixed interval, variable interval), a response is reinforced only if it occurs
after an interval of time has passed. It does not matter how many responses occur; as soon as
the specified interval of time has elapsed, the first response that occurs is reinforced. In a fixed
interval (FI) schedule, the interval of time is fixed, or stays the same each time. For example, in
a fixed interval 20-second (FI 20-second) schedule of reinforcement, the first response that
occurs after 20 seconds has elapsed results in the reinforcer. Responses that occur before the
20 seconds are not reinforced; they have no effect on the subsequent delivery of the reinforcer
(i.e., they don't make it come any sooner). Once the 20 seconds has elapsed, the reinforcer is
available, and the first response that occurs is rein- forced. Then, 20 seconds later, the
reinforcer is available again, and the first response that occurs produces the reinforcer.
Variable Interval