You are on page 1of 13

Accelerat ing t he world's research.

Skinner - Operant Conditioning


Nour Aj

Related papers Download a PDF Pack of t he best relat ed papers 

BEHAVIOURAL T HEORY OF PERSONALIT Y


Christ opher Tarkaa

Skinner: verbal behaviour t heory Art icle: T HE HIST ORY OF IMITAT ION IN LEARNING T HEORY: T HE LANG…
Maria Salud Marco Melgarejo
Skinner -
Operant Conditioning
by Saul McLeod published 2007, updated 2015
By the 1920s, John B. Watson had left academic psychology and other behaviorists were
becoming influential, proposing new forms of learning other than classical conditioning.
Perhaps the most important of these was Burrhus Frederic Skinner. Although, for obvious
reasons he is more commonly known as B.F. Skinner.
Skinner's views were slightly less extreme than those of Watson (1913). Skinner believed that
we do have such a thing as a mind, but that it is simply more productive to study observable
behavior rather than internal mental events.
The work of Skinner was rooted in a view that classical conditioning was far too simplistic to be
a complete explanation of complex human behavior. He believed that the best way to
understand behavior is to look at the causes of an action and its consequences. He called this
approach operant conditioning.

Operant Conditioning deals with operants - intentional actions that have an effect on the
surrounding environment. Skinner set out to identify the processes which made certain operant
behaviours more or less likely to occur.

Skinner's theory of operant conditioning was based on the work of Thorndike (1905). Edward
Thorndike studied learning in animals using a puzzle box to propose the theory known as the
'Law of Effect'.

BF Skinner: Operant Conditioning


Skinner is regarded as the father of Operant Conditioning, but his work was based on
Thorndike’s law of effect. Skinner introduced a new term into the Law of Effect -
Reinforcement. Behavior which is reinforced tends to be repeated (i.e. strengthened); behavior
which is not reinforced tends to die out-or be extinguished (i.e. weakened).

Skinner (1948) studied operant conditioning by conducting experiments using animals which
he placed in a 'Skinner Box' which was similar to Thorndike’s puzzle box.

1
B.F. Skinner (1938) coined the term operant conditioning; it means roughly changing of
behavior by the use of reinforcement which is given after the desired response. Skinner
identified three types of responses or operant that can follow behavior.

• Neutral operants: responses from the environment that neither increase nor decrease
the probability of a behavior being repeated.

• Reinforcers: Responses from the environment that increase the probability of a behavior
being repeated. Reinforcers can be either positive or negative.

• Punishers: Responses from the environment that decrease the likelihood of a behavior
being repeated. Punishment weakens behavior.

We can all think of examples of how our own behavior has been affected by reinforcers and
punishers. As a child you probably tried out a number of behaviors and learned from their
consequences.

For example, if when you were younger you tried smoking at school, and the chief
consequence was that you got in with the crowd you always wanted to hang out with, you
would have been positively reinforced (i.e. rewarded) and would be likely to repeat the
behavior. If, however, the main consequence was that you were caught, caned, suspended
from school and your parents became involved you would most certainly have been punished,
and you would consequently be much less likely to smoke now.

Positive Reinforcement

2
Skinner showed how positive reinforcement worked by placing a hungry rat in his Skinner box.
The box contained a lever on the side and as the rat moved about the box it would accidentally
knock the lever. Immediately it did so a food pellet would drop into a container next to the
lever. The rats quickly learned to go straight to the lever after a few times of being put in the
box. The consequence of receiving food if they pressed the lever ensured that they would
repeat the action again and again.

Positive reinforcement strengthens a behavior by providing a consequence an individual finds


rewarding. For example, if your teacher gives you £5 each time you complete your homework
(i.e. a reward) you will be more likely to repeat this behavior in the future, thus strengthening
the behavior of completing your homework.

Negative Reinforcement
The removal of an unpleasant reinforcer can also strengthen behavior. This is known as
negative reinforcement because it is the removal of an adverse stimulus which is ‘rewarding’ to
the animal or person. Negative reinforcement strengthens behavior because it stops or
removes an unpleasant experience.

For example, if you do not complete your homework, you give your teacher £5. You will
complete your homework to avoid paying £5, thus strengthening the behavior of completing
your homework.

Skinner showed how negative reinforcement worked by placing a rat in his Skinner box and
then subjecting it to an unpleasant electric current which caused it some discomfort. As the rat
moved about the box it would accidentally knock the lever. Immediately it did so the electric
current would be switched off. The rats quickly learned to go straight to the lever after a few
times of being put in the box. The consequence of escaping the electric current ensured that
they would repeat the action again and again.

In fact Skinner even taught the rats to avoid the electric current by turning on a light just before
the electric current came on. The rats soon learned to press the lever when the light came on
because they knew that this would stop the electric current being switched on.

These two learned responses are known as Escape Learning and Avoidance Learning.

Punishment (weakens behavior)


Punishment is defined as the opposite of reinforcement since it is designed to weaken or
eliminate a response rather than increase it. It is an aversive event that decreases the
behavior that it follows
3
Like reinforcement, punishment can work either by directly applying an unpleasant stimulus
like a shock after a response or by removing a potentially rewarding stimulus, for instance,
deducting someone’s pocket money to punish undesirable behavior.

Note: It is not always easy to distinguish between punishment and negative reinforcement.

There are many problems with using punishment, such as:

• Punished behavior is not forgotten, it's suppressed - behavior returns when punishment is
no longer present.

• Causes increased aggression - shows that aggression is a way to cope with problems.

• Creates fear that can generalize to undesirable behaviors, e.g., fear of school.

• Does not necessarily guide toward desired behavior - reinforcement tells you what to do,
punishment only tells you what not to do.

Schedules of Reinforcement
Imagine a rat in a “Skinner box". In operant conditioning if no food pellet is delivered
immediately after the lever is pressed then after several attempts the rat stops pressing the
lever (how long would someone continue to go to work if their employer stopped paying
them?). The behavior has been extinguished.

Behaviorists discovered that different patterns (or schedules) of reinforcement had different
effects on the speed of learning and on extinction. Ferster and Skinner (1957) devised different
ways of delivering reinforcement, and found that this had effects on
1. The Response Rate - The rate at which the rat pressed the lever (i.e. how hard the rat
worked).

2. The Extinction Rate - The rate at which lever pressing dies out (i.e. how soon the rat
gave up).

4
Skinner found that the type of reinforcement which produces the slowest rate of extinction (i.e.
people will go on repeating the behavior for the longest time without reinforcement) is variable-
ratio reinforcement. The type of reinforcement which has the quickest rate of extinction is
continuous reinforcement.

(A) Continuous Reinforcement


An animal/human is positively reinforced every time a specific behaviour occurs, e.g. every
time a lever is pressed a pellet is delivered and then food delivery is shut off.

• Response rate is SLOW

• Extinction rate is FAST

(B) Fixed Ratio Reinforcement


Behavior is reinforced only after the behavior occurs a specified number of times. E.g. one
reinforcement is given after every so many correct responses, e.g. after every 5th response.
For example a child receives a star for every five words spelt correctly.

• Response rate is FAST

• Extinction rate is MEDIUM


5
(C) Fixed Interval Reinforcement
One reinforcement is given after a fixed time interval providing at least one correct response
has been made. An example is being paid by the hour. Another example would be every 15
minutes (half hour, hour, etc.) a pellet is delivered (providing at least one lever press has been
made) then food delivery is shut off.

• Response rate is MEDIUM

• Extinction rate is MEDIUM

(D) Variable Ratio Reinforcement


Behavior is reinforced after an unpredictable number of times. For examples gambling or
fishing.

• Response rate is FAST

• Extinction rate is SLOW (very hard to extinguish because of unpredictability )

(E) Variable Interval Reinforcement


Providing one correct response has been made, reinforcement is given after an unpredictable
amount of time has passed, e.g. on average every 5 minutes. An example is a self-employed
person being paid at unpredictable times.

• Response rate is FAST

• Extinction rate is SLOW

Behavior Shaping
A further important contribution made by Skinner (1951) is the notion of behaviour shaping
through successive approximation. Skinner argues that the principles of operant conditioning
can be used to produce extremely complex behaviour if rewards and punishments are
delivered in such a way as to encourage move an organism closer and closer to the desired
behaviour each time.

In order to do this, the conditions (or contingencies) required to receive the reward should shift
each time the organism moves a step closer to the desired behaviour.

According to Skinner, most animal and human behaviour (including language) can be
explained as a product of this type of successive approximation.

Behavior Modification
6
Behavior modification is a set of therapies / techniques based on operant conditioning
(Skinner, 1938, 1953). The main principle comprises changing environmental events that are
related to a person's behavior. For example, the reinforcement of desired behaviors and
ignoring or punishing undesired ones.

This is not as simple as it sounds — always reinforcing desired behavior, for example, is
basically bribery.

There are different types of positive reinforcements. Primary reinforcement is when a reward
strengths a behavior by itself. Secondary reinforcement is when something strengthens a
behavior because it leads to a primary reinforcer.

Examples of behavior modification therapy include token economy and behavior shaping

Token Economy
Token economy is a system in which targeted behaviors are reinforced with tokens (secondary
reinforcers) and later exchanged for rewards (primary reinforcers).

Tokens can be in the form of fake money, buttons, poker chips, stickers, etc. While the
rewards can range anywhere from snacks to privileges or activities.

Token economy has been found to be very effective in managing psychiatric patients.
However, the patients can become over reliant on the tokens, making it difficult for them to
adjust to society once they leave prisons, hospital etc.

Teachers also use token economy at primary school by giving young children stickers to
reward good behavior.

Operant Conditioning in the Classroom


In the conventional learning situation operant conditioning applies largely to issues of class
and student management, rather than to learning content. It is very relevant to shaping skill
performance.

A simple way to shape behavior is to provide feedback on learner performance, e.g.


compliments, approval, encouragement, and affirmation. A variable-ratio produces the highest
response rate for students learning a new task, whereby initially reinforcement (e.g. praise)
occurs at frequent intervals, and as the performance improves reinforcement occurs less
frequently, until eventually only exceptional outcomes are reinforced.

For example, if a teacher wanted to encourage students to answer questions in class they
should praise them for every attempt (regardless of whether their answer is correct). Gradually

7
the teacher will only praise the students when their answer is correct, and over time only
exceptional answers will be praised.

Unwanted behaviors, such as tardiness and dominating class discussion can be extinguished
through being ignored by the teacher (rather than being reinforced by having attention drawn
to them).

Knowledge of success is also important as it motivates future learning. However it is important


to vary the type of reinforcement given, so that the behavior is maintained. This is not an easy
task, as the teacher may appear insincere if he/she thinks too much about the way to behave.

Operant Conditioning Summary


Looking at Skinner's classic studies on pigeons’ / rat's behavior we can identify some of the
major assumptions of the behaviorist approach.
• Psychology should be seen as a science, to be studied in a scientific manner. Skinner's
study of behavior in rats was conducted under carefully controlled laboratory conditions.
• Behaviorism is primarily concerned with observable behavior, as opposed to internal
events like thinking and emotion. Note that Skinner did not say that the rats learned to
press a lever because they wanted food. He instead concentrated on describing the easily
observed behavior that the rats acquired.

• The major influence on human behavior is learning from our environment. In the Skinner
study, because food followed a particular behavior the rats learned to repeat that behavior,
e.g. operant conditioning.

• There is little difference between the learning that takes place in humans and that in other
animals. Therefore research (e.g. operant conditioning) can be carried out on animals
(Rats / Pigeons) as well as on humans. Skinner proposed that the way humans learn
behavior is much the same as the way the rats learned to press a lever.

So, if your layperson's idea of psychology has always been of people in laboratories wearing
white coats and watching hapless rats try to negotiate mazes in order to get to their dinner,
then you are probably thinking of behavioral psychology.

Behaviorism and its offshoots tend to be among the most scientific of the psychological
perspectives. The emphasis of behavioral psychology is on how we learn to behave in certain
ways. We are all constantly learning new behaviors and how to modify our existing behavior.
Behavioral psychology is the psychological approach that focuses on how this learning takes
place.

Critical Evaluation
8
Operant conditioning can be used to explain a wide variety of behaviors, from the process of
learning, to addiction and language acquisition. It also has practical application (such as token
economy) which can be applied in classrooms, prisons and psychiatric hospitals.
However, operant conditioning fails to take into account the role of inherited and cognitive
factors in learning, and thus is an incomplete explanation of the learning process in humans
and animals.
For example, Kohler (1924) found that primates often seem to solve problems in a flash of
insight rather than be trial and error learning. Also social learning theory (Bandura, 1977)
suggests that humans can learn automatically through observation rather than through
personal experience.
The use of animal research in operant conditioning studies also raises the issue of
extrapolation. Somepsychologists argue we cannot generalize from studies on animals to
humans as their anatomy and physiology is different from humans, and they cannot think
about their experiences and invoke reason, patience, memory or self-comfort.
References
Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice Hall.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement.
Kohler, W. (1924). The mentality of apes. London: Routledge & Kegan Paul.
Skinner, B. F. (1938). The Behavior of organisms: An experimental analysis. New York: Appleton-Century.
Skinner, B. F. (1948). Superstition' in the pigeon. Journal of Experimental Psychology, 38, 168-172.
Skinner, B. F. (1951). How to teach animals. Freeman.
Skinner, B. F. (1953). Science and human behavior. SimonandSchuster.com.
Thorndike, E. L. (1905). The elements of psychology. New York: A. G. Seiler.
Watson, J. B. (1913). Psychology as the Behaviorist views it. Psychological Review, 20, 158–177.

Skinner was heavily influenced by the work of John B. Watson as well as early behaviorist pioneers Ivan
Pavlov and Edward Thorndike. He spent most of his professional life teaching at Harvard University (after 9
years in the psychology department at Indiana University). He died in 1990 of leukemia, leaving behind his
wife, Yvonne Blue and two daughters.

Burrhus Frederic Skinner was born March 20, 1904 (he died in 1990 of leukemia), in the small
Pennsylvania town of Susquehanna. His father was a lawyer, and his mother a strong and intelligent
housewife. His upbringing was old-fashioned and hard-working.

Skinner was an active, out-going boy who loved the outdoors and building things, and actually enjoyed
school. His life was not without its tragedies, however. In particular, his brother died at the age of 16 of a

9
cerebral aneurysm.

B. F. Skinner received his BA in English from Hamilton College in upstate New York. Ultimately, he
resigned himself to writing newspaper articles on labor problems, and lived for a while in Greenwich Village
in New York City as a “bohemian.” After some traveling, he decided to go back to school, this time at
Harvard. He got his masters in psychology in 1930 and his doctorate in 1931, and stayed there to do
research until 1936.

Also in that year, he moved to Minneapolis to teach at the University of Minnesota. There he met and soon
married Yvonne Blue. They had two daughters, the second of which became famous as the first infant to be
raised in one of Skinner’s inventions, the air crib which was nothing more than a combination crib and
playpen with glass sides and air conditioning.

In 1945, he became the chairman of the psychology department at Indiana University. In 1948, he was
invited to come to Harvard, where he remained for the rest of his life. He was a very active man, doing
research and guiding hundreds of doctoral candidates as well as writing many books. While not successful
as a writer of fiction and poetry, he became one of our best psychology writers, including the book Walden
II, which is a fictional account of a community run by his behaviorist principles.

B. F. Skinner’s theory is based on operant conditioning. The organism is in the process of “operating” on
the environment, which in ordinary terms means it is bouncing around its world, doing what it does. During
this “operating,” the organism encounters a special kind of stimulus, called a reinforcing stimulus, or simply
a reinforcer. This special stimulus has the effect of increasing the operant -- that is, the behavior occurring
just before the reinforcer. This is operant conditioning: “the behavior is followed by a consequence, and the
nature of the consequence modifies the organisms tendency to repeat the behavior in the future.”

The following has been adapted from the Wikipedia and Webspace websites.

Skinner conducted research on shaping behavior through positive and negative reinforcement and
demonstrated operant conditioning, a behavior modification technique which he developed in contrast with
classical conditioning. His idea of the behavior modification technique was to put the subject on a program
with steps. The steps would be setting goals which would help you determine how the subject would be
changed by following the steps. The program design is designing a program that will help the subject to
reach the desired state. Then implementation and evaluation which is putting the program to use and then
evaluating the effectiveness of it.

B.F. Skinner's Theory of Operant Conditioning

Place a rat in a special cage (called a “Skinner box”) that has a bar or pedal on one wall that, when
pressed, causes a little mechanism to release a food pellet into the cage. The rat is moving around the
cage when it accidentally presses the bar and, as a result of pressing the bar, a food pellet falls into the
cage. The operant is the behavior just prior to the reinforcer, which is the food pellet. In a relatively short
period of time the rat "learns" to press the bar whenever it wants food. This leads to one of the principles of
operant conditioning--A behavior followed by a reinforcing stimulus results in an increased probability of that
behavior occurring in the future.

If the rat presses the bar and continually does not get food, the behavior becomes extinguished. This leads
to another of the principles of operant conditioning--A behavior no longer followed by the reinforcing
stimulus results in a decreased probability of that behavior occurring in the future.

Now, if you were to turn the pellet machine back on, so that pressing the bar again provides the rat with
pellets, the behavior of bar-pushing will come right back into existence, much more quickly than it took for
the rat to learn the behavior the first time. This is because the return of the reinforcer takes place in the
context of a reinforcement history that goes all the way back to the very first time the rat was reinforced for
pushing on the bar. This leads to what are called the Schedules of Reinforcement.

10
Schedules of Reinforcement

Continuous reinforcement is the original scenario: Every time that the rat does the behavior (such as pedal-
pushing), he gets a food pellet.

The fixed ratio schedule was the first one Skinner discovered: If the rat presses the pedal three times, say,
he gets a goodie. Or five times. Or twenty times. Or “x” times. There is a fixed ratio between behaviors and
reinforcers: 3 to 1, 5 to 1, 20 to 1, etc.

The fixed interval schedule uses a timing device of some sort. If the rat presses the bar at least once during
a particular stretch of time (say 20 seconds), then he gets a goodie. If he fails to do so, he doesn’t get a
goodie. But even if he hits that bar a hundred times during that 20 seconds, he still only gets one goodie!
One strange thing that happens is that the rats tend to “pace” themselves: They slow down the rate of their
behavior right after the reinforcer, and speed up when the time for it gets close.

Skinner also looked at variable schedules. Variable ratio means you change the “x” each time -- first it takes
3 presses to get a goodie, then 10, then 1, then 7 and so on. Variable interval means you keep changing
the time period -- first 20 seconds, then 5, then 35, then 10 and so on. With the variable interval schedule,
the rats no longer “pace” themselves, because they can no longer establish a “rhythm” between behavior
and reward. Most importantly, these schedules are very resistant to extinction. It makes sense, if you think
about it. If you haven’t gotten a reinforcer for a while, well, it could just be that you are at a particularly “bad”
ratio or interval, just one more bar press, maybe this’ll be the one time you get reinforced.

Shaping

A question Skinner had to deal with was how we get to more complex sorts of behaviors. He responded
with the idea of shaping, or “the method of successive approximations.” Basically, it involves first reinforcing
a behavior only vaguely similar to the one desired. Once that is established, you look out for variations that
come a little closer to what you want, and so on, until you have the animal performing a behavior that would
never show up in ordinary life. Skinner and his students have been quite successful in teaching simple
animals to do some quite extraordinary things.

Beyond fairly simple examples, shaping also accounts for the most complex of behaviors. We are gently
shaped by our environment to enjoy certain things.

Aversive stimuli

An aversive stimulus is the opposite of a reinforcing stimulus, something we might find unpleasant or
painful. This leads to another principle of operant conditioning--A behavior followed by an aversive stimulus
results in a decreased probability of the behavior occurring in the future.

This both defines an aversive stimulus and describes the form of conditioning known as punishment. If you
shock a rat for doing x, it’ll do a lot less of x. If you spank Johnny for throwing his toys he will probably throw
his toys less and less.

On the other hand, if you remove an already active aversive stimulus after a rat or Johnny performs a
certain behavior, you are doing negative reinforcement. If you turn off the electricity when the rat stands on
his hind legs, he’ll do a lot more standing. If you stop your perpetually nagging when I finally take out the
garbage, I’ll be more likely to take out the garbage. You could say it “feels so good” when the aversive
stimulus stops, that this serves as a reinforcer. Another operant conditioning principle--Behavior followed
by the removal of an aversive stimulus results in an increased probability of that behavior occurring in the
future.

Skinner did not advocate the use of punishment. His main focus was to target behavior and see that
consequences deliver responses. From his research came "shaping" (described above) which is described
as creating behaviors through reinforcing. He also came up with the example of a child's refusal to go to
school and that the focus should be on what is causing the child's refusal not necessarily the refusal itself.
11
His research suggested that punishment was an ineffective way of controlling behavior, leading generally to
short-term behavior change, but resulting mostly in the subject attempting to avoid the punishing stimulus
instead of avoiding the behavior that was causing punishment. A simple example of this, he believed, was
the failure of prison to eliminate criminal behavior. If prison (as a punishing stimulus) was effective at
altering behavior, there would be no criminality, since the risk of imprisonment for criminal conduct is well
established, Skinner deduced. However, he noted that individuals still commit offences, but attempt to avoid
discovery and therefore punishment. He noted that the punishing stimulus does not stop criminal behavior;
the criminal simply becomes more sophisticated at avoiding the punishment. Reinforcement, both positive
and negative (the latter of which is often confused with punishment), he believed, proved to be more
effective in bringing about lasting changes in behavior.

Behavior modification

Behavior modification -- often referred to as b-mod -- is the therapy technique based on Skinner’s work. It is
very straight-forward: Extinguish an undesirable behavior (by removing the reinforcer) and replace it with a
desirable behavior by reinforcement. It has been used on all sorts of psychological problems -- addictions,
neuroses, shyness, autism, even schizophrenia -- and works particularly well with children. There are
examples of back-ward psychotics who haven’t communicated with others for years who have been
conditioned to behave themselves in fairly normal ways, such as eating with a knife and fork, taking care of
their own hygiene needs, dressing themselves, and so on.

There is an offshoot of b-mod called the token economy. This is used primarily in institutions such as
psychiatric hospitals, juvenile halls, and prisons. Certain rules are made explicit in the institution, and
behaving yourself appropriately is rewarded with tokens -- poker chips, tickets, funny money, recorded
notes, etc. Certain poor behavior is also often followed by a withdrawal of these tokens. The tokens can be
traded in for desirable things such as candy, cigarettes, games, movies, time out of the institution, and so
on. This has been found to be very effective in maintaining order in these often difficult institutions.

Walden II

In 1948, Skinner published his actual ideas on child-rearing in Walden Two, a fictional account of a
behaviorist-created utopia in which carefree young parents stroll off to work or school while their little ones
enjoy all the comforts of community-run, behaviorist-approved daycare. Skinner's book Walden Two
presents a vision of a decentralized, localized society which applies a practical, scientific approach and
futuristically advanced behavioral expertise to peacefully deal with social problems. Skinner's utopia, like
every other utopia or dystopia, is both a thought experiment and a rhetorical work.

In 1971 he wrote Beyond Freedom and Dignity, which suggests that the concept of individual freedom is an
illusion. Skinner later sought to unite the reinforcement of individual behaviors, the natural selection of
species, and the development of cultures under the heading of The Selection by Consequences (1981), the
first of a series of articles in the journal Science.

12

You might also like