You are on page 1of 18

Chapter 6: Learning “Operant Conditioning”

Operant Conditioning
A type of learning in which behavior is strengthened if followed by reinforcement or diminished if followed by punishment.
in rats:
• trial and error learning • allows acquisition of motor programs that aren’t instinctive • behavior shaped by rewards • develops as a result of the association of reinforcement with a particular response • on a proportion of occasions

Trial & Error --> Trial & Reward --> Operant Conditioning Operant Response -- Reinforcement -- Learned Behavior

Classical vs. Operant
They both use acquisition, discrimination, SR, generalization and extinction.

Classical Conditioning is automatic (respondent behavior).
Ex.) Your dog gets sick and requires several painful trips to the vet. Now he hides every time he hears you rattle your keys. Automatic.

Operant Conditioning involves behavior where one can influence their environment with behaviors which have consequences (operant behavior).
Ex.) Teacher comments on test.

Edward Thorndike

Law of Effect:
rewarded behavior is likely to recur.
Previous theories had emphasized practice or repetition, Thorndike gave equal consideration to the effects of reward or punishment, success or failure, and satisfaction or annoyance on the learner

B.F. Skinner
Instead of antecedents of behavior (what comes before) a new focus on consequences of behavior. BF Skinner argued that, CC did not explain complex behavior. 2 categories of consequences: Reinforcement & Punishment. Reinforcement is designed to increase the probability that a behavior will occur again. Punishment is designed to decrease the probability that a behavior will occur again.

Operant Conditioning Chamber

Positive reinforcement - when something is given (apply an aversive stimulus). Negative reinforcement - when something is removed (remove an aversive stimulus). Skinner - punishment should be judicious, immediate, consistent, & severe enough actually to be a punishment.

A procedure in Operant Conditioning in which reinforcers guide behavior closer and closer towards a goal.

Any event that STRENGTHENS the behavior it follows.

There are + and – reinforcers.

Positive Reinforcement
Strengthens a response by presenting a stimulus after a response. We may continue to go to work each day because we receive a paycheck on a weekly or monthly basis. If we receive awards for writing short stories, we may be more likely to increase the frequency of writing short stories. Receiving praise for our karaoke performances can increase how often we sing. These are all examples of positive reinforcement.

Negative Reinforcement
Strengthens a response by reducing or removing an aversive stimulus.
Examples: Driving in heavy traffic is a negative condition for most of us. You leave home earlier than usual one morning, and don't run into heavy traffic. You leave home earlier again the next morning and again you avoid heavy traffic. Your behavior of leaving home earlier is strengthened by the consequence of the avoidance of heavy traffic. The concept of Negative Reinforcement is difficult to learn because of the word negative. Negative Reinforcement is often confused with Punishment. They are very different, however. Negative Reinforcement strengthens a behavior because a negative condition is stopped or avoided as a consequence of the behavior. Punishment, on the other hand, weakens a behavior because a negative condition is introduced or experienced as a consequence of the behavior.

Fixed-ratio Schedules
A schedule that reinforces a response only after a specified number of responses.
Examples in natural environments: Jobs that pay based on units delivered. Employees often find this schedule undesirable because it produces a rate of response that leaves them nervous and exhausted at the end of the day. They may feel pressured not to slow down or take rest breaks, since they feel that such will costs them money. This is an example of how a schedule can produce a high rate of response even though the response rate is aversive to the subject. Examples in video games: Collecting tokens. Many games require the player to collect a fixed number of tokens to advance to the next level, obtain a new life point, or receive some other reinforcers. Attaining a new level in an RPG. Some RPG's clearly indicate how much experience is required to achieve the next level. A high degree of certainty as to the level of work that will be required to achieve the next level puts the player on a fixed ratio schedule.

Variable-ratio Schedule
A schedule of reinforcement that reinforces a response after an unpredictable number of responses.
Examples Slot machines. Slot machines are programmed on VR schedule. The gambler has no way of predicting how many times he must put a coin in the slot and pull the lever to hit a payoff but the more times a coin is inserted the greater the chance of a payout. People who play slot machines are often reluctant to leave them, especially when they have had a large number of un-reinforced responses. They are concerned that someone else will win the moment they leave. Playing golf. It only takes a few good shots to encourage the player to keep playing or play again. The player is uncertain how good each shot will be, but the more often they play, the more likely they are to get a good shot. Door to door salesmen. It is uncertain how many houses they will have to visit to make a sale, but the more houses they try, the more likely that they will succeed.

Fixed-interval Schedule A schedule of reinforcement that reinforces a response only after a specified time has elapsed.

Example: I give Bart a Butterfinger every ten minutes after he moons someone.

Variable-interval Schedule
A schedule of reinforcement that reinforces a response at unpredictable time intervals.

Pop Quizzes

An event that DECREASES the behavior that it follows.

Does punishment work?