Chapter 8: Conditioning and Learning

What is learning? • Learning is a relatively permanent change in behavior due to experience. Learning resulting from conditioning depends on reinforcement. Reinforcement increases the probability that a particular response will occur. o Response: Any identifiable behavior.  Antecedents: Events that precede a response.  Consequences: Effects that follow a response. o Acquisition: The period in conditioning during which a response in reinforced. • Classical, or respondent, conditioning and instrumental, or operant, conditioning are two basic types of learning. • In classical conditioning, a previously neutral stimulus begins to elicit a response through association with another stimulus. • In operant conditioning, the frequency and pattern of voluntary responses are altered by their consequences. How does classical conditioning occur? • Classical conditioning, studied by Pavlov, occurs when a neutral stimulus (NS) is associated with an unconditioned stimulus (US). o Classical Conditioning: A form of learning in which reflex responses are associated with new stimuli. o Neutral Stimulus: A stimulus that does not evoke a response. o Unconditioned Stimulus: A stimulus innately capable of eliciting a response. • The US causes a reflex called the unconditioned response (UR). If the NS is consistently paired with the US, it becomes a conditioned stimulus (CS), capable of producing a response by itself. This response is a conditioned (learned) response (CR). o Reflex: An innate, automatic response to a stimulus; for example, an eye blink. o Unconditioned Response: An innate reflex response elicited by an unconditioned stimulus. o Conditioned Stimulus: A stimulus that evokes a response because it has been repeatedly paired with an unconditioned stimulus. o Conditioned Response: A learned response elicited by a conditioned stimulus. • When the conditioned stimulus is followed by the unconditioned stimulus, conditioning is reinforced (strengthened). o Respondent Reinforcement: Reinforcement that occurs when an unconditioned stimulus closely follows a conditioned stimulus.

From an informational view, conditioning creates expectancies, which alter response patterns. In classical conditioning, the CS creates an expectancy that the US will follow. o Informational View: Perspective that explains learning in terms of information imparted by events in the environment. o Expectancy: An anticipation concerning future events or relationships. Higher order conditioning occurs when a well-learned conditioned stimulus is used as if it were an unconditioned stimulus, bringing about further learning. When the CS is repeatedly presented alone, conditioning is extinguished (weakened or inhibited). After extinction seems to be complete, a rest period may lead to the temporary reappearance of a conditioned response. This is called spontaneous recovery. o Extinction: The weakening of a conditioned response through removal of reinforcement. Through stimulus generalization, stimuli similar to the conditioned stimulus will also produce a response. Generalization gives way to stimulus discrimination when an organism learns to respond to one stimulus, but not to similar stimuli.

Does conditioning affect emotions? • Conditioning applies to visceral or emotional responses as well as simple reflexes. As a result, conditioned emotional responses (CERs) also occur. o Conditioned Emotional Response: An emotional response that has been linked to a previously nonemotional stimulus by classical conditioning. • Irrational fears called phobias may be CERs. Conditioning of emotional responses can occur vicariously (secondhand) as well as directly. o Desensitization: Reducing fear or anxiety by repeatedly exposing a person to emotional stimuli while the person is deeply relaxed.  Desensitization is used as a therapy to extinguish fears, anxieties, and phobias caused by CERs which, due to stimulus generalization, spread to other stimuli. o Vicarious Classical Conditioning: Classical conditioning brought about by observing another person react to a particular stimulus. How does operant conditioning occur? • Operant conditioning occurs when a voluntary action is followed by a reinforcer. Reinforcement in operant conditioning increases the frequency or probability of a response. This result is based on the law of effect. o Operant Reinforcer: Any event that reliably increases the probability or frequency of responses it follows.  Operant Reinforcement works best when it is response contingent. That is, it must be given only after a desired response has occurred. o Law of Effect: Responses that lead to desirable effects are repeated; those that produce undesirable results are not.

Complex operant responses can be taught by reinforcing successive approximations to a final desired response. This is called shaping. It is particularly useful in training animals. o Shaping: Gradually molding responses to a final desired pattern.  Successive Approximations: A series of steps or ever-closer matches to a desired response. If an operant response is not reinforced, it may extinguish (disappear). But after extinction seems complete, it may temporarily reappear (spontaneous recovery). o Operant Extinction: The weakening or disappearance of a nonreinforced operant response.  Marked changes in behavior occur when reinforcement and extinction are combined. For example, parents often unknowingly reinforce children for negative attention seeking (using misbehavior to gain attention). Children are generally ignored when they are playing quietly. They get attention when they become louder and louder, misbehave, throw tantrums, etc. The attention they get from their parents is often a scolding, but attention is a powerful reinforcer, nonetheless.

Are there different kinds of operant reinforcement? • In positive reinforcement, reward or pleasant event follows a response. In negative reinforcement, responses that end discomfort tend to be repeated. • Primary reinforcers are “natural,” physiological based rewards. Intracranial stimulation of “pleasure centers” in the brain can also serve as a primary reinforcer. o Intracranial Stimulation: Direct electrical stimulation and activation of brain tissue. • Secondary reinforcers are learned. They typically gain their reinforcing value by direct association with primary reinforcers. Tokens and money gain their reinforcing value in this way. o Social Reinforcer: Reinforcers, such as attention and approval, provided by other people. o Token Reinforcer: A tangible secondary reinforcer such as money, gold stars, poker chips, and the likes. • Feedback, or knowledge of results, aids learning and improves performance. It is most effective when it is immediate, detailed, and frequent. o Feedback: Information returned to a person about the effects a response has had.  Programmed Instruction: Any learning format that presents information in small amounts, gives immediate practice, and provides continuous feedback to learners. • Programmed instruction breaks learning into a series of small steps and provides immediate feedback. Computer-assisted instruction (CAI) does the same but has the added advantage of providing alternative exercises and information when

needed. Four variations of CAI are drill and practice, instructional games, educational stimulations, and interactive multimedia instruction. o Drill and Practice: A basic CAI format, typically consisting of questions and answers. o Instructional Games: Educational computer programs designed to resemble games to motivate learning. o Educational Simulations: Computer programs that simulate real-world settings or situations to promote learning. o Interactive Multimedia Instruction: Computerized instruction that combines text, sounds, videos, and interactive exercises. How are we influenced by patterns of reward? • Delay of reinforcement greatly reduces its effectiveness, but long chains of responses may be built up so that a single reinforcer maintains many responses. o Response Chaining: The assembly of separate responses into a series of actions that lead to reinforcement. • Superstitious behaviors often become part of response chains because they appear to be associated with reinforcement. o Superstitious Behavior: A behavior repeated because it seems to produce reinforcement, even though it is actually unnecessary. • Reward or reinforcement may be given continuously (after every response) or on a schedule of partial reinforcement. Partial reinforcement produces greater resistance to extinction. o Continuous Reinforcement: A reinforcer follows every correct response. This is fine for the lab, but it has little to do with the real world. Most of our responses are more inconsistently rewarded. In daily life, learning is usually based on partial reinforcement.  Partial Reinforcement: A pattern in which only a portion of all responses will be reinforced. • Partial Reinforcement Effect: Responses acquired with partial reinforcement are more resistant to extinction. o Schedule of Reinforcement: A rule or plan for determining which responses will be reinforced. • The four most basic schedules of reinforcement are fixed ratio, variable ratio, fixed interval, and variable interval. Each produces a distinct pattern of responding. o Fixed Ratio Schedule: A set number of correct responses must be made to get a reinforcer. For example, a reinforcer is given for every four correct responses. o Variable Ratio Schedule: A varied number of correct responses must be made to get a reinforcer. For example, a reinforcer is given after three to seven correct responses; the actual number changes randomly. o Fixed Interval Schedule: A reinforcer is given only when a correct response is made after a set amount of time has passed since the last

reinforced response. Responses made during the time intervals are not reinforced. o Variable Interval Schedule: A reinforcer is given for the first correct response made after a varied amount of time has passed since the last reinforced response. Responses made during the time interval are not reinforced. Stimuli that precede a reinforced response tend to control the response on future occasions (stimulus control). Two aspects of stimulus control are generalization and discrimination. In generalization, an operant response tends to occur when stimuli similar to those preceding reinforcement are present. In discrimination, responses are given in the presence of discriminative stimuli associated with reinforcement (S+) and withheld in the presence of stimuli associated with nonreinforcement (S-). o Discriminative Stimuli: Stimuli that precede rewarded and nonrewarded responses in operant conditioning.

What does punishment do to behavior? • Punishment decreases responding. Punishment occurs when a response is followed by the onset of an aversive event or by the removal of a positive event (response cost). o Aversive Stimulus: A stimulus that is painful or uncomfortable. o Punisher: Any event that decreases the probability or frequency of responses it follows. o Response Cost: Removal of a positive reinforcer after a response is made. • Punishment is most effective when it is immediate, consistent, and intense. Mild punishment tends to only temporarily suppress responses that are also reinforced or were acquired by reinforcement. o Positive Punishment: Something is added. Example: A child swears and is spanked. o Negative Punishment: Something is taken away. Example: A child steals and is grounded. o Severe Punishment: Intense punishment; punishment capable of suppressing a response for long periods. o Mild Punishment: Punishment that has a relatively weak effect, especially punishment that only temporarily slows responding. • The undesirable side effects of punishment include the conditioning of fear to punishing agents and situations associated with punishment, the learning of escape and avoidance responses, and the encouragement of aggression. o Escape Learning: Learning to make a response in order to end an aversive stimulus. o Avoidance Learning: Learning to make a response in order to postpone or prevent discomfort.

Punishment is not very effective, because, unlike conditioning, while it may make the punished person or animal afraid of the repercussions that follow a certain action, it will not like them dislike the action.

What is cognitive learning? • Cognitive learning involves higher mental processes, such as understanding, knowledge, knowing, or anticipating. Even in relatively simple learning situations, animals and people seem to form cognitive maps (internal representations of relationships). o Cognitive Maps: Internal images or other mental representations of an area (maze, city, campus, and so forth) that underlie an ability to choose alternative paths to the same goal. • In latent learning, learning remains hidden or unseen until a reward or incentive for performance is offered. • Discovery learning emphasizes insight and understanding, in contrast to rote learning. o Rote Learning: Learning that takes place mechanically, through repetition and memorization, or by learning rules. Does learning occur by imitation? • Much human learning is achieved through observation, or modeling. Observational learning is influenced by the personal characteristics of the model and the success or failure of the model’s behavior. Studies have shown that aggression is readily learned and released by modeling. o Model: A person who serves as an example in observational learning. • Television characters can act as powerful models for observational learning. Televised violence increases the likelihood of aggression by viewers. How does conditioning apply to practical problems? • Operant principles can be readily applied to manage behavior in everyday settings. When managing one’s own behavior, self-reinforcement, self-recording, feedback, and behavioral contracting are all helpful. o Premack Principle: Any high-frequency response can be used to reinforce a low-frequency response. For example, if you watch television every night and want to study more, make it a rule not to turn on the set until you have studied for an hour (or whatever length of time you choose). Then lengthen the requirement each week. o Self-Reinforcement: Rewarding one’s self after accomplishing established goal. o Self-Recording: Self-management based on keeping records of response frequencies. o Behavioral Contract: A formal agreement stating behaviors to be changed and consequences that apply.

Four strategies that can help change bad habits are reinforcing alternative responses, promoting extinction, breaking response chains, and avoiding antecedent cues.

How does biology influence learning? • Many animals are born with innate behavior patterns far more complex than reflexes. These are organized into fixed action patterns (FAPs), which are stereotyped, species-specific behaviors. o Innate Behavior: Inborn, unlearned behavior. o Fixed Action Pattern (FAPs): An instinctual chain of movements found in almost all members of a species. o Species-Specific Behavior: Behavior patterns that occur with little variation in almost all members of a species. • Learning in animals is limited at times by various biological constraints and species-typical behaviors. o Biological Constraints: Biological limits on what an animal or person can easily learn. o Species Typical Behavior: Behavior patterns that are typical of a species, but not automatic. • According to prepared fear theory, some stimuli are especially effective conditioned stimuli. o Prepared Fear Theory: Holds that people and animals are prepared by evolution to readily learn fears of certain stimuli. • Many responses are subject to instinctive drift in operant conditioning. Human learning is subtly influenced by many such biological potentials and limits. o Instinctive Drift: The tendency of learned responses to shift toward innate response patterns.

