You are on page 1of 5

How dogs learn (the science bit)

All sentient beings (dogs, rats, dolphins, humans, kangaroos, goldfish, etc) learn in the same way, through
operant conditioning, which is learning through consequences, how to gain things we enjoy or that we need
to survive and how to avoid things that might be dangerous or unpleasant.

And through Classical (or Pavlovian) conditioning, which is a more subconscious way of learning, creating
negative and positive associations with previously neutral stimuli or events that affect our emotions and

We can use the operant learning system in dog training to create, shape and encourage behaviours we like,
either to increase the likelihood that the behaviour will be offered/performed more generally, in a specific
situation, in response to one or more environmental cues, that is automatically, or to put it on cue
(sometimes called a command. To be able to prompt the animal in some way, a verbal signal or hand
gesture for example, opening up a window of opportunity for them to perform the corresponding behaviour
and earn a reward... or avoid a punishment if you are so inclined). And we can also reduce behaviour that we
do not want to see.

What differs in each species and actually each individual animal within that species or sub-group is what that
individual animal finds reinforcing (what they will work to earn and what creates positive conditioned
emotional responses when that reinforcer is paired with a neutral stimuli and creates excitement/eustress)
and what they, personally, find punishing or aversive (what they will work to avoid and what creates negative
emotional responses and anxiety/distress).

Whenever learning is happening (whether you intended it or not!) the process of operant conditioning is
responsible and the sequence of events that caused the learning to happen can always be categorised in one
of four quadrants of operant conditioning.

Firstly, let's look at what the symbols used to signify these different quadrants mean, individually.


The + stands for positive. When discussing learning theory, positive does not always mean something good.
It is used in the mathematical sense, positive (+) means to add. When you see the + sign, it means
something is being added to the situation.


The stands for negative. Again, this does not always mean something bad. Negative (-) means something
is being taken away. Something has been removed from the situation and that caused the animal to learn.


The P stands for Punishment.

When a situation has caused a decrease in behaviour (the likelihood that the animal will repeat that
behaviour has decreased, even if that was not your intention) then it is defined as punishment. Punishment
can be used knowingly in many ways, some are ethical, some are not.


The R stands for Reinforcement.

If the behaviour increases (the animal is more likely to perform that behaviour again) then the adding or
removing of the stimuli or event is reinforcing. Reinforcement can also be used many ways, some are ethical,
some are not.

There are four ways that these symbols can be put together, these are the quadrants.

Positive punishment (P+)

So, because we have the + it means something was added. And the P means punishment, so the behaviour
reduced. That something (event or stimulus) must have been aversive (painful, frightening, intimidating,
uncomfortable, unpleasant) in order to decrease the likelihood of the behaviour being repeated (the dog must
want to avoid the chance of that consequence happening again, by not displaying the behaviour that
immediately preceded it).

Remember this when proponents of positive punishment try to tell you that shock, choke and prong collars,
or other types of physical or psychological punishment kicking, tapping, sprays, shouting, staring down,
rattle bottles, etc - do not hurt or are not scary. This is not true.

In P+, the consequence MUST be aversive enough to override the dog's desire to display the behaviour that
the handler is targeting, which can be very strong, especially when we're talking about things like dogs who
want to chase sheep. They must want to avoid the punishment more than they want to display the targeted
behaviour... and we have selectively bred dogs for thousands and thousands of years to encourage their
natural, in-built drive to want to herd, chase, retrieve or kill animals.

P+ has to hurt, or at least be very uncomfortable or frightening, or it simply would not work. It is either
ineffective (sometimes provides a temporary fix, but the problem always returns or is replaced by
something worse), unethical or both.

An example of P+

A dog exhibits an undesired behaviour such as jumping up while greeting someone and the handler gives a
yank on a choke chain or shouts at the dog, causing it to cower.

The immediately preceding behaviour (jumping up) was punished with an aversive stimulus/event and will be
less likely to be repeated.

This may seem like a quick and easy fix to a behaviour problem, but when we take into account Classical
conditioning (pairing a neutral stimulus perhaps the person the dog was enthusiastically greeting with a
reinforcing or a punishing event/stimulus and creating positive or negative CERs - Conditioned Emotional
Responses), the fact that Negative CERs and perceived threats to the dog's safety (things that cause pain
and fear) create stress and the fact that dogs cope with stress by exhibiting behaviours like barking,
chewing, digging, urinating, humping and aggression... it starts to become clear why educated professionals
and knowledgeable, skilled amateurs avoid this quadrant!

This quadrant is not only sloppy (creating more problems than it solves. Eg; you stopped the dog from
jumping up but s/he now has a negative CER to people, may see them as a threat and something that
predicts pain and may even become aggressive towards them) but it is also considered to be unethical and
abusive as there are other ways to easily and effectively manage and modify unwanted behaviour that does
not rely on the use of aversives and setting the dog up to fail.

Such as...

Negative Punishment (P-)

Removal (-) of a reinforcing stimulus to decrease the likelihood that a behaviour will be repeated (P). So
because something was removed and the behaviour decreased, that something must have been reinforcing
(desired) by that particular animal for the event to have a punishing effect.

For example, a puppy is playing with his/her owner and accidentally bites their hand, the person immediately
turns away from the puppy and stops playing, removing their attention and stopping the game (that the
puppy finds rewarding) for a moment or two.

You can effectively and ethically use this type of punishment alongside management (setting up the
environment so that the dog cannot practice unwanted behaviour, an errorless learning approach) and DRI
(differential reinforcement, rewarding an alternative behaviour that gets the same result. Eg; praising and
feeding the puppy for biting the tug toy, not your hands! Or teaching the puppy to fetch a toy when s/he gets
excited and wants to play, using that energy and enthusiasm, instead of battling against it).

Negative Reinforcement (R-)

Removing an aversive stimulus to increase a behaviour.

This one is the most difficult to get your head around and can be cruel to the animal being subjected to it.
Something is removed (-) and the behaviour preceding the removal of that stimulus is more likely to be
repeated (R).

Let's break that down.

In order for the removal (-) of something to reinforce (R, increase likelihood) any given behaviour, the dog
must be relieved that the thing was removed. It must be aversive and it must be fairly constant.

For example, when a dog is asked to sit, the handler pushes down on the dogs rear end or pulls the head up
with a lead high on the neck to force it's bum to the floor. The dog will sit down to stop the pressure on its
rear or around the sensitive area at the top of the neck.

Or a person might put a prong collar (an unethical collar that has metal spikes inside that pinch the skin
when pressure is applied to the lead) on a dog so that the dog feels pinches on it's skin when it pulls
forwards but they stop when the dog stops pulling.

The most upsetting example I have seen is of trainers (people who are training professionally, you do not
need to have any education or ethics to do this, the industry is unregulated) teaching a recall by putting a
shock collar on a dog and holding the button down or repeatedly pressing it so that the dog is continually
shocked until it turns towards the person. Then the reward, for discovering (by trial and error) the action that
the handler wanted them to display, is to have relief from the shock.

Positive reinforcement (R+)

Adding (+) a reinforcing stimuli or event (reward) to increase the frequency of a behaviour being repeated.
For example, a dog sits and gets a treat, the dog is likely to repeat this behaviour and will offer it freely (as
long as s/he is confident to experiment and is not afraid of the consequences for getting it wrong, eg; if
they have previously been exposed to P+ / R-).

This is the most common way people train dogs now as it is safe, kind and easy to put into practice and
creates an enthusiastically cooperative dog!

When people refer to positive training or reward-based training they are referring to trainers who use
primarily positive reinforcement (alongside negative punishment), as these are the ethical quadrants.


Using food as a reward in the R+ quadrant is often confused with bribery.

Using bribes and using positive reinforcement are two completely different things but they can get confused if
you are not aware of the theory of learning and the order of the sequence of events that changes or shapes

If you ask your dog for a sit (a behaviour you have previously taught them) and, when their bottom hits the
floor, you produce and reward them with food or throwing a ball, etc. That is R+. If you first showed them
the treat or ball and then asked for the sit, that is a bribe.

Bribes often have a sequence of 'bribe (shown) - reinforcing event/stimulus (bribe given) aversive event'
that backfires easily.

An example of bribery:

A dog owner shows their dog a treat while the dog is running around the park and says what's this? (bribe
presented). The dog comes back eats the treat (bribe given) and is immediately put on the lead and taken
home (aversive event). Very soon the dog decides to forgo the treat when it is offered because s/he has
learned that it precedes an unwanted event (putting lead on and going home!) and it is not worth it.

All dogs get wise to bribes eventually and it is not an effective training method.

If you are thinking of attending a class or a course of one to one training with a professional trainer, it is a
good idea to attend a class without your dog first and to ask them what quadrants they use.

If they cannot tell you (they do not know what you are talking about or speak in pseudo-scientific language;
pack leader, energy, etc), if they use the unethical quadrants (usually because they are not skilled or
knowledgeable enough to use only the ethical ones and have gotten used to falling back on hurting or scaring
the dog to gain compliance when they get out of their depth) or if they tell you that they are using the ethical
quadrants (R+ / P-) but you actually think that what they are doing is functioning as positive punishment or
negative reinforcement (I have seen trainers state that they use 100% positive reinforcement when in fact
they use prong collars and choke chains which obviously use P+ / R-) then walk away! And quickly!

Classical (or Pavolvian) conditioning is the other way that sentient beings learn.

Through associations, pairing or predicting. Our dogs learn that a lead (a piece of rope or nylon that means
nothing the first time they see it) predicts that they are going for a walk. They enjoy their walks so they start
to become happy and excited on seeing their lead, a previously neutral stimuli that now has a positive
conditioned emotional response (CER).

We also experience this effect. Anyone who has ever set a specific ring tone for a loved one and one for their
boss will quickly start to FEEL completely differently on hearing those two different ring sounds, that meant
nothing to them before.

Dogs (and probably all animals) attach much more easily to negative experiences. This is a survival
mechanism as something that causes you pain or scares you may be a threat to your survival so you need to
avoid it. It takes far fewer pairings of a neutral stimuli with something aversive to create a negative CER than
it does to create a positive CER by pairing it with a reinforcing stimulus.

It is important to remember the saying Pavlov (the man who discovered the effects of classical conditioning)
is always on your shoulder.

This means that you cannot employ operant conditioning (the quadrants) without taking classical conditioning
into account. Classical conditioning is always happening.

We use this to our advantage when we use clicker training (pairing the sound of the click with food so that
we can use the sound to mark, and then reward, very specific behaviours).

If a trainer or owner uses aversives the dog will be forming negitive CERs with whatever happens to be
around at the time s/he is punished. For example they can start to feel scared and stressed (and more likely
to resort to aggression) about the environment (training hall, show ring, etc) it happens in, the equipment
used to deliver the punishment, the person delivering the punishment (owner or trainer), seemingly random
objects (I treated a dog who was fearful of bin bags as he was corrected with a choke chain when he pulled
forward to sniff them, he had associated that pain with the bin bags) or even people or animals who happen
to be around when they are punished (strangers, other dogs, etc) which is very dangerous as fear creates
aggression (as the dog tries to create space from the thing that it fears using increasingly extreme

When you are knowledgeable about the science of learning you can make informed decisions about what you
do and do not want your dog to be exposed to and advocate for them effectively.