You are on page 1of 127

Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the first part of a small series on visual perception

Learning objectives for this part:

• Describe the nature of light

• Label the anatomy of the eye

• Explain photosensitive cells and their connections


What is light?

The basis of visual perception is light.


Light is a wave (/stream of particles) of electromagnetic energy.
We are only sensitive to a very narrow range of wavelengths.

Most wavelengths are invisible to us.

Too long and it goes straight through


us, too short and it is damaging to us!
What is light?

A quick definition: degrees of visual angle

In vision science we like to refer to size as a function of the size of the visual field.
Degrees of visual angle (or just deg) are a useful measure for this.

Watch the first 30s of this video:

https://youtu.be/MMiKyfd6hA0

Measuring the size of an object in


deg means you do not need to know
how far away it is to know how much
of the visual field it will take up –
super useful.
Anatomy of the eye
Light CORNEA
ANTERIOR CHAMBER PUPIL (transparent membrane
(aqueous humor) allowing light to enter)
IRIS CILIARY BODY
(coloured opaque muscle LENS
(Changes lens shape)
that regulates light entry) LENS
POSTERIOR CHAMBER
+10mmHg (vitreous humor)

FOVEA
OPTIC DISC
(blind spot) RETINA
(photoreceptor layer)
Choroid layer
OPTIC NERVE (light-absorbing pigment)
(Axons projecting
SCLERA
to visual cortex)
(for protection)
The retina

Photosensitive cells called


photoreceptors are scattered across
the retina: rods and cones

The distribution is not uniform: there are more


cones in the fovea and more rods in the periphery
The retinal mosaic

The optic nerve and retinal artery


enter the eye above the retina,
creating a physiological blind spot

1. Close left eye


2. Fixate on letter F
3. Alter distance from eye until
middle dot vanishes (will be
~20cm from screen)

1. Repeat above (1&2)


2. Find distance where line
appears continuous
Cells in the retina
CHOROID LAYER (absorbs light)
RODS
CONES
• 120 million per eye
• 6 million per eye
• Periphery
• Fovea
• Monochrome
• Colour
• Low resolution
• High resolution
• Many to 1 with
• 1 to 1 with ganglion cells
ganglion cells
Horizontal cells (H): Inhibit
adjacent cells (lateral inhibition)

Bipolar cells (B): Amacrine cells (A):


form modified ‘2nd form modified ‘3rd image’
image’
LIGHT

INFO

Retinal ganglion cells (RGCs):


produce APs that project to CNS
More info: via optic nerve
nba.uth.tmc.edu/neuroscience/s2/chapter14
The duplex theory

Goal: to catch photons and signal the presence of light


Problem: must be able to operate in all the different conditions we are likely to encounter

The luminance range is looking (almost) into bright sunlight, down to a dark, starless night
This is range covers nine orders of magnitude – a billion-fold difference!

How?
A cone-driven, photopic A rod-driven, scotopic (dark)
• The duplex theory: we have two
(light) system: high acuity, system: low acuity, high
separate systems that deal with
low sensitivity sensitivity, colour-blind
different light levels.
• Constriction/dilation of the pupil: 1-8mmmm, reduces/increases the light by 64x.
• Adaptation: the photoreceptors increase the amount of photosensitive protein they have,
increasing sensitivity up to 1000x.
The eye: summary

Part 1 is over!

A quick summary:

• The human eye has evolved to be sensitive to a narrow section of the electromagnetic
spectrum
• We have 2 types of photosensitive cells:
• Rods: monochrome, peripheral, good in the dark
• 3 types of cone: Give us colour sensitivity, foveal, good in bright light

• Photoreceptors pass their signals through a network of other cells that modify the image before
it projects to the brain through the optic nerve

Next time: Retinal Ganglion Cells (RGCs)


Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the second part of a small series on visual perception

Learning objectives for this part:

• Describe a RGC and its associated receptive field

• Explain the principle and uses of centre-surround antagonism

• Explain how RGCs detect edges in their receptive field


Retinal ganglion cells (RGCs)

There are about 127 million photoreceptor cells, but their signals are processed by only about
1.25 million RGCs
Problem: How can RGCs collate this information in a way that retains the essential features of the
image?
It’s like summarising a 1,000 word essay into 10 words, without losing any essential information!

To answer this question, we must identify what type of visual stimulus the RGCs best respond to.
This is done using single cell recordings (usually in primates).

It’s a technique that allows us to measure the AP firing rate of an RGC in response to a specific
stimulus that is placed in a specific place in the visual field
Single cell recording

We place a tiny electrode next to the axon of an RGC.

The microelectrode records electrical changes in the


axon.

The eye must be fixed, so the extra-ocular muscles are


paralysed, and the head is fixed.

We can move the position of a light around until we begin


to influence the RGC activity – this area of sensitivity is
called the receptive field (RF) of the RGC.

Receptive fields of RGCs were first


mapped in a cat – see Kuffler
(1953) and Wiesel & Hubel (1960)
Receptive fields

Things you need to know about RGCs and their RFs:

1. RGCs’ AP firing pattern is influenced by the light in their RF.


2. They have a baseline firing rate, so the firing rate can both
increase and decrease from its baseline.

Light on (or off)

RGCs
firing rate time
Receptive fields

3. The RGC receptive field has a centre region and a surround


region. These regions show a centre-surround antagonism.

• If a light is turned on in the centre region, the RGC firing


rate will increase.
• If a light is turned on in the surround region, the firing rate
will decrease
• This is a ON-Centre cell.

4. OFF RGCs have the opposite firing pattern to ON RGCs.


Receptive fields

Things you need to know about RGCs and their RFs:

1. RGCs’ AP firing pattern is influenced by the light in their RF.


2. They have a baseline firing rate, so the firing rate can both
increase and decrease from its baseline – this is useful
3. The RGC has a centre region and a surround region:
• If a light is turned on in the centre region, the RGC firing
rate will increase.
• If a light is turned on in the surround region, the firing rate
will decrease
This is called centre-surround antagonism.
4. There are both ON and OFF RGCs – the OFF RGCs have the
opposite firing pattern to ON RGCs
Receptive fields

Firing pattern increases or Increased


neural firing
decreases depending on where
in the RF the light hits

Reduced
neural firing
ON response: increase
RF
OFF response: decrease

ON and OFF RGCs have


Reduced
opposing response patterns neural firing

Light anywhere outside the RF Increased


will not influence the firing activity neural firing

RGC
RF size

RFs are smallest in the fovea (0.01mm) and have a low neural convergence factor
– they provide high spatial resolution

10mm from the fovea, RF sizes have increased by a factor of 50. They collect info from a
much larger area of the retina (high neural convergence factor)
- they provide low spatial resolution (but good light sensitivity)

As well as eccentricity-
dependent variation in
RF size, we also see
some local variation
RF size

central gaze position


Centre-surround antagonism

Question: What purpose does this centre-surround organization serve?

Answer: It helps identify edges in images.


Objects can be distinguished from the background by sudden changes in the reflected light

Uniform field: no response (top left)

But if an edge is positioned


appropriately, there will be a
response from the cell regardless
of its orientation
Centre-surround antagonism

Each RF will respond optimally to a bar of a particular width, depending on its size

A small RF responds best to a small object (good spatial resolution).


Larger RFs respond better to larger objects.
Centre-surround antagonism

This allows RGCs to be sensitive to phase – the relative position of a grating in its cycle

Other RGCs will be sensitive to 90 and 270deg phases, but not 0 and 180 like this one
So an array of RGCs will encode the whole range
ON and OFF systems

ON So far we’ve discussed cells with an excitatory centre and an inhibitory


surround – the ON system

However, we also have a complimentary array of RGCs with the opposite


centre-surround arrangement – the OFF system

They have approx. the same number of RGCs in each system, and they
cover the same retinal areas

They are maintained as separate systems throughout the visual system, at


most stages of visual processing

They allow the visual system to detect increment


and decrement in light levels

OFF RGCs also have another division: M, P, and K RGCs. More on that later.
Centre-surround antagonism

A visible consequence of centre-surround antagonism: the Hermann Grid

Illusory spots will appear at the intersections of the grid – dark spots on the left and light spots on
the right. If you look directly at them, they disappear.
What are they?
Centre-surround antagonism

Consider the response of an ON-centre RF when an observer is looking at the fixation point.

large small
response response

Fixation point

The two pictured RFs are being excited in the centre to the same level, but the surrounds are getting
different amounts of stimulation. The cell centred on the intersection will respond less, since the
surround is balancing out the centre. Thus it respond less and the observer will perceive a relatively
dark patch at that location.
If you look directly at a spot, smaller RFs are involved, which entirely fit within the intersection!
Subdivisions of RGC

There is another important way we divide behaviour and structure of RGCs:


Magnocellular, Parvocellular, and Koniocellular.

Name M cells P cells K cells


Proportion 10% 80% 10%
RF size Large Small Medium
Retinal position Peripheral Central Peripheral
Axon diameter Thick Thin Thin
Conduction velocity Fast Slow Slow
Best objects Large, low contrast Small, high contrast Large, low contrast
Colour sensitivity No Yes Yes
Temporal response Transient Sustained Sustained
Where is it? What is it? Where is it?
Info extracted from image
Does it move? What colour is it? How bright is it?
RGCs: summary

Part 2 is over!

A quick summary:

• Retinal ganglion cells play a large role in compressing visual information before passing it on
• They operate with a centre-surround antagonistic mechanism
• The size of their receptive field depends (mostly) on eccentricity
• There is an ON and an OFF network of RGCs, that work separately but in parallel
• There are subcategories of RGC: M, P, and K cells.

Next time: Visual processing in the Lateral Geniculate Nucleus


Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the third part of a small series on visual perception

Learning objectives for this part:

• Describe path that visual information takes away from the eye

• Explain the layout and layers of the LGN

• Describe the connections between the optic tract and the LGN
The visual pathway

• Retinal ganglion cells produce


APs that project visual
information through the optic
nerve

• Projections from the


contralateral hemifield switch
sides at the optic chiasm

• The optic tract projects to the


lateral geniculate nucleus (LGN)
in the thalamus

• From there, optic radiations


project to the visual cortex
The visual pathway

Question: Why do nerve fibres from each eye cross?

Answer: to allow for visual hemifield-specific processing in later visual areas

If the subject fixates on the text, then the thumb (to the left of fixation) will fall
on:
• The temporal retina of the right eye, and
• The nasal retina of the left eye

In the optic tract, the fibres from these areas converge and project to the same
hemisphere of the brain (in this case the right hemisphere)

Then the right hemisphere processes visual


information from the left visual field, and the left
hemisphere from the right visual field.
The LGN

The lateral geniculate nucleus is part of the thalamus, a hugely important


relay station within the brain

~80% of axons from RGCs project to the LGN, the rest go mostly to the
superior colliculus (eye movements) and hypothalamus (circadian rhythm).

The LGN has layers

Cells in L1 and L2 are larger than other layers and


get input from M cells – the magnocellular layers

Cells in L3-6 are smaller and receive input from P


cells – the parvocellular layers

Cells with input from K cells are sandwiched


between the magnocellular and parvocellular layers
The LGN: connections

The optic tract (after optic chiasm) has nerve fibres


from each eye, e.g. temporal retina from right eye
and nasal retina from left

Ipsilateral fibres (that have not crossed) input to


the LGN in layers 2, 3, and 5.

Contralateral fibres (that have crossed) input to


layers 1, 4, and 6.

Each hemisphere of the brain has an LGN, so each


eye provides contralateral input to one LGN and
ipsilateral input to the other
The LGN: retinotopy

Each layer of the LGN is retinotopically organized:


there is an orderly map of the retina.

Adjacent areas of the retina are represented in


adjacent regions of each layer of the LGN.

The spatial relations of the retina are therefore


preserved. This is called the retinotopic map.

Each of the six LGN layers has a retinotopic map,


and the matching regains are stacked on top of each
other. So, the same retinal region is represented in
the same location within each layer.

LGN cells are therefore sensitive to specific regions


of visual space, i.e., they have a receptive field too.
LGN cells

Cells in the LGN have receptive fields, like RGCs do

These RFs are also circular in shape, with a centre-surround configuration and two
opposing types (ON and OFF). Both are found in magnocellular and parvocellular layers.

The inhibitory influence of the surround is stronger than in RGC RFs – this amplifies
differences between neighbouring regions of the RGC RF

This subdivision of input to the LGN (M vs P) suggests a sub-division in visual function:

Colour:
Acuity: Temporal sensitivity:
Nearly all P cells are colour sensitive.
LGN RF sizes vary in each M layer cells respond well to
This colour-based centre-surround is
layer, the smallest devoted rapid change in light intensity.
the basis of colour opponency (e.g.
to the fovea. P cells are much slower to
red centre vs green surround).
The largest are in M respond.
layers, so best spatial The M cells are much more
M cells respond to all colours and so
resolution is in P layers sensitive to motion.
are not colour sensitive.
The LGN: summary

Part 3 is over!

A quick summary:

• Visual information comes from each eye, and about 50% switches sides at the optic chiasm
• The LGN is an important relay station. It also has feedback loops from the visual cortex that
help it modulate signal quality
• The LGN is retinotopically organised
• The LGN segregates information in magno- and parvocellular systems for cortical processing

Next time: Visual processing in the cortex


Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the fourth part of a small series on visual perception

Learning objectives for this part:

• Describe the structure of the visual cortex

• Explain how visual space is organized in V1

• Explain how orientation sensitivity works

• Describe some important types of cortical cells,


and explain what they are sensitive to.
Structure of the visual cortex

The LGN projects to the primary visual cortex (V1)

• V1 has about 100m cells per hemisphere.


• V1 is organised in layers.

• LGN input comes into V1 at layer 4:


• Magnocellular in upper layer 4
• Parvocellular with lower layer 4

• Then they connect to upper and lower layers

• K cells go straight to layers 1-3


Ocular dominance columns

Cells in layer 4 are driven by the input from one eye only

If a particular block of cells receives input from the right


eye, the cells above and below it will also receive input
from the right eye

But adjacent blocks of cells either side will receive input


from the opposite eye

This creates a pattern of ocular dominance columns that


penetrate perpendicular to the surface. They are
organised in patterns visible in staining.

Reconstruction of ocular dominance


columns from macaque V1
Cortex retinotopy

Like the LGN, adjacent regions of the retina are mapped onto adjacent regions of the cortex –
the retinotopic map is maintained

However, the distribution of cells associated with each retinal region is distorted: 80% of cortical
cells are devoted to the central 10deg of the visual field

This disproportionate weighting of cortical power is referred to as cortical magnification

This mirrors how the vast majority of RGCs are devoted to the fovea.

Since foveal RFs are small relative to peripheral


RFs, many more are required to cover the same
area of visual space
Functional properties of cortical cells
Cortical cells have a few similarities and a few differences when compared with the
RGCs and LGN cells from which they get their input

Similarities:
• They maintain the retinotopic map
• They aren’t particularly sensitive to the
Differences: illumination level
• Selectivity to orientation • They respond best to abrupt changes in
luminance (lines, bars)
• They are sensitive to size in a different way
• They can be binocular
• They are more sensitive to colour
• They are sensitive to direction of motion

I will go through the differences in turn…


Orientation selectivity

Unlike RGC and LGN, most cortical cells have


marked preference for particular orientations

Cortical RFs are organised and shaped differently so


that they obtain a maximum response to a line of a
specific orientation

Retina & LGN

Cortex
Orientation selectivity

Like with the ocular dominance columns, staining can tell us about the orientation preference of an
array of cortical cells

In this example, the stained cells are those that respond to vertically orientated lines

By doing this for all orientations, we can create a map of orientation preference, called pinwheels
Size and location sensitivity

Cortical cells come in different types, some of which are sensitive to location, and some to size

Simple cell: Optimum response to an Complex cell: Optimum response to an


appropriately oriented stimulus and a certain appropriately oriented stimulus placed
position within the RF. Phase sensitive. anywhere within the RF. Phase insensitive.
Size and location sensitivity

Cortical cells come in different types, some of which are sensitive to position, and some to size

Hypercomplex cell: Optimum response depends not only on orientation but also on contour
length. Maximum response occurs when the bar length matches the width of the receptive field.
This is “end stopping” or “length-width inhibition”.
Binocularity

Cells in V1 layer 4 are monocular.

However, the other layers to which they send signals are


binocular – they can be driven by either eye

If a visual stimulus is delivered to each eye in turn, the cell


will respond;

If the same stimulus is then delivered to both eyes, the


response is more vigorous.

Binocular cells have two RFs – LE & RE.


They are matched in type, and respond to similar
preferred orientations, locations, and directions of
motion.
I.e. their response will be maximal when
corresponding regions in each eye are stimulated by
stimuli of similar size and orientation
Colour selectivity

Colour sensitive cells are concentrated in the


cortical blobs. Each blob is centred on an ocular
dominance column.

Within a blob, cells will either have red/green


opponency or blue/yellow opponency – these are
not mixed in a single blob.

Blobs receive their input from lower layer 4, which


gets its input from the parvocellular LGN layers.

These cells show no preference for a particular


orientation, unlike most other V1 cells.
Direction selectivity

A large proportion of cortical cells display


preferences for stimuli moving in a particular
direction.

Motion sensitive cells usually respond only to one


direction, i.e. they do not respond to motion in the
opposite direction. This is direction selectivity.

Simple cells respond to slow motion;


Complex cells respond to faster motion
Columns and hypercolumns

The visual cortex is composed of columns of cells.


Each column consists of cells with the same orientation
preference and the same ocular dominance preference.

A set of 18-20 columns (~1mm) traverse a complete range


of orientations and ocular dominance. This collection of
adjacent columns is referred to as a hypercolumn.

This structural arrangement is known as the ice cube


model.

Each HC contains the neural machinery required A foveal HC covers ~0.05deg of the visual field;
to simultaneously analyse multiple attributes of an At 10deg eccentricity, each one covers ~0.7deg.
image (orientation, size, colour, direction of motion) falling Neighbouring HCs look at neighbouring regions of
on a localized region of the retina. the retina
V1: summary

Part 4 is over!

A quick summary:

• Visual information enters V1 at layer 4 (of 6)


• The majority of V1 is dedicated to central vision due to cortical magnification
• V1 cells can be simple, complex, or hypercomplex
• V1 cells share some attributes with LGN and retinal cells, but others are unique to
the cortex
• V1 cells are organised into columns and hypercolumns based on ocular
dominance and orientation

That’s it for session 1!


Next session: Visual perception: colour, objects, and faces
Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the fifth part of a small series on visual perception

Learning objectives for this part:

• Describe how we see in colour

• Evaluate colour opponency and its uses

• Explain chromatic processing at each stage in the visual pathway

• Evaluate some defects of colour vision


Why is colour important?

Colour vision aids discrimination and detection

Important in many key tasks:


• When choosing what to eat
• Scene segmentation
• Visual memory
• Mating rituals
• Camouflage
What is colour?

Colour is a purely psychological phenomenon, Important colour terms:


i.e. it is entirely subjective
Hue (H): the quality that distinguishes red from blue, i.e.
Objects only appear coloured because they the hues of the rainbow.
reflect different wavelengths of light from
Brightness (V): the perceived intensity of light
different parts of the visible spectrum (sometimes lightness)
Colour is therefore a property of our neural Saturation (S): characterizes a colour as pale or vibrant
apparatus. For an object to appear coloured,
we need to have the correct photoreceptors
and neurons.

Colour perception arises from the ability of


certain light rays to evoke a particular pattern
of neural responses in the visual system.

https://isle.hanover.edu/isle2/index.html?ch=Chapter08
Metamers
A metamer is a sensory stimulus that is perceptually identical to another stimulus, but
physically different.

For example, Newton demonstrated a light that appeared orange was indistinguishable from
a light produced by combining a red light and yellow light – they are colour metamers.

This suggests the visual system is producing identical neural responses to physically
different stimuli.

Conversely, if you can discriminate between two lights (they appear different), then the
neural representation of these stimuli must differ.

If you cannot tell visual stimuli apart, then the physical property that makes them different is
not being encoded by the visual system!

NB: This is not unique to the visual system!


The way we encode mint is the same way we encode
cold (temperature) – they are flavour metamers
Colour coding in the retina

We know that the subjective experience of colour and the physical properties of the incident
light waves are connected in some way.

Moreover, physically different wavelengths can result in identical colour experiences.

So what is the connection between the physical stimulus and our perception of colour?

Since the photoreceptors are the first stage in the processing of visual information, it is likely
that the answer lies here. Furthermore, we already know rods are colourblind, so specifically
we need to look at cone cells and the properties of their photopigments.
The principle of univariance

Consider a hypothetical photoreceptor with a single


photopigment.

The graph shows the proportion of light absorbed by the


photoreceptor (expressed as a percent of the peak
absorption), as a function of the wavelength (λ) of the
incident light.
• At λA, it absorbs about 25% of the incident light
• At λB, it absorbs about 50% of the incident light

If the intensity of λA is the same as λB, then there will be a


different response from the cell to the different light Therefore, any single photopigment
is colour-blind, since an appropriate
But if the intensity of λA is about 2x the intensity of λB, then combination of λ and intensity can
the response from the cell will be the same to both. result in identical neural response -
this is the principle of univariance
So how do we see more colours?

To differentiate between wavelengths and intensities, we therefore need a comparison of


signals from two or more cone classes, each with a unique spectral sensitivity

As a rule, wavelength discrimination improves with the number of cone classes.

Some non-primate mammals that rely heavily on sound and smell only have 2 pigments –
they are dichromats;
Some birds that rely heavily on vision can have up to 5 pigments – they are pentachromats

Humans are trichromats – we have 3 cone types


S cones (short λ, blue); peak absorption at 420nm
M cones (medium λ, green); peak absorption at 530nm
L cones (long λ, red); peak absorption at 565nm

The balance of neural activity from each of these


receptors is sufficient to represent the vast array of
natural colours we encounter
Retinal topography: cone mosaic

This image is a recreation of the layout of cone cells


on the retina, colour coded for pigment.

Interesting points:

1. There are far fewer S cones than M or L

2. There are no S cones in the fovea

3. They are randomly distributed, but clumping is


common

4. The layout and relative proportions of cones is


largely individual, e.g. some will have roughly
equal amounts of L and M comes, while others
will have a L:M ratio of 4:1
Retinal topography: cone mosaic

Topographical patterns of
cone cells vary between
people

These are the first images of


the 3 cone types in the living
human retina
(Roorda & Williams, 1999)

These images were created


using a super high-
performance optical system
called a TSLO, and a retinal
adaptation paradigm using
high-intensity coloured lights
Colour opponency

Opponent coding theory: Colours are grouped into opposing


pairs (blue and yellow, red and green).
It’s easy to imagine a
reddish yellow, but a
reddish green?

This is evident from colour


afterimages: adapting to one colour
produces its opposite in the afterimage.

This is purely a result of our physiology;


There is no link between opposing
colours in the physical spectrum!

Demo this for yourself on the next slide


Colour opponency
Fixate on the centre for ~30s.
Colour opponency
See how the image is in perfect colour?
Physiology of opponency

Parvocellular RGCs have chromatically opponent RFs.

So the centre may be excited by red light, while the


inhibitory surround is excited by green light (top left)

There are both ON and OFF versions of


this (top right)

This arrangement also exists for


blue/yellow (middle row)

There are also cells that respond to


red light switched on anywhere in
the RF, or green light switched off
anywhere in the RF (bottom left).
Colour tuning in the LGN

LGN layers 1 & 2 get their input from M RGCs: input for
achromatic luminance channel
Layers 3-6 get theirs from P RGCs: input for the two
chromatic channels

At the LGN, nearly all cells prefer stimuli that are modulated along
the cardinal directions of colour space, i.e. red-green/cyan (0-
180 hue angle) or blue/purple-yellow (90-270)
Colour tuning in visual cortex

Unlike the LGN, cortical cells show


preference for a wide range of hue angles,
not just the cardinals

Tuning width remains fairly consistent


across cortical areas (V1, V2, V3).

Some cortical cells have double opponent RFs. The centre


is excited by red and inhibited by green, while the surround
is excited by green and inhibited by red
Colour constancy

Colour constancy is the ability to assign a fixed colour to


an object even though the actual spectral information
entering the eye changes in different illumination
conditions
This figure has been illuminated with coloured light,
greenish on the left and reddish on the right.
Due to colour constancy, all 8 squares appear the same.
Actually, columns 1 and 4 are the same. They appear
slighty different because of the different backgrounds.
This change in perceived colour is chromatic induction.

Squares in 1 & 3 look similar because the chromatic difference


to the background is identical in these columns.
Double opponent colour cells can signal chromatic contrast
differences and contribute to colour constancy and induction
Colour vision disorders

Colour vision deficiency can be congenital or acquired.


Acquired CVD (cerebral achromatopsia) is typically due to damage to V4
Congenital CVD is an X-linked recessive gene:
XY chromosomes: 8% chance of colour blindness
XX chromosomes: 0.5% chance of colour blindness

People affected in this way have normal cone numbers, but fewer photopigments
available
Congenital CVD usually affects M or L cones, rather than S cones
• If M or L cones are missing, then green and red will be confused
• If S cones are missing, blue becomes hard to distinguish
Colour vision disorders

What does this look like?

Top left: normal


Top right: missing M cone pigment
Bottom left: missing L cone pigment
Bottom right: missing S cone pigment

Lacking a pigment makes you a dichromat


(rather than a trichromat).
It is more common to be an anomalous
trichromat (so all three are present, but one
does not work optimally)
Colour vision disorders

How do we detect colour vision deficiency?

Ishihara colour plates


Each dot in these images is different only in hue –
not luminance
The combination of numbers that are visible to a
patient is indicative of the type of deficiency
Colour perception: summary

Part 1 is over!

A quick summary:

• Colour perception aids visual judgements


• Colour is a purely psychological property that comes from the interaction of
wavelength and our visual machinery
• Cells in the LGN show chromatic opponency and are tuned along cardinal hue axes
• Cortical cells are tuned to a wide variety of hues and can show double opponent
responses
• An individual may be missing a photopigment (or one may be atypical), resulting in
colour vision deficiency

Next part: object perception


Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the sixth part of a small series on visual perception

Learning objectives for this part:

• Evaluate the issues the visual system must overcome to perceive objects

• Explain the necessary components of a model of object recognition

• Critically evaluate existing models of object recognition


Many-to-one mapping

Problem: There are many separate objects that occupy the same
cognitive category; this is many-to-one mapping.

Every single one of these is a rocking chair,


Every single one of these is a letter A
Every single one of these is Michael Caine.

Goal: Our object recognition system


must be able to deal with different
representations of the same object
Many-to-one mapping

This doesn’t only apply to structural differences, but


also differences in:
• Pose
• Distance
• Lighting
• Position
• Viewpoint
• Etc.
Many-to-one mapping

We also need to be able to cope with a


degraded image, such as:
• Occlusion
• Noise
• Distortion
• Filtering
Where does object recognition take place?

The two-stream model


 Post V1, information is transmitted via two pathways Ungerleider & Mishkin (1982)
 Ventral stream: the ‘what’ pathway
 V1 > V2 > V4 > IT cortex
 Associated with object
recognition, memory

 Dorsal stream: the ‘where’ or ‘how’


pathway
 V1 > V2 > V6/DM > V5/MT
 Associated with motion, location,
saccadic control

 More complex networks link to the frontal lobe and other areas
Object agnosia
Damage to the ventral stream can create deficiency in object recognition

For example, Dr P from Oliver Sacks’ The Man who Mistook His Wife for a Hat
[Dr Sacks hands the patient a red rose]

Patient: “About six inches in length… A convoluted red form with a


linear green attachment.”

Dr Sacks: “Yes, and what do you think it is?” Primary visual


Patient: “It lacks the simple symmetry of the Platonic solids, functions, as well
although it may have a higher symmetry of its own. …I think this as verbal
could be an inflorescence or flower.”
description and
Dr: “Could be?” logical reasoning
Patient: “Could be.” are intact, but
object recognition
Dr: “Smell it…” has been lost.
Patient: “Beautiful! An early rose. What a heavenly smell!”

~ Oliver Sacks, 1985


Models of object recognition

Several models have been created to attempt to explain how


the visual system constructs a perception of a recognised
object “Trying to understand perception
by studying only neurons is like
trying to understand bird flight
David Marr asserted that a model of object recognition by studying only feathers: It just
necessarily includes a computational approach. cannot be done.” ~ David Marr,
1982
He proclaimed that the model should have three levels of
analysis:
• Computational: what machinery is required?
• Algorithmic: what processes and sequences are there?
• Implementational: how does this produce a recognised
object?

We will discuss four models of object recognition


1: Template-matching models

The simplest model for object recognition:

You have a detector for (e.g.) letter A.


When an object appears in the RF of this detector that
matches that template, it signals.
For this to work, we would need a detector for every possible
orientation, scale, font – requiring an implausibly large brain!
The computer vision equivalent is the machines that read
cheques; they work well, but the letters must be in exactly
the expected location and orientation
2: Feature detection models

e.g. Selfridge’s Pandemonium model (1959)

A built-up template model. Selfridge described it in


terms of demons with different jobs (sticking with the
letter example):

1. The feature demons look at the image, and simply


write down how many examples of their feature
they see (e.g. tuned to horizontal lines)

2. The cognitive demons shout if they think that


combination of features applies to their letter; the
more confident they are, the louder they shout

3. The decision demon listens to the cognitive


demons and decides who is shouting the loudest,
providing that as the perceived letter
2: Feature detection models

e.g. Selfridge’s Pandemonium model (1959)

This model fits well with Hubel and Wiesel’s work,


which suggests feature-detecting neurons

Early implementations are crude:


• Feature demons are still using templates
• Still no information on configuration
• Still can’t distinguish between different versions of R

Modern viewpoint-dependent models are a little


derived from like this model
3: Structural description models

Marr & Nishihara (1978)


Marr & Nishihara’s criteria:
They believed the goal of the model is to 1. Accessibility: the ease with which an
describe the object unambiguously object might be derived from the data
Therefore the system must be invariant to (efficiency of the system)
transformations in viewpoint, illumination, etc. 2. Scope: the range of objects to which
the models apply must be appropriate
This means the system must know which
properties are invariant under transformation, 3. Uniqueness: the same object should
and how other properties might vary always result in the same unique
description
They developed criteria to consider for a
good model of object recognition in high- 4. Stability: the description should be
level vision: stable to minor changes in the object
5. Sensitivity: the description should
allow for discrimination between
instances of the object
3: Structural description models

Marr & Nishihara (1978)


Other considerations:
1. Should the coordinate system be viewer-centred or object-centred?

Object-centred
viewer object negates the problem
centred centred of transformation
variance

2. What are its primitives? I.e. the basic units of information in its representation

These volumes only


require axis and size
info – maintains
specificity without
requiring too much
storage space
3: Structural description models

Marr & Nishihara (1978)


Other considerations:
3. How is that information organised into an object description?
Viewer-centred Object-centred

This is their proposed


process by which the
image becomes a set of
volumes.

Retinal image. Zero Surfaces, with Composed of 3D


Intensity and crossings, local orientations “primitive” volumes,
wavelength of blobs, edges, and discontinuities organised
light at each bars, ends, in depth. hierarchically by
point. curves, scale.
boundaries.
3: Structural description models

Marr & Nishihara (1978)


The object is described
Other considerations:
in terms of its axes and
3. How is that information organised into an object description?
the volumes around
them

This description is modular


and hierarchical

This means the object


can be described at
many scales, allowing
for identity matching and
discrimination

Now we have the


representation, we just
need recognition
3: Structural description models

Marr & Nishihara (1978)

Recognition: the “model store”

This is a more complicated process Cylinder


than simple matching
It allows for higher specificity and
accuracy
Even if your object perception doesn’t Limb Quadruped Biped Bird

exactly match anything in your model


store, you’ll:
• Find the closest match, and
• Have sufficient information on the Thick limb Cow Human Ostrich

object from the image and your


memory to help you interact with it
Thin limb Giraffe Ape Dove
3: Structural description models

Biedermann (1987)

Similar to Marr and Nishihara, but proposed a set of


primitive volumes into which objects are decomposed “Some non-accidental differences
(not just cylinders) between a brick and a cylinder”

The volumes are called geons (geometric ions)


They have specific properties that are retained in any
2D projection, such as:
• Collinearity
• Symmetry
• Parallelism
• Co-termination
3: Structural description models

Biedermann (1987)

He estimated that there are ≤ 36 of these geons


Therefore there are 362 = 1,296 pairs of geons, which can be
attached in different ways and of different relative sizes
He proposed that there are ~75,000 possible 2-geon objects.
3: Structural description models

Biedermann (1987)

He also gave experimental


evidence for these geons in
human object recognition.

The right column is missing


key (2D) geons, while the
centre column is missing an
equal amount of other contour
info.

Object recognition was much


poorer in the missing-geon
condition (solid lines),
particularly when presented
only for a short time. Physiological evidence has also been
presented from neuronal recordings
(Tanaka 1991, 2003; Hung et al., 2012)
3: Structural description models

Assessment of structural description models

Pros: Cons:

• Invariance is well explained • Extracting model parameters can be


hard in real images (e.g. occlusion)
• Recognition relies on description
rather than matching • Structural description is difficult for
some objects (e.g. crumpled paper,
• Graded representations cope with campfires)
discrimination and generalisation
• Driven by theoretical desirability
• Evidence that structural information rather than behavioural or
matters to humans and to neurons physiological evidence
4: View-dependent models

Bülthoff & Edelman (1992); Riesenhuber & Poggio (1999)

• Brute-force association:
“I recognise that as a horse because I have seen a horse on many occasions, and it
looks like that.”

• Uses a viewer-based coordinate system

• The primitives are sub-regions of the image:


• Not whole image (as in early template-matching models)
• “Abstract features” that might consist of lines, curves, texture, colour, shading etc.
• Feature-sensitive units combine into each other in a weighted way, getting more
complicated (similar to Pandemonium)
• Size and position invariant (because IT neurons have big receptive fields)
• These feed into view-tuned object recognition cells

• Recognition by matching input to closest stored view


4: View-dependent models

Bülthoff & Edelman (1992); Riesenhuber & Poggio (1999)

Main difference is a weighted


approach between layers, rather
than winner takes all
4: View-dependent models
This is behavioural,
Evidence for view-dependent models physiological evidence also
exists: Logothetis et al., (1995);
Human object recognition is not perfectly Riesenhuber & Poggio (1999)
viewpoint invariant

The viewing sphere: practiced recognizing Hardest = orthogonal axis


objects from specific viewpoints (shown as
black spots), tested at novel viewpoints
Medium =
extrapolation
Interpolation: between previous
viewpoints. Easiest.

Extrapolation: beyond previous viewpoints


but in the same axis. Medium difficultly.

Easiest =
Orthogonal axis: from a completely new interpolation
viewpoint. Hardest.
Bülthoff & Edelman (1992)
4: View-dependent models

Assessment of view-dependent models

Pros:
• Straightforward
Cons:
• Minimises transformations that must
be performed • Humans often show quite good
generalisation across viewpoint even
• Newer models are based directly on for novel objects
what we know of physiology
• Still more memory intensive than e.g.
• Abstract features are recombinable geon model
• Good behavioural, physiological, and
simulation-based evidence
Object recognition: summary

Part 6 is over!

A quick summary:

• Object recognition is a complex process


• It must overcome changes in simple parameters like rotation and lighting
• It involves cells along the ventral stream / visual perception pathway
• There are several proposed models of object recognition, each with advantages
and disadvantages
• Evidence to support these models has been derived from behaviour and physiology

Next part: face perception


Cognitive Psychology

Perception:
The visual system

Dr Domenica Veniero
Learning objectives

This is the final part of a small series on visual perception

Learning objectives for this part:

• Describe the FFA and explain how it is connected with


face perception

• Explain some behavioural and physiological experiments


in facial recognition

• Critically evaluate whether facial processing is a unique


version of object recognition
Why is face perception interesting?

Faces are uniquely rich in information:


• Identity, familiarity, age, race, gender
• Gaze direction, attractiveness, mood, communication

There are practical applications to this research:


• The Home Office has funded research into this since the 1970s
• CCTV and the police
• Passports and customs/immigration
• Security of your devices
• Facebook photo tagging

The question: Is facial recognition in humans a unique example


of expertise, or are there special mechanisms in place?
What sort of object is a face?

One with extremely similar distractors:


• So it is within-class, not between-class recognition

A very changeable one:


• Rigid transformations (head movement, viewpoint)
• Non-rigid (expressions, speech)
• Shape and texture (aging)
• Colour (emotion, health, temperature, tan)
We are very good at facial recognition

Even when extremely distorted, I’m sure you recognise these!


We are very good at facial recognition
We even see faces where there are none!
This is known as pareidolia. Studies suggest all you need is a symmetrical noise pattern
with a natural distribution of spatial frequencies (Paras & Webster, 2013)
Where are faces processed?

The FFA (fusiform face area) is a face selective region


As shown by contrast studies, e.g. Kanwisher et al. (1997)

They compared BOLD signal for faces vs other objects;


highlighted regions had higher signals for faces
Where are faces processed?

The FFA (fusiform face area) is a face selective region


Evidence also comes from physiology (Desimone et al., 1984)
Neural signaling in monkey FFA
was highest for faces (of the same
species)
When scrambled or partially
obscured, response went down

Now watch the FFA disruption movie on


moodle.
A man is about to receive epilepsy
surgery. Before he does, they do an
experiment: does the face he is looking
at appear to change if they interfere
with FFA?
Stimulating FFA resulted in distorted
facial perceptions!
Perspectives on face recognition

There are two main ideas on facial perception:

1: Faces are special 2: Faces are not special


The domain specificity hypothesis: The expertise hypothesis:
We are born with dedicated mechanisms for Face perception simply shows us how
facial recognition, which operate differently general object recognition mechanisms
to those that serve typical object recognition work for objects we are extremely well-
practiced at observing

This is still an ongoing debate in visual neuroscience.


I will go over the evidence in support of each case.
Domain Specificity hypothesis

Evidence 1: Neonatal face discrimination


Is there an innate ability to recognize faces?

Newborn babies prefer to look at face-like patterns more than non-


face-like patterns (Johnson et al., 1991)

…But this might be a broader preference for top-heavy patterns


(Simion et al., 2001)

…But babies as young as 1-4 days old seems to be able to tell their
mother’s face from that of a stranger (Field et al., 1984)

<Sidenote> How can we tell?


Using habituation: show the baby photos of its mother until it gets bored.
See how long it looks at a new photo (of a stranger); if it looks for longer, it
is inferred that the baby was more interested because the face was new.
</Sidenote>
Domain Specificity hypothesis

Evidence 2: Prosopagnosia
Some people cannot recognise (exclusively) faces

They often have different gaze patterns (Schwarzer et al., 2007)

Acquired: damage to occipeto-temporal regions (e.g. stroke, TBI)

Although very rarely isolated completely to faces

Developmental: can be hereditary (Schmalz et al., 2008).


Can be very isolated to faces.

Although some people are also very good at distinguishing between faces (super-
recognisers), so perhaps there is just a natural spectrum of ability
Domain Specificity hypothesis

Evidence 3: the Inversion Effect


3a: Bistable ambigram face drawings.
You can see the sullen police officer, but can you see the inverted face?
(British artist Rex Whistler, 1905-44)

It’s much easier to see the second face


(the surprised conductor) when it’s the
right way up
Domain Specificity hypothesis

Evidence 3: the Inversion Effect


3b: Pareidolia is orientation specific

I’m sure you can see the face on the right, but
probably not on the left
Domain Specificity hypothesis

Evidence 3: the Inversion Effect


3c: the Thatcher effect (Thompson, 1980)

The right image probably looks a little weird


It’s way more horrifying when the right way up!

Named because the original example used Margaret


Thatcher

We are more attuned to faces that are the correct orientation


Is that unique to faces?
Do we have unique expectations for faces (e.g. light comes
from above)?
Domain Specificity hypothesis

Evidence 4: Sensitivity to facial configuration


The inversion effect disrupts configural information more than featural (LeGrand et al., 2001)

Spacings between features


(configuration) has been
Procedure:
changed
• Target face appears for 200ms
• Second face is shown
• Task: Are they they same face,
or different?
Same spacings, but
features (eyes, mouth)
have been changed
Domain Specificity hypothesis

Evidence 4: Sensitivity to facial configuration


The inversion effect disrupts configural information more than featural (LeGrand et al., 2001)
Configural: Featural: Inversion disrupts configural more than
80% correct 81% correct featural information

This is evidence of holistic processing: the


inability to attend to one part of the face

Featural: Configural: Further evidence: change one part (the


80% correct 63% correct mouth) and the whole face looks different
Domain Specificity hypothesis

Evidence 5: Part-whole effect


Sub-parts of faces are not independently recognizable (Tanaka & Farah, 1993)

Training phase:
Participants were given a face to remember, either whole or scrambled

“This is Larry. Remember him.”


Testing phase:
The participants were given a distinguishing task, where one thing (nose) had been
changed
Domain Specificity hypothesis

Evidence 5: Part-whole effect


Sub-parts of faces are not independently recognizable (Tanaka & Farah, 1993)

Results:
Participants trained on the whole face were better at identifying the whole face
“This is Larry. Remember him.”
Participants trained on scrambled faces were better at identifying individual parts

This is evidence that when given the whole


face to learn, it was processed holistically.
When trained on the scrambled face, face
specific mechanisms were not activated and
the component parts were processed
individually
Domain Specificity hypothesis

Evidence 6: Composite effect


We can’t help but see the whole face (Young, Hellawell & Hay, 1987)
Who does this look like to you?

David Cameron

Bruce Willis? Tony Blair


Domain Specificity hypothesis

Evidence 6: Composite effect


We can’t help but see the whole face (Young, Hellawell & Hay, 1987)
Measured RT for identifying the top and bottom faces, either aligned or misaligned,
upright or inverted.

1400
The composite effect slows RT for aligned
1200 faces, but only when they are upright
1000
Time to name

800
Aligned More evidence of compulsory holistic
600 Misaligned processing for upright faces
400

200 Perhaps we can’t encode configural


0 relationships in upside-down faces
Upright faces Inverted faces
Expertise hypothesis

Evidence 1: the effect of (un)familiarity


We are so much better at identifying people
we have already seen (Jenkins et al., 2011)

How many different


women are in this collage?
Expertise hypothesis

Evidence 1: the effect of (un)familiarity


12
We are so much better at identifying people we
have already seen (Jenkins et al., 2011) British participants

n
The two women are Dutch celebrities, so are more
familiar to Dutch participants than British
participants 0

12
So facial recognition is heavily dependent on Dutch participants
familiarity – you have practiced identifying these
particular faces n

0
2 4 6 8 10 …
Number of different women
Expertise hypothesis

Evidence 2: the “other race” effect


Sensitivity to differences between faces seems to
require specific experience (Shepherd et al., 1974)

Accuracy
People are better at remembering, more accurate
at matching, and can make finer discriminations
amongst faces of their own race rather than another

Asian children adopted into European Caucasian


families show the same recognition pattern as
native Caucasian people (Sangrigoli et al., 2005)

More evidence of the importance of practice and


familiarity
Expertise hypothesis

Evidence 3: object inversion in experts


Orientation is more critical in situations where the participant has extensive practice in
making subtle object discrimination (Diamond & Carey, 1986)

Tested experts in dog breeds on subtle differences in pictures of dogs, upright and inverted

“Which of these dogs


was in the study image?”

Study upright Study inverted

Test upright Test inverted


Expertise hypothesis

Evidence 3: object inversion in experts


Orientation is more critical in situations where the participant has extensive practice in
making subtle object discrimination (Diamond & Carey, 1986)

Tested experts in dog breeds on subtle differences in pictures of dogs, upright and inverted

Both dog experts and normal


people were worse at
recognizing faces when they
were upside down

But only dog experts were


worse at recognising dogs
when they were inverted

So the inversion effect applies to all


things we are good at recognising
Expertise hypothesis

Evidence 4: the part-whole effect in objects


Parts are often recognised better in their original
context, not just faces (Gauthier & Tarr, 2002)

• Trained participants to recognise Greebles


• Showed them a target Greeble and told which part
of it to attend to (e.g. ‘ears’), then shown that part
either in isolation or in its trained configuration
• Was that part the same or different to that part of the
target Greeble?
• Accuracy was much better when the parts were in
situ
• So the part-whole effect exists for objects too
Expertise hypothesis

Evidence 5: FFA activation in car experts


The FFA may simply be an area for expertise

• Looked at FFA activation as a


response to faces, animals, cars,
and planes
• Most voxels preferred faces
• But the amount that these voxels
were activated to cars (vs. animals)
was correlated with how expert the
person was with cars
• So the FFA may aid perception of
images in which we are experts
Perspectives on face recognition

Summary of the main arguments:

1: Faces are special 2: Faces are not special


The domain specificity hypothesis: The expertise hypothesis:
• Facial recognition appears to be innate • Familiarity with faces is important, for
individuals and races
• Some people are exclusively bad at
recognising faces • Experts experience object inversion
effects
• Inverting a face makes recognition more
difficult • Parts of an object are also recognised
better in their original environment
• Faces may have compulsory holistic
processing – it is hard to selectively • FFA may support general expertise, not
attend to specific parts just faces
Face recognition: summary

Part 7 is over!

A quick summary:

• Faces are of immense social value, but pose a difficult problem for visual
neuroscience
• There is strong evidence in support of the idea that we have unique mechanisms for
the processing of faces
• There is also strong evidence that these effects are simply a result of us being very
practiced at identifying and assessing subtleties in faces
• What do you think?

That’s it for my lectures. Thanks for listening!

You might also like