You are on page 1of 6

What are the contributions of different visual cortical

areas to analysing a visual stimulus?


The visual cortex is one of the most investigated, and most important, subjects, in
work on the cortex, if not in the entire field of neuroscience. Up to half of the entire
cortex is concerned with analysis of vision, and our recognition of different areas has
grown at a staggering rate. One general feature seems to be that there is a clear
hierarchy between different areas, with clear pathways for processing of certain
aspects of the image. Much of our information comes from experiments on Macaque
monkeys. PET scans, as well as various clinical evidence, have shown that the
pathways in humans are largely similar.

The two main properties of any object seen in an image are what it is, and where it is.
These two needs are met by two major pathways in the visual cortex. To achieve this,
various other aspects of the object have to be established. For example, one would
want to know its direction and its distance for localisation. As for what the object is,
one would to be able to see its shape (its edges) and its colour, for example.
Furthermore, one might want to see whether it was moving, and if so, in which
direction that was occurring. And when this is done, these various aspects have to be
incorporated into the single coherent image we take for granted whenever we open
our eyes.

The three major subcortical targets for the visual afferents are the pretectum, the
superior colliculus, and the lateral geniculate nucleus. These structures themselves do
not fall within the scope of this essay, but from here fibres project to the major
cortical areas, and in particular the lateral geniculate nucleus is the terminal through
which ninety percent of retinal axons synapse to supply the visual cortex.

In contrast to cells of the lateral geniculate nucleus, which as mentioned in the


previous essay are very similar to the retinal ganglion cells, the cells of the primary
visual cortex are the first cells to demonstrate significant functional differences. This
is found in Brodmann’s area 17. This area is also known as the striate cortex, since it
has a prominent stripe in layer 4: the stria of Gennari. Yet another name for it is V1.
Similarly, Brodmann 18 and 19 have been subdivided into V2 through 4 and V5
respectively, an example of how our recognition of the different areas has blossomed.

As with all neocortex, the primary visual cortex is made up of six layers. The major
input into the visual cortex is via layer 4. Major cell types are the excitatory pyramidal
and spiny stellate cells, inhibitory are the GABAergic smooth stellate cells. The
functions of the cells were first established by Huber and Wiesel. The small spots of
light that are so important in the higher levels of the retina are not so important.
Instead, cells respond to lines of certain direction (and sometimes length), colour, or
serve to combine input from both sides of the visual field.

The cells are divided into simple and complex. Simple cells respond best to a bar of
light with a defined orientation. They have excitatory and inhibitory regions, except
these are not shaped like an archery target (like the retinal ganglion and bipolar cells)
but consist of a central excitatory strip with antagonist surrounds. Indeed, the term
“simple” cell is applied to any whose receptive properties can be readily mapped out
using small spots of light. The complex cells do not have these excitatory and
inhibitory areas. They respond best to a bar or line of a specific orientation, anywhere
in their field. Indeed, they often respond better to a moving bar. The end stopped
complex cells respond best to a bar in a particular orientation, which is in a particular
orientation. It is easy to imagine how the input of a number of lateral geniculate
nucleus cells (spots) might combine to give a cell excited by (spots in a) line.
Similarly, each complex cell receives input from a number of simple cells. The input
for complex cells typically comes from both sides of the eye. Although most will have
input from areas corresponding to the same part of the visual field, others have pairs
of inputs which do not exactly correspond. Presumably this allows for stereoscopic
(binocular) vision, which is one of the important depth cues.

These cells are responding to just some of the information available to them. They are
concentrating on detecting edges: this contributes to recognition without localisation.
Functionally, this means that when we look at a large featureless object, such as a
wall, we are not “seeing” the uniformity of the wall directly, but are instead merely
detecting that each bit looks the same as the bit next to it, until we get to its edges.
Simple and complex cells in V1 receive input from both P and M pathways, although
the information from the two is at this stage kept separate. David Marr suggested that
the information at this stage could be used to form the “primal sketch” of an object, a
2D cartoon like representation of its contours.

The neurons responsible for these orientation specific responses in V1 are arranged
into columns. These are about 30 to 100μm wide, and 2 mm deep. The fact that there
were cells in columns responding with the same receptor field, responding to related
stimuli, was first discovered by mapping with microelectrodes. Also, radiolabelled 2-
deoxyglucose differentiates between active and inactive cells, and application of this
to the visual cortex will, with suitable stimulation, show a line of stripes on x ray film.
Cells in each column all recognise the same orientation of bar, and will share
information with each other. As one moves across the surface of the cortex, the
orientation changes gradually. After about 0.75 mm one comes back full circle to cells
responding to the original orientation. Thus there are a series of bands, running
perpendicularly to the columns across the cortex, within each of which there are cells
that respond to any orientation. Among these orientation columns are so-called ocular
dominance columns. These respond to one eye, which has been established by loading
labelled amino acids into one eye and seeing where it ends up, and have some
importance for binocular vision. Given this system of arrangement, columns that
between them respond to both eyes and all orientations (for a small, specific part of
the visual field) will lie very close to each other. These are termed a hypercolumn.
Within each hypercolumn there are four blobs, seen when staining with cytochrome
oxidase. These imaginatively named areas sit in layers II and III of visual cortex,
respond to different colour stimuli, and their receptive fields do not have a specific
orientation. They are driven by one eye. There are also horizontal connections
between the different columns, due to pyramidal cell axons. These seem to allow cells
of different columns but with similar functions to communicate which each other.

Thus cells in V1 are arranged in columns, which between them analyse orientation
and colour, and lay the foundations for binocular vision. So this area is important for
the initial analysis of the image. Information passes to Brodmann’s areas 17 and 18,
output to the cortex coming from pyramidal cells of all layers above 4C. From here,
there are two separate pathways. There are pathways to the posterior parietal cortex
(dorsal), which is responsible for complex localisation, and inferotemporal regions
(ventral), which is responsible for complex recognition. The functions of these two
areas were discovered by Ungerleider and Mishkin in a series of experiments where
they created cortical lesions in animals and analysed their behaviour. The P and M
pathways, whose functional characteristics were discussed in the previous essay,
remain distinct in V1. It used to be thought that the P pathway goes to the inferior
temporal cortex (recognition), with the M pathway running to the posterior parietal
cortex (localisation), although it now seems that there does not exist such a clear cut
distinction. Although there are crossconnections between the two pathways,
particularly in V2 (see below) but also for example in terms of inputs to the blobs, it is
unclear whether these cross connections actually provide sensory input, or merely
modulation.

From V1 the output goes first to V2, which is a subdivision of area 18. Above, it was
explained that the blobs of the hypercolumns were recognised by staining with
cytochrome oxidase. This is not its only use. If V2 is stained with cytochrome
oxidase, there are parallel thin and thick strips, separated by unstained inter-stripe
regions. The blobs project to the thin stained stripes, neurons in layer 4B project to
thick stained stripes, and the other cells of V1 project to the unstained areas. The cells
on the M pathway project from layer 4C to 4B in V1, and from there to the thick
strips. Cells on the P pathway would seem to project to the blobs and then to the thin
strips, although this is the subject of debate.

The M pathway runs from V1 to V2 to medial temporal (MT or V5) to medial


superior temporal (MST). These latter two areas are highly specialised for the analysis
of motion. MT is the most studied of the extrastriate visual cortical areas. The
recognition of motion is achieved by comparing the position of the object at various
points. Given that in principle any area of the visual cortex could perform this
function, it seems in some ways surprising that there exists this pathway: perhaps it is
due to its importance? In any case, experiments have shown that lights in separate
positions can be switched on and off to create the illusion of motion (like old
fashioned LED display boards, with special effects). If the same pathways were
responsible for position and motion, presumably this illusion would not be created.

MT is one of the best studied areas of visual cortex outside V1, and indeed has some
important similarities to V1. Its cells are also arranged in columns, and 95% of them
respond to motion. But now these cells respond to directional motion. Their response
is hardly altered by the shape or colour of the object. As one moves along the vertical
columns, the direction of motion responded to gradually changes, just as in V1 the
direction preferentially responded to changes. Admittedly, a significant proportion of
the cells in V1 also do this, but allow themselves to be fooled, with the optimal speed
for their response in many cases dependent on the shape of the object that is moving.
In V5, there are no such “problems”, as mentioned above. This again contributes to
the evidence that these cells represent the major motion sensing area of visual cortex.
Lesions of MT impair the response to motion, whereas in damage to the visual areas
of the temporal cortex, leaving the motion pathways intact, sometimes patients will be
clinically blind but can still respond to motion: the phenomenon of blindsight1. Most
1
Discussed in Phantoms in the Brain, VS Ramachandran.
of the cells here respond to a moving change in luminance, as would be expected from
a predominantly M cell driven pathway, but some also respond to a moving edge of
colour. Once again, this confounds the idea of strict separation of M and P pathways
at this stage.

These cells have quite large receptive fields, often ten times larger than those in V1.
Small receptive fields can lead to problems in detecting motion. As an example, take
the following, known as the “aperture problem”. If one sees a moving pattern of
parallel lines through a small aperture, one tends to see the entire pattern as moving in
the direction of the lines. So one responds to the direction of motion of the individual
lines, rather than the motion of the piece of paper on which the lines are drawn. This
is one reason why receptors have to be large, to accommodate the complex moving
patterns of the real world. But obviously they will never be large enough to
completely encompass a pattern. So the idea of a two stage processing of motion has
arisen, to explain this aperture paradox. In the first stage, neurons that respond to a
specific axis of orientation respond to elements of the pattern that are moving in that
direction. Then, the information from these various receptors is analysed in the second
stage. Indeed, in experiments by Movshon, who recorded the response of MT to a
moving plaid cell, about 20% of the cells seemed to respond to the direction of motion
of the entire plaid, rather than the individual lines of the plaid. This provides
experimental evidence for the two stage processing.

Inherent in the idea of being able to detect motion is being able to tell how far away
an object is. This was one of the key features of an object’s image, highlighted at the
beginning of this essay. The various ways in which the eye performs this have been
understood since the Renaissance, where they were exploited to trick the eye for more
realistic art, and were even formally set out by Leonardo. In addition to the
stereoscopic vision mentioned earlier, which relies on two eyes and processing in V1,
there are a number of monocular cues. These include calibration against objects of
known size, occlusion (which objects block out other objects), linear perspective
(parallel lines might at the vanishing point), size perspective (two similar objects
appearing at different sizes), change in textures at distance, shadows and illumination,
and parallax effects as we move our heads (which objects in the visual field then
appear to move and by how much).

The ventral cortical pathway runs from V1 to V2, through V4 and through to the
inferior temporal cortex (IT). As already stated, it detects object form. At all stages of
the pathway, connections tend to be reciprocal, so there is a substantial amount of
feedback. As with the dorsal pathway, there are substantial connections between the
sides of the brain, via the corpus callosum. As one moves along the pathway, and
reaches progressively higher levels, projections become more diffuse, in the sense that
there is no longer any sort of point to point representation of the image on the retina.
Intriguingly, at these higher levels there is increased expression of proteins that are
known to be involved with synaptic plasticity and long term memory, so perhaps our
memory of objects stems from here? Disorders of visual agnosia (Greek: “without
knowledge”), which is a blanket term referring to a wide range of visual disorders,
whose common theme is an inability to recognise objects, despite other lower order
visual functions remaining intact. These disorders have been linked to the inferior
temporal cortex. Interestingly, in patients with lesions of these areas, there is a
difference between whether the lesions are on the right or left hand sides. Lesions on
the right side result in visual agnosia, while lesions on the left normally allow patients
to retain the ability to recognise faces (the so called “granny cells” of V4 perhaps
respond to individual faces, and hopefully at least one of them to granny’s face2!).
Kluever and Bucy initially observed that removal of temporal lobes in monkeys
resulted in, among other problems, serious visual impairments. The area responsible
for this was eventually narrowed down to IT.

Cells in V2 respond to the same three things as V1, namely the colour, contour and
any ocular discrepancies. There are however also spot cells, which response to both
colour and small dimensions of stimuli, which is perhaps symptomatic of the higher
order processing that is occurring. Cells in V4 are certainly responsive to colour (it
was previously thought that this was their sole function), but they also respond to both
the width and length of bars in specific orientations, and indeed some cells respond to
altogether more complicated stimuli such as irregular borders. Like in the MT of the
dorsal pathway, their receptive fields are considerably larger than cells below them.
They also have antagonistic surrounds: here they are known as “silent surrounds”.
Lesions of this area in the Macaques lead to a loss of ability to detect patterns and
shapes, but the effects on colour are quite subtle. Just like how the eye must
distinguish between albedo and luminancy for monochromatic light, the eye needs to
be able to detect colours regardless of the ambient light. This ability, termed colour
constancy, is lost when V4 is lesioned, at least in monkeys.

From here information passes to the inferior temporal cortex. This has been the focus
of a huge amount of work, since it seems that it is responsible for much of our visual
memory. A particularly interesting area is face recognition. The cells almost
invariably demonstrate very complex receptor properties, with large receptor fields,
sometimes encompassing the entire visual field. This is exactly what would be
expected from these very high level cells. Also, the receptor field almost always
includes the fovea, which again is what would be expected for the fine detail that IT
presumably responds to. The cells will tend to recognise their stimulus regardless of
where it is in the visual field: this is the property of “response invariance”. They also
start to display responses related not just to the site of an object, but also to its
memory: for example on exposure to the same object (face) repeatedly might decrease
the response of the cells that recognise it.

Some cells of IT will respond to hands (and will fire regardless of what orientation the
hand is in). Others respond to faces, some preferentially when the face is full on,
others when it is in profile. Some respond to facial expressions. Others respond to the
distance between the eyes, and some seemingly to eye contact: it is no accident that
this is such an important social signal. At some point there is a question about what
exactly these cells are using to recognise their stimuli. It is easy enough to understand
how responses to dots of light can be built up from the individual photoreceptors,
lines from the dots and even moving bars from the lines. But surely there cannot be
receptors for everything we might encounter: such as pea receptors, receptors for the
letter A, and receptors for the letter A in a fancy font! So the cells must be stripping
away the unessential information, although Zigmond and Bloom state that “attempts
by various workers to determine whether simpler features within complex objects are
2
An appealingly whimsical idea, which assumes that individual neurons are responding to entire faces.
Sadly, the evidence is not yet very strong for this idea. These neurons may only be sensitive to certain
distinguishing features of a face. This is discussed in more detail below.
crucial for eliciting the response have shown that is true for only some [my italics] of
those TE [a region within IT] neurons that respond maximally to complex stimuli.”

Thinking about faces specifically, since this is where a lot of the research has been
done, there has been some debate about whether the cells that respond to faces are
responding to elements truly unique to faces, or whether they are responding to more
generalised cues (it has been suggested that at least part of the response can be
mimicked by two dots and straight line for the mouth3). For the moment at least, it
seems as though they are responding to something unique to faces: certainly while
pictures produce similar responses in these cells, no other stimuli have yet been tested
which produce the same magnitude of response4. Given this, there is then the question
of how much can be abstracted from the work on face recognition to the rest of IT.
Are these cells a typical example of the type of recognition that is occurring, or are
they something that have evolved separately, relating to the immense importance of
facial recognition in human society?

The visual cortex is extensive, and has been very widely studied. Modern techniques
such as PET scanning are allowing us to move investigations from Macaque monkeys
at least partly into humans. And with fields such as facial recognition remaining so
intriguing, but so poorly understood, it is certain that visual cortex will continue to
excite and surprise us for many years to come.

3
Kandel and Schwartz
4
Although even if this is indeed the case, we will never know whether this is due to just a lack of
ingenuity on the part of the experimenters choosing the stimuli. A problem that also applies more
generally to defining the responses of many sensory systems.

You might also like