Professional Documents
Culture Documents
Sensation and Perception6
Sensation and Perception6
Receptive fields
of a sample of
Given that cells with all the receptive field types
simple cells with shown in 3.6-3.7, each for a full range of orienta-
differing preferred tions, are needed for each of the myriad retinal
orientations but patches, one can see that the striate cortex is a
all receiving their
inputs from the very elaborate biological image processing device
same patch of indeed. We will examine its structute and function
retinal receptors in Ch 9.
A patch
of retinal
receptors Are Simple Cells Feature Detectors?
I 1"1111111111111111111111111111 I
Preferred stimulus:
very brisk response Row of"",
Some -s showing more
inhibition than in a preferred
IJt t /'
Fewer +s showing less orientations Cells with
b excitation than in a The most active very low
cell (firing at, activity (say,
I I 1111111111111111111 II II I say, 20 impulses 0-5 impulses
per second) per second)
Preferred stimulus orientation but
low contrast: weaker response b
than in a High contrast
just-off-vertical
edge stimulus
Balance of excitation and
c / inhibition the same as in b
3.11 Simple cell ambiguity 3.12 Interpreting simple cell responses in context
65
Chapter 3
two cells are sufficient to correctly resolve this orientations of 70° and 110° also fire, but relatively
ambiguity; Ch 11.) weakly, and the responses from cells tuned to 60°
This orientation-contrast example illustrates and 120° are very small indeed. For cells with
very nicely the idea of regarding each simple cell optimal orientations further away from vertical
as delivering image measurements for interpretation, than about ±30° the firing rates quickly reduce to
and not as an "edge detectors" pure and simple. resting discharge levels. This symmetric pattern of
They can't serve as feature detectors because their firing rates centered on the vertically tuned cell can
individual responses are far too ambiguous. be regarded as a "signature" profile of activity for
The example of ambiguity just discussed in- the feature representation verticaL bar. It would be
volves two stimulus dimensions, orientation and the signature noted by the interpretive mechanisms
contrast. We examine in detail in Ch 11 how to whose task it is to decide what the simple cell
resolve ambiguities of that sort. We now consider measurements convey about the input.
how to compute the orientation of a stimulus Now study 3.14, which illustrates firing rates
when its contrast is held constant. Our goal is to that arise for an input bar whose orientation is 92°.
show that there is a much better way of using the As you can see, the distribution is slightly skewed,
responses of a range of cells with different preferred not symmetric. The vertically tuned cell is still
orientations than simply to seek the most actjve firing fastest but it has almost been caught up by
one. the 100° cell, which is now firing well above the
level of the 80° cell. That wasn't the case for the 90°
Computing Stimulus Orientation input. Equally, the 110° cell is firing more briskly
Single cell recordings show that there are simple than the 70° cell. This skewed activity profile is the
cells for only 18-20 different preferred orienta- signature profile for a 92° bar.
tions within each patch of retina. Hence, if cell This is a much more sophisticated approach
responses were interpreted as suggested in 3.12, to the interpretation of simple cell measurements
with the most active cell signalling the orientation (outputs) than simply finding the most active
of the input feature, then we could only see 18-20 neuron and assuming its preferred orientation is
different orientations. This implies that we would the orientation of the input feature. What we now
be limited to discriminating between lines differing have is a system in which simple cells can be said
in orientation only if their orientations differed by to sample the stimulus dimension oforientation,
about 10°. That is, the 180° of orientation would and infer what is going on in the image from the
be shared between, say, 18 orientation detectors pattern of simple cell outputs.
with the peaks of their tuning curves (3.13,3.14) One advantage of this scheme for finding edge
separated by 180°/18 =10°. But our perceprual ca- orientation is that it is a very neat way of avoiding
pabilities are much better than this: we can manage more simple cells than are necessary to do the job.
discriminations of less than 0.25°. Clearly, there is This makes it economical in terms of number of
a need for some method of interpolation between cells required. Another clever brain trick.
neighboring orientation measurements. By this we
Integrating Channel Outputs: Weighted Means
mean there must be a way for the brain to estimate
stimulus orientations lying in between the 18 pre- A channel is defined as a population of cells that
ferred orientations encoded by simple cells. all have the same preferred value of a particular
For example, compare the two situations illus- stimulus property. For example, if all the cells in a
trated schematically in 3.13 and 3.14. The input given population have the same preferred orienta-
feature in 3.13 is vertical and this causes a symmet- tion then together they constitute a single orienta-
ric distribution of simple cell firing rates with peak tion channel. Note that the cells in such a channel
firing shown for the vertically-tuned cell. Either are not necessari ly located close to o ne another.
side of this peak response, simple cells whose pre- In fact, they can be distributed widely across the
ferred stimuli are bars with orientations of 80° and striate cortex (Ch 9). The examples shown in 3.13
100° are shown firing quite briskly-but not as fast and 3.14 illustrate orientation channels, but show
as the vertical ly (90°) tuned cell. Cells with optimal only one single cell taken from each channel with
66
Seeing with Receptive Fields
I: ,:
:
. I .I .1 1 I
ltl itt:
! 1l::
I
I :
I
I
I
I
:
:
I
'
I
I
I
I
I
I
I
I
I
I
I
I
j'm
i All I
I
I
I
I
I
I
I
I
I
I
I
: .
•
•
•
•
I
I
I
. ..I...L80......
Tuning of
Cells
to a vertical
(90°) stimulus of
just one of the LOw L-____________ _____________
differently tuned
cells shown above
.... .. ..... . .. .. Row of cells
Receptive fields
: A:
:
\ .\ .\ \ \. \. \
A:
itl ' I :
:
I
I
I
I
:
I
:
I
•
•
I
:
I
•
•
I
I
I
I
I
I
:
I
•
•
All l 1j i !
I
:
I
•
•
•
: :
•
' .
•
I
Tuning of
Cells
67
Chapter 3
a particular orientation tuning (90°, 80°, 70°, and examp le, the full range shown in 3.13 and 3.14.
so on). We say more about the problems of noise in eh 5.
We now explore further the idea of interpret- [This simple scheme for calculating a weighted
ing simple cell outputs by giving an example of mean is used as an example to exp lain the basic
one particularly simple way of doing it. This is to idea. However, it would need to be refined in a
regard all the activities of the cells as a mass of data practical system, as it collapses arithmetical ly for
and work out the weighted mean of all these data. any orientation coded as 0°. This problem can be
Slcip these details if you prefer and go to the next fixed by expressing angles trigonometrically, but we
section. will not go into detail here.]
To keep things arithmetically straightforward, But you might say: the brain doesn't have a cal-
let each impulse in a given time interval of, say, 1 culator for doing arithmetic, so how might it use
second be regarded as one data item. Also, we will its neurons to implement this type of interpolation
consider computing a weighted mean from just calculation? The general answer is: the brain can
three cells, with preferred orientations of 80°,90° "do" arithmetic using the processes of excitation
and 100°. The basic idea is to let each cell contrib- and inhibition in combi nation.
ute to the computation according to its firing rate Suppose the brain did do something along the
(output) in comparison with the other cells. lin es of a weighted mean calculation , and then
Let's first take the situation in which these cells used just one neuron to encode each discriminable
are responding to a vertical bar (90°), and let's sup- orientation. Given that we can distinguish orienta-
pose their firing rates are as shown in the second tions as little as 0.25° apart then that would entail
column of 3.15. The total number of impulses in having a few hundred neurons to encode the orien-
1 second is 35+50+35= 120 impulses per sec- tation of every edge feature in each patch of retina.
ond. We now weight the contribution of each cell Each such neuron would then be said to be a local
taking into account how much each cell is firing code for just one particular orientation.
in comparison with the other cells. Thus" we Alternatively, perhaps the brain doesn't do a
weight the output of the 80° cell by the fraction weighted means calculation to decide which one
35/120=0 .29, the 90° cell by 501120=0.42 and of a set of neurons should become activated as the
the 100° cell by 35/120=0.29. You can think of code for a given orientation. Perhaps instead the
this weighting as reflecting how much influence is patterns of simple cell responses shown in 3.13
to be given to each of the cells in computing the and 3.14 are used as a population code for the
stimulus orientation they are dealing with. feature representation vertical bar present.
The final column of the table multiplies the pre- After all, our simple weighted mean calculation
ferred orientation of each cell by its weighted firing has demonstrated that the population of simple
rate, which sum to give the weighted mean of 90°. cells taken together has the orientation of the
This is exactly as it should be, of course, for a 90° bar encoded in its activity pattern. Perhaps this
input- this what we want the feature code to be distributed representation is sufficient as it stands
saying when this particular symmetrical "signature for the uses the brain has for orientation data. If
tune" is "playing" in the orientation channels. so, why bother going a step further and malcing
But now consider 3.16 which shows asymmetri- the bar orientation explicit in a local code? This
cal firing rates in the same three channels for a question raises some fundamental issues in trying
stimulus just-off-vertical, 92°. The weighted mean to understand seeing and the brain. We return to
is now 92°. This output can be regarded as the re- them in detail in later chapters.
sult of interpolating between the preferred orienta-
tions of the three orientation channels to find the Coarsely Tuned Channels Are a Good Idea
orientation of the stimulus. Progress. We said above that simple cells can be viewed as
This strategy of using weighted outputs from a channels sampling the stimulus dimension oforienta-
set of channels has the advantage that it averages tion. It turns out that the basic principle underly-
out the effects of noise in responses. Obviously, ing this SOrt of san1pling scheme applies generally
this will be better if more channels are used, for in vision.
68
Seeing with Receptive Fields
69
Chapter 3
data needs to be interpreted before a proper feature stimulus orientation would cause very little change
description can be asserted, or their responses have in output, 3.17. The output of such a very broadly
to be treated as a population code. tuned cell changes very little unless retinal line
One method for reducing the effects of noise is orientation falls on the flanks of the tuning curve.
to average responses from many cells. The underly- These flanks are where the slope of the tuning
ing assumption here is that the noise in any given curve is greatest, and therefore where the change in
cell will be independent of that affecting other cell output per degree of change in line orientation
cells. This means that the noise variations will tend is greatest.
to cancel out when an average is taken. One way to ensure that most retinal line ori-
This is one reason why it may be a good idea entations coincide with this' sensitive" part of a
for the brain to consider the responses of entire tuning curve is to use a large number of cells with
populations of cells when attempting to recover tuning curves which, taken together, tile the space
the parameters of the stimulus that caused those of all possible orientations to yield adequate resolu-
cells to respond. Using averaging to get around tion, as in 3.18. This represents another reason for
the problem of noise is explored in detail Ch 5 in using many cells to encode orientation, which is
connection with the task of edge detection from independent of the noise ptoblem above.
noisy images. A curious side effect of this way of extracting
orientation is that it predicts peaks and troughs
Problem of Parameter Resolution in the system's sensitivity to orientation. A peak
A question that we discuss in Ch 11 is: how few of high sensitivity should be found in the re-
channels are needed to resolve the ambiguity in gion where the two response curves are changing
simple cells responses? It turns out that, in princi- sharply. Troughs should arise in the regions covered
ple, only two, very broadly tuned. However, if we by the top of each cell's response curve. This is
had only two cells to span the entire range of 180
0 because at the tops the change in response as retinal
then each cell would be very insensitive to most orientation changes is not as great.
orientation changes. This is because a large part of This is paradoxical. It would, at first sight, be
each cell's orientation range would be on regions natural to expect maximum sensitivity for orienta-
of the tuning curve that change very slowly with tions falling on the highest point of each cell's tun-
changes in input orientation, so that changes in ing curve. But on careful examination, each cell is
70
Seeing with Receptive Fields
Examp les of regions of peak sensitivity to stimulus orientation changes because these are regions
with steepest changes in channel outputs
100 -I -I -I -
U-
Q) 80
en
u;
Q)
"""
.6.
.e 60
2
ell
a:
OJ 40
.'u:::c" 3.18 Using many simple
20 cells with different
preferred orientations
to "tile" the full stimulus
0 orientation range
Orientation (degrees)
most sensitive to changes in orientation about half number template has a much more complex pat-
way down from the top. tern than the simple bar feature. Moreover, bank
So it is reasonable to ask: does human vision check numbers are made more readily distinguish-
show these predicted peaks and troughs in orien- ab le one from another by using specially designed
tation sensitivity? The answer is yes. Regan and numerals with lines of different thicknesses to
Beverley reported experiments in 1985 on human facilitate recognition.
orientation discrimination which confirmed the But that trick alone would not be enough to get
prediction (see Regan, 2000). such templates to serve as pattern recognizers. The
crucial added ingredient is using a special check
Can Templates Ever Work as Recognizers? scanning device which prevents comp lications
arising from large variations in the input images in
We started this chapter by considering the use terms of the brightness, contrast, shape, size, and
of bar templates to detect stimulus bars. We orientation of numerals. This permits a template
discovered that the problem of response ambigu- recognition system that works well for the task it
ity bedevils their use, but this led us to a general tackles.
principle: ambiguities can be resolved by drawing
inferences from the outputs of many templates
(channels).
Even so, simple templates can be made to work
well as pattern recognition devices in some special N. _ _ a.ntPlc
71
Chapter 3
However, that task is a very si mple one by the We now have 18 x 18 x 18 x 18 x 18 = 1,889,568
standards of biological vision systems. They have templates.
to cope with all manner of variations in the way And, once again, this is for just one object, for
objects appear in retinal images, variations over example just one of the numerals that our bank
which they have no control. How human vision check number template recognizer would have to
copes with some of these variations is an issue that deal with if it was stripped of careful control of
we will address in later chapters, particularly eh variations in the input image.
8, Seeing Objects. But we pursue here the topic of Imagine needing this huge number of tem-
template recognition a bit further by way of intro- plates for all the different objects that we so readily
d ucing some basic facts that illuminate why the recognize-numerals, letters, birds, trees, chairs,
seeing problem is so hard. people, and so on.
There is a general formula for working out the
Templates and the Combinatorial Explosion number of combinations of parameters involved
You might be wondering: could a template recog- in this combinatorial explosion: the total number
nition system be made to work by having a range equals N where N is the number of templates per
of different templates, each tuned to deal with one parameter (18 in our example), and the exponent
or other source of image variation? k is the number of parameters. Don't worry if
The way we coped with the problem of vari- this exponential formula seems a bit opaque: if
able image bar orientation illustrates this idea: we you want to know more about it then read e h 11
found it a good idea to have 18 or so differently where we discuss its implications at length .
oriented bar templates, and to use these coarsely The combinatorial explosio n reveals just how
coded measurements of orientation to work out hard the problem vision is. Any attempt to solve it
bar orientation using weighted means. What if this using simple-minded templates doesn't work: the
ap proach was extended for other sources of image brain just doesn't have enough neurons.
variation, such as color and size?
Binding Problem
Each such variable, often called a parameter as
already noted, would need its own set of coarsely You might think at this point this is si lly; surely
coded templates, each one dealing with a limited there is no need to have a cell for every combina-
range of the parameter's possible values. tion of parameter values? Why not simply have one
The trouble with this idea is that it immediately population of cells that encodes only one param-
hits a major snag, called the combinatorial explo- eter exclusively, for all objects.
sion. If we need 18 templates for orientation then For example, one population could encode only
for each one of these we will need a suitable range color, and another could encode only orientation .
of templates for size. Let's say for simplicity that Using this scheme we would require k popula-
this would also be 18. Hence, 18 x 18 templates tions to encode k parameters. If each population
would be needed to deal with al l combinations of consisted of one million cells then no more than k
orientation and size. million cells would be required in total. The brain
Now consider adding another variable, such may do something along these lines (as we will
as contrast, and suppose that too needed 18 see), but this raises another fundamental vision
templates. This takes us to 18 x 18 x 18 = 5,832 problem: how do we attach parameter values to
templates for one numeral. objects? This is known as the bindingproblem and
Well, the brain has a lot of neurons so perhaps we discuss it in eh 11.
that isn't so bad. But things rapidly get worse when
Back to the Jumping Spider
other sources of image variation are brought into
the equation . We can now see that a spider would have a very
Take object position on the retina for example. hard time using templates as a means of deciding
Again to keep things simple, let's suppose the the questio n: is the object over there mate or prey?
parameter of vertical position needs 18 templates Such a spider would also be subject to the combi-
and similarly for the horizontal position parameter. natorial explosion (further details in eh 11).
72
Seeing with Receptive Fields
However, it may be that the jumping spider has, sion arising from the exponential IV" formula. We
as suggested earlier, evolved some special-purpose examine that problem in considerable depth in
visual mechanisms that are quite unlike our own. Ch 11 to explain in more detail why vision is such
Land has suggested that these spiders use scanning a hard problem.
movements of their boomerang-shaped retinae to Finally, armed with core ideas introduced in this
align them with the orientations of leg-like bars in chapter, we are ready to have a much closer look
the input image. This trick might allow the spider in Ch 5 at the task of edge detection. We consider
to avoid the 18 or so different orien tation tuned there and in Ch 6 the tasks of how ro recover fea-
channels that monkeys and humans seem to pos- ture properties other than orientation, such as bar
sess. width, and whether an edge is a sharp or fuzzy.
These retinal scanning movements might also Ch 5, Seeing Edges, will also remedy a nagging
solve the problem of variable object position in the irritation that may have formed in your mind. We
image. They would do that if they ensured that the emphasized in Chs 1 and 2 the need to be very
object's image falls on exactly the right spot for the clear about the computationaL theory, aLgorithmic,
spider's limited number of templates to be able to and hardware Levels of task analysis when studying
recognize Mate or Prey. Perhaps these and other vision. But we have blatantly ignored our own ad-
special-purpose adaptations give the spider a work- vice in this chapter. We have jumped straight into
ing template-based recognition system. considering a particular sort of algorithm, applying
The general idea here is that perhaps the spider templates, without any guidance from a computa-
has not evolved a general-purpose vision system, tional theory as to the design of those templates.
such as our own, but one specialized for its particu- You might feel a bit cheated. But we have done this
lar ecological niche. It can survive if it can capture simply to introduce a wide range of basic concepts
insect prey and find mates. Perhaps it has a visual and terms that it is best to get out of the way first.
system set up to do those tasks and very little else. In any event, if you do feel a bit cheated then
In this respect it may be a bit like the simple-mind- you have drawn the right conclusion because this
ed but highly specialised bank check recognition chapter illustrates just how unsatisfactory it can be
system described above. to start addressing an image processing task with-
out a clear computational theory of that task. Tem-
Concluding Remarks plate matching is a species of algorithm. The design
This chapter has explored the task of building a bar and use of templates demands the clarity afforded
detector using templates to discover what problems by a decent theory of the task. We investigate in
have to be overcome. AJong the way we defined Ch 5 how we can get a much better understanding
some essential technical terms, many to do with of what the receptive fields of simple cells are doing
basic facts about the "hardware" of biological visual by thinking a lot harder about the task of feature
detection. That will be seen to be the moral of the
systems. A key concept has been that of a receptive
story of simple cells told here.
fieLd with excitatory and inhibitory regions.
Linked to this is the idea of receptive fields of
Further Reading
various types, organised as channels analysing each
patch of retina and providing measurements about This main function of this early chapter has been
a stimuLus dimension (such as orientation) from to explore some core ideas needed to understand
which the brain can work out which features are seeing, such as receptive fields, channels, coarse
present in the input image. In the next chapter, Ch and fine coding. We do not recommend much fur-
4, we use these ideas in showing how certain illu- ther reading at this stage on this material. We will
sions called aftereffects have revealed orientation be making suggestions for further reading for later
channels in the human visual system without need chapters that use the core concepts dealt with here.
for invasive single cell recordings. Hence, it is not suggested you consult the
This chapter has also introduced a fundamental sources overleaf at this juncture but we provide
concept in vision research: the combinatoriaL expLo- them so that you can follow them up if you wish.
73
Chapter 3
Barlow HB (1972) Single units and sensation: See pp.l16-120. Comment: An excellent book
A neuron doctrine for perceptual psychology, that we recommend strongly for readers who want
Perception 1 37 1-394 Comment C lassic paper that to pursue various topics in human and animal vi-
discusses the relationship between the firing of sio n at an advanced level, and specifically the work
single neurons in sensory systems and subjectively of Regan and Beverley on peaks and troughs in
experienced sensations. Co mm entaries celebrating orientation sensitivity.
this landmark paper are in Perception 38 795-807.
It is probably best tackled after reading Ch 11 .
74