FlemingAnderson VisNeuro BookChapter

86 The Perceptual Organization of Depth
ROLAND FLEMING AND BARTON L. ANDERSON
T    perception is to identify the spatial layout provide ideal locations for the segmentation of depth into
of the objects and surfaces that constitute our surroundings. objects. Moreover, as we will show in the first section, the
One important observation about the world around us that geometry of occlusion causes relatively near and relatively
influences the way we see depth is that physical matter is not far depths to play different roles in the inference of surface
distributed randomly, with arbitrary depths at every location. structure.
On the contrary, the environment is generally well organized: In the second section we discuss the visual representation
the world consists mainly of tightly bound objects in a dis- of environmental structures that are hidden from view. If the
cernible layout. This order results from countless forces and visual system is to organize depth into meaningful bodies, it
processes (such as gravity and biological growth) which tend must represent whole objects, not only those fragments that
to organize matter into objects and place those objects in happen to be visible. In order to do this, the visual system
certain spatial relations. The central thesis of this chapter is must interpolate across gaps in the image to complete its
that our perception of depth mirrors this organization. We representation of form. We argue that by considering the
argue that because the world consists of objects and surfaces, particular environmental conditions under which structures
our perception of depth should likewise be represented in become invisible (specifically occlusion and camouflage), we
terms of the functionally valuable units of the environment, can make predictions about the mechanisms underlying
namely, surfaces and objects. As we shall see, this has pro- visual completion. We also discuss how visual completion
found consequences for the processing of depth information. influences the representation of depth.
In particular, there is more to depth perception than simply Finally, we discuss what happens when the scene contains
measuring the distance from the observer of every location transparent surfaces, and thus multiple depths are visible
in the visual field. Rather, the perception of depth is the active along a single line of sight. We argue that this introduces a
organization of depth estimates into meaningful bodies. Depth second segmentation problem in the perceptual organization
constrains the formation of perceptual units, and, recipro- of depth. The visual system not only needs to segment “per-
cally, the figural relations between depth measurements allow pendicular” to the image plane, such that neighboring loca-
the visual system to parse its representation of depth into eco- tions are assigned to different objects; with transparency,
logically valuable structures. the visual system also has to segment depth “parallel” to the
There are many sources of information about depth from image plane by separating a single image intensity into
“pictorial” perspective to motion parallax. An exhaustive multiple depths, a process known as scission (Koffka, 1935).
review of all these sources of information is beyond the We discuss the conditions under which the visual system
scope of this chapter (although see Bruce et al., 1996; performs scission and how the ordering of the surfaces in
Palmer, 1999, for introductory reviews). Instead, we discuss depth is resolved.
three key domains in which the visual system “organizes” We argue that the ambiguity of local depth measure-
our perception of depth into meaningful units to emphasize ments, the representation of missing structure, and the
the intimate relationship between depth processing and depiction of multiple depth planes are three of the major
perceptual unit formation. problems faced by a visual system if it is to organize depth
In the first section, we discuss how the visual system infers into surfaces and objects. Through systematic explanations
the layout of surfaces from local measurements of depth. We of example stimuli, we discuss some of the ways in which
will argue that local estimates of depth are ambiguous but the visual system overcomes these problems.
that the geometry of occlusion critically constrains the legal
interpretations. Occlusion occurs when one opaque object Interpreting local depth measurements: the contrast depth
partly obscures the view of a more distant object, as happens asymmetry principle
frequently under normal viewing conditions. Occlusion is
important because it occurs at object boundaries, and In this section we discuss how occlusion constrains the inter-
therefore the depth discontinuities introduced by occlusion pretation of local depth estimates. Specifically, we show that
1284
occlusion enforces a crucial asymmetry between relatively P
near and relatively distant structures that can have profound (a)
Vieth-Müller Circle
implications for the representation of surface layout.
α
Although the principles are discussed in terms of binocular
disparity, the fundamental logic relates to the geometry of Q
occlusion and therefore applies to any local estimate of
depth.
B S   C P

Binocular stereopsis is the most thoroughly studied source
of information about depth. Binocular depth perception
relies on the fact that the two eyes receive slightly dif- Q'
ferent views of the same scene. Because of the horizontal P' Q' P'
parallax between the two views, a given feature in the (b) (c)
world often projects to two slightly different locations on
the two retinas (Fig. 86.1). These small differences in retinal
location, or binocular disparities, vary systematically with A B A B
dA
distance in depth from the point of convergence and can
thus be used to triangulate depth. For a thorough treatment d*
of stereopsis, see Howard and Rogers (1995) and Chapter
87.
In order to determine the disparity of a feature in the
world, the visual system must localize that feature in the two
retinal images. Once it has identified matching image
features, the difference in retinal location is the binocular
disparity, which can then be scaled to estimate depth. The B'
A' A' A'
visual system must not measure the disparity between fea-
tures that do not belong together; otherwise, it will derive F 86.1. a, The two eyes converge by angle a on a point P.
spurious depth estimates (Fig. 86.1). Because of this, the Therefore, by definition, P projects to the foveae of both eyes (P¢).
accuracy of the matching process is critical to binocular The Vieth-Müller circle is one of the geometrical horopters, that
is, it traces a locus of points in space that project to equivalent
depth perception. The problem of identifying matching fea- retinal locations in the two eyes and thus carry no interocular
tures in the two eyes’ views (i.e., features that originate from disparity. Point Q is closer to the observer than P (as it falls inside
a common source in the world) is known as the correspondence the horopter). Therefore, it projects to different locations on the
problem. two retinas (Q¢). The difference in the locations of Q¢ is the binoc-
If the features that the visual system localizes in the two ular disparity, which can be scaled by the vergence angle, a, to
derive depth. b, When the visual field contains many points, there
images are very simple, such as raw intensity values (or pixels), is a potential ambiguity concerning which image features corre-
then in principle there could be many distracting features spond in the two eyes. Correct matches yield correct depth esti-
that do not in reality share a common origin in the mates, such as dA. c, By contrast, false matches yield erroneous
world. Under these conditions the correspondence problem depth estimates. Here, the image of point A has been incorrectly
would be difficult, as the visual system would have to iden- matched with the image of point B, leading to an incorrect depth
estimate, d*.
tify the one true match from among a large number of false
targets.
However, there is considerable debate about what types
of image features the visual system matches to determine maximizes sensitivity to contrasts rather than to absolute
disparity ( Jones and Malik, 1992; Julesz 1960, 1971; luminances (Cornsweet, 1970; Hartline, 1940; Ratliff, 1965;
Marr and Poggio, 1976, 1979; Pollard et al., 1985; Prazdny, Wallach, 1948). By the time binocular information converges
1985; Sperling, 1970). Psychophysically, at least, it now in V1, the visual field appears to be represented in terms of
seems unlikely that the visual system matches raw lumi- local measurements of oriented contrast energy (De Valois
nances. Rather, the visual system seems to match local and DeValois, 1988; Hubel and Wiesel, 1962), and thus it
contrast signals, that is, localizable variations in intensity, is likely that these are the features from which disparity is
such as luminance edges (Anderson and Nakayama, 1994; computed.
Smallman and McKee, 1995). This seems an almost If this is true, then the image features that carry disparity
inevitable consequence of early visual processing, which information are local contrasts such as luminance edges.
  :     1285

However, this poses a problem for the visual system, for in (a)
order to capture the functional units of the environment, the
visual representation of depth should be tied to surfaces and
objects, not to local image features. There is therefore a poten-
tial discrepancy between the image features that carry dis-
parity information (i.e., local contrasts) and the perceptual
structures to which depth is assigned (i.e., regions) in the
ultimate representation of environmental layout. This
discrepancy plays a critical role in the theoretical discussion
that follows (see Anderson, in press). (b) (c)
A local image feature, such as an edge, has only one true
match in the other eye’s image. Therefore, the edge carries
only one disparity. However, depth is ultimately assigned to
the two regions that meet to form the edge. This results in a
problem: in order to represent surface structure, the
visual system must assign depth to both sides of an edge,
even though the edge carries only one disparity (Fig. 86.2).
How does the visual system infer the depths of two regions
from every local disparity signal? We will show that the
geometry of occlusion imposes an inviolable constraint on F 86.3. Asymmetries in depth interpolation. a, When the
the interpretation of local disparity-carrying features. To left stereopair is cross-fused, the diamonds appear to float inde-
pendently in front of the Kanizsa triangle, as schematized in
anticipate, we show that the simple fact that near surfaces
b. When the disparity of the diamonds is inverted (by cross-fusing
can occlude more distant ones, but not vice versa, has pro- the right stereopair), the diamonds drag their background with
them, creating the percept of a triangular hole, even though
only the disparity of the diamonds has changed (c). This asym-
(a)
metrical change in surface structure can be explained by the con-
trast depth asymmetry principle (see text). (Adapted from Takeichi
et al., 1992.)
? ? found consequences for the assignment of depth to whole

regions.
A  D: A D By way of

motivation for the theoretical discussion that follows, con-
(b) sider Figure 86.3, which is based on a figure developed by
Takeichi et al. (1992). The figure consists of a Kanizsa illu-
sory triangle and three diamonds. When disparity places the
diamonds closer to the observer than the triangle and induc-
wo ers (by cross-fusing the stereopair on the left of Fig. 86.3),
rld
the diamonds appear to float independently in front of the
background, and the Kanizsa triangle tends to be seen as a
ima figure in front of the circular inducers; this percept is
ge schematized in Figure 86.3B. The disparities in the display
can be inverted simply by swapping the left and right
F 86.2. a, The image of a square occluding a diamond.
A receptive field of limited extent (the ellipse) captures only eyes’ views, as can be seen by cross-fusing the stereopair on
local information about the scene, here a vertical luminance the right side of Figure 86.3. In this case, what was previ-
edge. This local information is ambiguous, as many different scenes ously distant becomes near and vice versa, such that the
could have resulted in the same image feature. b, If disparity is diamonds are placed behind the plane of the inducers. In
calculated by matching local contrasts, then the edge carries
both versions of the display, the triangle itself carries no dis-
only a single disparity. However, in this case, the light and dark
sides of the edge result from two distinct objects, and therefore parity relative to the circular inducers; only the disparity of
different depths have to be assigned to the two sides of the the diamonds changes from near to far. This simple inver-
edge. sion leads to a change in surface representation that is more
1286 , ,   

Possible depth interpretations Figure 86.4. The first class consists of surface events in
Continuous Surfaces which both sides of the edge meet at the depth of the edge,
d0. There are many surface events for which this is the
case: reflectance edges, cast shadows, and creases in the
surface, to name just three. When the feature originates from
a continuous manifold, as in these cases, interpretation is
Occluding Surfaces
matching, simple, as both sides of the edge are assigned the same
disparity
computation depth, d0.
The second class of interpretations occurs when the edge
corresponds to an object boundary and therefore represents
Local Image Data a depth discontinuity (Fig. 86.4). In this case, one side of the
edge lies at the depth of the occluding object, and the other
F 86.4. A contour which carries a depth signal (e.g., dis-
parity) is inherently ambiguous. Two main classes of world states side of the edge lies at the depth of the background. There-
could have given rise to the contour: the contour could have orig- fore, the visual system must assign different depths to the two
inated from a single continuous surface (e.g., a reflectance edge or sides of the edge. How can the visual system assign two
cast shadow), or it could have originated from an occlusion event. depths, when it is given only one disparity, d0? The answer
In the occlusion case, the border ownership of the contour (i.e., is that it only assigns a unique depth to the occluding
which side is the occluder) is ambiguous. Nonetheless, in all con-
figurations, both sides of the contour are constrained to be at least
side. The critical insight is the following: the depth mea-
as far as the depth signal carried by the contour. This introduces a surement acquired at an occluding edge only specifies the depth
fundamental asymmetry in the role of near and far contours in of the occluding surface. The visual system assigns depth d0 to
determining surface structure (see text for details). (Adapted from the occluding surface. All that it knows about the other side
Anderson, in press; see also Anderson et al., 2002.) is that it must be more distant than the occluding surface. If
the more distant surface is untextured, then it could be at
complex than a simple reversal in the depth ordering of the any depth behind the occluder and the local image data
perceptual units (as schematized in Fig. 86.3B). When the would remain the same. By contrast, if the depth of the
diamonds recede, they drag their background back with occluding surface varies, the disparity carried by the object
them, such that the triangle appears as a hole through which boundary must also change, because the occluding surface
the observer can see a white surface; the three black dia- “owns” the contour (Koffka, 1935; Nakayama et al., 1989)
monds lie embedded in the more distant white surface. This and is therefore responsible for the disparity associated with
recession of the background has a secondary effect of the edge.
increasing the strength of the illusory contour (the border of Although the visual system cannot uniquely derive the
the triangle). depth of the occluded side (i.e., the background) from the
The important observations with regard to the theory are local disparity computation, there is one critical piece of
the following. First, when the diamonds are in front, they are information that it does have: the occluded side is more
freely floating and separate, while when they recede, they distant than the occluder. There is no way for an occluding
drag the background with them. Second, when the dia- object to be more distant than the background that it
monds are forward, the Kanizsa triangle tends to be seen as occludes. If the background is brought closer than the
a figure (rather than ground), but when the diamonds are object, then the background becomes the occluding surface and
more distant, the triangle is seen as a hole. And yet all that carries the edge with it. In this way, occlusion introduces a
changed in the display was the disparity of the diamonds. fundamental asymmetry into the interpretation of disparity-
Why does this simple reversal in depth lead to an asymmet- carrying edges: the occluded side of the edge can be at
ric change in the surface representation? Why does the dis- any distance greater than d0, but neither side can be nearer
parity of the diamonds influence the appearance of the than d0.
triangle? These are the asymmetries of depth to which the We can summarize the possible depth assignments
following discussion pertains. (from the occlusion and nonocclusion classes just described)
in the form of a constraint on the interpretation of local
F F  S: I  L disparity-carrying contrasts, termed the contrast depth asymme-
D S Let us assume that the visual system try principle (Anderson, in press; see also Anderson et al.,
has located a luminance edge and derived a disparity, d0, 2002):
from that edge. What possible surface configurations are
consistent with the local disparity measurement? Broadly, Both sides of an edge must be situated at a depth that is greater than
the legal interpretations fall into two classes, as shown in or equal to the depth carried by that edge.
  :     1287

Although this geometric fact is simple in form, it can have at least as far as the edge leads to asymmetrical surface
pronounced effects on the global interpretation of images structures when disparities are inverted.
when the constraint applies to all edges simultaneously. We This is just one example that shows how the CDAP can
will now run through an example to show how the principle account for asymmetrical effects of relatively near and rela-
can explain the asymmetric changes in perceived surface tively far disparities on perceived surface layout. Because the
structure that occur when near and far disparities are CDAP is derived from the geometry of occlusion, it can
inverted. account for a very large number of displays and can be
used to generate surprising new displays (Anderson, 1999,
A   C D A P- Anderson, in press).
 In order to demonstrate the explanatory power of
the contrast depth asymmetry principle (hereafter CDAP), Occlusion and camouflage: hallucinating the invisible
we will now use it to account for the demonstration in
Figure 86.3. Recall that when the diamonds carry near The central thesis of this chapter is that the visual
disparity, they float freely in front of the background, and system does not merely record depth at each location in
the illusory triangle tends to be seen as figure. When the the visual field; rather, it actively organizes its depth mea-
disparity is reversed, however, the diamonds drag the surements into functionally valuable units. In the previous
background back with them, and the triangle appears as a section, we discussed how occlusion plays a key role in
hole. This asymmetry in surface layout is depicted in Figure this organization. In this section, we discuss how the
86.3B. visual system handles what is arguably the hardest problem
Let us first consider the case in which the diamonds posed by occlusion: the visual representation of structures
appear to float in front. The visual system has to interpret that are hidden and are therefore completely invisible. If
the disparity signals carried by the edges of the diamonds. seeing depth is about representing the actual layout of
The CDAP requires both sides of the diamonds’ edges (i.e., objects in the environment, then all portions of the objects
the black inside and the white outside of the diamonds) to must be represented, even those that are hidden from
be at least as distant as the edges. Now consider the view: hidden portions do not disappear from the environ-
“pacman” inducers, which are more distant than the dia- ment just because they do not appear in the image. There-
monds. The constraint requires both sides of these edges to fore, the visual system has to go beyond local image data
be at least as distant as their edges. This means that all of to construct representations of hidden structures. We will
the black interior of the inducers must be at least this distant now discuss how the environmental conditions of occlusion
and, more importantly, all of the white background must be at and camouflage predict properties of the construction
least this distant, which is farther than the disparity of the process.
diamonds. If all of the white background is farther than the
diamonds, then the edges of the diamonds must be occluding M  A C We will consider two
edges, and the black interior of the diamonds must be an major ways in which parts of the scene can become
occluding surface. This explains why the diamonds are seen invisible. The first is simple occlusion, in which an opaque
as independent occluders, floating in front of the large white object obscures part of a more distant object. When
background and black inducers: the edges of the “pacman” this happens, the occluded structures of the more distant
inducers drag the white background back, leaving the object have no corresponding features in the image, and thus
diamonds floating in front. the visual system must somehow “reconstruct” the missing
Now consider the case in which the diamonds are more data. The second way that viewing conditions can lead to
distant than the inducers. Again, the CDAP requires both invisible structures is through camouflage. In camouflage
the inside and the outside of the diamonds to be at least as it is the nearer, occluding surface that is rendered invisible
far back as their disparity dictates. This means that both the because it happens to match the color of its background.
diamonds and their white background are dragged back to the Because the boundaries of the camouflaged object do
more distant disparity. Now consider the “pacman” induc- not project any contrast, they have no corresponding
ers, which carry a relatively near disparity. Because the white features in the image, and thus the nearer object is effec-
background behind the diamonds has been dragged back tively invisible. Under these circumstances, the visual system
with the diamonds, the inducers and their white background must actively “hallucinate” the invisible structures. In both
must be occluding surfaces. This means that the background cases, the visual system interpolates missing data, a process
immediately surrounding the diamonds must be visible known as visual completion. This process is important to
through a hole in the occluding surface. The edges of this depth perception because it is one of the means by which
hole are the illusory contours of the Kanizsa figure. Note the visual system organizes its depth measurements into
again: the fact that both sides of every edge have to be meaningful bodies. We argue that depth perception and unit
1288 , ,   

amodal completion (Michotte et al., 1964/1991). In general, the
regions of the image which are visible, and lead to visual
completion, are referred to as inducers.
T I H There is a vast literature on

visual completion; a thorough discussion of all the issues is
beyond the scope of this chapter. One important issue that
(a)
is discussed in greater detail in Chapter 75 is whether visual
completion occurs relatively early or late in the putative pro-
cessing hierarchy. However, the perceptual organization of
depth has a direct bearing on another current debate, specif-
ically the extent to which modal and amodal completion are
the consequences of a single process. This issue is intimately
bound to depth perception because it determines the extent
to which depth processing and perceptual organization are
(b) (c) independent.
The debate runs roughly as follows. On the one hand, a
F 86.5. a, Modal completion. Most observers report seeing
a vivid white triangle in front of three discs and a black triangular strong claim has been made that a single mechanism is
outline. The contours of the white triangle are subjectively distinct, responsible for both modal and amodal completion. Accord-
resembling real contours, even though there is no corresponding ing to this account, perceptual organization (including visual
image contrast, and hence the triangle is illusory. b, Amodal com- completion) produces perceptual units, and an independent
pletion. Most observers report seeing a single continuous black process places those units in depth. The theory states that
shape, part of which is hidden from view by the gray occluder, even
though the parts that are hidden from view are, by definition, invis- psychological differences between modal and amodal com-
ible. c, A self-splitting object. Even though the shape is uniformly pletion result from the final depth ordering of the completed
black, it tends to be seen as two forms, one in front of the other. forms (Kellman and Shipley, 1991; Kellman et al., 1998;
Which form tends to complete modally, and which amodally, Shipley and Kellman, 1992) rather than a difference
depends in part on the distance that must be spanned by the com- between the completion processes themselves. This is known
pletion (Petter’s law).
as the identity hypothesis. On the other hand, the two processes
could be largely independent, subject to different constraints
formation are intimately intertwined, for depth constrains and subserved by distinct neural mechanisms. A strong
the perceptual units that are formed, and perceptual form of this dual mechanism hypothesis would be that the
organization influences the interpretation of local depth two processes are fundamentally different in nature—for
measurements. example, that modal completion is largely data driven,
The phenomenal quality of completed structures differs, while amodal completion is essentially cognitive. To anticipate,
depending on whether it is near (camouflaged) or far although we do not subscribe to the strongest form of
(occluded) structures that are interpolated. In the case of the dual mechanism hypothesis, we will provide evidence
camouflage, the interpolation leads to a distinct impression that modal and amodal completion follow different
of a contour or surface across the region of missing data. constraints and argue that they are subserved by distinct
This is referred to as modal completion (Michotte et al., neural processes. Central to the arguments that we present
1964/1991) because the experience is of the same phe- are the geometric and photometric conditions under
nomenal modality as ordinary visual experience. An illusory which occlusion and camouflage actually occur in the
contour, for example, is crisp and subjectively similar to a environment.
real contour, as can be seen in Figure 86.5A. In contrast, the The principal evidence for the identity hypothesis has
sense of completion experienced with occluded structures is been that subjects perform similarly with modally and
less distinct. The black form in Figure 86.5B tends to be seen amodally completed figures in a variety of tasks. In one task,
as a single object, part of which is hidden, rather than as Shipley and Kellman (1992) varied the spatial alignment of
two distinct objects, whose boundaries coincide with the the inducing elements in both modally and amodally com-
boundary of the gray occluder. There is a compelling sense pleted squares. Such misalignment is known to weaken the
that the two visible portions of the black form belong to the sense of completion, as the completed boundary is forced to
same object and that that object continues in the space undergo an inflection. Subjects were asked to rate the sub-
behind the occluder. However, this impression, although jective strength of visual completion as a function of the
visual in origin, is not of the same phenomenal mode as degree of misalignment for modal and amodal versions of
normal and modal contours and is therefore referred to as the display. Shipley and Kellman (1992) found that ratings
  :     1289

declined at the same rate as a function of misalignment for A second reason for believing that modal and amodal
both modal and amodal figures. This has been interpreted completion are subject to different constraints relates to
as evidence that a single mechanism is responsible for both the color conditions that are required for occlusion and
forms of completion. camouflage to occur. Again, occlusion can happen between
Using a more rigorous method, Ringach and Shapley objects of any color. The reflectance of the near object is
(1996) performed a shape discrimination task with modal unrelated to the fact that it hides the more distant one from
and amodal versions of a Kanizsa figure. By rotating the view. This suggests that amodal completion should not
inducing elements, the vertical contours of the completed be sensitive to the luminance relations between the image
square can be made to bow out (creating a “Fat” Kanizsa) regions involved. Camouflage, by contrast, requires a
or curve in (creating a “Thin” Kanizsa). Subjects were asked perfect match in luminance between the near and far sur-
to discriminate between Fat and Thin versions of the display faces. This implies that modal completion should be sensitive
while the angle through which the inducers were rotated was to the luminance relations between the image regions
varied. Ringach and Shapley found that discrimination per- involved.
formance as a function of rotation was nearly identical for Recent experimental work has shown that this luminance
modal and amodal versions of the display, a finding which sensitivity can lead to large differences between modal
is consistent with the identity hypothesis. and amodal displays (Anderson et al., 2002). Anderson et al.
One problem with this type of evidence is that it relies on created displays consisting of two vertically separated
negative results, that is, a failure to detect a difference, which circles filled with light and dark stripes, as shown in Figure
could be due to the method rather than to a fundamental 86.6. The binocular disparity of the circles was kept con-
property of the system being studied. If positive evidence stant, but the disparity of the light-dark contours inside the
could be provided that modal and amodal completion are circles was altered to place the stripes behind or in front of
subject to different constraints, or result in different percep- the circular boundaries. When the stripes were further than
tual units, then the identity hypothesis would no longer be the circles, the top and bottom stripes tended to complete
tenable. amodally to form a single continuous dark and light surface,
There are two major reasons for believing that modal which appeared to be visible through two circular holes, as
and amodal completion should be subject to different con- schematized in Figure 86.6D. This percept occurred irre-
straints, both of which are related to the environmental spective of the luminance of the region surrounding the
conditions under which occlusion and camouflage occur. circles.
First, occlusion occurs over greater distances across images By contrast, when the disparity placed the contours in
because it only requires that one object be in front of front of the circles, the dark and light stripes separated
another. Camouflage, on the other hand, requires a perfect into different depth planes. The way in which the stripes
match in color between the near surface and its background, separated from one another depended on the luminance of
and thus occurs less frequently in general. This difference is the surround. When the surround was the same color as the
reflected in a constraint on the image distances over which light stripes, the light stripes appeared to float in front and
modal and amodal completion occur, which was first docu- completed modally across the gap between the two circles.
mented by Petter (1956). Petter used a class of stimuli now In this condition, the dark stripes completed amodally
known as spontaneously splitting objects (SSOs), which consist of underneath the light stripes to form complete circles. This
a single homogeneously colored shape, such as the one led to an impression of light vertical stripes in front of
shown in Figure 86.5C, that tends to be interpreted as two dark circles, as schematized in Figure 86.6E. However,
independent shapes, one behind the other. Which object is when the surround was the same luminance as the dark
seen in front tends to oscillate with prolonged viewing. stripes, the percept inverted, such that the dark stripes ap-
However, which shape is seen in front first, and which tends peared to float in front of light discs. This demonstrates
to be seen in front for a greater proportion of the time, can a fundamental dependence on luminance that was not
be predicted rather well from the lengths of the contours present in the amodal version of the display. Furthermore,
that must be interpolated. Petter’s rule states that longer con- if the surround was an intermediate gray, then the display
tours tend to be completed amodally, while shorter contours was not consistent with camouflage, as neither the light
tend to be completed modally. Thus, which figure is seen in nor the dark stripes perfectly matched the luminance of the
front can be predicted from the length of the contours that background. Under these conditions, there was no modal
must be completed. If the two types of completion are completion across the gap, and the percept was difficult
subject to different constraints on the distances over which to interpret. This demonstrates that modal completion is
they occur, this opens the possibility that they are subserved sensitive to luminance relations, while amodal completion
by different mechanisms. is not.
1290 , ,   

(a) pletion are subject to different constraints, both on the dis-
tance over which they occur and on the luminance condi-
(d) Amodal depth ordering tions that are required to induce them. This positive
evidence for a difference between modal and amodal com-
pletion uses essentially the same types of task as the nega-
tive evidence that had previously been used to support the
identity hypothesis.
(b) V C   P O

 D The geometric and photometric differences
between modal and amodal completion are derived
directly from the environmental conditions of occlusion
and camouflage. Because occlusion and camouflage occur
(e) Modal depth ordering under different circumstances, they have different conse-
quences for the organization of depth into meaningful
bodies. In fact, the differences can be exploited to generate
(c)
stimuli in which modal and amodal completion lead to dif-
ferent shapes. This is important, as it shows that unit for-
mation is intimately bound to the placement of structures in
depth.
The greater “promiscuity” of amodal completion is the
key in the generation of these displays. Figure 86.7 is a
recently developed stereoscopic variant of the Kanizsa
configuration in which the inducing elements are rotated
outward (Anderson et al., 2002). When the straight segments
F 86.6. Demonstration of dependence of modal comple- (the “mouths” of the “pacmen”) are placed in front of the
tion on surround luminance. When the left stereopairs of a, b, and
c are cross-fused, the stripes tend to complete amodally between the
circular portions of the inducers, the impression is of five
gaps between the circular hole, creating the impression of a single independent illusory fragments that float in front of five
striped surface (like wallpaper) viewed through two apertures, as black discs on a white background. However, when the
depicted in d. This occurs irrespective of the luminance of the sur- two eyes’ views are interchanged, and thus the straight
round. However, when the right stereopairs are cross-fused, thus contours are placed behind the circular segments, the impres-
inverting the disparity, only two stripes appear to complete modally,
sion is rather dramatically altered. With the disparity
and which stripes complete depends critically on the surround
luminance, as depicted in e. When the surround is dark, as in a, the inverted, the impression is of a single amodally completed,
dark stripes complete modally; when the surround is light, as in b, irregularly shaped, black figure on a white background,
the light stripes complete modally; and when the surround is inter- which is visible through five holes in a white surface
mediate, no completion is visible. This demonstrates that modal (these percepts are schematized in Figs. 86.7B and 86.7C).
completion is luminance dependent, while amodal completion is
Thus, the former case consists of a total of 11 surfaces (5
not. (Adapted from Anderson et al., 2002.)
fragments + 5 discs + white background), while the latter
case consists of 3 (1 white surface with 5 holes + 1 black
Anderson et al. showed that this luminance sensitivity shape + white background). Clearly, the placement in depth
could affect performance on basic visual tasks such as vernier has a considerable effect on what perceptual units are
acuity. The stripes in the top and bottom circles can be formed.
horizontally offset (i.e., misaligned slightly) without destroy- Anderson et al. also provided evidence that differences
ing the sense of completion. Subjects were asked to report between modal and amodal interpolation can lead to differ-
in which of two displays the contours were slightly mis- ences in the very shapes of completed contours themselves.
aligned. Both modal and amodal completion facilitate per- When the left-hand stereopair in Figure 86.8A is uncross
formance in this task. However, in the amodal case, fused, the resulting percept consists of six circular discs that
performance was unaffected by the luminance of the are partly occluded by a jagged white surface on the right-
surround, while in the modal case, performance was hand side, as schematized in Figure 86.8B. However, when
much worse when the luminance of the surround was an the disparities are inverted (by uncross-fusing the right pair
intermediate gray (the condition in which the stripes do not of Fig. 86.8A), the modal completion across the regions
complete across the gap). Thus, modal and amodal com- between the four black blobs tends to take the form of a
  :     1291

(a) (a)
(b) Serrated edge near (c) Serrated edge far
(b) (c)
F 86.7. a, Relative depth alters perceptual organization.

When the left stereopair is cross-fused, the figure tends to appear
as five discs occluded by five distinct image fragments, as depicted
in b; the transparency in b is included only so that both depth planes
can be depicted simultaneously. When the depth ordering is F 86.8. The serrated-edge illusion. When the left stereopair
reversed by cross-fusing the right stereopair, a single irregular black in a is uncross-fused, the resulting percept consists of six circular
“star” appears to lie on a continuous white background, which is disks that are partly occluded by a jagged white surface on the right,
visible through five holes in a continuous overlying layer (c). In this as depicted in b. When the right stereopair is uncross-fused, the
depth ordering the black shape tends to appear as figure. (Adapted modal completion of these four black blobs tends to take the form
from Anderson et al., 2002.) of a single wavy contour that runs vertically down the center of the
display, as depicted in c. Although other percepts are possible, this
continuous wavy contour that runs down the center of the is an existence proof that depth inversion alone can alter the shape
display. This percept is schematized in Figure 86.8C. The of modally and amodally completed contours. (Adapted from
Anderson et al., 2002.)
importance of this demonstration is that it shows that modal
and amodal completion can not only result in different
surface structures, but even in differently shaped contours. stimulus that leads to amodal completion of the edge), the
It is difficult to see what the concept of a single completion cells responded vigorously. This shows that at the earliest
mechanism serves to explain if the two processes can result stages of cortical processing, there is a double dissociation
in different completed forms. between the representations of modal and amodal struc-
Ultimately, the identity hypothesis is a claim about mech- tures, a conclusion which supports the dual mechanism
anism and can therefore be assessed physiologically. There hypothesis.
is a considerable body of evidence for extrastriatal units that
are sensitive to illusory but not to amodally completed Transparency, scission, and the representation of multiple
contours (see Chapter 76 for a review). A critical additional depth planes
piece of evidence was provided recently by Sugita
(1999), who found cells in V1 that respond to amodal Transparency poses a particularly interesting problem in the
completion across their receptive fields but not to modal perceptual organization of depth. With transparency, one
completion. Cells responded weakly when presented with object is visible through another, and thus two distinct depths
two unconnected edges; holes and occluding surfaces on lie along the same line of sight (Fig. 86.9). If the visual system
their own; and stimuli in which two unconnected edges is to represent depth in terms of the actual surfaces of the
were separated by a hole. However, when the cells were pre- environment, it has to depict two distinct depths at a single loca-
sented with two edge fragments separated by an occluder (a tion in the visual field. The process of projection compresses
1292 , ,   

from two distinct depths. The second is to assign surfaces
properties correctly at the two depths. By studying when and
how we see transparency, we can learn how the visual system
scissions depth into layers.
Much of the seminal work on perceptual transparency
was conducted by Metelli (1970, 1974a, 1974b; see also
Metelli et al., 1985), who provided a quantitative analysis of
the color mixing that occurs when one surface is visible
through another. When a background is visible through a
(a) (b)
transparent sheet, only certain geometrical and luminance
relations can hold between the various regions of the display
(Fig. 86.9). From these relations Metelli derived constraints
that determine whether a region will look transparent or not,
and how opaque it will appear if it does look transparent.
This is important, as it determines the conditions under
which the visual system scissions a single image intensity into
multiple layers, and thus how the visual system stratifies its
representation of depth.
(c) (d) Broadly, the conditions required for perceptual scission fall
into two classes. The first are the photometric conditions for
transparency, which detail the relations between the light
intensities of neighboring regions that are necessary for
scission. The second set of conditions for perceptual scission
are geometrical, or figural. Depth separates into layers only
when these relations hold between the various regions of the
display.
(e) (f) P C  S Consider the

F 86.9. Perceptual transparency. The figure in a tends to be display shown in Figure 86.9A, which tends to be seen as a
seen as a light gray transparent surface in front of a bipartite back- bipartite background that is visible through a transparent
ground, as depicted in b, and thus two distinct surfaces are visible filter. The vivid separation of the central region into two
along the same line of sight. Transparency is seen only when depths occurs only when certain luminance relations hold.
certain relations hold between the various regions of the display. In Metelli derived two constraints on the photometric condi-
c the central region is higher in contrast than its surround and thus
is not seen as transparent. In d, the polarity of the contrasts is
tions required for perceptual scission.
reversed, and again transparency is not seen. In e, the contour of The intuition behind the first constraint, which we refer
the underlying layer is not continuous inside and outside the central to as the magnitude constraint, is that a transparent medium
region, eliminating the percept of transparency. In f, the contour cannot increase the contrast of the structures visible through
of the overlying layer is not continuous, which also reduces the it. The consequence of this constraint is that the central
percept of transparency.
diamond must be lower or equal in contrast than its sur-
round in order to appear transparent, as shown in Figure
the light arriving from the transparent surface and the light 86.9A. This constraint is important, as it restricts the condi-
arriving from the more distant surface into a single image tions under which scission occurs: a region can scission only
intensity on the retina. In order to represent both surfaces, if its contrast is less than or equal to the contrast of its flank-
the visual system has to separate a single luminance value ing regions. As can be seen in Figure 86.9C, infringement of
into multiple contributions, a process known as scission this constraint with respect to the central diamond prevents
(Koffka, 1935). We argue that scission is a type of percep- the central region from undergoing scission. However, in this
tual segmentation, as it parses the representation of depth display, the constraint is satisfied for the region surrounding
into distinct surfaces. However, rather than segmenting the diamond; thus, the display can be seen as a bipartite
neighboring locations into distinct objects, scission separates display viewed through a transparent filter with a diamond-
depth into layers, or planes, and thus operates “parallel” to shaped hole in the center.
the image plane. The intuition behind the second luminance constraint,
Scission poses the visual system with two principal prob- which we refer to as the polarity constraint, is that a transpar-
lems. The first is to identify when a single luminance results ent medium cannot alter the contrast polarity of the
  :     1293

(a) (b) (c) increasing luminance form a Z configuration. When the
lines form a C shape (Fig. 86.10B), only one of the squares
is seen as transparent, and when the lines crisscross (Fig.
86.10C), the polarity constraint is infringed for all regions
and neither square scissions. Adelson and Anandan (1990)
provided a similar taxonomy based on the number of polar-
ity reversals. A number of lightness illusions demonstrate
that scission can be predicted from the class of X-junctions
in the display and that these X-junctions can have powerful
effects on many qualities of our experience (see, e.g.,
Adelson, 1993, 1999).
The magnitude and polarity constraints can be unified
as a single rule that describes a powerful local cue to scis-
sion. Anderson (1997) phrased the rule as follows: “When
two aligned contours undergo a discontinuous change in
contrast magnitude, but preserve contrast polarity, the lower-
F 86.10. The polarity constraint means that transparency contrast region is decomposed into two causal layers.” There
manifests itself in distinctive local ordinal relations in luminance. are two valuable consequences of this rule. The first is that
The only difference between the three figures is the luminance
of the region of overlap. In a, the region is dark and the image it unifies the two Metelli constraints. The second is that it
is bistable, as either square can be seen in front. When this occurs, provides a local signature of transparency that can be applied
a line that passes progressively from brighter to darker regions to any meeting of contours. This includes those T-junctions
creates a Z shape. In b, the overlap is intermediate, such that that are in fact degenerate X-junctions, that is, those in
the line that joins regions of decreasing brightness is C-shaped. which two neighboring regions happen to have exactly the
When this happens, exactly one of the surfaces appears transpar-
ent. In c, the overlap is light, creating a crisscross pattern. In
same luminance. Anderson also demonstrated that a
this case, neither square appears transparent, as the polarity con- number of traditional lightness phenomena, including
straint is infringed for both squares. (Adapted from Beck and Ivry, White’s effect and its variants, and neon color spreading,
1988.) can be accounted for as cases of scission rather than as
the consequence of traditional “contrast” or “assimilation”
processes.
structures visible through it. Put another way, if a dark-light Having identified that a location contains two surfaces,
edge passes underneath a transparent medium, the dark side the visual system has to partition the luminance at that
will remain darker than the light side, no matter what the location between the two depths. How much of the light is
absolute luminances are. As can be seen in Figure 86.9D, due to reflectance of the underlying surface, and how much
infringement of this constraint prevents perceptual scission, is due to the properties of the overlying layer? The opacity
demonstrating that the visual system respects this optical of the overlying layer determines how the luminance is
outcome of transparency. This constraint is particularly divided between the two depths. Metelli’s model makes
important in determining the depth ordering in transparent explicit predictions about the perceived opacity and light-
displays. ness of the transparent layer. The equations predict that
The polarity constraint enforces certain restrictions on the two surfaces with identical transmittance should look
ordinal relationships between the luminances of neighbor- equally opaque irrespective of their lightness. However,
ing regions. This means that, in principle, we can classify Metelli himself noted that dark filters tend to look
the locations where neighboring regions meet to determine more transparent than light filters with the same transmit-
whether scission is or is not possible in each region. This pro- tance. Why does the visual system confuse lightness and
vides the visual system with a local signature of transparency. transmittance in partitioning luminance between two
Beck and Ivry (1988) noted that if one draws a series of lines depths?
running progressively from the brightest to the darkest In a series of matching experiments, Singh and Ander-
regions, there are three possible shapes that result, as shown son (2002) recently resolved this issue. Subjects adjusted
in Figure 86.10. The only difference between the three the opacity of one filter until it matched the perceived
figures is the luminance of the region of overlap between opacity of another filter with a different lightness. Singh
the two squares. In the first instance (Fig. 86.10A), the image and Anderson found that perceived transmittance is
is bistable, as either square can be seen as a transparent predicted almost perfectly by the ratio of Michelson contrasts
overlay. In these circumstances, the lines linking regions of inside and outside the transparent region, even though
1294 , ,   

such a measure is actually inconsistent with the optics of sophisticated than a mere two-dimensional map of depth
transparency. As discussed above, there is a general con- values.
sensus that the early visual processing tends to optimize
sensitivity to contrast rather than absolute luminance. S   P O  D
Hence, in assigning transmittance, the visual system appears Scission can have pronounced effects on perceptual organi-
to use the readily available contrast measurements, even zation. For example, Stoner et al. (1990) demonstrated that
though they are not strictly accurate measurements of perceived transparency can alter the integration of motion
opacity. signals into coherent moving objects. When a plaid is drifted
at constant velocity across the visual field, it is typically
F C  S In addition to the lumi- seen as a single coherent pattern that moves at the velocity
nance conditions, certain geometrical relations must hold of the intersections between the two component gratings.
between the various regions of the display in order for depth However, with prolonged viewing, the plaid appears to sep-
stratification to occur (Kanizsa, 1995/1979; Metelli, 1974a). arate into two component gratings that slide across each
These figural conditions fall into two broad classes. The first other, each of which appears to move in the direction per-
class requires good continuation of the underlying layer. pendicular to its orientation. When the plaid is coherent, it
Specifically, the contours that are in plain view should be appears to occupy a single depth plane, but when it sepa-
continuous with the contours viewed through the region of rates into its components, the gratings tend to appear at dif-
presumed transparency. As can be seen in Figure 86.9E, ferent depths.
infringement of this condition interrupts the percept of Stoner et al. varied the intensity of the intersections of the
transparency. The second figural condition requires good plaids and measured the proportion of time for which
continuation of the transparent layer. Figure 86.9F shows that the plaid was seen as coherent. They found that when the
infringement of this condition weakens or eliminates the color of the intersection was consistent with one grating
percept of transparency. being seen through the other (i.e., when the junctions
There are conditions in which the figural cues to trans- were consistent with transparency), the proportion of the
parency are so strong that they can override the luminance time for which the plaid appeared to separate into gratings
cues. Beck and Ivry (1988) showed subjects displays like was greatly increased. By contrast, when the color of the
the one shown in Figure 86.10C, in which the region of intersections infringed the polarity constraint, such that
overlap between the two figures is the wrong contrast neither grating could be seen as transparent, the pattern
polarity for either figure to be seen as transparent. Despite tended to be seen as a coherent plaid rather than undergo-
this, naive subjects did occasionally report seeing such figures ing scission into distinct layers. This demonstrates that
as transparent, demonstrating that the sense of figural scission has important consequences for the representation
overlap is a central aspect of the percept of transparency. of visual structure. When an image region scissions, the
Certainly most observers are willing to agree that the effects can spread to regions distant from the local cues to
region of overlap in Figure 86.10C appears to belong to scission.
two figures simultaneously, an impression that can be Scission acts as a nexus between depth and other visual
enhanced with stereo and relative motion. However, it attributes. Scission of depth can cause regions to change in
should be noted that the gray of the overlap region does not apparent lightness; conversely, changes in luminance can
appear to scission into two distinct sources, at least not in the cause changes in depth stratification. Figure 86.11 demon-
same way as the overlap of a normal transparency display strates this close relationship between luminance, scission,
does (as in Figs. 86.10A and 86.10B). This leads to the and the perceptual organization of depth. Three circular
possibility of two distinct neural processes in the percep- patches of a random texture were placed on a uniform back-
tion of transparency. One is driven by relatively local ground. Critical to the demonstration is that disparity is
cues and leads to phenomenal color scission. The other is introduced between the circular boundaries and the texture
driven by more global geometrical relations and leads to inside the circles. When the disparity places the texture
stratification in depth. Under normal conditions of trans- behind the circular boundaries, the circles appear as holes
parency, the two processes operate in concert to produce the through which the texture is visible. The texture tends to
full impression of transparency. However, using carefully appear as a single plane with continuously stochastically
designed cue-conflict stimuli, such as those used by Beck and varying lightness. However, when the disparity places the
Ivry, these two factors in the representation of transparent texture in front of the circular boundaries, the percept
surfaces can be distinguished. An open question, however, is changes considerably. The texture separates into two distinct
how these processes are instantiated neurally. All we can layers: a near layer made up of clouds with spatially varying
conclude is that the representation of depth is much more transmittance and a far layer that is visible through the
  :     1295

appears relatively uniform in depth by comparison. The
second asymmetry that occurs with depth inversion is photo-
metric in that it is driven by the luminance of the surround
and determines the lightness of the cloud and discs. When
the texture is distant, the percept changes very little with
changes to the luminance of the surround; by contrast, when
the texture is near, the luminance of the surround critically
determines how the scission occurs, as well as the lightness
of the cloud and discs. In what follows, we will use the CDAP
discussed in the first section and the concept of scission to
explain theses asymmetries. For a more thorough discussion,
see Anderson (in press).
Let us first consider the case in which the texture carries
far disparity relative to the circular boundaries. Because the
texture is continuously varying in luminance, it carries local-
izable disparity signals at almost every location. Put another
F 86.11. Scission and the perceptual organization of depth. way, if disparity is carried by contrast, as argued in the first
The top and bottom figures are identical apart from the brightness
section, then patterns that are richly structured bear the
of the surround. When the right stereopair is cross-fused, the
figure appears as a single textured plane that is visible through densest distribution of disparities. Recall that the CDAP
three circular holes. This is seen irrespective of the luminance of requires both sides of every contrast to be at least as distant
the surround. However, when the disparity is reversed (by cross- as the disparity carried by the contrast. This means that
fusing the left stereopair), the texture appears to separate into two when the texture is given far disparity (or, more precisely,
depth planes. The near layer contains near clouds that vary when the contrasts of the texture are given far disparity), both
spatially in thickness or opacity. Through these clouds can be seen
three more distant disks, which appear more or less uniform in the light and dark matter in the texture recede to this depth.
lightness. For this depth ordering, the structure completely reverses In turn, the depth placement of the texture uniquely deter-
with a change in the luminance. In the top case, the dark portions mines the border ownership of the boundaries of the discs,
of the texture form the clouds; in the bottom case, the light por- which carry relatively near disparity. If the insides of the discs
tions of the texture form the clouds. Scission makes these percepts (i.e., the texture) carry far disparity, then the outsides (i.e., the
possible by allowing the visual system to separate the intermediate
grays into two distinct contributions. (Adapted from Anderson,
region surrounding the discs) must be at the depth carried
1999.) by the circular boundary. Thus, the circles are seen as holes
in the surrounding surface; it is through these holes that the
clouds, which consists of uniform discs on a uniform texture is visible.
background (see Anderson, 1999; Anderson, in press). The situation is more complex when the depth is reversed,
Another interesting property of this display is that the that is, when the contrasts of the texture are nearer than the
lightness and spatial structure of the clouds and discs reverse contrast of the circular boundaries. Crucial to the following
completely when the luminance of the surround varies. In argument is that it is contrasts that carry disparity, while it is
Figure 86.11, the top and bottom displays are identical the light and dark regions that make up the contrasts to
except for the lightness of the surround. When the surround which depth is assigned. First, let us consider the circular
is dark, the texture scissions into dark, smoke-like clouds in boundary between the surround and the texture. When
front of white discs. However, when the surround is white, the surround is light, it is the dark portions of the texture
it is the light portions of the texture that move forward, float- (inside the circles) that contrast with the surround. Thus, the
ing like mist in front of dark discs. One final observation disparity of the circular boundary is carried by the contrast
about the display is that when the texture carries near dis- between the light matter of the surround and the dark
parity, and thus undergoes scission, the clouds that float in portion of the textured disc. The CDAP requires both of
front tends to complete modally across the gaps in between these regions to be at least as distant as the disparity carried
the discs. This is in part due to the fact that the conditions by the boundary. This means that the light surround is
for camouflage are satisfied, as discussed in the second dragged back to this depth, and the dark matter of the
section. texture is also dragged back to this depth. Now consider the
When the depth is reversed in the display, two asymme- contrasts between the dark and light portions within the
tries occur. The first is geometrical in that it alters the struc- texture. These contrasts carry relatively near disparity. But
ture of the depths in the scene. In the near case the texture the contrast between the dark matter and the surround has
scissions into two layers, while in the far case the texture already constrained the dark matter to be at least as distant
1296 , ,   

as the circular boundary. This means that it must be the light percepts of depth and material quality. It is through the
matter of the texture that is responsible for the near disparity CDAP and scission that the visual system interprets local
of the texture, that is, the light matter is a near surface variations in luminance as meaningful surfaces located in
that partly obscures the dark matter. This explains why depth. Depth stratification complements traditional seg-
the texture splits into two depths: the dark matter is drag- mentation as an important process through which the visual
ged back by forming a contrast that carries far disparity system organizes its representation of depth into ecologically
(i.e., the boundary of the disc) and the light matter floats valid structures.
in front, as its boundaries with the dark matter carry near
disparity.
The final logical step in the explanation involves scission. Conclusions
The texture does not consist of only two luminances, but of
a continuous range of luminances from light to dark. How It is common to think that depth perception involves
can we explain the appearance of the intermediate lumi- little more than determining the depth at each location in
nances in the texture? Scission makes it possible to separate the visual field. We have argued, to the contrary, that the
the intermediate luminances into two distinct components: visual system mirrors the structural organization of the
dark “stuff ” and light “stuff,” which have been compressed environment by tying its representation of depth to
into a single luminance by the process of projection onto the surfaces and objects. Thus, depth perception is an active
retina. These two components lie in different depth planes. process of perceptual organization, as well as a passive
Put another way, scission allows the visual system to inter- process of acquiring depth estimates. We have argued
pret the gray regions as dark matter viewed through light matter. that luminance, disparity, and contrast are some of the
The critical insight is that it is the dark stuff in the texture basic image features that carry local information about
that forms the contrast with the surround. Therefore, all of depth, while scission, visual completion, and the CDAP
the dark stuff belongs to the more distant depth, including are some of the means by which depth is organized into
the dark stuff within the grays. All of the remaining light- surfaces.
ness in the grays belongs to the transparent clouds that float In the first section, we introduced the CDAP and argued
in front of the discs. In this way, the intermediate luminances that
are interpreted as varying degrees of transmittance of the 1. Disparity is carried by local contrasts (e.g., luminance
overlying layer. The lighter the gray, the thicker the cloud; edges) but depth is assigned to the regions that meet to form
the darker the gray, the sparser. This explains why the disc the contrasts.
appears as a uniform dark disc: all of the dark is “sucked 2. Occlusion introduces a critical constraint on the inter-
out” of intermediate regions and is dragged back to form pretation of local disparity signals, the CDAP. This con-
the disc. The “leftover” lightness is attributed to the trans- straint requires that both sides of a contrast are at the depth
parent clouds. specified by the contrast, or one side could be a more distant
The whole argument reverses when we change the sur- occluded surface. In the latter case, the disparity determines
round from light to dark. When the surround is dark, it is the depth of the occluding side.
the light portions of the texture that contrast with the sur- 3. The CDAP imposes a fundamental asymmetry
round, and therefore, it is the light portions of the texture between near and far structures. When simultaneously
that are dragged back. The near disparity of the texture applied to all edges in a display, the CDAP can explain
must therefore be due to the dark regions, and thus dark a number of asymmetrical changes in perceived surface
clouds are seen to float in front of white discs. Again, as it layout that occur with simple inversion of the disparity
is the whiteness of the texture that is dragged back, all of the field.
whiteness in the intermediate luminances is attributed to the
more distant discs. The remaining darkness in the grays is In the second section, we discussed how the visual system
attributed to the dark clouds that float in front. In this way, deals with structures that are invisible because they are
changing the luminance of the surround changes which hidden by occlusion or camouflaged against their back-
contrasts carry the disparities and thus which regions are ground. We argued that
dragged back by virtue of the CDAP. Scission enables the 1. The visual system has to actively complete the
visual system to separate luminances into multiple contribu- missing data if it is to segment depth accurately into
tions and thus segment the intermediate grays into two dis- objects.
tinct depth planes. 2. Consideration of the environmental conditions of
This demonstration and others like it are important, as occlusion and camouflage predicts (a) that modal comple-
they show how multiple processes interface to determine our tion is sensitive to luminance, while amodal completion is
  :     1297

not, and (b) that modal completion tends to occur over Jones, J., and J. Malik, 1992. A computational framework for deter-
shorter distances than amodal completion. mining stereo correspondence from a set of linear spatial filters,
Image Vis. Comput., 10:699–708.
3. As predicted from the environmental differences, dis-
Julesz, B., 1960. Binocular depth perception of computer gener-
tinct mechanisms are responsible for the two types of com- ated patterns, Bell Syst. Tech. J., 39:1125–1162.
pletion. The differences can be used to generate displays in Julesz, B., 1971. Foundations of Cyclopean Perception, Chicago: Univer-
which the completed forms differ when the disparity field is sity of Chicago Press.
inverted. Kellman, P. J., and T. F. Shipley, 1991. A theory of visual interpo-
lation in object perception, Cogn. Psychol., 23:141–221.
Finally, in the third section, we discussed how scission Kellman, P. J., C. Yin, and T. F. Shipley, 1998. A common mech-
allows the visual system to represent two depths along the anism for illusory and occluded object completion, J. Exp. Psychol.
same line of sight and thus organize depth into layer. We Hum. Percept. Perform., 24:859–869.
argued that Kanizsa, G., 1955/1979. Organization in Vision, New York: Praeger.
Koffka, K., 1935. Principles of Gestalt Psychology, Cleveland:
1. Certain luminance and figural relations must obtain in Harcourt, Brace and World.
order for a region to undergo scission. Marr, D., and T. Poggio, 1976. Cooperative computation of stereo
2. Scission can have pronounced effects on perceptual disparity, Science, 194:283–287.
Marr, D., and T. Poggio, 1979. A computational theory of human
organization in regions distant from the local signatures of stereo vision, Proc. R. Soc. Lond B, 204:301–328.
transparency. Metelli, F., 1970. An algebraic development of the theory of per-
ceptual transparency, Ergonomics, 13:59–66.
Metelli, F., 1974a. The perception of transparency, Sci. Am.,
230:90–98.
REFERENCES Metelli, F., 1974b. Achromatic color conditions in the perception
of transparency, in Perception: Essays in Honor of J. J. Gibson (R. B.
Adelson, E. H., 1993. Perceptual organization and the judgment
MacLeod and H. L. Pick, eds.), Ithaca, NY: Cornell University
of brightness, Science, 262:2042–2044.
Press.
Adelson, E. H., 1999. Lightness perception and lightness illusions,
Metelli, F., O. da Pos, and A. Cavedon, 1985. Balanced and unbal-
in The New Cognitive Neurosciences (M. Gazzaniga ed.-in-chief ),
anced, complete and partial transparency, Percept. Psychophys.,
Cambridge, MA: MIT Press.
38:354–366.
Adelson, E. H., and P. Anandan, 1990. Ordinal characteristics of
Michotte, A., G. Thines, and G. Crabbe, 1964/1991.
transparency, Boston: AAAI-90 Workshop on Qualitative Vision,
Amodal completion of perceptual structures, in Michotte’s
July 29.
Experimental Phenomenology of Perception (G. Thines, A. Costall,
Anderson, B. L., 1997. A theory of illusory lightness and trans-
and G. Butterworth, eds.), Hillsdale, NJ: Erlbaum, pp. 140–
parency in monocular and binocular images: the role of contour
167.
junctions, Perception, 26:419–453.
Nakayama, K., S. Shimojo, and G. H. Silverman, 1989.
Anderson, B. L., 1999. Stereoscopic surface perception, Neuron,
Stereoscopic depth. Its relation to image segmentation, group-
24:919–928.
ing, and the recognition of occluded objects, Perception, 18:55–
Anderson, B. L., in press. The role of occlusion in the perception
68.
of depth, lightness, and opacity, Psychological Review, (in
Palmer, S. E., 1999. Vision Science, Cambridge, MA: MIT Press.
press).
Petter, G., 1956. Nuove ricerche sperimentali sulla totalizzazione
Anderson, B. L., and K. Nakayama, 1994. Towards a general
percettiva, Riv. Psicol., 50:213–227.
theory of stereopsis: binocular matching, occluding contours and
Pollard, S. B., J. E. W. Mayhew, and J. P. Frisby, 1985. A stereo cor-
fusion, Psychol. Rev., 101: 414–445.
respondence algorithm using a disparity gradient limit, Perception,
Anderson, B. L., M. Singh, and R. W. Fleming, 2002. The
14:449–470.
interpolation of object and surface structure, Cogn. Psychol.,
Prazdny, K., 1985. Detection of binocular disparities, Biol. Cybern.,
44:148–190.
52:93–99.
Beck, J., and R. Ivry, 1988. On the role of figural organiza-
Ratliff, F., 1965. Mach Bands: Quantitative Studies on Neural Networks in
tion in perceptual transparency, Percept. Psychophys., 44:585–
the Retina, San Francisco: Holden-Day.
594.
Ringach, D. L., and R. Shapley, 1996. Spatial and temporal prop-
Bruce, V., P. R. Green, and M. A. Georgeson, 1996. Visual Percep-
erties of illusory contours and amodal boundary completion, Vis.
tion, 3rd ed., Hove, East Sussex, UK: Psychology Press.
Res., 36:3037–3050.
Cornsweet, T. N., 1970. Visual Perception, New York: Academic
Shipley, T. F., and P. J. Kellman, 1992. Perception of partly
Press.
occluded objects and illusory figures: evidence for an
DeValois, R. L., and K. K. DeValois, 1988. Spatial Vision, New York:
identity hypothesis, J. Exp. Psychol. Hum. Percept. Perform., 18:106–
Oxford University Press.
120.
Hartline, H. K., 1940. The receptive fields of optic nerve fibres,
Singh, M., and B. L. Anderson, 2002. Toward a perceptual theory
Am. J. Physiol., 130:690–699.
of transparency, Psychol. Rev., 109(3):492–519.
Howard, I. P., and B. J. Rogers, 1995. Binocular Vision and Stereopsis,
Smallman, H. S., and S. P. McKee, 1995. A contrast ratio
New York: Oxford University Press.
constraint on stereo matching, Proc. R. Soc Lond. B, 260:265–
Hubel, D. H., and T. N. Wiesel, 1962. Receptive fields, binocular
271.
interaction and functional architecture of monkey striate cortex,
Sperling, G., 1970. Binocular vision: a physiological and neural
J. Physiol., 160:106–154.
theory, Am. J. Psychol., 83:461–534.
1298 , ,   

Stoner, G. R., T. D. Albright, and V. S. Ramachandran, 1990. Takeichi, H., T. Watanabe, and S. Shimojo, 1992. Illusory occlud-
Transparency and coherence in human motion perception, ing contours and surface formation by depth propagation, Per-
Nature, 344:153–155. ception, 21:177–184.
Sugita, Y., 1999. Grouping of image fragments in primary visual Wallach, H., 1948. Brightness constancy and the nature of achro-
cortex, Nature, 401:269–272. matic colors, J. Exp. Psychol., 38:310–324.
  :     1299

FlemingAnderson VisNeuro BookChapter

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FlemingAnderson VisNeuro BookChapter

Uploaded by

Copyright:

Available Formats

86 The Perceptual Organization of Depth

ROLAND FLEMING AND BARTON L. ANDERSON

B S   C P

  :     1285

? ? found consequences for the assignment of depth to whole

A  D: A D By way of

1286 , ,   

  :     1287

1288 , ,   

T I H There is a vast literature on

  :     1289

1290 , ,   

(b) V C   P O

  :     1291

(b) Serrated edge near (c) Serrated edge far

F 86.7. a, Relative depth alters perceptual organization.

1292 , ,   

(e) (f) P C  S Consider the

  :     1293

1294 , ,   

  :     1295

1296 , ,   

  :     1297

1298 , ,   

  :     1299

You might also like