You are on page 1of 22

Journal of Vision (2010) 10(6):4, 1–22 http://www.journalofvision.

org/content/10/6/4 1

Visual detection of symmetry of 3D shapes


Department of Psychological Science, Purdue University,
Tadamasa Sawada West Lafayette, IN, USA

This study tested perception of symmetry of 3D shapes from single 2D images. In Experiment 1, performance in
discrimination between symmetric and asymmetric 3D shapes from single 2D line drawings was tested. In Experiment 2,
performance in discrimination between different degrees of asymmetry of 3D shapes from single 2D line drawings was
tested. The results showed that human performance in the discrimination was reliable. Based on these results, a
computational model that performs the discrimination from single 2D images is presented. The model first recovers the 3D
shape using a priori constraints: 3D symmetry, maximal 3D compactness, minimum surface area, and maximal planarity of
contours. Then the model evaluates the degree of symmetry of the 3D shape. The model provided good fit to the subjects’
data.
Keywords: symmetry, a priori constraints, maximum 3D compactness, minimum surface area, planarity of contours,
3D shape recovery, kinetic depth effect
Citation: Sawada, T. (2010). Visual detection of symmetry of 3D shapes. Journal of Vision, 10(6):4, 1–22,
http://www.journalofvision.org/content/10/6/4, doi:10.1167/10.6.4.

Consider first the case of a retinal image of a 2D


Introduction (planar) symmetric shape. The line segments connecting
pairs of symmetric points of a 2D symmetric shape are
The following three types of symmetry are directly called symmetry line segments. Symmetry line segments
related to shape perception: mirror, rotational, and trans- are parallel to one another, and their midpoints are
lational (Figure 1) (Mach, 1906/1959). Each type of collinear. Furthermore, the line connecting the midpoints
symmetry is formally defined as invariance under some is orthogonal to the symmetry line segments and it
transformation: reflection, rotation, or translation. For coincides with the symmetry axis of the 2D shape. If a
example, in the case of mirror symmetry, one symmetric planar symmetric shape is slanted relative to the observer,
half of the object can be thought of as a mirror reflection its retinal image is asymmetric and is called 2D skewed
of the other half. In two dimensions (2D), an axis of symmetry (Kanade, 1981). In an orthographic image, the
symmetry plays a role of a mirror. In three dimensions projections of symmetry line segments are parallel to one
(3D), a plane of symmetry plays the role of a mirror. another and their midpoints are collinear. However, the
Systematic study of perception of symmetry started line connecting the midpoints is not orthogonal to the
with Mach (1906/1959) and was followed by the Gestalt projections of the parallel line segments. In a perspective
psychologists (Koffka, 1935; Wertheimer, 1923). It has image, the projections of symmetry line segments are not
been shown that mirror symmetry is detected by the parallel and their midpoints are not collinear (Sawada &
human visual system more reliably than other types of Pizlo, 2008a; Wagemans, van Gool, & d’Ydewalle, 1992).
symmetry (for reviews, see Wagemans, 1997; Zabrodsky, Specifically, the lines representing the projections of
1990). This preference for mirror symmetry seems to be symmetry line segments intersect at a single point called
reasonable because many natural and man-made objects in the vanishing point. It was shown that the performance in
the real world are mirror symmetric or at least approx- detection of 2D symmetric shapes based on a single
imately mirror symmetric. It follows that detecting and retinal image that itself was skewed symmetric is reliable
recognizing 3D mirror-symmetric objects is important. In but is worse than in detection of symmetry on the retina
this paper, we will use “symmetry” to mean “mirror (Sawada & Pizlo, 2008a; Wagemans, van Gool, &
symmetry.” d’Ydewalle, 1991; Wagemans et al., 1992). Skewed
In the past, visual perception of symmetry has been symmetry in prior experiments was produced by using
studied primarily using symmetric retinal images. The perspective, orthographic, or projective images. It is
human visual system can reliably detect symmetry on the perspective projection, which correctly simulates the rules
retina, especially when the axis of symmetry is vertical of geometrical optics (Pizlo, Rosenfeld, & Weiss, 1997a,
and at the center of the retina (e.g., Barlow & Reeves, 1997b). A number of studies demonstrated that the visual
1979; Jenkins, 1983; Julesz, 1971). However, the retinal system uses the rules of perspective projection in shape
image of a symmetric shape “out there” is symmetric only perception (Kaiser, 1967; Pizlo & Salach-Golyska, 1995;
for a small set of viewing directions. Wagemans, Lamote, & van Gool, 1997; Yang & Kubovy,

doi: 1 0. 11 67 / 1 0 . 6 . 4 Received September 12, 2009; published June 4, 2010 ISSN 1534-7362 * ARVO

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 2

Figure 1. Three types of symmetry related to shape perception (after Mach, 1906/1959). (a) Mirror symmetry is invariant with respect
to reflection. A reflection transformation of a shape about its axis of symmetry results in a shape congruent with the original shape.
(b) Rotational symmetry is invariant with respect to rotation. (c) Translational symmetry is invariant with respect to translation.

1999). An orthographic projection is an approximation to result suggests that the visual system uses the rules of
a perspective projection. This approximation is good when orthographic rather than perspective projection in detec-
the range in depth of the simulated object is small tion of symmetry from skewed symmetric images. This
compared to the viewing distance. A projective trans- makes sense considering that orthographic projection is
formation on the retina is produced when a perspective computationally simpler than perspective projection and
image on a computer screen is viewed from a wrong under normal viewing conditions is likely to provide a
vantage point (Pirenne, 1970; Pizlo, 2008). This fact is good enough approximation. Next, we explain the proper-
related to the well-known theorem of projective geometry ties of 3D symmetry. Perception of symmetric 3D shapes
that a composition of two perspective projections is itself and the detection of 3D symmetry from a single 2D retinal
a projective rather than perspective projection. Interest- image received very little, if any, attention in the past.
ingly, the detection of symmetry based on skewed Consider a symmetric 3D shape, like that in Figure 3.
symmetric retinal image is slightly more reliable with The 2D retinal image of a symmetric 3D shape is itself
orthographic than perspective images (Sawada & Pizlo, symmetric only if the line of sight lies on the symmetry
2008a). The detection of symmetry seems least reliable plane (Figure 3a). For other viewing orientations, the 2D
with projective images (see examples in Figure 2). This retinal image is asymmetric and will be called 3D skewed

Figure 2. A symmetric 2D shape (a) and its transformations when it is slanted relative to the observer: (b) perspective transformation and
(c) orthographic transformation. Slant is 70 degrees and tilt is 85 degrees in both these transformations. When the reader’s line of sight is
orthogonal to this page at the cross F and the viewing distance is equal to the distance between F and D, the retinal images of both
(b)and (d) are perspective images of the shape in (a) with slant 70 and tilt 85 degrees. When the reader’s line of sight is orthogonal to
this page at any other point than F, the retinal image of (d) is a projective image of the shape in (a).

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 3

model uses symmetry of the 3D shape as an implicit


constraint; i.e., the recovered 3D shape is always
symmetric. They showed that the 3D shape recovered by
the model is very similar to a 3D shape perceived by the
observer from the same 2D image. However, a question
arises as to how the human visual system knows whether
the 2D image was produced by a 3D symmetric shape.
Either the symmetry of the 3D shape is determined before
the 3D shape is recovered or the 3D shape recovery
maximizes the 3D symmetry (rather than assume that the
shape is symmetric), and the symmetry of the 3D shape is
a byproduct of the shape recovery process. Clearly,
detecting symmetry of a 3D shape is important for a
human observer, and our everyday life experience
suggests that we can accomplish this task very well.
However, there has been no systematic study testing
Figure 3. A symmetric polyhedron, its symmetry plane and
human performance in detecting symmetry of a 3D shape
symmetry line segments. The 3D symmetry line segments are
from a single 2D image.
parallel to one another, and their projections (2D symmetry line
It is trivially true that any 2D image of a symmetric 3D
segments) are also parallel to one another under an orthographic
shape is consistent with infinitely many asymmetric 3D
condition. The 3D symmetry line segments are parallel to the
shapes. In order to see this, imagine displacing one point
normal of the symmetry plane and they are bisected by the
of a symmetric 3D shape along the line connecting this
symmetry plane.
point with its image. The new 3D shape will be
asymmetric, but its 2D image will not change. It is less
trivial, but also true, that there are asymmetric 3D shapes
symmetry (Figure 3b). The line segments, which connect such that each of their 2D images is consistent with
pairs of 3D symmetric points, are called 3D symmetry line infinitely many symmetric 3D shapes. In this study, we
segments. They all are parallel to the normal of the show that the human observer can reliably discriminate
common symmetry plane, and they are bisected by this between symmetric and asymmetric 3D shapes from
plane. It follows that the 3D symmetry line segments are single 2D images. The human performance cannot be
parallel to one another and their midpoints are coplanar. predicted from the features of the 2D retinal image, even
Unlike the 2D case, the 3D symmetry line segments are not though the 2D retinal image is the only visual data
coplanar and their midpoints are not collinear. Next, provided to the observer. Clearly, the human visual
consider a 2D orthographic image of a symmetric 3D system “goes beyond the information given” (Bartlett,
shape. The projections of 3D symmetry line segments 1932): It uses a priori constraints that determine the
(called henceforth 2D symmetry line segments) are parallel likelihood of the 3D interpretations of a given 2D image.
to one another. Unlike the 2D case, the midpoints of 2D We present a new computational model that explains the
symmetry line segments are not collinear. In a perspective nature of these constraints. Performance of the model in a
image, the 2D symmetry line segments are not parallel: 3D symmetry discrimination task is very similar to the
Their extrapolations intersect at the vanishing point. performance of human subjects.
Despite the fact that the 2D retinal image of a
symmetric 3D shape is asymmetric, symmetry of the 3D
shape is a useful constraint in 3D shape recovery (Pizlo,
2008; Pizlo, Sawada, Li, Kropatsch, & Steinman, 2010). It Experiment 1: Discrimination
has been shown that symmetry of a 3D shape facilitates between symmetric and
human performance of 3D shape recognition (Chan,
Stevenson, Li, & Pizlo, 2006; Liu & Kersten, 2003; Pizlo asymmetric 3D shapes from
& Stevenson, 1999; van Lier & Wagemans, 1999; Vetter, single 2D images and from
Poggio, & Bülthoff, 1994). These prior studies showed
that the human observer can reliably recognize the same kinetic depth effect
symmetric 3D shape from different viewpoints. Note that
the 2D retinal image of a 3D shape changes when the Methods
viewpoint changes. The reliable performance of 3D shape
recognition requires veridical perception of the 3D shape Subjects
from each 2D image. Li, Pizlo, and Steinman (2009) Three subjects (including the author) were tested. All
proposed a model that can recover a veridical 3D shape subjects had prior experience as subjects in psychophysical
from a single 2D image of a symmetric 3D shape. Their experiments. TS received extensive practice before being

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 4

tested. TS and ZP knew the purpose of the experiment. RV whether the image was produced by a symmetric or an
was naı̈ve about the purpose. All subjects had normal or asymmetric 3D shape. The subject was asked to judge the
corrected-to-normal vision. symmetry of the 3D shape, not the symmetry of the 2D
image. The abstract shapes were used to avoid confounding
effects of familiarity (Chan et al., 2006; Pizlo & Stevenson,
Apparatus 1999). Each symmetric polyhedron had 16 vertices. The
The stimuli were shown on an LCD monitor with positions of the vertices were generated randomly in 3D
1280  1024 resolution and 60 Hz refresh rate. The subject under the following restrictions: the vertices formed eight
viewed the monitor with the right eye from a distance of symmetric pairs, the faces of the polyhedron were planar,
40 cm in a dark room. The subject wore an eye patch on the polyhedron consisted of two small boxes, and one big
his left eye. The subject’s head was supported by a chin- box. The symmetric polyhedron had 28 straight edges and
forehead rest. The subject’s line of sight was orthogonal 14 planar faces, which were quadrangles. Eight out of the
to the monitor. 28 edges connected the symmetric pairs of vertices. These
eight edges were 3D symmetry line segments.1
An asymmetric polyhedron was generated by distorting
Stimuli the symmetric one. There were three types of asymmetric
The 2D orthographic images (line drawings) of abstract polyhedra corresponding to three experimental conditions.
symmetric and asymmetric polyhedra were used (Figure 4). In Condition-R (random distortion), the faces of an
Note that the 2D image was always asymmetric, regardless asymmetric polyhedron were not planar and the line
segments which were 3D symmetry line segments in
its original symmetric polyhedron were not parallel
(Figure 4b). In Condition-N (non-parallel 3D symmetry
line segments), the faces of an asymmetric polyhedron
were planar but its “3D symmetry line segments” were not
parallel (Figure 4c). In Condition-P (parallel 3D symmetry
line segments), the faces of an asymmetric polyhedron
were planar and its “3D symmetry line segments” were
parallel (Figure 4d). This condition is especially interest-
ing because every orthographic image of an asymmetric
object of this type is consistent with a symmetric 3D
interpretation. This means, that every image in this
condition, regardless whether the image was produced
by a symmetric or an asymmetric 3D shape, was consistent
with both symmetric and asymmetric 3D interpretation.
Will the subject be able to perform above chance level in
this condition?
The distortion that was used to produce an asymmetric
polyhedron in each condition was made by displacing
eight out of its 16 vertices. In Condition-N, the vertices
were displaced along the directions of the edges connecting
the vertices. In Condition-P, the vertices were displaced
along the “3D symmetry line segments.” In Conditions-N
and -P, two of the eight vertices were displaced first, and
then the other six vertices were displaced so that the
Figure 4. Orthographic images of a symmetric polyhedron (a) and planarity of the faces was preserved. The amount of the
of three types of asymmetric polyhedra generated by distorting a distortion was controlled by restricting the range of
symmetric polyhedron. (a) A symmetric polyhedron. The faces of the random displacement of the first two vertices in
the symmetric polyhedron were planar. The 3D symmetry line Condition-N and -P and of all eight vertices in Condition-R.
segments were always parallel. (b) An asymmetric polyhedron in We also tested the effect of kinetic depth effect on the
Condition-R. The faces were not planar and the “3D symmetry discrimination. Note that a single 2D image is ambiguous
line segments” were not parallel. (c) An asymmetric polyhedron in but three or more 2D images are sufficient for reconstruct-
Condition-N. The faces were planar but the “3D symmetry line ing a unique 3D shape (Ullman, 1979). In each trial of this
segments” were not parallel. (d) An asymmetric polyhedron in condition (called kinetic), the 3D polyhedron rotated
Condition-P. The faces were planar and the “3D symmetry line around the vertical axis. The magnitude of rotation was
segments” were parallel. Amount of the distortion for generating 10 degrees. During the duration of the trial (0.5 sec), the
these asymmetric polyhedra (b–d) was the largest (L4). subject was presented with 30 images.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 5

An orthographic projection was used to produce the 2D As expected, discrimination was easier when the dis-
images of the polyhedra. Hidden contours were removed. tortion was greater (F3, 46 = 249.94, p G 0.001) and when
Each polyhedron was randomly oriented in 3D space with the presentation was kinetic (F1, 46 = 24.42, p G 0.001).
the following two restrictions: at least one vertex of each Performance was significantly higher than chance level for
pair of symmetric vertices, and at least ten vertices forming all asymmetry types even in static L1 condition, in which
five symmetric pairs had to be visible. These constraints the distortion of asymmetric shapes was the smallest
allow the computational model to recover the entire (R: t(5) = 5.10, p G 0.005; N: t(5) = 3.44, p G 0.05;
polyhedron, both the visible and the hidden parts (see P: t(5) = 3.29, p G 0.05). This means that the subjects
Appendix A). The polyhedra were drawn in white on a could reliably discriminate between symmetric and asym-
dark background with high contrast. The width of the metric 3D polyhedra even when only one 2D image was
contour was 2 pixels (0.6 mm). The image of the provided. The main effect of the asymmetry type was also
polyhedron subtended 16.5 degrees (11.5 cm) on average. significant (F2, 46 = 105.18, p G 0.001). An interaction
between the asymmetry type and the level of distortion was
Procedure
also significant (F6, 46 = 5.89, p G 0.001), but this
interaction was most likely due to a floor effect. A
The method of signal detection was used. Each session posteriori test (Tukey HSD) showed that the difference
consisted of 200 trials: 100 trials with an image of a between Conditions-N and -P was significant in L4 (p G
symmetric polyhedron and 100 trials with an image of an 0.005) but was not significant in L1, L2, and L3 conditions
asymmetric polyhedron, presented in a random order. (p 9 0.05). The other interactions were not significant (p 9
There were 24 experimental conditions: three types of an 0.05). Performance was the best in Condition-R and the
asymmetric polyhedron (Condition-R vs. -N vs. -P)  four worst in Condition-P. This was expected. The most
levels of distortion for generating the asymmetric poly- interesting result, however, is that the subject was able to
hedra (L1–L4)  two types of display (static vs. kinetic). perform above chance level in static Condition-P. Recall that
The levels of distortion corresponded to the extent by in this condition, each image of an asymmetric polyhedron is
which the vertices have been moved: 0.07–0.14 cm (L1), consistent with a symmetric 3D interpretation. It follows that
0.14–0.28 cm (L2), 0.28–0.56 cm (L3), and 0.56–1.12 cm in all trials in this condition, the 2D image was consistent
(L4). Each session tested a single condition. Before each with both symmetric and asymmetric 3D interpretation
session, the subject was informed about the condition. regardless whether the 2D image was produced by a
Each session started with a block of 20 practice trials. The symmetric or an asymmetric 3D shape. So, mathemati-
subject ran two sessions for each condition. The order of cally, the images in this condition were completely
sessions was randomized. ambiguous. How could the subject make the discrim-
Each trial began with a fixation cross. After pressing the ination? It seems that the only way to make the
mouse button, the fixation cross disappeared and the discrimination reliably is to evaluate the likelihood of
stimulus was shown for 500 ms. The subject’s task was to the symmetric and asymmetric 3D interpretations. The
respond whether or not the presented polyhedron was computational model described later in this paper, does
symmetric. After each trial, a feedback about the accuracy this by recovering a 3D shape by means of maximizing a
of the response was given. The performance of the subject cost function that includes four a priori constraints. Once
was evaluated by the discriminability measure dVused in the 3D shape is recovered, its 3D symmetry is evaluated.
the signal detection theory and its standard error. Higher Finally, our results show that although performance in
performance corresponds to higher values of dV; dV = 0 the kinetic condition was higher than that in the static
represents chance performance and dV = V represents condition, the improvement was rather small. On average,
perfect performance. dV was computed for each session. the dVimproved by only 0.26. These results suggest that
The standard error was computed from two values of dV. kinetic depth effect is of secondary importance in 3D
symmetry perception. Pictorial information provided by a
single 2D image plays a major role.
Results and discussion

Results of individual subjects and the averaged results


are shown in Figure 5. The ordinate shows dV and the
abscissa shows the level of distortion of the asymmetric
Experiment 2: Discrimination
polyhedra.2 The results of static and kinetic conditions are between degrees of asymmetry of
plotted separately. The three curves indicate the three 3D shapes from single 2D images
asymmetry types: Condition-R (circles), -N (triangles),
and -P (squares). The results were analyzed using a three-
way ANOVA within-subjects design: conditions (R vs. N Results of Experiment 1 show that the subject can
vs. P)  levels of distortion for generating asymmetric reliably discriminate betweensymmetric andasymmetric 3D
polyhedra (L1–L4)  type of display (static vs. kinetic). shapes even from single 2D images. The discrimination

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 6

was easier when the asymmetric shape was distorted


more. This suggests that 3D asymmetry is a continuous
perceptual feature and that the degree of asymmetry can
be perceptually judged. This has already been demon-
strated in the case of 2D shapes (Barlow & Reeves,
1979; Tjan & Liu, 2005; see also Zabrodsky & Algom,
1994; Zimmer, 1984). Barlow and Reeves (1979) showed
that the subjects could discriminate more asymmetric
from less asymmetric 2D dot patterns on the retina. The
experiment presented here is a 3D version of Barlow and
Reeves’ experiment. The main difference is that here 3D
rather than 2D stimuli were used, and the stimuli were
shapes rather than dots.

Methods

The experimental method was the same as in Experi-


ment 1 except as indicated below. All polyhedra used in
Experiment 2 were generated the same way as the
asymmetric polyhedra in Experiment 1. Recall that the
asymmetric polyhedra in Experiment 1 were generated
from symmetric ones by applying four levels of distortion
(L1–L4). The level of the distortion represents the degree
of asymmetry; the more distorted symmetric polyhedron
is more asymmetric. Only static 2D images of the
polyhedra were used in Experiment 2.
Each session consisted of 200 trials with distorted
symmetric polyhedra: A half of the polyhedra were more
asymmetric than the other half. The subject’s task was to
respond whether the presented polyhedron was more or less
asymmetric. Before each session, the subject ran a block
of 20 practice trials. The more asymmetric polyhedra
always came from level L4 of distortion. There were
9 experimental conditions: asymmetry type (Condition-R
vs. -N vs. -P)  three levels of distortion for generating the
less asymmetric polyhedra (L1–L3). Each condition was
replicated twice.

Results and discussion


Results of individual subjects and the averaged results
are shown in Figure 6. The ordinate shows dV, and the
abscissa shows levels of distortion of the less asymmetric

Figure 5. Results from Experiment 1. The ordinate shows d V, and


the abscissa shows levels of distortion of asymmetric polyhedra.
The three curves indicate the three types of asymmetric poly-
hedra. Results from the static condition are plotted on the left and
those from the kinetic condition are plotted on the right. (a) Results
of individual subjects. Error bars represent the standard errors
calculated from two sessions for each condition. (b) Averaged
results from all three subjects. Error bars represent the standard
errors calculated from three subjects.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 7

the best in Condition-R and the worst in Condition-P. As


in Experiment 1, the subject was able to perform above
chance level in Condition-P. These results show that the
human visual can measure the degree of asymmetry of a
3D shape and use this measure in discriminating between
more and less asymmetric 3D shapes. The nature of the
perceptual metric for asymmetry will be discussed in the
next section.
One subject (TS) ran a control experiment in the
presence of kinetic depth effect. The results are shown in
Figure 7. On average, dV improved by only 0.19. As in
Experiment 1, the kinetic depth effect did not substantially
improve the performance.

Computational model of
discrimination between
symmetric and asymmetric 3D
shapes from single 2D images

The 2D images of both the symmetric and asymmetric


3D shapes are asymmetric (except for degenerate views,
which were excluded from our psychophysical and
simulation experiments). Hence, discrimination between
the symmetric and asymmetric 3D shapes could not be
based on 2D symmetry of their 2D images (see Appendix B
for the simulation results supporting this claim). Recall
that the 3D symmetry line segments of a symmetric 3D
shape are all parallel to one another and the same is true in
any 2D orthographic projection of the 3D shape. This is
the only invariant of 3D mirror symmetry. Consider an
Figure 6. Results from Experiment 2. The ordinate shows d V, and asymmetric 3D polyhedron with parallel line segments
the abscissa shows levels of distortion of asymmetric polyhedra. (Figure 4d). The parallelism of these line segments is also
The three curves indicate the three types of asymmetric poly- preserved in a 2D orthographic image. It is easy to show
hedra. (a) Results of individual subjects. Error bars represent the
standard errors calculated from two sessions for each condition.
(b) Averaged results from all three subjects. Error bars represent
the standard errors calculated from three subjects.

polyhedra. The three curves indicate the asymmetry types.


The results were analyzed using two-way ANOVA
within-subjects design.
As expected, discrimination was easier when the differ-
ence between the two degrees of the distortion was greater
(F2, 16 = 99.99, p G 0.001). Performance was significantly
higher than chance level for all asymmetry types even in
L3 condition, in which the difference was the smallest
(R: t(5) = 14.25, p G 0.001; N: t(5) = 10.07, p G 0.05; P:
t(5) = 18.21, p G 0.05). This means that the subjects could
reliably discriminate between two degrees of asymmetry
of 3D shapes from single 2D images. The main effect of Figure 7. Results of TS in the kinetic condition of Experiment 2
the asymmetry type was also significant (F2, 16 = 26.29, (the panel on right). Results of TS in the static condition (Experi-
p G 0.001). An interaction between these two factors was ment 2) are also plotted on left. Error bars represent the standard
not significant (F4, 16 = 1.76, p = 0.187). Performance was errors calculated from two sessions for each condition.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 8

(see below) that any image of this asymmetric polyhedron order to restrict the family of possible 3D shapes (Pizlo,
is consistent, in principle, with a symmetric 3D interpre- 2008; Poggio, Torre, & Koch, 1985). The 3D shape
tation. Interestingly, human subjects can discriminate recovery itself is performed, in the new model in two
between images of a symmetric 3D shape and asymmetric steps. First, a symmetric or an approximately symmetric
3D shape with parallel line segments (Condition-P in 3D shape is recovered by using 3D symmetry and
Experiment 1). They could also judge the degree of planarity of contours as implicit constraints (assumptions).
asymmetry of a 3D shape with parallel line segments from The maximal 3D compactness and minimum surface area
a single 2D image (Experiment 2). These results suggest are treated as explicit constraints and used in a cost
that the parallelism of 2D symmetry line segments, which function, which is minimized by performing a one-
is the only invariant of 3D mirror symmetry under parameter search. In the second step, all four constraints
orthographic projection, is not sufficient to explain human are used as explicit constraints in a cost function and the
performance. Human subjects must use other properties of 3D shape is deformed by performing search in multi-
the 3D shape in their judgments. The model described parameter space so as to minimize the function. The first
below simulates the discrimination between symmetric step, which could, in principle be skipped, provides a
and asymmetric 3D shapes by first recovering the 3D good starting point for the multi-parameter optimization
shape from a single 2D image. The recovery is based on that is likely to have multiple local minima. The first step
four a priori constraints: symmetry of the 3D shape, is described next. It has already been shown that each of
planarity of contours, compactness of the 3D shape, and these four a priori constraints plays an important role in
the 3D shape’s surface area. All these constraints are used in recovering a veridical 3D shape from a single 2D image
a cost function whose maximum determines the perceived (Li et al., 2009; Pizlo et al., 2010). Furthermore, it has
3D shape. been shown that a model of 3D shape recovery based on
these constraints recovers shapes that are very similar to
the shapes recovered by the subjects (Li et al., 2009).
Measure of asymmetry of a 3D shape in the Therefore, the present study is not trying to verify
2D image psychological plausibility of these constraints. The main
purpose of this study is to formulate a model of 3D
The model consists of two main stages. In the first symmetry discrimination. In these prior studies, 3D
stage, a 3D polyhedron is recovered from a single 2D symmetry and planarity of contours were used as implicit
image. This stage is an elaboration of models proposed in constraints. As a result, a recovered 3D shape was
our prior studies (Li et al., 2009; Sawada & Pizlo, 2008b). symmetric and had planar faces. This is also the case in
In the second stage, the asymmetry of the recovered the first step of the model proposed in this paper.
polyhedron is measured and compared to a criterion in
order to decide whether or not the recovered 3D shape is
symmetric. A 2D orthographic image of a 3D polyhedron Recovering a symmetric or an approximately
is used as input. Note that the model does not perform symmetric 3D shape from a single 2D image
figure-ground organization of the image; the given 2D
image is already organized. Specifically, the model is The 3D shape is recovered using an algorithm described
provided with the visible contours of the polyhedron and in our prior studies (Li et al., 2009; Sawada & Pizlo,
with the information about which contours form the faces 2008b; see also Vetter & Poggio, 1994). The technical
of the polyhedron. For the case of a symmetric poly- explanation of this algorithm is described in Appendix A.
hedron, the model is also given the information about Here, I will present the main features of the algorithm.
which vertices form symmetric pairs and are connected by Recall that the information about which vertices of the
2D symmetry line segments. Recall that asymmetric polyhedron form possible symmetric pairs is given.
polyhedra used in the psychophysical experiments were Assume that the line segments connecting these pairs of
generated by distorting symmetric ones. Hence, the the vertices are parallel to one another. Parallelism of
asymmetric polyhedra were approximately symmetric symmetry line segments is an invariant of a symmetric 3D
and the model was given the information about which shape under the orthographic projection and the image
pairs of vertices are approximately symmetric and which with this invariant is consistent with a symmetric 3D
line segments are approximately 2D symmetry line seg- shape. If the line segments are only approximately
ments. Performance of the model was evaluated using the parallel, they are corrected by moving their endpoints so
same 3D simulated shapes and their 2D images that were that they become parallel and consistent with a symmetric
used in Experiments 1 and 2. 3D shape (Figures 8a and 8b; Sawada & Pizlo, 2008b;
The 3D shape recovery from its single 2D image is an Zabrodsky & Weinshall, 1997). The amount of the
ill-posed problem; this problem is underconstrained. This correction is minimal in the least square sense. From the
means that there are always infinitely many 3D shapes corrected image, which now is an image of a symmetric
that could have produced the 2D image. To recover a 3D shape, a “virtual image” of the same 3D shape is
unique 3D shape, a priori constraints must be used in computed (Vetter & Poggio, 1994). The virtual image of a

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 9

Figure 8. Process of recovery of a symmetric polyhedron. (a) An original image given to the model. (b) A corrected image; the vertices are
moved so that the 2D symmetry line segments (dashed contours) become parallel. (c) The virtual image generated by reflecting the real
(corrected) image. (d) Only symmetric pairs of vertices that are both visible, along with the edges connecting them are shown. These
vertices can be recovered from the real (corrected) image (b) and the virtual image (c). (e) A visible vertex (black circle) whose symmetric
counterpart is occluded. This visible vertex can be recovered by applying the constraint of planarity to the contour enclosing the face (the
shaded face). (f) An occluded vertex (open circle) whose counterpart (black circle) is visible and recovered in (e). This occluded vertex
can be recovered by reflecting the counterpart with respect to the symmetry plane of the polyhedron. (g) The recovered polyhedron is
uncorrected by moving the vertices so that the polyhedron is consistent with the original 2D image.

symmetric 3D shape is computed by reflecting the original other visible vertex lying on this plane can be computed as
2D image with respect to an arbitrary line (Figure 8c). It is well. The occluded counterpart of this vertex is recovered
important to point out that the virtual image is computed by reflecting the visible vertex with respect to the
from the given real 2D image without knowing the 3D symmetry plane of the 3D shape (Figure 8f).
shape. Under an orthographic projection, these two images The symmetry and planarity constraints determine a
determine the 3D shape “out there” up to one unknown one-parameter family of 3D shapes. A unique 3D shape is
parameter. This parameter represents the 3D orientation of selected from this family as the shape that maximizes a
the symmetry plane and the aspect ratio of the 3D shape. weighted combination of 3D compactness and surface
In other words, a 2D orthographic image of a symmetric area (Li et al., 2009). Compactness of the 3D shape is
3D shape determines a one-parameter family of symmetric defined as:
3D shapes.
The process described above can be applied to sym- V3D ðHÞ2
metric pairs whose both vertices are visible (Figure 8d). If compactnessðH Þ ¼ 36: ; ð1Þ
one of the vertices of a symmetric pair is occluded, these S3D ðHÞ3
vertices can be recovered using two constraints: planarity
of contours, which include the visible vertex, and where H represents the 3D shape, V3D(H) and S3D(H) are
symmetry of the 3D shape (Li et al., 2009; Mitsumoto, the volume and the surface area of the 3D shape H
Tamura, Okazaki, Kajimi, & Fukui, 1992). The planarity (Hildebrandt & Tromba, 1996). Maximum 3D compact-
constraint is applied to the contours enclosing the faces of ness, in conjunction with 3D symmetry constraint give the
the polyhedron (Figure 8e). If at least three vertices of a object its volume. Minimum surface area is defined as the
planar contour have already been recovered based on the minimum of the total surface area of the object. It is
symmetry constraint, the orientation of the plane contain- equivalent to the maximum of the reciprocal of the total
ing this contour is known. Then, the coordinates of any surface area. This constraint tends to decrease the thickness

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 10

of the 3D shape along the depth direction. It makes the 2D In a planar np-gon, the sum of all interior angles is equal
image more stable in the presence of small 3D rotations, to (np j 2):. When a convex polygon is not planar, the
and thus the 3D recovered shape is more likely; a small sum of its angles is smaller.3 Hence, the departure from
change of the viewing direction does not change the 2D planarity can be measured by computing the difference
image substantially (Li et al., 2009). In order to use the between the sum of the angles and (np j 2):. Planarity of
surface area of the 3D shape as a unit-free parameter, the each face is defined as negative of the departure from
reciprocal of the total surface area of the 3D shape was planarity. Planarity of faces of the 3D shape is defined
multiplied by the size of the surface area of the projected here as an average of planarity from all faces:
3D shape to the 2D image:
 
X  Xnp 

S2D ðHÞ j ðnp j2Þ I : j !f ;a 
surfaceð HÞ ¼ ; ð2Þ  
S3D ðHÞ planarityð HÞ ¼
f a¼1
; ð4Þ
: I nf
where S3D(H) and S2D(H) are the surface area of the 3D
shape H and the surface area of the projection of H to the where f represents polygonal face of the 3D shape, !f,a is
2D image. The symmetric 3D shape which maximizes ath internal angle of face f, and nf is the number faces of
compactness(H) + surface(H) is chosen from the family of the polyhedron H. For the other two constraints, max-
the symmetric 3D shapes. imum 3D compactness and minimum surface, Equations 1
The recovered symmetric shape is consistent with the and 2 are used. The following equation is used as an
original image only if the 2D symmetry line segments are overall objective function:
exactly parallel to one another in the original 2D image. If
they are only approximately parallel, the 2D image has EðHÞ ¼ compactnessðHÞ þ surfaceðHÞ
been corrected (see above). As a result, the recovered þ symmetryðHÞ þ planarityðHÞ: ð5Þ
symmetric 3D shape is not consistent with the original 2D
image. In order to make it consistent, the 3D shape has to The absolute value of each component in the cost
be uncorrected (distorted). The uncorrection is done by function ranges between 0 and 1. Compactness and
moving the endpoints of the 3D symmetry line segments surface constraints are positive (or zero), while symmetry
of the recovered symmetric shape so that the resulting and planarity are negative (or zero). All components of the
asymmetric shape becomes consistent with the original cost function are weighted equally. Other coefficients have
image (Figure 8g; Sawada & Pizlo, 2008b). Amount of the been tried, but the simplest form with equal coefficients
uncorrection is minimal in the least square sense. seemed to work best. The model searches for a 3D shape,
which maximizes the cost function E(H). The dimension-
ality of the search is established as follows: 1 (the
Second stage of the 3D shape recovery orientation of the symmetry plane) + 8 (the depth
positions of the “3D symmetry line segments”) + no (the
The 3D shape recovered in the first step will be number of the occluded vertices). Note that the orientation
deformed in the second stage until it maximizes a cost of the symmetry plane is characterized by two parameters:
function with four constraints: symmetry of the 3D shape, slant and tilt. Tilt is estimated as an average orientation of
planarity of contours, maximum 3D compactness, and the 2D symmetry line segments (Zabrodsky & Weinshall,
minimum surface area. Symmetry is defined here as a 1997). Hence, only slant is used in the search. All 3D
negative of an asymmetry of the 3D shape: symmetry line segments are restricted to be parallel to the
normal of the symmetry plane, if the 2D symmetry line
symmetryðH Þ ¼ jasymmetry ðHÞ segments are parallel to one another in the image. Otherwise,
 
X  the 3D symmetry line segments are restricted to be as
j  
!a j !counterpartðaÞ  parallel as possible. The positions of the occluded vertices
¼ a
; ð3Þ are restricted to be along the 3D symmetry lines emanating
: I na from their symmetric counterparts. The model uses a
steepest gradient descent method for finding the minimum
where !a and !counterpart(a) are the corresponding 2D of jE(H) (which is equivalent to maximum of E(H)).
angles of contours, and na is number of the 2D angles of Once the 3D shape is recovered, the model evaluates its
the polyhedron H. If the recovered shape is perfectly asymmetry using Equation 3. The range of this measure is
symmetric, its two halves are identical and symmetry(H) from 0 to 1; the bigger this value, the more asymmetric
is zero; otherwise, it is smaller than zero. Planarity the polyhedron. The decision as to whether or not the
constraint enforces the contours enclosing the faces of polyhedron is classified as symmetric is based on a
the 3D shape to be planar (Leclerc & Fischler, 1992; see criterion whose value is chosen so as to maximize the fit
also Hong, Ma, & Yu, 2004; Sawada, Li, & Pizlo, 2010). of the model to the subjects’ results.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 11

Model fitting Results of the model superimposed on the average results


of the subjects in Experiment 1 are shown in Figure 9 (top
The model of detecting 3D symmetry was applied to the panel). The ordinate shows dV and the abscissa shows
2D images that were used in Experiment 1 (static condition) levels of distortion of the asymmetric polyhedra. The
and Experiment 2. For each image, the model recovered a results show that the model can account for human
3D shape and computed its asymmetry. The discriminability performance quite well. Specifically, the model’s perform-
measure dVwas computed for each session. The criterion for ance improves with the level of the distortion. The
model’s classification of the recovered 3D shape as performance is the best with the asymmetric polyhedra in
“asymmetric” was chosen so as to minimize the sum of Condition-R and is the worst with the asymmetric poly-
squared differences between the dVof the model and that of hedra in Condition-P. The same trends are also observed in
the subject in each experiment. Hence, there was one free the results of the human subjects. Similarly good fit was
parameter for 12 data points in Experiment 1 and one free found for results in Experiment 2 (the bottom panel). It is
parameter for 9 data points in Experiment 2. worth pointing out that the estimated response bias of the
The visual noise was emulated by randomly perturbing model was very similar to that of human subjects in each
the orientations and lengths of the 2D symmetry line session. On average, the rate of “symmetric shape”
segments. The noise of the orientation of the line responses of the model, when the 3D shape was symmetric,
segments was approximated by a Gaussian probability was 80%, whereas that of the human subjects was 79%. The
density function with a zero mean. The standard deviation judgments of both the model and the human subjects in the
of this distribution was computed based on orientation psychophysical experiments were slightly biased toward
discrimination thresholds (Mäkelä, Whitaker, & Rovamo, “symmetric shape” responses.
1993). The noise of the length of the line segments was Again, the condition-P is of special interest here. It can
approximated by a Gaussian probability density function be seen that the performance of the model is above chance
with a zero mean (Chan et al., 2006; Levi & Klein, 1990). level and quite similar to that of the subjects. As pointed
The standard deviation was 3% of the eccentricity of the out above, the discrimination in Condition-P cannot be
endpoints (Chan et al., 2006; Watt, 1987). The eccen- easily performed based on 2D properties of the image
tricity was computed assuming that the eye was fixated at because each image in this condition is geometrically
the center of the image. consistent with both symmetric and asymmetric 3D

Figure 9. Results of the model in the simulation experiment. The model was applied to the images used in the psychophysical
experiments. The ordinate shows d V, and the abscissa shows levels of distortion of asymmetric polyhedra. Results from different types of
asymmetric polyhedra are plotted in separate graphs. The results of the model (solid symbols) are superimposed on the averaged results of
the subjects in the psychophysical experiments. Error bars represent the standard errors calculated from two sessions for each condition.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 12

interpretations. The model proposed here first recovers a no one studied it? The answer is that the problem of
3D shape which satisfies a priori constraints and then discrimination between symmetric and asymmetric 3D
measures asymmetry of the recovered shape. As a result, shapes does not naturally fit in existing paradigms of 3D
maximal 3D compactness, maximal planarity of contours, shape perception. A 2D image of a symmetric 3D object is
and minimum surface area constraints contribute to itself symmetric only when the line of sight lies on the
perception of symmetry of a 3D shape. Even though these symmetry plane of the object. For all other viewing
constraints are not directly related to 3D symmetry, they directions, the image of a symmetric object is not
increase the likelihood of the recovered 3D shape. symmetric. Hence, 3D symmetry of a 3D object cannot
Two alternative models, suggested by the reviewers, be judged based on 2D symmetry of the 2D image of the
were also tested and compared to the psychophysical 3D object (Appendix B). There is one invariant of 3D
results. First, the discrimination between symmetric and symmetry under orthographic projection: the symmetry
asymmetric 3D shapes can potentially be performed based line segments are parallel to one another in every
on the asymmetry of their 2D images (Appendix B). orthographic image of a mirror-symmetric shape. This
Specifically, it is reasonable to expect that the degree of invariant could be used to discriminate between symmet-
asymmetry of a 2D image of a 3D asymmetric shape is ric and asymmetric 3D shapes, and we showed that it is
higher than that of a 3D symmetric shape. Second, a 2D actually used by the human visual system (Sawada &
image of an asymmetric 3D shape in Condition-P can be a Pizlo, 2008a). However, this invariant is not sufficient to
degenerate (accidental) view of its symmetric 3D inter- explain human performance. In one of the conditions
pretation (Appendix C). If the 2D image is a degenerate (Condition-P), this invariant was made ineffective. This
view of a 3D shape, a small change of the viewing was accomplished by generating asymmetric 3D shapes
orientation of the 3D shape will cause a large change of its from symmetric ones in such a way that the line segments
2D image. It is known that the human visual system tends connecting the corresponding vertices of the symmetric
to avoid unstable interpretations, although this observation shape remained parallel. We verified that any 2D ortho-
has never been directly tested (Freeman, 1994; Mach, graphic image of such an asymmetric shape is consistent
1906/1959). These two models were examined in simu- with infinitely many symmetric 3D interpretations. It
lation experiments (see Appendices B and C for more follows that the performance of the subjects in discrim-
details). These experiments show that the asymmetry of ination between symmetric and asymmetric 3D shapes in
the 2D image and the generic view assumption cannot this condition would have been quite poor if the perform-
explain our psychophysical results. Reliable discrimina- ance were based on the 2D image properties (Appendix B).
tion between symmetric and asymmetric 3D shapes seems Performance could obviously improve if the 3D shapes
to require recovery a 3D shape using a priori constraints, were familiar. However, the shapes in our study were
as in the model presented above. novel. Despite geometrical ambiguity of a single 2D image,
the subjects were able to reliably discriminate between
symmetric and asymmetric unfamiliar 3D shapes. It
follows that conventional approaches to 3D shape percep-
Summary and discussion tion, in which the subject’s percept is derived directly from
the 2D image, such as ideal observer or pattern matching,
cannot be applied to the questions studied here. Our results
This paper presents the first ever psychophysical study clearly suggest that the visual system recovers a 3D shape
of discrimination between symmetric and asymmetric 3D from a 2D image and then evaluates whether the 3D shape
shapes from a single 2D image. As pointed out in is symmetric or asymmetric. Most prior 3D shape recon-
Introduction, perception of symmetry has been discussed struction approaches started with reconstructing visible
in the literature for about 100 years. These previous surfaces of 3D objects (so-called 2.5D sketch, Marr, 1982).
studies concentrated on the case of symmetry on the However, the visible surfaces of a 3D symmetric object
retinal image (see Figure 1). These studies were then are themselves almost never symmetric, for the same
extended to the case of 2D skewed symmetry, following reason why the 2D images are not symmetric. It follows
Kanade’s (1981) observation that a single 2D orthographic that the task of discriminating between symmetric and
image of a 2D mirror symmetric figure is easily asymmetric novel 3D shapes cannot be solved by such
recognized by a human observer as such (see Figure 2). models. As a result, the question as to how such
Once a 2D skewed symmetry is identified, the 3D discrimination might be performed has never come up in
orientation of the 2D symmetric figure can be computed the context of these models. Clearly, this question is new;
(Kanade, 1981; Saunders & Knill, 2001; Wagemans, it is irrelevant in existing paradigms, and it became
1992, 1993). This way, 2D skewed symmetry provided a important only after we demonstrated that the percept of a
tool for reconstructing 3D visible surfaces of an object 3D shape is based on recovering this shape by applying a
from a single 2D image. 3D symmetry constraint (Pizlo, 2008; Pizlo et al., 2010).
The ability to tell the difference between symmetric and Consider an ecological justification for 3D symmetry
asymmetric 3D shapes is obviously important, so why has and maximal 3D compactness constraints. Note that

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 13

almost every animal in nature is symmetric. It has been the 2D compactness of the planar interpretation (Brady &
argued that symmetry of the 3D shape of the animal Yuille, 1988; Sawada & Pizlo, 2008a).
improves its motor functions (Steiner, 1979). At the same The model proposed in this paper used an “organized”
time, multiple objects rarely form a 3D symmetric 2D image of a 3D shape for its input. Specifically,
configuration (see Figure 10a). It follows that 3D information about symmetric correspondences of contours
symmetry is a property of the 3D shape of a single object. and vertices of the 3D shape was given to the model. How
A 3D shape of a living organism has a continuous volume. can the human visual system derive this information from
Note that maximizing compactness of the 3D shape is the 2D projection of the 3D shape? If the 3D shape is
equivalent to minimizing the surface area of the shape for exactly symmetric, the 2D symmetry lines segments are
a given volume. Minimizing the surface area for a given all parallel to one another. However, in real images of real
volume means that the interface (surface of the skin) objects, the symmetry line segments will never be exactly
between the inside and outside of the object is minimized. parallel. Therefore, the parallelism of these line segments
In such a case, the external interference with the object is cannot be the only, or even the main feature analyzed. It
minimized. It makes the object physically stable in was suggested that topological structure of the image
thermodynamic and mechanical sense. These facts suggest should be analyzed, as well, before the 3D symmetry
that maximizing 3D symmetry and 3D compactness of a 3D constraint is applied (Pizlo et al., 2010; Sawada et al., in
recovered shape makes the 3D shape more likely as a 3D preparation). Studying human performance in organizing
shape of a single natural object. This nature of 3D shapes is the 2D image information and detecting symmetric
reflected in human shape perception. Prior studies showed correspondences will be addressed in our future work.
that symmetry of a 3D shape facilitates recognition of the
3D shape (Chan et al., 2006; Liu & Kersten, 2003; Pizlo &
Stevenson, 1999; van Lier & Wagemans, 1999; Vetter
et al., 1994). McBeath, Schiano, and Tversky (1997)
showed that randomly generated 2D images tend to be
Appendix A
interpreted by human subjects as silhouettes of symmetric
3D objects. A number of prior studies showed that a An algorithm recovering an approximately
perceived 2D shape is biased toward symmetry (Csathó, symmetric polyhedron from a single 2D
van der Vloed, & van der Helm, 2004; Freyd & Tversky, orthographic image
1984; King, Meyer, Tangney, & Biederman, 1976). The
same bias seems to work in 3D shape perception, as well Let z = 0 be the image plane and the x- and y-axes of the
(Kontsevich, 1996). In contrast to symmetry, there is not 3D Cartesian coordinate system be the 2D coordinate
much evidence for the role of 3D compactness in 3D system on the image plane. Consider a 3D polyhedron that
shape perception. 3D compactness was used as an a priori is mirror symmetric with respect to a plane S in 3D space.
constraint in a model for 3D shape recovery proposed by We begin with the analysis of those symmetric pairs of
Li et al. (2009; see also Li, 2009; Pizlo, 2008; Pizlo et al., vertices that are both visible. 3D symmetry line segments
2010). Similarly, it was shown that perception of slant of a connecting symmetric pairs in 3D space are parallel to one
planar shape in a 3D scene can be explained by maximizing another. The 2D symmetry line segments, which are the
projection of the 3D symmetry line segments in the 2D
image, are also parallel to one another under an orthographic
projection. Let’s set the direction of the x-axis so that the x-
axis is parallel to the 2D symmetry line segments (note that
this does not restrict the generality). If the polyhedron is
approximately symmetric and the 3D and 2D symmetry
line segments are only approximately parallel, the x-axis is
set to a direction that is as parallel to the 2D symmetry line
segments as possible in the least squares sense:

X 2
min yiVj ycounterpartðiÞ
V
i
 
xV
piV ¼ i ; ðA1Þ
Figure 10. Symmetric shapes composed of multiple objects. yiV
(a) Multiple natural organisms rarely compose a single 3D
 
symmetric configuration in nature. Image from istockphoto.com. xV
(b) In man-made world, there are also 3D symmetric configurations pcounterpartðiÞ
V ¼ counterpartðiÞ
V
ycounterpartðiÞ
composed of multiple objects. Image from ashinari.com.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 14

where pV i = [xV i] and pVcounterpart(i) = [xVcounterpart(i),


i, yV
t
t
yVcounterpart(i)] are 2D orthographic projections of vertices i
and its symmetric counterpart(i).

Correction of the 2D image


The 2D symmetry line segments are changed (cor-
rected) to be parallel by displacing their endpoints
(Zabrodsky & Weinshall, 1997):
2 3
  xiV
xi
pi ¼ ¼ 4 yiV þ yV
counterpartðiÞ
5 Figure A1. A real (corrected) image and a virtual image of a
yi
2 2 3 symmetric 3D shape. The virtual image is generated by reflecting
  xV
counterpartðiÞ the corrected image. pi and pcounterpart(i) are 2D orthographic
x ðA2Þ
ppairðiÞ ¼ counterpartðiÞ ¼ 4 yiV þ yV 5
counterpartðiÞ ;
projections of vertices i and its symmetric counterpart(i) in the
ycounterpartðiÞ corrected image. qi and qcounterpart(i) in the virtual image or
2
  produced by computing a reflection of pi and pcounterpart(i),
xV respectively. Note that the virtual image is a valid 2D image of
¼ counterpartðiÞ
yi the same 3D shape after a 3D rigid rotation. pi maps to qcounterpart(i)
and pcounterpart(i) maps to qi as a result of the 3D rotation.
where pi = [xi, yi]t and pcounterpart(i) = [xcounterpart(i),
ycounterpart(i)]t are the projections of the vertices i and the orientation for which the virtual image was obtained
counterpart(i) after the correction. The corrected image is be ui = [jxi, yi, zi]t and ucounterpart(i) = [jxcounterpart(i), yi,
consistent with an orthographic image of a perfectly zcounterpart(i)]t. Then, the vertex ucounterpart(i) that corre-
symmetric 3D shape. If the 2D symmetry line segments sponds to vi after the 3D rigid rotation can be written as
are all exactly parallel in the 2D image, pVi = pi and follows:
pV
counterpart(i) = pcounterpart(i). 2 3 2 3
jxcounterpartðiÞ xi
4 yi 5 ¼ R3D 4 yi 5; ðA4Þ
Producing a virtual image zcounterpartðiÞ zi

The method proposed by Vetter and Poggio (1994) is where R3D is a 3  3 rotation matrix. The R3D in
applied to the corrected image (Figure A1). The virtual Equation A4 represents the 3D rigid rotation around the
image of the symmetric 3D shape is generated by y-axis:
reflecting the corrected image with respect to the y-axis: 2 3
cos E 0 sin E
   
j1 0 jxi R3D ¼ Ry ðEÞ ¼ 4 0 1 0 5; ðA5Þ
qi ¼ p ¼ jsin E 0 cos E
0 1 i yi
   
j1 0 jxcounterpartðiÞ where E is the angle of the rotation around the y-axis. Note
qcounterpartðiÞ ¼ p ¼ ;
0 1 counterpartðiÞ yi that the correspondence of points in the real and virtual
images produced by the 2D reflection (i.e., pi maps to qi
ðA3Þ and pcounterpart(i) maps to qcounterpart(i)) is “opposite” to the
correspondence produced by the 3D rotation (i.e., pi maps
where qi and qcounterpart(i) are the reflections of pi and to qcounterpart(i) and pcounterpart(i) maps to qi) (see Figure A1).
pcounterpart(i) in the virtual image. The virtual image is a
valid 2D image of the same 3D shape after the 3D shape
has been rotated around the y-axis. Let the 3D coordinates Recovering one-parameter family of
of the symmetric pair of vertices i and counterpart(i) at the symmetric polyhedra
orientation for which the real image was obtained be vi =
[xi, yi, zi]t and vcounterpart(i) = [xcounterpart(i), yi, zcounterpart(i)]t. From the first row of Equation A4, we obtain:
Note that x- and y-values of vi and vcounterpart(i) are
identical to those of pi and pcounterpart(i) under the ortho-  t  
cos E xi
graphic projection. In the same way, let the 3D coordinates jxcounterpartðiÞ ¼ : ðA6Þ
of the symmetric pair of vertices i and counterpart(i) at sin E zi

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 15

From Equation A6, zi can be computed: The slant A s of S is defined as an angle between ns and the
z-axis, and it can be computed as follows:
xi cos E þ xcounterpartðiÞ
zi ¼ : ðA7Þ
jsin E ½ 0 0 1  ns
cosðAs Þ ¼
¬ns ¬
Therefore, the vertex i of the symmetric 3D shape can be 1
written as follows: ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tan ð:=2j E=2Þ þ 1
2

2 3
xi ¼ cosð:=2j E=2Þ
6 yi 7
vi ¼ 4 : ðA8Þ A s ¼ :=2j E=2; ðA12Þ
xi cos E þ xcounterpartðiÞ 5
jsin E
Recall that the 3D symmetry line segments are parallel to
From Equation A8, it can be seen that vi depends on E. It ns. Equation A12 shows that E which is the parameter
means that symmetric 3D shapes that are consistent with specifying the family 3D shapes is directly related to the
the corrected image can be represented by a one- orientation of the 3D symmetry line segments and the
parameter family. The family is controlled by E, which symmetry plane of the 3D shape.
is the angle of rotation around the y-axis. Next, consider the 3D shape itself. Let the intersection
line between S and the zx-plane be szx. Let the height of
the object be its length along the y-axis and the width be
its length along ns and the depth be its length along szx.
The 3D shapes and orientations The y-axis, ns, and szx are perpendicular to one another.
of the symmetric polyhedra Let the width W of the 3D shape be measured as the
length of the longest 3D symmetry line segment. Consider
First, consider the orientation of the 3D symmetry line the 3D symmetry line segment connecting a symmetric
segments and the symmetry plane S of the symmetric 3D pair of the vertices i and counterpart(i). The length li(E) of
shape. The 3D symmetry line segments are parallel to the this 3D symmetry line segment can be computed as
normal of the symmetry plane S and the midpoints of the follows:
3D symmetry line segments are on S. The midpoint of a
3D symmetry line segment connecting the vertices i and
counterpart(i) is
¬
li ðEÞ ¼ vi j vcounterpartðiÞ
¬

¬ ¬
xi j xcounterpartðiÞ
vi þ vcounterpartðiÞ
mi ¼ ¼  0
2  ð1j cosEÞ : ðA13Þ
2 3 xi jxcounterpartðiÞ
xi þ xcounterpartðiÞ sinE
6 2 7 pffiffiffi
6 7
¼6 yi 7: ðA9Þ 2
4 xi þ xcounterpartðiÞ ð1 þ cos EÞ 5 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jxi jxcounterpartðiÞ j
j 1 þ cos E
2 sin E
Equation A13 shows that the length of each 3D symmetry
From Equation A9, we can write an equation for S: line segment is scaled as the same function of E. It means
that the 3D shape is proportionally stretched along the
direction of 3D symmetry line segments by the same
1 þ cos E
z ¼ jx ¼ jx tanð:=2jE=2Þ: ðA10Þ factor, which is a function of E: From Equation A13 we
sin E have:
Equation A10 shows that S is a plane whose normal is
Wð:=2Þ
perpendicular to the y-axis and the y-axis is on S. The W ðEÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ðA14Þ
normal of S can be written as follows: 1 þ cos E
2 3 where W(E) is the width of the 3D shape. W(:/2) is the
tanð:=2j E=2Þ
width of the shape when E = :/2.
ns ¼ 4 0 5: ðA11Þ The depth of the 3D shape can be measured by
1 measuring the longest distance between two vertices

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 16

along szx, which is on the zx-plane. Note that the 3D the parameter characterizing the family, specifies the
symmetry line segments are parallel to ns and ns is orientation of the symmetry plane, orientation of the 3D
perpendicular to szx. The depth of the 3D shape can be symmetry line segments, and the aspect ratio of the 3D
measured by computing the longest distance between two shape.
midpoints of the 3D symmetry line segments. The
midpoints are coplanar on S, and ns is perpendicular to
the y-axis. Hence, the distance dij between two midpoints Applying planarity constraint to recover
along szx can be computed as follows: hidden vertices

2 xi þ xcounterpartðiÞ 3 If all vertices are visible, this step is skipped. Sym-


metric pairs whose one vertex is visible and the other is
6 2 7
6 7 hidden are recovered by applying two constraints: planar-
mi ¼ 6 y 7
4 x þx i
5 ity constraint for the visible vertex and symmetry
i counterpartðiÞ ð 1 þ cos E Þ
j constraint for the hidden counterpart (Li et al., 2009;
2 sin E
2 3 Mitsumoto et al., 1992). In order to use a planarity
xj þ xcounterpartðjÞ constraint, at least three vertices of a face on which the
6 2 7 visible vertex is located have to be recovered first.
6 7
6 7
mj ¼ 6 y j 7 Assume that the face is planar, and the orientation of the
6 7 ;
4 xj þ xcounterpartðjÞ ð1 þ cos EÞ 5 face is known. The z-value of the visible vertex is
j obtained by computing an intersection of this face and
2 sin E
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
! the projection line emanating from the image of this
u

u xj þ xcounterpartðjÞ xi þ xcounterpartðiÞ 2 1 þ cos 2 vertex. The hidden counterpart is recovered by reflecting


dij ðEÞ ¼ t j þ1
2 2 sin E the visible vertex with respect to the known symmetry
plane of the 3D shape.
pffiffiffi
jxj þ xcounterpartðjÞ j xi j xcounterpartðiÞ j 2
¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 1jcos E
ðA15Þ Undoing the 2D correction in 3D space

where mi is a midpoint of a 3D symmetry line segment When the projected symmetry lines are all parallel in
connecting i and counterpart(i), mj is a midpoint of a 3D the real 2D image, this step is skipped. When they are not
symmetry line segment connecting j and counterpart(j), parallel, the recovered symmetric 3D shape must be
and dij(E) is the distance between mi and mj. Equation A15 distorted so that its image agrees with the given 2D real
shows that the distances between mi and mj are scaled as a image:
function of E. It means that the 3D shape is proportionally
scaled along the depth axis as a function of E: From
Equation A15 we have: viV ¼ vi þ $3D ; ðA18Þ

Dð:=2Þ
DðEÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ðA16Þ where $3D is a 3D distortion, and vV i is the position of the
1j cos E vertex i after the distortion. Let the 3D coordinates of $3D
be [$x, $y, $z]t. From Equation A2, $3D = [0, yi j yVi, $z]t,
where D(:/2) is the depth of the 3D shape when E = :/2. where $z can be arbitrary. The magnitude of 3D distortion
From Equation A8, we see that the y-values of the vertices ($3D) is minimized when $z = 0. Hence, the minimally
are independent of E. It means that the height of the 3D distorted symmetric shape which is consistent with the
shape is constant (and thus is independent of E). real 2D image can be written as follows:
From these observations, the 3D shape that belongs to
the one-parameter family can be uniquely characterized
by its aspect ratio, which is the width divided by the depth v iV ¼ vi þ minð$3D Þ
of the 3D shape. This aspect ratio can be computed as 2 3
xi
follows: 6 yiV 7
¼4 : ðA19Þ
xi cos E þ xcounterpartðiÞ 5
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
WðEÞ 1j cos E jsin E
ARðEÞ ¼ ¼ ARð:=2Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ðA17Þ
DðEÞ 1 þ cos E
Note that the transformation of x and y coordinates in
where AR(:/2) is the aspect ratio of the 3D shape for E = Equation A19 is an inverse transformation of that in
:/2. Equations A11, A12, and A17 show that E, which is Equation A2.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 17


Appendix B visibleðaÞ ¼
1 if a 2D angle a is visible in a given 2D image
;
0 if a 2D angle a is invisible in a given 2D image

Measuring 2D symmetry and 2D skewed ðB1Þ


symmetry of an image of a 3D shape where !V a and !V counterpart(a) are the projections of the
corresponding 2D angles of contours of the polyhedron H.
A 2D image of a symmetric 3D shape is symmetric only
In order to measure 2D symmetry of the image, only the
if the line of sight lies on the symmetry plane of the 3D
visible line segments in the image were considered
shape (Figure 3a). At the same time, a 2D image of an
(Figure B1). If the 2D image of the 3D shape is perfectly
asymmetric 3D shape is almost never symmetric. It is
symmetric, its two halves are identical and symmetry2D(H)
reasonable to expect that the degree of asymmetry of a 2D
is zero; otherwise, it is smaller than zero.
image is higher for asymmetric than for symmetric 3D
The model used in this simulation experiment discrimi-
shapes. As a result, it might have been possible for the
nated between symmetric and asymmetric 3D shapes
subjects to discriminate between symmetric and asym-
based on 2D symmetry of the images as defined in
metric 3D shapes based on 2D asymmetry of their images.
Equation B1. Before computing 2D symmetry of the
This hypothesis was tested in a simulation experiment.
images, the image noise was added by randomly perturb-
The 2D symmetry of an image of a 3D shape is defined
ing the orientations and lengths of the 2D symmetry line
here as a negative of its 2D asymmetry:
segments (see Model Fitting for more details). The
symmetry2DðHÞ ¼ jasymmetry2DðHÞ discriminability measure dVwas computed for each session
X based on measured 2D symmetry of images used in the
j j!aVj!counterpartðaÞ
V j I visibleðaÞ I visibleðcounterpartðaÞÞ session. The criterion for classification of 2D symmetry
¼ aX was chosen so as to minimize the sum of squared
: I visibleðaÞIvisibleðcounterpartðaÞÞ differences between the dV of the model and that of the
a subject in each experiment.

Figure B1. Results of the model in the simulation experiment. The model was applied to the images used in the psychophysical
experiments. The ordinate shows dV, and the abscissa shows levels of distortion of asymmetric polyhedra. Results from different types of
asymmetric polyhedra are plotted in separate graphs. The results of the model (solid symbols) are superimposed on the averaged results
of the subjects in the psychophysical experiments. Error bars represent the standard errors calculated from two sessions for each
condition.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 18

Results of the model superimposed on the average their symmetry. The constraints of planarity and compact-
results of the subjects are shown in Figure B1. The ness can override the symmetry constraint, resulting in a
ordinate shows dV and the abscissa shows levels of 3D asymmetric shape in cases where 3D symmetric
distortion of the asymmetric polyhedra. The model’s interpretation is geometrically possible.
performance shows similar effect of the levels of the It has been known that some possible 3D interpretations
distortion as that observed in human performance. How- of a 2D image are not perceived by human observers if
ever, the overall performance of the model is quite close these interpretations are degenerate (accidental), i.e., not
to chance level in most conditions. This contrasts with likely (Freeman, 1994; Mach, 1906/1959). A 2D image of
performance of the subjects which was well above chance a 3D scene is a degenerate view if the 2D image is
level in most conditions. Furthermore, the response bias of unstable and a small change of the viewing orientation
the model that maximized the fit to the subjects’ results causes large difference of the 2D image (Freeman, 1994;
was very different from that of the human subjects. On Weinshall & Werman, 1997). A simple example described
average, the proportion of “symmetric shape” responses of by Mach (1906/1959) used a straight-line segment on the
the model, when the 3D shape was symmetric, was 13%. retina with two possible interpretations: a straight-line
This means that the model almost never responded segment out there, or a circle presented at a slant 90 degrees.
“symmetric” when the 3D shape was symmetric. This This example illustrates the relation between stability of
contrasts markedly with human responses, where the the 2D image and the generic viewpoint of the perceptual
proportion of responses “symmetric” for symmetric interpretation.
shapes was equal to 79%. This extreme response bias of A question arises as to whether the generic viewpoint of
the model would not lead to successful performance in 3D recovered shape, itself, can account for human
everyday life where most objects are symmetric and performance more directly. There are at least two different
should be perceived as such. Changing the model’s ways to use generic viewpoint in a model for simulating
response bias so that it is identical to the subjects’ human performance in discrimination between symmetric
response bias would make model’s discriminability even and asymmetric 3D shapes. First, generic viewpoint can
worse (close to chance level in all conditions). All these be used as a criterion for recovering 3D shapes. However,
results show that discrimination of 2D symmetry of if generic viewpoint is used as the only criterion for the
images of 3D shapes cannot account for the human 3D shape recovery (i.e., as the only element in the cost
performance in discrimination between symmetric and function), the results of the recovery will be trivial. The
asymmetric 3D shapes. most stable 3D interpretation of a line drawing is always
the line drawing itself: the backprojection of the line
drawing to a frontoparallel plane (Weinshall & Werman,
1997). A flat shape on a frontoparallel plane will lead to
Appendix C the smallest sensitivity of its 2D image to small 3D
rotations. Second, generic viewpoint can be used as a
criterion of symmetry discrimination after recovering a
Evaluating accidentalness of a 3D symmetric 3D shape from each 2D image. This is the
interpretation method used here. The model takes a 2D image of a
symmetric or asymmetric 3D shape and produces a
Consider a 2D orthographic projection of a symmetric symmetric 3D interpretation. We expect that a symmetric
3D shape. The 3D symmetry line segments of the 3D interpretation of a 2D image produced by an
symmetric 3D shape are all parallel to one another and asymmetric 3D shape will lead to less stable 2D image
their projections are also parallel to one another in the 2D compared to an asymmetric 3D interpretation. Hence, we
orthographic image. Next, consider a 2D orthographic compute the stability of the 2D image and compare it to a
projection of an asymmetric 3D polyhedron with 3D criterion. A 2D image that leads to a symmetric 3D
parallel line segments (like the asymmetric shapes in our interpretation for which the image is stable is classified as
P condition; Figure 4d). The parallelism of these line an image of a symmetric 3D shape. A 2D image that leads
segments is also preserved in the 2D orthographic image. to a symmetric 3D interpretation for which the image is
Any image of this asymmetric polyhedron is consistent, in unstable is classified as an image of an asymmetric 3D
principle, with a symmetric 3D interpretation. Interest- shape. The simulation experiment described below tests
ingly, human subjects can discriminate quite reliably whether the human performance of 3D symmetry dis-
between images of a symmetric 3D shape and asymmetric crimination can be accounted by this model.
3D shape with parallel line segments (Condition-P in Quantitative measure of sensitivity of a 2D image of a
Experiment 1). They could also reliably judge the degree 3D shape is formulated first. Consider a 2D image of a 3D
of asymmetry of a 3D shape with parallel line segments shape and another 2D image of the same 3D shape viewed
from a single 2D image (Experiment 2). Our model can from a different viewing direction. A change of the
produce equally good discriminations by using four viewing direction can be represented by a 3D rotation of
constraints to recover the 3D shapes before evaluating the line of sight around the center of the 3D object. The

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 19

3D rotation of the line of sight is characterized by two computed for a small change of slant (here 1 degree) and
angles: slant (A) and tilt (C). Slant is the angle between the over all tilts:
first line of sight and the rotated line of sight. Hence, slant
specifies the amount of the rotation. Tilt is the angle
between the projection of the rotated line of sight to the stabilityðimageð H; 0; CÞÞ ¼ j instabilityðimageð H; 0; CÞÞ
original image and the x-axis of the original image. Tilt Z
specifies the axis of rotation, around which the line of
2:
differenceðimageðH; 0; CÞ; imageðH; $A ; CÞÞ
¼j dC:
sight is rotated. Hence, the 2D image of the 3D shape after 0 $A
the rotation of the line of sight can be written as image(H,
ðC2Þ
A, C). The image(H, 0, C) represents the original image. A
difference between these two images is computed here as
a sum of absolute differences between projections of 2D The stability defined by Equation C2 varies a lot across
angles in image(H, 0, C) and image(H, A, C): 3D shapes and their 2D images. Therefore, we normalize
the stability, before we compare it to a criterion (this
normalization substantially improves the performance of
P ðimageðH; 0; CÞ; imageð H; A; CÞÞ
difference
the model):
j!aVð0; CÞj!aVðA; CÞj
¼ a ; ðC1Þ
: I na
normalized stabilityðimageðH; 0; 0ÞÞ
where na is number of the 2D angles of the polyhedron H, stabilityðimageðH; 0; 0ÞÞj MinStabilityðHÞ
and !Va(0, C) and !Va(A, C) are the projections of a 2D angle ¼ ; ðC3Þ
MaxStabilityðHÞj MinStabilityðHÞ
of contours of H in image(H, 0, C) and image(H, A, C).
Note that a small A causes a large difference between the
images if image(H, 0, C) is unstable. Stability of image(H, where MaxStability(H) and MinStability(H) are maximum
0, C) is defined here as a negative of its instability, which and minimum stability of images of the polyhedron H
is a sum of the differences defined in Equation C1 among images of 2562 different viewing orientations of H.

Figure C1. Results of the model in the simulation experiment. The model was applied to the images used in the psychophysical
experiments. The ordinate shows d V, and the abscissa shows levels of distortion of asymmetric polyhedra. Results from different types of
asymmetric polyhedra are plotted in separate graphs. The results of the model (solid symbols) are superimposed on the averaged results
of the subjects in the psychophysical experiments. Error bars represent the standard errors calculated from two sessions for each
condition.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 20

The 2562 viewing orientations were derived by connecting


the center of a viewing sphere and 2562 points that are
Acknowledgments
almost uniformly distributed on the surface of the sphere
(see Ballard & Brown, 1982, pp. 492–493). The author thanks Zyg Pizlo for helpful comments and
The model in this simulation experiment discriminated suggestions. The author also thanks Johan Wagemans and
between symmetric and asymmetric 3D shapes based on the other anonymous reviewer for useful suggestions, in
the normalized stability of 3D symmetric interpretations particular, for suggesting the additional tests described in
of 2D images. The model was applied to 2D images used Appendices B and C. This project was supported by the
in Experiment 1 (static condition) and Experiment 2. The National Science Foundation, Department of Energy and
normalized stability of each 3D shape was computed the Air Force Office of Scientific Research.
based on a symmetric 3D recovery from the 2D image
(see the section Recovering a symmetric or an approx- Commercial relationships: none.
imately symmetric 3D shape from a single 2D image and Corresponding author: Tadamasa Sawada.
Appendix A). Specifically, for each 2D image, regardless Email: sawada@psych.purdue.edu.
whether it was an image of a 3D symmetric or asymmetric Address: 703 3rd street, West Lafayette, IN 47907-2081,
shape, a 3D symmetric shape was recovered and used to USA.
compute the stabilities of the 2D images. Before recover-
ing the symmetric 3D shape, the image noise was added
by randomly perturbing the orientations and lengths of the Footnotes
2D symmetry line segments in order to emulate the visual
noise (see Model Fitting for more details). The discrim- 1
inability measure dVwas computed for each session based The method of generating the symmetric polyhedra
on the normalized stabilities. The criterion for symmetric was almost the same to that in Chan et al. (2006), Li et al.
versus asymmetric response of the model was chosen so (2009), and Pizlo and Stevenson (1999). The only differ-
as to minimize the sum of squared differences between the ence is that the symmetric polyhedron used in these prior
dVof the model and that of the subject in each experiment. studies had coplanar bottom faces, but that used in this
Results of the model superimposed on the average study did not.
2
results of the subjects are shown in Figure C1. The We also plotted the results using overall proportion
ordinate shows dV and the abscissa shows levels of correct, rather than dV, as a dependent variable. These two
distortion of the asymmetric polyhedra. The model’s dependent variables led to the same conclusions.
3Note that all faces in our polyhedra are convex. If faces
performance shows similar trends to the results of the
subjects. However, the overall performance of the model are not convex, more general non-planarity measures are
was substantially poorer than the human performance. available.
Furthermore, the estimated response bias of the model was
very different from that of human subjects in each session.
On average, the rate of “symmetric shape” responses of References
the model, when the 3D shape was symmetric, was 15%,
whereas that of human subjects was 79%. Again, this
response bias of the model would not lead to successful Ballard, D. H., & Brown, C. M. (1982). Computer vision.
performance in everyday life where most objects are Englewood Cliffs, NJ: Prentice-Hall.
symmetric and should be perceived as such. Changing the Barlow, H. B., & Reeves, B. C. (1979). The versatility and
model’s response bias so that it is identical to the subjects’ absolute efficiency of detecting mirror symmetry in
response bias would make model’s discriminability close random dot displays. Vision Research, 19, 783–793.
to chance level. These results clearly show that it is [PubMed]
unlikely that the stability of symmetric 3D interpretations Bartlett, F. C. (1932). Remembering: A study in exper-
of 2D images is the primary factor in the discrimination imental and social psychology. Cambridge: Cam-
between symmetric and asymmetric 3D shapes from bridge University Press.
single 2D images. A priori constraints, such as 3D
compactness, seem to be critical in this task. It is Brady, M., & Yuille, A. (1988). Inferring 3D orientation
important to point out, however, that the generic view- from 2D contour (An experimental principle). In
point model implemented here is not necessarily the only, W. Richards (Ed.), Natural computation (pp. 99–106).
or even the best way to do it. Therefore, this simulation Cambridge: MIT Press.
experiment should not be treated as an ultimate test Chan, M. W., Stevenson, A. K., Li, Y., & Pizlo, Z. (2006).
rejecting the generic viewpoint approach to the problem of Binocular shape constancy from novel views: The
discriminating between 3D symmetric and asymmetric role of a priori constraints. Perception & Psychophy-
shapes. sics, 68, 1124–1139. [PubMed]

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 21

Csathó, A., van der Vloed, G., & van der Helm, P. A. Liu, Z., & Kersten, D. (2003). Three-dimensional sym-
(2004). The force of symmetry revisited: Symmetry- metric shapes are discriminated more efficiently than
to-noise ratios regulate (a)symmetry effects. Acta asymmetric ones. Journal of the Optical Society of
Psychologica, 117, 233–250. [PubMed] America A, Optics, Image Science, and Vision, 20,
Freeman, W. T. (1994). The generic viewpoint assumption 1331–1340. [PubMed]
in a framework for visual perception. Nature, 368, Mach, E. (1906/1959). The analysis of sensations and the
542–545. [PubMed] relation of the physical to the psychical. New York:
Dover.
Freyd, J., & Tversky, B. (1984). Force of symmetry in
form perception. American Journal of Psychology, Mäkelä, P., Whitaker, D., & Rovamo, J. (1993). Model-
97, 109–126. [PubMed] ling of orientation discrimination across the visual
field. Vision Research, 33, 723–730. [PubMed]
Hildebrandt, S., & Tromba, A. (1996). The parsimonious
universe. New York: Springer. Marr, D. (1982). Vision. San Francisco: Freeman.
Hong, W., Ma, Y., & Yu, Y. (2004, May). Reconstruction McBeath, M. K., Schiano, D. J., & Tversky, B. (1997).
of 3-D deformed symmetric curves from perspective Three-dimensional bilateral symmetry bias in judg-
images without discrete features. Proceedings of ments of figural identity and orientation. Psycholog-
European Conference on Computer Vision, Prague. ical Science, 8, 217–223.
Jenkins, B. (1983). Component processes in the percep- Mitsumoto, H., Tamura, S., Okazaki, K., Kajimi N., &
tion of bilaterally symmetric dot textures. Perception Fukui, Y. (1992). 3-D reconstruction using mirror
& Psychophysics, 34, 433–440. [PubMed] images based on a plane symmetry recovering
method. IEEE Transactions on Pattern Analysis and
Julesz, B. (1971). Foundation of cyclopean perception. Machine Intelligence, 14, 941–946.
Chicago: University of Chicago Press.
Pirenne, M. H. (1970). Optics, painting, & photography.
Kaiser, P. K. (1967). Perceived shape and its dependency Cambridge: Cambridge University Press.
on perceived slant. Journal of Experimental Psychol- Pizlo, Z. (2008). 3D shape: Its unique place in visual
ogy, 75, 345–353. [PubMed] perception. Cambridge: MIT Press.
Kanade, T. (1981). Recovery of the three-dimensional Pizlo, Z., Rosenfeld, A., & Weiss, I. (1997a). The
shape of an object from a single view. Artificial geometry of visual space: About the incompatibility
Intelligence, 17, 409–460. between science and mathematics. Computer Vision
King, M., Meyer, G. E., Tangney, J., & Biederman, I. and Image Understanding, 65, 425–433.
(1976). Shape constancy and a perceptual bias Pizlo, Z., Rosenfeld, A., & Weiss, I. (1997b). Visual
towards symmetry. Perception & Psychophysics, 19, space: Mathematics, engineering, and science. Com-
129–136. puter Vision and Image Understanding, 65, 450–454.
Koffka, K. (1935). Principles of Gestalt psychology. New Pizlo, Z., & Salach-Golyska, M. (1995). 3-D shape percep-
York: Harcourt, Brace, & World. tion. Perception & Psychophysics, 57, 692–714.
Kontsevich, L. L. (1996). Symmetry as a depth cue. In [PubMed]
C. W. Tyler (Ed.), Human symmetry perception and Pizlo, Z., Sawada, T., Li, Y., Kropatsch, W., & Steinman,
its computational analysis (pp. 331–347). Utrecht, R. M. (2010). New approach to the perception of 3D
Netherlands: VSP BV. shape based on veridicality, complexity, symmetry
Leclerc, Y. G., & Fischler, M. A. (1992). An optimiza- and volume. Vision Research, 50, 1–11. [PubMed]
tion-based approach to the interpretation of single line Pizlo, Z., & Stevenson, A. K. (1999). Shape constancy
drawings as 3D wire frames. International Journal of from novel views. Perception & Psychophysics, 61,
Computer Vision, 9, 113–136. 1299–1307. [PubMed]
Levi, D. M., & Klein, S. A. (1990). The role of separation Poggio, T., Torre, V., & Koch, C. (1985). Computational
and eccentricity in encoding position. Vision vision and regularization theory. Nature, 317, 314–319.
Research, 30, 557–585. [PubMed] [PubMed]
Li, Y. (2009). Perception of parallelepipeds: Perkins’s Saunders, J. A., & Knill, D. C. (2001). Perception of 3D
law. Perception, 38, 1767–1781. [PubMed] surface orientation from skew symmetry. Vision
Li, Y., Pizlo, Z., & Steinman, R. M. (2009). A computa- Research, 41, 3163–3183. [PubMed]
tional model that recovers the 3D shape of an object Sawada, T., Li, Y., & Pizlo, Z. (in preparation). Any
from a single 2D retinal representation. Vision 2D image is consistent with 3D symmetric
Research, 49, 979–91. [PubMed] interpretations.

Downloaded from jov.arvojournals.org on 04/04/2023


Journal of Vision (2010) 10(6):4, 1–22 Sawada 22

Sawada, T., & Pizlo, Z. (2008a). Detection of skewed Wagemans, J., van Gool, L., & d’Ydewalle, G. (1991).
symmetry. Journal of Vision, 8(5):14, 1–18, http:// Detection of symmetry in tachistoscopically pre-
www.journalofvision.org/content/8/5/14, doi:10.1167/ sented dot patterns: Effects of multiple axes and
8.5.14. [PubMed] [Article] skewing. Perception & Psychophysics, 50, 413–427.
Sawada, T., & Pizlo, Z. (2008b). Detecting mirror- [PubMed]
symmetry of a volumetric shape from its single 2D Wagemans, J., van Gool, L., & d’Ydewalle, G. (1992).
image. Proceedings of the Workshop on Perceptual Orientation effects and component processes in
Organization in Computer Vision, IEEE International symmetry detection. Quarterly Journal of Experi-
Conference on Computer Vision and Pattern Recog- mental Psychology, 44A, 475–508.
nition, Anchorage, AK, June 23, 2008. Wagemans, J., Lamote, C., & van Gool, L. (1997).
Steiner, G. (1979). Spiegelsymmetrie der tierkörper. Shape equivalence under perspective and projective
Naturwissenschaftliche Rundschau, 32, 481–485. transformations. Psychonomic Bulletin & Review, 4,
Tjan, B. S., & Liu, Z. (2005). Symmetry impedes symmetry 248–253.
discrimination. Journal of Vision, 5(10):10, 888–900, Watt, R. J. (1987). Scanning from coarse to fine spatial
http://www.journalofvision.org/content/5/10/10, scales in the human visual system after the onset of a
doi:10.1167/5.10.10. [PubMed] [Article] stimulus. Journal of the Optical Society of America A,
Ullman, S. (1979). The interpretation of visual motion. Optics and Image Science, 4, 2006–2021. [PubMed]
Cambridge: MIT Press. Weinshall, D., & Werman, M. (1997). On view likelihood
van Lier, R., & Wagemans, J. (1999). From images to and stability. IEEE Transactions on Pattern Analysis
objects: Global and local completions of self- and Machine Intelligence, 19, 97–108.
occluded parts. Journal of Experimental Psychology: Wertheimer, M. (1923). Investigations on the gestalt of
Human Perception and Performance, 25, 1721–1741. shape. Psychologische Forschung, 4, 301–350.
Vetter, T., & Poggio, T. (1994). Symmetric 3D objects are Yang, T., & Kubovy, M. (1999). Weakening the robust-
an easy case for 2D object recognition. Spatial Vision, ness of perspective: Evidence for a modified theory of
8, 443–453. [PubMed] compensation in picture perception. Perception &
Vetter, T., Poggio, T., & Bülthoff, H. H. (1994). The Psychophysics, 61, 456–467. [PubMed]
importance of symmetry and virtual views in three- Zabrodsky, H. (1990). SymmetryVA review (Tech. Rep.
dimensional object recognition. Current Biology, 4, No. 90-16). CS Department, The Hebrew University
18–23. [PubMed] of Jerusalem.
Wagemans, J. (1992). Perceptual use of nonaccidental Zabrodsky, H., & Algom, D. (1994). Continuous symme-
properties. Canadian Journal of Psychology, 46, try: A model for human figural perception. Spatial
236–279. [PubMed] Vision, 8, 455–467. [PubMed]
Wagemans, J. (1993). Skewed symmetry: A nonaccidental Zabrodsky, H., & Weinshall, D. (1997). Using bilateral
property used to perceive visual forms. Journal of symmetry to improve 3D reconstruction from image
Experimental Psychology: Human Perception and sequences. Computer Vision and Image Understand-
Performance, 19, 364–380. ing, 67, 48–57.
Wagemans, J. (1997). Characteristics and models of Zimmer, A. C. (1984). Foundations for the measurement of
human symmetry detection. Trends in Cognitive phenomenal symmetry. Gestalt Theory, 6, 118–157.
Sciences, 1, 346–352.

Downloaded from jov.arvojournals.org on 04/04/2023

You might also like