You are on page 1of 17

Journal of Visual Languages and Computing (1999) 10, 19—35

Article No. jvlc.1998.0112, available online at http://www.idealibrary.com on

Manipulating Objects in Virtual Worlds:


Categorization and Empirical Evaluation of
Interaction Techniques*

IVAN POUPYREV R AND TADAO ICHIKAWA


Information Systems Laboratory, Hiroshima University, Japan

The acceptance of virtual environment (VE) technology requires scrupulous optimiza-


tion of the most basic interactions to maximize user performance and provide efficient
and enjoyable virtual interfaces. Motivated by insufficient understanding of human
factors implications in the design of interaction techniques for object manipulation in
virtual worlds, this paper presents results of a formal study that evaluated two basic
interaction metaphors for virtual manipulation—virtual pointer and virtual hand—in
object selection and positioning tasks. In this work, we survey and categorize current
virtual manipulation techniques according to their basic design metaphors, conduct
experimental studies of the most basic techniques, and derive guidelines to aid designers
in the practical development of VE applications.
 1999 Academic Press

1. Introduction
MANIPULATION OF OBJECTS is one of the most fundamental interactions between humans
and their environment, whether in the physical or virtual world. In immersive virtual
environments ( VEs), object manipulation can occur both as a sole activity, capturing the
whole attention of the user, and as a component of a complex interaction sequence.
Hence, a significant amount of research on designing interaction techniques that
provide effective means for selecting and manipulating objects in VEs has been carried
out in recent years. This has resulted in a wide array of techniques such as laser pointer
and flash light [1], World-In-Miniature [2], Go-Go [3], HOMER [4], image plane [5],
scaled-world grab [6], and many other techniques.
As the field of virtual reality has matured, the lack of comprehensive research
evaluating the human factors of 3D manipulation techniques and their design implica-
tions has become apparent [ 7 ]. The sheer variety of techniques itself presents a chal-
lenge for developers. How do all these techniques relate to each other? Which
interaction techniques should be chosen for particular tasks? Which among the

* Some of the results reported in this paper were presented at the ACM Symposium on Virtual Reality
Software and Technology’97 and EUROGRAPHICS’98 conference.
-
Corresponding author. MIC Lab, ATR, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan.

1045-926X/99/010019#17 $30.00/0  1999 Academic Press


20 I. POUPYREV AND T. ICHIKAWA

parameters of techniques, tasks, and environments should be considered in designing


VE interfaces?
These questions cannot be answered only on the basis of our everyday experience
with manipulating physical objects. Unlike early VE techniques that attempted to
directly simulate real-world manipulation [8, 9], the majority of current techniques allow
the manipulation of virtual objects in entirely new ways, which are not possible in the
physical world. For example, a ray-casting technique allows to pick up objects by
pointing on them; the Go-Go technique [3] provides a virtual ‘rubber arm’ that allows
easy reach of objects far away. Since each of these techniques departs from real-world
manipulation to a greater or lesser degree, there is little previous experience that would
help designers predict the user performance and choose appropriate techniques. For
example, it is not intuitively clear which would be more effective for object selecting
within VE: pointing using ray-casting or reaching with the mentioned ‘rubber arm’
technique. Therefore, the systematic analysis and experimental evaluation are crucial for
understanding the usability characteristics of techniques and for providing developers
with guidelines to allow informed design decisions.
In this paper, we present the results of a study evaluating human factors character-
istics of interaction techniques for object manipulation in VEs. Specific objectives are:
(1) to survey and categorize current manipulation techniques within a unified taxon-
omy; (2) to utilize this taxonomy in evaluating some of the basic techniques for object
manipulation in VEs, and (3) to derive general guidelines to aid designers in practical
development of VE applications. We limited our investigation to traditional fully
immersive VEs, where the user’s head and hand positions are tracked using six-degrees-
of-freedom (6DOF ) sensors and users are immersed using a head-mounted display
(HMD). The paper starts with a discussion of background and related work, and
proceeds to survey current techniques that are then categorized within a simple
taxonomy. We then describe the experimental procedures of the study that evaluated
some of the techniques. Finally, we report results and conclude with a discussion of
design implications.

2. Background and Related Work


Research efforts investigating human manipulation have a long history. The driving
force behind them has been the desire to maximize human performance by improving
the design of control and interaction devices, hand tools, physical work layouts, and
work methods [10]. Similar to manipulation in the physical world, object manipulation
within VE is affected by user characteristics, manipulation task conditions, and proper-
ties of the environment [ 7 ]. In addition, the user performance is strongly affected by the
characteristics of interface components, in particular by the properties of input and
output devices and interaction techniques [ 7, 11].
Figure 1 places the major components of VE interfaces into perspective by extending
the classic human factor model of human—machine interfaces introduced by Taylor [12]
and adopted later for human—computer interaction [11, 13]. The user interacts with VE
applications in a closed-loop system by applying motor stimuli to input devices and
receiving sensory feedback through display devices. The interaction techniques map the
user input captured by devices, such as the positions of body parts, voice commands,
MANIPULATING OBJECTS IN VIRTUAL WORLDS 21

Figure 1. Interaction in virtual environments

and hand gestures, into the corresponding control actions and commands. The VE
system responds by changing the state of VE, i.e., by modifying the shape, position,
color, and other properties of various entities. Display devices provide sensory feedback
to users by stimulating their visual, auditory, and other perceptual systems, thus closing
the interaction loop.
Naturally, the performance characteristics of input and display devices in
various conditions of spatial manipulation have been subject to significant
scrutiny. Starting with the pioneering work by Ware that assessed the applicability of 3D
input devices in various manipulation tasks [8], many studies investigated the effect of
input and display devices and their properties on user manipulation performance
[14 —17 ].
While devices have been subject to careful and thorough evaluation, the effect of
mappings, i.e. interaction techniques, on user manipulation performance remains largely
unknown. Some of the earliest work concerned with the design of manipulation
techniques, rather than input devices, was a number of surveys by Hinkley [19] and Mine
[20]. They summarized existing techniques, and identified problems and possible
solutions. An informal usability study by Bowman and Hodges [4] was one of the early
attempts to systematically evaluate 3D manipulation techniques as a distinctive compon-
ent of VE interfaces. Although no quantitative data was collected, their study provided
some useful preliminary observations. For example, Bowman and Hodges observed
that ray-casting techniques were more effective for object selection tasks, while arm-
extension techniques were superior for manipulation. Recently, the number of
studies investigating 3D manipulation techniques has steadily increased [6, 21]. Our
work contributes to this growing body of research by systematically evaluating 3D
object manipulation techniques in various conditions of object selection and positioning
tasks.

3. Survey and Taxonomy of Techniques for Immersive Manipulation


Due to the wide variety of object manipulation techniques, a straightforward evaluation
is difficult. Even for the same technique, user performance depends on subtle imple-
mentation details, and studies of a particular implementation may not be readily
generalized to other implementations of the technique. On the other hand, many
techniques apparently relate to each other and share many common properties. For
22 I. POUPYREV AND T. ICHIKAWA

example, there are more similarities between ray-casting and flashlight techniques than
there are between ray-casting and techniques that use non-linear mappings to extend the
user’s area of reach (as in Go-Go [3]). While an evaluation of ray casting might provide
insight into similar techniques, such as flashlight, it probably would not help in
understanding techniques like Go-Go. A taxonomy of techniques that categorizes them
according to their common properties, is crucial to understand relations between
techniques and to direct their design and evaluation.

3.1. Brief Survey of Techniques for Object Manipulation


Interaction techniques for immersive manipulation should provide the means to accom-
plish at least one of the three basic manipulation tasks: object selection, object
positioning, and object orientation [22, 23]. The classical approach provides the user
with a ‘virtual’ hand—a 3D cursor, often shaped like a human hand, whose movements
correspond to the movements of the tracker worn on the hand or held by the user
[ Figure 2(a)]. To select an object the user simply intersects the virtual hand with
the target, and presses a trigger to pick it up. The object then is attached to the virtual
hand and, consequently, can be translated and rotated within the VE. The virtual
hand technique is rather intuitive since it simulates a real-world interaction with
objects. A problem, however, is that only those objects within the area of reach can be
picked up.
A number of techniques have been suggested to overcome this problem. The Go-Go
technique [3], for example, allows the extension of reaching distance by using a non-
linear mapping applied to the user’s hand extension [Figure 2( b)]. While the real hand of
the user is within some threshold distance D, the mapping is linear and the movements
of the virtual hand correspond to the real hand movements. When the user extends
a hand further than D, the mapping becomes non-linear and the virtual arm ‘grows’
[Figure 2(b) and (c)]. Different mapping functions can be used to achieve different
control-display gains between real and virtual hands (fast Go-Go, stretch Go-Go
techniques [4]).
The other common way to select and manipulate objects in VEs is to point at them
using a virtual ray emanating from the virtual hand. When the virtual ray intersects with
the selected object, the object can be picked up and manipulated [4, 20] [ Figure 2(d)].
A problem reported with ray-casting is the difficulty in selecting small objects and those
at a distance, partly due to tracker noise [1, 6, 24]. Several variations of this technique
have been developed to overcome this problem. For example, the spotlight technique
provides a conic selection volume, so that objects falling within the cone can be easily
selected [1, 25]. However, when more than one object falls into the spotlight, further
disambiguation of the target object is required. The aperture technique [24] is a modifi-
cation of the spotlight technique that allows for interactive control of the selection
volume size. The conic pointer direction is defined by the location of the user’s eye,
which is estimated from the tracked head location and the location of a hand sensor
represented as an aperture cursor within the VE [ Figure 2(e)]. The user can simply
control the size of the selection volume by bringing the hand sensor closer or moving it
farther away. The Image Plane family of interaction techniques [5] allows the user to
manipulate objects by interacting with their 2D projections on an image plane in front of
the user. For example, in the Sticky Finger technique [Figure 2(f )], the user selects an
MANIPULATING OBJECTS IN VIRTUAL WORLDS 23

Figure 2. Interaction techniques for VE object manipulation: classical virtual hand (a), Go-Go (b, c),
ray-casting (d), aperture (e), image plane (f ) and world-in-miniature ( g )
24 I. POUPYREV AND T. ICHIKAWA

object by simply touching its projection. The object underneath the user’s finger is
selected by casting a vector from the user’s eye-point through the finger and finding an
object intersecting with this vector.
Another approach to expanding the user ability to access and manipulate virtual
objects allows the user to manipulate the relative scale of the virtual world. One of the
earliest techniques using this approach was implemented within a 3DM immersive
modeler [9], where users could ‘grow’ or ‘shrink’ themselves to more easily manipulate
objects of different sizes. The automatic world scaling technique [6] allows the user to
scale and bring parts of the VE containing remote objects within the user’s reach. The
environment scales back after the manipulation is finished. The other interesting
technique is the world-in-miniature ( WIM ) technique [2], which provides the user with
a miniature hand-held model of the VE [ Figure 2( g )]. The user can then indirectly
manipulate virtual objects by interacting with their representations in the WIM.
Since all of the techniques discussed above have their strengths and weaknesses, there
have been a number of attempts to integrate them combining their best features. Virtual
Tricorder [25] combines ray casting for object selection and manipulation with tech-
niques for viewpoint navigation and level-of-detail control within one universal tool.
Another example is HOMER, which combines ray-casting and virtual hand: after the
user selects an object by ray-casting, his virtual hand instantly snaps to the selected
object allowing manipulation [4]. The virtual hand returns to its normal position after
the manipulation is completed. Cohen developed a similar technique earlier for spatial
sound manipulation [26].

3.2. Taxonomy
An analysis of current VE manipulation techniques suggests that most are based on
a few interaction metaphors or their combinations. Each of these basic metaphors
forms the fundamental mental model of a technique—a perceptual manifestation of
what users can do, how they can do it (affordances), and what they cannot do
(constraints) using the technique [27 ]. Particular techniques are implementations of
basic metaphors, which are often extended to overcome some of their constraints.
These improvements, in turn, can often result in new constraints. For example, the
flashlight technique enhances a virtual pointer metaphor by using a conic pointer to ease
the selection of small objects [1]. However, this enhancement results in an ambiguity if
several objects fall into the conic pointer [24].
In Figure 3, we present a simple taxonomy of current VE manipulation techniques
that categorizes them according to their basic interaction metaphors into exocentric and
egocentric techniques. These two terms originated in studies of cockpit displays [28], and
are used now to distinguish between two fundamental styles of interaction within VEs.
In exocentric interaction, also known as the God’s eye viewpoint, users interact with
VEs from the outside (the outside-in world referenced display [28]); examples are the
world-in-miniature and world scaling techniques. In egocentric interaction, which is the
most common in immersive VEs, the user interacts from inside the environment, i.e. the
VE embeds the user [28]. There are currently two basic metaphors for egocentric
manipulation: virtual hand and virtual pointer. With the techniques based on the virtual
hand metaphor, users can reach and grab objects by ‘touching’ and ‘picking’ them with
a virtual hand. The major design factor that distinguishes techniques is the mapping
MANIPULATING OBJECTS IN VIRTUAL WORLDS 25

Figure 3. Taxonomy of VE manipulation techniques

between the real and virtual hand’s positions and orientations. For example, a ‘classical’
virtual hand technique provides one-to-one mapping, while the Go-Go technique
employs a non-linear function.
With the techniques based on the virtual pointer metaphor, the user interacts with
objects by pointing at them. When the vector emanating from the virtual pointer
intersects with an object, the object can be picked up and manipulated [20]. The major
design factors that distinguish techniques are virtual pointer direction and shape, and
methods of disambiguating the target object. In the simplest case, the direction of the
virtual pointer is defined by the orientation of the virtual hand, the pointer’s shape is
a ‘laser ray’, and no disambiguation is provided [ Figure 2(d)]. Aperture technique [24],
on the other hand, define the pointer direction using the user’s dominant eye and
hand-held tracker positions, and the shape of the pointer is a cone.
This suggested taxonomy identifies only the most basic metaphors for virtual object
manipulation. Techniques based on these metaphors can be further subdivided to reflect
the particular design aspects of each technique or combined together to form new
integrated techniques, such as Virtual Tricorder or HOMER techniques [4, 25].

4. Empirical Evaluation of Manipulation Techniques


To compare the strengths and weaknesses of immersive object manipulation techniques,
we conducted experimental studies of the basic metaphors for egocentric object
manipulation—virtual hand and virtual pointer—in 3D selection and positioning tasks.
Our studies focused on egocentric techniques; although exocentric techniques are also
important, their evaluation is outside the scope of this work.

4.1. Evaluated Interaction Techniques


For this study, we elected to evaluate those techniques that implement the studied
metaphors—virtual hand and virtual pointer—as close as possible with little or no
26 I. POUPYREV AND T. ICHIKAWA

improvements. As a result, we could limit the number of techniques studied and


generalize some of the implications of the experiments to other techniques based on
these metaphors.
We used a simple ray-casting technique to evaluate the pointing metaphor, where
pointer direction was defined by the position and orientation of the virtual hand. A short
segment of the ray was attached to the virtual hand to provide visual feedback
[ Figure 4(a)]. For object selection, the subject pointed at it and pressed a mouse button.
Since the ‘classical’ virtual hand allows object manipulation only within the reaching
distance, we evaluated two variations of the virtual hand metaphor: ‘classical’ virtual
hand and Go-Go. The subject was presented with a virtual hand whose position and
orientation was controlled by a 6DOF tracker. To select an object, the user intersected
the virtual hand with the object and pressed the button [ Figure 4(b)]. One-to-one
mapping was used for the ‘classical’ virtual hand and non-linear mapping was used for
the Go-Go technique.
All techniques were evaluated under two conditions: with and without visual feed-
back. When visual feedback was provided, the stimuli changed color when the user
correctly pointed at it or ‘touched’ it with the virtual hand.

4.2. Materials and Equipment


The experiments were conducted within the framework of a VR Manipulation Assess-
ment Testbed ( VRMAT ) [23], which is a tool that facilitates the rapid design and
implementation of immersive manipulation studies. The testbed was implemented using
a custom VR software toolkit and an SGI Onyx RE2 workstation equipped with
a Virtual Research VR4 stereoscopic head-mounted display (HMD) and Polhemus
Fastrak 6DOF trackers. A mouse was used as the button device. The frame update rate
was controlled at 15 Hz.

4.3. Participants
Two groups of subjects were recruited from the laboratory subject pool: 10 males and
three females for selection task experiments, eight males and four females for position-
ing task experiments. Subjects ranged in age from 19 to 32; all subjects were right
handed, as determined by the Edinburgh inventory. To reduce the variability in subject
performance, we chose subjects who had moderate prior experience with virtual reality.

4.4. Experimental Tasks


Participants were immersed in a VE consisting of a large checked ground plane and
a virtual representation of their hand ( Figure 4). They wore a 6DOF tracking sensor on
their dominant hand, and held a button device (for selecting and picking objects) in the
other hand. To reduce the number of variables affecting the subjects’ performance, we
restricted their physical movement by placing them on a platform approximately 1.5 m
in diameter.
The selection task required participants to select a solitary test object (stimuli) using
the technique under investigation. All stimuli were presented in the user’s field of view
as simple shapes such as spheres, cubes, and cylinders [ Figure 4(a)]. The positioning task
MANIPULATING OBJECTS IN VIRTUAL WORLDS 27

Figure 4. Experimental tasks: object selection (a) and repositioning (b)

required the subject to pick a test object from an initial position and move it to a final
position specified by a terminal object of different color [ Figure 4( b)]. Both objects
were cylinders with equal radii, and subjects were asked to align the manipulated cylinder
with the terminal object. Each time the test object was released, the VRMAT calculated
the error of positioning as follows:

((x0!xt )2;( y0!yt )2


EH" ;100%
Ds

(( y0!yt )2;(z 0!z t )2


EV" ;100% (1)
Hs
where EH and EV are the percentages of horizontal and vertical displacements of the
test object relative to the terminal; xt , yt , z t and x0 , y0 , z 0 are coordinates of the test and
terminal objects, respectively; Ds , Hs are diameter and height, equal for both objects.
When both EH and EV became less than the specified threshold (required accuracy) the
trial was completed and both objects disappeared. The next trial was then presented.
The independent variables for the selection task were distance to the object, object size,
interaction techniques, and visual feedback. Similar to the classical study by Kennedy of an
operator’s reaching envelope [29], the objects’ positions and sizes were defined as
distance d and directions a, b from the user to the test object ( Figure 5). We measured
distance d in terms of virtual cubits, a unit of distance introduced in the VRMAT and
equal to the length of the user’s maximum reach [23]. The main advantage of virtual
cubits is the ease of generalization of the experimental results; an object located at
a distance of one virtual cubit would be located on the boundary of the user’s reach for
any user and any VE, independent of the software and hardware used in implementa-
tion. Virtual cubits also can eliminate bias due to subjects anthropometrical differences.
The stimuli size was defined as visual size —the vertical and horizontal angles u,
the
object occupies in the user’s field of view ( Figure 5). In order to maintain the visual size
specified a priori by the experimenter, the actual geometrical size of stimuli was
recalculated before each trial depending on the distance to the object and length of the
virtual cubit, which was user dependent. Thus, the visual size of stimuli did not change
28 I. POUPYREV AND T. ICHIKAWA

Figure 5. System of measurement used in the experiments

with the distance change, so we could separate the influence of distance and object size
on user performance as well as generalize experimental results beyond our test VE.
The main independent variables of interest for the positioning task were initial and
final distances to stimuli, required accuracy of positioning, and interaction technique. Both initial
and final distances were defined in virtual cubits; the required accuracy was defined
according to Eq. (1).

4.5. Performance Criteria


The completion time was used as a primary performance criterion. For the selection task,
this was the time from the moment the stimulus appeared until the moment it was
successfully selected by the subject. For positioning tasks, the completion time was
measured from the moment the subject picked a stimuli until the moment it was
positioned with the required accuracy. The interaction sequence for the positioning task
consisted of multiple adjusting repositioning movements, i.e. the subject would pick up,
reposition, and release an object several times until it was positioned with the required
accuracy. Accordingly, we also measured the number of iterations it took to complete the
task, as well as ‘net ’ time, i.e. the completion time excluding the time required for each
selection between repositioning.

4.6. Procedure
A balanced within-subject (repeated measure) design was used. After donning the
HMD, subjects were asked to momentarily extend their tracked hand to its full natural
reach for ‘virtual cubit’ calibration. The environment then was recalibrated according to
the length of the virtual cubit. Following an explanation of the interaction techniques
and tasks, subjects had on average 3 min to practice them. Each subject completed 18
experimental sessions for selection tasks: three sessions with and three sessions without
visual feedback for each technique. There were 15 trials in each session: manipulating
three object sizes (4, 6 and 9 degrees of visual field) and five distances (0.7, 1, 2, 4 and
6 virtual cubits).
MANIPULATING OBJECTS IN VIRTUAL WORLDS 29

For the object repositioning tasks, we evaluated techniques under two conditions:
(1) repositioning at constant distances from the user and (2) repositioning within the
area of reach. Subjects completed three sessions for each technique with six trials in each
session: four trials for repositioning at constant distances (0.7, 2.2, 3.5, and 6 virtual
cubits) and two trials for repositioning within the area of reach with moderate distance
changes (from 0.7 to 1 and from 1 to 0.7 virtual cubits). All trials were defined within
20% of the required accuracy.
The order of trials within sessions was randomized, and trials were presented one
after the other with a 4 s delay between them until the session was completed. The
interaction techniques order was also randomized. In addition to the on-line perfor-
mance data, an informal questionnaire was administered after the experiments.

4.7. Experimental Result


4.7.1. Selection Task
A repeated measures multiple-way ANOVA was performed with completion time as the
dependent variable; distance, size, visual feedback and interaction technique were
independent variables. A significant main effect was found for distance [F(4, 48)"
54.23, p(0.0001], object size (F (2, 24)"92.25, p(0.0001] and visual feedback
[F (1, 12)"15.4, p(0.002]. In addition, significant interactions between technique and
object size [F (2, 24)"47.95, p(0.0001], technique and distance [F (4, 48)"6.9,
p(0.0001], technique and visual feedback [F (1, 12)"8.19, p(0.01] were found.
These interactions suggest that neither the virtual hand nor the virtual pointer was
universally preferable—their comparable weaknesses and strengths depended on the
particular conditions of the selection task.
Figure 6(a) summarizes the effects of distance and size on object-selection perfor-
mance for the Go-Go and ray-casting techniques without using visual feedback. For
both techniques with a decrease in object size or an increase in distance, the target object
is increasingly harder to ‘hit’. This observation was expected and appears to represent
a ‘Fitt’s Law’ phenomenon [30]. A significant interaction between size and distance was
also found for both techniques [F (8, 96)"4.9, p(0.0001 and F (8, 96)"5.9,
p(0.0001, respectively]. This interaction suggests that the effect of distance is stronger
under those conditions that require more accurate selection, i.e. the selection of smaller
objects.
A comparison of the techniques showed that within the area of reach, both
techniques exhibited comparable performance for all object sizes, with slightly better
performance for ray-casting. However, with increased distance, the performance of
ray-casting rapidly degraded, especially when a high selection accuracy was required, i.e.,
a selection of small objects [ Figure 6(a)]. In these conditions, the Go-Go technique
performed significantly better [F (1, 12)"9.13, p(0.01], while both techniques had
comparable performance for the selection of large objects, where no significant differ-
ence was found [F (1, 12)"2.47, p(0.142].
Visual feedback significantly improved performance in ray-casting [ Figure 6(b)]:
ANOVA revealed a significant effect due to visual feedback [F (1, 12)"18.3,
p'0.001], as well as interaction between visual feedback and distance [F (4, 48)"8.41,
p(0.0001]. Separate analyses of local and at-a-distance selection revealed that while
30 I. POUPYREV AND T. ICHIKAWA

Figure 6. Selection time for ray-casting and Go-Go interaction techniques in object selection

visual feedback significantly improved performance at-a-distance [F (1, 12)"16.1,


p(0.002], the effect of visual feedback was not significant for local selection
[F(1, 12)"2.789, p(0.12]. Surprisingly, although visual feedback seemed to improve
performance of the Go-Go technique [ Figure 6(d)] it was not significant statistically
[F(2, 24)"1.89, p(0.17]. A technique comparison shows that ray-casting enhanced with
visual feedback results in better performance than Go-Go [Figure 6(c)], except for condi-
tions when high accuracy is required, i.e. the selection of small objects. Under those
conditions, the Go-Go technique still outperformed ray-casting [F (1, 12)"8.96, p(0.01].
Finally, ANOVA did not reveal any significant difference between ray-casting,
Go-Go and ‘classical’ virtual hand techniques [F (2, 24)"2.25, p(0.13] for object
selection within the user reach where all techniques showed similar performance. Visual
feedback did not result in significant improvements for classical virtual hand
[F (1, 12)"1.55, p(0.24].

4.7.2. Positioning Task


A repeated measures multiple-way ANOVA was conducted with distance and interac-
tion technique as independent variables, and absolute positioning time, ‘net’ positioning
time (i.e. with selection time subtracted), and the number of iterations as dependent
variables. A significant effect due to distance was found for all dependent variables
for repositioning at constant distances from the user [F (3, 33)"48.5, p(0.0001,
F (3, 33)"39.22, p(0.0001 and F (3, 33)"25.83, p(0.0001, respectively; Figure 7 ].
While no significant effect due to technique was found for absolute and net positioning
MANIPULATING OBJECTS IN VIRTUAL WORLDS 31

Figure 7. ‘ Net’ time, absolute time (a, b) and number of iterations (c)

times [F (1, 11)"0.132, p(0.72 and F (1, 11)"0.747, p(0.41], a significant tech-
nique effect was found for number of iterations [F (3, 33)"5.47, p(0.039]. The
distance by distance analysis showed that with increased distance the ray-casting allows
object repositioning with less movement than Go-Go. For example, it required 1.8 less
movements than Go-Go for repositioning at the distance of 6 virtual cubits.
A comparison of the ray-casting, Go-Go, and classical virtual hand techniques for
object repositioning within the user reach showed that all techniques result in similar
performance when a moderate change of distance is required [absolute time:
F (2, 22)"2.9, p(0.08, net time: F (2, 22)"1.36, p(0.28]. For repositioning at
a constant distance (0.8 virtual cubits), ‘classical’ virtual hand and ray-casting outper-
formed Go-Go [absolute time: F (2, 22)"13.759, p(0.0001, net time: F (2, 22)"8.8,
p(0.002]. Under these conditions, classical virtual hand was 22% faster and rays
casting 15% faster than Go-Go. In general, ray-casting performed much better when
the task did not involve a change of distance [absolute time: F (2, 22)"17.77,
p(0.0001]; Go-Go, on the other hand, demonstrated the same performance under all
conditions [absolute time: F (2, 22)"0.95, p(0.4].

4.7.3. Subject’s Comments


While none of the subjects had difficulties in using any of the techniques, Go-Go was rated
as the most intuitive and enjoyable, with ray-casting second. This finding replicates previous
results [4]. Three subjects, however, preferred the classical virtual hand, reporting that it
provided more familiar mapping. All subjects were dissatisfied with the decreased ray-casting
performance in the selection of small objects at far distances, and most participants
commented on the performance improvement when ray-casting was enhanced with visual
feedback. The subjects also stated that one of the main difficulties in positioning objects at
a distance was the limited visual cues, rather than shortcomings in the techniques themselves:
they simply could not see if the object was being positioned correctly.

5. Discussion
These experiments demonstrate that there is no one ‘ best’ interaction metaphor among
those studied: their strengths and weaknesses can be compared only in relation to the
32 I. POUPYREV AND T. ICHIKAWA

particular conditions of the spatial manipulation. Here we discuss some of the issues that
arose from the studies.

5.1. Virtual Pointer vs. Virtual Hand in Object Selection Task


Techniques performance depended on the conditions of the selection task. Within the
area of user reach, all techniques demonstrated similar performance, with ray-casting
exhibiting slightly better scores, especially when accurate selection was not required.
Therefore, the classical virtual hand can be replaced by the Go-Go or ray-casting
techniques without degrading user performance in local selection conditions, in those
applications where both local and remote selection are required. Also, the Go-Go
interaction technique can be considered as a generalization of the classical virtual hand
technique for selection at-a-distance.
While both the ray-casting and Go-Go techniques allowed for effective selection
at-a-distance, Go-Go resulted in notably better performance when accurate selection
was required. Ray-casting, on the other hand, was more efficient when a high selection
accuracy was not needed. The introduction of visual feedback significantly improved the
accuracy of ray-casting and is an important enhancement in this technique. However,
even with visual feedback, Go-Go was still faster for accurate selection [ Figure 6(c)].
The choice of technique for selection at-a-distance depends, therefore, on the accuracy
of selection required in a particular application.

5.2. Virtual Pointer vs. Virtual Hand in Object Positioning Task


It is difficult to compare virtual pointer and hand metaphors since it is next to
impossible to reposition an object using ray-casting if a change of distance is required
(unless the virtual pointer is extended with some mechanism to manipulate its length,
such as in Bowman and Hodges [4] and Mine et al. [6]). The repositioning with the
change of distance is natural and intuitive using the Go-Go technique. A virtual pointer,
however, is very effective for repositioning at constant distances from the user and
within the user’s reaching area. Under such conditions, ray-casting resulted in similar or
better performance than Go-Go [Figure 7(a) and ( b)]. Furthermore, the ray-casting
resulted in better performance with the increase of the distance to object for the number
of iterative movements [ Figure 7(c)].

5.3. Visual Feedback


Enhancing interaction techniques with visual feedback does not always improve user
performance in object selection. Visual feedback did considerably improve ray-casting
performance for selection at-a-distance, making selection of small objects easier. The
effect of visual feedback, however, was not significant within the area of the user’s reach.
Furthermore, there was no significant effect of visual feedback for the Go-Go tech-
nique, which was somewhat surprising. Previous evaluations of Go-Go indicated that
because of the non-linear mapping used in the techniques, an increase in object distance
might lead to their ‘overshoot’. As a result, we expected that visual feedback might
improve the Go-Go performance at greater distances, but this did not happen. One
possible explanation is that with techniques based on the virtual hand metaphor, the
MANIPULATING OBJECTS IN VIRTUAL WORLDS 33

user can naturally see when the virtual hand intersects the object. Therefore, visual
feedback is an inherent part of the metaphor and adding ‘more’ visual feedback does not
necessarily result in performance improvements. A second explanation is that because
object sizes are defined by visual angles, moving objects further from the user increases
their actual geometrical size [23]. This leads to an increase of the stimuli volume which,
in turn, counterbalances the effect of the overshoot. This situation, in fact, is very
natural for VEs: in order for the object to be visible at large distances, its geometrical
size should be quite large. An important exception is the selection of flat objects.
Therefore, the depth of the stimuli is another important variable that should be
considered in technique evaluation and interface design.

5.4. Metaphor Affordances and Constraints: 2D vs. 3D Manipulation


Our findings suggest that the basic affordances and constraints of the virtual pointer and
the virtual hand metaphors are defined by the number of degrees of freedom that can be
effectively controlled. While the virtual hand allowed effective manipulation of all three
object coordinates in selection and positioning tasks (d, a, and b, Figure 5), the virtual
pointer allowed effective control of only two of them: a and b object directions; it was
much less effective in manipulating the third coordinate—distance d. Indeed, with an
increase of stimuli distance the ray-casting performance degraded significantly more
than the Go-Go performance. While ray-casting was very inefficient for positioning that
required a change of distance, it was very effective for repositioning at a constant
distance, when only directions to the test object were manipulated. Thus, the virtual
hand and virtual pointer can be categorized as 3D and 2D manipulation metaphors,
respectively.
Certainly, the virtual pointer metaphor can be enhanced to allow more direct control
of distance. For example, Bowman et al. [4] extended ray-casting with a ‘fishing reel’ that
allowed changes of the virtual pointer’s length. However, the user performance implica-
tions of these extensions are not clear. Will the reeling mechanism provide performance
comparable to Go-Go in repositioning tasks? Will it improve ray-casting performance in
the selection of small objects? Is it possible that enhancing the metaphor will degrade
user performance in some task conditions? These questions are subject to future human
factor evaluations.
We should also note that our categorization of techniques to 2D or 3D is valid only
for studied task conditions or their close approximations. In other conditions, the virtual
pointer may behave as a 3D technique (for example, when the selection of occluded
objects is required [32]).

5.5. World-Centered vs. User-Centered Design of Virtual Interaction


Prior research and development of user interfaces for VEs has been geared towards the
development of effective interaction techniques that allow the user to effectively interact
in any given task conditions. However, our findings suggest that even for basic tasks,
such as object selection and positioning, the development of a single universally efficient
technique is difficult, or impossible. Thus, instead of developing new interaction
techniques, developers can take another route: designing VEs to allow for optimal
performance using existing techniques. We call these two approaches, respectively,
34 I. POUPYREV AND T. ICHIKAWA

world-centered and user-centered VE interface design. The categorization of VE design


methods as user- or world-centered is, certainly, a generalization. Practical VE system
development should probably use methods and principles based on both approaches,
depending on the purpose of the application.

6. Conclusion
The growing acceptance of VE technology will require significantly more attention to
VE interaction to maximize user performance. This study systematically explores one of
the most important aspects of immersive interfaces—interaction metaphors for immer-
sive object selection and positioning. The paper presented an original taxonomy of
interaction techniques for immersive manipulation, described the methodical frame-
work used in the experiments, reported experimental results, and drew some general
design implications for the development of VE interfaces.
The research reported here, however, is just a small step towards understanding
human factors behind manipulation in VEs. Future studies should further investigate
the design aspects of particular techniques and their influence on user performance. For
example, it is necessary to assess their usability in other conditions of manipulation
tasks, investigate combinations of manipulation and navigation techniques, and explore
possible ways to integrate various techniques into seamless and intuitive interaction
dialogues.

Acknowledgments
This research was partially sponsored by the Air Force Office of Scientific Research
(contract *92-NL-225) and a grant from the HIT Lab Virtual Worlds Consortium. The
authors want to especially thank Jennifer Feyma for her help with experiments. We
would also like to thank Edward Miller, Jerry Prothero, Hunter Hoffman, Doug
Bowman, Prof. Hirakawa and all the subjects who participated in the experiments.

References
1. J. Liang (1994) JDCAD: a highly interactive 3D modeling system. Computers and Graphics 18,
499—506.
2. R. Stoakley, M. Conway & R. Pausch (1995) Virtual reality on a WIM: interactive worlds in
miniature. In: Proceedings of the CHI’95, pp. 265—272.
3. I. Poupyrev, M. Billinghurst, S. Weghorst & T. Ichikawa (1996) Go-Go interaction tech-
nique: non-linear mapping for direct manipulation in VR. In: Proceedings of the UIST ’96,
pp. 79—80.
4. D. Bowman & L. Hodges (1997) An evaluation of techniques for grabbing and manipulating
remote objects in immersive virtual environments. In: Proceedings of the Symposium on Interactive
3D Graphics, pp. 35—38.
5. J. Pierce, A. Forsberg, M. Conway, S. Hong, R. Zeleznik & M. Mine (1997) Image plane
interaction techniques in 3D immersive environments. In: Proceedings of the Symposium on
Interactive 3D Graphics, pp. 39—43.
6. M. Mine, F. Brooks & C. Sequin (1997) Moving objects in space: exploiting proprioception
in virtual-environment interaction. In: Proceedings of the SIGGRAPH ’97, pp. 19—26.
MANIPULATING OBJECTS IN VIRTUAL WORLDS 35

7. K. Stanney (1995) Realizing the full potential of virtual reality: human factors issues that
could stand in the way. In: Proceedings of the VRAIS ’95, pp. 28—34.
8. C. Ware (1990) Using hand for virtual object placement. Visual Computer 5, 245—253.
9. J. Butterworth, A. Davidson, S. Hench & T. Olano (1992) 3DM: a three dimensional
modeler using a head-mounted display. In: Proceedings of the Symposium on Interactive 3D graphics,
pp. 135—138.
10. E. McCormick (1970) Human Factors Engineering, 3rd edn. McGraw-Hill, New York, pp. 639.
11. I. MacKenzie (1995) Input devices and interaction techniques for advanced computing.
In: Virtual Environments and Advanced Interface Design (W. Barfield & T. Furness III, eds).
Oxford, Univ. Press, Oxford, pp. 437—470.
12. F. Taylor (1957) Psychology and design of machines. The American Psychologist 12, 249—258.
13. J. N. Latta & D. J. Oberg (1994) A conceptual virtual reality model. IEEE Computer Graphics
& Applications 14, 23—29.
14. S. Zhai & P. Milgram (1993) Human performance evaluation of manipulation schemes in
virtual environments. In: Proceedings of the VRAIS ’93, pp. 155—161.
15. B. Watson, V. Spaulding, N. Walker & W. Ribarsky (1996) Evaluation of the effects of frame
time variation on VR task performance. In: Proceedings of the VRAIS’96, pp. 38—52.
16. E. Spain & K. Holzhauzen (1991) Stereoscopic versus orthogonal view displays for
performance of a remote manipulation task. In: Proceedings of the Stereoscopic Displays and
Applications II, pp. 103—110.
17. J. Boritz & K. Booth (1997) A study of interactive 3D point location in a computer simulated
virtual environment. In: Proceedings of the VRST ’97, pp. 181—187.
18. S. Zhai, W. Buxton & P. Milgram (1994) The ‘Silk cursor’: investigating transparency for 3D
target acquisition. In: Proceedings of the CHI’94, pp. 459—464.
19. K. Hinckley, R. Pausch, J. Goble & N. Kassell (1994) A survey of design issues in spatial
input. In: Proceedings of the UIST ’94, pp. 213—222.
20. M. Mine (1995) Virtual environment interaction techniques. Technical Report TR95-018, UNC
Chapel Hill CS Department.
21. K. Hinckley, J. Tullio, R. Pausch, D. Proffitt & N. Kassel (1997) Usability analysis of 3D
rotation techniques. In: Proceedings of the ACM UIST ’97, 1—10.
22. S. Grissom & G. Perlman (1995) StEP(3D): a standardized evaluation plan for three-
dimensional interaction techniques. International Journal of Human-Computer Studies 43, 15—41.
23. I. Poupyrev, S. Weghorst, M. Billinghurst & T. Ichikawa (1997) A framework and testbed for
studying manipulation technique for immersive VR. In: Proceedings of the VRST ’97, pp. 21—28.
24. A. Forsberg, K. Herndon & R. Zeleznik (1996) Aperture based selection for immersive
virtual environment. In: Proceedings of the UIST’96, 95—96.
25. M. Wloka & E. Greenfield (1995) The virtual tricorder: a uniform interface for virtual reality.
In: Proceedings of the UIST’95, pp. 39—40.
26. M. Cohen (1993) Throwing, pitching and catching sound: audio windowing models and
modes. International Journal of Man-Machine Studies 39, 269—304.
27. T. Erickson (1990) Working with interface metaphors. In: The Art of Human—Computer Interface
Design (B. Laurel, ed.). Addison-Wesley, Reading, MA, pp. 65—73.
28. C. D. Wickens & P. Baker (1995) Cognitive issues in virtual reality. In: Virtual Environments
and Advanced Interface Design ( T.A. Furness & W. Barfield, eds). Oxford Univ. Press, New
York, pp. 514—542.
29. K. Kennedy (1964) Reach capability of the USAF population: Phase 1. The outer boundaries
of grasping-reach envelopes for the short-sleeved, seated operator. Technical Report TDR
64-56, USAF, AMRL.
30. P. Fitts (1954) The information capacity of the human motor system in controlling the
amplitude of movement. Journal of Experimental Psychology 381—391.
31. D. Foley, V. Wallace & V. Chan (1984) The human factors of computer graphics interaction
techniques. IEEE Computer Graphics & Applications 13—48.
32. C. Shaw (1998) Personal communication. May 28.

You might also like