Professional Documents
Culture Documents
Manipulating Objects in Virtual Worlds
Manipulating Objects in Virtual Worlds
1. Introduction
MANIPULATION OF OBJECTS is one of the most fundamental interactions between humans
and their environment, whether in the physical or virtual world. In immersive virtual
environments ( VEs), object manipulation can occur both as a sole activity, capturing the
whole attention of the user, and as a component of a complex interaction sequence.
Hence, a significant amount of research on designing interaction techniques that
provide effective means for selecting and manipulating objects in VEs has been carried
out in recent years. This has resulted in a wide array of techniques such as laser pointer
and flash light [1], World-In-Miniature [2], Go-Go [3], HOMER [4], image plane [5],
scaled-world grab [6], and many other techniques.
As the field of virtual reality has matured, the lack of comprehensive research
evaluating the human factors of 3D manipulation techniques and their design implica-
tions has become apparent [ 7 ]. The sheer variety of techniques itself presents a chal-
lenge for developers. How do all these techniques relate to each other? Which
interaction techniques should be chosen for particular tasks? Which among the
* Some of the results reported in this paper were presented at the ACM Symposium on Virtual Reality
Software and Technology’97 and EUROGRAPHICS’98 conference.
-
Corresponding author. MIC Lab, ATR, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan.
and hand gestures, into the corresponding control actions and commands. The VE
system responds by changing the state of VE, i.e., by modifying the shape, position,
color, and other properties of various entities. Display devices provide sensory feedback
to users by stimulating their visual, auditory, and other perceptual systems, thus closing
the interaction loop.
Naturally, the performance characteristics of input and display devices in
various conditions of spatial manipulation have been subject to significant
scrutiny. Starting with the pioneering work by Ware that assessed the applicability of 3D
input devices in various manipulation tasks [8], many studies investigated the effect of
input and display devices and their properties on user manipulation performance
[14 —17 ].
While devices have been subject to careful and thorough evaluation, the effect of
mappings, i.e. interaction techniques, on user manipulation performance remains largely
unknown. Some of the earliest work concerned with the design of manipulation
techniques, rather than input devices, was a number of surveys by Hinkley [19] and Mine
[20]. They summarized existing techniques, and identified problems and possible
solutions. An informal usability study by Bowman and Hodges [4] was one of the early
attempts to systematically evaluate 3D manipulation techniques as a distinctive compon-
ent of VE interfaces. Although no quantitative data was collected, their study provided
some useful preliminary observations. For example, Bowman and Hodges observed
that ray-casting techniques were more effective for object selection tasks, while arm-
extension techniques were superior for manipulation. Recently, the number of
studies investigating 3D manipulation techniques has steadily increased [6, 21]. Our
work contributes to this growing body of research by systematically evaluating 3D
object manipulation techniques in various conditions of object selection and positioning
tasks.
example, there are more similarities between ray-casting and flashlight techniques than
there are between ray-casting and techniques that use non-linear mappings to extend the
user’s area of reach (as in Go-Go [3]). While an evaluation of ray casting might provide
insight into similar techniques, such as flashlight, it probably would not help in
understanding techniques like Go-Go. A taxonomy of techniques that categorizes them
according to their common properties, is crucial to understand relations between
techniques and to direct their design and evaluation.
Figure 2. Interaction techniques for VE object manipulation: classical virtual hand (a), Go-Go (b, c),
ray-casting (d), aperture (e), image plane (f ) and world-in-miniature ( g )
24 I. POUPYREV AND T. ICHIKAWA
object by simply touching its projection. The object underneath the user’s finger is
selected by casting a vector from the user’s eye-point through the finger and finding an
object intersecting with this vector.
Another approach to expanding the user ability to access and manipulate virtual
objects allows the user to manipulate the relative scale of the virtual world. One of the
earliest techniques using this approach was implemented within a 3DM immersive
modeler [9], where users could ‘grow’ or ‘shrink’ themselves to more easily manipulate
objects of different sizes. The automatic world scaling technique [6] allows the user to
scale and bring parts of the VE containing remote objects within the user’s reach. The
environment scales back after the manipulation is finished. The other interesting
technique is the world-in-miniature ( WIM ) technique [2], which provides the user with
a miniature hand-held model of the VE [ Figure 2( g )]. The user can then indirectly
manipulate virtual objects by interacting with their representations in the WIM.
Since all of the techniques discussed above have their strengths and weaknesses, there
have been a number of attempts to integrate them combining their best features. Virtual
Tricorder [25] combines ray casting for object selection and manipulation with tech-
niques for viewpoint navigation and level-of-detail control within one universal tool.
Another example is HOMER, which combines ray-casting and virtual hand: after the
user selects an object by ray-casting, his virtual hand instantly snaps to the selected
object allowing manipulation [4]. The virtual hand returns to its normal position after
the manipulation is completed. Cohen developed a similar technique earlier for spatial
sound manipulation [26].
3.2. Taxonomy
An analysis of current VE manipulation techniques suggests that most are based on
a few interaction metaphors or their combinations. Each of these basic metaphors
forms the fundamental mental model of a technique—a perceptual manifestation of
what users can do, how they can do it (affordances), and what they cannot do
(constraints) using the technique [27 ]. Particular techniques are implementations of
basic metaphors, which are often extended to overcome some of their constraints.
These improvements, in turn, can often result in new constraints. For example, the
flashlight technique enhances a virtual pointer metaphor by using a conic pointer to ease
the selection of small objects [1]. However, this enhancement results in an ambiguity if
several objects fall into the conic pointer [24].
In Figure 3, we present a simple taxonomy of current VE manipulation techniques
that categorizes them according to their basic interaction metaphors into exocentric and
egocentric techniques. These two terms originated in studies of cockpit displays [28], and
are used now to distinguish between two fundamental styles of interaction within VEs.
In exocentric interaction, also known as the God’s eye viewpoint, users interact with
VEs from the outside (the outside-in world referenced display [28]); examples are the
world-in-miniature and world scaling techniques. In egocentric interaction, which is the
most common in immersive VEs, the user interacts from inside the environment, i.e. the
VE embeds the user [28]. There are currently two basic metaphors for egocentric
manipulation: virtual hand and virtual pointer. With the techniques based on the virtual
hand metaphor, users can reach and grab objects by ‘touching’ and ‘picking’ them with
a virtual hand. The major design factor that distinguishes techniques is the mapping
MANIPULATING OBJECTS IN VIRTUAL WORLDS 25
between the real and virtual hand’s positions and orientations. For example, a ‘classical’
virtual hand technique provides one-to-one mapping, while the Go-Go technique
employs a non-linear function.
With the techniques based on the virtual pointer metaphor, the user interacts with
objects by pointing at them. When the vector emanating from the virtual pointer
intersects with an object, the object can be picked up and manipulated [20]. The major
design factors that distinguish techniques are virtual pointer direction and shape, and
methods of disambiguating the target object. In the simplest case, the direction of the
virtual pointer is defined by the orientation of the virtual hand, the pointer’s shape is
a ‘laser ray’, and no disambiguation is provided [ Figure 2(d)]. Aperture technique [24],
on the other hand, define the pointer direction using the user’s dominant eye and
hand-held tracker positions, and the shape of the pointer is a cone.
This suggested taxonomy identifies only the most basic metaphors for virtual object
manipulation. Techniques based on these metaphors can be further subdivided to reflect
the particular design aspects of each technique or combined together to form new
integrated techniques, such as Virtual Tricorder or HOMER techniques [4, 25].
4.3. Participants
Two groups of subjects were recruited from the laboratory subject pool: 10 males and
three females for selection task experiments, eight males and four females for position-
ing task experiments. Subjects ranged in age from 19 to 32; all subjects were right
handed, as determined by the Edinburgh inventory. To reduce the variability in subject
performance, we chose subjects who had moderate prior experience with virtual reality.
required the subject to pick a test object from an initial position and move it to a final
position specified by a terminal object of different color [ Figure 4( b)]. Both objects
were cylinders with equal radii, and subjects were asked to align the manipulated cylinder
with the terminal object. Each time the test object was released, the VRMAT calculated
the error of positioning as follows:
with the distance change, so we could separate the influence of distance and object size
on user performance as well as generalize experimental results beyond our test VE.
The main independent variables of interest for the positioning task were initial and
final distances to stimuli, required accuracy of positioning, and interaction technique. Both initial
and final distances were defined in virtual cubits; the required accuracy was defined
according to Eq. (1).
4.6. Procedure
A balanced within-subject (repeated measure) design was used. After donning the
HMD, subjects were asked to momentarily extend their tracked hand to its full natural
reach for ‘virtual cubit’ calibration. The environment then was recalibrated according to
the length of the virtual cubit. Following an explanation of the interaction techniques
and tasks, subjects had on average 3 min to practice them. Each subject completed 18
experimental sessions for selection tasks: three sessions with and three sessions without
visual feedback for each technique. There were 15 trials in each session: manipulating
three object sizes (4, 6 and 9 degrees of visual field) and five distances (0.7, 1, 2, 4 and
6 virtual cubits).
MANIPULATING OBJECTS IN VIRTUAL WORLDS 29
For the object repositioning tasks, we evaluated techniques under two conditions:
(1) repositioning at constant distances from the user and (2) repositioning within the
area of reach. Subjects completed three sessions for each technique with six trials in each
session: four trials for repositioning at constant distances (0.7, 2.2, 3.5, and 6 virtual
cubits) and two trials for repositioning within the area of reach with moderate distance
changes (from 0.7 to 1 and from 1 to 0.7 virtual cubits). All trials were defined within
20% of the required accuracy.
The order of trials within sessions was randomized, and trials were presented one
after the other with a 4 s delay between them until the session was completed. The
interaction techniques order was also randomized. In addition to the on-line perfor-
mance data, an informal questionnaire was administered after the experiments.
Figure 6. Selection time for ray-casting and Go-Go interaction techniques in object selection
Figure 7. ‘ Net’ time, absolute time (a, b) and number of iterations (c)
times [F (1, 11)"0.132, p(0.72 and F (1, 11)"0.747, p(0.41], a significant tech-
nique effect was found for number of iterations [F (3, 33)"5.47, p(0.039]. The
distance by distance analysis showed that with increased distance the ray-casting allows
object repositioning with less movement than Go-Go. For example, it required 1.8 less
movements than Go-Go for repositioning at the distance of 6 virtual cubits.
A comparison of the ray-casting, Go-Go, and classical virtual hand techniques for
object repositioning within the user reach showed that all techniques result in similar
performance when a moderate change of distance is required [absolute time:
F (2, 22)"2.9, p(0.08, net time: F (2, 22)"1.36, p(0.28]. For repositioning at
a constant distance (0.8 virtual cubits), ‘classical’ virtual hand and ray-casting outper-
formed Go-Go [absolute time: F (2, 22)"13.759, p(0.0001, net time: F (2, 22)"8.8,
p(0.002]. Under these conditions, classical virtual hand was 22% faster and rays
casting 15% faster than Go-Go. In general, ray-casting performed much better when
the task did not involve a change of distance [absolute time: F (2, 22)"17.77,
p(0.0001]; Go-Go, on the other hand, demonstrated the same performance under all
conditions [absolute time: F (2, 22)"0.95, p(0.4].
5. Discussion
These experiments demonstrate that there is no one ‘ best’ interaction metaphor among
those studied: their strengths and weaknesses can be compared only in relation to the
32 I. POUPYREV AND T. ICHIKAWA
particular conditions of the spatial manipulation. Here we discuss some of the issues that
arose from the studies.
user can naturally see when the virtual hand intersects the object. Therefore, visual
feedback is an inherent part of the metaphor and adding ‘more’ visual feedback does not
necessarily result in performance improvements. A second explanation is that because
object sizes are defined by visual angles, moving objects further from the user increases
their actual geometrical size [23]. This leads to an increase of the stimuli volume which,
in turn, counterbalances the effect of the overshoot. This situation, in fact, is very
natural for VEs: in order for the object to be visible at large distances, its geometrical
size should be quite large. An important exception is the selection of flat objects.
Therefore, the depth of the stimuli is another important variable that should be
considered in technique evaluation and interface design.
6. Conclusion
The growing acceptance of VE technology will require significantly more attention to
VE interaction to maximize user performance. This study systematically explores one of
the most important aspects of immersive interfaces—interaction metaphors for immer-
sive object selection and positioning. The paper presented an original taxonomy of
interaction techniques for immersive manipulation, described the methodical frame-
work used in the experiments, reported experimental results, and drew some general
design implications for the development of VE interfaces.
The research reported here, however, is just a small step towards understanding
human factors behind manipulation in VEs. Future studies should further investigate
the design aspects of particular techniques and their influence on user performance. For
example, it is necessary to assess their usability in other conditions of manipulation
tasks, investigate combinations of manipulation and navigation techniques, and explore
possible ways to integrate various techniques into seamless and intuitive interaction
dialogues.
Acknowledgments
This research was partially sponsored by the Air Force Office of Scientific Research
(contract *92-NL-225) and a grant from the HIT Lab Virtual Worlds Consortium. The
authors want to especially thank Jennifer Feyma for her help with experiments. We
would also like to thank Edward Miller, Jerry Prothero, Hunter Hoffman, Doug
Bowman, Prof. Hirakawa and all the subjects who participated in the experiments.
References
1. J. Liang (1994) JDCAD: a highly interactive 3D modeling system. Computers and Graphics 18,
499—506.
2. R. Stoakley, M. Conway & R. Pausch (1995) Virtual reality on a WIM: interactive worlds in
miniature. In: Proceedings of the CHI’95, pp. 265—272.
3. I. Poupyrev, M. Billinghurst, S. Weghorst & T. Ichikawa (1996) Go-Go interaction tech-
nique: non-linear mapping for direct manipulation in VR. In: Proceedings of the UIST ’96,
pp. 79—80.
4. D. Bowman & L. Hodges (1997) An evaluation of techniques for grabbing and manipulating
remote objects in immersive virtual environments. In: Proceedings of the Symposium on Interactive
3D Graphics, pp. 35—38.
5. J. Pierce, A. Forsberg, M. Conway, S. Hong, R. Zeleznik & M. Mine (1997) Image plane
interaction techniques in 3D immersive environments. In: Proceedings of the Symposium on
Interactive 3D Graphics, pp. 39—43.
6. M. Mine, F. Brooks & C. Sequin (1997) Moving objects in space: exploiting proprioception
in virtual-environment interaction. In: Proceedings of the SIGGRAPH ’97, pp. 19—26.
MANIPULATING OBJECTS IN VIRTUAL WORLDS 35
7. K. Stanney (1995) Realizing the full potential of virtual reality: human factors issues that
could stand in the way. In: Proceedings of the VRAIS ’95, pp. 28—34.
8. C. Ware (1990) Using hand for virtual object placement. Visual Computer 5, 245—253.
9. J. Butterworth, A. Davidson, S. Hench & T. Olano (1992) 3DM: a three dimensional
modeler using a head-mounted display. In: Proceedings of the Symposium on Interactive 3D graphics,
pp. 135—138.
10. E. McCormick (1970) Human Factors Engineering, 3rd edn. McGraw-Hill, New York, pp. 639.
11. I. MacKenzie (1995) Input devices and interaction techniques for advanced computing.
In: Virtual Environments and Advanced Interface Design (W. Barfield & T. Furness III, eds).
Oxford, Univ. Press, Oxford, pp. 437—470.
12. F. Taylor (1957) Psychology and design of machines. The American Psychologist 12, 249—258.
13. J. N. Latta & D. J. Oberg (1994) A conceptual virtual reality model. IEEE Computer Graphics
& Applications 14, 23—29.
14. S. Zhai & P. Milgram (1993) Human performance evaluation of manipulation schemes in
virtual environments. In: Proceedings of the VRAIS ’93, pp. 155—161.
15. B. Watson, V. Spaulding, N. Walker & W. Ribarsky (1996) Evaluation of the effects of frame
time variation on VR task performance. In: Proceedings of the VRAIS’96, pp. 38—52.
16. E. Spain & K. Holzhauzen (1991) Stereoscopic versus orthogonal view displays for
performance of a remote manipulation task. In: Proceedings of the Stereoscopic Displays and
Applications II, pp. 103—110.
17. J. Boritz & K. Booth (1997) A study of interactive 3D point location in a computer simulated
virtual environment. In: Proceedings of the VRST ’97, pp. 181—187.
18. S. Zhai, W. Buxton & P. Milgram (1994) The ‘Silk cursor’: investigating transparency for 3D
target acquisition. In: Proceedings of the CHI’94, pp. 459—464.
19. K. Hinckley, R. Pausch, J. Goble & N. Kassell (1994) A survey of design issues in spatial
input. In: Proceedings of the UIST ’94, pp. 213—222.
20. M. Mine (1995) Virtual environment interaction techniques. Technical Report TR95-018, UNC
Chapel Hill CS Department.
21. K. Hinckley, J. Tullio, R. Pausch, D. Proffitt & N. Kassel (1997) Usability analysis of 3D
rotation techniques. In: Proceedings of the ACM UIST ’97, 1—10.
22. S. Grissom & G. Perlman (1995) StEP(3D): a standardized evaluation plan for three-
dimensional interaction techniques. International Journal of Human-Computer Studies 43, 15—41.
23. I. Poupyrev, S. Weghorst, M. Billinghurst & T. Ichikawa (1997) A framework and testbed for
studying manipulation technique for immersive VR. In: Proceedings of the VRST ’97, pp. 21—28.
24. A. Forsberg, K. Herndon & R. Zeleznik (1996) Aperture based selection for immersive
virtual environment. In: Proceedings of the UIST’96, 95—96.
25. M. Wloka & E. Greenfield (1995) The virtual tricorder: a uniform interface for virtual reality.
In: Proceedings of the UIST’95, pp. 39—40.
26. M. Cohen (1993) Throwing, pitching and catching sound: audio windowing models and
modes. International Journal of Man-Machine Studies 39, 269—304.
27. T. Erickson (1990) Working with interface metaphors. In: The Art of Human—Computer Interface
Design (B. Laurel, ed.). Addison-Wesley, Reading, MA, pp. 65—73.
28. C. D. Wickens & P. Baker (1995) Cognitive issues in virtual reality. In: Virtual Environments
and Advanced Interface Design ( T.A. Furness & W. Barfield, eds). Oxford Univ. Press, New
York, pp. 514—542.
29. K. Kennedy (1964) Reach capability of the USAF population: Phase 1. The outer boundaries
of grasping-reach envelopes for the short-sleeved, seated operator. Technical Report TDR
64-56, USAF, AMRL.
30. P. Fitts (1954) The information capacity of the human motor system in controlling the
amplitude of movement. Journal of Experimental Psychology 381—391.
31. D. Foley, V. Wallace & V. Chan (1984) The human factors of computer graphics interaction
techniques. IEEE Computer Graphics & Applications 13—48.
32. C. Shaw (1998) Personal communication. May 28.