You are on page 1of 11

Visual Informatics 5 (2021) 56–66

Contents lists available at ScienceDirect

Visual Informatics
journal homepage: www.elsevier.com/locate/visinf

Natural multimodal interaction in immersive flow visualization


Chengyu Su a,b , Chao Yang b , Yonghui Chen a , Fupan Wang a , Fang Wang b , Yadong Wu c ,

Xiaorong Zhang a ,
a
Southwest University of Science and Technology School of Computer Science and Tecnonology, China
b
Computational Aerodynamics Institute at China Aerodynamics Research and Development Center, China
c
SiChuan University of Science & Engineering, China

article info a b s t r a c t

Article history: In the immersive flow visualization based on virtual reality, how to meet the needs of complex
Received 12 October 2021 professional flow visualization analysis by natural human–computer interaction is a pressing problem.
Received in revised form 3 December 2021 In order to achieve the natural and efficient human–computer interaction, we analyze the interaction
Accepted 7 December 2021
requirements of flow visualization and study the characteristics of four human–computer interaction
Available online 14 December 2021
channels: hand, head, eye and voice. We give out some multimodal interaction design suggestions
Keywords: and then propose three multimodal interaction methods: head & hand, head & hand & eye and head
Flow visualization & hand & eye & voice. The freedom of gestures, the stability of the head, the convenience of eyes
Virtual reality and the rapid retrieval of voices are used to improve the accuracy and efficiency of interaction. The
Multimodal interaction interaction load is balanced by multimodal interaction to reduce fatigue. The evaluation shows that
Human–computer interaction
our multimodal interaction has higher accuracy, faster time efficiency and much lower fatigue than
the traditional joystick interaction.
© 2021 The Authors. Published by Elsevier B.V. on behalf of Zhejiang University and Zhejiang University
Press Co. Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction essential phenomena and characteristics contained in CFD nu-


merical simulation data by avoiding the distraction from the
Flow visualization is an important part of scientific visualiza- computer interface.
tion. With the help of computer graphics theory and technology, In previous studies, we have proved that the efficiency of 3D
the data describing the movement of particles in the flow field interaction in visualization based on virtual reality is better than
are presented directly in the form of graphics. By the interac- that of traditional 2D mouse interaction (Xu et al., 2021), but
tive graphics system, valuable information is extracted from a there are still some problems in the natural interaction of im-
large number of complex data to help users analyze and under- mersive flow visualization. The previous research only compared
stand the complex fluid mechanism and the physical phenomena the interaction between gesture and mouse, so the research on
of flow (Chen et al., 2013). The combination of virtual reality interaction channels is relatively insufficient. In another previous
technology and flow visualization expands the means of human– study, gesture interaction is more natural and comfortable than
computer interaction, which can effectively improve the effi- that of joystick interaction (Lei et al., 2019), but the interaction ef-
ciency and quality of analyzing and understanding computational ficiency is not as good as the latter. Therefore, the improvement of
interaction efficiency needs to be studied. Besides, the interaction
fluid dynamics (CFD) numerical simulation data (Li et al., 2013).
mode in the previous research has the problem of fatigue caused
Compared with the traditional two-dimensional (2D) display, the
by long-term gesture interaction, so it is necessary to explore a
3D graphics generated by virtual reality technology can truly
more natural interaction mode.
represent the 3D spatial characteristics of flow data, which brings
In immersive flow visualization, the complexity of flow data
many benefits to visual analysis and interaction. The multimodal
and the professionalism of flow visualization analysis produce
human–computer interaction in virtual reality technology makes
complex and diverse interaction requirements. The mature WIMP
the interaction between users and the computer environment
interaction from 2D desktop interaction is no longer suitable for
more natural and harmonious through intuitive stereo graphics,
immersive 3D environments. Post WIMP interface represented
three-dimensional (3D) interaction, stereo hearing and interac- by the 3D user interface and multimodal interaction has become
tive feedback. Users can focus their attention on observing the a research hotspot (Zhang et al., 2016). Multimodal interaction
makes full use of human sensory channels to improve the natural-
∗ Corresponding author. ness and efficiency of interaction. However, there is less research
E-mail address: zhangxiaorong@swust.edu.cn (X. Zhang). on multimodal interaction in immersive flow visualization. In this

https://doi.org/10.1016/j.visinf.2021.12.005
2468-502X/© 2021 The Authors. Published by Elsevier B.V. on behalf of Zhejiang University and Zhejiang University Press Co. Ltd. This is an open access article under the
CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

paper, the strategy of multimodal interaction is proposed, which front of a 2D display, such as selecting, dragging and zooming. The
combines different interactions in a serial or parallel way to make QuickSet system proposed by Cohen et al. (1997) is a collaborative
use of the complementarity of them. The multimodal interaction multimodal system using wireless handheld devices. The system
method is used to optimize the interaction of immersive flow analyzes voice and stylus input in real-time. They cooperated
visualization, improve the accuracy of interaction and reduce with the U.S. Naval Research Laboratory to build the 2D version
interaction fatigue. The main contributions of this paper are: of QuickSet and integrated it into the virtual battlefield planning
platform.
• We provide a reference for the design of immersive visual- Pfeuffer et al. (2017) proposed an interactive technology of
ization interaction paradigm, especially for the multimodal
gesture and gaze, which can select objects through gaze and
interaction in immersive flow visualization.
manipulate by gestures. Users can interact well with near and
• We propose three multimodal interaction methods to im-
far interactive objects by the technology. Wang et al. (2019) pro-
prove the interaction accuracy and efficiency, balance the
posed a remote collaboration system based on augmented reality.
workload of the interaction channel, and reduce user fatigue.
Users can complete tasks by gestures and head pointing. Results
show that head and hand collaborative interaction improves the
2. Related work
interactive experience. Some researchers consider applying more
input channels to virtual reality. Koons et al. (1998) implemented
As an important research branch of scientific visualization,
a map interaction system using interactive technologies such
flow visualization has a long history of development, but the ap-
as 3D pointing gestures, speech and eye-tracking. Problems of
plication of virtual reality technology in flow visualization started
concurrent multimodality were also discussed. Oviatt et al. (2000)
in the 1990s.
studied a multimodal 3D virtual vehicle auxiliary maintenance
system based on virtual reality. The input channels include avatar
2.1. Immersive flow visualization
based on body tracking, gesture recognition and voice input. The
Bryson and Levit (1991) accomplished the application of sci- system implements the fusion of concurrent input channels. Kok
entific visualization in the virtual wind tunnel in 1991. The goal and Van Liere (2007) developed a set of interfaces named VR-
of the system is to effectively visualize the 2D flow field. It VTK, which are used to implement 3D display and multimodal
allows users to inject particles into the precalculated flow field interaction in VTK. The interfaces make use of head tracking
and observe its trajectory. The system ensures interactivity by to control the camera, pedals to grasp, voice input to program
distributed computing (Bryson and Gerald-Yamasaki, 1992). Sub- commands and system control. Besides, they also deeply studied
sequently, the aerospace research centers in France, Germany, the problems related to 3D space interaction, such as complex
Spain, and Japan have developed their own virtual wind tunnel (Li interaction methods and depth enhanced perception.
et al., 2013). LaViola (2000) developed a multimodal scientific In general, there are two main shortcomings in the current
visualization prototype system called MSVT, which allows users study: first, user interaction techniques are not defined well for
to observe the results of 2D scientific visualization by gesture and exploring flow visualizations due to hardware limitations. Sec-
voice interaction. They optimized the implementation method of ond, limited interactions are provided, which cannot satisfy the
multimodal interaction through user experiments and evaluation. requirements for complex flow data exploration.
With the continuous development of head-mounted virtual
reality devices, more convenient virtual reality helmets, such as 3. Multimodal interaction design
HTC Vive and Oculus Rift have emerged. The research of immer-
sive flow visualization based on consumer virtual reality devices There are three problems that need to be solved in multimodal
has gradually developed. Wernert et al. (2012) studied the inte- interaction: task requirements, multimodal interaction support
gration of visualization tool VTK and virtual reality environment. technology and multimodal interaction fusion method.
They demonstrated two new methods to simplify the integration The general interaction tasks in a virtual reality environment
of immersive interface and visual rendering. Besides, they intro- include navigation/roaming, selection/operation and system con-
duced some functions of rapid update and efficient interaction. trol. The interaction tasks in flow visualization are more detailed
Paeres et al. (2021) demonstrated a virtual wind tunnel using and complex. Requirement analysis is the premise of establishing
virtual reality technology as a scientific visualization tool, which the mapping relationship among interaction tasks, input channels
enables users to observe complex turbulence in an immersive and interaction technologies, which is also the basis of virtual
environment. reality multimodal interaction design.
The supporting technology of multimodal interaction is the
2.2. Multimodal interaction single-channel interaction. There are many devices available in
the virtual reality environment, so it is necessary to select and
Multimodal interaction is the interaction combining two or manage these devices and their interactive channels properly. The
more input channels in a system. It makes human–computer analysis of the characteristics of every single channel also pro-
interaction more natural and effective by the use of different vides a decision-making basis for the multimodal fusion method.
human sensory channels, which has the following advantages: re- Multimodal interaction is optimized by the complementarity of
ducing coupling, reducing errors, increasing flexibility, controlling different channels and the adaptability of channel tasks. There-
intellectual resources, and reducing user cognitive load (Zhang fore, the fusion of different interaction channels is the core prob-
et al., 2016). The first research of multimodal interaction for lem of multimodal interaction. Appropriate fusion methods will
graphic display is bolt’s put-that-there (Bolt, 1980), which inte- ensure the availability of multimodal interaction and even im-
grates voice input and guidance input based on tracker, so that prove the efficiency of interaction.
users can create and edit 2D graphic elements in front of the rear
projection screen. 3.1. Requirement analysis
The combination of gesture and voice is the most intuitive
way of multimodal interaction, so it is also the most discussed. The interaction tasks in human–computer interaction can be
Lucente et al. (1998) utilized speech recognition and video-based divided into navigation/roaming, selection/operation and system
hand tracking input to enable users to operate large objects in control. To analyze the flow visualization better, it is necessary
57
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Table 1 Table 2
Interaction requirements of flow visualization. Characteristics of channels.
Categories Requirements Details Channel Advantages Disadvantages
Data reading Read numerical simulation Joystick Accurate and easy, feedback Fixed shape, destroy immersion
2D calculation data file from hard disk. through vibration (Wang (Wang et al., 2020). fatigue
Algorithm management Add, select and delete visualization et al., 2020). (Boring et al., 2009).
algorithms reasonably. Hand Low equipment requirements Affected by the environment,
Algorithm parameter Configure or change the types and and cost, high degree of tracking range is limited, not
configuration values of visualization algorithm freedom, natural (Yang et al., accurate (Yang et al., 2019).
parameters. 2019).

Spatial parameter Determine the scope or boundary Head Stable and accurate, higher Rotate frequently in the large
3D attention and interest FOV scene (Blattgerste et al.,
configuration of the visualization algorithm.
(Sidenmark and Gellersen, 2018). slower than eye (Bizzi,
3D Geometric Move, zoom and rotate the 2019). 1974) . Midas problem (Drewes,
transformation visualization graphics. 2010).
Eye Faster and lower cost than Not stable (Blattgerste et al.,
head (Blattgerste et al., 2018). Eye calibration required,
2018). Reduce fatigue Midas problem (Drewes, 2010).
to analyze the requirements of specific interaction tasks in flow (Drewes, 2010).
visualization. By studying the workflow of current mature flow
Voice Suitable for non graphic Commands need to be
visualization software such as Ensight and Paraview, and com- (Billinghurst et al., 2018). memorized, recognition delay,
munication with domain experts, we define five kinds of flow Efficient and precise input of start and end need to be
visualization interaction requirements. In order to facilitate the text (Harris, 2005). determined Harris (2005).
study of different interaction methods, we divide the interac-
tion requirements into two categories: 2D interaction and 3D
interaction. As shown in Table 1, 2D interaction involves three Gaze. Both head gaze and eye gaze have the Midas problem. The
interaction requirements: data reading, algorithm management reason may be that the visual channel can only provide direc-
and algorithm parameter configuration: 3D interaction involves tional information but cannot quickly reconfirm the interaction.
two interactive requirements: spatial parameter configuration Comparing the two gaze interaction methods, the head is more
and 3D geometric transformation. stable but more tired than the eye, and the eyeball is faster
In 2D interaction, data reading needs to support a variety of and more convenient but more unstable. The head and eye will
formats of numerical simulation calculation files. The common cooperate with each other when observing from a large angle of
visualization data file formats are Tecplot, plot3d, VTK, CGNS, view.
etc. In addition, it also needs to have the ability to calculate
various vectors in the grid data attributes for further analysis. Voice. It has the ability of accurate text input, but the input
In the analysis of flow visualization, multiple visualization algo- contents should not be too many, otherwise it will bring great
rithms are involved, which need to be managed correctly to avoid recognition delay and user memory pressure. The beginning and
confusion and illegal operation. The commonly used visualiza- end of speech recognition need the cooperation of other channels.
tion algorithms are streamline, clipping, isosurface, and so on. Based on the information above, we put forward some prelim-
Each algorithm has the corresponding parameter configuration inary ideas of multimodal interaction design:
requirements. For example, in the streamline algorithm, drawing (1) The introduction of gaze (head or eye) interaction can
direction, seed area and seed number need to be configured; In reduce the burden of hand interaction. The direction provided by
scalar coloring algorithm, we need to configure scalar type, scalar visual interaction can reduce the frequency of gesture interaction,
range, color distribution and color order. which helps to alleviate the problem of hand fatigue.
In 3D interaction, widgets are interactive components created (2) Introducing other channels for confirmation is a better way
to solve the Midas problem of gaze interaction, such as head gaze
to determine some 3D spatial information of visualization algo-
selection + gesture confirmation; In addition, the combination of
rithm. For example, linear widget is used to control the seed
eye gaze and head gaze can also solve the problem without hand,
points area of the streamline. 3D Geometric Transformation is
which can further reduce the use of hands.
used to control the perspective of 3D scene, including navigation,
(3) Voice interaction is suitable for short and precise interac-
zooming and rotation, which is convenient for users to observe
tion or non-graphic commands, especially for retrieval interac-
the visualization results from multiple angles.
tion. When there are many similar commands, it can directly hit
the target, reducing manual search time. In order to avoid the
3.2. Channel analysis delay of voice wake-up, the start and end of voice interaction can
be controlled by other interaction channels (such as gestures).
Considering the common interaction channels and the avail-
ability of interaction devices, this paper discusses the combina- 3.3. Multimodal interaction method
tion of five interactive channels: joystick, hand, head, eye and
voice. Referring to the existing researches and papers, we summa- According to the characteristics of interaction operation and
rize the characteristics of each channel as shown in Table 2. The interaction channels, we propose three multimodal interaction
joystick is used as a traditional interaction mode for comparison, methods.
so we focus on the analysis of the other interaction channels.
Dual-channel interaction: head & hand. Gesture interaction is
Hand. No matter what implementation method is, there is a more flexible and natural than joystick interaction, so gesture
fatigue problem of long-time use. The reason may be that hands interaction is used as the main interaction channel. However,
hanging needs the support of the whole arm muscle, which gesture interaction is not accurate enough. In previous studies,
will consume more physical strength. Considering the cost and it is found that the instability of gesture brings some negative
interaction efficiency, visual-based gesture interaction is a better effects when making more accurate selection interaction, which
interaction method in hand channel. reduces the accuracy and efficiency of interaction. In addition,
58
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Table 3
Interactive channel assignment.
Interaction event Dual- Three- Four-channel Fusion
channel channel mode
UI select Head gaze Head gaze Head gaze/voice
UI click Gesture Eye gaze Eye gaze/voice Parallel
UI switch Gesture Gesture Voice/gesture
Scene interaction Gesture Gesture Gesture
Serial
Widget interaction Gesture Gesture Gesture

if all operations are completed by gesture interaction channel 3.4. Interaction event
only, it will bring heavy interaction burden and fatigue to users.
The head gaze has a relatively stable direction and low motion In the light of the interactive information from channels, we
burden, so the head gaze interaction channel is added to provide can trigger the corresponding interaction event. An interaction
direction selection information. Users click and confirm with event contains a series of related operations to finish a func-
gestures while aiming by the head(see Fig. 1.a). tion. In the requirements analysis section, we summarized the
interactive requirements of flow field visualization, but it is a
Three-channel interaction: head & eye & hand. Selection and click user-oriented classification rather than function-oriented, so We
interaction is a common interaction in flow visualization. In dual- reclassify all interaction operations into three categories: UI in-
channel interaction, head gaze is used as the selection, while ges- teraction, scene interaction and widget interaction (see Table 3).
ture is used as the click. This modality provides a more accurate In the immersive environment, 3D interaction needs can be com-
and stable choice. However, gesture still needs to be suspended pleted through 2D interface, which can be called UI Interaction.
frequently in the whole interaction process. Besides, long-term It consists mainly of interface control, selection and click op-
use may lead to heavy interaction fatigue. In order to reduce erations. The Spatial parameter configuration requirements are
interaction fatigue, it is necessary to reduce the rate of gesture completed by the interaction with widgets, which can be called
interaction. Therefore, eye gaze interaction channel is added to Widget Interaction. The 3D geometric transformation in 3D inter-
complete the selection and click interaction through the cooper- action needs is completed by transforming the scene. We call the
ation of head and eye gaze. So the purpose of reducing gesture 3D geometric transformation as Scene Interaction, which involves
interaction fatigue can be achieved. the translation, scaling and rotation of the scene.
Users aim with the eyeball and use the head gaze aiming to
confirm(see Fig. 1.b). Only when the UI interface is called out 4. Implementation
by gestures, eye aiming and head aiming will start. The green
translucent aperture indicates the general scope of the user’s eye In order to establish the mapping relationship between the
staring. The red cursor indicates the precise position of the user’s interaction task and the interaction channel, we propose a mul-
head staring. When the red cursor enters the green aperture, the timodal immersive flow visualization system framework. The
angle between the head gaze direction and the eye gaze direction overall framework is shown in Fig. 2.
is less than 8.5 degree. When the Coincidence time exceeds 0.5 s,
it is recorded as a click operation, then the cursor position of head 4.1. Hardware
gaze is transmitted to the UI interface for processing as the click
position. The hardware devices of the interaction channel are as fol-
lows:
Four-channel interaction: head & eye & hand & voice. The func-
VR headset: HTC Vive Pro Eye. Eye tracker: HTC Vive Pro Eye
tions involved in the flow visualization are quite complex.
Tracker. Hand tracker: Leap Motion. Voice: Microphone in VR
Whether the two-dimensional icons or three-dimensional models
headset. Joystick: HTC VR joystick. The specific layout of each
are used as metaphors, the number of metaphors will increase
input device is shown in Fig. 3.
with the increase of functions. When the number of metaphors
is large, users will spend more time on searching. At the same
4.2. Single interactive channel
time, more icons or models in the virtual reality space will also
cause serious visual occlusion and tedious interaction. Therefore,
4.2.1. Head
based on the three-channel interaction method, we add the voice
In head gaze interaction, the VR headset provides head orien-
interaction channel to control and manage the flow visualization
tation. We get the orientation of the headset which is also the
algorithm by making use of the characteristics that voice is
head direction vector by the OpenVR SDK. The vector is used for
suitable for fast retrieval. Compared with the wake-up word,
collision detection with the visualization results or intersection
gesture interaction can wake up and stop the voice interaction
calculation with the user interface (UI) to obtain the head gaze
more quickly.
object or UI coordinates.
Users make a voice gesture to turn on the voice recognition ⃗ 1 , r2 , r3 ) and the staring
The user’s head direction vector is R(r
function(see Fig. 1.c). During the duration of voice gesture, the point is A(a1 , a2 , a3 ). The origin of UV coordinate of UI inter-
microphone will be turned on to record the user’s voice. When face is O(o1 , o2 , o3 ). The normal vector of the UI interface is
the voice gesture ends, the voice recording will be ended too. The W⃗ (w1 , w2 , w3 ). The two axes of the UI interface are U(u ⃗ 1 , u2 , u3 )
recorded voice will be sent to the recognizer for language recog- and V⃗ (v1 , v2 , v3 ). The parameter equation of user gaze vector R ⃗
nition. After a short time, the corresponding interaction events is:
will be triggered according to the recognized voice. Besides, the
system sound can also be used as an output channel to give users x = a1 + r1 ∗ t ; y = a2 + r2 ∗ t ; z = a3 + r3 ∗ t (1)
interactive feedback. When the graphic feedback is not obvious,
The equation of UI interface is:
the headset sound can tell the user whether the operation is
correct or wrong. w1 ∗ (x − o1 ) + w2 ∗ (y − o2 ) + w3 ∗ (z − o3 ) = 0 (2)
59
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Fig. 1. (a) head & hand: Aim through the head and confirm by gestures. (b) head & eye: Aim through the head (red spot) and confirm by head eye coincidence (the
green circle is the eye gaze area). (c) hand & voice: Select by voice command while start and end with gestures.

Fig. 2. Multimodal immersive flow visualization system framework.

Fig. 3. Multimodal interaction hardware device layout.

According to (1) (2), the intermediate variable t can be solved as: dithering procedure is carried out to ensure gaze stability. Finally,
eye gaze is combined with head gaze to complete the interaction.
∑3 Before using eye gaze interaction, users need to conduct an
i=1 (oi − ai ) ∗ wi
t= (3) initial eye calibration to ensure that the measured gaze direc-
3
wi ∗ ri

i=1 tion is the same as the actual gaze direction. Because of the
dual-channel interaction between the head and eyeball, we can
Substituting (3) into (1), we can get the intersection coordinates
propose a combination of fast initial calibration and dynamic
as B(x, y, z). By the transformation of 3D coordinate system, the
implicit calibration. Compared with the common 5-point or 9-
intersection B is changed from the world coordinate to the UI
point calibration, 1-point calibration can be applied to initial
coordinate, which is C (u, v, 0). This coordinate C can be used to
calibration, even if the calibration accuracy of this method is
trigger interface interaction.
low. With the use of eye gaze interaction, the calibration re-
sults can be continuously corrected through dynamic implicit
4.2.2. Eye calibration as shown in Fig. 4. In this way, even if there is a
Firstly, the eye gaze vector output by the eye tracker is trans- deviation in eye tracking during the interaction, the usability of
formed into world coordinates, then the calibration procedure is eye-tracking interaction can be guaranteed through continuous
carried out to ensure the accuracy of gaze direction, and then the correction without special and complex calibration again.
60
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

4.2.5. Joystick
The interactive information of the joystick is provided by the
joystick controller, including the position, direction and button
information of the joystick. We define the function of each key
according to the requirements of operation. The information of
all joystick controllers can be obtained through the OpenVR SDK.

4.3. Visualization algorithm pipeline tree

As described in Section 3.1, visualization algorithm manage-


Fig. 4. When the system determines that the user’s intention is that the two
gaze directions coincide, the offset of the two direction vectors v
⃗os is calculated, ment is an important requirement in the application of flow
then the calibration offset is recalculated by the offset. field visualization. When exploring data, users will use a vari-
ety of visualization algorithms to analyze the data, which takes
the original data as the root node to produce a complex tree
Because of the eyeball itself and eye-tracking equipment, there visualization topology network. In most systems based on a 2D
may be a jitter when the user looks at a fixed target, so it is interactive environment, a simple tree list or hidden pipeline
necessary to decrease it for the user experience. We use the dy- is used to represent the topological relationship between algo-
namic exponential smoothing method which is very suitable for rithms to save screen space, and the menu is used to control the
high-speed scanning and low-speed fixation repeatedly switch- algorithm nodes. When the algorithm network topology relation-
ing. Dynamic exponential smoothing formula is: v ⃗t = at ∗⃗v0 + (1 − ship is complex, it is difficult to intuitively display the network
at ) ∗ v
⃗t − 1. v⃗t is the final fixation coordinate value at time t. v⃗0 is topology relationship to users, resulting in users spending a lot
the initial fixation coordinate value at time t. v ⃗t −1 is the fixation of time looking for the target visualization algorithm node.
coordinate value at time t-1. at is the smoothing coefficient at In order to solve the visualization algorithm management
time t, which can be defined by specific application scenarios. problem in a VR environment. we propose the concept of visual-
ization algorithm pipeline tree (see Fig. 6), which represents the
4.2.3. Voice relationship between various algorithms through tree nodes to
The speech recognition in this paper is based on the offline facilitate the management of visualization algorithms. The system
command recognition technology developed by iFLYTEK. The directly displays the visualization algorithm network structure
grammar of speech recognition defines the set of command words on the 3D interactive panel because the immersive 3D display
supported by speech recognition. We use BNF to describe the environment has no space constraints.
grammar of speech recognition. After the construction of the The original numerical simulation data will be displayed on
grammar dictionary, it will be compiled into recognition net- the top layer as the root node of a tree. If other data needs to
work and sent to speech recognizer. The speech information is be read in, it will be displayed on the same layer as the root
input through a microphone. The speech recognizer extracts the node of another tree. Each tree does not interfere with each
characteristic information of the input speech and performs path other logically, which is similar to the forest, but the visualization
matching on the recognition network. Finally, the content of the results controlled by the pipeline tree are displayed in the same
user’s speech is recognized, and then the corresponding interac- VR environment, so as to achieve the purpose of superposition
tion function is triggered by the content of speech recognition to and composite display. When the user analyzes the flow field,
complete the speech interaction function. the system needs to call the visualization algorithm as a filter to
We build a visualization algorithm dictionary in the form of emphasize the characteristics (such as specific scalar or vector)
‘‘operation + object’’. The operation includes five kinds: ‘‘add’’, that the user wants to analyze. At this time, a new node will be
‘‘delete’’, ‘‘show’’, ‘‘hide’’ and ‘‘edit’’. The object includes six kinds added, which is called node addition. The system takes the data
of visualization algorithms: ‘‘streamline’’, ‘‘cut’’, ‘‘isosurface’’, stored by the parent node as the input, and the new data gener-
‘‘shock wave’’, ‘‘eddy’’, ‘‘coloring’’, and ‘‘data’’ for visualization ated after the algorithm filter will be displayed in the lower layer
data management. The voice command can manage the nodes of as the child nodes of the parent node. The pipeline between the
the visualization pipeline tree without explicit display of the tree, two nodes represents a reasonable relationship between them.
which saves the interactive operation steps and the display space. However, the unreasonable node addition will fail. The new data
In addition, we also add a sound output system to give feed- includes not only the visual data set, but also the interactive scene
back through headset playing system sound when users interact, data, such as the location of the current interactive widgets or the
especially when users make illegal operations or the change of configuration of filter parameters. When the user wants to change
operation is not obvious. the visual result of a node, that is, node edition, the system will
read the data stored in the node and the user can edit and modify
4.2.4. Hand the node by modifying the filter parameters. When the user wants
In this paper, Leap Motion controller is used to collect user to delete a node, which is called node deletion, the system will
gesture information in real-time. We use the position information delete all its child nodes in the order of depth-first search and
of 25 hand joints obtained by hand tracker to build a hand then delete the node itself.
model. The virtual hand model can be obtained by connecting We refer to VTK’s visualization algorithm library and manage
each joint point in a reasonable topological order. According to the configuration through the visualization algorithm pipeline
the requirements of gesture interaction, we define five kinds of tree, which can support many kinds of composite flow field vi-
gestures: call menu, call widget, click/pinch, grab and voice (see sualization algorithms. Some typical flow visualization scenarios
Fig. 5). Each gesture has its own feature algorithm to recognize. and results are shown in the Fig. 7.
We use the gesture sequence of a three frames window to judge
the gesture state, which can be divided into three states: start, 5. Performance evaluation
moving and end. After the gesture and state are determined,
the interactive gesture can be designed to trigger the interaction We conducted two evaluation experiments for evaluating the
event according to the interaction requirements. usability and efficiency of multimodal interaction.
61
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Fig. 5. According to the characteristics of fingers and joints, such as the number of fingers extended, the degree of grasp and the normal direction of palm, the
recognition algorithm is developed for gesture recognition.

Fig. 6. The visualization algorithms are managed by tree structure. All visual-
ization results are mapped in the same VR environment to realize composite
visual analysis and each node can independently control the corresponding
visualization results.

5.1. Experiment design

In Experiment 1, the task is digital input. The reason why we


choose digital input as the interactive task is that a large number
of parameter configurations are involved in flow visualization.
Even though we have done a lot of three-dimensional natural
interactive design in interaction, there are still many mathemati- Fig. 7. The figure above is a combined visualization result of streamline
generated by rectangular seed area and multiple slices, which are colored with
cal parameters that need to be configured by interface, at least different scalars respectively. The lower figure is a combined result of a velocity
for the time being. As a high-density interactive interface, the streamline and multiple velocity isosurfaces, in which the isosurface is displayed
digital input can test the accuracy of interaction under extreme with boundary and translucent for observation.
conditions. Users input the same number through three kinds of
multimodal interaction. The single-channel Joystick interaction is
a controlled experiment for analysis. interaction tasks, so as to find the potential optimization direction
During the experiment, we will collect the error rate of user and shortcomings. We analyze the effect of multimodal inter-
input to analyze the accuracy of various multi-channel inter- action methods on task load balancing and reducing interaction
action methods. After the experiment, we will ask users about fatigue by counting the interaction channel occupation time. After
their subjective feelings about learning difficulty and interaction
the experiment, we will collect users’ subjective feelings about
fatigue.
multimodal interaction. We hope to expand the cognition of
Experiment 2 simulates the process of the flow visualiza-
multimodal interaction through user survey, so the content of the
tion. The goal of the experiment is streamline analysis, and it is
questionnaire includes but is not limited to the advantages and
completed by three kinds of multimodal interaction. The single-
channel Joystick interaction is also a controlled experiment. disadvantages of each interaction method.
During the experiment, we will record the sub task completion We choose the public data as the analysis data. The data is
time, total task completion time and interaction channel occupa- a numerical simulation data from the US Air Force fighter yf-17.
tion time of each multi-mode interaction method. By observing The interaction operations of the experiments include data read-
the completion efficiency of sub-tasks and total tasks, we analyze ing, visualization algorithm configuration, algorithm parameter
the characteristics of each interaction method in different types of configuration, spatial parameter configuration and 3D geometric
62
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Fig. 8. a: Digital input. b: Data reading. c: Streamline adding. d: Streamline generating. e: Streamline coloring. f: Streamline recycling.

transformation, including all the interaction requirements of the


flow visualization.
15 Participants were recruited in this experiment, including
10 males and 5 females, with an age between 23 and 36 years
(mean:28.13, SD:3.54). There are 13 Participants who have com-
puter science background and have been engaged in research
in flow visualization, and the other 2 participants do not have
relevant application background.

5.2. Procedure

Experiment 1: Digital Input Task. Participants call out the


virtual number keyboard in the virtual space first. Then they input
ten numbers of a given sequence on the keyboard(see Fig. 8.a).
The numbers gradually increase from single digit to four digits,
with a total of 27 numbers. The frequency of each number is
between 2–3 times to ensure that the frequency of each number Fig. 9. The input error rates of the four interaction modes in the digital input
is basically the same. Each participant completes the task by four task.
interaction modes. In order to avoid mutual interference, there is
a time interval between each interaction mode.
Experiment 2: Streamline Visualization Task. It can be di-
Experiment 1: Digital input task. Fig. 9 shows the input error rates
vided into five subtasks: data reading, streamline adding, stream-
of the four interaction modes in the digital input task. It should be
line generating, streamline coloring and streamline recycling (see
noted that the experimental interaction method of four-channel
Fig. 8.b–f).
interaction is the same as that of three-channel interaction, be-
The data reading task is to read the numerical calculation data
cause voice interaction does not have good recognition for digital
for visual rendering. The streamline adding task first configures
input. Both some numbers with too similar or short pronunciation
the velocity vector data from the file for streamline drawing, then
selects the streamline algorithm in the visualization pipeline tree and the user’s accent will greatly reduce the accuracy of voice
and adds it as a child node of the data node. The streamline recognition, resulting in unstable experimental results. Therefore,
generating task configures the streamline algorithm parameters voice interaction is abandoned in this test. It can be seen from the
first, then sets the generating direction of the streamline as bidi- figure that the error rate of single-channel joystick interaction is
rectional, the shape of the streamline generating area as line the highest, the three-channel and four-channel interaction are
segment and the number of streamline seed points as 20. Then, in the middle, and the dual-channel interaction is the lowest.
the spatial parameters of the algorithm are configured by the After the completion of the task, we collected some views
linear widget. The streamline generating area is determined as of the users on the experiment. In terms of learning difficulty,
the leading edge of the wing of the aircraft model. The appro- 15 of the 15 people thought that the learning difficulty of the
priate viewing angle is obtained by changing the scene through joystick was the lowest, 10 people thought that the three-channel
navigation and zooming. Streamline coloring is based on the ve- interaction was the hardest to learn, and the other 5 people
locity of streamline. The range of coloring is from the minimum to thought that the two-channel learning was the most difficult.
the maximum of velocity, and the coloring method is continuous In terms of fatigue feeling, 8 out of 15 people thought that the
coloring. In the streamline recycling task, the streamline node in fatigue of three-channel interaction was the lowest, and the other
the visualization pipeline tree is selected, then select the delete 7 people thought that the fatigue of joystick was the lowest.
button in the function menu to delete the generated streamline The feedback of the four-channel interactive mode is relatively
to return to the original visualization state. complex. When the recognition is successful, all users have a high
evaluation of it. They all thought that it is simple and not tired,
5.3. Results but when the recognition fails and needs to correct pronunciation
and accent repeatedly, they all thought that this mode is not good
The results of experiment 1 include input error rate and par- enough.
ticipant feedback. The results of experiment 2 include completion
time of each subtask, total task completion time, occupation time Experiment 2: Streamline visualization task. Fig. 10 shows the
of each interaction channel and participant feedback. average completion time of five subtasks in the flow visualization
63
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

thought that gesture was more flexible than joystick. 2 people


were not satisfied with the accuracy of gesture interaction. They
thought that gesture picking and control was not good enough.
However, 8 people were not satisfied with the call gesture in
subtask 3 streamline generating. They thought that the widget
could not be called conveniently and needed 2 or even 3 times to
succeed, while the joystick could succeed immediately. Among
the 15 people, 13 were satisfied with the voice interaction, and
2 were not satisfied with the recognition accuracy of voice in-
teraction. They thought that speech recognition requires two or
even three times to be correct. 13 of the 15 people were satisfied
with the eye interaction, but 2 people were not satisfied with the
accuracy of eye tracking. They found that sometimes the eye and
the head gaze should coincide, in other words should be in the
same direction, but the reality is not the case. 14 of the 15 were
satisfied with the head interaction, and one was not satisfied with
Fig. 10. The average completion time of subtasks in the flow visualization task the weight of VR headset. For the choice of optimal interaction, 9
by multimodal interaction. of 15 people chose four-channel interaction as the optimal choice,
one chose three-channel interaction, three chose dual-channel
interaction, and one chose joystick interaction.

6. Discussion

We summarize the experimental results and then analyze


some interesting phenomena found in the results. We hope to
find out valuable information to provide suggestions for better
multimodal interaction methods.
Joystick interaction. As can be seen from Fig. 9, the error rate
of joystick interaction is the highest, which actually slightly ex-
ceeds our expectations. Therefore, we analyze the causes of this
phenomenon and obtain the following two possible reasons: (1)
The trigger of the joystick has a key range, so there is still a
distance from pressing the trigger to completely pressing the
trigger. When the key is worn, the key range is longer and feels
Fig. 11. Total completion and channel occupancy time of multimodal interaction ‘‘soft’’ when it is pressed, so it is not easy to detect the feedback
in the streamline visualization task. when it is completely pressed, which leads to the ‘‘false press’’.
It is a hardware problem that cannot be optimized. (2) When
the participants press the trigger button, they need to pull the
task under the four interaction modes. The completion time of index finger back slightly. The muscle force will make the whole
each subtask in the single-channel interaction is marked as a joystick swing slightly with the movement of the hand. When
comparison. There are several data that need to be specified. the trigger button is pressed, the direction of the joystick will
The completion time of three-channel and four-channel inter- change slightly with the direction of the user’s initial aiming
action is significantly longer in subtask 2 streamline adding. direction. If the user’s initial aiming position is at the edge of the
The completion time of single-channel interaction in subtask 3 target button, it is possible to click on other buttons by mistake
streamline generating is significantly longer than that in all other when the final click is made. This problem can be optimized
ways. In subtask 5 streamline recycling, the completion time of by optimizing the interface layout and increasing the distance
four-channel interaction is significantly shorter. between the buttons. According to the questionnaire survey in
Fig. 11 shows the total completion time and the channel Experiment 1, the learning difficulty of joystick interaction is the
occupancy time of the four interaction modes in the flow vi- lowest, because the fixed shape and clear keys are very friendly to
sualization task. The first column shows the total completion new users. From the data of Experiment 2, the joystick interaction
time for each interaction. In terms of total completion time, takes the longest time to complete subtask 3, which contains a
four-channel interaction takes the shortest time, dual-channel large number of three-dimensional interactive operations, so we
interaction ranks second, then single-channel interaction ranks know that it has the lowest efficiency in three-dimensional space
third, and three-channel interaction has the longest completion interaction. The reason may be that the fixed shape makes the
time. The following four columns show the occupancy time of interaction have two disadvantages: visual occlusion and inflex-
four sub interaction channels (hand, head, eye and voice) in each ibility. Generally speaking, as an interactive method with good
multimodal interaction. efficiency and low cost, joystick interaction is still an excellent
After the completion of the task, we collected the participants’ interactive method at the current stage, but its disadvantages
views about the experiment. Among the 15 participants, 14 of caused by mechanical structure wear and fixed shape are difficult
them were not satisfied with the performance of the joystick to optimize, so its application in natural interaction is limited.
interaction in the streamline generating subtask (which is subtask
3). They thought that the joystick obscured the interactive object, Dual-channel interaction: head & hand. It can be seen from Fig. 9
which had a negative impact on the picking control and the that the error rate of dual-channel interaction is the lowest, which
position judgment of the control. However, all 15 participants due to the cooperation and complementarity of the two channels.
were satisfied with the performance of the joystick in other tasks. The head aiming is stable, which ensures the accuracy of aiming
13 of the 15 people were satisfied with gesture interaction. They and the naturalness of gesture interaction gives users real and
64
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

accurate feedback. Besides, the questionnaire of Experiment 1 a large amount of interaction time is saved by voice interaction
shows that the learning difficulty of dual-channel interaction is in subtask 5, which proves that we have achieved the purpose
the second difficult. Participants think that the learning difficulty of saving interface interaction and function search time by voice
is to maintain a relatively standard gesture. Because of the small interaction in the design. As can be seen from Fig. 11, the total
recognition range of the gesture controller and the occlusion interaction completion time of four-channel interaction is the
problem of recognition, the range and accuracy of gesture recog- least, and the occupation time of each interaction channel is more
nition are not good enough. Nevertheless, in Experiment 2, the balanced. Therefore, the application of voice interaction optimizes
total completion time of the dual-channel interaction method multimodal interaction. Overall, four-channel interaction is the
is still the second fastest. It can be seen from Fig. 10 that its best interaction method at present. It has certain advantages in
efficiency in other tasks is almost the same as that of a joystick interaction efficiency and interaction fatigue. Although the advan-
interaction, but it has flexible gestures, which makes its efficiency tages are not obvious, there is a huge space for optimization. The
in streamline generating subtask better than that of joystick. more interaction channels, the higher the requirements for every
In the questionnaire survey, the overall evaluation of the dual- single interaction channel. It should be noted that pronunciation
channel interaction method is relatively general. The participants and accent should be fully considered in the design of voice
instructions.
do not have too many negative comments, except for dissat-
isfaction with the tracking quality of gestures and the design 7. Conclusion
of some gestures. In general, dual-channel interaction is a good
interaction method. Its performance is comparable to traditional In this paper, we study the application of multimodal inter-
joystick interaction. It has great advantages in three-dimensional action in flow visualization. We analyze the interaction require-
interaction, but also needs to be optimized in interaction fatigue. ments in flow visualization. Then the advantages and disadvan-
Three-channel interaction: head & hand & eye. We can see from tages of gesture, head, eye and voice channel through literature
are summarized. Based on the principle of multimodal comple-
Experiment 1 that the error rate of the three-channel interaction
mentarity, we propose three multimodal interactions: head &
is in the middle, which mainly comes from Midas problem. Al-
hand, head & eye & hand and head & eye & hand & voice. The
though the restrictions are added, there is still a small probability
parallel cooperative interaction methods of head & hand, head &
that the user will trigger the click operation unconsciously. By
eye and hand & voice are described in detail. We also designed
analyzing the questionnaire data of experiment 1, we find that
an immersive flow visualization system with multimodal inter-
the proportion of participants who think that the three-channel
action. The evaluation shows that natural multimodal interaction
interaction is the most difficult to learn is the highest. The reason can improve the user’s interaction experience by improving the
may be that the way of head eye collaborative interaction is interaction accuracy, accelerating the interaction, dispersing and
rare and the eye-tracking is not accurate and stable enough. reducing the interaction fatigue. Our future research includes: (1)
Nevertheless, a significant benefit of this method is to reduce Optimization of sub-channel. Improving the range and accuracy
fatigue. The proportion of users who think this method is the of gesture and eye-tracking. (2) Application expansion of each
easiest is the highest. It should be emphasized that those users sub-channel. We plan to expand the tools and widgets based on
who do not think the method is easy may mind the fatigue gesture interaction and study more eye interaction application
caused by the weight of VR headset rather than the method methods.
itself. In order to avoid false touch and improve accuracy, the
restrictions imposed to reduce the interaction efficiency, so the Ethical Approval
total task completion time of three-channel interaction is the
longest, especially in subtask 2 with many interface interactions. All procedures followed were in accordance with the ethical
However, there is still an interesting phenomenon. Although the standards of the institutional and/or national research committee
interface interaction efficiency of this method is low, the com- and with the 1964 Helsinki Declaration and its later amendments
pletion time in subtask 4 and 5 is actually less than that of or comparable ethical standards. All participants provided written
dual-channel interaction. We find that these two tasks require informed consent prior to enrolment in the user study.
fewer parameters, so the interaction interface is relatively small,
CRediT authorship contribution statement
and there are few and scattered buttons, so it is very conducive
to head-eye collaborative interaction. It can be seen from Fig. 11
Chengyu Su: Methodology, Software, Validation, Formal
that after the introduction of the eye interaction channel, the
analysis, Investigation, Data curation, Writing – original
occupation time of the hand channel finally decreased signifi-
draft. Chao Yang: Conceptualization, Writing – review &
cantly, but the cost is that the interaction efficiency decreases
editing. Yonghui Chen: Conceptualization, Writing – review &
slightly. By the experiment and analysis, we can find that three- editing. Fupan Wang: Resources, Project administration. Fang
channel interaction has a good performance in interactive load Wang: Resources, Supervision, Funding acquisition. Yadong
balancing, which alleviates the interaction fatigue. However, the Wu: Conceptualization, Funding acquisition. Xiaorong Zhang:
interaction efficiency decreased a little and the learning cost Resources, Supervision, Project administration.
increased slightly. Therefore, three-channel interaction needs to
be improved in efficiency. It is still an efficient interaction method Declaration of competing interest
in an appropriate application scenarios.
The authors declare that they have no known competing finan-
Four-channel interaction: head & hand & eye & voice. On the
cial interests or personal relationships that could have appeared
basis of three-channel interaction, voice interaction is added to to influence the work reported in this paper.
form four-channel interaction. In Experiment 1, we tried to input
numbers by voice interaction, but found it difficult to complete Acknowledgments
the whole experiment. The pronunciation of similar numbers
and the accent of participants have a great impact on speech This work was supported in part by the National Natural
recognition. The participants in Experiment 1 also expressed the Science Foundation of China (No. 61872304, No. 61802320), the
same view, that is, the experience is good when recognition is State Key Laboratory of Aerodynamics (SKLA20200203) and the
successful, but poor when it fails. It can be seen from Fig. 10 that National Numerical Windtunnel Project (NNW2019ZT6-A17).
65
C. Su, C. Yang, Y. Chen et al. Visual Informatics 5 (2021) 56–66

Appendix A. Supplementary data Koons, D.B., Sparrell, C.J., Thorisson, K.R., 1998. Integrating simultaneous input
from speech, gaze, and hand gestures. In: Readings in Intelligent User
Interfaces. pp. 53–64.
Supplementary material related to this article can be found
LaViola, J., 2000. MSVT: A virtual reality-based multimodal scientific visualization
online at https://doi.org/10.1016/j.visinf.2021.12.005. tool. In: Proceedings of the Third IASTED International Conference on
Computer Graphics and Imaging, pp. 1–7.
References Lei, J., Wang, S., Zhu, D., Wu, Y., 2019. Non contact gesture interaction method
for immersive medical visualization based on cursor model. J. Comput.-Aided
Billinghurst, M., Cordeil, M., Bezerianos, A., Margolis, T., 2018. Collaborative Des. Comput. Graph. 031 (002), 208–217.
immersive analytics. In: Immersive Analytics. pp. 221–257. Li, S., Cai, X., Wang, W., 2013. Large-scale flow field scientific visualization. Natl.
Bizzi, E., 1974. The coordination of eye-head movements. Sci. Am. 231 (4), Def. Ind. Press 16–18.
100–109. Lucente, M., Zwart, G., George, A., 1998. Visualization space: A testbed for
Blattgerste, J., Renner, P., Pfeiffer, T., 2018. Advantages of eye-gaze over head- deviceless multimodal user interface. In Proc. AAAI Intelligent Environments
gaze-based selection in virtual and augmented reality under varying field Symposium, pp. 87–92.
of views. In: Proceedings of the Workshop on Communication by Gaze Oviatt, S., Cohen, P., Wu, L., Duncan, L., Suhm, B., Bers, J., Holzman, T.,
Interaction. pp. 1–9. Winograd, T., Landay, J., Larson, J., et al., 2000. Designing the user interface
Bolt, R.A., 1980. Put-that-there: Voice and gesture at the graphics interface. Acm for multimodal speech and pen-based gesture applications: state-of-the-art
Siggraph Comput. Graph. 262–270. systems and future research directions. Hum.-Comput. Interact. 15, 263–322.
Boring, S., Jurmu, M., Butz, A., Scroll, tilt or move it: using mobile phones to Paeres, D., Santiago, J., Lagares, C.J., Rivera, W., Craig, A.B., Araya, G., 2021. Design
continuously control pointers on large public displays. In: Proceedings of of a virtual wind tunnel for CFD visualization. AIAA Scitech 2021 Forum.
the 21st Annual Conference of the Australian Computer-Human Interaction Pfeuffer, K., Mayer, B., Mardanbegi, D., Gellersen, H., 2017. Gaze + pinch
Special Interest Group: Design: Open 24/7, pp. 161–168. interaction in virtual reality. In: Proceedings of the 5th Symposium on Spatial
Bryson, S., Gerald-Yamasaki, M., 1992. The distributed virtual windtunnel. In: User Interaction. Association for Computing Machinery, pp. 99–108.
Proceedings of the 1992 ACM/IEEE Conference on Supercomputing, pp. Sidenmark, L., Gellersen, H., Eye&head: Synergetic eye and head movement
275–284. for gaze pointing and selection. In: Proceedings of the 32nd Annual ACM
Bryson, S., Levit, C., 1991. The virtual windtunnel: An environment for the Symposium on User Interface Software and Technology, pp. 1161–1174.
exploration of three-dimensional unsteady flows. In: Proceedings of the 2nd Wang, P., Zhang, S., Bai, X., Billinghurst, M., Zhang, L., Wang, S., Han, D., Lv, H.,
Conference on Visualization. Yan, Y., 2019. A gesture-and head-based multimodal interaction platform for
Chen, W., Shen, Z., Tao, Y., 2013. Data visualization. Electron. Ind. Press 29–33. MR remote collaboration. Int. J. Adv. Manuf. Technol. 105 (7), 3031–3043.
Cohen, P.R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., Wang, S., Zhu, D., Yu, H., Wu, Y., 2020. Immersive WYSIWYG (what you see
Clow, J., 1997. Quickset: Multimodal interaction for distributed applications. is what you get) volume visualization. In: 2020 IEEE Pacific Visualization
In: Proceedings of the Fifth ACM International Conference on Multimedia. Symposium. PacificVis, pp. 166–170.
Association for Computing Machinery, pp. 31–40. Wernert, E.A., Sherman, W.R., O’Leary, P., Whiting, E., 2012. A common path
Drewes, H., 2010. Eye gaze tracking for human computer interaction. (Ph.D. forward for the immersive visualization community. IEEE VR 2012.
thesis). lmu. Xu, S., Zhao, D., Su, C., 2021. Immersive virtual reality interactive system for
Harris, R.A., 2005. Chapter 1 - introduction. In: Harris, R.A. (Ed.), Voice Interaction flow field visualization. J. Syst. Simul. 1–13.
Design. Morgan Kaufmann, San Francisco, pp. 3–31. Yang, L., Huang, J., Feng, T., Hong-An, W., Guo-Zhong, D., 2019. Gesture
Kok, A.J., Van Liere, R., 2007. A multimodal virtual reality interface for 3D interaction in virtual reality. Virtual Real. Intell. Hardw. 1, 84–112.
interaction with VTK. Knowl. Inf. Syst. 13, 197–219. Zhang, F., Dai, G., Peng, X., 2016. A survey on hum.-comput. interact. in virtual
reality. Sci. Sinica Informationis 46 (12), 1711–1736.

66

You might also like