You are on page 1of 11

Virtual Reality

https://doi.org/10.1007/s10055-020-00468-0

ORIGINAL ARTICLE

RUN: rational ubiquitous navigation, a model for automated


navigation and searching in virtual environments
Muhammad Raees1   · Sehat Ullah1

Received: 10 February 2018 / Accepted: 8 September 2020


© Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract
By now, the realm of virtual reality is abuzz with high-quality visuals, enough to simulate a real-world scene. The use of
intelligence in virtual reality systems, however, is a milestone yet to be achieved to make possible seamless realism in a
virtual environment. This paper presents a model, rational ubiquitous navigation to improve believability of a virtual envi-
ronment. The model intends to augment maturity of a virtual agent by inculcating in it the human-like learning capability.
A novel approach for automated navigation and searching is proposed by incorporating machine learning in virtual reality.
An intelligent virtual agent learns objects of interest along with the paths followed for navigation. A mental map is molded
dynamically as a user navigates in the environment. The map is followed by the agent during self-directed navigation to
access any known object. After reaching at a location where an object of interest resides, the required object is selected on
the basis of front-facet feature. The model is implemented in a case-study project learn objects on path (LOOP). Twelve
users evaluated the model in the immersive maze-like environment of LOOP. Results of the evaluation assure applicability
of the model in various cross-modality applications.

Keywords  Machine learning in VR · Automated navigation · Object-based searching · Intelligent virtual reality systems

1 Introduction a humanoid body and virtual senses, an intelligent virtual


agent (IVA) needs to have the ability to learn and respond
Virtual reality (VR) is a synthetic, immersive and interac- dynamically (Aylett and Cavazza 2001). Intelligence of the
tive 3D environment. Effectiveness of a virtual environment IVA depends on its learning and reasoning abilities while
(VE) depends on the degree of immersion it provides (Sheri- interacting with other IVAs and responding to a VE. By dint
dan 2000; Matsas et al. 2017) and the level of naturalism it of embedding AI algorithms in VR, an intelligent virtual
retains for interaction (Hale and Stanney 2014). With the reality system (IVRS) is designed where an IVA mimics the
widespread technological advancement, VR systems have learning and reasoning capabilities of human. In an IVRS,
shifted beyond simple audio-visual effects which were once intelligence-based navigation is often required so that an
considered enough for the simulation of real world (Duna- IVA should follow an optimal path to reach to a destination
gan and Jake 2004). The use of artificial intelligence (AI) point (usually a 3D object).
is becoming indispensible in the domain of VR to make a This paper proposes RUN, a novel model that makes an
virtual world indistinguishable from the real world (Lugrin IVA able to perform automated navigation and searching for
et al. 2005). With the machine learning (ML) classifiers, a desired place/object inside a VE. In the learning phase, an
intelligence-based interaction can be performed quite intui- IVA learns the name, location and facet of objects traced
tively. As a result, realism of a VR system is raised. Besides while exploring a VE. A mental map (M) of the VE repre-
senting paths leading to different virtual objects/places is
* Muhammad Raees constructed dynamically. In the application phase, an IVA
visitrais@yahoo.com performs self-directed navigation by using its virtual brain
Sehat Ullah to search out a desired known object. The search process is
sehatullah@hotmail.com performed in two steps. In the first step, the IVA follows the
map, M of the environment to perform automated navigation
1
Department of Computer Science and IT, University to the location where the object is located. As a single virtual
of Malakand, Chakdara, Pakistan

13
Vol.:(0123456789)
Virtual Reality

frame may contain more than one virtual object, therefore, markers. The system works within a specific lighting condi-
selection of the required object is performed in the second tion and depends on the shape and size of the markers. In
step. The K-nearest neighbors (KNN) classifier is used in another path planning approach for navigation (Li and Ting
the second step to select objects on the basis of front-facet 2000), a suitable path is opted from a number of possible
feature. In real-world scenarios, name of object/place is paths, but a user has to specify check points before starting
important to recognize and remember it. To simulate this in navigation. The path finding model of (Badler et al. 1996)
VR, explicit entry for object name is required once an object suggests intelligence-based animation in a VR setup. Some
is discovered. Similarly, by feeding name, the IVA initiates state-of-the-art research works have successfully mingled
autonomous navigation to find out a discovered object in the algorithms of AI with VR. The behavioral animation
application phase. The proposed model is implemented in a model (Conde et al. 2003) uses ML algorithms to ensure
case-study project, learn objects on path (LOOP). Using the rules inside VR like keeping distance from neighbors and
LOOP project, the model is evaluated in terms of accuracy gaining a specific velocity. Machine vision has also been
and applicability in IVRS. utilized in human–computer interfacing where position, pose
The paper is organized into 5 sections. Related work and actions of users are used for interaction (Hämäläinen
is discussed in Sect. 2. Section 3 elaborates the proposed and Johanna 2002). The KNN model of (Cai et al. 2010) cat-
RUN model. Section 4 covers implementation and evalua- egorizes objects or places into different classes using group
tion details. In the last section, Sect. 5 is about conclusion analysis and pattern recognition.
and future work.

3 RUN: the proposed model


2 Literature review
The RUN model intends to instill in an IVA the ability to
A key aspect behind the design of an immersive VR system learn navigational experience. Besides making a map of dif-
is to mimic the real world too closely to be distinguished. ferent paths, the agent learns objects during navigation. To
Intelligence is pivotal in simulating real-world interaction introduce a discovered object to an IVA, a user needs only to
in a VE (Gobbetti et al. 1998). Using the ML algorithms, name the object after clicking over it. Like the real eyes and
intelligence can be incorporated in a VR system. However, brain metaphor, the IVA discovers object of interest (OOI)
in most systems, avatars in a VE perform simple predefined with its virtual eye and stores names and details of OOI in its
interaction (Downie et al. 2001). Avatar is the manifesta- virtual brain. At the exploration of the VE, the mental map,
tion of self, inside a VE (Pterson 2005), whereas an IVA is M of paths, is formed dynamically. The IVA follows M, in
the avatar that may intelligently respond to dynamic queries the application phase, for autonomous navigation. The two
about the contents of a VE (Aylett and Cavazza 2001). An well-known ML classifiers, support vector machine (SVM)
IVA should be intelligent enough to learn like a human and (Pontil and Verri 1998) and KNN (Guo et al. 2003), are used
recall whatever interaction it once performed. to learn the details of the traced OOIs. The SVM classifier
An IVRS profoundly depends on the learning capability is trained by name and 3D position of a discovered OOI.
of an IVA (Bates 1994). However, scarce research works The KNN classifier is used to learn the facet’s features of
have been dedicated to enhance capabilities of IVA. The the OOI. A snapshot from the LOOP project is illustrated in
work of (Luck and Ruth 2000) proposes ML classifiers to Fig. 1 where the IVA learns the details of a selected OOI.
keep avatars away from hitting with the walls and other vir- At the completion of learning phase, the IVA may locate
tual objects. The approach introduces a platform to percep- a discovered object by keep tracking the data stored in its
tively control actions of an IVA. However, the proposed sys- virtual brain. In the same phase, the IVA is triggered by the
tem lacks automation of interaction. The nomadic anatomy search engine (SE) module to use its brain data (M and the
application of Senger (Senger 2005) records the interaction classifiers’ results) to spontaneously navigate inside the VE.
of trainees. The system stores interactions performed on a At an appropriate frame, a required object is searched out on
particular voxel, and automated navigation is supported only the basis of front-facet feature.
within a local geometry. In the offline navigation support
approach (Van et al. 2001) an agent guides visitors in the 3.1 Learning phase
designed environment and supports dialogue-based interac-
tion. In such systems, the agent lacks the ability to learn by IVRS is on the rise to incorporate intelligence in VR systems
itself. Due to the absence of learning factor, explicit specifi- via the cutting-edge algorithms of ML. The believability of
cation of the paths is required for navigation. a synthetic character is incomplete if it lacks the capability
Rivas et al. (Rivas et al. 2015) proposed an image-based to learn by itself (Aylett and Cavazza 2001). Navigation is
algorithm to guide robot’s locomotion by using various traveling inside a VE. Navigation is primarily required to

13
Virtual Reality

𝜓 with rows r = {0, 1, .., n} and columns c = {0, 1, 2, … n} is


performed. A frame pixel Fp(px, py) at row px and column
px in the 𝜓 is replaced by a thresholded pixel Tp(px, py) in
Ω if the intensity of Tp(px, py) is less than a constant T  . The
whole or inner part (in case of complex object) of an OOI is
traced by convex hull CH (Gang and Nengxiong 2015). The
CH represents the set of pixels 𝜒 = {x1 , x2 , … xn ) of the OOI
enclosed by CH is given as,
{ |𝜒| |𝜒|
}
∑ ∑
CH = ws xs |∀s ∶ ws ≥ 0 ∧ ws = 1
𝜒=1 𝜒=1

where w is the weight constant.


Details about the OOI ­(OOIImg and OL) are then fed into
the virtual brain of the IVA. For text-based reference, the
Fig.1  The IVA observing the selection of an OOI system prompts for a string input to be assigned as name
to the object. The system assigns unique label on the fly to
access a virtual object in a VE. Automated navigation is represent the features’ set of the OOI for the classifiers.
important in a large/complex VE so that to avoid the pos- Information about all the followed paths and the discov-
sibilities of disorientation and getting lost. The RUN model ered OOIs is retained inside the virtual brain. Learning about
enables an IVA to learn the navigational experience of a VR OOI is accomplished inside the brain by a twofold process
user and use it in discovering various objects at run time. using the two ML algorithm, SVM and KNN. The SVM
During the learning phase, paths with turning points and classifier is trained by OLs, whereas the KNN classifier is
objects on the basis of front-facet feature and positions are trained by front-facet images of the OOI. The object’s name
learnt. A map (M), with routes as edges and turning points (ON) data structure keeps track of the names of the discov-
as nodes, is formed during explicit navigation. A dynamic ered objects. Schematic of the model is shown in Fig. 2.
data structure, distance vector (DV), is used to form M as
a user navigates in the VE. At the selection of an OOI, the
IVA captures details of the OOI, image ­(OOIImg) and the 3.1.1 Mental map
object location (OL). At the clicking over an OOI, the ren-
dered scene is packed into memory as a frame image ( 𝜓  ) The map M is a graph of nodes vi ∈ N and edges ei ∈ E
by using the glStorePixels routine (Elhassan 2005). Only representing the routes followed during navigation. M is
the 3D objects are considered in scanning 𝜓 excluding the updated each time when a turn is made by an arrow key dur-
background. To obtain the corresponding contours image ing navigating. Before switching to application phase, cen-
( Ω ), adaptive thresholding (Ayas et al. 2018) over the image tral voxel of the last selected object is taken as end vertex.

Virtual brain
Virtual Scene
Distance (D),
M. Map
Vector (V)
SVM Classifier Navigation
DV
OL(x,y,z)
KNN Virtual Eye 0
OOIImg
Classifier OOI?
OL(x,y,z)
User
Name 1
Input
OOIImg

OOIImg

Fig. 2  Schematic of the proposed system

13
Virtual Reality

Distance Vector

Fig. 3  The distance vector (DV) data structure

M = (N, E) (1)

{ }
N = vi , ..vn (2)

A node represents direction of the following edge where


an edge represents distance/length between node vx and vx+1 .
Fig. 4  The followed path in the grid of voxels
If ‘ d  ’ is to represent Euclidian distance between any two
vertices vi and vi+1 , then
( )
E = {d1 (vi , vi+1 ), … dn vn−1 , vn } (3)

The three possible values of nodes are S, L, R where ‘S’ S L


represents straight, ‘L’ left and ‘R’ right move. With each
turn the graph is updated by fixing distance ‘d’ of the previ-
ous edge ep ∈ E and adding a new node vx ∈ N  . Practically,
the map is accomplished by the DV structure. The DV links
R S
the two dynamic arrays, ‘distance’ and ‘vector,’ as shown
in Fig. 3.
While initiating a move in a particular direction (straight,
right or left), a character constant ‘S’, ‘R’ or ‘L’ is assigned,
respectively, to the vector array. The M formed by going S
S → R → S → L → S in grid of voxel, see Fig. 4, is shown
in Fig. 5. Considering
{ the whole
} virtual scene as a grid of
Fig.5  The map (M) of the followed path
voxels, V = 𝜔1 , 𝜔2 , … , 𝜔n |𝜔i ∈ ℝ3 at the time of a mak-
ing turn the distance (d) between starting voxel ( 𝜔s ) and end
voxel ( 𝜔e ) is dynamically calculated as,

( ) ( )2
d 𝜔s , 𝜔e = 𝜔s − 𝜔e (4)

3.1.2 Extraction of OOI

During navigation, explicit selection is initiated by a click


over an OOI, see Fig. 6. Extraction of the front-facet of
OOI ­(OOIImg) from the scene is performed by our designed
algorithm (Raees et al. 2016). After the identification of the
boundaries of the object by ­CH, the algorithm selects contour
pixels from top-left to bottom-right. The rendered virtual
scene at the time of selecting OOI is taken as a Mat object by
the glReadPixels routine. Adaptive thresholding (Gonzalez
and Richard 2002) is performed with a threshold intensity Fig. 6  The rendered frame at the time of selecting an OOI

13
Virtual Reality

level T. The constant T is calculated as a mean of the intensi- the OOIs is stored at run time. Once an object is discovered
ties of the background and foreground pixels. by the IVA, the system queries about a string input. In the data
∑n ∑n structure, the string input is saved under the name attribute
Fp(r, c) ∈ Background (Object_Name). To uniquely represent each entry of ON, an
𝜇B = r=1 c=1 (5)
r∗c index (from whole numbers) is assigned dynamically. The data
structure is updated each time an OOI is traced. In the virtual
∑n ∑n
Fp(r, c) ∈ Foreground brain, the binary contour image ­(OOIImg) representing the front
𝜇F = r=1 c=1
(6) facet of an OOI is saved with the same input name. The ON
r∗c
structure is shown.
𝜇B + 𝜇F
T= (7) Index Object_Name
2
0 Input-String1
. .
The Ω image is obtained as,
. .
{ . .
0 𝜓(x, y) < T
Ω(x, y) = N Input-Stringn
1 Otherwise

From the Ω of the scene, O ­ OIImg with rows ‘m’ and col-
umns ‘n’ is extracted as, 3.1.4 Identification and classification algorithms
(⋃Tm−5 ⋃Lm−5 )
(Ω), (Ω) (8) To learn and recall objects, the virtual brain makes the use
r=Bm + 5 c=Bm + 5
of the two well-known classifiers, KKN and SVM. The
where Tm, Lm, Rm and Bm represent top-most, left-most, KNN classifier is used to recognize OOI based on front-
right-most and bottom-most white pixels of the object’s con- facet shape and is therefore trained by ­OOIImg. The SVM
tours. In order to fully grasp the image, five extra pixels are classifier deals with 3D location and is trained by the OLs.
extracted at each boundary. The entire process of extracting One unique label per OOI is assigned to represent the image
­OOIImg from a frame image ( 𝜓  ) is shown in Fig. 7. class by KNN and its respective 3D position in the SVM.

3.1.3 The ON data structure 3.1.4.1  The SVM classifier  SVM is an efficient ML classifier
(Hu et al. 2016) which learns the sets of prototypes based on
Akin to the use of name in real world, to learn and recall a separating hyperplane. The classifier needs to be trained
objects, a unique name is required to reference a discovered with features xi ∈ Rd and class labels yi ∈ Y = {1, 2, … , n} .
OOI. With the ON data structure, the name information about After the completion{(of training,
) ( the classifier
)} predicts a
class from a set S = x1 , y1 … xn , yn  . On the basis of
obtained features, the classifier builds an optimal hyperplane
such that to predict a class label class yi |yi ∈ {+1, −1} for
i = {1, 2, … , n} . By design, the classifier predicts a unique
class label yx if most of the unknown features belonging to
yx lies on one side of the hyperplane (Lu and Qihao 2007).
The classifier computes inner product space X ∶ X ⊆ ℝd for
xi ∈ X and yi ∈ Y = {1, 2, .., n} for S of cardinality n . If H
(a) (b) is the prototype space and xi ∈ X an input instance, then the
scoring function f for SVM is given as,
f ∶X×H→ℝ

Predicting a class label yx for the features of an input


instance xx , the classification function 𝜔 ∶ 𝜌 → ℝ is given
as,
( )
(d) (c) arg max ( )
yx = 𝜔
xx ∈ 𝜌
f xx , Hx (9)
Fig. 7  The process of extraction where a frame image 𝜓 is thresh-
olded to get b Ω image. c The computed ­CH over to get d the ­OOIImg where 𝜌 is the set of vectors’ indexes.

13
Virtual Reality

{ }
In the case of multiclass SVM, where cardinality n > 2 , C = OOIImg1 , OOIImg2 … OOIImg𝜉 (15)
searching in prototype matrix H ∈ ℝ|𝜌|×d is made for an
instance of( features
) vector xi ∈ 𝜌 . A class label yx is pre- The features of an ­OOIImg are represented in the form
dicted if 𝜔 xx results into positive prototype 𝜌 for the class of fixed size BT. A BT with rows ‘r’ and columns ‘c’ rep-
yx |𝜌 = {xx ∈ 𝜌 ∶ yix = 1} , where 1 ≥ i ≤ n. resents pixels values of an O ­ OIimg. The BT of an OOI is
As the RUN model deals with learning of multiple 3D stored in a classification file (CF) as given below,
objects, multiclass SVM is used to learn variable number of
OOI . Therefore, the set of classes is given as, ⎛ BT1(0, 0), … .BT1(0, c) ⎞
⎜ BT1(1, 0), … .BT1(1, c) ⎟
{ } CF = ⎜ ⎟ (16)
Y = 1, 2, … , OOIξ (10) ⎜…………….…….……⎟
⎝ BT1(r, 0), … .BT1(r, c) ⎠
where OOI𝜉 is the total number of traced 3D objects. { }ξ
To train the classifier, object names are used as class To query an image{ℚ , }BTxi i=1 ∈ ℚ , the KNN classi-
ξ
labels and 3D positions as features. The dataset D of the fier searches in set ℙ , BTi i=1 ∈ ℙ . The CF having closest
classifier associates the discovered{ object OOIi with resemblance with the query image is returned using the
} its loca-
tion OLi where OLi ∈ R3 for i = 1, 2, … , OOIξ following formula,
{ √
√ ξ
D = (OLi, OOIi)|OLi ∈ R3 }i=1 ξ
OOI
(11) √∑ ( )2
d(ℚ, ℙ) = √ ℚi − ℙi (17)
( ) i=1
A total of OOIξ OOIξ − 1 ∕2 hyperplanes are set for
‘ OOI𝜉 ’ number of classes to separate each class from the
rest of the classes. Each hyperplane of points x is satisfying
the equitation, 3.2 Application phase
w.x + b = 0 (12)
The twofold classification is used to reduce the chances
where ‘ w ’ is the weight vector and ‘ b ’ the slope intercept of of falsehood in searching out an OOI in the VE. After the
the hyperplane. For an input instance of voxel 𝜑i , the deci- exploration of VE, a user needs only to input name of any
sion function f is given as, known object to initiate auto-navigation and to locate the
( ) [ ] object. With the SE module, the string input (name) is
f 𝜑i = arg max wi .𝜑i + bi (13) recursively compared with the Object_Name entry in the
{ } ON data structure, see Fig. 8. On the successful match-
where i = 1, 2, … OOIξ .
ing, the name entry of the desired object is fed to the
SVM classifier. In case of failed matching, the process is
3.1.4.2  The KNN classifier  K-nearest neighbors are the sim-
repeated by incrementing the index to check the next entry.
ple learning algorithm with high accuracy rate for image-
The designed VE is deemed as a voxels grid across x-,
based analysis (Gonzalez and Richard 2002; Roberts et al.
y- and z-axis as shown in Fig. 9. To reach the location
2007). As the KNN classifier suits well for pattern recog-
(OL) of an object, the DV structure is followed for self-
nition (Franti et  al. 2006) and image analysis (Stork et  al.
directed navigation. Current entries of the distance and
2012), the classifier is followed in the proposed model for
vector arrays are followed to travel along a path and to
facet recognition. With KNN, if 𝛽 represents set of features
take turns, respectively. The coordinates of virtual cam-
and 𝛾 the target class then for an unknown 𝛽x KNN follows
era: CC(x, y, z) are changed according to the vector entries
the function h to conditionally predict y.
vi ∈ V to travel distance di ∈ D.
( )
h 𝛽x ∶ 𝛽 → y The self-directed navigation of the IVA is stopped
{ ( ) ( ) ( )} whenever the camera’s coordinates CC(x, y, z) match the
For a set of classes C = c1 x1 , c2 x2 , … , cn xn rep- OL of the target object. The algorithm follows for auto-
resenting the trained dataset, the solution for an instance 𝛽x mated navigation is shown in Fig. 10. To locate a required
is computed by the KNN algorithm as, object in the rendered viewport, classification on the basis
( ) [{ ( ) }] of front-facet features is performed by the KNN classifier.
h 𝛽x = argmaxc∈C y|y ∈ KNN 𝛽x , h(y) = c (14) The ­OOIImg from the virtual brain is checked with the
counters image ( Ω ) of the last rendered frame. The sec-
In the proposed model, the features data points are the
tion of Ω having maximum similarity with the O ­ OIImg is
binary tuple (BT) representing pixels of the ­OOIimg. Let C
thus selected.
be the collection of a total of ξ binary images representing
the discovered OOIs, then

13
Virtual Reality

Input Name (XXX)


Index Object_Name
1 Stop Navigation &
CC(x,y,z)=OL(x,y,z)?
1 Panning
0
IndexX XXXX Input Name =
Object_Name ? 0
0
OL(z)>CC(z)?

1 0
Panning
about +ve x OL(x)>CC(x)?
Navigation
about +ve z
Set C Files SVM OL 0
Panning
about –ve x Navigation
XXX KNN OOIImg about –ve z

Fig. 8  Graphic representation of the search engine (SE) module Fig. 10  Flowchart of the navigation and panning algorithm

laboratory. Before starting the evaluation session, partici-


pants were introduced to the system. Moreover, the par-
ticipants performed pretrials were as well.

4.1 Interface of the project

The LOOP system offers an immersive interface where the


user’s position is represented by an IVA. The interface sup-
ports keyboard events to activate different actions. With
the press event of arrow keys (up, right and left) the avatar
takes turn accordingly in the VE. Though the avatar can
move freely in the environment, there are three tracks fol-
lowing which a user may explore the entire VE. The three
Fig. 9  Grid view of a portion of VE tracks, straight, left and right, meet at a termination point
represented by a ‘stop’ sign board, see Fig. 11. In the learn-
ing phase, a static view of the VE containing various 3D
4 Implementation and evaluation objects is displayed. With the press of up arrow-key, the
z-coordinate of the virtual camera is decreased to have a
The RUN model is implemented in a project, LOOP. The perception of forward navigation. By pressing the left or
LOOP system is designed in Visual Studio (VS-2015) by right (L/R) arrow keys, the avatar takes L/R turn. Besides
using the libraries of OpenCV and OpenGL. A Corei5 rotating the avatar toward L/R, the eye’s coordinates of vir-
laptop with 2.7 GHz processor and 4 GB RAM was used tual camera are altered to have the feeling of panning (Chang
for the implementation and evaluation of the system. and Michael 2017). With each press event of the arrow keys,
Twelve participants 10 males and 2 females, mean age one step navigation or panning is performed. Along with a
34.9 (standard deviation 5.9, range 18), evaluated the case long beep of 600 ms, text like ‘left turn’ or ‘right turn’ is
study application. All the participants were postgraduate also displayed to inform user about an action. A 3D object
students of the department of computer science, Univer- is selected as an OOI when a user clicks over it. However,
sity of Malakand. Having prior experience in VR technol- to accomplish the training of the OOI, user needs to enter
ogy, they voluntarily participated in evaluation session. string input (name) for the object. The length of object name
All the experiments were performed in the University IT is kept eight characters. The string may contain numbers,
alphabets or symbols. By hitting the Enter key, the input

13
Virtual Reality

Rout-1
Rout-3 Rout-2

Fig. 11  Routes of the VE leading meeting at the board

Fig.13  3D environment of the LOOP application

Table 1  Statistics of the evaluation


Interaction Total Correct False % Accuracy

Automated navigation leading 60 57 3 95


to OL
Objects selection based on 60 52 8 86.6
OOI_Img
Overall accuracy 90.8

Fig. 12  State machine of the LOOP project objects, each participant had to take at least one turn during
exploration. In application phase, the system was examined
box for name feeding is set cleared. The ON data structure is by randomly feeding names of the known objects. As retrain-
updated by the new entry, while the features, OL and ­OOIimg ing is required only once (in learning phase), users availed
for the selected OOI are extracted for training. On hitting the option of pressing the Escape key for repeating the pro-
the Escape key, the system switches to application phase. cess of searching different OOI.
At the top left, label for the input box changes to ‘Enter
name of Object to navigate to.’ By pressing the Enter key,
the input string (name) is incrementally compared with the 4.3 Accuracy assessment
entries under the Name field in the ON data structure. The
SE module pursues OL and ­OOIimg by calling the SVM and With the case-study project, the model was evaluated for a
KNN classifier. Next, the map M is followed for automated total of sixty times. Each time the classifiers were trained
navigation. At the attainment of the object’s location (OL) in and tested by different combinations of the 3D objects. Over-
the VE, navigation is stopped. At the successful classifica- all accuracy rate for automated navigation and searching, as
tion by KNN, the required OOI is highlighted by a rectan- shown in Table 1, was 90.8%. Incorrect navigation or selec-
gle around its edges. State machine of the LOOP project is tion (false positive and false negative) (Najadat et al. 2019)
shown in Fig. 12. was counted as false detection.
For an unknown object/incorrect name, the system
4.2 Evaluation tasks remains in passive mode. In the passive mode no action is
performed in the VE. However, user is informed by dis-
The 3D VE of LOOP application contains fifteen objects of playing a message about entering a valid name. To properly
varying surface attributes like solid-filled, wire-filled and evaluate the use of SVM and KNN classifiers in the VR
textural facets, as shown in Fig. 13. Participants were asked setup, outcomes of the classifiers are analyzed separately by
to select and name any five 3D objects in learning phase. In the confusion matrices, see Figs. 14 and 15.
the designed environment, there are exactly four objects on Three 3D positions were wrongly classified by the
each track (straight, left and right). Therefore, to select five SVM: false negative (FN = 3). Some objects in the

13
Virtual Reality

Predictive
Total
Classification
1 0
Actual 1 57 3 60
Classification 0 0 0 0
Total 57 3 60

Fig.14  Confusion matrix of the SVM classifier

Predictive
Total Fig. 17  Result of the four-factor subjective
Classification
1 0
1 52 0 52
Actual 5 Conclusion and future work
0 8 0 8
Classification
Total 60 0 60 AI adds in augmenting 3D interaction (Dobrzański and Rafal
2016) in a VE. With this research work, we propose a model
Fig.15  Confusion matrix of the KNN classifier to enhance interactivity with an IVRS. The propose model
intends to enhance the ability an IVA such that to learn navi-
gational experience from a VR user. As a human learns in
ACVR=0.90 exploring an unknown place, the IVA learns different objects
1 along with different tracks. Once an object is discovered, a
name entry is fed which is used by the IVA as a reference
to trace the object. In application phase, the IVA accesses
APR
any of the known objects by inputting name of the desired
Accuracy

0.5 object.
ARR
Following a mental map of the scene, the IVA performs
auto-navigation to the position of the desired object. The
model was implemented and evaluated to testify its appli-
0 cability in VR systems. Satisfactory accuracy of the LOOP
SVM KNN
project confirms wider applicability of the model. For a nov-
ice user, the system ensures feasible view and visit of objects
Fig.16  Accuracy results of the classifiers
with less chances of disorientation. The model introduces a
path-finding approach that can be followed for simplifying
environment have similar facet features; therefore, the navigational tasks in complete VE such as virtual representa-
KNN classifier falsely identified eight objects: false posi- tion of brain, structure of DNA or galaxies. Moreover, the
tive (FP = 8). model can be extended for the automation of other 3D tasks
The calculated average precision ratio (APR), aver- inside a VE.
age recall ratio (ARR) and average cross-validation ratio Although the model is appropriate for autonomous navi-
(ACVR) for evaluating the classifier response (Caceres gation, front facets of objects should be distinguishable.
2014) are shown in Fig. 16. Accuracy of objects selection depletes with the increase of
similar objects in a single rendered frame. Moreover, the
system does not support jumping or flying of the IVA. An
4.3.1 Subjective analysis additional research is required to overcome the said chal-
lenges. In future, we are determined to enhance the model
After the evaluation session, a three-factor measuring ques- for the emerging augmented and mixed VR setups. The
tionnaire was presented to the participants. The factors authors plan to enhance the model so that the knowledge
assessed were ease of use, suitability in IVRS and natural- of an IVA can be shared with other IVAs. With this an IVA
ism. The users’ response about the three factors is shown will be able to intelligently respond against any query about
in Fig. 17. The postassessment questionnaire is shown in a 3D VE.
Table 2.

13
Virtual Reality

Table 2  The questionnaire for measuring the three factors

Question no Statement Your response (tick one)?


Strongly agree Agree Indifferent Disagree Strongly
disagree

1 As a whole, the system was easy to use


2 The model is suitable to be followed in intelligence-based VR applica-
tions
3 The system mimics the natural (real-world) experience of human in
learning places/objects during exploration

Gang ME, Nengxiong XU (2015) Cudapre3d: an alternative preproc-


essing algorithm for accelerating 3D convex hull computation
on the GPU. Adv Electr Comput Eng. https​://doi.org/10.4316/
Compliance with ethical standard  aece.2015.02005​
Gobbetti E, Riccardo S (1998) Virtual reality: Past, present, and
Conflict of interest  The authors affirm that they have no conflict of future. Virtual environments in clinical psychology and neuro-
interest. science: methods and techniques in advanced patient-therapist
interaction
Gonzalez RC, Richard EW (2002) Thresholding. In: Digital image
processing, pp 595–611
Guo G, Hui W, David B, Yaxin B, Kieran G, (2003) KNN model-based
References approach in classification. In: OTM confederated international
conferences on the move to meaningful internet systems, Springer,
Ayas S, Dogan H, Gedikli E, Ekinci M (2018) A novel approach for bi- Berlin, Heidelberg, pp 986–996. https​://doi.org/10.1007/978-3-
level segmentation of tuberculosis bacilli based on Meta-Heuristic 540-39964​-3_62
algorithms. Adv Electr Comput Eng 18(1):113–121. https​://doi. Hale KS, Stanney KM (2014) Handbook of virtual environments:
org/10.4316/aece.2018.01014​ design, implementation, and applications. CRC Press, Boca
Aylett R, Cavazza M (2001) Intelligent virtual environments—A state- Raton. https​://doi.org/10.1201/b1736​0
of-the-art report. In: Eurographics conference, Manchester, UK Hämäläinen P, Johanna H (2002) A computer vision and hearing based
Badler B, Webber W, Becket C, Geib M, Moore C, Pelachaud B, Reich user interface for a computer game for children. In: ERCIM work-
MS (1996) Planning for animation. In: Magnenat-Thalmann N, shop on user interfaces for all, Springer, Berlin, Heidelberg, pp
Thalmann D (eds) Interactive computer animation. Prentice-Hall, 299–318. https​://doi.org/10.1007/3-540-36572​-9_24
New Jersey, pp 235–262 Hu L-Y, Min-Wei H, Shih-Wen K, Chih-Fong T (2016) The distance
Bates J (1994) The role of emotion in believable agents. Commun function effect on k-nearest neighbor classification for medical
ACM 37(7):122–125. https​://doi.org/10.1145/17678​9.17680​3 datasets. Springer Plus 5(1):1304. https​://doi.org/10.1186/s4006​
Caceres CA (2014) Machine Learning Techniques for Gesture Recogni- 4-016-2941-7
tion. Dissertation, Virginia Tech Li TY, Ting HK (2000) An intelligent user interface with motion plan-
Cai Y, Ji D, Cai D (2010) A KNN research paper classification method ning for 3D navigation. In Virtual Reality Proceedings, pp177–
based on shared nearest neighbor. In: NTCIR, pp 336–340 184. https​://doi.org/10.1109/vr.2000.84049​6
Chang H, Michael F (2017) Panning and Zooming High-Resolution Lu D, Qihao W (2007) A survey of image classification methods and
Panoramas in Virtual Reality Devices. In: Proceedings of the 30th techniques for improving classification performance. Int J Remote
Annual ACM Symposium on User Interface Software and Tech- Sens 28(5):823–870. https:​ //doi.org/10.1080/014311​ 60600​ 74645​ 6
nology, pp 279–288. https​://doi.org/10.1145/31265​94.31266​17 Luck M, Ruth A (2000) Applying artificial intelligence to virtual real-
Conde T, Tambellini W, Thalmann D (2003) Behavioural Animation of ity: Intelligent virtual environments. Appl Artif Intell 14(1):3–32.
Autonomous Virtual Agents Helped by Reinforcement Learning. https​://doi.org/10.1080/08839​51001​17142​
In: International workshop on intelligent virtual agents. Springer, Lugrin J, Marc C, Mark P, Sean C (2005) AI-mediated interaction
Berlin, pp 175–180 https​://doi.org/10.1007/978-3-540-39396​ in virtual reality art. In: International conference on intelligent
-2_28 technologies for interactive entertainment, Springer, Berlin, Hei-
Dobrzański LA, Rafał H (2016) Artificial intelligence and virtual delberg, pp 74–83. https​://doi.org/10.1007/11590​323_8
environment application for materials design methodology. Arch Matsas E, Vosniakos GC, Batras D (2017) Effectiveness and accept-
Mater Sci Eng 45(2):69–94 ability of a virtual environment for assessing human–robot col-
Downie RB, Damian M, Yuri I, Bruce B (2001) Creature smarts: the laboration in manufacturing. Int J Adv Manuf Technol 92(9–
art and architecture of a virtual brain. In: Proceedings of game 12):3903–3917. https​://doi.org/10.1007/s0017​0-017-0428-5
developers conference Najadat HM, Alshboul AA, Alabed AF (2019) Arabic Handwritten
Dunagan, Jake F (2004) Neuro-futures: the brain, politics and power. Characters Recognition using Convolutional Neural Network.
J Fut Stud 9(2):1–18 In: 10th International conference on information and commu-
Elhassan I (2005) Fast texture downloads and readbacks using pixel nication systems (ICICS), pp 147–151. https​://doi.org/10.1109/
buffer objects in opengl. nVidia Technical Brief. nVidia iacs.2019.88091​22
Franti P, Olli V, Ville H (2006) Fast agglomerative clustering using a Peterson M (2005) Learning interaction in an avatar-based virtual envi-
k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell ronment: a preliminary study. PacCALL J 1(1):29–40
28(11):1875–1881. https​://doi.org/10.1109/tpami​.2006.227

13
Virtual Reality

Pontil M, Verri A (1998) Support vector machines for 3D object recog- virtual reality software and technology, pp 1–7. ACM. https​://
nition. IEEE Trans Pattern Anal Mach Intell 20(6):637–646. https​ doi.org/10.1145/50239​1.50239​2
://doi.org/10.1109/34.68377​7 Stork DG, Duda RO, Hart PE, Stork DG (2012) Pattern classifica-
Raees M, Ullah S, Rahman SU, Rabbi I (2016) Image based recogni- tion. John Wiley & Sons Publication, New Jersey. https​://doi.
tion of Pakistan sign language. J Eng Res 4(1):22–41. https​://doi. org/10.1007/s0035​7-007-0015-9
org/10.7603/s4063​2-016-0002-6 Van L, Jeroen, Anton N (2001) A dialogue agent for navigation support
Rivas E, Koutarou K, Kazuhisa M, Genci C (2015) Image-based navi- in virtual reality. In: CHI’01 Extended abstracts on human factors
gation for the snoweater robot using a low-resolution USB camera. in computing systems, pp 117–118. https:​ //doi.org/10.1145/63413​
Robotics 4(2):120–140. https​://doi.org/10.3390/robot​ics40​20120​ 3.63413​8
Roberts A, McMillan L, Wang W, Parker J, Rusyn I et  al (2007)
Inferring missing genotypes in large snp panels using fast Publisher’s Note Springer Nature remains neutral with regard to
nearest-neighbor searches oversliding windows. Bioinformatics jurisdictional claims in published maps and institutional affiliations.
23(13):i401–i407. https​://doi.org/10.1093/bioin​format​ ics/btm220​
Senger S (2005) Visualizing Volumetric Data Sets Using a Wireless
Handheld Computer. Studies in health technology and informat-
ics: 447–450.
Sheridan TB (2000) Interaction, imagination and immersion some
research needs. In: Proceedings of the ACM symposium on

13

You might also like