You are on page 1of 95

Accessible and Assistive ICT

VERITAS
Virtual and Augmented Environments and Realistic User
Interactions To achieve Embedded Accessibility DesignS
247765

Testing and Validation Refinement of the


interface tool set

Deliverable No. D2.8.3


SubProject No. SP2 SubProject Title Innovative VR models, tools
and simulation environments
Workpackage WP2.8 Workpackage Multimodal interfaces
No. Title
Activity No. A2.8.5 Activity Title Iterative testing and
optimization of the multimodal
interface tool set
Authors P. Moschonas, A. Tsakiris, G. Stavropoulos, N.
Kaklanis, S. Segouli, I. Paliokas, D. Tzovaras
(CERTH/ITI), E. Gaitanidou (CERTH/HIT), T. Grill
(UoS), G. Fico, C. Daz (UPM).

Status F (Final)

Dissemination Level Pu (Public)

File Name: D2.8.3 Testing and Validation Refinement of the


interface tool set.doc

Project start date and 01 January 2010, 48 Months


duration
VERITAS D2.8.3 PU Grant Agreement # 247765

Version History Table


Version Dates and comments
no.

1 First version created (October 2012).

2 Added the Heuristic Evaluation results (November 2012).

3 Added the User Study results (early December 2012).

4 Draft version created and sent for peer review (December 2012).

5 Final version submitted (December 2012).

December 2012 iii CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Table of Contents
Version History Table ...................................................................... iii
Table of Contents ............................................................................ iv
List of Figures ................................................................................. vii
List of Tables ................................................................................... ix
List of Abbreviations ........................................................................ x
List of Abbreviations ........................................................................ x
List of Abbreviations for VERITAS tools........................................... xi
Executive Summary....................................................................... 12
Document Overview ..................................................................................... 14
1 Introduction ............................................................................. 15
1.1 VerMIM and the VERITAS framework................................................ 15
2 Acceptability and Usability Indicators....................................... 17
2.1 Generic User Interfaces Indicators ..................................................... 17
2.1.1 User Performance................................................................................ 17
2.1.2 User Satisfaction.................................................................................. 18
2.2 Indicators for Multimodal User Interfaces Addressed to People with
Special Needs............................................................................................... 20
2.2.1 Speech- and Audio-based Interaction .................................................. 20
2.2.2 Tactile and Haptic Interaction .............................................................. 21
2.2.3 Eye- and Head- tracking Controlled Interaction .................................... 22
2.2.4 Hand and Gesture Recognition Interaction .......................................... 23
2.2.5 Vestibular Interaction ........................................................................... 24
2.2.6 Brain Controlled Interaction ................................................................. 24
2.2.7 Visual Interaction ................................................................................. 25
2.2.8 Special Issues concerning the Multimodal Interfaces Indicators........... 25
3 In Depth Analysis of the Usability and Acceptability Indicators 27
3.1 Recommendations for Modalities ....................................................... 28
3.2 Recommendations for Combined and Single Usability and
Acceptability Indicators ................................................................................. 29
3.2.1 Using target goals ................................................................................ 29
3.2.2 Using percentages ............................................................................... 29
3.2.3 Using z-scores ..................................................................................... 30
3.2.4 Using SUM: Single Usability Metric ...................................................... 30

December 2012 iv CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

3.2.5 Summary ............................................................................................. 30


3.3 VERITAS Task Examples .................................................................. 31
4 Heuristic Evaluation of the VERITAS Multimodal Interfaces
Toolset .......................................................................................... 35
4.1 Introduction ........................................................................................ 35
4.2 Process of the heuristic evaluation..................................................... 36
4.2.1 Process ............................................................................................... 36
4.2.2 Material ................................................................................................ 36
4.2.3 Scenarios ............................................................................................ 39
4.2.4 Expert specifications ............................................................................ 39
4.2.5 Evaluation Process .............................................................................. 41
4.2.6 Analysis ............................................................................................... 43
4.2.7 Modality blind user ............................................................................... 48
4.2.8 Modality Myopia ................................................................................... 50
4.2.9 Modality Motor impaired User .............................................................. 52
4.3 Conclusion ......................................................................................... 53
5 Testing and Validation of the VERITAS Multimodal Interfaces
Toolset: User Study ....................................................................... 55
5.1 VerMIM Evaluator: Integrating VerMIM with the Simulation Platform . 55
5.1.1 Data flow and Connection with Simulation Platform ............................. 58
5.2 Users & Tests Specifications and Scenario Description..................... 61
5.2.1 Users Specifications ............................................................................ 61
5.2.2 Tests Specifications ............................................................................. 62
5.2.3 Scenario Description ............................................................................ 68
5.3 Test Results ....................................................................................... 69
5.3.1 Quantitative metric results ................................................................... 69
5.3.2 Qualitative metric results...................................................................... 75
5.4 Conclusions of the User Study ........................................................... 77
6 Refinement of the Multimodal Interfaces Tool-set and its
Limitations ..................................................................................... 79
6.1 VerMIM Refinement ........................................................................... 79
6.1.1 Addition of the Head Tracking Tool ...................................................... 79
6.1.2 Improvements in Speech Recognition system ..................................... 81
6.1.3 Adding Natural Output of Speech Synthesis ........................................ 81
6.1.4 VerMIM as standalone GUI application ................................................ 81
6.2 VerMIM Limitations ............................................................................ 85

December 2012 v CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

7 Conclusions ............................................................................. 86
References .................................................................................... 87
8 Appendix ................................................................................. 91
8.1 TaskSheet Heuristic Evaluation: Multimodal Interfaces Manager
(VerMIM Evaluator) ....................................................................................... 91
8.1.1 A. Scenario: Normal users without any disabilities ............................. 92
8.1.2 B. Scenario: User with severe visual impairments blind users........... 93
8.1.3 C. Scenario: Users with mild visual impairments Myopia .................. 94
8.1.4 D. Scenario: Motion impaired users upper limb paralysis .................. 95

December 2012 vi CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

List of Figures
Figure 1: The VerMIM connections with the rest of the VERITAS tools. As it is
depicted the VerMIM communicates with several modality sub-modules (listed
in the right rectangle column). .......................................................................... 16
Figure 2: Example of a speech and audio-based device to interact hands-free
while on the go [25]. ......................................................................................... 20
Figure 3: Example of haptic interface, an exoskeleton [13]. ............................. 22
Figure 4: Example of an eye- and head-controlled interface [12]. .................... 23
Figure 5: Sign language recognition using Kinect [35]. .................................... 23
Figure 6: Wii balance board used as assistive technology [31]. ....................... 24
Figure 7: Matt Nagle, the first person to ever be implanted with a BrainGate
[27]. .................................................................................................................. 25
Figure 8 - Evaluation process of the heuristic evaluation ................................. 36
Figure 9: The expert RB running through Scenario 3 in that a mild visual
impairment is simulated. He is interacting with the computer using the Novint
Falcon haptic device and gets haptic feedback. ............................................... 42
Figure 10: Expert running through scenario 4 in which he wants to perceive how
a user with motor impairments (upper limb paralysis) would interact with the
SmartHome tool. He is wearing glasses on which an infrared lamp is fixed.
Behind the laptop screen the WiiMote appreciates the movement of the infrared
lamp and as a consequence the head movement. Moving his head he is able to
move the cursor on the screen. ........................................................................ 43
Figure 11 - Relation between number of evaluators and problems identified
according to Molich and Nielsen (1991) ........................................................... 43
Figure 12: Percentage of agreement to the questions in the first category of the
evaluation guidelines: VerMIM. ........................................................................ 45
Figure 13: Percentage of agreement in the VerMIM GUI Design category. ..... 47
Figure 14: Screenshot of the VerMIM Evaluation tool. ..................................... 48
Figure 15: Percentage of agreement for the modality Myopia. ......................... 51
Figure 16: Percentage of agreement for the Modality motor impairments. ....... 53
Figure 17: The main screen of the Smart Home Application interface which was
be used as the base for the user interaction scenario steps. ........................... 55
Figure 18: The VerMIM Evaluator tool that was used for the user test recordings
and the management of the simulation platform, responsible for simulating the
impairments to the subjects. ............................................................................. 56
Figure 19: The VerSim-GUI, which is communicating with the VerMIM Evaluator
and is responsible for simulating the various impairments to the test-users. .... 57
Figure 20: The VerMIM Evaluator report dialog that is displayed after each test-
simulation session. Durations, errors and velocity of the mouse pointer (per
each scenario task) are depicted. The user is also able to save these statistics,
along with other metrics, to a file for further process. ....................................... 58

December 2012 vii CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 21: The testing procedure data flow. The integration with the VerSim tool
is necessary in order to perform the simulation of the impairment in the testing
environment. VerMIM and VerMIM Evaluator exchange several data during the
simulation session, such as current device state, task completion checks, etc. 59
Figure 22: The Smart Home Application that was used for the scenario. Here
the interface is depicted unfiltered, just as it was used in the Normal and
Motor Impairment sessions. ........................................................................... 66
Figure 23: The Smart Home application as it appears after the simulation of the
myopia impairment, that was applied in the mild vision impairment case. ..... 66
Figure 24: The severe glaucoma vision impairment case; most of the visual
field is occluded by blind spot areas. In such cases the virtual user is
considered as almost blind............................................................................. 67
Figure 25: The test-user using the head tracking device; the user was instructed
not to use his hands; thus any interaction with the application was based on
head motion (via the infrared led glasses) combined with voice commands
(captured by the microphone)........................................................................... 67
Figure 26: The total durations of each test. The session percentages (to the
total users test time) are also depicted. It is clear that a great amount of time
was consumed for the 4th session, i.e. the head tracking for the Motor
Impairment test................................................................................................. 70
Figure 27: Average session duration (indicated by the number in seconds on
top of each bar). The red lines indicate the standard deviation of the duration
distribution. ....................................................................................................... 71
Figure 28: The duration overhead as a percentage relative to the Normal
session. The overhead of the haptic (Vision Mild session) is relative small to
the rest, especially when compared to the usage of the head tracker (Motor
session). ........................................................................................................... 71
Figure 29: Distribution of the user errors per session. The majority (12 out of 13)
of the users performed the tests making an almost negligible amount of errors.
......................................................................................................................... 72
Figure 30: The average number of user errors per session; even the Motor
session, which involved the head tracker, manages to achieve a mere mean of
2.0 errors. The standard deviation is indicated with the read line segments. ... 73
Figure 31: The normalize distance (as a percentage of the total point distance
travelled through the tests). The results indicate that the distances travelled are
comparable through the usage of different modalities. ..................................... 74
Figure 32: The distance average overhead (compared to the Normal case) of
the Vision-Mild and Motor sessions. As it is shown the average overhead is
small. ................................................................................................................ 74
Figure 33: The pointer velocity of each user of the Normal, Vision-Mild and
Motor sessions. .............................................................................................. 75
Figure 34: The pair of glasses attached with the LED transmitter, that were used
as tracking device. The depicted system can be considered as low cost, as it
total cost is less than five Euros. ...................................................................... 80

December 2012 viii CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 35: The VerMIM haptics testing panel. .................................................. 82


Figure 36: The VerMIM magnifying glass testing panel. ................................... 82
Figure 37: The VerMIM speech recognition testing panel. ............................... 83
Figure 38: The VerMIM sign language synthesis testing panel. ....................... 83
Figure 39: The VerMIM calibration and test panel of the Head tracking module.
......................................................................................................................... 84
Figure 40: The VerMIM speech synthesizer testing panel. ............................... 84

List of Tables
Table 1: Usability and Acceptability Indicators ................................................. 27
Table 2: Mostly used questionnaires regarding usability, usefulness, satisfaction
and easy of use. ............................................................................................... 28
Table 3: Usability and acceptability indicator recommendations considering
modalities ......................................................................................................... 29
Table 4: Examples of usability and acceptability indicators for Veritas project
tasks ................................................................................................................. 31
Table 5: Categories of our evaluation guidelines, their descriptions and
subcategories ................................................................................................... 38
Table 6 The four scenarios with the exact task descriptions. These Instructions
got our experts printed out on instruction sheets. ............................................. 39
Table 7: Specifications of our chosen experts: Education, specialisation,
research focus .................................................................................................. 39
Table 8: Users specifications table. ................................................................. 61
Table 9: Multimodal experience of the users before the test. ........................... 62
Table 10: Test session types; each session is a different combination of a VUM
and set of activated modality tools. .................................................................. 63
Table 11: The scenario followed at each test-session. ..................................... 68
Table 12: The system usability questionnaire; The scale is from 1 to 5, where 5
indicates strong agreement to the statement. The number of the test-subject is
reported in each cell (along with its translation to percentage). ........................ 76
Table 13: Technology acceptance model questionnaire for the VerMIM tools.
Each statement answer is scaled from 1 (favourable opinion of the system) to 7
(unfavourable opinion of the system). .............................................................. 77

December 2012 ix CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

List of Abbreviations
Abbreviation Explanation

API Application Programming Interface


Ax.y.z VERITAS Activity
DLL Dynamic Link Library
Dx.y.z VERITAS Deliverable
GUI Graphical User Interface
GVUM Generic Virtual User Model
LGPL GNU Lesser General Public License

ODE Open Dynamics Engine


OGRE Open Source 3D Graphics Engine
O/S Operating System
SP Sub-project
UI User Interface
WP Work-package

VR Virtual Reality
VUM Virtual User Model

December 2012 x CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

List of Abbreviations for VERITAS tools


Abbreviation Explanation

VerMIM Veritas Multimodal Interfaces Manager

MOUSI Veritas User Model Ontology to UsiXML Converter

VerGen Veritas User Model Generator

VerMP Veritas Model Platform

VerSEd-3D Veritas 3D Simulation Editor

VerSEd-GUI Veritas GUI Simulation Editor

VerAE Veritas Avatar Editor

VerSim-3D Veritas 3D Core Simulation Viewer

VerSim-GUI Veritas GUI Core Simulation Viewer

IVerSim-3D Veritas 3D Immersive Simulation Viewer

VerIM Veritas Interaction Manager

December 2012 xi CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Executive Summary
This documents purpose is to describe and justify the process that has been
followed for the testing and evaluation of the VERITAS Multimodal Interfaces
Tools, which have been developed as part of the work defined in WP2.8. Before
consulting this document, the reader is advised to read more about the
implemented tools and their integration, in the contents of the deliverables
D2.8.1 [1] and D2.8.2 [2].
The documents objective is threefold: a) to present a list of accessibility and
usability indicators which are commonly established as suitable for evaluation
for Multimodal Interfaces systems; b) to select which of these indicators can be
applied for the evaluation of our Multimodal Interfaces system, which is destined
to assist the interactions between elderly or impaired users and users with
special needs; and c) the most important: to define proper scenarios and test
the VERITAS multimodal toolset for its acceptability and usability.
The inclusion in this document of the accessibility and usability indicators and
factors was necessary and critical for setting properly the basis of the evaluation
that has been followed. Matters such as which indicator is suitable for which
modality have to be presented at first to give the reader a bibliographic
completeness on the field. The selection of the proper indicators, suitable for
testing the interactions with impaired users, was also a necessary step in order
to adapt the evaluation process to the VERITAS special needs.
During the evaluation process, several problems were encountered. The most
difficult one was the definition of a proper scenario for the testing sequence.
The reason for this was that the simulated user-group contains mostly impaired
users, with the inclusion of models with severe impairments (almost blind users,
users who cannot use their limbs). This resulted into the addition of several
constrains in the scenario, as it should use as many modalities as possible
which would assist a great number people having totally different impairment
types (from the vision, hearing, cognitive and motor domains) and in all these
parameters, the scenario should stay as realistic as possible.
Another problem that was met is that the usage of pre-existing applications
could not be used without any alteration or integration with the VERITAS
Multimodal Interfaces Manager (in short: VerMIM). So either a new application
framework should been created as a testing bed of the toolset or a special
low-level integration should be made. As it will be shown in the respective
section, the testing bed method was followed and was applied to a closed-
source Smart Home controller application.
Proper user selection was the key to the evaluation process for the testing
pilots. Both experts and non-expert users were included in the tests in order to
provide a spherical evaluation of the provided toolkit. So, two VerMIM
evaluations have taken place: a) a heuristic evaluation by exprerts and b) a
typical-user study. Both processes are fully described in this manuscript and
their results are reported.
First, the heuristic evaluation took place, from which several problems have
been depicted from experts in the area of multimodal user interfaces. Their
commendation has been taken into account and several fixes have been

December 2012 12 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

applied to multimodal interfaces toolset. This resulted into a heavy refinement of


the toolkit. Several of the VerMIM components had improved and the VerMIM
was re-tested in a user study of thirteen test-subjects. The user study evaluation
process produced results of favourable and positive comments and easily can
be declared that this tool has been successfully integrated with the Simulation
platform to provide a holistic multimodal way of interaction of users having
virtual impairments.

December 2012 13 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Document Overview
The document is split into seven Sections. Section 1 is the introduction to this
document; it defines its purpose and positions the VerMIM with the rest
VERITAS simulation tools.
Section 2 presents a list of generic acceptability and usability indicators of user
interfaces systems. The generic indicators inclusion in this document is a
necessary step of understanding how an evaluation of a system is performed.
The indicators in Section 2 are generic and independent from the activated
modalities.
Section 3 moves one step further by taking the acceptability and usability
indicators and assigning them to the different modality needs and to the
different target groups. Each indicator is analysed and is matched to one or
more modalities. The questionnaires types that are used for such evaluations
are also discussed in this section.
Section 4 contains the heuristic evaluation of the VerMIM. This evaluation was
conducted by experts on the field and provides a thorough review of the
VerMIM and its integration with the VERITAS framework in real application
scenarios. This is the first evaluation that took place and important feedback
was received.
Section 5 contains the user study that has been performed as part of the
VERITAS Multimodal Interfaces Manager (VerMIM) testing procedure. In this
section, the user study parameters, the subjects properties and the application
scenario are presented, as well as any quantitative and qualitative results.
Section 6 includes the toolkit refinement actions that were necessary in order to
perform successfully the user study tests, mostly for the second evaluation
process. Moreover, in this section the limitations of the VerMIM system are also
discussed.
Finally, any conclusions that have been deducted after the VerMIM framework
tests are discussed in Section 7.

December 2012 14 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

1 Introduction
This document is the third and final VERITAS deliverable document of the
WP2.8. It is inextricably tied with deliverables D2.8.1 [1] and D2.8.2 [2] and
continues their work by describing the testing procedure that has been followed
in order to evaluate the performance and acceptance of Multimodal Interfaces
Toolset.
The Multimodal Interfaces toolset and its basic tool: the VERITAS Multimodal
Interfaces Manager, or shortly VerMIM1, has a two-fold purpose:
a) To integrate into one entity, a list of multimodal tools (or modules) which
will be used to help impaired (virtual) user or a (virtual) user with
restricted capabilities has to interact with the application under
development. This entity is called VerMIM and has been integrated
properly with the rest of the VERITAS simulation platform and provides
solutions were the typical unimodal interaction ways fail.
b) To automatically select the proper modality tools which have to be
activated in order to run specific scenarios where a virtual or real
impaired user interacts with applications. Such procedure is called the
modality compensation process [2] and takes into account the Virtual
User Models (VUM) created in SP1 as well as the Multimodal Interaction
models created as part of the A2.8.2.
By following an iterative approach two main test sessions were planned and
executed to validate the developed Multimodal Interaction Tools and verify the
effectiveness of the overall VERITAS framework in testing multimodal
interfaces. The first session was based on a typical user study and the other is
a heuristic evaluation based on experts in the multimodal interaction field. The
design of the VerMIM toolset have been refined and optimized according to the
outcomes of these tests.

1.1 VerMIM and the VERITAS framework


The connection of the VerMIM with the rest Veritas tools will be shortly
described in this subsection. As it is depicted in Figure 1, VerMIM is the basic
architectural component of the Multimodal Interfaces toolset. VerMIM receives
the Virtual User and Simulation Models information and selects the appropriate
Multimodal Interfaces Models when needed. It is also responsible for providing
alternative interfaces to the rest VERITAS SP2 Tools responsible for performing
the impairment simulation.
The Multimodal Interfaces Manager receives three kinds of input data, all of
which are described using UsiXML formatted files. These are:
The Virtual User Model (VUM) files: these files contain information that
describes the disabilities and capabilities of the virtual user. The disability

1
VerMIM and Multimodal Interfaces Toolset Manager are going to be used interchangeably in
this document if anything that defines that otherwise is not declared.

December 2012 15 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

models data are needed to find which users modalities are affected in
order to perform the modality compensation and replacement process or
to select the appropriate multimodal interfaces.
The Simulation Model files: these files describe in an abstract manner
the sequence of the tasks of an application scenario. The simulation
model scenario results into one task sequence that is specific per
application area. The VerMIM uses this file to perform the analysis of the
modalities required in each task and then, using the Multimodal
Interfaces Models, produces alternative task sequences that depend on
alternative users modalities.
The Multimodal Interface Models: these UsiXML files are used to
produce alternative task sequences of the default sequence described in
the Simulation Model. Every Multimodal Interface Model contains
alternative task paths of a simple task. Moreover, it contains information
about the required modalities of each task.
As it also depicted in Figure 1, the VerMIM manages eight modules, each one
applied to a different modality domain and used in different impairment
situations.
Veritas Simulation Component

Immersive
Core Simulation VerSEd-3D/GUI IVerSim-3D
Simulation
Platform VerSim-3D/GUI VerIM
Platform

Speech Recognition
Module

Virtual Haptics
User Module
Model
(UsiXML)
Speech Synthesis
Module

Sign Language
Simulation Multimodal Interfaces Synthesis Module
Model Manager (VerMIM)
(UsiXML)
Symbolic Module

Screen Reader
Module
Multimodal
Interfaces
Models Screen Magnifier
Repository Module

Head Tracking
Module

Figure 1: The VerMIM connections with the rest of the VERITAS tools. As it is depicted
the VerMIM communicates with several modality sub-modules (listed in the right
rectangle column).

December 2012 16 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

2 Acceptability and Usability Indicators


The indicators of usability and acceptability are independent from UI modalities
and can be applied as it is to different target groups. Design guidelines for
specific target groups (e.g. elderly or people with a specific disability) are
usually based on experience and not standardized. Usability and acceptability
can be boiled down to two main indicators, performance and satisfaction.

2.1 Generic User Interfaces Indicators


The generic user interfaces indicators can be split into two main categories: a)
user performance and b) user satisfaction. The first category includes indicators
which are used to show how fast, accurate and efficient a user interface is,
while the second category is used to declare how easy, useful and satisfactory
this UI is. A study on both of these categories has to be made in order to
provide a holistic review of the UI indicators.

2.1.1 User Performance


There are five basic indicators measuring user performance:
1. Task success
2. Task completion time
3. Errors/accuracy
4. Efficiency
5. Learnability
These user performance indicators are going to be analysed in the following
paragraphs.

2.1.1.1 Task success


Success in completing a task is a very common usability indicator, which can be
calculated easily and for a wide variety of interfaces, including multimodal
interfaces. In order to measure success, tasks need to be well worded and have
to have a concretely defined end state. Success criteria should be defined
carefully. Binary success (pass or fail) is the easiest way to collect data and
compare it. However, it is also possible to define levels of success. Levels of
success can reflect to which degree a task was completed, and how much
struggle it was to complete the task, or optional ways to accomplish a task (e.g.,
success in using one modality but not the other). A common way is to use 3
levels for success, complete success, partial success and complete failure;
Each of this levels can have sublevels e.g. complete success with assistance,
complete success without assistance. Success levels need to be defined
beforehand.

2.1.1.2 Task completion time


The time it takes a user to complete a task is an excellent measure for
performance. Usually, the faster a user completes a task the better is the
usability of the interface that was used to complete the task. In order to
measure task completion time it is necessary to define a start and end state for
the task, the time elapsed between start and end state is the tasks completion

December 2012 17 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

time. In many cases it is important to define a threshold in which a task should


be completed, as one does not want a user to try too long in order to complete a
task.

2.1.1.3 Errors and accuracy


Errors are related to usability issues, incorrect actions or not accurate actions
may lead to failure. Identifying and classifying errors and accuracy can help
understanding underlying usability issues. Errors are part of the user
performance, for example, the number of errors made during the interaction and
completion of a task. What constitutes an error or as not accurate is task and
situation dependent (e.g., incorrect/poor action of the user); Obvious errors are
those preventing the user from completing a task. As with the other
performance factors it is necessary to define what an error is beforehand and
what errors are relevant in a given situation, task and for a given user.
Frequency of errors is factor revealing usability issues.

2.1.1.4 Efficiency
One could use task completion time (2.1.1.2) to measure efficiency, i.e., the
amount of effort to complete a task. However, there are is another, more
suitable way to measure efficiency. Typically the number of actions and steps a
participant needs to complete a task is used to measure the amount of effort. In
order to measure cognitive and physical effort, it is important to identify the
actions to be measured, the start and end states of a the task that is to be
completed and only successful task should be taken into account.

2.1.1.5 Learnability
New products require some amount of learning, as experience increases
learning happens, however can be time consuming. It can be measured by
looking at how much time it takes to become proficient with an interface (e.g.,
completing task with an interface). Learnability is very important for interfaces
and tasks that are supposed to be used over a long period and regularly. For
example, when something is learned, it can become a habit and habits usually
require less conscious interaction and though cognitive effort. For multimodal
interfaces this can be crucial; For example, if a person is proficient in using a
joystick, this person will be able to combine joystick interaction with an
additional interaction modality. In order to measure learnability data has to be
collected multiple times. Expected frequency of use should serve as a basis on
how often data should be collected. Learnibility can be measured through
comparing the performance data (e.g. efficiency, errors, task completion time
and task success) of repeated measurements.

2.1.2 User Satisfaction


Satisfaction is a self-reported indicator and common satisfaction indicators and
corresponding questionaires are:
1. ease of use
2. expectation measure
3. system usability scale (SUS)
4. user interface satisfaction (QUIS)
5. usefulness, satisfaction and ease of use (USE)

December 2012 18 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

There are many more factors that can be measured through self-reporting and
may be relevant for the usability and user experience with an interface; E.g.
level of perceived pain, trust, aestehtic, emotions, fatigue, stress, workload, etc.

2.1.2.1 Ease of use


Asking users to report the ease of use is probably the most common approach
to measure how easy or difficult a task was. A common scale to use is a
traditional likert scale, such as (1=strongly disagree, 3=neither agree nor
disagree, 5=strongly agree) to the question This task was easy to complete

2.1.2.2 Expectation measure


Another way to measure satisfaction is to assess subjective reactions [3], that
is, how easy or difficult the task was in comparison to what was expected by the
user.

2.1.2.3 System usability scale (SUS)


The System usability scale consists of 10 items to which users rate their level of
agreement on a 5-point scale [8]. A technique to combine the rating into one
value is also presented.
More specifically the SUS is a simple, ten-item attitude scale giving a global
view of subjective assessments of usability. The SUS can be measured only by
taking into account the context of use of the system i.e., who is using the
system, what they are using it for, and the environment in which they are using
it. Furthermore, measurements of usability have several different aspects:
Effectiveness: can users successfully achieve their objectives,
Efficiency: how much effort and resource is expended in achieving those
objectives,
Satisfaction: if the experience with the system was satisfactory.
Measures of effectiveness and efficiency are also context specific.
Effectiveness in using a system for controlling a continuous industrial process
would generally be measured in very different terms. It can be argued that given
a sufficiently high-level definition of subjective assessments of usability,
comparisons can be made between systems.

2.1.2.4 Questionnaire for User interface satisfaction (QUIS)


The QUIS consist of 27 items divided into five categories: overall reaction to the
software, screen, terminology/system information, learning and system
capabilities [10]. The QUIS was designed to assess users' subjective
satisfaction with specific aspects of the human-computer interface. The QUIS
team successfully addressed the reliability and validity problems found in other
satisfaction measures, creating a measure that is highly reliable across many
types of interfaces.

2.1.2.5 Usefulness, satisfaction and ease of use (USE)


The USE questionnaire consists of 30 items divided into five categories:
usefulness, satisfaction, ease of use and ease of learning [19]. USE stands for
Usefulness, Satisfaction, and Ease of use. These are the three dimensions that
emerged most strongly in the early development of the USE Questionnaire. For
December 2012 19 CERTH/ITI
VERITAS D2.8.3 PU Grant Agreement # 247765

many applications, Usability appears to consist of Usefulness and Ease of Use,


and Usefulness and Ease of Use are correlated. Each factor in turn drives user
satisfaction and frequency of use. Users appear to have a good sense of what
is usable and what is not, and can apply their internal metrics across domains.

2.2 Indicators for Multimodal User Interfaces


Addressed to People with Special Needs
In the following section, relevant interaction modalities are listed and the
usability and acceptability indicators are discussed considering the modalities
and people with special needs. Subsequently, issues related with mixing and
combining modalities are summarized.

2.2.1 Speech- and Audio-based Interaction


Speech is a familiar interaction modalitiy that is known from interpersonal
communication and is based on language. Speech based input is a promising
interaction modality for people, who have limited control over their hands and
arms or have to need to interact in hand free [25]. Figure 2 presents an example
wearable device designed for mobile workers. Several studies investigated
speech as an input and output modality. For example, Manaris et al. provided
access to all of the functionalities of a keyboard based on speech alone [20].
Issues in usability and acceptability of speech-based interaction relate to
speech rate and prosody; Prosody refers to the rise and fall of speech during
communication. The loudness of a sound is truly individual; however it can be
approximated by sound pressure levels (dB). Frequency levels are an additional
issue and strongly related with aging.

Figure 2: Example of a speech and audio-based device to interact hands-free while on


the go [25].

The main indicator for the usability and acceptability of speech-based input is
error rates. Error rates are highly related to the task that needs to be completed.
For example, Karat et al found that speech-based navigation and error
correction are tasks that can be problematic during composing text documents
[15]. Speech is an interaction modality that highly relies on context information.
Speech-based interaction is fundamentally complex, however users are trained
in using speech for communication in interpersonal dialogs. There are some

December 2012 20 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

differences in using speech for human-computer interaction. When interacting


with a computer a user needs to be aware of how and when the speech
recognition activate/deactivate (e.g., push-to-talk) and furthermore users need
to be aware of what they can say, if the vocabulary and grammar is restricted
(e.g., that is why many interfaces include a what can I say? sentence). There
are speaker independent and depended speech-recognition software available.
While speaker dependent software has a higher recognition rate and less error
rates it needs a training phase to adapt to the speaker. Speaker independent
solutions have less good recognition rates; however, for some situations and
applications a training phase is not possible (e.g. multi-user scenarios). When
using a speaker independent system usability and user acceptance can be
improved through providing context-aware grammars. Furthermore, often
appropriate acoustic models can be set e.g., for a car environment, there are
acoustic models that can recognize typical background noises and improve the
separation of the user speech from the background noise.
In general, choosing appropriated computerized voices is difficult for speech-
based interaction tasks. Especially, older adults prefer non-computerized voices
or computerized voices that efficiently mimic the prosody of human speech. A
common design guideline is to avoid high pitched female voices and use a
voice that is in middle range; i.e., between 300 and 2500 Hz and 10 dB higher
than the background noise. Since loudness of a sound is individual allowing
easy control over the volume is an additional design guideline for speech- and
audio-based interfaces. Error rates are the standard indicator for analyzing
usability and acceptability of speech-based input.

2.2.2 Tactile and Haptic Interaction


Tactile and haptic interaction like adding text via joysticks, trackballs, touchpads
or even exoskeletons (Figure 3) can take into account the needs for user
groups with motor and vision impairments (e.g. people with spinal cord injuries,
cerebral palsy, Parkinsons Disease and muscular dystrophy). However, since
different impairments affect people abilities in different ways; e.g., degenerative
voluntary muscle control, reduced strength, limited endurance, reduced
flexibility and slowed movements designers have to consider the specific
impairments that affect on users abilities [22][26]. For example, people with
muscular dystrophy (MD) have very often accurate fine motor skills although
they tend to be slow and prone to fatigue. In contrast, people with cerebral palsy
(CP) may have trouble in acquire targets due to tremors that often magnify
during reaching for the targets. People with Parkisons Disease have similar
issues with fine motoric. Designers have to incorporate tolerance for accidental
interaction in their designs when designing tactile and haptic interfaces for
people with special needs. Furthermore, designers have to consider that
designs (graphical and physical) may exacerbate a greater level of pain,
discomfort, or fatigue.

December 2012 21 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 3: Example of haptic interface, an exoskeleton [13].

Tactile input; for example, grasping, pulling or pushing a button, depending on


the situation involves finger, hand and whole body movement. From applying
the right pressure while holding a mouse to using a touch screen, motor control
is a key factor in tactile and haptic input. For many people with disabilities motor
control is an issue. The basic and mostly relevant indicators for the assessment
of usability and acceptability in motor control are response time (speed) and
accuracy of movement (i.e., error rates due to unintended interaction or miss-
recognitions of performed gestures). Typical indicators for user acceptance are
level of discomfort, pain, frustration and fatigue. Many tactile and haptic input
methods are novel to many users therefore it is suggested to consider learning
rates; e.g., based on the power low of learning [9][21]. These performance
models allow predicting future performance of users with the evaluated designs.
Related work in measuring the user experience of people with special needs
with tactile and haptic interaction designs suggest that small changes can make
big differences; e.g., bevelling edges to adjusting timeouts by milliseconds.

2.2.3 Eye- and Head- tracking Controlled Interaction


Eye- and head-tracking controlled interaction (Figure 4) is an option for people
with spinal cord injuries or multiple sclerosis who have very limited control over
their hands but retain control over their heads and eyes [18]. In order to provide
usable and acceptable head controlled interfaces several challenges have to be
addressed; E.g. very often head controlled interfaces have to be calibrated to
the individual users characteristics. Designers have to keep in mind that head
mounted interaction, such as, text input is slow and that extended use can
cause fatigue in neck and related muscles. Eye-controlled interfaces have been
successfully used for target selection and navigation task [28]. Related work
suggests that designing for eye-controlled interfaces needs new and specialized
interaction solutions. A good indicator for the usability of eye-control solutions is
accuracy (e.g. in target selection tasks).

December 2012 22 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 4: Example of an eye- and head-controlled interface [12].

2.2.4 Hand and Gesture Recognition Interaction


According to Kurtenbach and Hult [17] "A gesture is a motion of the body that
contains information. Waving goodbye is a gesture. Pressing a key on a
keyboard is not a gesture because the motion of a finger on its way to hitting a
key is neither observed nor significant. All that matters is which key was
pressed". This mode of interaction is used every day by thousands of people
with hearing disabilities that use a sign language to communicate. This is a
clear application of gesture recognition interaction, but to be able to carry out
real-time machine recognition of sign language interaction there exist the need
for cameras with enough resolution, and powerful processing equipment and
algorithms to recognize both the movement of the hands and arms to the body.
One of the most challenging tasks in the sign language recognition is the
analysis of spatial information containing the entities created during the sign
language discourse. This is needed to reduce the ambiguity of words that often
have trouble being translated [29]. Recent studies demonstrate the viability of
using a Kinect based system for sign language recognition like the one shown
in Figure 5 [35]. Using this kind of systems would lower the cost of a sign
language translation system.

Figure 5: Sign language recognition using Kinect [35].

In addition, the gesture recognition systems can be used not only by users that
employ a sign language to communicate but also to provide people with
physical disabilities a new way of interacting with computers [16]. Users must
learn the gestures to interact with the systems. These gestures must fit their
December 2012 23 CERTH/ITI
VERITAS D2.8.3 PU Grant Agreement # 247765

skills and be easy to learn and repeat. Thus one of the indicators of usability is
the ease of learning and repeating the gesture, taking into account that the user
might get exhausted. Other indicators of usability and acceptability of the
gesture interaction modality are the success rates in recognition. If the system
adequately recognizes 99% of the gestures, the system will have a greater
acceptance by the user, than if it only recognizes correctly 50% of the gestures
made.

2.2.5 Vestibular Interaction


Vestibular interaction is what gives us the sense of our position in space
according to the gravity and gives us information about the change of the
momentum of our body. The organ that performs this function is connected to
the inner ear, and it is common for people with hearing loss to have problems
with balance [11]. This modality is used both for the input and output of
information. The most common systems using vestibular interaction are
simulators: vehicle, aircraft or even amusement rides. These systems are
intended to provide the user with a sense of movement so the user receives
acceleration movements and position changes that simulate what they would
experience in reality.

Figure 6: Wii balance board used as assistive technology [31].


Vestibular interaction used as an input modality was developed using
accelerometer sensors, which detect the change of the relative position of the
user, or with balance board type devices with pressure sensors. This type of
system that initially is being used for leisure and entertainment has proven to be
a successful tool for the rehabilitation of users with different disabilities [30], like
the one shown in Figure 6. An indicator of usability when using this type of
modality is accuracy (e.g. when selecting one target for example in a wii game
using the balance board).

2.2.6 Brain Controlled Interaction


One of the latest interaction methods being researched now is the machine-
mind interaction modality. This mode of interaction based on a brain-computer
interface, usually focuses on providing control and communication functions to
users with severe motor disabilities due to spinal cord injury, or due to
degenerative diseases such as amyotrophic lateral sclerosis, among others
[32]. Several studies in the Wadsworth Center BCI Laboratory in Albany, New
York, show that people with motor disabilities can control the amplitude of and

December 2012 24 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

parameters in the EEG rhythms over sensorimotor cortex and that these
rhythms can be used to control a cursor on a computer screen in one or two
dimensions [34].

Figure 7: Matt Nagle, the first person to ever be implanted with a BrainGate [27].
One of the biggest challenges facing this type of interaction is to eliminate the
need for surgery for implantation of the sensors to acquire the electrical signals
emitted by the brain (Figure 7). This makes this method to be considered as a
last resource when other modalities of interaction are not possible. Apart from
requiring an invasive interface, the interface needs to be calibrated and the user
trained in order to be able to coordinate vision with the focus needed to move
the cursor with enough accuracy. The accuracy of movement of the cursor is
the main indicator of success in using this type of interaction modality.

2.2.7 Visual Interaction


The visual is undoubtedly the most widely modality used for information input
and often serves to support other modalities such as gestures or brain-
computer interaction. There are numerous studies, guidelines and standards
that aim to improve the accessibility of the interfaces that are based primarily on
visual modality and thus improve and evaluate their usability. The best known of
all are the web accessibility guidelines WCAG 2.0 [4], which describes perfectly
what are the indicators of web accessibility. The main indicators that will enable
us to assess the usability of a graphical user interface are: the contrast (which
allows users to identify shapes, objects and text), colour (e.g. avoiding
confusion in people with colour blindness), the size of fonts, or shapes and
objects, the layout of graphical objects and the visual context (for example, if
there is enough light to see the object). There are some useful tools in order to
asses these indicators, as The Colour Contrast Analyser 2.2 [33] used primarily
for checking foreground & background colour combinations to determine if they
provide good colour visibility.

2.2.8 Special Issues concerning the Multimodal Interfaces Indicators


Allowing multiple modalities to accomplish a task can help users with special
needs. Some issues related to habituation and learnability should be considered
when using multiple modalities.

2.2.8.1 Vision in multimodal interfaces


Although information presentation is multimodal, vision is a fundamental way in
which information is presented to users. In vision-based interaction colour, size,

December 2012 25 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

layout and visual context are important indicators for the usability and
acceptability. Being not able to see important aspects of the screen can
produce frustration. Many interfaces and modalities require implicitly vision
capabilities for input; i.e., vision is in many situations mandatory to locate
interactive devices and understand their meaning. For example, when speaking
into a microphone a sighted person will identify the microphone using vision.
Same applies for using a labelled keyboard.

2.2.8.2 Mixing modalities for context support


One successful way to design usable and acceptable multimodal interfaces is to
provide context information through additional modalities. For example, in vision
and speech based multimodal interaction, it is common to use additional and
complementary visual information to ease the load provided through auditory
information.

2.2.8.3 Habituation and learnability issues


Having multiple modalities to complete a task provides a user with options;
however, allowing multiple modalities to complete the same task can prevent or
complicate the development of an habit and influence learnability. Developing
habits can improve the usability and acceptability of an interface, because
interaction becomes automated and less cognitive effort is needed; users gain
more resources to focus on other activities (e.g., surrounding context).

December 2012 26 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

3 In Depth Analysis of the Usability and


Acceptability Indicators
This section presents indicators for the usability and acceptability. In the
previous sections the interaction modalities have been listed considering users
with special needs. The indicators are applicable for users with special needs
and multimodal interfaces. Indicators are grouped into performance and user
satisfaction. User satisfaction is mostly measured through self-reporting while
performance measures can be acquired automatically. Although identifying
usability issues is mostly considered purely qualitative; metrics can be used to
identify usability issues automatically. Table 1 presents the indicators for
performance that can be measured automatically.
Table 1: Usability and Acceptability Indicators
Name of Indicator Summary
Task success Task success is a strong and clear indicator for the
usability and acceptability of an interface. Task
success is usually easy to measure if an end state
for the task is clearly defined.
Binary coding is easy to measure (i.e., success and
failure), however depending on the task it can be
useful to use more levels for success (i.e., success
with assistance, success without assistance, failure)
The usability and acceptability of an interface can be
good on long term even if users fail to accomplish a
task in first trials but learn to use it efficiently later.
Task success is relevant for all kinds of tasks. For
tasks that have to be completed often learnability
will be also very important. For tasks that have to be
completed rarely task success should have a high
priority.
Task completion time Start and end state have to be defined clearly. The
faster a task is completed the better is usually the
usability and acceptability. However, there are
exceptions; E.g., enjoyment using an interface might
cause longer task completion times.
If the task in consideration is time critical, task
completion time should have high priority for the
usability and acceptability of an interface.
Errors/accuracy Errors strongly relate to usability and acceptability
issues. It is important to clearly define what counts
as an error beforehand. In addition errors can be
weighted (e.g., error that causes failure in
completing a task, or error that causes additional
time to complete a task successfully)
Depending on the task (e.g., playing a game,

December 2012 27 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

making an emergency call) and the interfaces error


tolerance, errors can have high or low priority.
Efficiency Efficiency and task completion time is strongly
related; However, the amount of effort (e.g. number
of performed activities to achieve task success) is
also a measure for efficiency.
The number of actions that a user has to perform in
order to complete a task is an indicator for efficiency
and though for usability and acceptability.
Efficiency/Effort is a fitting measure for interfaces
that have to be used to complete complex tasks that
can be broken down to sub tasks.
Learnability Learnability is more cumbersome to measure.
Repeated measurements have to be taken and
compared. Defining the right amount of time
between measurements is important.
Learnability is an important indicator for usability and
acceptability if the interface is supposed to be used
regularly (e.g., daily, weekly basis).

Automatically measuring user satisfaction is difficult, since user satisfaction is


usually a self-reported metric. However, one way of identifying issues
automatically in an easy manner is to require users to provide verbatim
comments after completing a task or if an error occurs.
In any case, two questionnaires are mostly used by the research community for
the usability, usefulness, satisfaction and ease of use: the SUS and USE
questionnaires, mostly because of their generic measuring approach. Their
description is presented in Table 2.
Table 2: Mostly used questionnaires regarding usability, usefulness, satisfaction and
easy of use.
Self-reported metric for The best and easiest way to assess usability using
usability questionnaires is the SUS questionnaire. It consists
of only 10 items and has been used broadly with
success in the past
Usefulness, satisfaction The USE questionnaire is a good way to assess
and ease of use (USE) user satisfaction.

3.1 Recommendations for Modalities


Some indicators are more relevant for specific interaction modalities. In Table 3
modalities are listed and the indicators that are more important for the modality
are prioritized.

December 2012 28 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Table 3: Usability and acceptability indicator recommendations considering modalities

Name of modality Prioritization of indicators


Speech and audio In addition to task success, error rates are the
standard indicator for analyzing usability and
acceptability of speech-based input.
For speech-based interfaces: Combine SUS ratings
with performance ratings (task success, task
completion time, error rate, efficiency) and weight
error-rates double.
Tactile and haptic In addition to task success, for tactile and haptic
interfaces efficiency (i.e., task completion time and
effort) is a standard indicator for usability and
acceptability.
For new kinds of tactile and haptic interfaces
learnablity is also a very important indicator.
Eye- and head- If users are not used to interact based on a eye- and
controlled head-controlled modality learnability is important.
For experienced users efficiency and task success
are the most important indicators.
Vestibular If users are not used to vestibular interaction
learnability is important. For experienced users
efficiency and task success are the most important
indicators.

3.2 Recommendations for Combined and Single


Usability and Acceptability Indicators
Very often metrics from more than one usability and acceptability indicator can
be collected (e.g., task completion time and error rates); however, the overall
scores are of greater interest. There are multiple ways to combine metrics in
order to get an overall view.
3.2.1 Using target goals
One way to combine metrics is to compare collected data to a target goal; for
example, target goal is achieved if task completion time is fewer than 50
seconds and less than 3 errors, otherwise it is not achieved.
3.2.2 Using percentages
If it is difficult or not possible to set appropriate target goals, another way to
combine metrics is to use percentages; that is, different scales from different
indicators should be turned into percentage. One guideline to deal with time
data is to use the fastest time in the trials (or the time an expert performed with
the system) as the best possible value in order to calculate the other time as
percentages in relation to it (i.e., by dividing the best/shortest time by the
observed times). Once each scale is in percentage, it is possible to use the
overall average as a combined and single usability indicator. Similar to time

December 2012 29 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

data error can be tricky to transform into percentages; for example, if the
desired minimum is 0 error and no predefined maximum of errors exists. In that
case it is possible to take the maximum number of errors a user ever produced;
error rates can be transformed into percentages by dividing the observed error
rate from the maximum number of errors ever observed and subtracting that
from 1. Depending on the task and the modality can be reasonable to weight
the data from indicators non-equally; for example, error rates could be weighted
double than task completion time for speech interfaces.
3.2.3 Using z-scores
The z-scores are a technique to transform scores from different scales (e.g.,
metrics from different indicators) so that they can be combined into one metric.
z-scores are based on the normal distribution and indicate how many units a
given score is away from the mean value.
The formula to transform scores from indicators into a z-score is as follows:

z
x
(1)

where x is the value to be transformed; is the mean; and is the standard
deviation.
One should keep in mind that in order to use z-scores mean and standard
deviation need to be known (i.e., approximated based on multiple observations).
Similar to using percentages z-scores can be averaged to get a single
combined value; however, the obtained single combined value cannot be
treated as some type of an overall usability score. z-scores should be used in
iterative test to compare different sets of data; For example, one iteration to
another iteration of a design, or data from one user group and data from
another user group.
3.2.4 Using SUM: Single Usability Metric
SUM is a single usability metric that standardize the four usability metrics: a)
task success; b) task completion time; c) error rate and d) post-task satisfaction
rating [23]. Jeff Sauro provides the SUM score calculator at his web site [24].
3.2.5 Summary
Using target goals to assess the usability of a system is perhaps the best way. It
is important to define the target goals beforehand. Goals should be task specific
and clearly defined; for example:
at least 95% of the target users will be able to succeed the task
the average SUS rating will be at least 70%
the average number of errors will be 3

If for any reason defining goals becomes difficult, it is recommended to compare


performance metrics to the metrics achieved by an expert or the average value
gained from testing with multiple experts.
In addition to the methods previously described to combining different values
into a single usability score there are also explorative approaches (i.e., to use a

December 2012 30 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

graphical presentation); for example, usability scorecards present graphically


the result of multiple metrics in a summary chart.
3.3 VERITAS Task Examples
Although usability and acceptability indicators and their measurement are in
general mostly task independent, i.e., performance and satisfaction (task
success, task completion time, error/accuracy, learnability, frustration, trust,
etc.), in the following table we describe the indicators for a few exemplary
Veritas project tasks.
Table 4: Examples of usability and acceptability indicators for Veritas project tasks
Task Comment on indicators
Walk Indicators for the usability of an interface to
complete the task walk are:
Task success: User has arrived at the end sate
(i.e., user has arrived at the target location)
Task completion time: Speed (e.g., meter per
minute) to move from start location A to target
location B
Errors/accuracy: For example, the number of times
the user left a predefined path.
Efficiency: Number of user actions with the
interface (e.g. steps, button presses, navigation
commands, etc.) to successfully arrive at the target
location B.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)
See Indicators for the usability of an interface to
complete the task see are:
Task success: User has arrived at the end sate
(i.e., the user has identified x items, written words
and images at a given time).
Task completion time: Speed (e.g., number of
recognized items per minute) to identify items.
Errors/accuracy: Number of items the user could
not identify.
Efficiency: Number of user actions with the

December 2012 31 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

interface (e.g. steps, button presses, navigation


commands, etc.) to successfully identify items.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)
Hear Indicators for the usability of an interface to
complete the task hear are similar to the task
see
Task success: User has arrived at the end sate
(i.e., the user has identified x items, spoken words
and sounds at a given time).
Task completion time: Speed (e.g., number of
recognized items per minute) to identify items.
Errors/accuracy: For example, the number of items
the user could not identify.
Efficiency: Number of user actions with the
interface (e.g. steps, button presses, navigation
commands, etc.) to successfully identify items.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)
Grasp Door handle Indicators for the usability of an interface to
complete the task grasp door handle
Task success: User has arrived at the end sate
(i.e., one action away from using the door handle)
Task completion time: Time (e.g., in seconds) until
end state is reached.
Errors/accuracy: For example, the number of failed
attempts to reach the end state in a given time (e.g.
1 minute).

December 2012 32 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Efficiency: Number of user actions with the


interface (e.g. steps, button presses, navigation
commands, etc.) to successfully reach the end state.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)
Pull Door handle Indicators for the usability of an interface to
complete the task pull door handle are similar
to the task grasp door handle
Task success: User has arrived at the end sate
(i.e., user has moved the door to a predefined
position or angle)
Task completion time: Time (e.g., in seconds) until
end state is reached.
Errors/accuracy: For example, the number of failed
attempts to reach the end state in a given time (e.g.
1 minute).
Efficiency: Number of user actions with the
interface (e.g. steps, button presses, navigation
commands, etc.) to successfully reach the end state.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)
Sit car seat Indicators for the usability of an interface to
complete the task sit car seat
Task success: User has arrived at the end state
(i.e., user is seated in the car seat)
Task completion time: Time (e.g., in seconds) until
end state is reached.
Errors/accuracy: For example, the number of failed

December 2012 33 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

attempts to reach the end state in a given time (e.g.


3 minutes).
Efficiency: Number of user actions with the
interface (e.g. steps, button presses, navigation
commands, etc.) to successfully reach the end state.
Learnability: Analyze a specific performance
metric; E.g., task completion time, number of errors,
number of actions. If after repeating the task x
times the user shows no improvement, the user has
learned as much as he can and there is not much
room for improvement. The difference between first
and last trial will indicate how much learning must
occur until maximum performance is reached.
Self-reports: usability score (SUS), user satisfaction
(USE)

December 2012 34 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

4 Heuristic Evaluation of the VERITAS


Multimodal Interfaces Toolset
In this section, the heuristic evaluation of the VERITAS Multimodal Interfaces
Manager Tool (VerMIM) which was conducted by five Human Computer
Interaction and Usability experts is presented. The focus of the evaluation is on
the combination of different user models represented through handicaps and
particular interaction modality chosen by the VerMIM tool. This is the first
evaluation process that took place in the iterative testing of the VerMIM (the
second evaluation is the user study, which is described in Section 5).
In the following text, first a definition of the methodology of heuristic evaluations
is given. Then, the purpose of the expert study is defined and an overview over
the evaluation process is given. Finally, the experimental results of all the expert
answers are analysed and discussed. In the conclusion we summarize our
findings and give an overview about the evaluation results done by human
computer interaction experts. We conclude with a summary of the findings that
should give together with the findings of the user tests a roadmap for future
improvement of the VerMIM evaluator.
It must be said that in the heuristic evaluation the experts opinion was much
stricter compared to the user study of Section 5 (in which non-expert users took
place). As it will be shown, the evaluation process detected many flaws and this
fact turned to be a nice thing, as it helped very much in the refinement of the
VerMIM product.

4.1 Introduction
A heuristic evaluation is used to reveal usability problems in computer software
and focuses on the design of user interfaces (UI). This includes on one hand
usability parameters targeting the graphical design as well as the interaction
design itself. Examples for such parameters are e.g. design consistency,
intuitiveness, etc. This is done in a structured way by following a number of
heuristics defined based on the 10 usability heuristics of Nielsen (1994). Such
heuristics are mainly design and interaction principles that are used mainly to
capture the usability and accessibility of a system.
To evaluate the VERITAS tools we selected five experts in Human Computer
Interaction. We designed different scenarios and adopted Nielsens heuristics
which are defined as follows:
1. Visibility of system status
2. Match between system and the real world
3. User control and freedom
4. Consistency and standards
5. Error prevention
6. Recognition rather than recall
7. Flexibility and efficiency of use

December 2012 35 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8. Aesthetic and minimalist design


9. Help users recognize, diagnose, and recover from errors
10. Help and documentation.
For the present evaluation we adopted the heuristics in order that they are
appropriate to the VERITAS Multimodal Interaction Manager (VerMIM) and
made some own categories.
Purpose of the evaluation was to detect usability problems with the VerMIM
evaluator and its User Interface on one hand but also to identify problems that
relate to the usage of the VERITAS user models (VUMs) in combination with
the multimodal interaction tools, which are used to simulate a specific handicap
reflected in the VUM.

4.2 Process of the heuristic evaluation


In this subsection the process steps of the heuristic evaluation are defined.

4.2.1 Process
The evaluation process used for the heuristic evaluation of the multimodal
toolset is depicted in Figure 8. In order to be able to conduct the particular
process a number of materials is required. This includes the setup of the system
itself as well as the heuristics that have been defined for the particular
evaluation.

Welcoming the user


Welcome Explaining what will be done in the evaluation session
Signing the consent form for recording

The system is briey explained to the expert


Introduction to The task sheet introducing the expert to the system is handed
the system

Heuristic based The experts evaluates the system based on the predened heuristics
evaluation

Good bye
Closing

Figure 8 - Evaluation process of the heuristic evaluation

4.2.2 Material
In the following paragraphs, the material used for the heuristic evaluation of the
VerMIM Evaluator tool will be described.

December 2012 36 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

4.2.2.1 Matter of evaluation: VerMIM


The evaluated software is exactly the same as used in the user study described
in Section 5. The evaluated VERITAS tools are the VerMIM with its evaluator
tool and the VerSim-GUI. For more detailed information see Section 5.

4.2.2.2 Required hardware


In order to run the VerMIM software a PC running Microsoft Windows XP or
Windows 7 was required. The recommended hardware was a PC with an
iCore5 processor, Bluetooth, and a microphone.
We used a Lenovo ThinkPad with Windows 7 as an operating system in order
to able to smoothly run the VerMIM toolset.

4.2.2.3 Required (assistive) tools


For the current evaluation three different assistive tools were used in
combination with the previously described hardware.
A Novint Falcon haptic device capable of haptic feedback (vibration) as well as
providing the possibility to navigate a cursor on the hardware was used. The
haptic feedback was used to indicate interactive zones (like e.g. buttons). This
makes sense mainly for vision impaired users.
A head-tracker built on basis of a WiiMote and an infrared emitter was built and
installed on a tripod behind the laptop screen. The infrared emitter was a
diffused infrared LED that we mounted on usual protection goggles. This had
the advantage that it was possible for the experts to wear their own glasses
under the head-tracker goggles.
For the speech recognition and synthesis we used the microphone and
speakers of the Lenovo ThinkPad.
Depending on the different VERITAS user models different combinations of the
particular multimodal interaction tools were simulated.

4.2.2.4 Consent form


At first, the experts had to sign a consent form in which they had to comply with
audio and video records of the evaluation and the publishing of the whole
material. For further evaluation purposes we decided to make an audio record
and take pictures during the evaluations.

4.2.2.5 Instructions
The experts got information about the evaluation itself and three task sheets
(see Appendix 8.1) explaining the workflow that the expert should follow in order
to get an overview of the VERITAS multimodal interaction tool set.
The first sheet contained general information about the evaluation and its
context as well as about the different disabilities simulated by the tool and first
instructions.
On the other sheets (sheet 2-4) different scenario descriptions explained step
by step are given to the user. The different interaction scenarios are related to
different types of handicaps and thus VERITAS user models that are to be
applied using the VerMIM tool.

December 2012 37 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

On the scenario description for the severe visual impairment condition (almost
blind) there were additionally the exact voice commands for the speech
recognition.

4.2.2.6 Evaluation guidelines


An excel sheet containing the evaluation guidelines (heuristics) that are to be
applied by the experts are compiled. Consulting the excel sheet, the experts
had to evaluate the VerMIM evaluator. These guidelines are divided into five
categories which are specified in Table 5.
Table 5: Categories of our evaluation guidelines, their descriptions and subcategories
Name of category Description Subcategories
VerMIM general This category addresses general None
questions questions concerning the first
impressions about the User
Interface, the functionality and
the validity and correctness of
the different user models.
VerMIM GUI Design The GUI design category tries to None
evaluate continuity and intuitive
interactions.
Modality: almost blind In this category the interactions VerMIM Evaluator
user of almost blind users via speech Tool
condition and recognition are
Visual impairment VerMIM Simulation
addressed.
(severe) System/Modality
Other remarks
Modality: Myopia The Myopia category addresses VerMIM Evaluator
the usability of the software for Tool
Visual impairment
(mild) visual impaired people with a low Simulation
visual acuity via a haptic device.
System/Modality
Other remarks
Modality: upper limb This category contains heuristics VerMIM Evaluator
paralysis referring to the usability of the Tool
program for people with motion
Motion impairment Simulation
impairments, e.g. upper limb
paralysis that use a headtracker System/Modality
to interact with the system. Other remarks

4.2.2.7 Further Material


In order to be able to later validate also the verbal comments of the experts we
made audio recordings of the evaluation sessions. For this purpose we used a
MacBook Pro and QuickTime Player 10.2 that was pre-installed on the
MacBook.

December 2012 38 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

The excel sheet with the evaluation guidelines was also filled on the MacBook
in order to immediately have the collected data in electronic form for further
analysis purposes.

4.2.3 Scenarios
The four created scenarios in the heuristic evaluation are the same as used in
the user study. They consist of different interaction steps required in order to
accomplish the scenarios developed for the particular evaluation session.
Table 6 gives an overview about the different steps required and links the
particular tasks to the specific commands. The commands are the same for the
different scenarios the experts need to conduct using the appropriate interaction
devices based on the selected VERITAS user model.

Table 6 The four scenarios with the exact task descriptions. These Instructions got our
experts printed out on instruction sheets.
Instruction for the experts Voice command
Change language to English English language
Go to control room Go to control room
Open television control Control television
Turn on television Turn on television
Increase volume Increase volume
Go back to control Go back
Open blinds control Control blinds
Close blinds Close blinds
Go back to control Go back
Go back to Intro Go back
Open settings Go to settings
Activate outdoor control lights Activate automatic
entrance lights
Go to intro Go back

4.2.4 Expert specifications


A total of five experts evaluated the VerMIM Toolset. They are all experts in
Human Computer Interaction and Usability and researchers at the ICT&S
Center Salzburg. Two of them are female and three male. Table 7 gives an
overview about their background and research focus.

Table 7: Specifications of our chosen experts: Education, specialisation, research focus

Education Specialisation Research focus

December 2012 39 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Master degree in Gaze-assisted Usability evaluation


Psychology Interaction, Usability Gaze-assisted Interaction
Evaluation and Eye
Bachelor degree in
Tracking. Eye tracking
Applied Computer
Science

Master degree in Focusing on the spatial Social aspects in human


Communication aspects of ICT usage computer interaction
studies in the home context for User-centered design and
ICT adoption. evaluation of secure and
trustworthy composite
services
Research methods on user
experience and user
acceptance
Bachelor degree in User Interface Design Human Robot Interaction
applied computer Ambient Interfaces Ambient Intelligence
science
Pursuing master
degree

Master degree in Implicit measures of Contextual Interfaces


psychology attitude Measurement of user
emotion
Human Robot Interaction
Psychological theory in HRI
Bachelor degree in Software engineering, Software engineering, tools
Applied Computer tools and prototyping and prototyping for HCI
Science for HCI research research
Pursuing his master Experience sampling method
degree in the automotive and mobile
context

Our goal was to recruit HCI and usability experts with different backgrounds in
order to cover different usability aspects. Criteria were at least a bachelor
degree in Computer Science or a master degree in psychology, communication
Studies or another relevant field of study as well as several years of experience
in the field of human computer interaction. Another requirement was an
employment at the ICT&S Center as researcher and experience with heuristic
evaluations. Furthermore we aimed to have a balanced gender ratio for the
evaluation.

December 2012 40 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

4.2.5 Evaluation Process


During the whole evaluation the coordinator and an assistant took notes about
questions of the experts and occurring problems. The coordinator person was
responsible to guide the expert if he or she is lost or anything goes wrong.
Additionally a sound record was taken of the whole evaluation session.
Before performing the tasks the experts got a sheet with information about the
toolset and the Smart Home Application, an explanation of the three different
disabilities, and a listing of the three available interaction methods with the
System: Head tracker, Novint Falcon haptic device and speech recognition and
production.
After they had confirmed that they have no more questions and understood
everything, they got the first task, that was to interact freely with the SmartHome
Application for five minutes in order to get familiar with the tool and its functions.
Even if the SmartHome App was not to evaluate, this task was important as the
VerMIM Evaluator integrates this external application in the described
evaluation scenario.
Then they had to run through the four Scenarios. All the scenarios were
described step by step on sheets given to the experts, with the note, that the
SmartHome Application will only react on these defined steps (see appendix).
At first they had to interact with the system as a normal user without any
disabilities with the mouse as input device. In the second scenario the VerMIM
evaluator simulated an almost blind user and the expert hat to use speech
recognition and production. In the third scenario a mild condition of a visual
impairment was simulated (Myopia) and the experts had to use the Novint
Falcon haptic device. In the last scenario they had to interact with the system as
a motion impaired user with upper limb paralysis using a head tracker and
speech as interaction tools. Here the evaluation coordinator had to give further
instructions to the experts, because it was unclear for them, that they had to put
on the glasses with the infrared LEDs for using the head tracker. It was also
important to mention, that the experts should wear their normal glasses while
using the head tracker. Furthermore the hint was given, that the position of the
chair should not be changed after the head tracker is calibrated. For each
scenario the VerMIM evaluator records any wrong clicks or voice commands, as
well as the time needed by the expert for performing each task.

December 2012 41 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 9: The expert RB running through Scenario 3 in that a mild visual impairment is
simulated. He is interacting with the computer using the Novint Falcon haptic device and
gets haptic feedback.

After running through the four scenarios the experts were instructed to evaluate
the system based on heuristics collected in an Excel spreadsheet. During the
evaluation they were free to re-evaluate and verify different functionality and
steps to be done using the the VerMIM Evaluator. Further they were told that in
case of unclear situations and open questions they are free to ask the
researcher leading the heuristic evaluation. The qualitative information we got
from the notes taken while the evaluation and from the filled excel sheet were
the base of our analysis.

December 2012 42 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 10: Expert running through scenario 4 in which he wants to perceive how a user
with motor impairments (upper limb paralysis) would interact with the SmartHome tool.
He is wearing glasses on which an infrared lamp is fixed. Behind the laptop screen the
WiiMote appreciates the movement of the infrared lamp and as a consequence the head
movement. Moving his head he is able to move the cursor on the screen.

4.2.6 Analysis
An expert evaluation with five experts was conducted. As shown in Figure 11,
such an evaluation discovers a percentage of approximately 75% of usability
problems which is a satisfying approach for identifying issues in formative
evaluations.

Figure 11 - Relation between number of evaluators and problems identified according to


Molich and Nielsen (1991)

December 2012 43 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

After getting familiar with the system the experts had the task to evaluate the
system based on the heuristics (Figure 12). The percent scale refers to the
degree of approval of the experts to the conformity of the system regarding the
fulfillness of the particular heuristic by the system. 100 % is the best value, 0 %
the worst. In addition of better understanding, we coloured the percentages.
The Criteria contains the particular evaluation guideline (heuristic) the
evaluator should address during the heuristic evaluation.
The evaluation guidelines are divided in the following categories:
VerMIM
contains general questions that relate to the first impression,
completeness of functionality, applicability, etc. of the VerMIM tool.
VerMIM GUI Design
contains questions regarding the graphical user interface design of the
VerMIM GUI. This category mainly relates to issues, like intuitiveness,
consistency, improvement potential, etc.
Modality Blind User
The category describes the combination of the selected user model and
the appropriate multimodal simulation of the handicap and the interaction
capabilities defined.
Modality Myopia
The category describes the combination of the selected user model and
the appropriate multimodal simulation of the handicap and the interaction
capabilities defined.
Modality Motion impaired
The category describes the combination of the selected user model and
the appropriate multimodal simulation of the handicap and the interaction
capabilities defined.
The results were collected according to these categories and the particular
heuristics defined.

4.2.6.1 VerMIM
This category addresses general questions concerning the first impressions
about the User Interface, the functionality and the validity and correctness of the
different user models. In Figure 12 the results of this category are shown.

December 2012 44 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 12: Percentage of agreement to the questions in the first category of the
evaluation guidelines: VerMIM.
The first impression of the VerMIM Tool is various. One expert thinks that its
good and easy to understand. Another one is at the opinion, that its rather
technical the terminal window in the background gives the engineer information
about the system he probably shouldn't have (or need), additionally its unclear
what the vision impaired condition means. Anyway the overall impression is
positive, as the experts rated the VerMIM tool with 83% on a percentage scale.

Appropriation to simulate multimodal interaction


Each modality has its strengths and weakness in different contexts/interactions.
Its not clear, what the categories mean. Different modalities might be useful for
different interactions. Maybe there is a possibility of other settings.
Simulating the same interaction with different modalities enables one to
compare the modalities. If the focus is on comparing different modalities for the
same interaction, it is very appropriate.
The experts rated the VerMIM tool with 68.8% on a percentage scale while the
lower percentage mostly refers to the restrictions of the particular scenario
implemented.

The combination of the simulation of a potential impairment and the provided


modalities
The handling of the haptic was not as ergonomic as needed, however the haptic
feedback itself worked without problems. Sometimes, the Wii-remote didnt

December 2012 45 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

react appropriate on the head movement, so it wasnt possible to move the


cursor, as the experts wanted to. All together the linkage between the simulated
modalities and the particular impairment was judged positive, although
improvement potential especially regarding specific interaction devices (mainly
the Novint Falcon) was identified. The final rating: 63.8%.

There were some problems while using VerMIM


The program crashed once and lost all log data. Having the terminal window in
the background makes the program appeal fragile. The head tracker didnt work
well, so it was not possible to finish the task. Another problem was the speech
input: The program understood language, but not in the fourth scene, when the
participants wore the glasses. The mouse moved out of the application boarder.
You have to click a button, the Novint falcon haptic device to get acoustic
information. It only vibrates, when you are on the adjusted button, which is kind
of irritating and not intuitive. Another problem could be the implementation in
other applications.
It would be good to have audio feedback, especially in the Myopia Mild Scene.
An audio and visual feedback would be helpful for orientation purposes. It would
be good, if the Novint falcon haptic device would vibrate, when you are on a
button and give acoustic feedback, additionally when you are on the adjusted
one.
Even if there are problems, the experts would use the VerMIM for experiencing
how a handicapped user feels, behaves. This is reasoned by the fact, that its
good to see the interface through the perspective of the user. The shown
modalities seem to work for that.
Another reason is the correctness of the particular user models. The visual
impairment is easily simulated and worked around by using speech control.
Every other scenario like body impairment is hard to simulate. It would be good
if the screen would be black in the blind scenario. It would be helpful if the user
models would be more intuitive, because the experts didnt understand how to
start the scenarios. Anyway although such issues were identified in most cases
the application worked as expected which results in a rating of 85%.

4.2.6.2 VerMIM GUI Design


The GUI design category tries to evaluate continuity and intuitive interactions.
There are also questions about the quality of the GUI Elements and the
understanding of the Evaluator. The results are depicted in Figure 13.
In the beginning it wasnt understandable which scenarios are eligible. The text
field virtual user model shows the same information as the virtual use type. The
text field virtual user model is not disabled, its not clear what ignore strict
scenario steps and enable free navigation exactly means, or what sign
language synthesis exactly refers to, without a manual. The function naming
was not self explanatory so that the experts usually had to ask the adviser. The
design is typical windows (Figure 14), not very appealing and old school.
However, the User interface is mostly consistent and thus the learning curve is
quite steep.

December 2012 46 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 13: Percentage of agreement in the VerMIM GUI Design category.

Suggestions for improving the Design or the VerMIM


The most of the users tried to select the multimodal interaction tools
themselves. It could make sense to allow such modifications in order to being
able to explore different ways of how handicapped users interact using different
interaction tools.
The VerMIM tool should provide the users with concrete instructions about the
particular scenarios that are simulated. Further improvements may contain
The possibility to select the user's language (user of the VerMIM tool).
Providing specific handicaps and/or to make it possible to create specific
impairments by choosing handicaps on various dimensions.
The design could be improved e.g. by showing the information what to
select in a step-by-step manner.
The design of the VerMIM tool could be more up-to-date (modern).

December 2012 47 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 14: Screenshot of the VerMIM Evaluation tool.

GUI as appropriation for the particular task of simulating the effects of a user's
handicap:
Evaluating a single screen of a software will not yield interesting results. The
experts only ever saw the Start screen. It can simulate well but in the case of
vision impaired usage it is not clear how to interact with the VerMIM tool itself.
The experts are not sure if the simulation in case of the motor impairment is
appropriate. Multiple interaction possibilities for such would be desireable. Also
the head tracker did not always work 100% as intended. The general
impression about the design of the VerMIM tool is acceptable although some
improvements could be identified. This results in an overall rating of 64.8%
which is acceptable.

4.2.7 Modality blind user


In this category the interactions of almost blind users via speech condition and
recognition are addressed.

4.2.7.1 VerMIM Evaluator Tool

General problems with the VerMIM Evaluator Tool


The descriptions of the disabilities on the first screen of the VerMIM Evaluator
Tool fare not comprehensible for a user without medical background according
to the experts as it contains only of medical terms. Furthermore it was noted
that information about the problems persons with the impairments have to face,
at what age the impairment usually appears and what social stigmata arise
together with the impairments.

December 2012 48 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Another criticized aspect was that the speech production and recognition stand
alone and are not combined with other interaction methods that are appropriate
for almost blind users.
One expert noted that it would be difficult to make the blind users familiar with
the structure and general commands to interact with the app.

Positive aspects
Generally the experts agreed with speech recognition and production as
appropriate interaction tools for almost blind users. One expert noted that it
would be difficult to make the blind users familiar with the structure and general
commands to interact with the app.

Suggestions for improvement


As a suggestion for improvement was noted the fact that it would be more
ostensive when there would be a description of a person with this impairment.
According to two of the experts the software would profit from studies carried
out with blind users, who are the real experts in this field. Interesting could be to
develop how the users use their remaining skills to compensate their disability.
Furthermore they made the suggestion to combine speech production and
recognition with other appropriate interaction methods for almost blind users, for
example braille type setting or screen flashing in different colours.

4.2.7.2 Simulation

Critics
The experts did not find the simulation very useful, as it just simulates the
almost blindness only on the screen but keyboard, mouse and other
surroundings are still seen very good. However, for full impairment simulation
such kind of immersion would need special head mounted display in order to
occlude what the user sees. It has been agreed by the experts that providing
on-screen simulation has a potential for vision deficiencies.

Positive aspects
All of the experts agreed that the response times of the simulation are
appropriate to the task with a percentage of 96%.

Suggestions for improvement


According to one of them it would have been more realistic just to close the
eyes as you can still see the keyboard with a normal acuity using the simulation.

4.2.7.3 System/Modality

Critics
One of the main problems detected by the experts is that the acoustic feedback
tells you your location in the system but not which actions are available next.
The question is, how knows the blind user his next possibilities? Furthermore
the wording of the speech control commands and the audio feedback could be
improved in terms of consistency.

December 2012 49 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

The commands do not feel very familiar to the experts. They are too technical
and short. It doesnt reflect natural interactions. They noted if they wouldnt have
had the list of available commands they would have been lost in the system.

Positive aspects
The experts did like that the system kind of responded to their actions which
provided them with feedback about the success of what they did. That means
when they gave a command the system confirms that it has recognized and
processed it. Furthermore the voice commands were judged as very concrete.
The third positive aspect they found is that the response time of the VerMIM is
very fast, so you can navigate through the menus quickly.

Suggestions for improvement


The criticized unfamiliar speech commands could be improved by a more
natural way of talking. For example instead saying at first television control
and then turn on television it would be easier, faster and more familiar to say I
want you to turn on the TV as Apples Siri speech recognition does.
Another important point of critic was that the system does not provide acoustic
information about the next possible options. To improve that point it was
suggested that the system gives you this information auditively on demand for
example by saying state.

4.2.8 Modality Myopia


The Myopia category addresses the usability of the software for visual impaired
people with a low visual acuity via a haptic device. In the German language the
term "Myopia" is not very common, so the adviser had to explain it first. The
haptic device occurred to several problems. The results are depicted in Figure
15.

Definition of Myopia
Nearsightedness, or myopia, as it is medically termed, is a vision condition in
which close objects are seen clearly, but objects farther away appear blurred.
Nearsightedness occurs if the eyeball is too long or the cornea, the clear front
cover of the eye, has too much curvature. As a result, the light entering the eye
isnt focused correctly and distant objects look blurred, Association, 2013.

Understanding of the disability


Myopia is a medical conception. All of the experts asked about the German
word for it, because its not common. The word Myopia misses what problems
persons with this impairment have to face, which kind of accidents typically
happen, at what age the impairment usually appears, what social stigmata arise
together with the impairment etc. Maybe a description of a person with this
impairment would be more informative. A list of medical terms is not helpful at
all! The field of Ametropia is so wide, that its difficult to design one scenario,
which is adequate for all of them.

December 2012 50 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 15: Percentage of agreement for the modality Myopia.

Working of the Interaction Tool


The Interaction Tool doesnt work properly. It would be better to have a cursor
with a magnifying glass. For example: Apple has got accessibility helpers in
Mac OS offer with this functionality.
Using the haptic device was very stressful, not properly working or ergonomic.
The experts didnt know how to touch and use it. After the explanation they
were able to, but the haptic device was too cumbersome.

December 2012 51 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Suggestion for improvement


The experts opinion, that its better to not use the haptic device for the
simulation of myopia. Those people use most of the time magnifiers which
allows them to focus the content of the screen.
This should be considered in this modality. Another point is the light. Sometimes
it depends on the surrounding light conditions how well you can see as a person
with myopia, e.g. during a bright day it's better than when trying to read
something with lamp light by night. The haptic device does not reflect entirely
this issue.

4.2.9 Modality Motor impaired User


This category contains heuristics referring to the usability of the program for
people with motion impairments, e.g. upper limb paralysis, which uses a head
tracker to interact with the system. The results are depicted in Figure 16.

Problems
Sometimes, the head tracker didnt work properly. Some of the experts didnt
know how to work with the head tracker without any instructions. The speech
control didnt react all the time and that was why some of them werent able to
finish the scenario.

Suggestions for Improvement


There should be a calibration procedure that has an obvious end and an
obvious beginning. It is not clear enough if the calibration was successful
and how it works. An extra manual should be given.
The experts had the feeling that they had to move the complete upper
part of their body instead of the head - A combination of gaze-based and
speech interaction is probably more appropriate, however, the cost of
such a system would by raised a lot.
It would be useful, if you could start with Speech Control.

Positive Aspects
The idea of the scenario is a good one. The experts got the impression that it is
easy to change the speed of the mouse according to what is needed for the
task. The way of using head tracking for people having severe motion
impairments and who could only move their heads, was much appreciated.
Such technologies enable such people interacting with systems, although the
performance, accuracy, etc. would be less.

December 2012 52 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 16: Percentage of agreement for the Modality motor impairments.

4.3 Conclusion
The VerMIM tool is a first approach for providing the designer and developer
with means of simulating a handicapped users behaviour. The heuristic
evaluation tries to identify problems with a system in order to improve it. Thus a
number of problems that could be addressed in future iterations of the tool were
identified. The type of the results depicted in this section, were mainly
qualitative in terms of applicability, usability, and design.
Although some issues regarding the VerMIM user interface, as well as the
appropriate presentation of the particular VERITAS user model, the reaction of

December 2012 53 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

the experts were positive regarding the VerMIM tool. The tool was found easy to
use and it does not require extensive training.
Improvement potential regarding the combination of the selected modalities and
the chosen VERITAS user model was identified mainly in terms of flexibility.
This means that designers would like to try out different combinations of
multimodal interaction tools with different handicaps. Such could be an
improvement potential to not only simulate specific handicaps and workflows to
the designer but also providing him with a tool that supports him during his
design process. In general, the experts were impressed by the potential of the
tool and agreed that they would use it in their daily work if specific design issues
appear. Moreover, the experts were very strict in their judgement and managed
to detect many system flaws. This fact turned to have positive impact to the
improvement of the VerMIM tool, as it helped very much in its refinement
process.

December 2012 54 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

5 Testing and Validation of the VERITAS


Multimodal Interfaces Toolset: User Study
In this chapter, the procedure for testing the VERITAS Multimodal Interfaces
Manager tool (VerMIM) with non-expert test-subjects will be described. A short
explanation regarding the integration process of the VerMIM with the Simulation
Platform, and the resulted tool of this activity, will be given first.
For the testing purposes, we have used a scenario that is based on the
interaction of the test-user with a Smart Home interface application which runs
on a desktop pc and is responsible for controlling various smart devices. The
Smart Home Application is depicted in Figure 17. The scenario sequence is
almost identical to the one that has been followed in the expert heuristic
evaluation of Section 4. However, for completeness sake, these scenario steps
will be also described. Finally, the experimental results of the recording
sessions, as well as any qualitative metric results, will be discussed in this
chapter.

Figure 17: The main screen of the Smart Home Application interface which was be used
as the base for the user interaction scenario steps.

5.1 VerMIM Evaluator: Integrating VerMIM with the


Simulation Platform
The integration of the VerMIM with the Simulation Platform is described
thoroughly in the VERITAS Deliverable D2.8.2 [2]. As it is stated In that
document the multimodal interfaces toolset, thus the VerMIM tool, cannot be
run as a stand alone executable. The VerMIM tool is offered as an API dynamic
link library (DLL), thus it must be integrated into an external application. For
such reasons, a new tool has been created in order to allow the communication
of the VerMIM with the simulation platform and the tested application (i.e. the
Smart Home application). The name of this new tool is VerMIM Evaluator.
December 2012 55 CERTH/ITI
VERITAS D2.8.3 PU Grant Agreement # 247765

The VerMIM Evaluator tool had been used as the testing tool that validated the
performance of the Multimodal Interfaces toolset. A screenshot of the main
screen of this tool is depicted in Figure 18.

Figure 18: The VerMIM Evaluator tool that was used for the user test recordings and the
management of the simulation platform, responsible for simulating the impairments to
the subjects.
The VerMIM Evaluator tool is responsible for the following actions:
1. To load the external application in our case the Smart Home Controller
and to manage the test-users interactions with it via the several
multimodal tools.
2. To observe the test-user actions and manage the scenario that has to be
followed. It must be said here that for the description of the scenario is
described using the Task Model structure, that has been used thoroughly
as a scenario task-base for the rest VERITAS Tools.
3. To enable the communication with the VERITAS Simulation tools, either
GUI or 3D, which with their turn enable the simulation of the various
impairments that will be necessary for the tests. In our case the VerMIM
evaluator communicates with the VerSim-GUI tool [38][37] (Figure 19),
because the Smart Home Controller is a 2D application which runs on a
desktop pc.

December 2012 56 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

4. To select the Virtual User Model that the real test-user will be simulated
as.

Figure 19: The VerSim-GUI, which is communicating with the VerMIM Evaluator and is
responsible for simulating the various impairments to the test-users.

5. To activate and calibrate the corresponding multimodal interfaces tools


that will be used during the simulating session. The tools selection is
based on the Modality Compensation process which is thoroughly
described in [2], Section 3. Different impairments (de)activate different
tools. For example, the speech recognition tool will be activated for users
with severe vision impairments, such as users with severe cataract or
glaucoma.
6. Finally, to record the users actions, such as click events, voice
commands, and to provide to the tester a report with durations and errors
that took place during the simulation session. Such a scenario report is
depicted in Figure 20.
Besides, the above functionality, the evaluator tool also provides to the user-
tester some extra functions, which are:
a) Setting the user id who is going to be tested this is used only in the
storage of the log file and is not necessary for performing the testing
procedure.
b) An option to ignore the scenario steps and allow interaction with the
underneath application, e.g. the Smart Home Application. In this case the
user is free to navigate through the application without any scenario
restriction. This option can be used to familiarize the user with the
application while simulating the impairments.

December 2012 57 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 20: The VerMIM Evaluator report dialog that is displayed after each test-simulation
session. Durations, errors and velocity of the mouse pointer (per each scenario task) are
depicted. The user is also able to save these statistics, along with other metrics, to a file
for further process.
Before describing the test configurations and the scenario steps, it is wise to
depict the architecture of the integration of the VerMIM with its Evaluator tool
and with the Simulation Tool (VerSim-GUI).

5.1.1 Data flow and Connection with Simulation Platform


The data flow of the testing and validation process is depicted in Figure 21. As it
is depicted three main VERITAS tools are included in each testing procedure: a)
the VerMIM, the VerMIM Evaluator and the VerSim-GUI.

December 2012 58 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 21: The testing procedure data flow. The integration with the VerSim tool is
necessary in order to perform the simulation of the impairment in the testing
environment. VerMIM and VerMIM Evaluator exchange several data during the simulation
session, such as current device state, task completion checks, etc.

The data flow starts with the Test-Coordinator person, who is responsible for
the configuration of the test. The Coordinator selects the impairment category
from the VerMIM Evaluator interface, performs the calibration of any activated

December 2012 59 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

device (if such action is needed) and then initiates the testing procedure. The
Coordinator is also responsible to guide the test-user if she/he is lost or
anything goes wrong.
Moreover, the VerMIM Evaluator is responsible for loading the scenario to be
used in the testing procedure. As already mentioned, the scenario is stored in a
format which is similar to a task-models.
Just after the Coordinator initiates the session, the VerMIM Evaluator reads the
Virtual User Model file which the Coordinator has selected and sends two data
signals:
1. The first signal is targeted to the VerMIM, where the suitable external
devices have to be initiated. Which devices will be activated and which
wont is defined by the modality compensation process, described
thoroughly in the Deliverable D2.8.2 [2]. Shorty described, this procedure
involves a) the parsing of the VUM file; b) identifying the respecting ICF
codes which apply to the specific impairment and c) matching the ICF
code to a modality tool.
2. The second signal is destined to the Simulation tool. In our case, it is
destined to the VerSim-GUI tool. With this procedure the path of the
Virtual User Model file is passed into the VerSim-GUI and the latter starts
the simulation of the impairments described in it. It must be written here,
that the VerSim-GUI has the simulation already platform integrated in it,
in order to perform the interactive visual, hearing and motor impairment
simulation.
The VerMIM Evaluator constantly check the states of both the VerMIM and the
VerSim and if any error takes place, reports the corresponding message to the
test coordinator.
After the initialisation of both the Multimodal Interfaces Tools and the Simulation
Environment, the Test-User may start performing the pre-defined scenario
steps, described in the loaded task-model. During the session the VerMIM
Evaluator records any wrong clicks or voice commands, as well as the time
needed by the test-user for performing each task.
In this point it must be declared that during the testing session, both the VerMIM
and the Evaluator windows are hidden, so that the user interacts with the tested
application with the virtual impairments activated without dividing his attention to
unneeded interfaces.
After the scenario steps are finished, the VerMIM Evaluator sends signals to all
other tools to stop and display to the Coordinator the report window. Using the
report window dialog the Coordinator can save the user-recorded data, for
evaluation of the toolkit.
Finally, one thing that has to be mentioned is that the integrated system of the
VerMIM, VerMIM Evaluator and VerSim-GUI, uses the external application (i.e.
the Smart Home Application), just as it is, without any alterations. That means
that any of the events are handled by the Evaluator tool, and when needed they
passed to the application running below it. This is a procedure ensures two
things:

December 2012 60 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

First, the source code of the application below is not needed, as any of
the extra multimodal functionality is added by the VerMIM. This results
into the usage of the VerMIM Evaluator with an infinite range of computer
programs.
Secondly, the sophisticated task management allows the test-coordinator
to handle the interaction events with precision and a) either allow every
event to be passed below to the application (normal test behaviour) or b)
to be consumed by the VerMIM Evaluator, when the user has performed
an action that is outside of the scenario sequence (strict test behaviour).
As it will be described later, both normal and strict test behaviours were used.

5.2 Users & Tests Specifications and Scenario


Description
In this subsection the user testing procedure specification will be described and
answers such as which kind of users have participated, what modalities were
tested and which impairments (and why) were simulated. Moreover, this
subsection includes the scenario description and the definition of the user-
actions that had to be taken at each step.
The main objective of the performed tests was to test how the system would
behave in terms of acceptability, usability and effectiveness for typical users. All
the tests performed in this Section, were conducted by inexperienced users (in
contrast to the experts that conducted the tests of Section 4).

5.2.1 Users Specifications


A total of thirteen users (13) were tested. The average age of the test-users is
29 and a half years old (average 29.46; standard deviation: 3.38), with the
minimum age being: 25 and the maximum: 35. The test-users are either
developers or researchers (or both). Table 8 summarises the users
specifications.

Table 8: Users specifications table.

User Attribute Users Distribution

Age Average 29.46 (STD: 3.38, MIN: 25, MAX: 35).

Occupation field Developers: 4 (30.77%), R&D: 7 (53.86%), PhD


students: 2 (15.37%).

Nationality and origin Greek, Greece (100%).


country

English language High: 13 (100%).


knowledge

Education University Studies (100%).

December 2012 61 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

User Attribute Users Distribution

Left/Right hander Left: 2 (15.38%), Right: 11 (84.62%).

Visual aids Glasses: 2 (15.38%), Contact lenses: 1 (7.7%).

Colour blindness Red-Green/Deuteranopia: 2 (15.38%)

PC & Notebook Usage Daily: 13 (100%)

Smartphone Usage Often: 10 (77%), Rarely: (23%)

Before performing the tests the users were asked if they had any experience in
using haptic devices or involved in any recording that included a head tracker.
Although most of the test-users are experienced programmers, who at least
once have developed some kind of graphical user interfaces, their answers
showed that none of them had used a haptic device or a head tracker for any
kind of interaction. In fact many of them asked what a haptic device is and why
is it used. Moreover, only three (3) users have an expertise in the multimodal
interaction field: two users have been involved in the development of a body
tracker (using the Microsoft Kinect device) and another being involved with the
development of a speech recognition system (Table 9). The fact that none of
the users had any experienced in using a haptic device or a head tracker has to
be considered as an important fact which must be taken into account for the
evaluation of the VERITAS multimodal toolkit.

Table 9: Multimodal experience of the users before the test.

Multimodal Experience Users Distribution

Any experience with a None of the users had ever used a haptic device.
haptic device.

Any experience with a None of the users had ever used a specialized
head tracking device. head tracker. However, 2 users had experience of
using a full-body tracking.

Any experience with a All users have at least used once a speech
voice recognition recognition system. One user had even been
system. involved in the development of such system.

5.2.2 Tests Specifications


Each test-user was instructed to perform the same scenario (as it is described
in 5.2.3) four times. Each time, the test-user was placed into a different virtual
user models place, thus each time a different impairment and a different set of
modalities were activated. The configuration of each test session is depicted in
Table 10.

December 2012 62 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Four testing sessions have been performed by each test-subject. Vision and
motor impairment based VUM definition were used. Hearing impairment VUM
models were not used, as the Smart Home Application did not have any sound
feedback.
In the normal type session, the VerMIM tools were inactive, as the user
interacted with the application using only the mouse. This type was measured
as a performance comparison basis for the rest three testing sessions. The
testing sessions took place in the order they are mentioned in Table 10. As it
will be presented later, this order is also the order of the difficulty of each testing
session.
Before the recordings the users had at least 3 minutes each, to freely interact
with the application using the mouse in order to get familiar with what each
control/button does. Also the users were instructed to perform the scenario
steps as fast as they could while trying to make as less as possible wrong
actions. The users were also given a short period of adapting themselves to the
new devices, e.g. the haptic and the head tracking pair of glasses. For the
latter, if the subject wore glasses was instructed not to take them off and just to
wear the tracking glasses over them.

Table 10: Test session types; each session is a different combination of a VUM and set of
activated modality tools.

Session # Virtual User Model Activated Modality Tools &


Interaction Method

1. Normal Normal None


(Figure 22) fully capable model. interaction is performed
using mouse and
keyboard.

December 2012 63 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Session # Virtual User Model Activated Modality Tools &


Interaction Method

2. Mild Vision Mild myopia profile Haptics, Screen Reader &


Impairment visual acuity is set to 0.9 Speech Synthesis
(Figure 23) (where 1.0 indicates the the user uses the
perfect vision). haptic device to move
the pointer and interact
with the GUI
components.
the haptic device
vibrates when the
pointer is over an
active GUI component.
in such cases, if the
user presses a special
button on the haptic, a
short description is
played back via the
headphones, using the
screen reading and
speech synthesis tools.

3. Severe Severe glaucoma profile Speech Recognition &


Vision at least 85% occlusion Speech Synthesis
Impairment of the visual field by the user uses her/his
(Figure 24) blind spots. voice to interact with
visual acuity set to 0.3. the system.

this type of impairment if the VerMIM system


indicates an almost blind recognises a valid
virtual user. voice command, it
sends to the application
the event, and replies
to the user with a short
event description, using
the speech synthesis
modality.
If the VerMIM system
fails to recognise a
command or the
command is wrong,
then it replies to the
user the phrase Invalid
command.

December 2012 64 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Session # Virtual User Model Activated Modality Tools &


Interaction Method

4. Severe Severe motor profile Head Tracker & Speech


Motor Upper limb paralysis due Recognition
Impairment to advanced stage of the user wears a pair of
(Figure 22) Parkinsons disease. glasses, on which a
Virtual user is able to infrared led transmitter
use only his/her head is attached.
and voice for interaction the user moves the
(Figure 25). mouse pointer by
moving/rotating her/his
head.
for any interaction with
simulated environment,
the user has to speak
the corresponding
mouse command to the
microphone, e.g. click
for left click.
the test-users were
instructed not to use
their hands, as the
mouse and keyboard in
this case were de-
activated from the
VerMIM Evaluator.

Before the recording process, each subject was given a set of demographic
questions, the answers of which are depicted in Table 8 and Table 9. After the
final session a System Usability Scale questionnaire was filled by the subject in
order to provide qualitative metrics and feedback for the test. Additionally, the
users answered a list of six questions concerning the technology acceptance
model integrated into the VerMIM tools. The answers to these questionnaires
along with the quantitative metrics recorder during each session can be found
and discussed in subsection 5.3.

December 2012 65 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 22: The Smart Home Application that was used for the scenario. Here the interface
is depicted unfiltered, just as it was used in the Normal and Motor Impairment
sessions.

Figure 23: The Smart Home application as it appears after the simulation of the myopia
impairment, that was applied in the mild vision impairment case.

December 2012 66 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 24: The severe glaucoma vision impairment case; most of the visual field is
occluded by blind spot areas. In such cases the virtual user is considered as almost
blind.

Figure 25: The test-user using the head tracking device; the user was instructed not to
use his hands; thus any interaction with the application was based on head motion (via
the infrared led glasses) combined with voice commands (captured by the microphone).

December 2012 67 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

5.2.3 Scenario Description


The scenario that was performed from the users at each session included
thirteen (13) simple interaction tasks (Table 11). All four sessions involved
interaction with the Smart Home Application. Before the first session the subject
could have at least three (3) minutes to interact with the applications GUI, in
order to feel comfortable with it. It was asked from the test-users to perform the
scenario tasks as fast as possible without making any wrong clicks or saying
invalid commands.
Table 11: The scenario followed at each test-session.

User Action

Cases: Case:
Normal (input: mouse) Severe vision impairment
Mild vision impairment (input: (voice recognition)
haptic device)
Motor Impairment (input: head
tracker, voice recognition for
the click command)

Interaction Type: Interaction Type:


Interaction with a GUI Voice Command; the user
component; either click on the had to speak a specific
GUI or for the head tracking: voice command.
saying the word click when
the pointer is over the GUI
Task component.

1 Change current language to English English Language

2 Go to Control Room Go to Control Room

3 Open Television Control Control Television

4 Turn on television Turn on television

5 Increase Volume Increase Volume

6 Close Television Control Go back

7 Open Blinds Control Control Blinds

8 Close Blinds Close Blinds

9 Close Blinds Control Go back

10 Go Back to main screen Go back

December 2012 68 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

11 Go to Settings Go to Settings

12 Activate Automatic Entrance Lights Activate Entrance Lights

13 Go Back to main screen Go back

A paper sheet with the scenario tasks was given to the test-subjects and this
sheet was in front of them during all the test sequence; this aimed to remind the
user of what to do next, in case she/he had forgotten it. Here, it must be
reminded to the user that this scenario does not aim to perform an accessibility
assessment of the Smart home interface, but to measure how easily and
efficiently a user can interact with it while applying the impairment virtual
symptoms.

5.3 Test Results


In this subsection the quantitative and qualitative results will be presented and
discussed.

5.3.1 Quantitative metric results


Four kinds of metrics have been measured as quantitative results:
1. Duration: measures the duration needed to accomplish the full test
session. It is worthy to say that all users were capable to perform all the
steps of the task sequence of each session in time less than three
minutes.
2. Errors: this may refer to either wrong clicked components or invalid
voice commands. Any click outside the component that each task
defined was considered as invalid. For the voice command, an invalid
command is either a command that is not understandable by the
recognizer or an understandable command that is irrelevant to the
current task action. For the special case of the head tracker, these two
kinds of errors are summed.
3. Cursor traveled distance: measures the mouse cursor traveled
distance (in pixels2). This metric is not used in the Vision Severe
Impairment session, where only the users voice is used as input. The
mouse cursor is manipulated with the mouse only in the first session;
after that the haptic and the head tracker are used to move it.
4. Cursor Velocity: this quantity measures how fast the mouse cursor is
moved on the screen. It is measured in pixels/sec.
An analysis of the above metric results will be presented in the next paragraphs.

2
In the test setup a 17 inch monitor was used with a desktop resolution of 1280x1024,
meaning that 38 pixels are converted into 1cm distance.

December 2012 69 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

5.3.1.1 Duration
The results regarding the test session durations are depicted in Figure 26. All
the users were able to succeed in performing the whole scenario in less than
three minutes.
As it is depicted in Figure 27 the average duration is increased when the
simulation impairment advances to a more severe case, from a 48% overhead
of the Mild Vision impairment case, to the 226% of the more sever vision
impaired to the 412% of the severe case of the motor impairment (Figure 28).

User Test Durations

13 7% 12% 31% 51%


12 9% 11% 30% 49%
11 7% 15% 29% 48%
10 9% 15% 31% 44%
9 11% 12% 32% 45%
8 8% 10% 38% 45%
User ID

7 9% 14% 34% 43%


6 12% 16% 30% 42%
5 7% 12% 30% 51%
4 12% 17% 25% 46%
3 11% 13% 32% 44%
2 8% 14% 27% 51%
1 11% 16% 26% 47%

0 50 100 150 200 250 300 350

Seconds

Normal Vision Mild Vision Severe Motor


Figure 26: The total durations of each test. The session percentages (to the total users
test time) are also depicted. It is clear that a great amount of time was consumed for the
th
4 session, i.e. the head tracking for the Motor Impairment test.

The fact that the head tracker had lasted at least four times longer compared to
the normal session indicates at first that the users having difficulties using the
head tracker efficiently. However, none of them had any experience of using
such device before and moreover the maximum speed of the mouse pointer
was restricted in order to make its use more comfortable.

December 2012 70 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 27: Average session duration (indicated by the number in seconds on top of each
bar). The red lines indicate the standard deviation of the duration distribution.

Average Duration Overhead

(as Percentage to Normal Session Duration)

450.00%
412.00%
400.00%

350.00%

300.00%

250.00% 226.83%
200.00%

150.00%

100.00%
48.57%
50.00%

0.00%
Vision Mild Vision Severe Motor

Figure 28: The duration overhead as a percentage relative to the Normal session. The
overhead of the haptic (Vision Mild session) is relative small to the rest, especially
when compared to the usage of the head tracker (Motor session).

5.3.1.2 Errors
The error distribution per each session is depicted in Figure 29. As presented in
Figure 30, the errors in the Normal case are almost nonexistent: 0.38 average
errors per user, resulting into an accuracy of 97% (13 total tasks 0.38 errors =
12.62 correct actions). This indicates that the graphic user interface of the smart
December 2012 71 CERTH/ITI
VERITAS D2.8.3 PU Grant Agreement # 247765

home application is very well designed. In the same figure, it becomes clear the
voice recognition (Vision-Severe case) achieves better accuracy than the
haptic controller (Vision-Mild case), probably due to the better experience the
users had previously with speech recognition systems (Table 9). Even so, the
two vision cases achieve very small error rates with accuracies: 92% and 93%
respectively for the mild and severe vision impairments.
Concerning the head tracker the accuracy falls at 85%. This can be justified by
two reasons:
a) The overwhelming majority of the users (12 out of 13) performed the
head tracker session with making less than 3 errors, which transforms
the accuracy to 91%. So the 13th user can be considered as the worst
case of such scenario.
b) The fact that the tracker device, i.e. the glasses with the infrared led, is
still in a prototype phase justifies its low accuracy - compared to the
market ready haptic device and voice recogniser.

Figure 29: Distribution of the user errors per session. The majority (12 out of 13) of the
users performed the tests making an almost negligible amount of errors.

December 2012 72 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 30: The average number of user errors per session; even the Motor session,
which involved the head tracker, manages to achieve a mere mean of 2.0 errors. The
standard deviation is indicated with the read line segments.

5.3.1.3 Cursor traveled Distance


Another metric that was measured was the distance (in pixels) traveled by the
pointer. This metric is valuable for showing the effectiveness of each tool when
moving the arrow pointer. This metric has not a meaning in the severe vision
impairment case as the manipulation was performed using only voice
commands. The results of the rest sessions are depicted in Figure 31 and
Figure 32.
In the first of the two figures, the normalized pointer distance (as a percentage
to the total distance travelled) indicates that the difference between the cursor
pointer navigation using different modality devices produces almost equal
results. More precisely, as depicted in Figure 32, the overhead is 22.19% while
on the Motor impairment case is just 13.22%. This indicates that the users
achieved to navigate the mouse cursor using special devices other than the
mouse (haptic and glasses) travelling the same screen distance.

December 2012 73 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Normalized Pointer Distance Traveled

100%
90%
27% 27% 31% 28%
33% 34% 36% 36% 37% 36%
80% 38% 42% 39%
Pointer Distance (Pixels)

70%
60%
36% 30%
50% 36% 46% 37% 29% 32%
34% 34% 49% 32%
38% 34%
40%
30%
20% 37% 38%
31% 30% 34% 32%
27% 29% 29% 29%
10% 24% 24% 23%

0%
1 2 3 4 5 6 7 8 9 10 11 12 13

User ID

Normal Vision Mild Motor

Figure 31: The normalize distance (as a percentage of the total point distance travelled
through the tests). The results indicate that the distances travelled are comparable
through the usage of different modalities.

Overhead in Distance

25.00%
22.19%

20.00%
Distance Overhead

15.00% 13.22%

10.00%

5.00%

0.00%
Vision Mild Motor
Session

Figure 32: The distance average overhead (compared to the Normal case) of the
Vision-Mild and Motor sessions. As it is shown the average overhead is small.

5.3.1.4 Cursor Speed


The cursor speed results are depicted in Figure 33. As it is depicted, the mean
velocity of the mouse cursor is comparable to the haptics one. However, the
head tracking glasses cursor is a lot slower. This can be easily justified if the
following are taken into account:
a) the haptic and mouse are both manual devices, i.e. the user uses the
same part of his body to manipulate them, so that the two distributions
follow comparable results.

December 2012 74 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

b) the users are totally inexperienced into using only the motion of their
heads to navigate the mouse cursor. This had as a result the restriction
of the maximum speed the cursor could achieve in fact the VerMIM had
a calibration dialog where each user could select the cursor max speed
high speeds were not preferred because of making the cursor
incontrollable.
Pointer Velocity

400

350

300
Pointer Velocity (pixels/sec)

Normal
250 Mean (Normal)
Vision Mild
200 Mean (Vision Mild)
Motor
150
Mean (Motor)

100

50

0
1 2 3 4 5 6 7 8 9 10 11 12 13

User ID

Figure 33: The pointer velocity of each user of the Normal, Vision-Mild and Motor
sessions.

5.3.2 Qualitative metric results


After the four recording sessions the users were asked to complete two
questionnaires in order to capture a list of qualitative metrics: the first
questionnaire regards the VerMIM system usability and the second has been
used to estimate the acceptability of its technology.
The system usability questionnaire is parted of ten statements, where for each
the user has to choose from a number 1 (if she/he strongly disagrees with that
statement) to 5 (if she/he strongly agrees with it). The answers the users have
given are presented in Table 12. The majority has responded with favorable and
positive answers for the VerMIM system, as most of them:
would use the system frequently (statement #1),
have not found it complex (#2) or cumbersome (#8 ),
thought it as easy to use (#3), without any need of a technical person
(#4),
found the various system functions well integrated (#5) and consistent
(#6),
would imagine that most people would learn to use it quickly (#7),
felt confident in using it (#9), and
didnt have to learn much about the system before they could get going
with it (#10).

December 2012 75 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Table 12: The system usability questionnaire; The scale is from 1 to 5, where 5 indicates
strong agreement to the statement. The number of the test-subject is reported in each
cell (along with its translation to percentage).

Strongly Strongly
disagree agree

Statement 1 2 3 4 5

1. I think that I would like to use this 0 0 4 6 3


system frequently. (0%) (0%) (30.8%) (46.2%) (23.1%)

2. I found the system unnecessarily 7 4 1 1 0


complex. (53.8%) (30.8%) (7.7%) (7.7%) (0%)

3. I thought the system was easy to 0 0 3 5 5


use. (0%) (0%) (23.1%) (38.5%) (38.5%)

4. I think that I would need the support 7 3 3 0 0


of a technical person to be able to use
(53.8%) (23.1%) (23.1%) (0%) (0%)
this system.

5. I found the various functions in this 0 0 0 8 5


system were well integrated. (0%) (0%) (0%) (61.5%) (38.5%)

6. I thought there was too much 6 6 1 0 0


inconsistency in this system. (46.2%) (46.2%) (7.7%) (0%) (0%)

7. I would imagine that most people 0 0 0 4 9


would learn to use this system very
(0%) (0%) (0%) (30.8%) (69.2%)
quickly.

8. I found the system very cumbersome 6 6 1 0 0


to use. (46.2%) (46.2%) (7.7%) (0%) (0%)

9. I felt very confident using the system. 0 0 4 4 5


(0%) (0%) (30.8%) (30.8%) (38.5%)

10. I needed to learn a lot of things 6 4 3 0 0


before I could get going with this
(46.2%) (30.8%) (23.1%) (0%) (0%)
system.

The technology acceptance model answers are included in Table 13. In most of
the answers a positive feedback was received by the users:
The VerMIM and its tools are considered as a good idea to the test-users
(answers #3 and #6), easy to use (#1).

December 2012 76 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

As most of the test subjects are developers, the answer to #2 is a crucial


question for the VERITAS and the majority of the users replied that using
the VERITAS multimodal tools would improve their performance in
designing and development tasks.
Neutral-positive and positive comments have been gathered around the
statement of effectiveness increase via using the VerMIM (#5), as well as
the statement of using it whenever available (#4).

Table 13: Technology acceptance model questionnaire for the VerMIM tools. Each
statement answer is scaled from 1 (favourable opinion of the system) to 7 (unfavourable
opinion of the system).

Extremely Extremely
likely unlikely
(like) (dislike)

Statement 1 2 3 4 5 6 7

1. I would find using VerMIM 5 5 3


- - - -
tools easy. 38.5% 38.5% 23.1%

2. Using VerMIM tools would


1 6 3 1 2 - -
improve my performance in the
design and development tasks. 7.7% 46.2% 23.1% 7.7% 15.4%

3. I find using the VerMIM tools 8 3 2


- - - -
a good idea. 61.2% 23.1% 15.4%

4. I intend to use VerMIM tools 1 5 4 2 1


- -
whenever available. 7.7% 38.5% 30.8% 15.4% 7.7%

5. Using VerMIM tools would 1 5 4 2 1


- -
enhance my effectiveness. 7.7% 38.5% 30.8% 15.4% 7.7%

6. I like or dislike the idea of


6 7
VERITAS Mutimodal interfaces - - - - -
tools. 46.2% 53.9%

5.4 Conclusions of the User Study


The user tests have shown that the VerMIM can be a valuable asset that
provides efficient multimodal tools. The integration of the VerMIM system with
the VerSim-GUI created a holistic approach of both:
simulating the various impairments and placing the test-user into the
position of a virtual impaired person, and

December 2012 77 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

providing the suitable multimodal tools in order to confront in this virtual


situation.
Typical multimodal tools such as the voice recognition and speech synthesis
have been successfully combined in a consistent environment with new
technologies such as haptics and head trackers. As the results have shown the
system was easily adapted by the users and all the VerMIM tools were
successfully used to complete all the scenario tasks not even one user has
failed to complete the scenario.
The navigation using the haptics or the head tracker gave a very good
impression to the majority of the users and this is a very positive feedback,
taking into consideration that it was the first time they controlled such devices.
The manipulation of the GUI elements using these new-to-them devices did not
prove difficult at all and the transition from mouse and keyboard progressed in a
consisted way.
As most of the users are developers, found it a very good idea of integrated the
VerMIM tools to their applications. They were also positive to the fact that tools,
such the VerMIM and the VERITAS simulation platform, would increase their
effectiveness into constructing tools destined to impaired people.
In general the VerMIM the conclusions of the performed user study showed
that:
VerMIM modality compensation process worked successfully as the
selected tools for the virtual impairment were proved suitable (scenario
success rate: 100%).
Although the users made some wrong actions, the overall interaction with
the application has been validated as successful as the tools accuracy
and effectiveness was in high levels.
Increasing the severity of the impairment has a negative impact to the
scenario duration, as more sophisticated tools need to be used.
The consistency of the tools and their integration with a closed-code third
party Smart Home application, had been considered by the users very
good to excellent. Taking into account that most of the tested users are
developers this is a very important fact.
Almost all of the users considered the system efficient, easy, a good
idea and they would have accepted it as a potential candidate to their
future interaction needs, either as designers or beneficiaries.
Finally, it must be stated that the experiment was planned in such way that the
users had to advance from mild impairment sessions to sessions with more
severe impairments. As the quantitative results had shown, this had a
corresponding impact the durations of the sessions: from lower durations to
higher ones. However, this also helped the users to progressively get into the
feet of the impaired person. Most of the test-users find it a very good idea to
consistently increase the difficulty that arises from a more severe impairment
simulation. Most of the users were positively thrilled of the fact to manipulate the
Smart home user interface using only their heads.

December 2012 78 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

6 Refinement of the Multimodal Interfaces Tool-


set and its Limitations
This section presents the refinement process of the Multimodal Interfaces
Toolset (or VerMIM from now on). Moreover, any system limitations will be
presented in the second half of this section.

6.1 VerMIM Refinement


The refinement of the toolkit has been a constant and continuous process that
took place from the start of the VERITAS WP2.8 until the moments these pages
have been written. However, several things and VerMIM components had to be
added or changed in order to provide to the users a stable release version that
can stand up to the standards of the rest VERITAS tools. Most of the refinement
process took place after the expert evaluation process and before the user
study tests.
Any problems depicted in the heuristic evaluation process had to be fixed, and
as it is shown by the user study which took place afterwards, the VerMIM
refinement can be considered as a successful process.
The following things had to be added to the VerMIM environment and then
further improved:
a) Inclusion and development of a modality tool that would apply to users
which could not use their hands to interact with the tested application.
This tool was added before the heuristic evaluation tests, and its
functionality was greatly improved after the feedback that was received
then.
b) Refinement of the speech recognition system that runs under the
VerMIM suite. First unofficial tests of the VerMIM voice recognition
produced poor accuracy rates and this had to be corrected.
c) Improvement in speech synthesis procedure, as the produced voice
sounded unnatural.
d) There wasnt any other way of testing each of the provided multimodal
tools without first integrating it with another standalone application.
The following paragraphs provide to the reader the solution description to each
of the above problems.

6.1.1 Addition of the Head Tracking Tool


A way of interacting with an application without the users hands was necessary
to be included in the VerMIM framework, especially to cope with severe motor
impairments which prohibit the hands usage. The voice recognition could
provide to the beneficiary a suitable tool for this purpose. However, this kind of
solution needs lots of lines of code to be written by the developer/designer in
order to respond to each voice command separately. An elegant solution that
would be independent of the application that is running and being tested by the
VERITAS framework had to be implemented. The solution was found in the
development of a head tracking system.

December 2012 79 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

The head-tracking system is based on an infrared LED transmitter and the


corresponding receiver which receives the signal source position and via the
VerMIM it translates into 2D motion. In our experiments, the Nintendo Wii-
Remote has been used as receiver because of its low price and high availability
in the market. The led transmitter is attached to a pair of low cost pair of safety
glasses. The glasses are depicted in Figure 34.

Figure 34: The pair of glasses attached with the LED transmitter, that were used as
tracking device. The depicted system can be considered as low cost, as it total cost is
less than five Euros.
The head-tracking device is used to move the mouse cursor. In cooperation
with a simple voice recognition system, which can recognise simple commands
such as left click, right click, etc, can be used to manipulate the mouse
cursor and its behaviour, independently of any application that is running in the
desktop pc of the user. This solution is a global approach and does not need
any extra development for applications that are destined to users with
amputated hands or severe motor impairments.
The refinement process of this tool included several improvements, which were:
Improvements in the calibration GUI, via which the user is able to
configure and test the tool before using it in the multimodal scenario.
Better navigational capabilities. At first only 4-directional mouse
navigation was offered by the system. The refinement process added
two new modes: 8-directional and free navigation.
Several optimizations were applied, concerning the better cooperation of
the LED transmitter and Wii-remote receiver, in order to provide a wider

December 2012 80 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

field of view for the users, as it was depicted in the heuristic evaluation
the signal was lost when the user turned her/his head in great angles.

6.1.2 Improvements in Speech Recognition system


When the speech recognition system had first been used, it had produced poor
accuracy results. Current state of the art speech recognition systems provide
high accuracy, however their code is not available to the public, because they
are closed source APIs. This was not permitted for integration as VerMIM tool
as part of the VERITAS framework had to be an open-source project. The best
open-source speech recognition which suited for the VerMIM purposes was the
Sphinx API [1]. However after several tests, even that tool could not produce
accuracies to realistic scenario higher that 70%.
The solution to that problem had been found to be the definition of grammar
rules that were specific to the tested application. Support of parsing special
grammar rules for the Sphinx API has been integrated to the VerMIM. As the
user study showed, the speech recognition rates have climbed to accuracies
higher that 90% which was considered more than enough by the test-subjects.

6.1.3 Adding Natural Output of Speech Synthesis


Initial testing of the VerMIM speech synthesis module had shown that the
producted speech was poor and most of the test-users reported it as bad
quality. This was something that had to be corrected before the user study
would have taken place.
The solution was to search for open-source voices that could be added to the
VerMIM framework and to integrate them without lots of modifications. So, after
a thorough search, open source male and female voices have been found and
added to the speech synthesis module. The new voices sounded natural and as
it was shown in the experimental results produced a pleasant voice to the test-
subjects.

6.1.4 VerMIM as standalone GUI application


It was necessary to provide to the developer a testing base of each multimodal
tool in order to see for herselft / himself what each tool could achieve. Before,
the developer had to integrated each of the tool programmatically into her / his
application and test it from there. However this was not an elegant way to show
the capabilities of the VerMIM tools.
For that reason, a stand-alone version of the VerMIM toolset have been
constructed, which allows the users, either developers or beneficiaries:
to see the various VerMIM components and how each one could be
used,
to experiment with each of the tools,
to see the capabilities of each one and decide if it is sufficient or not.
This stand-alone version integrates the VerMIM toolset library and provides to
the users several graphical user interfaces in order to interact with its tools
(Figure 35 ~ Figure 40).

December 2012 81 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 35: The VerMIM haptics testing panel.

Figure 36: The VerMIM magnifying glass testing panel.

December 2012 82 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 37: The VerMIM speech recognition testing panel.

Figure 38: The VerMIM sign language synthesis testing panel.

December 2012 83 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

Figure 39: The VerMIM calibration and test panel of the Head tracking module.

Figure 40: The VerMIM speech synthesizer testing panel.

December 2012 84 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

6.2 VerMIM Limitations


The VerMIM is not a tool without flaws. It is a newly developed toolset in which
several other open-source APIs have been integrated in order to be offered to
the developer/designer as a holistic approach. In its current state there are
several limitations regarding its modules.
One basic limitation is that the various modalities need a very fast computer in
order to perform their potential. Several optimisations have to be made in order
for this tool to be able to run properly to mobile computers. For example the
speech recognition module takes more time (than the faster closed-source state
of the art relative systems) in order to produce the output result and this delay
could be misconceived by the test user and may lead to voice recognition
errors.
Another limitation of the system is that new technologies are used. Many of
them have problems with their drivers installation (e.g. for the haptics device),
especially for operating systems other than the Microsoft Windows. This could
lead into immigration problems to other O/S platforms.
Finally, a limitation of the VerMIM is that it doesnt include a tool for interaction
with touch-screens. However, such a tool could not provide much of an extra
assistance to the vision, motor and hearing impaired users in desktop
applications, as such kind of users could not benefit more than interacting with
the haptic input device, which additionally provides advance force feedback to
them.

December 2012 85 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

7 Conclusions
The work results described in this document have shown that the Multimodal
Interfaces Toolset is a valuable addition to the rest VERITAS tools for providing
to the impaired user a holistic approach of special multimodal interaction which
can cope successfully with her/his special needs. This deliverable starts by
analysing user interfaces indicators of user performance and satisfaction. Then
it continues with matching each one of these with one or more modalities and
how people with special needs could interact with systems using those
modalities.
Two test sessions have arranged and performed to evaluate the Multimodal
Interfaces toolset. The first test-session involved a heuristic evaluation of the
system. This evaluation performed by experts in the Multimodal Interaction field.
Despite the fact that the experts judged the whole system very strictly, the
majority of them was impressed by the potential of the tool and agreed that they
would use it in their daily work if specific design issues appear. Moreover, it was
found out that the tool is easy to use and it does not require extensive training.
The issues regarding the VerMIM user interface and the devices improper
functionality were taken into account in the refinement process and were fixed,
for the second evaluation.
The second evaluation was a user study which was based on performance and
satisfaction indicators. In the study, thirteen users, mostly developers and
researchers tried the VerMIM and managed to control successfully (100% rate)
a smart home application they had seen first time under circumstances affected
by mild and severe impairments. This is can be considered as an impressive
result if taken into account that most of the users have interacted with haptic
devices or head tracking devices for the first time. The general conclusion by
the user study tests strongly indicates that by using the integrated product of
VerMIM with the VERITAS Simulation platform, the developer has a great asset
when designing an application that includes people with disabilities.
Between and after the two testing sessions several of the components of the
VerMIM have been altered and improved through a refinement process defined
by the comments of the various test-users. New voices have been added to the
speech synthesizers, new grammar-based system has been integrated to the
voice recognition module for higher accuracy results and a stand-alone
application with panels for configuring and testing each tool have been
implemented.
Finally, a word must be said about the head tracking system has been added to
the VerMIM modules. This low cost tracker was made especially for the VerMIM
test requirements in order to simulate situations were the subject cannot use
her/his hands. The results have shown that the majority of the users, via the
interaction with such a device, were very satisfied of navigating through the
Smart Home application using only their heads.
Having the above in mind, the VerMIM tool may be considered as the VERITAS
multimodal interaction based solution for designers who need to be placed in
the impaired peoples shoes in order to provide better tools to real impaired
users.

December 2012 86 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

References
[1] Panagiotis Moschonas, Dimitrios Tzovaras, George Ghinea, VERITAS
Deliverable D2.8.1 First prototypes of the multimodal interface tool set,
December 2011.

[2] Panagiotis Moschonas, Athanasios Tsakiris, Georgios Stavropoulos,


Sofia Segouli, Ioannis Paliokas, Dimitrios Tzovaras, VERITAS
Deliverable D2.8.2 Integration of Multimodal Interfaces into the
VERITAS Simulation and Testing Framework, June 2012.

[3] Albert, W.S. and Dixon, E. (2003). Is this what you expected? The use
of expectation measures in usability testing. Proc. Usability
Professionals Association, 12thAnnual Conference, 10th paper.

[4] Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg


Vanderheiden. Web Content Accessibility Guidelines (WCAG) 2.0. W3C
Recommendation 11 December 2008

[5] Bolt, R. A. (1980) Put-That-There: Voice and Gesture at the Graphics


Interface. In: Proceedings of the 7th International Conference on
Computer Graphics and Interactive Techniques. Seattle, USA, pp 262-
270.

[6] Bolt, R. E., Herranz, E. (1992) Two-handed gesture in multi-modal


natural dialog. In: Mackinlay, J. & Green, M. (eds) Symposium on User
Interface Software and Technology (UIST92). ACM Press, New York,
Unite States, pp 7-14.

[7] Buxton, W., Myers, B.A. (1986) A study in two-handed input. In: Mantei
M, Orbeton P (eds) ACM Conference on Human Factors in Computing
Systems (CHI86). ACM Press, Boston, Massachusetts, United States,
pp 321-326.

[8] Brooke, J., 1996, SUS: A Quick and Dirty Usability Scale. In: P.W.
Jordan, B. Thomas, B.A. Weerdmeester & I.L. McClelland (Eds.),
Usability Evaluation in Industry. London: Taylor & Francis, 189-194.

[9] Card, S.K., English, W.K., & Burr, B. J. (1978). Evaluation of mouse,
rate-controlled isometric joystick, step keys and text keys for text
selection on a CRT. Ergonomics, 21(8), 601-613.

[10] Chin, J., Diehl, V., Norman, K. (1988). Development of an instrument


measuring user satisfaction of the human-computer interface. in Proc.
CHI88 Human Factors in Comp. Systems Conf., ACM Press, 213-218.

[11] Cushing SL, Papsin BC, Rutka JA, James AL, Gordon KA. 2008.
Evidence of vestibular and balance dysfunction in children with
profound sensorineural hearing loss using cochlear implants.
Laryngoscope. 2008 Oct;118(10):1814-23.

December 2012 87 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

[12] Finke, A., H. Koesling, & H. Ritter. 2010. Multi-modal human-computer


interfaces for handicapped user groups: Integrating mind and gaze. In
Proc. Ubicomp 2010, Copenhagen, Denmark, 1 - 4.

[13] Frisoli A., Rocchi F., Marcheschi S., Dettori A., Salsedo F. &
Bergamasco M. (2005). A new force-feedback arm exoskeleton for
haptic interaction in virtual environments. Proceedings of the First
Eurohaptics Conference and Symposium on Haptic Interfaces for
Virtual Environment and Teleoperator Systems March,
2005, Pisa, Italy, 195-201.

[14] Kabbash, P., Buxton, W., Sellen, A. (1994) Two-handed input in a


compound task. In: Plaisant C (ed) ACM Conference on Human Factors
in Computing System (CHI94). ACM Press, Boston, Massachusetts,
United States, pp 417-423.

[15] Karat, C.-M., Halverson, C., Horn, D., & Karat, J. (1999). Patterns of
entry and cor- rection in large vocabulary continuous speech
recognition systems. Proceedings of the International Conference for
Computer-Human Interaction (CHI99), 568575. New York: ACM
Press.

[16] Kavakli, Manolya 2008. Gesture recognition in virtual reality.


International Journal of Arts and Technology 2008 - Vol. 1, No.2 pp.
215 229

[17] Kurtenbach, G. & Hulteen, E. (1990). Gestures in Human-Computer


Communications. In B. Laurel (Ed.) The Art of Human Computer
Interface Design. Addison-Wesley, 309-317.

[18] LoPresti EF, Brienza DM. Adaptative software for head-operated


computer controls. IEEE Trans Neural Syst Rehabil Eng. 2004;12:102-
11.

[19] Lund, A. M. (2001). Measuring usability with the USE questionnaire.


Usability Interface: Usability SIG Newsletter, October.
http://www.stcsig.org/usability/newsletter/0110 measuring with use.html.

[20] Manaris, B.; Macgyvers, V.; Lagoudakis, M. A Listening Keyboard for


Users with Motor Impairmentsa Usability Study. Speech Technol.
2002, 5, 371-388.

[21] MacKenzie, I.S. and S.X. Zhang. The design and evaluation of a high-
performance soft keyboard. Proc. CHI'99, p. 25-31.

[22] Myers, B.A., Wobbrock, J.O., Yang, S., Yeung, B., Nichols, J., Miller, R.
Using handhelds to help people with motor impairments. Proc. ASSETS
02. ACM Press, 2002, 89-96.

[23] Sauro, J. & Kindlund, E. A Method to Standardize Usability Metrics into


a Single Score, Proc. CHI 2005, ACM Press (2005), 401-409.

December 2012 88 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

[24] Jeff Sauro, SUM: Single Usability Metric, April 17, 2005,
http://www.measuringusability.com/SUM/index.htm.

[25] Sawhney, N. and Schmandt, C. Nomadic Radio: speech and audio


interaction for contextual messaging in nomadic environments. ACM
Transactions on Hu- man-Computer Interaction 7, 3 (2000), 353-383.

[26] Sears, A. and Young, M. Physical disabilities and computing


technologies: an analysis of impairments. In the Human-Computer
interaction Handbook, Lawrence Erlbaum Associates (2003), 482-503.

[27] M Serruya, A Caplan, M Saleh, D Morris, J Donoghue, The braingate


pilot trial: Building and testing a novel direct neural output for patients
with severe motor impairment, Soc. for Neuroscience. Abstr, 2004.

[28] Sibert, L., and Jacob R. 2000. Evaluation of eye gaze in- teraction. In
Proceedings of the SIGCHI conference on Human factors in computing
systems, 281288.

[29] Signspeak. Scientific understanding and vision-based


technologicalmdevelopment for continuous sign language recognition
and translation www.signspeak.eu FP7-ICT-2007-3-231424 -
Annual Public Report -

[30] Shiha, Ching-Hsiang, Shihb, Ching-Tien, Chu Chiung-Ling. 2010.


Assisting people with multiple disabilities actively correct abnormal
standing posture with a Nintendo Wii Balance Board through controlling
environmental stimulation. Research in Developmental Disabilities,
2010 Elsevier

[31] Madeline E. Smith, Carole Dennis, Sharon Stansfield, Hlne Larin,


Infants Control of a Robotic Mobility Device, RESNA Annual
Conference, 2010.

[32] Vaughan TM, McFarland DJ, Schalk G, Sarnacki WA, Krusienski DJ,
Sellers EW, Wolpaw JR. 2006.The Wadsworth BCI Research and
Development Program: at home with BCI. IEEE Trans Neural Syst
Rehabil Eng. Jun;14(2):229-33.

[33] VisionAustralia 2012. The Colour Contrast Analyser 2.2


http://www.visionaustralia.org.au/

[34] Wolpaw J. R. and McFarland D. J. 2004. Control of a two-dimensional


movement signal by a non-invasive brain-computer interface in
humans. Proc. Natl. Acad. Sci. USA, vol. 101, pp. 1784917854.

[35] Zafrulla, Zahoor and Brashear, Helene and Starner, Thad and Hamilton,
Harley and Presti, Peter. 2011. American sign language recognition
with the kinect. Proceedings of the 13th international conference on
multimodal interfaces. ICMI '11

December 2012 89 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

[36] Zhai, S., Barton, A.S., Selker, T. (1997) Improving browsing


performance: a study of four input devices for scrolling and pointing
tasks. In: Howard S, Hammond J & Lindgaard G (eds) The IFIP
Conference on Human-Computer Interaction (INTERACT97).
Chapman & Hall, Sydney, Australia, pp 286-292.

[37] Panagiotis Moschonas, Athanasios Tsakiris, Ioannis Paliokas,


Georgios Stavropoulos, Dimitrios Tzovaras, VERITAS Deliverable
D2.1.4 Integrated Core Simulation Platform and Exportable Toolbox,
June 2012.

[38] PERCRO, CERTH, VRMMP, VERITAS Deliverable D2.1.3 Interaction,


Virtual User and Simulation Adaptor, December 2011.

December 2012 90 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8 Appendix

8.1 TaskSheet Heuristic Evaluation: Multimodal


Interfaces Manager (VerMIM Evaluator)
You have been recruited as an expert in Human Computer Interaction
and Usability in order to evaluate a multimodal Interaction Toolset, which was
developed for users with different disabilities. The design of the toolset will be
refined and optimized according to the outcomes of this evaluation.
As an example program, which could be used in the multimodal Toolset
framework we programmed the SmartHome Application. It should be easy to
use via the Toolset for people with the following disabilities:
Severe visual impairments (blindness)
(lacking visual perception due to physiological or neurological factors)
Myopia, a mild visual impairment
(Myopia, commonly known as being nearsighted (American English) and
shortsighted (British English), is a condition of the eye where the light that comes in
does not directly focus on the retina but in front of it. This causes the image that one
sees when looking at a distant object to be out of focus, but in focus when looking
at a close object.)
motion impaired users (upper limb paralysis).
(Paralysis is loss of muscle function for one or more muscles. Paralysis can be
accompanied by a loss of feeling (sensory loss) in the affected area if there is
sensory damage as well as motor. )
Available interaction tools are a head tracker, speech synthesis and haptic
interaction (Novint Falcon haptic device). Please note that the SmartHome
Application is not on the focus of this evaluation. It is only an example.
At first, take yourself 5 minutes time to make you common with the
program. Then please run through the following 4 settings and carry out the
instructions. After that please evaluate the usability and accessibility of the
different modalities for disabled people.
Your first task is to interact with the Smart Home App as a normal user without
any disabilities to make you familiar with the SmartHome program. Feel free to
explore how to interact with the application.

December 2012 91 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8.1.1 A. Scenario: Normal users without any disabilities


Note that the SmartHome App will only react on our defined tasks.
1. Use the Multimodal Interface Manager (VerMIM) and enter that you dont
have a disability.
2. Use the interaction tool given by VerMIM to execute the following tasks.
3. Change language to English
4. Go to control room.
4.1. Open television controls
4.2. Turn on the television
4.3. Increase Volume
4.4. Go back to control
5. Open blinds control
5.1. Close blinds
5.2. Go back to control
6. Go back to Intro
7. Open settings
8. Activate outdoor Control lights
9. Go to Intro

A window with the statistics of your interaction is opening now. Please save the
log file (file name: your user ID_scene1

December 2012 92 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8.1.2 B. Scenario: User with severe visual impairments blind users

Note that the SmartHome App will only react on our defined tasks.

1. Use the Multimodal Interface Manager (VerMIM) to enter your fictive


disability, which is blindness in this scenario.
2. Use the interaction tool given by VerMIM to execute the following tasks.
Imagine that you are blind while doing this. Quoted text denotes the
corresponding voice commands.
3. Change language to English.
English language
4. Go to control room.
Go to control room
4.1. Open television controls
Control television
4.2. Turn on the television
Turn on television
4.3. Increase Volume
Increase Volume
4.4. Go back to control
Go back
5. Open blinds control
Control blinds
5.1. Close blinds
Close blinds
5.2. Go back to control
Go back
6. Go back to main screen
Go back
7. Go to settings
Go to settings
8. Activate automatic entrance lights
Activate entrance lights
9. Go to Intro
Go back

A window with the statistics of your interaction is opening now. Please save the
log file (file name: your user ID_scene2).

December 2012 93 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8.1.3 C. Scenario: Users with mild visual impairments Myopia

Note that the SmartHome App will only react on our defined tasks.

1. Use the Multimodal Interface Manager (VerMIM) to enter your fictive


disability, which is Myopia in this scenario.
2. Use the interaction tool given by VerMIM to execute the following tasks.
Imagine that you can see, but you cannot read the interface due to your
impairment while doing this.
3. Change language to English
4. Go to control room.
4.1. Open television controls
4.2. Turn on the television
4.3. Increase Volume
4.4. Go back to control
5. Open blinds control
5.1. Close blinds
5.2. Go back to control
6. Go back to Intro
7. Open settings
8. Activate outdoor Control lights
9. Go to Intro

A window with the statistics of your interaction is opening now. Please save the
log file (file name: your user ID_scene3).

December 2012 94 CERTH/ITI


VERITAS D2.8.3 PU Grant Agreement # 247765

8.1.4 D. Scenario: Motion impaired users upper limb paralysis

Note that the SmartHome App will only react on our defined tasks.

1. Use the Multimodal Interface Manager (VerMIM) to enter your fictive


disability, which is upper limb paralysis in this scenario.
2. Use the interaction tool given by VerMIM to execute the following tasks.
Imagine that you cannot move your upper body due to your impairment
while doing this.
3. Calibrate the headtracker. Please take some time to get a feeling for the
headtracker and calibrate at least 30 seconds. Please dont change your
position or move your chair after calibrating. Note that the headtracker works
better if you move your whole head as if you turn it around.
4. Change language to English
5. Go to control room.
5.1. Open television controls
5.2. Turn on the television
5.3. Increase Volume
5.4. Go back to control
6. Open blinds control
6.1. Close blinds
6.2. Go back to control
7. Go back to Intro
8. Open settings
9. Activate outdoor Control lights
10. Go to Intro

A window with the statistics of your interaction is opening now. Please save the
log file (file name: your user ID_scene4).

December 2012 95 CERTH/ITI