You are on page 1of 144

22/02/2022

1. Introduction to
Human-Computer
Interaction
(HCI)

Human – Computer Interaction


• Definition given by ACM*
(Association for Computing Machinery):

“Human-computer interaction is a discipline


concerned with the design, evaluation and
implementation of interactive computing
systems for human use and with the study
of major phenomena surrounding them.”

* ACM SIGCHI Curricula for Human-Computer Interaction


1
2
22/02/2022

Computers
• Classical situation: a person using an
interactive graphics program on a workstation
• BUT machines are nowadays pervasive:
– Embedded computational machines, such as
parts of spacecraft cockpits or microwave ovens
– Ubiquitous computing, Wearable computing,
smartphones, tablets
– Virtual realities
– Mixed realities

Computers
• Keywords: computing, interaction
=> active machines
• E.g., the relationships between humans and
hammers is not part of HCI!
• Human factors studies the human aspects of
all designed devices, but not the mechanisms
of these devices.
• Human-Computer Interaction studies both the
mechanism side and the human side, but of a
narrower class of devices.
2
4
22/02/2022

Humans
• Classical situation: a person using an
interactive graphics program on a workstation
• BUT complex social organizations can also be
considered:
– Group of humans (e.g., collaborating for work, for
leisure, etc., or competing): Computer Supported
Cooperative Work (CSCW)
– Large organizations (e.g., companies, institutions)

=> interfaces for distributed systems, computer-


aided communications between humans
5

Multidisciplinary nature of HCI


• HCI in the large is an interdisciplinary area.
• On the machine side: software engineering,
programming languages, development
environments, multimedia systems, computer
graphics, computer vision, sound and music
computing, …
• On the human side: psychology, anthropology,
sociology, theories of human cognition,
emotion, and perception, linguistics, social
sciences, industrial design, arts, humanities …
3
6
22/02/2022

Historical roots
• Vannevar Bush’s Memex (1945)
– A proto-hypertext system: an electromechanical device for
reading a large self-contained library, and add or follow
(collaboratively) associative trails of links and notes.
– Visions of future HCI, e.g., head-mounted cameras, voice
recognition, speech synthesis (the Memex desk).

Historical roots
• Man-machine symbiosis (Licklider, 1960)
• Augmentation of human intellect,
oNLineSystem (NLS) (Engelbart, 1963) ->
“Augmenting human intellect”: precursor of
CSCW, pointing devices, windows, RPC
• These works provided the conceptual
framework and the lines of development for a
number of important building blocks for HCI:
e.g., the mouse, bitmapped displays, personal
computers, windows, the desktop metaphor,
point-and-click editors.
4
8
22/02/2022

Historical roots
• Ivan Sutherland’s Sketchpad (1963)
uses CRT and pen devices, is the first example of
Graphical User Interface (GUI), is the ancestor of modern
computer-aided drafting (CAD)

Historical roots
• Dynabook (Kay and Goldberg, 1977)
The first prototype of what is now known as a laptop computer or a
tablet PC, aimed at giving children access to digital media; the target
audience was children; the prototype embodied learning theories from
developmental psychology.

5
10
22/02/2022

Historical roots
• Xerox PARC (Palo Alto Research Center)

Xerox Star (1981)


Early windows GUIs

Xerox Alto (1973)


Early prototypes of
mouse and GUI

11

HCI becomes a discipline


• It happens in the 80s, also because of the
widespread dissemination and adoption of
Personal Computers.
• Some important dates:
– 1981-83: IBM PC; 1984: Apple Macintosh
– Since 1983: ACM CHI Conference
– Since 1984: IFIP INTERACT Conference
– Since 1985: British Computer Society HCI Conference
– Since 1985: International Conference on Human-
Computer Interaction

6
12
22/02/2022

Goals of HCI
• Basic goal: to improve the interactions
between users and computers by making
computers more usable and receptive to
the user’s needs.

13

HCI and Interaction Design


• Which are main differences between Interaction Design (ID) and
Human–Computer Interaction (HCI) ?
• ID has cast its net much wider, being concerned with the theory,
research, and practice of designing user experiences for all manner of
technologies, systems, and products, whereas
• HCI has traditionally had a narrower focus, being “concerned with the
design, evaluation, and implementation of interactive computing
systems for human use and with the study of major phenomena
surrounding them” (ACM SIGCHI, 1992, p. 6).
Schaffer (2009): focusing more on the user experience and less on
usability. Many websites are designed to persuade or influence rather
than enable users to perform their tasks in an efficient manner.
Example: many online shopping sites are in the business of selling services and products,
where a core strategy is to entice people to buy what they might not have thought they
needed. Online shopping experiences are increasingly about persuading people to buy
rather than being designed to make shopping easy. This involves designing for persuasion,
emotion, and trust – which may or may not be compatible with usability goals.

7
14
22/02/2022

We address topics and questions about the what, why,


and how of interaction design. These include:
• Why some interfaces are good and others are poor
• Whether people can really multitask
• How technology is transforming the way people communicate with one another
• What users’ needs are and how we can design for them
• How interfaces can be designed to change people's behavior
• How to choose between the many different kinds of interactions that are now
available (e.g. talking, touching, wearing)
• What it means to design truly accessible interfaces
• The pros and cons of carrying out studies in the lab versus in the wild
• When to use qualitative versus quantitative methods
• How to construct informed consent forms
• How the detail of interview questions affects the conclusions that can safely be
drawn
• How to move from a set of scenarios, personas, and use cases to initial low-fidelity
prototypes
• How to represent the results of data analysis clearly
• Why it is that what people say can be different from what they do
• The ethics of monitoring and recording people's activities
• What are Agile UX and Lean UX and how do they relate to interaction design?

15

Goals of HCI
• Methodologies for designing interfaces
– i.e., given a task and a class of users, design the best possible
interface within given constraints

• Methods for implementing interfaces


– e.g. software toolkits and libraries; efficient algorithms

• Techniques for evaluating and comparing


interfaces
• Developing new interfaces and interaction
techniques
• Developing descriptive and predictive
models and theories of interaction
8
16
22/02/2022

HCI Curricula (ACM)

17

Example 1

Source: Interface Hall of Shame

9
18
22/02/2022

Example 1
• Problems:
– Inconsistency with prior experience and other applications
(horizontal scrollbar)
– the horizontal scrollbar is an affordance for continuous scrolling,
not for discrete selection
– no shortcuts for frequent users
– The help text has also usability problems: “Press OKAY”? Where
is that? And why does the message have a ragged left margin?
– The presence of a help text is indicative of the presence of
usability bugs.
– A usable interface should not need many explanations!
– Usability bugs should be fixed along the design process and not
patched when the product is delivered!

19

Example 1
• The example redesigned

Source: Interface Hall of Shame

10
20
22/02/2022

Example 2

Source: Interface Hall of Shame

21

Example 2

• Problems:
– The date and time look like editable fields (affordance), but you
cannot edit them with the keyboard.
– The dialog box displays time differently, using 12-hour time (7:17
pm) where the original dialog used 24-hour time (consistency)
– The third representation (analog clock) further increases
confusion.
– The way of changing the time using the two mouse buttons is
not familiar.

11
22
22/02/2022

Example 3

Look at the context menu and at the sub-menu

23

Example 3
• Context-menus popping up on right-click
are useful (you don’t have to go up to the
menu bar)
• But the appearing and disappearing of
sub-menus should be carefully managed:
– Sub-menus that appear and disappear too quickly make the
interface difficult to use
– If they are too slow, they make the user uncomfortable with them

• Design of user interface must carefully


consider human capabilities (=> need to
study perception and cognition).
12
24
22/02/2022

Example 4

What happens when you select a tool in the palette?


What happens when you press the Caps Lock key?

25

Example 4
• “An human-machine interface is modal with respect
to a given gesture when (1) the current state of the
interface is not the user’s locus of attention and (2)
the interface will execute one among several
different responses to the gesture, depending on
the system’s current state”
(J. Raskin, The Humane Interface, 2000)
• Thus, a mode is a distinct setting within a computer
program or any physical machine interface, in
which the same user input will produce perceived
different results than it would in other settings.

13
26
22/02/2022

Example 4
• Modes often lead to errors know as mode errors.
• A mode error can be quite disorienting as the user
copes with the violation of her user expectations.
• An interface that uses no modes is known as a
modeless interface.
• An interface is not modal, as long as the user is
aware of its current state.
• The best way to avoid mode errors is to build an
accurate mental model of the system for the users
which allows them to predict the mode accurately.
• Quasimodes are modes that are kept in place
only through some constant action by the user.
27

Example 5

The subjective quality of an interface and its


expressive, emotional aspects matter too.

14
28
22/02/2022

Example 6

Source: www.baddesigns.com

Controls and labels look the same!


Designing of good interfaces is not restricted to software
interfaces!

29

Example 7: Tabbed Browsing

Very useful in web browsers: tabbed browsing neatly solves


a scaling problem in the Windows taskbar:
Without tabs, there is a higher number of top-level Internet
Explorer windows, which become a single button in the task bar
with a pop-up menu:
less visible and less efficient to search.

15
30
22/02/2022

Example 7: Tabbed Browsing

Tabs are widely used (e.g. EyesWeb)


31

…Example 7 – Tabbed Browsing

Multiple windows are grouped into a single top-level


window and accessed by a row of tabs.
Tabbed browsing allows to group related tasks to a
single button in the task bar.
Tabbed browsing better supports task analysis and
shortcuts for task oriented users (Mozilla)

16
32
22/02/2022

…Example 7 – Tabbed Browsing


Problem: You can’t have more than 5-10 tabs without shrinking
their labels so much that they are unreadable.
Is multiple rows of tabs a good solution ?

Hall of Shame: multiple rows of tabs (example: MS Word 6)

33

…Example 7 – Tabbed Browsing


-> Multiple rows of tabs
Click in a back row (e.g. “Spelling”) has to move the whole row
forward in order to maintain the tabbing metaphor.
Disorientating: (1) the tab you clicked on has leaped out from
under the mouse; (2) the other tabs you visited before are
now in totally different places.
Plausible solutions: color-coding each row of tabs; moving the
front rows of tabs below the page; use animations.
All solutions might reduce disorientation, but they add visual
complexity, greater demand on screen real estate, or having
to move the page contents in additions to the tabs.
No solution prevent the basic problem:
the tabs jumping around.

17
34
22/02/2022

…Example 7 – Tabbed Browsing

Eclipse 3.0: it shows a few tabs, the rest are found in a


pulldown menu on the right end of the tab bar.
Features: incremental search (typing into the first line of the menu will
narrow the menu to tabs with matching titles; but it does not communicate
its presence very well); use of boldface to distinguish between visible and
hidden tabs (was it a good decision?)

35

…Example 7 – Tabbed Browsing


The solution proposed in Eclipse 3.0:
The problem with this pulldown menu is that it
completely disregards the natural, spatial mapping
that tabs provide:
• Menu order is unrelated to the order of visible tabs;
• Menu is alphabetical, but tabs are listed in order of
recent use: if you choose a hidden tab, it replaces the
least recently used visible tab.
LRU is a great policy for caches.
Is it appropriate for frequently accessed menus? No,
because it interferes with users’ spatial memory.

18
36
22/02/2022

Example 8
How to design an intuitive, usable phone answering machine?

37

Example 8
Marble Answering Machine (Bishop, 1995)
The Marble Answering Machine (by
Durell Bishop, student at the Royal
College of Art) is a prototype
telephone answering machine.
Incoming voice messages are
represented by marbles, the user can
grasp and then drop to play the
message or dial the caller
automatically. It shows that
computing doesn’t have to take place
at a desk, but it can be integrated into
everyday objects. The Marble
Answering Machine demonstrates
the great potential of making digital
information graspable

– Tangible interface
– Based on the use of everyday objects
– Easy and intuitive to use
– Requires one-step actions to perform core tasks
19
38
22/02/2022

Example 9
Bancomat (cash machine) (our student Andrea Germinario)

– Tangible interface
– Inconsistency of instructions

39

The user interface…


… is important
• It strongly affects perception of software and
can make the difference for selling a software
product!
… is critical
• The lighter consequence of a bad user
interface is wasting user’s time, the worst is to
be cause of a disaster!
… is hard to design and to build
• Specialists are needed for design (as in
software engineering); user interface takes a
lot of software development effort.
20
40
22/02/2022

Interaction design

Interface design
system
user
interface

Interaction design
interaction
system
user

41

Interaction design
• Preece, Sharp, and Rogers, 2015:
Designing interactive products to support the way
people communicate and interact in their everyday
and working lives
• Winograd, 1997:
The design of spaces for human communication
and interaction
• Wikipedia
The discipline of defining the behavior of products
and systems that a user can interact with

21
42
22/02/2022

Goals of interaction design


(Sharp, Rogers and Preece, 2007)

• Develop usable products


– Easy to learn, effective to use and
providing an enjoyable experience
• Involve users in the design process
• HCI is one the multidisciplinary fields
that “do” interaction design.

43

Usability

“The extent to which a product can be used by


specified users to achieve specified goals with
effectiveness, efficiency and satisfaction in a
specified context of use.”

ISO 9241-11: Guidance on Usability (1998)

22
44
22/02/2022

Usability

“The extent to which a product can be used by


specified users to achieve specified goals with
effectiveness, efficiency and satisfaction in a
specified context of use.”

E.g., novice users, expert


users, users with disabilities,
elder users

45

Usability

“The extent to which a product can be used by


specified users to achieve specified goals with
effectiveness, efficiency and satisfaction in a
specified context of use.”

This means that users’ needs


and goals should be studied
and defined carefully!

23
46
22/02/2022

Usability

“The extent to which a product can be used by


specified users to achieve specified goals with
effectiveness, efficiency and satisfaction in a
specified context of use.”

E.g., at home, at work, while


driving, on mobiles, at public
spaces => context-aware apps

47

Usability

• Usability is contexts
therefore a
relative
concept that
depends on
three users
independent
variables
goals

24
48
22/02/2022

Usability

“The extent to which a product can be used by


specified users to achieve specified goals with
effectiveness, efficiency and satisfaction in a
specified context of use.”

ISO 9241-11: Guidance on Usability (1998)

49

Usability goals/dimensions
• Effectiveness
– Does the system allow the user to fully
reach the goal? How accurate is it?
• Efficiency
– How many resources have to be spent for
obtaining the result? Is the system fast to
use, once learned?
• Satisfaction
– Is the system enjoyable to use? Do the
users appreciate them?

25
50
22/02/2022

Usability goals/dimensions
• When does the system need to be
effective, efficient, and satisfactory? The
first time it is used? After some training?
What happens in case of errors?
• Learnability
– How much is the system easy to learn?
• Memorability
– Is the use of the system easy to remember?
• Safety
– Is the system safe? Are errors few and
recoverable?
51

Usability Goals vary in importance


• Depend on the user:
– Novice users need Learnability
– Infrequent users need Memorability
– Experts need Efficiency
• But no user is uniformly novice or expert:
– Domain experience
– Application experience
– Application’s feature experience

26
52
22/02/2022

Usability is only one attribute of a system

• Software designers worry about:


– Functionality; Performance; Cost; Security;
Usability; Size; Reliability; Standards
• Many design decisions involve tradoffs
among different attributes

• In this course we focus on Usability

53

Learnability
• Some systems are designed so that
novice users can quickly and easily learn
how to use them.
• Some other systems are designed for
expert users and expert users only can
use them effectively.
• Some other systems have alternative
modes (for novices and for experts).
• But no user is uniformly novice or expert!

27
54
22/02/2022

Learning curve

55

The Ten-Minute Rule

• Novice users should be able to learn how


to use a system in under 10 minutes
(Nelson, 1980).
• It is a useful rule of thumb for evaluating
many systems.
• It can be inappropriate for complex
systems, for systems providing diverse
functionality, or needing high levels of skill.

28
56
22/02/2022

Memorability

• It is particularly important for systems


that are not often used, but that require
high efficiency and safety in their use.
• When reaction must be fast, users have
no time to check manuals: they must
remember how to use the system.
• For example, systems used in
emergency procedures (e.g., fire alarm).

57

Learnability and memorability


• Every system should allow the novice user
to learn how to use its basic functions,
without the need of any training or manual.
• Every system should be easy enough to
remember so a user who seldom uses it can
use it again after some time without the
need of checking the manual.
• Lengthy manuals and long learning times to
understand how to use basic features are
often cues of bad designs!
29
58
22/02/2022

Safety
• Protecting the user from dangerous conditions
and undesirable situations:
– Preventing the user to make serious errors (e.g.,
do not place the quit or delete command right next
to the save command in a menu!)
– Providing users with ways of recovery in case of
errors (e.g., undo facilities, confirmatory dialog
boxes).
• Safe interactive systems engender confidence
and allow exploration of functionalities.

59

User experience goals


• User experience: what the interaction with the
system feels like to the users.
• User experience goals include a system to be:
• enjoyable • supportive of creativity
• engaging • aesthetically pleasing
• pleasurable • rewarding
• exciting • fun
• entertaining • provocative
• helpful • surprising
• motivating • enhancing sociability
• emotionally fulfilling • challenging

• ... and not be boring, annoying, frustrating.


• Quality of Service → Quality of Experience
30
60
22/02/2022

Usability goals vs.


user experience goals
• User experience goals are more subjective
• Usability goals are more objective (=> metrics)
• Sometimes usability and user experience goals
are contrasting: for example, a non-easy to use
system (e.g., a game) can be more challenging and
hence more interesting than an easy to use system!
• Trade-offs are often needed between the two kinds
of goals
– e.g. can a product be both fun and safe?
– Not all the usability and the user experience goals are
applicable to every system. For example, games must be
fun, process control systems must be safe.

61

Readings
• Curricula for Human-Computer Interaction, ACM
Special Interest Group on Computer-Human
Interaction, Curriculum Development Group,
http://sigchi.org/cdg/index.html, Last updated: 2008-
04-11.
• Shneiderman, B. (2009). Designing the User
Interface. Addison Wesley, Chapter 1.
• Preece, J., Rogers, Y., Sharp, H. (2019). Interaction
design. Wiley, 5th Ed.
• MIT course on HCI
• Web: Interface Hall of Shame, baddesign.com ….

31
62
22/02/2022

2. The Human:
Sensory channels,
Perception, Cognition,
Emotion

63

HCI Curricula (ACM)

32
64
22/02/2022

The Human

• Information i/o …
• visual, auditory, haptic, movement
• Information stored in memory:
– sensory, short-term, long-term
• Information processed and applied:
– reasoning, problem solving, skill, error
• Emotion influences human capabilities
• Each person is different

65

Human processor model

Card, Moran, Newell, The psychology of human-computer interaction, 1983


33
66
22/02/2022

Human processor model

Card, Moran, Newell, The psychology of human-computer interaction, 1983

67

Perception

• The process of attaining awareness or


understanding of sensory information.
• That is, the process of transforming sensorial
information to higher-level representations
which can be used in associative processes
(memory access) and cognitive processes
such as reasoning.

34
68
22/02/2022

Perception
• One of the oldest fields in psychology
– E.g., the Weber–Fechner law (19th century) on
the logarithmic relationship between physical
magnitudes of stimuli and perceived intensity
• What one perceives is a result of interplays
between past experiences, including culture,
and the interpretation of the perceived.
• If the percept does not have support in any of
these perceptual bases it is unlikely to rise
above perceptual thresholds.

69

Sensory modalities
• Law of specific nerve energies: we are
aware not of objects themselves but of
signals about them transmitted through our
nerves, and that there are different kinds of
nerves, each nerve having its own “specific
nerve energy” (Muller, 1830).
• Muller adopted the five primary senses that
Aristotle had recognized: seeing, hearing,
touch, smell, taste.
• The specific nerve energy represented the
sensory modality that each type of nerve
transmitted. 35
70
22/02/2022

Sensory modalities
• Sensory modality: the sensory channel
through which information is perceived; it
refers to the type of communication channel
used for transferring or acquiring information.
• There are specific receptor cells, tuned to be
sensitive to different forms of physical energy
in the environment
• Mode: a state determining how information is
interpreted to extract or transfer its meaning.

71

Sensory modalities

An overview of input channels at the neurophysiological level


36
72
22/02/2022

Sensory modalities

• The different sensory modalities are not


processed in isolation.
• Multimodal areas exist in cortical and sub-
cortical areas.
• The integration of the different channels is
essential, among other things, for allowing
the brain to reconstruct internal body
models and internal representations.

73

Terminology:
Proximal and Distal Stimuli

• distal (~distant) stimuli or objects: objects


and events out in the world
• proximal (~approximate = close) stimuli:
patterns of stimuli from these objects and
events that actually reach your senses
(eyes, ears, etc.)

37
74
22/02/2022

Proximal and Distal Stimuli


• Most of the time, perception reflects the properties of the distal objects
and events very accurately, much more accurately than you might
expect from the apparently limited, varying, unstable pattern of
proximal stimulation the brain/mind gets.
• The problem of perception is to understand how the mind/brain
extracts accurate stable perceptions of objects and events from such
apparently limited, inadequate information.
• In vision, light rays from distal objects form a sharply focused array on
the retina in back of the eye. But this array continually varies as the
eyes move, as the observer gets different views of the same object, as
amount of light varies, etc. Although this proximal stimulus array is
what actually triggers the neural signals to the brain, we are quite
unaware of it or pay little attention to it (most of the time). Instead we
are aware of and respond to the distal objects that the proximal
stimulus represents. This is completely reasonable: the distal object is
what is important.

75

Terminology:
Multimedia Vs. Multimodal
• Medium = information carrier (e.g.,: press,
video, audio, graphic video terminals, e-mails, ...)
• Multimedia system: a system that can
gather information from and produce
information in more than one medium.
• Codes (encoding) = conventions used for
representing information
• E.g., medium press may contain codes such as
pictures, diagrams, text.
38
76
22/02/2022

Multimedia vs. Multimodal


• Multimedia systems and multimodal
systems make use of many media and
many communication channels.
• Moreover, a multimodal system aims at
modeling information content.
• A multimodal system aims at
understanding and processing the
underlying meaning of information.
• A multimedia system focuses on the
medium, i.e., on technology rather than on
users and therefore at the application level.

77

The typical architecture of a


multimedia system

Syntax

Semantics

Pragmatics

Application

39
78
22/02/2022

Architecture of a
multimodal system

Syntax

Semantics

Pragmatics

Application

79

Multimodal Machine Learning


• Our experience of the world is multimodal: we see objects,
hear sounds, feel texture, smell odors, and taste flavors.
• Modality refers to the way in which something happens or is
experienced and a problem is characterized as multimodal
when it includes multiple such modalities.
• In order for Artificial Intelligence to make progress in
understanding the world around us, it needs to be able to
interpret such multimodal signals together:
• Multimodal machine learning aims to build models that can
process and relate information from multiple modalities. It is
a multi-disciplinary field of increasing importance and with
extraordinary potential. Challenges include representation,
translation, alignment, fusion, and co-learning.
• T.Baltrusaitis, C.Ahuja, L.P.Morency (2017) Multimodal Machine
Learning: A Survey and Taxonomy
https://arxiv.org/pdf/1705.09406.pdf

40
80
22/02/2022

Sensory modalities
• Major modalities for Human-Computer
Interaction and multimedia:
– Vision
– Hearing
– Somatic senses
• Some of the sensory modalities do not have a
cortical representation (sense of balance,
chemical senses) or just have a very reduced
one (taste) and do not give origin to
“conscious perception”: thus we cannot
speak, for them, of “perceptual channels”.

81

Smell
• The senses of smell and taste have not (yet)
been widely used in HCI and multimedia.
• However, especially the sense of smell has
great potentialities, e.g., in man-machine
interaction with mobile robots.
• In nature odors are not only important from
the point of view of “chemical” analysis, but
also from the navigation point of view, for
“marking” the territory and setting “landmarks”
which are of great help in path planning.

41
82
22/02/2022

Smell
• Biological memory is linked to odors, most
probably because the phylogenetically oldest
systems of territory representation are based
on the chemical senses.
• Spatial memory is probably related to the
hippo-campus, a cortical area in the immediate
neighborhood of the olfactory cortex.
• Some experiments of robotical path planning,
following the gradient of some odor
(e.g., Russell, 2000).

83

Smell
• It seems reasonable to consider odor as a
creative bidirectional channel of
communication between man and a moving
machine.
• Unlike sound, olfactory marks have a physical
persistence, like visual traces, but are
invisible themselves and may thus be used in
parallel to visible traces.
• Human chemical senses such as taste and
smell create particularly salient memories.
42
84
22/02/2022

Vision
• “vision: [...] 2b (1): mode of seeing or
conceiving; [...] 3a: the act or power of
seeing: SIGHT; 3b: the special sense by
which the qualities of an object [...]
constituting its appearance are perceived and
which is mediated by the eye; [...]”
• Vision plays the most important role as input
modality for information processing.
• There is a large body of experimental
knowledge about the properties of vision
(even if we cannot fully understand it yet).

85

Vision

Two stages in vision

• physical reception of stimulus

• processing and interpretation of stimulus

43
86
22/02/2022

The Eye - physical reception


• mechanism for receiving light and
transforming it into electrical energy
• light reflects from objects
• images are focused upside-down on
retina
• retina contains rods for low light vision
and cones for colour vision
• ganglion cells (brain) detect pattern and
movement

87

The eye

• Eyes are organs


that detect light,
and send signals
along the optic
nerve to the visual
and other areas of
the brain.

44
88
22/02/2022

Receptors

The retina contains


two major types of
light-sensitive
photoreceptor cells
used for vision: the
rods and the
cones.

89

Receptors
• Rods cannot distinguish colors, but are
responsible for low-light black-and-white vision
• Rods work well in dim light as they contain a
pigment, visual purple, which is sensitive at
low light intensity, but saturates at higher
intensities.
• Rods are distributed throughout the retina but
there are none at the fovea and none at the
blind spot. Rod density is greater in the
peripheral retina than in the central retina.

45
90
22/02/2022

Receptors
• Cones are responsible for color vision.
• They require brighter light than rods.
• There are three types of cones, maximally
sensitive to long-wavelength, medium-
wavelength, and short-wavelength light (often
referred to as red, green, and blue).
• The color seen is the combined effect of stimuli
to, and responses from, these cone cells.
• Cones are mostly concentrated in and near the
fovea. Only a few are present at the sides.

91

Receptors

• Objects are seen most sharply in focus when


their images fall in the fovea, as when one
looks at an object directly.
• Cone cells and rods are connected through
intermediate cells in the retina to nerve fibers
of the optic nerve.
• When rods and cones are stimulated by light,
the nerves send off impulses through these
fibers to the brain.

46
92
22/02/2022

Interpreting the signal (1/3)


• Size and depth
– visual angle indicates how much of view object
occupies
(relates to size and distance from eye)

– visual acuity is ability to perceive detail (limited)


– familiar objects perceived as constant size
(in spite of changes in visual angle when far away)

– cues like overlapping help perception of size and


depth

93

Interpreting the signal (2/3)


• Brightness
– subjective reaction to levels of light
– affected by luminance of object
– measured by just noticeable difference
– visual acuity increases with luminance

• Colour
– made up of hue, intensity, saturation
– cones sensitive to colour wavelengths
– blue acuity is lowest
– 8% males and 1% females are colour blind

47
94
22/02/2022

Interpreting the signal (3/3)

• The visual system compensates for:


– movement
– changes in luminance.

• Context is used to resolve ambiguity

• Optical illusions sometimes occur due to over


compensation

95

Visual acuity

• Visual acuity measures how much an eye can


differentiate one object from another in terms
of visual angles.
• It is often measured in cycles per degree
(CPD), i.e., as an angular resolution.
• For a human eye with excellent acuity, the
maximum theoretical resolution would be 50
CPD. A rat can resolve only about 1 to 2 CPD.

48
96
22/02/2022

Eye movement
• Saccades: very fast (even more than 400°/s)
eye movements with very short time duration
(20~50 ms), aiming at moving the fovea so
that small parts of a scene can be sensed with
greater resolution.
• Fixation: maintaining of the visual gaze on a
single location. Humans typically alternate
saccades and visual fixations. During fixation
(~60-700 ms) visual information is captured
• Scanpath: 2D trajectory followed by eyes
while exploring a scene (alternation of
saccades and fixations, duration: ~230 ms).
97

Eye movement

Fixation Fixation

Saccade

49
98
22/02/2022

Eye tracking
• The process of measuring either the point of
gaze (“where we are looking”) or the motion of
an eye relative to the head.
• Eye tracker: device for
measuring eye
positions movements.
• Video-based eye
trackers: an infrared
light is used to create
a corneal reflection
(CR), which is tracked
with a video-camera.

99

Analysis of eye movements

50
100
22/02/2022

Analysis of eye movements

101

Analysis of eye movements

51
102
22/02/2022

Eye movements and tasks

Yarbus, A. L., 1967. Eye Movements and Vision. Plenum. New York.

103

Vision

• From the point of view of applications the


properties of vision which are basic and
important are:
– light intensity response
– color response
– temporal response
– spatial responses

52
104
22/02/2022

Light and color response


• Humans are sensible to a very narrow band
of frequencies within the enormous range of
frequencies of the electromagnetic spectrum.
• This narrow band of frequencies is referred to
as the visible light spectrum.
• Specific wavelengths within the spectrum
correspond to a specific color.
• The long wavelength end of the spectrum
corresponds to light perceived to be red;
the short wavelength end to light perceived to
be violet.
105

Light and color response

Cone
sensitivity
curves

53
106
22/02/2022

Temporal response
• It is responsible for many effects like
perception of light intensity changes, and
rendering of motion.
• The response can also be very high in
specific conditions but in practice it can be
considered to be limited to a maximum of
100 Hz for very good motion rendering and
few tens of Hz for light intensity change.
• This is dependent on the type of visual
stimulation, distance, lighting conditions.

107

Spatial response
• Spatial response deals with the problems of
visual resolution, width of the field of view,
and spatial vision.
• Resolution: while in normal scenes the
resolution to details needs not to be very
high, in specific situations the eye is very
sensitive to the resolution: this is the reason
why magazine printing might require a
hundred times higher resolution than TV.

54
108
22/02/2022

Spatial response
• Width of the field of view: while the central
visual field which brings most information is
essential, there is a much wider peripheral
vision system which has to be activated in
order to increase the perceptual involvement
(cinema vs. TV effect).
• On top of this there is a sophisticated spatial
vision system which is partially based on
binocular vision and partially on spatial
feature extraction from monocular images.

109

Optical illusions

• An optical illusion (also called a visual


illusion) is characterized by visually perceived
images that differ from objective reality.
• The information gathered by the eye is
processed in the brain to give a percept that
does not tally with a physical measurement of
the stimulus source.

55
110
22/02/2022

Optical illusions

• There are three main types:


– literal optical illusions that create images that are
different from the objects that make them;
– physiological optical illusions that are the effects
on the eyes and brain of excessive stimulation of
a specific type (brightness, tilt, color, movement);
– cognitive optical illusions where the eye and brain
make unconscious inferences.

111

Physiological illusions
• Stimuli have individual dedicated neural paths in
the early stages of visual processing; repetitive
stimulation of only one or a few channels causes a
physiological imbalance that alters perception.
• Mach bands is an illusions that is best explained
using a biological approach: in the receptive field
of the retina light and dark receptors compete with
one another to become active.
• We therefore see bands of increased brightness
at the edge of a color difference.
• Once a receptor is active it inhibits adjacent
receptors.
56
112
22/02/2022

Mach Bands

113

The squares A and B on the illusion are the same


color (or shade), although they seem to be different.
57
114
22/02/2022

115

Context can create optical illusions:


Ponso and Muller-Lyer illusions

The Ponzo Illusion: The Muller-Lyer illusion:


The two rectangles have Which is the longest line ?
the same dimensions ?

58
116
22/02/2022

Illusions in text reading tasks

Reading test: read quickly the following text:


La veloce volpe
marrone balza sopra il
il lento cane
Most subjects do not notice the repetition of the article “il”.

The visual system tries to compensate - and sometimes


over-compensate – perception, aiming at perceiving
correctly the surrounding world.

Similar phenomena apply to hearing.


117

Reading
Several stages:
- visual pattern perceived of the structure of the word on the page
- decoded using internal representation of language
- interpreted using knowledge of syntax, semantics, pragmatics

Reading involves saccades and fixations; perception occurs


during fixations (94% of total time).
The eye moves forward and backward on text (regression):
the more complex the text, the higher number of regressions.
Adults read about 250 word per minute.
Word shape is important to recognition: a familiar word is
recognized by its shape, so it may require the same time as
to recognize a single character. This means that
modifications of the shape of familiar words (e.g., all
characters upper case) is negative for a fast and precise
reading.
Negative contrast improves reading from computer screen.
59
118
22/02/2022

Hearing

• “hearing: 1: to perceive or apprehend by the


ear; [...] 1: to have the capacity of
apprehending sound; [...] 1: the process,
function, or power of perceiving sound; specif:
the special sense by which noises and tones
are received as stimuli; [...]”

119

Hearing
• Provides information about the surrounding environment:
distances, directions, identification of objects and of the physical matter
and actions of objects (e.g. the sound of “friction”, of “rolling”, of “walking
steps” on different matter, e.g. snow, sand, stone…”, etc).
• Physical apparatus:
– outer ear – protects inner and amplifies sound
– middle ear – transmits sound waves as
vibrations to inner ear
– inner ear – chemical transmitters are released
and cause impulses in auditory nerve
• Perceived and objective qualities of sound:
– pitch – sound frequency
– loudness – amplitude
– timbre – type or quality

60
120
22/02/2022

The external stimulus: Sound

• Sound:
– changes in air pressure (vibrations)
• Sounds are produced:
– physical objects vibrate according to their
properties and thus cause movements of the air
• Sounds are perceived:
– air vibrations are picked up by eardrums, and
later in the inner ear are transformed to nerve
impulses

121

1
Sound
0.8

0.6

0.4
Amplitude

0.2

-0.2

-0.4

-0.6

-0.8

-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Time

A sound of a note on a pianoforte: the initial wave (left) is the sum of the periodic
sound of the note and the noise of the hammer hitting the strings; after a while (right)
the noise fade off only remains the periodic component of the sound.
61
122
22/02/2022

Sound

Pressure waves

air

Sound source
(vibrating object) Ear

123

The ear

62
124
22/02/2022

The ear
• The ear has external, middle, and inner portions. The outer
ear is called the pinna and is made of ridged cartilage covered
by skin. Sound funnels through the pinna into the external
auditory canal, a short tube that ends at the eardrum
(tympanic membrane).
• Sound causes the eardrum and its tiny attached bones in the
middle portion of the ear to vibrate, and the vibrations are
conducted to the nearby cochlea. The spiral-shaped cochlea
is part of the inner ear; it transforms sound into nerve
impulses that travel to the brain.
• The fluid-filled semicircular canals (labyrinth) attach to the
cochlea and nerves in the inner ear. They send information on
balance and head position to the brain. The eustachian
(auditory) tube drains fluid from the middle ear into the throat
(pharynx) behind the nose.

125

Hearing

• Major attributes of hearing:


– Loudness
– Pitch
– Timbre
– Spatial attributes

63
126
22/02/2022

Sound Intensity
• Sound pressure or acoustic pressure is the local pressure deviation
from the ambient (average, or equilibrium) atmospheric pressure,
caused by a sound wave.
• In air, sound pressure can be measured using a microphone, and in
water with a hydrophone.
• In a sound wave, the complementary variable to sound pressure is the
particle velocity. Together they determine the sound intensity of the
wave:
• Sound intensity, denoted I and measured in W·m−2, is given by:
I=pv
where:
– p is the sound pressure, measured in Pa;
– v is the particle velocity, measured in m·s−1.
• The Distance Law of sound pressure p for a spherical sound wave at a
distance r from a punctual sound source is given by:
pα1/r

127

Loudness
• Perceived sound intensity
• Used to arrange sounds on a scale from quiet to loud
• Perceived physical quality:
– measured as dB SPL (sound pressure level):
SPL=20log(p/p0)
where
• p is the root mean square sound pressure;
• p0 is the commonly used reference sound pressure in air (20 μPa RMS) which is
usually considered the threshold of human hearing (roughly the sound of a
mosquito flying 3 m away). Sound level measurements are made relative to this
level (Standard ANSI S1.1-1994)
• The lower limit of audibility is defined as SPL of 0 dB,
– 10 dB SPL – barely audible
– 60 dB SPL – moderately loud
– The upper limit is not as clearly defined. Approx 120 dB SPL – threshold of pain
• In general: twice as loud is ~6 dB of difference

64
128
22/02/2022

Loudness and Frequency:


Equal-loudness contour
• Loudness is related to physical sound level, to frequency and to other
characteristics including distance and taste.
• Human hearing does not have a flat spectral sensitivity (frequency
response) relative to frequency versus amplitude.
• Because the frequency response of human hearing changes with
amplitude, three weightings have been established for measuring sound
pressure: A, B and C.
– A-weighting: sound pressures levels up to 55 dB (dBA or LA)
– B-weighting: sound pressures levels between 55 and 85 dB (dBB or LB)
– C-weighting: sound pressure levels above 85 dB (dBC or L)
• Some sound measuring instruments use the letter "Z" as an indication of
linear SPL.

129

Equal Loudness Contours


• Equal-loudness contour = a measure of sound pressure (dB SPL),
over the frequency spectrum, for which a listener perceives a
constant loudness when presented with pure steady tones.
• Unit of measurement for loudness levels: phon
– two sine waves of differing frequencies are said to have equal-loudness level
measured in phons if they are perceived as equally loud by the average young
person without significant hearing impairment.
– The phon is a unit of loudness level for pure tones: the number of phon of a
sound is the dB SPL of a sound at a frequency of 1 kHz that sounds just as loud.
– Its purpose is to compensate for the effect of frequency on the perceived
loudness of tones.
• Equal-loudness contours are often referred to as "Fletcher-Munson"
curves, after the earliest researchers, but those studies have been
superseded and incorporated into newer standards.
• The definitive curves are those defined in the international
standard ISO 226:2003 which are based on a review of several
modern determinations made in various countries.

65
130
22/02/2022

Equal Loudness Contours

Original Fletcher-Munson curves

131

Loudness
Shot
(Pain threshold)
Rock concert

Reaping machine Clacson

Conversation

Clock

66
132
22/02/2022

Pitch
• Perceptual property:
– Property that can be used to arrange sounds
on a scale from low to high
– is equal to the frequency of a sine-wave,
for which a listener says it has the same pitch
as the sound
• Sounds:
– do cause a sensation of pitch: harmonic
sounds (tones)
– do not cause a sensation of pitch
– sounds in between

133

Pitch
• Range: 20 Hz – 20 kHz
• We perceive pitch on a logarithmic scale
with the base of 2:
– sound 2x as high – frequency 2x as high
– sound 3x as high – frequency 4x as high
– sound 4x as high – frequency 8x as high
– ...
• In western music, an octave is divided into
12 semitones with frequencies 21/12 apart

67
134
22/02/2022

Frequency spectrum of
sounds
• The frequency spectrum of harmonic
sounds contains one or more stable
frequency components, called partials
4000
Amplitude

3000

2000

1000

0
0 220 440 880 1760 2000
Frequency

135

Frequency spectrum of
sounds
• The frequency spectrum of non-harmonic
sounds is usually noisy – there are no
prominent partials
300
250
Amplitude

200
150
100
50
00 1000 2000 3000 4000
Frequency
68
136
22/02/2022

Pitch and partials

• Pitch of a tone is related to frequencies


and distribution of partials:
– sinewave:

0 500 1000 1500 2000

137

Pitch and partials

Piano A3: (220 Hz) Clarinet A3 (220 Hz)


all first 7 partials missing even partials

0 220 440 660 880 1760 0 220 660 1100 1540

69
138
22/02/2022

Pitch and partials


Piano A1: (55 Hz)
missing 1st partial (fundamental frequency)

0 110 165 220 330 660 880

139

Auditory illusions

– Shepard scale

– Risset scale

Studies on psychoacoustics by J.C.Risset at Bell Labs, 1969

70
140
22/02/2022

Timbre

• Quality of sound enabling us to discern its origin


• Many parameters affect perception of timbre:
– the number, amplitudes and distribution of partials
– development of partials through time
– vibrato (modulation of frequency, 2-5Hz)
– Tremolo (modulation of amplitude, 2-5Hz)
– sound onset (slow/fast attack of sound), and in
general the dynamic shape of ADSR: attack,
decay, sustain, release.

141

Spatial attributes:
distance and direction
• The perception of the direction of a sound source
depends on
– the differences in the signals between the two ears
(interaural cues):
• interaural level difference (ILD)
• interaural time difference (ITD)
– the spectral shape at each ear (monaural cues).
• Interaural and monaural cues are produced by
reflections, diffractions and damping caused by the
body, head, and pinna.
• The transfer function from a position in space to a
position in the ear canal is called head-related
transfer function (HRTF).
71
142
22/02/2022

Spatial attributes:
distance and direction

• The perception of distance is influenced by changes


in the timbre and distance dependencies in the
HRTF.
• In echoic environments the time delay and
directions of direct sound and reflections affect the
perceived distance:
– Perception of physical characteristics (e.g. size) of a room
by delays and reverberation (e.g., native blind persons)

143

Music Information Retrieval


• We easily perform many complex processes,
– recognize and discriminate sounds
– isolate (hear) sounds from a complex mixture of other sounds or
noises
– hear melodies, memorize melodies, recognize melodies, recall
melodies, reinterpret melodies
– recognize and anticipate organization of music
– follow tempo, beats; foot-tapping
– perceive emotions
Mobile apps are available to search on internet songs and
music content: e.g. Shazam, Soundhound
Multimedia information retrieval in music and audiovisual
industry: how to retrieve, access, manipulate, add

72
144
22/02/2022

Interactive Sonification
http://www.interactive-sonification.org/
• Sound is intrinsically related to movement: acoustic
waves are generate by a movement generating an
excitatory pattern.
• Music alludes to movement: scientific literature
studying “what movements are evoked by music?”
• Industry applications: for example
– Sonification of human movement or of scientific data
(counterpart of visual displays / data visualization)
– Therapy and rehabilitation
– Wellness and fitness (music listening while doing
sport decreases 5% oxygen consumption: music is
forbidden in sport competitions: technological
doping)
145

Other industry applications


• Apps for mobiles
• Videogames
• Education
• …

smcnetwork.org
– European Roadmap on the ACM discipline of
“Sound and Music Computing”

73
146
22/02/2022

Somatic senses
• “somatic: 1: of, relating to, or affecting the body [...];
2. of or relating to the wall of the body”
• The main keywords related to somatic senses are
tactile and haptic which are both related to the
sense of touch.
• Concerning this sense, there is a lot more than only
touch itself.
• Researchers distinguish five “senses of skin”: the
sense of pressure, of touch, of vibration, of cold,
and of warmth.

147

Somatic senses
• Provide important feedback about environment.
• A key sense for someone who is visually impaired.
• Stimulus received via receptors in the skin:
– Thermoreceptors (heat and cold)
– Nociceptors (pain)
– Mechanoreceptors (pressure)
• Some are instantaneous, some continuous
• Some areas more sensitive than others e.g. fingers.
• Kinethesis: awareness of body position
– affects comfort and performance.

74
148
22/02/2022

Touch
• Skin is the largest
sense organ in the
body. It covers
about 2 mq
(Quilliam, 1978).
Sensory
• Mechanoreceptors
sensitive to
pressure or
deformation.
• Concentration of
mechanoreceptors
is not uniform.

149

Sensory homunculus

frekvenca

75
150
22/02/2022

Two-points threshold

How far apart


do two
separate
points need
to be before
they frekvenca
are
perceived as
two?

151

The hand
• Spatial acuity about 1 mm (Loomis, 1979)
• Temporal acuity 700 Hz vibrations (1.4 ms
intervals) (Verrillo 1963)
• The output response of each receptor
decreases over time (called stimulation
adaptation) for a given input stimulus.

76
152
22/02/2022

Multisensory integration
• Hand poorer than eye and better than
ear in spatial details.
• Hand better that eye and poorer than
ear in temporal details.

• Hands, eyes, and ears complement each


other in the human sensory system
• A premise to multimodal integration

153

Haptics
• The word haptic, from the Greek ἁπτικός
(haptikos), means pertaining to the sense
of touch
• Haptic technology refers to technology
that interfaces to the user via the sense of
touch by applying forces, vibrations,
and/or motions to the user.
• By touching an object, which information is
it possible to obtain?

77
154
22/02/2022

Haptics

155

Exploratory procedures

Lederman and Klatzky, 1993


78
156
22/02/2022

Exploratory procedures

Okamura, Turner, and Cutkosky, 1997

157

Braille Bar

Using touch for replacing vision (e.g., blind people)

79
158
22/02/2022

Kinesthesis
• Kinesthesis (perception of body movements):
the perception that enables one person to
perceive movements of the own body.
• It is based on the fact that movements are
reported to the brain (feedback), as there are:
– angle of joints
– activities of muscles
– head movements (vestibular organ in the inner ear)
– position of the skin, relative to the touched surface
– movements of the person within the environment
(visual kinesthesis)
159

Paradigms of perception

• How is sensorial information transformed to


higher-level representations?
• Several theories and paradigms have been
proposed. We shortly cover the following:
– Behaviorism
– Gestalt psychology
– Recent developments in Cognitive Sciences

80
160
22/02/2022

Behaviourism
• I.P.Pavlov -> J.B.Watson (1913) ->
B.F.Skinner
• Deny consciousness.
• Chains of conditioned reflex would explain
all learned behavior, even language.
• Method: measure innate reflex in babies and
strengths of drives for rewards.
• A sort of “atomism” to explain complex
behavior from simple components.

161

... Behaviourism

• Problem solving is by trial-and-error -


without insight into the nature of a problem.
• Experiments with cats and dogs (e.g.
salivation-food -> salivation-bell).
• Perception and behavior supposed to be
controlled quite directly by stimuli.
• Psychology can become a perfectly
predictive science.

81
162
22/02/2022

Gestalt Psychology
• A very different rival school, founded in
Germany in the 1920s, then in USA.
• Emphasis on dynamics and “holism”.
• Gestalt: a grouping of elements such
that the whole is greater than the sum of
its parts.
• Analysis into perceptual components is
not supposed to be possible

163

Law of prägnanz
• The fundamental principle of gestalt
perception is the law of prägnanz (German
for conciseness) which says that we tend to
order our experience in a manner that is
regular, orderly, symmetric, and simple.
• Gestalt psychologists attempt to discover
refinements of the law of prägnanz, and this
involves writing down laws which
hypothetically allow us to predict the
interpretation of sensation, what are often
called “gestalt laws of organisation”.
82
164
22/02/2022

Gestalt Laws of Organisation


• Importance to understand human perception
(vision and hearing)
• Gestalt theory has inspired Artificial
Intelligence researchers to build computer
program able to recognize patterns and
objects
• Visual and auditory perceptions are more
than the sum of the stimuli, and are
organised according to various laws

165

Gestalt Laws of Organisation


• Law of Closure - tendency for a roughly circular patterns of dots to be seen as
“belonging” to and forming an object. The mind may experience elements it does
not perceive through sensation, in order to complete a regular figure.
• Law of Common fate - parts moving together, as leaves of a tree, seen as
parts of an object: elements with the same moving direction are perceived as a unit.
• Law of Similarity - The mind groups similar elements into collective entities.
This similarity might depend on relationships of form, color, size, or brightness.
• Law of Proximity - Spatial or temporal proximity of elements may induce the
mind to perceive a collective or totality.
• Law of Symmetry (Figure ground relationships) - Symmetrical images are
perceived collectively, even in spite of distance.
• Law of Continuity - The mind continues visual, auditory, and kinetic patterns.

• Laws of organization were supposed to be inherited, but as they


correspond to common features of almost all objects, learning could be
involved.
83
166
22/02/2022

Key principles of Gestalt theory

Reification is the constructive or generative Emergence: demonstrated by the


aspect of perception: the experienced percept perception of the Dog Picture, which depicts
contains more explicit spatial information a Dalmatian dog sniffing the ground in the
than the sensory stimulus on which it is based. shade of overhanging trees. The dog is not
A triangle will be perceived in picture A, although recognized by first identifying its parts (feet,
no triangle has actually been drawn. In B and D ears, nose, tail, etc.), and then inferring the
the eye recognizes disparate shapes as dog from those component parts. Instead,
“belonging” to a single shape, in C a complete the dog is perceived as a whole, all at once.
3D shape is seen, where in actuality no such However, this is a description of what occurs
thing is drawn. Reification is explained by in vision and not an explanation. Gestalt
studies on illusory contours, which are treated theory does not explain how the percept of a
by the visual system as "real“ contours. dog emerges

167

Key principles of Gestalt theory

Other examples of the principle of Emergence

84
168
22/02/2022

Key principles of Gestalt theory


Invariance: property of perception whereby simple
geometrical objects are recognized independent of
rotation, translation, and scale; as well as several
other variations such as elastic deformations,
different lighting, and different component features.
For example, the objects in A in the figure are all
immediately recognized as the same basic shape,
Multistability (or multistable
which are immediately distinguishable from the
perception) is the tendency of
forms in B. They are even recognized despite
ambiguous perceptual experiences to
perspective and elastic deformations as in C, and
pop back and forth unstably between
when depicted using different graphic elements (D).
two or more alternative interpretations.
This is seen for example in the “Necker
cube”, and in “Rubin's Figure” / Vase
illusion shown above. Other examples
include the 'three-pronged widget' and
artist M. C. Escher's artwork and the
appearance of flashing marquee lights
moving first one direction and then
suddenly the other. Again, Gestalt does
not explain how images appear
multistable, only that they do.

169

85
170
22/02/2022

171

… Gestalt Principles

• Emergence, reification, multistability,


and invariance are not separable
modules to be modeled individually, but
they are different aspects of a single
unified dynamic mechanism.

86
172
22/02/2022

Gestalt laws are used


in user interface design
The laws of similarity and proximity can, for example, be
used as guides for placing radio button widgets.
They may also be used in designing interfaces for more
intuitive human use. Examples include the design and
layout of a desktop's shortcuts in rows and columns.

Gestalt psychology is important for


computer vision
make computers "see" the same things as humans do.

173

Cognitive Psychology

• In the last decades, Cognitive Sciences and Neurosciences


elaborated a number of approaches to explain perception and
cognition, resulting in more elaborated frameworks with
respect to the model by Card/Moran/Newell.

• A.Gaggioli. G.Riva, L.Milani, E.Mazzoni (2013) “Networked


Flow – Towards an Understanding of Creative Networks”,
Springer, Chapter 2.

87
174
22/02/2022

Cognitive Psychology:
the Symbolic Approach

• Symbolic (traditional) approach (Johnson Laird 1988;


Newell and Simon 1972)
• Deny that perception and behavior are controlled by stimuli
• Importance of general background knowledge and more-or-
less logical thought processes
• How far cognitive problem solving apply to perception is
controversial
• The notion of representing by the brain
• Mental models (Johnson-Laird)

175

Cognitive Psychology:
the Symbolic Approach
• Symbolic processors as model of the mind: by using
symbolic language it is possible to represent a subject’s
complete knowledge (an explicit representation of
knowledge). From this knowledge base, it is then possible to
draw the conclusions necessary to make the agent act in an
“intelligent” way.
• In this way, the structural characteristics of human cognitive
processes are largely independent from the type of hardware
(the brain, the human body) on which they operate, just as a
piece of software is independent from the type of computer
on which it is installed: the same piece of software can be
used on very different computers.
• Artificial Intelligence started from these assumptions.

88
176
22/02/2022

Cognitive Psychology:
Situated Cognition
• Recent discoveries in neuroscience led to the redefinition of
the concept of cognition.
• An early attempt at this redefinition was made within the
Situated Cognition movement: in the majority of situations,
learning is not the result of an individual process, but of
social interaction (Lave and Wenger 2006): members of a
community, by means of common experience, come to share
a culture, a language and a way to express themselves: a
community of customs.
• This process is only possible if all the subjects share a
common ground, a range of beliefs, expectations, and
collective knowledge. This common heritage is continually
updated through a process defined as grounding, the
process of collaboratively establishing common ground
during communication.
177

Cognitive Psychology:
Embodied Cognition
• A second attempt came as the result of the Embodied
Cognition movement: corporeity – the sum of organism’s
motor-sensory skills which allow it to successfully interact in
its environment – as being necessary for its development of
social and cognitive processes.
• Enaction, Enactive Systems (Varela 1991): knowledge
defined as “capacity towards interactive action”, resulting
from the interaction which occurs in real time between a
corporeal organism and its environment directed toward an
objective.
• Knowledge is necessarily situated and embodied: it
requires continual external feedback in order to coordinate
perception and action.

89
178
22/02/2022

Cognitive Psychology:
Common Coding Theory
• Common Coding theory: based on the discovery of two
types of bimodal neurons, in which sensory faculties are
linked to motor faculties (Rizzolatti et al):
• Canonical neurons: activated when a subject sees an object
with which can potentially interact;
• Mirror neurons: activated when the subject sees another
individual performing the same action.
• Perceptual representations (action perceived) and motor
representations (actions to be performed) are based on the
same motor code.
• Embodied simulation: internal representations of corporeal
objects associated with given actions and sensations are
generated within the subject, as she were performing a
similar action or experiencing similar emotions or sensations.

179

Mirror and Canonical Neurons


In the mid-1990s, scientists studying Area F5 in the ventral premotor cortex of monkeys found
that certain neurons in this area sent out action potentials not only when the monkeys were
moving their hands or mouths, but also when they were simply watching another animal or a
human being who was making such a gesture.
These neurons were dubbed mirror neurons because of the way that a visually observed
movement seemed to be reflected in the motor representation of the same movement in the
observer.
In addition to mirror neurons, which are activated both when you perform an action yourself and
when you see someone else performing it, another kind of neurons, called canonical neurons,
become activated when you merely see an object that can be grasped by the prehensile
movement of the hand whose movements they encode—as if your brain were foreseeing a
possible interaction with this object and preparing itself accordingly.
What these two types of neurons have in common is that they are both activated by an action
regardless of whether you are carrying that action out, anticipating carrying it out, or watching
someone else carrying it out. Because mirror neurons thus help us foresee the consequences of
our own actions, some have argued that these neurons may be the cellular substrate for our
ability also to understand the meaning of other people's actions.
This understanding of other people's actions is the foundation for all social relations, and
especially for communication between individuals. The discovery of mirror neurons may thus be
particularly useful for explaining how we can imagine other people's intentions and state of mind.
Lastly, the fact that Area F5 in monkeys is regarded as the homologue for Broca's area in
humans suggests that mirror neurons also are involved in human communication.

90
180
22/02/2022

Cognitive Psychology:
Common Coding Theory
• Example: the sight of a red apple is believed to activate a simulation of
the motor functions necessary to pick it up, while the sight of a person who
reaches out to pick up the apple is believed to activate a motor simulation
which allows the subject to understand this person’s intention.
• A subject’s knowledge of objects and space is pragmatic
knowledge (Rizzolatti and Sinigaglia 2006):
• Objects are conceptualized through a process of simulation,
like points of virtual action defined by the intentions directed
toward them.
• Space is defined by the “system of relationships which such
virtual actions utilize, and which are limited by various parts of
the body”.

181

Cognitive Psychology:
Common Coding Theory
• Multisensory integration of bodily inputs is a key mechanism
underlying the experience of oneself within a body, which is
perceived as one’s own (body ownership), which occupies a
specific location in space (self-location), and from which the
external world is perceived (first person-perspective), i.e., the
different components of what has been called bodily self-
consciousness.
• The manipulation of bodily inputs has been used to induce the
feeling that an artificial or virtual body is one’s own and to
generate the sensation of being located within a virtual
environment. These findings thus highlight the particularly
relevant role of bodily inputs for virtual reality (VR)

91
182
22/02/2022

Cognitive Psychology:
Common Coding Theory
• Multisensory integration of bodily-relevant inputs naturally
happen within a limited space immediately surrounding the
body, where external stimuli can have direct contacts with the
body, i.e., the peripersonal space (PPS), to index the self-
space and to represent the space wherein the individual
interacts with external stimuli.
• Evolutionarily, until very recently, all direct body-objects
interactions have been experienced within a physical PPS.
• However, as human interactions are increasingly occurring not
within the real, but also within virtual or mixed realities, it is
interesting to study and characterize how PPS is represented
in VR, delineating interpersonal space in virtual and real
environments.

183

Cognitive Psychology:
Common Coding Theory
• Bimodal neurons: combine sensory input from two different
modalities (in the rubber hand illusion, light and touch). A
unimodal neuron only responds for one sense.
• In bimodal neurons activation is influenced by intention: the
visual information about an object is transformed into the motor
functions required to interact with it.
• Canonical neurons permit an immediate and intuitive
(prereflexive) understanding of opportunities for interaction
with various objects may offer: in the case of the handle of a
coffee cup, there is the possibility of being taken hold of, if the
subject wants to drink.
• But how can we define “intuitive”?

92
184
22/02/2022

Cognitive Psychology:
Kahneman theory

• One of the crucial elements of this definition is the concept of


intuition. Daniel Kahneman (2002) argues that our cognitive
system is based on two systems, intuition and reasoning:
• System 1 (Intuition): generates impressions of a perceived
and considered object’s characteristics. These impressions,
rapid and simple from a computational point of view, are
involuntary and often unconscious;
• System 2 (Reasoning): general judgments are slow, ordinal,
costly from a computational point of view, and always explicit
and intentional.

185

Cognitive Psychology:
Kahneman theory
• The existence of two separate cognitive systems is made
evident by the distinction between being able to do something,
and knowing something.
• On the one hand, we are able to control complex dynamic
systems without being capable of explaining the rules which
enable us to do so (Intuition), e.g. ski, ride a bike, play a
musical instrument.
• On the other hand, we can describe the rules which permit a
system to function (Reasoning) without being able to put them
into practice.
– Example: reading the highway code and knowing all necessary info to
drive a car does not mean that you will not fail your driving test.
• The ability to understand a subject’s intentions is an intuitive
process of which the subject is unaware (Riva and Mantovani 2012).
93
186
22/02/2022

Kahneman theory:
Thinking Fast and Slow
The Farmer or Librarian example:
• As you consider the next question, please assume that
Steve was selected at random from a representative
sample.
• An individual has been described by a neighbor as
follows: “Steve is very shy and withdrawn, invariably
helpful but with little interest in people or in the world of
reality. A meek and tidy soul, he has a need for order and
structure, and a passion for detail.”
• Is Steve more likely to be a librarian or a farmer? …

187

Kahneman theory:
Thinking Fast and Slow
…The Farmer or Librarian example:
• According to Kahneman, most people assume Steve is a
librarian (due to prevailing Fast thinking/Intuition).
• That answer is wrong, because it depends on occupational
stereotypes while ignoring “equally relevant statistical
considerations.”
• “Did it occur to you that there are more than 20 male farmers
for each male librarian in the United States? Because there
are so many more farmers, it is almost certain that more ‘meek
and tidy’ souls will be found on tractors than at library
information desks.”
• This question is supposed to illustrate the shallowness of our
intuitions about probability.

94
188
22/02/2022

Affordance: perception and


interaction

189

Affordance: perception and


interaction
• “the perceived and actual
properties of a thing” (Gibson,
Norman)
• Affordance: The perception of
an object helps determine
how we interact with the
object
• Affordances are rarely innate:
they are learned from
experience

95
190
22/02/2022

Affordance: perception and


interaction
• Some affordances are obvious, most are
learned:
– Glass can be seen through (innate)
– Glass breaks easily (learned)
• Some affordances constrain an action
– Floppy disk
• Rectangular: can’t insert sideways
• Tabs on the disk prevent the drive
from letting it be fully inserted
backwards

191

Perceived and Actual Affordance


• Perceived affordance can differ from real or actual
affordance:
• A facsimile of a chair made of papier-mache has a
perceived affordance for sitting, but it does not actually
afford sitting !
• A fire hydrant has no perceived affordance for sitting: it
lacks flat, human-width horizontal surface, but it actually
does afford sitting, albeit unconfortably
– Chair: real affordance
• Affords sitting
• Affords standing for changing a lightbulb
• Affords keeping a door open
– Chair: false affordance
• Used to throw at somebody
96
192
22/02/2022

Why it is hard to exploit


affordance in design
• The parts of a user interface should agree in
perceived and actual affordances:
– See examples at begin of course: bad use of
scroll bar (example 1), time setting (example 2)
• Devices and software are more complex
than door knobs or chairs.
• Controls function differently depending on
their states (modes)
– Microwave buttons (time, power, etc)
– Imaging your chair going into these modes!

193

Affordance in screen-based interfaces


• Designer only has control over perceived affordances
– Display screen, pointing device, selection buttons, keyboard
– These afford touching, pointing, looking, clicking on every
pixel of the display.
• Most of this affordance is of no value: if the display is
not touchsensitive, even though the screen affords
touching, touching has no effect.
– does a graphical object on the screen afford clicking? yes,
but:
• does the user perceive this affordance?
• does the user recognize that clicking on the icon is a
meaningful, useful action?

97
194
22/02/2022

Affordance
• Some objects lack affordance:

195

Affordance
Compromise between affordance and aesthetics

98
196
22/02/2022

Embodied Cognition:
Enactive knowledge
• Three kinds of knowledge:
– Symbolic
– Iconic
– Enactive

197

Symbolic knowledge
• Abstract sequences of reasoning, text, logics, mathematics
• A printed or written form of knowledge that makes use of text and vocabulary-
signs to represent operations, processes, elements or relations, …
• Typical forms are: procedural, declarative, episodic
• Typical use: languages

99
198
22/02/2022

Iconic knowledge

199

Enactive knowledge
• The word ‘Enactive’ has been
attributed to the psychologist
Jerome Bruner

• … The third type of knowledge is


enactive. It is inherently tied to
actions, and it is the craftperson’s
way of knowing.
It is the most intuitive and so the
easiest to learn.

According to Varela’s model of


“enactive cognition” (Varela et al.
1991), enactive knowledge is
primarily ‘‘knowledge for action”,
and conversely, action is always
necessary to acquire knowledge

Malcolm McCullough,“Abstracting Craft”, MIT Press

100
200
22/02/2022

Enactive knowledge
• Enactive knowledge is • Enactive knowledge is not
far from our everyday life.
constructed on motor o SPORT
skills, such as o MUSIC
manipulating objects, o ART
riding a bicycle, o DANCING
playing a musical o CRAFTING
instrument, etc. o WORK
o PLAYING
• Enactive
representations are As a consequence of the above
assumptions, physical embodiment is
acquired by doing a necessary condition for the
acquisition of enactive knowledge.
Elena Pasquinelli, 2004

201

Example: improve movement


skill by interactive sonification
• Enactive interfaces for sensory
substitution / supplementation
• Sound as conveyer of information for
human action:
– To reinforce feedback
obtained from other sensory channels
– To substitute feedback
from missing sensory channels

101
202
22/02/2022

sensory substitution and


supplementation

• Exploring novel therapy and rehabilitation


interaction modalities for disabled people.
• Developing and experimenting techniques and
systems for compensating sensorial and
motoric disabilities.
• Sport training, edutainment, (serious) games …
• Techniques:
 acoustic interfaces
 movement interactive sonification

203

Experiment: the balance task


Question: Is it possible to enhance the execution of
a motor task by providing an appropriate auditory
feedback to movement in real-time ?
The experiment: to “measure” the balance of subjects on a
basculating system under several auditory conditions.

The user is required to keep


the basculating platform as much
horizontal as possible balancing
his/her own weight.

102
204
22/02/2022

Sonification Vs visualization of
movement
• Visual feedback might be used instead of
audio, like the large mirrors in gym
sessions, but
• Temporal resolution of auditory perception
is about two orders of magnitude better then
visual
• Audio feedback does not require eye
contact on a screen: audio is naturally
immersive

205

Experiment: the balance task

Audio feedback

System

Motoric performance

103
206
22/02/2022

Apparatus
Custom-built platform mechanical frame:
roll freedom degree
angular range: 13° right/left

optical sensor (f = 50Hz)


accelerometer (f =160Hz)
compression springs

207

Auditory feedback

• Interactive Sonification of Movement


Parameter Mapping Technique:

x: roll value
.....amplitude modulation!
y: loudness and frequency

f: linear function

104
208
22/02/2022

Auditory feedback
The idea: to provide users with a “metronome”- like
feedback related to their motion
• How ?
Working on rhythm and lateralization of sound.
• Why ?
Rhythm: a feature easily perceived and tightly
linked to movement
- the more the rhythm becomes persisting, the more the user
is induced to move quickly
Lateralization: it seems natural to be attracted by
the location of a sound source.

209

Auditory feedback

• The more the subject moves incorrectly, the


more the rhythm becomes fast.
For example, the more the subject makes the platform move
away from the zero-inclination condition, the more the rhythm
becomes faster. The equilibrium condition is a rhythm of a slow
human breathing, and increases proportional with inclination.

• The sound is mapped on the opposite


direction with respect to subject’s movement.
For example, if the subject loses balance on the right, the sound
will correspondingly move to the left.

105
210
22/02/2022

Auditory feedback: which


sounds?
White noise modulated in amplitude
(multiplied) by a sinusoidal wave.
why the choice of white noise?
(hint: equal distribution of stimulis on cochlea hair cells)

Sonification provided by headphones.


Interactive system (sensors input processing and
auditory feedback) developed using the EyesWeb
software platform for the development of multimodal
apps (free download from www.casapaganini.org)

211

Cross-modal mapping
• Scientific literature on the correspondences
between sensory modalities, e.g.
• Spence, C. (2011). Crossmodal correspondences: A
tutorial review. Attention, Perception, &
Psychophysics, 73(4), 971-995.
• Spence, C. (2015). Cross-modal perceptual
organization. The Oxford handbook of perceptual
organization, 649-664.
• T.Hermann, The sonification handbook, 2011
https://pdfs.semanticscholar.org/d86f/4cfc71da6fb6dd606
b94bed4cad61d33ea05.pdf

106
212
22/02/2022

Auditory feedback

Real-time feedback
Input-output time latency: ~20 ms

213

The experiment
• 23 participants grouped in 4 groups
• Each group performs the balance task under
different conditions of auditory feedback:
– Group A: full auditory feedback
– Group B: rhythmic component only
– Group C: lateralization component only
– Group D: no auditory feedback.
• Recorded data:
– Angular position (sampling rate 25 Hz)
– Angular acceleration (sampling rate 160 Hz)

107
214
22/02/2022

Discussion

215

Discussion

It seems that auditory feedback increases reactivity


(fast responses) and equilibrium:
• The 5 best subjects (on 23) had auditory
feedback;
• Auditory feedback based on lateralization seems
more effective that feedback based on rhythm;
• Subjects do not seem to be aware of the effect of
the feedback.

108
216
22/02/2022

Experiment 2: different targets


• Testing the ability of a standing person to
control medio-lateral orientation relative to
referents in an acoustic space.

• Reaching target in acoustic space

• Decoupling of acoustic upright


from gravito-inertial upright

217

Experimental Conditions

• Participants (N=10): 5 males, 5 females


– mean age: 27.1y
– mean height 170.4cm
– mean weight 61.6kg
– no balance or hearing deficits

• Control of stance in audio-reaching task


– 7 silent targets:
• 0°± 1.2°, 3°± 1.2° right/left, 6.5°± 1.2° right/left, 9°± 1.2° right/left
– Training and Test session
• 21 trials for session – 3 trials for silent target
• random presentation of the silent targets

109
218
22/02/2022

Results and Discussion (1/4)


• Statistical analysis
Descriptive statistics

No evident trial effect


Oscillations in the 95% of Confidence Interval for each target

219

Results and Discussion (2/4)

• Statistical analysis
Inferential statistics
– 2-way ANOVA on mean achieved orientations
• Factors: Target, Trials
• Statistical significance: p < .05

– No significant main effect of Trials


– No significant Trials x Target interaction

110
220
22/02/2022

Results and Discussion (3/4)


• Inferential Statistics
Significant Main effect of Target F(6,189) = 2616.34, p <.01

Simultaneous control of orientation and of upright stance

221

Results and Discussion (4/4)

• Successful body orientations relative to


referents in acoustic space:
– with minimal practice
– also with referents diverging from the direction of
balance

111
222
22/02/2022

Summary: Enaction, Cognition,


and Embodiment
• ‘It is only by having a sense of common ground
between cognitive science and human experience
that our understanding of cognition can be more
complete and reach a satisfying level’
(Varela,1991).

• Cognitivist accounts of mind: focuses on the


description of an individual’s cognitive
architecture in order to explore thought as mental
representation.

• The embodied view takes a different stance.

223

Embodied Cognition
• Embodied cognition studies focus on human
action as revealing of cognition, emphasizing the
inter-related roles of environment and the body in
shaping mental process and experience.

• Perception is perceptually guided action,


emergent only through histories of structural
coupling between agencies, and between agent
and environment.

112
224
22/02/2022

Memory
• Memory as an organism’s mental ability to
store, retain and recall information.
• Three types of memory (questionable…):
– Sensory memory (Short-term sensory store)
– Short-term memory (STM) and
Working memory
– Long-term memory (LTM)
• Embodiment and memory

225

Human processor model

Card, Moran, Newell, The psychology of human-computer interaction, 1983


113
226
22/02/2022

Memory process
• From an information processing perspective
there are three main stages in the formation
and retrieval of memory:
– Encoding or registration (receiving, processing
and combining of received information)
– Storage (creation of a permanent record of the
encoded information)
– Retrieval or recall (calling back the stored
information in response to some cue for use in
a process or activity)

227

The multi-store model

(Atkinson & Shiffrin, 1968, et al.)


• The model has been criticized for being too simplistic:
− Long-term memory is made up of multiple subcomponents, such as
episodic, semantic, and procedural memory.
− We are capable of remembering without rehearsal.
− Short-term memory can be broken up into different units such as
visual information and acoustic information.
− Sensory store is split up into several parts e.g., taste, vision, hearing.

114
228
22/02/2022

Sensory memory

• The first experiments on it were conducted


by George Sperling (1960)
• It is a kind of temporary buffer associated
with sensor organs.
• It contains unprocessed sensory information.
• It does not require attention.
• It is outside of conscious control.
• Very short persistence:
0.5s visual (iconic), 2s auditory(echoic)

229

Sensory memory

• Buffers for stimuli received through senses


– iconic memory: visual stimuli
– echoic memory: aural stimuli
– haptic memory: tactile stimuli
• Examples
– “sparkler” trail
– stereo sound
– Example in Eyesweb
• Continuously overwritten

115
230
22/02/2022

Short-term memory
• Used as temporary memory by cognitive processes
• Rapid access: ~70ms
• Rapid decay: from ~200ms to 15-30s
• Capacity very limited: the store of short term
memory is 7±2 items (G. Miller, 1956).
• Persistence increases with rehearsals: but this
requires attention.
• New inputs remove old content: interference
(that’s why we should not use the cell phone while
driving!)
• Memory capacity can be increased through a
process called chunking: chunk the information into
meaningful groups, e.g. phone numbers
231

Working memory
• Short-term memory: theory-neutral short term
storage of information.
• Working memory: a theoretical framework that
refers to structures and processes used for
temporarily storing and manipulating information.
• As such, working memory might also be referred
to as working attention.
• Several theories exist as to both the theoretical
structure of working memory as well as to the
specific parts of the brain responsible for it.

116
232
22/02/2022

A model for working memory

Baddeley and Hitch, 1974; and


A. Baddeley, “The episodic buffer: a new component of working memory?”
Trends Cogn. Sci., vol. 4, no. 11, pp. 417–423, 2000.
233

A model for working memory

• Phonological Loop stores verbal-acoustic information such as the


sounds of spoken language.
• Visuospatial Sketchpad stores visual and spatial information.
• Episodic Buffer is passive storage for integrating phonological, visual,
and spatial information with time sequencing, for example, the memory
of a story. In addition, the Episodic Buffer has a function to link
information in the WM and LTM.
• Central Executive acts as a supervisory system and controls the flow of
information: directing attention to relevant information and suppressing
irrelevant information. The flow control of information is essential for
preventing WM overflow.
117
234
22/02/2022

Short-term memory
and interaction design
• Do not overload the short term memory of the user:
ask the user to remember only few (7±2) significant
or familiar items.
• While the user is engaged in other cognitive
activities, minimize the use of his/her short-term
memory.
• Anxiety dramatically reduces short-term memory
performances: avoid stress for the user.
• Allow the user to clean the short-term memory
=> simple, well defined tasks in sequence,
rather than in parallel => use of narration techniques
from humanistic theories (e.g., theatre, literature)

235

Example 1

Microsoft Word 97

After closing the window, it is almost


impossible for the user to remember
what he/she has to do!

118
236
22/02/2022

Example 2

The help window goes


behind the application
window: impossible to
remember what to do

Fixed in the version 2007 :-)

237

Example 3

Impossible
to remember which
characters are allowed

119
238
22/02/2022

Example 4
A good
solution

239

Long-term memory
• Long access time
• Long persistence: it can remember certain information for
almost a lifetime; huge capacity;
• It encodes information semantically for storage.
• Several subsystems:
– Declarative Memory refers to all memories that are consciously
available. Two major subdivisions: Episodic Memory refers to
memory for specific events in time; Semantic Memory refers to
knowledge about the external world, such as the function of a pencil.
– Procedural Memory refers to the use of objects or movements of the
body, such as how exactly to use a pencil or ride a bicycle.
– Emotional Memory, the memory for events that evoke a particularly
strong emotion; it involve both declarative and procedural memory
processes; it elicits a powerful, unconscious physiological reaction.

120
240
22/02/2022

Long-term memory
• Semantic memory structure
– provides access to information
– represents relationships between bits of information
– supports inference

• Model: semantic network


– inheritance – child nodes inherit properties of parent nodes
– relationships between bits of information explicit
– supports inference through inheritance

241

Long-term memory
Semantic networks are a possible computation model for long-term memory.

121
242
22/02/2022

Models of LTM - Frames


• Information organized in data structures
• Slots in structure instantiated with values for instance
of data
• Type–subtype (IS-A, Inheritance) relationships:
DOG

Fixed
legs: 4
Default
diet: carniverous
sound: bark
Variable
size:
colour

243

Long-term memory: Storage


• rehearsal
– information moves from STM to LTM

• total time hypothesis (Ebbinghaus)


– amount retained proportional to rehearsal time

• distribution of practice effect (Baddeley)


– amount retained is optimized by spreading learning over time

• structure, meaning and familiarity


– information easier to remember: it is easier to remember a list
of words representing objects with respect to a list of words
representing concepts. Objects can be more easily visualized
mentally.

122
244
22/02/2022

Long-term memory: Forgetting


Two theories:
decay
– information is lost gradually but very slowly: logarithmic law of
decay: initially faster lost of information, then slower
– Jost Law: if two LTM traces are equally strong in a given
moment, the older will remain longer in LTM.

interference
– new information replaces old: retroactive interference
Example: It is difficult to remember the old phone number when you
learn the new one.
– old may interfere with new: proactive inhibition
Example: drive your car involuntarily to the old home address.

memory is selective and affected by emotion: can


subconsciously `choose‘, e.g. to forget negative information and
keep positive.

245

Long-term memory: Retrieval

recall
information reproduced from memory can be assisted by cues,
e.g. categories, imagery

recognition
information gives knowledge that it has been seen before
less complex than recall - information is cue

123
246
22/02/2022

Attention
• The cognitive process of selectively concentrating
on one aspect of the environment while ignoring
other things.
• Example: listening carefully to what someone is
saying while ignoring other conversations in a
room (cocktail party effect)
• It is influenced by exogenous (external stimuli)
and endogenous factors (motivations, mental
models).
• Endogenous factors are deemed more relevant.

247

Example

You have 3 seconds for counting the


number of green squares in the next slide.

124
248
22/02/2022

249

How many red squares?

125
250
22/02/2022

Attention
and interaction design

• How and where to direct user’s attention


during interaction?
• How to keep user’s attention on the
desired items?
• How to avoid interferences that distract
attention from the relevant items?

251

Example 1

126
252
22/02/2022

Example 2

253

Example 3

(MAC OS 8)

127
254
22/02/2022

Example 4

A spot-light is used to direct users attention


(Khan et al., CHI 2005)

255

Example 5

The active
window is
emphasized

128
256
22/02/2022

Example 6
Interfaces for Museum visitors:
How to convey the desired content to
visitors ?
How to adapt the interface to different
typologies of visitors ?

«Viaggiatori di sguardo», Palazzo Ducale, Genova 2009-


2011. Casa Paganini - InfoMus
257

Example 6 (video demo)

«Viaggiatori di sguardo», Palazzo Ducale, Genova 2009-2011. Casa Paganini – InfoMus


https://www.youtube.com/watch?v=oIkm6CTiUH8
129
258
22/02/2022

Example 7
Interaction design for children:

- Serious games to capture children attention, increase


learning in engaging, entertaining applications

- Therapy and rehabilitation in videogame-like environments

RGB-D sensors

Analysis of movement features: e.g. trajectories, velocities, energy, contraction/expansion

259

Example 7: Serious games for children


EyesWeb platform
configured for the design
of serious games for
children.

Adopted at Gaslini
Children Hospital (Genoa):

Joint Laboratory
DIBRIS-Unige and
Gaslini Hospital:
ARIEL - Augmented
Rehabilitation in
Interactive/multimodal
Environment Lab

130
260
22/02/2022

Emotion
• Neuroscientific evidence:

– Emotion is related to activity in brain areas that direct our


attention, motivate our behavior, and determine the
significance of what is going on around us.

– Emotion investigated by neural mapping in the limbic system


(amygdala).

– Music, as a human language to express emotion par


excellence, is adopted in experiments to understand
emotion and brain functions, and to develop computational
models and affective computing systems.

261

Affective Computing and HCI


• Research has revealed the powerful role that emotion and
emotion expression play in shaping human social interaction.

• Computer interaction can exploit (and indeed must address)


social interaction.

• Emotional displays convey considerable information about the


mental state of an individual.

• But: are emotional displays true emotions or simply


communicative conventions/simulated emotions? In both
cases, pragmatically, they are useful in HCI.

131
262
22/02/2022

Affective Computing and HCI


• From emotional displays, observers can form interpretations of
a person’s
– beliefs (e.g., frowning at an assertion may indicate disagreement)
– desires (e.g., joy gives information that a person values an outcome)
– intentions/action tendencies (e.g., fear suggests flight)

• Emotional displays may also provide information about the


underlying dimensions along which people appraise the
emotional significance of events:
– Valence, Intensity, Certainty, Expectedness, Blameworthiness etc.

263

Affective Computing and HCI


• Emotion is a powerful signal that can also be a means of social
control: emotional displays seem to function to elicit particular
social responses from other social individuals (“social
imperatives”, Frijda 1987).
• The responding individual may not even be consciously aware
of the manipulation.
• For example:
– anger seems to be a mechanism for coercing actions in others and
enforcing social norms,
– displays of guilt can elicit reconciliation after some transgression,
– distress can be seen as a way to recruit social support, and displays of
joy or pity are a way of signaling such support to others.

132
264
22/02/2022

Affective Computing and HCI


• Emotions provide a wide array of of functions in social
interactions
• Emotion displays seem to exert control indirectly, by inducing
emotional states in others and thereby influencing an
observer’s behavior:
– Emotional Contagion: lead individuals to “catch” the emotions of those
around them
– Pygmalion Effect (or Self-Fulfilling prophecy): our positive or negative
expectations about an individual, even if expressed non-verbally, can
influence them to meet such expectations

265

Affective Computing and HCI


• To the extent that these functions can be realized in artificial
systems, they could play a powerful role in facilitating
interactions between computer systems and human users.
• This has inspired several trends in HCI. For example:
– Deduce user’s emotional state based in their actions
– Various systems attempt to recognize the behavioural manifestations of
a user’s emotion including
• Facial expression
• Vocal expression
• Full-body behavior
• Physiological indicators

• Another trend in HCI is the use of emotions and emotional


displays in virtual characters that interact with the user (ECA –
Embodied Conversational Agents);
• Creative and Cultural Sector incl. Videogame industry

133
266
22/02/2022

References
• K.R.Scherer, T.Banzinger, E.B.Roesch (2010) Blueprint for
Affective Computing – A Sourcebook. Oxford University Press.
• Kleinsmith, A., & Bianchi-Berthouze, N. (2012). Affective body
expression perception and recognition: A survey. IEEE
Transactions on Affective Computing, 4(1), 15-33.
• A.Damasio (1994) Descartes’ Error: Emotion, Reason, and the
Human Brain. ( “L’errore di Cartesio – Emozione, ragione e cervello umano”, Adelphi)
• http://emotion-research.net/ Research community emerged
from the EU Network of Excellence HUMAINE (2004-2007)
• Scientific journals:
– IEEE Transactions on Affective Computing
– ACM Transactions of Interactive Intelligent Systems
– International Journal of Human-Computer Studies
– IEEE Transactions on Human-Machine Systems

267

Emotion in user interfaces


• Affect: the biological response to physical stimuli

• Affect influences how we respond to situations


– positive -> creative problem solving
– negative -> narrow thinking

• “Negative affect can make it harder to do even


easy tasks; positive affect can make it easier to do
difficult tasks” (Donald Norman)

134
268
22/02/2022

Emotion in user interfaces


• Emotion plays a fundamental role in HCI:
– stress will increase the difficulty of problem solving
• Goal: to minimize level of stress in users
– relaxed users will be more forgiving of
shortcomings in design
• Emotion as a «motivator» to increase effectiveness and
usability
– aesthetically pleasing and rewarding interfaces will
increase positive affect
• e.g. Aesthetical resonance in therapy and rehabilitation
applications

269

Computational Approaches to
Emotion

– Affective Computing (US)


(Picard, 1995)
– KANSEI Information Processing (Japan)
(Hashimoto, 1997)
– A third European route: the EU-IST Project
MEGA, 2000-2003: www.megaproject.org

135
270
22/02/2022

Theories of Emotion
• Component Process Model (Scherer, 2004): emotion is
defined as a sequence of state changes in five subsystems:
– Cognitive processes, i.e., appraisal processes. Emotional
responses are consequence of a subjective evaluation of
events with respect to their relevance for individuals.
– Physiological arousal: changes activated by the autonomic
nervous system, such as an increase or decrease in heart
rate, breath rate.
– Motor expression: behavioral responses such as facial and
vocal expressions, body gesture and posture.
– Action tendency: behavior preparation consequent to the
elicitation of emotion.
– Subjective feeling: the result of all the changes in
components during an emotional process

271

Dimensional models of emotion


• For both theoretical and practical reasons researchers define
emotions according to one or more dimensions. Wilhelm Max
Wundt, a father of modern psychology, proposed in 1897 that
emotions can be described by three dimensions: "pleasurable
versus unpleasurable", "arousing or subduing" and "strain or
relaxation".
• In 1954 Harold Schlosberg named three dimensions of
emotion: "pleasantness–unpleasantness", "attention–
rejection" and "level of activation".
• Dimensional models of emotion attempt to conceptualize
human emotions by defining where they lie in two or
three dimensions.
• Most dimensional models incorporate valence and arousal or
intensity dimensions.
136
272
22/02/2022

Dimensional Models of Emotion

The circumplex model (Russell, 1980)


273

Dimensional models of emotion


• Dimensional models of emotion suggest that a common and
interconnected neurophysiological system is responsible for
all affective states.
• These models contrast theories of basic emotion, which
propose that different emotions arise from separate neural
systems.
• Several dimensional models of emotion have been
developed, though there are just a few that remain as the
dominant models currently accepted by most.
• The two-dimensional models that are most prominent are the
circumplex model, the vector model, the Positive Activation –
Negative Activation (PANA) model, and PAD model

137
274
22/02/2022

Dimensional models of emotion


PAD emotional state model: A.Mehrabian, J.Russell
Three numerical dimensions: Pleasure, Arousal and Dominance.
• Pleasure-Displeasure Scale: measures how pleasant an
emotion may be. For instance both anger and fear are
unpleasant emotions, and score high on the displeasure
scale. Joy is a pleasant emotion.
• Arousal-Nonarousal Scale: measures the intensity of the
emotion. For instance while both anger and rage are
unpleasant emotions, rage has a higher intensity or a higher
arousal state. Boredom, which is also an unpleasant state,
has a low arousal value.
• Dominance-Submissiveness Scale: represents the controlling
and dominant nature of the emotion. For instance while both
fear and anger are unpleasant emotions, anger is a dominant
emotion, while fear is a submissive emotion.
275

Models of Emotion: Discrete

OCC model (Ortony, Clore, & Collins, 1988)


138
276
22/02/2022

Models of emotion

Ego-Nos space
Camurri and Ferrentino,
ACM Multimedia Systems Journal,1999

277

Emotional agents

Emotional agent
Camurri and Coglio, IEEE Multimedia Journal, 1998
139
278
22/02/2022

Emotional Conversational Agents (ECA)

• ECAs support interaction by means of a speaking


virtual human interface in multimedia apps.
• Increased interaction and communication by
exploiting multimodal characteristics of humans:
listen to the speaking voice (vocal expression,
prosody), see the face (lips), and the (virtual or
physical) body movements and expression of the
agent.
• Adopted in specific application domains
e.g. virtual call centers, virtual sellers
• Videogame and movie industry contribute to ECA
• Example: Samsung Neon novel technology
279

Example of museum robot: how affective


computing can enhance Human-Robot Interaction

Suzuki, Camurri, Hashimoto, and Ferrentino, 1998


140
280
22/02/2022

EU-ICT Project ASC-INCLUSION


Serious games for teaching autistic children to recognize and
express emotions by non-verbal full-body expressive gesture
automated analysis of emotions

EyesWeb application to model and recognize Emotions


S.Piana et al (2016) “Adaptive body gesture representation for automatic emotion recognition”, ACM
Transactions on Interactive Intelligent Systems, 6(1).

281

KANSEI Information Processing


• Affective computing approaches grounded on
Japanese culture
• KANSEI is related to emotions and personality:
it is not (only) emotion.
• Rather, KANSEI can be considered as:
‒ An evaluation function: a human ability that allows
to perform a qualitative evaluation of the world
(e.g., “That picture is beautiful”, “I like it”).
‒ Another level of understanding: when a human
processes a signal, she performs an analysis at
different levels: physical level, logical level, and
KANSEI level (Hashimoto, 1997).
141
282
22/02/2022

Levels of understanding
• Physical level: evaluation the basic physical
components of the perceived signal (e.g.,
loudness of a sound, brightness of an image).
• Logical level: combining perceived physical
information using rules and logic in order to
extract meaning.
• KANSEI level: extraction of higher level
information. The KANSEI level interacts with
both logical and physical levels in order to
provide better understanding and perform
problem solving tasks in a non-logical way.
283

Social Signals Processing


• From the individual (emotion) to joint actions in a group
• Analysis of the behavior and of the interaction within and
among groups of humans
• Focus on social emotions and kansei:
– Empathy: entrainment, synchronization, emotional
contagion
– Co-creation: group creativity
– Functional roles: dominance, leadership
– Salient behaviour
• Strategic role in emerging embodied social media

142
284
22/02/2022

Entrainment
• Refers generally to spatiotemporal coordination between two
or more individuals, often in response to a rhythmic signal
• In music and dance: rhythmic synchronization as a product of
individuals’ interaction (Clayton et al., 2005) – that is,
rhythmic coordination with simple (e.g., 1:1 in-phase or anti-
phase) or more complex (e.g., 2:3 or 3:4 polyrhythmic) phase
relations.
• Entrainment is characterized by two intertwined components:
• Temporal component: observed at hierarchical levels of
metrical periodicity in the body and brain
• Affective component: mutual sharing of an affective state
between individuals; Affective entrainment involves the for-
mation of interpersonal bonds and is related to the pleasure
in moving the body to music and being in time with others
(e.g., “groove”)
285

Emotions, Kansei, and Social Signal


processing in interface design
• Role in the design process:
– Managing levels of stress in the interaction with a GUI
– Measure and assess Quality of Experience, User
satisfaction
– Novel approaches to evaluate and validate GUIs
– GUI Evaluation techniques based on non verbal emotion
cues measured on users
– Interfaces for group work
– Support the design of multimodal interfaces, natural
interfaces, wearables etc.
– Support a growing number of novel applications for
creative industry, social inclusion, therapy and
rehabilitation, etc.
143
286
22/02/2022

Emotions, Kansei, and Social Signal


processing in interface design
• International research projects at Casa
Paganini-InfoMus:
• EU ICT FET Project SIEMPRE (2010 – 2013)
• EU Horizon 2020 ICT Project DANCE (2015-2017)
• EU Horizon 2020 ICT Project Wholodance (2016-
2018)
• EU Horizon 2020 ICT Project TELMI (2016-2018)
• EU Horizon 2020 ICT Project WeDraw (2017-2019)
• EU Horizon 2020 FET PROACTIVE EnTimeMent
(2019-2022) http://entimement.dibris.unige.it

287

144

You might also like