Professional Documents
Culture Documents
Landmarks
GIScience for Intelligent Services
Landmarks
Kai-Florian Richter • Stephan Winter
Landmarks
GIScience for Intelligent Services
123
Kai-Florian Richter Stephan Winter
Department of Geography Department of Infrastructure Engineering
University of Zurich University of Melbourne
Zurich, Switzerland Parkville, VIC, Australia
vii
viii Preface
context. They include services such as simple search, webmapping, mobile location-
based services, mobile guides, car navigation services, public transport planners,
and emergency call centers, as they involve more and more natural language
interaction.
This book will address this gap. It is reporting on latest research on landmarks in
geographic environments and practical applications of this research in information
service provisions. It covers a spectrum of disciplinary fields encompassed by what
has been called cognitive computation by some or spatial cognitive engineering
by others. The disciplines involved are from life sciences (neurosciences), social
sciences (psychology, cognitive science, linguistics), engineering (artificial intelli-
gence, information systems), physical and mathematical sciences (geomatics), and
humanities (geography, philosophy).
The reader can expect from this book a broad scope covering perceptual and
cognitive aspects of natural and artificial cognitive systems, conceptual aspects
of trying to define and formally model these insights, computational aspects with
respect to identifying or selecting landmarks for various purposes, and communica-
tion aspects of human-computer interaction for spatial information provision. The
origin of the book goes back to our own, originally separate work on computational
issues of landmarks, which started about a decade ago, perhaps kicked off by one
particular, frequently cited paper [4]. Of course, landmarks were by then already a
well-studied topic in spatial cognition research (e.g., [2, 3, 5]). A rich body of work
had been developed, which we will present in this book in a systematic manner,
including outlining the still open questions. Portions of the presented ideas led also
to the world-first commercial navigation service using landmarks selected based on
cognitive principles [1].
Accordingly, the purpose of this book is to provide a review of this line of
research, structured into cognitive, conceptual, computational, and communication
aspects. This is in particular valuable because it represents a synopsis of research in
different disciplines and thus not only addresses a breadth of topics but also bridges
between different traditions of thinking. It is also timely since the research in these
four areas has reached levels that allow for a first time a consistent synopsis.
The intended audience of this publication are certainly graduate or postgraduate
students—they will profit from a compact reader summarizing and synthesizing a
large number of research papers—but beyond that also the interested public, from
the enthusiastic geeks maintaining crowd-sourced datasets like OpenStreetMap,
over the early adaptors of novel tools such as navigation services, to the curious
people being interested why certain things are so hard for computers . . . such as
thinking about our environments like we humans do.
References
1. Duckham, M., Winter, S., Robinson, M.: Including landmarks in routing instructions. J.
Location-Based Serv. 4(1), 28–52 (2010)
2. Lynch, K.: The Image of the City. The MIT Press, Cambridge (1960)
3. Presson, C.C., Montello, D.R.: Points of reference in spatial cognition: stalking the elusive
landmark. Br. J. Dev. Psychol. 6, 378–381 (1988)
4. Raubal, M., Winter, S.: Enriching wayfinding instructions with local landmarks. In: Egenhofer,
M.J., Mark, D.M. (eds.) Geographic Information Science. Lecture Notes in Computer Science,
vol. 2478, pp. 243–259. Springer, Berlin (2002)
5. Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In: Freksa,
C., Mark, D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol.
1661, pp. 37–50. Springer, Berlin (1999)
6. Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
7. Winter, S., Wu, Y.: Intelligent spatial communication. In: Navratil, G. (ed.) Research Trends in
Geographic Information Science, pp. 235–250. Springer, Berlin (2009)
Acknowledgements
This book presents the work of many people to whom we are indebted for their
inspirations, their collaborations at some stage of our pathways, or for challenging
our ideas in discussions or in anonymous reviews.
Our own work was also supported by a number of research funding agencies,
including the Australian Research Council (Talking about Place, LP100200199), the
Go8/DAAD exchange program (Cognitive Engineering for Navigation Assistance),
the Institute for a Broadband-Enabled Society (Crowd-Sourcing Human Knowledge
on Spatial Semantics of Placenames), the German Research Foundation DFG
(Transregional Collaborative Research Center SFB/TR 8 Spatial Cognition), the
Australian Academy of Science (Smart Interaction for Wayfinding Support), and
the Swiss National Foundation (Computational Methods for Extracting Landmark
Semantics).
xi
Contents
xiii
xiv Contents
1
http://www.yokohama-landmark.jp/web/english/, last visited 23/12/2013.
2
http://www.geonames.org/maps/google_49.672_-96.822.html, last visited 23/12/2013.
3
http://en.wikipedia.org/wiki/Landmark, last visited 23/12/2013.
of landmarks. But then this definition is also somehow imbalanced, first speaking
of ‘anything’ in the intensional part, and then mentioning only examples taken from
the built environment in the ostensive part.
Let us have a look into another dictionary. Merriam Webster distinguishes three
meanings: “(1) an object or structure on land that is easy to see and recognize, (2) a
building or place that was important in history, or (3) a very important event or
achievement4.”
• Meaning 1 is more or less covering the previous definition, although it is
questionable why landmarks should be restricted to land. Landmarks arguably
exist, for example, also on the ice cap of the North Pole, indoors, and even at
sea: Navigators have used Polaris as a landmark for centuries. But importantly
Meaning 1 refers again to cognitive processes based on embodied experience,
which is what we will build upon later. Let us ignore that the wording requires
‘seeing’ in addition to ‘recognizing’. If seeing would be required blind people
would have no concept of landmarks, or soundscapes would have no landmarks,
which is both not true. However, the embodied experience postulated above is
expressed here in the reference to locatable objects or structures.
4
http://www.merriam-webster.com/dictionary/landmark, last visited 23/12/2013.
1.1 What Landmarks Are 3
Rosch [38] argues that two principles drive the formation of categories in the
mind. One is cognitive economy, calling for grouping similar things together and
giving them a name. The other principle is rather a recognition that the continuous
world outside of the body is actually structured and forms natural discontinuities.
Categories are then, economically, formed by objects that have many attributes in
common and share not many attributes with members of other categories [39].
This idea has to be taken by a grain of salt, though. Wittgenstein had famously
interjected that at least several natural categories appear to have no common
attributes, or none that are not shared with members of other categories—he
used the category of games (board games, card games, olympic games) for
this argument [50]. Instead of commonalities Wittgenstein argues for a family
resemblance, which is only a matter of similarity. As a consequence, Rosch [38]
suggests to apply a measure of category resemblance derived by Tversky from
similarities: “Category resemblance is a linear combination of the measures of the
common and the distinctive feature of all pairs of objects in that category” ([47],
p. 348). Thus, category resemblance describes the inner coherence of a category,
without requiring a catalog of shared attributes. Imagine a space spanned by some
conceptual properties. In this space each object forms a node. Convex categories can
be formed by clustering these nodes such that if x and y are member of a category
and z is between x and y than z is member of the category as well [16]. These
conceptual spaces form a mathematical basis to express family resemblance.
Extending such a notion of a category, prototype theory calls in more central
entities—prototypes—and accepts graded membership to a category [27, 37, 38].
For example, probably most people would agree that the Eiffel Tower is a landmark,
and in surveys the Eiffel Tower is always highly ranked (e.g., Fig. 1.3), which makes
the Eiffel Tower a prototype in the category of landmarks. But how many people
would agree that the blue house at the street corner is a landmark, the ATM in
the mall, the T-intersection where one has to turn right, or the only tree far and wide
marking the entrance to the farm? These are perhaps more ambiguous entities in the
category.
By the way, landmarks are not special in this respect. Any classification of
objects in geographic space is to some extent arbitrary, and has its prototypes and
its boundary cases. For example, we may have a clear understanding of what a
building is, a mountain, or a road. They are everyday terms, and basic categories in
5
http://www.openstreetmap.org, last visited 3/1/2014.
6 1 What are Landmarks, and Why Are They Important
the context of geographic information [39]. They are standard elements in spatial
databases: A typical GIS contains database layers of (representations of) buildings
and roads, and gazetteers of geographic names contain representations of mountains.
According to prototype theory each of these categories will have typical examples
as well as less clearly assigned entities. When it comes to these boundary cases in
categorizations, people start to disagree or become uncertain. Is a single garage, a
shed, or a kennel still a building? Is a laneway, a mall, or a trek still a road? Is a hill
a mountain? Is Uluru a hill or a mountain? These questions are important for map
makers and database administrators alike since they decide about the cleanliness of
their products. A producer does not want to have a product with information many
people disagree about. These questions also decide about the ability to compare
different map sources or databases. Map updating, for example, should not merge
datasets of varying semantics. It has even been shown that these classifications can
vary across languages and cultures [29, 30]. Thus, there is no definite answer to the
question which object is a landmark and which is not. Landmarks are countable but
are not finite.
There are other reasons adding evidence to this conclusion. For example, the
world is constantly changing, and over time it can change whether a geographic
object is a landmark or not. The first skyscraper in Chicago was a landmark, at
least for some time, until it became one of many and others were more outstanding.
Accordingly, classifications of objects can be made only for a certain time. But even
at a snapshot in time we have already seen it is impossible to provide a complete
list. Now we know why. It is because of graded membership. Additionally this grade
of membership to the landmark category depends also on the context. For example,
a city looks very different by night than by day, and landmarks in night scenes
may be quite unimpressive in daylight. Or consider a café that became special for a
couple because it is the place where they first met. It is one of the locations in the
city they refer to when they explain other locations to each other. For the two of them
it has become a landmark they share even if this is one of many cafés in that street.
For others, especially those who never visited the café, it is not. The New York City
Apple Store (Fig. 1.4a) may have the meaning of a landmark for people of some
1.1 What Landmarks Are 7
Fig. 1.4 Landmarks: (a) New York City Apple store; (b) Sydney opera house
interests and particular age, but not for others. The Sydney Opera House (Fig. 1.4b)
may be a prime example of a globally recognized landmark, but its cleaners may see
this labyrinth of a complex building differently, on another spatial granularity, and
perhaps with other landmarks within for their own orientation and communication
purposes (‘the box office’, ‘the stage’, ‘Bistro Mozart’). Thus, being a landmark
is not a global characteristic of an object, but a function of parameters such as
the individual that perceives and memorizes an environment, the communication
situation, the decision at hand, and the time. The latter argument means that even
prototypes cannot be considered prototypes in all cases, since there might be no
such thing that is always, i.e., in any context a well-suited landmark.
Even the extensional approach behind Fig. 1.3 follows the intention of this
definition. In order to illustrate the effect, consider the following experiment:
Take the six landmarks ranked in the figure and order them geographically
on a blank sheet of paper. You will find arranging them not too difficult.
What you have actually achieved is an externalization of your mental spatial
representations. The same experiment can be easily translated into the context
of your own hometown. Imagine a number of prominent objects in your
hometown, and ask your friends to arrange these objects for you spatially.
Again, they will be able to produce a sketch reflecting satisfactorily the outlay
of your hometown.
In a very similar experiment by Lynch [28] people were asked to sketch their
hometown. Comparing a large number of sketches, Lynch found commonalities
in the structure of the sketches. One common element in these sketches he called
landmarks,6 or, in his understanding, identifiable objects which serve as external
reference points.
Our definition of landmarks is purely functional: being a landmark is a role
that objects from any category can play. It emphasizes that landmarks are mental
constructs. In alignment with Meaning 1 and 2 from above it covers for objects that
stand out in an environment such that they have made (or can make) an impression
on a person’s mind. This experience is not limited to visuo-spatial properties of the
object itself, nor to current properties or to shared experiences. Counterexamples
would be “the café where we met”, “the park where we had this unforgettable
picnic” or “the intersection where I was nearly killed by a car”. It is also not limited
to human-made things, as the definition by Wikipedia above might suggest; it can
equally well be a natural object such as the Matterhorn (Fig. 1.5), a widely visible
mountain in the Swiss Alps with a characteristic shape, and hence frequently used
for localization and orientation. But in their function landmarks must have certain
properties, most importantly: being recognizable in the environment.
6
We will later argue that all elements Lynch has identified can be considered as landmarks.
1.1 What Landmarks Are 9
However, the chosen definition is not unproblematic. For a scientist the definition
remains questionable just because what happens in human minds is not directly
observable or accessible. So if we agree that there are mental representations of
what is in the world outside of our bodies, then indirect methods must be accepted
to reconstruct the elements and functioning of these representations. These indirect
methods are the toolboxes of cognitive science, with a strand in the neurosciences,
studying the structure and work of the senses and the brain, and another strand in
cognitive psychology, studying responses or behaviours of people, such as learning,
memorizing, language, or spatial behavior. This cognitive perspective on landmarks
could easily fill a book by itself. Therefore, we will keep the discussion of the
cognitive aspects concise, and instead focus on how the understanding of the role
and working of landmarks in spatial cognition and communication can inform
the design of smarter systems interacting with people.
It must also be said that this definition is in conflict with other common
understandings of landmarks. For example, Couclelis et al. have called anchor
points what we have called landmarks [4]:
“Much of the work in spatial cognition has focused on the concept of imageability. In brief,
this assumes that there will be elements in any given environment (natural or built) which
by virtue of their distinctive objects (for example, form, color, size, visual uniqueness),
or by virtue of some symbolic meaning attached to them (places of historic importance,
of religious or socio-cultural significance, etc.), stand out from among the other things in
the environment. Because such elements are outstanding, literally, they are likely to be
perceived, remembered, and used as reference points by a large number of people in that
environment. This is the notion of landmark as popularized by Lynch’s seminal work on
the ‘Image of the City’. Anchor-points (anchors for short) are closely related to landmarks,
both concepts being defined as cognitively salient cues in the environment. However, as
represented in the literature, landmarks tend to be collectively as well as individually
experienced as such, whereas anchors refer to individual cognitive maps. Although one
would expect to find several local landmarks among the anchors in a person’s cognitive map,
many anchors (such as the location of home and work) would be too personal to have any
significance for other, unrelated individuals. Further, landmarks are primarily treated as part
of a person’s factual knowledge of space, whereas anchor-points are supposed to perform in
addition active cognitive functions, such as helping organize spatial knowledge, facilitating
10 1 What are Landmarks, and Why Are They Important
navigational tasks, helping estimate distances and directions, etc. Finally, landmarks are
concrete, visual cues, whereas anchor-points may be more abstract elements that need not
even be point-like (e.g., a river or a whole city in a cognitive map at the regional level)”
(p. 102).
In fact, our definition does not make such a distinction between anchor points and
landmarks. Instead, we argue that, in principle:
• any property standing out in an environment can be shared with another person,
and
• that any sharing of experiences is limited to (larger or smaller) groups.
For example, ‘my home’ is shared with a few people, and ‘Eiffel Tower’ is shared
with many people. But even the Eiffel Tower is not known by the universe of all
living people. Furthermore, experience can be lived and communicated. Landmarks
can be learned from text, from maps, or from conversation (e.g., somebody may
tell me: “At that intersection I was robbed”, which attaches an outstanding memory
to this object). Thus, all what is required in a communication situation is taking a
perspective. As people do adapt themselves to their communication partner in their
choice of landmarks, so must the machine. This involves a capacity for context-
aware computing [6]. Different groups (e.g., family, colleagues, people living in the
eastern suburbs, tourists) share different sets of anchor points, or can expect that
certain landmarks can be experienced by particular individuals.
Relating the notion of landmarks to embodied experience has some tradition in
research in spatial cognition. In environmental psychology, for example, Siegel and
White wrote, “landmarks are unique configurations of perceptual events (patterns).
They identify a specific geographic location” ([42], p. 23), and similar words can
be found elsewhere (e.g., [8]). According to their distinctive experience, landmarks
should be “the most easily recalled attributes of a region” [41]. Sadalla et al. then go
on to “explore the function of landmarks as spatial reference points, points that serve
as the basis for the spatial location of other (nonreference) points” (ibid.). Similarly
the definition has been supported by neuroscience, which has shown that objects
relevant for navigation and orientation do not only engage object recognition in the
brain but also areas associated with spatial memory (e.g., [21, 24]).
Presson and Montello use this definition as well (“objects that are relatively
better known and define the location of other points”) in their discussion of the
nature of landmarks [35]. In particular they point out that landmarks, in order to
be able to define the location of other points (objects in our terminology), must
be distinct from these other elements in spatial memory, and central to the nature
and organization of mental spatial representations. We will come back later to a
discussion that landmarks may be stronger or weaker in their distinct experience,
and that stronger ones may be used as reference points to locate weaker ones.
In this regard, only objects that are located by reference to “better known” objects
are not landmarks. Presson and Montello continue, referring to the observation of
asymmetric distance estimates between reference and non-reference points made by
Sadalla et al.: “The relation of reference to non-reference points is assumed to be
asymmetric although the notion that there are a few elements to which many others
1.1 What Landmarks Are 11
are spatially related does not require this to be so. Non-reference points are more
likely to be defined in terms of their relation to reference points than vice versa,
and the judged distance from a reference point to a non-reference point may not be
equal to the same distance judged in the reverse direction”. Landmarks, to be used as
points of reference, must be objects that are relatively better known than the other
objects in their neighborhood. In this regard, Appleyard [1], the urban designer,
had already postulated: “We have to go beyond Lynch’s identification of known
urban element types. We must determine the reasons why these elements are known
which means discovering the attributes that capture attention and hold a place in the
inhabitant’s mental representation of his city” (p. 131). But generally in urban design
and architecture the notion of landmarks is restricted to the build environment, or
more precisely buildings, and falls too short for the purpose of catching human
mental spatial representations or, correspondingly, intelligent spatial systems.
On 28 June 2013, Kate Schneider, travel editor of News Ltd., wrote: “Kings
Park War Memorial in Perth. Fremantle prison. Melbourne’s Block Arcade.
What do these places have in common? Well, they’ve all made a list of the
nation’s best landmarks by travel website TripAdvisor. Yep, really. According
to the site, they are among the top spots to go in Australia for “enriching and
entertaining experiences”. The list was based on millions of reviews submitted
by travellers on the site over the past year. It also includes iconic Aussie
attractions such as the Sydney Opera House and the Sydney Harbour Bridge,
mixed in with grim locations such as the Port Arthur Historic Site. The list
had us wondering, is this the best [Australia has] to offer the world?”
http://www.news.com.au/travel/australia/australia8217s-top-10-landmarks-named-in-
tripadvisor-list/story-e6frfq89-1226670801261#ixzz2XjmExZCp, last visited 3/1/2014.
12 1 What are Landmarks, and Why Are They Important
Machine learning algorithms [33], one of the core tools of artificial intelli-
gence [40], can use such lists as training data and then try to identify more
landmarks. The task of machine learning (or any learning, in that respect) is not
trivial: If the examples in Fig. 1.3 above are landmarks, does then the Alcazar in
Spain count as a landmark as well? In order to evaluate such a question, an algorithm
(as well as a human mind) would collect characteristics of the given landmarks and
search for patterns of similarity for conceptual resemblance.
Lists, or datasets of candidates, however, do not resolve the challenges discussed
before. These lists require meta-level descriptions of the context of their validity,
a link between ranking and grading, local structures by spatial relationships, and
they require continuing maintenance due to changes in the world. Most critically,
however, the relationships between the elements of the dataset and the definition
of landmarks must be established methodologically. So just as cognitive science
is challenged identifying the nature and role of landmarks, artificial intelligence
is challenged as well. Generally, spatial databases are fundamentally different,
and incompatible with mental spatial representations. Analysis in spatial databases
is based on geometric representations and geometric algorithms. For example,
routing in navigation systems will apply an exact shortest path algorithm, such
as Dijkstra’s [7]. People’s mental spatial representations do not allow this kind of
reasoning. Their mental representations are good at (fast) guesses, but not at exact
computations [18, 22].
Consequently, their language is quite different from autonomous systems such as
robots. While a robot is happy with instructions such as “Move in direction of 354.6
degrees for 35.7 meters, then turn by 32.6 degrees”, a human would find this hard to
realize, let alone to manage the cognitive load with memorizing all these numbers
(imagine a full route description of this sort). People, in contrast, communicate in
ways the machine has difficulties to interpret. For a machine, understanding “To the
café at the library” requires a number of tasks, such as identifying the library in a
spatial dataset by interpreting the conversational context, and then disambiguating
the café at the library from all other cafés. The latter involves searching current
business directories, an ontological matching of all businesses that can count as
a café, and last but not least an interpretation of the preposition at, a qualitative
spatial relationship that is vague and underspecified. Judging from the structure of
this phrase, where the café is located relative to the library, the library appears to
be a landmark in the sense of the definition above. Thus, the capacity to interpret
landmarks is essential for smart human-computer interaction.
And once the machine succeeds with the interpretation of the phrase, and
computes a route for this person, this person expects human-like directions, for
example, “Turn right at the traffic lights”. The construction of these directions in
natural language is as difficult for a machine as the prior challenge of interpreting
natural language. It requires again a number of tasks, such as applying a smart
1.2 Related Concepts 13
From the definition and discussion above it becomes clearer that landmarks emerge
in the process of perceiving, learning and memorizing an environment in a particular
context, and that these memorized landmarks will be picked up in spatial reasoning
or communication processes. For objects in the environment to acquire landmark
quality these objects must somehow stand out. Furthermore, to contribute to the
embodied experience of the environment in which a person moves these objects
must be related to the human body and human senses. Using a classification relative
to embodied experience introduced by Montello [34], these objects have to be
identifiable objects in vista space, environmental space, or geographic space. Vista
space is the space covering all objects that can be seen from a single viewpoint,
and with the naked eye. Examples are a room, an open plaza, or any other
vistas in streetscape. Environmental space is the space learned by locomotion—the
movement of the body coordinated to the proximal surrounds—and the integration
of this embodied experience. Examples are buildings that are learned only by
walking through, or city districts that are learned by walking or driving around.
All body senses, including sight (vistas), contribute to an integrated, coherent
mental representation of these spaces. Geographic space is the space larger than
environmental space such that it can be learned only from symbolic representations
such as maps. Examples are countries or even larger cities, which cannot be explored
completely by foot, car or another form of locomotion.
With these integrated layers of human experience and mental representations,
landmarks will be found at each level. Orientation is helped by outstanding objects
in vista space (“the keys are on the table”), in environmental space (“the library
is around the corner”), as well as in geographic space (“Cologne must be in this
direction”). Having this more graphic image of landmarks at hand, it becomes
easier to draw boundaries around the concept of landmarks, and to discuss how
other things relate to landmarks. This way, objects that stand out on other than
vista, environmental or geographic scale are not considered to form landmarks.
14 1 What are Landmarks, and Why Are They Important
1.2.1 Places
“What then after all is place? The answer to this question may be elucidated as follows
[. . . ]:
• Place is what contains that of which it is the place.
• Place is not part of the thing.
• The immediate place of a thing is neither less nor greater than the thing.
• Place can be left behind by the thing and is separable.
• In addition: All place admits of the distinction of up and down, and each of the bodies is
naturally carried to its appropriate place and rests there, and this makes the place either
up or down.”
The last item seems to indicate that places have gravity, an understanding shared
by contemporary geography [3]. In the Aristotelian notion of place objects have
their unique immediate place, or footprint: “Hence we conclude that the innermost
motionless boundary of what contains is place” ([2], IV). Two phenomena cannot
have the same immediate place, and one phenomenon cannot be at two different
immediate places at the same time. Beyond the immediate place a phenomenon is
simultaneously nested in an unlimited number of other places. For example, while
I am sitting at my desk, I am at the same time in my office, in the department, et
cetera.
However such a philosophical perspective does not yet consider the more
experiential perspective of geography, with its notion of a sense of place (e.g., [31,
36, 45]). A geographer’s approach might be calling any meaningful spatial config-
uration of shared affordances to the human body a place. Such a definition shows
some similarities to landmarks, for example, both requiring perceptual wholeness
and cognitive salience, and both being context-dependent. But it also explains the
difference: Landmarks act as anchor points, and hence are conceptually abstracted
to nodes where no internal structure is required. Their purpose is fast reasoning or
efficient communication. Place, in contrast, captures the meaning and affordance of
a scene, and hence, is rich in structure and complex to communicate. With regards
to the latter, and especially focusing on the context dependency, Freksa and Winter
have pointed to the principle of sufficiency ([49], p. 32):
“Cognition of and communication about place in spatial environments is a matter of
sufficiency. Sufficiency can be captured by contrast sets, that is, by specifying the meaning
of a place in a given context by explicating the contrast to other places. People conceptualize
a portion of an environment as a place if their embodied experience of this portion
shows a wholeness against the background, i.e., if it has some contrasting properties to
its environment or to other places. Referring to such human embodied experience, our
arguments and examples will focus on geographic places, i.e., on places in our physical
environment of vista, environmental, or geographic scale as defined by Montello [34].”
“As interrelated processes, cognition and language make use of places. For example,
spatial reasoning happens on qualitative spatial relationships between places [12], and
everyday language refers to named and unnamed places (e.g., on Federation Square, at
the road intersection) and the relationships between them (e.g., at Birrarung Marr near
Federation Square). Sketches, as non-metric graphical externalizations of cognitive or
verbal representations, also reflect configuration knowledge of places and their relation-
ships. And yet, despite recent progress in neuroscience and cognitive science our knowledge
about cognitive representations and reasoning is not sufficient to formally characterize
the entities, relations, and operations that would enable us to build a system that reflects
the computational processes of spatial representations in our mind.”
Some geographic information systems contain points of interest databases, and may
use this terminology, which is sometimes abbreviated to POI, also in their user
interface. For example, car navigation services and public transport trip planners
offer users to specify their destination by selecting a point of interest from their
database, and mobile location-based services provide points of interests on their
you-are-here maps. A point of interest is simply a point (typically a point on a
map, or a GPS coordinate in some spatial reference system) that somebody by some
authority has declared to be interesting. A point of interest typically comes with a
name, carrying the semantics of what can be found at the location characterized by
this point (Fig. 1.6).
Yet what can be considered to be interesting depends on the particular context
a person may find themselves in. As a car driver, they will find gas stations,
parking houses, and speed cameras interesting. As a tourist they will find museums,
churches, restaurants and also hotels interesting. As a public transport user they
1.2 Related Concepts 17
will find stations, event locations, or public institutions interesting. We got used to
specialized services catering for these markets, such as car navigation systems or
tourist guides. However, what about generic information services? Once we are
able to take our car navigation system out of the car and use it for pedestrian
navigation these systems would need mechanisms to determine the context of a
particular user, or query, to come up with a relevant result. But then, an economic
geography researcher will find perhaps living places and workplaces interesting,
which are typically lacking in point of interest databases. Concerned parents want
to know where their children are, and points representing these locations are also
lacking in point of interest databases. Or a serendipitous armchair traveler might
enjoy to discover locations of objects of unexpected categories, which for a system
are even harder to predict than profiled preferences. Thus, prefabricated and stored
points of interest come along with some paternalism: Something at some location
has been deemed to be interesting by somebody in a specific context.
Are points of interest landmarks? As indicated, current systems choose particular
categories of points of interest for particular contexts, for example, car driving
or using public transport. They do consider a potential service for the user, such
as navigating by a given means. This can include communicating destinations,
presenting waypoints that may service the current mode of travelling, or presenting
choices of routes for a given mode. Since points of interest are service-oriented,
they are also attractive advertising tools. For example, particular brands may want
to ensure that they are present on car drivers’ maps. From their perspective it
might be even more attractive to block other brands from being listed. But despite
the commercial flavor and potentially compromised selection of points of interest
there is another substantial argument why points of interests are not landmarks: The
selection of points of interest does not consider the appearance to the human senses,
and more generally, does not aim to support human orientation and wayfinding.
Some of the point of interest categories cannot be expected to be easily recognized
from outside, or to stand out in their neighborhood. For example, the POI category
of medical doctors may produce POIs on a map, but they will actually be hard to
find in the environment for a car driver, with practices being unremarkable from
the outside (according to professional codes of the profession). They also can be
located in hidden places such as in malls, or above ground levels of buildings.
But if the doctor’s practice does not stand out in its neighborhood, it will not
structure mental representations because people cannot experience it. Other point
of interest categories may typically be highly visible, for example restaurants.
However, that does not mean automatically that the entities of this category stand
out in their environment. For example, China Town may show a strip of Chinese
restaurants. A POI service will unashamedly show the points of interest in their
high local density. But just because of that reason no single restaurant stands out
in its neighborhood. These restaurants are unsuited to form a point of reference in
mental representations, and a decision point “At the China restaurant turn left” does
not work either. The aggregate China Town, however, may form a landmark in some
context.
18 1 What are Landmarks, and Why Are They Important
This argument does not answer the opposite question either. Are landmarks
points of interest? Some navigation systems, especially those addressing tourists,
like to suggest so. In their context shown landmarks are recommendations to visit
these places because they are famous, prominent, en vogue, of historical interest,
or of cultural interest. If the purpose of the service is limited to recommendations
only, neither the compilation of these landmarks for a POI database nor their
presentation on a map aim to support human orientation. Nonetheless, whatever
the purpose of the presentation these landmarks may actually help the tourist with
global orientation after all. Accordingly, these landmarks are at most a subset of the
landmarks studied in this book. They are those landmarks somebody has selected
to make recommendations for tourists, i.e., those that serve also an interest different
from spatial orientation and reasoning.
A special category of points of interest are those collected by a machine for
an individual. Let us call them favorite places. Favorite places are a product of
machine learning: They can be derived from the individual’s prior search, movement
behavior, their social network’s behavior, or from the behavior of a group of people
with similar profile. “My home” or “my favorite coffee place” can form landmarks
in my mental spatial representation. These are individual landmarks. The challenge
in communication is that these landmarks may not be shared, such that a place
description “let us meet at my favorite coffee place” may not work, depending on
the recipient’s intimacy with my life. Landmarks imply a shared understanding, and
thus, databases should store geographic objects that, since they can be experienced
by all people, have chances to structure many people’s mental representations.
1.2.3 Icons
Another term to distinguish landmarks from are icons. Icons have a strong semantics
in semiotics. Their image stands for something else. If the icon, an image, stands for
a geographic entity there are some parallels to landmarks. Consider, for example,
the Eiffel Tower, which is an icon of Paris, if not of France. Typically being an
icon of a geographic entity requires a containment relationship. The Eiffel Tower
is an icon of Paris because it is in (or part of ) Paris, even relatively central, in
addition of being highly visible, standing out with a unique, unambiguous shape,
and carrying a strong emotional attachment of locals and visitors alike. Hence, icons
refer to geographic objects that are landmarks—the geographic objects stand out,
are known, are referred to—but not every landmark has an iconic significance. For
example, Federation Square in Melbourne is a landmark—probably every person in
Melbourne knows it, and it is frequently used in route descriptions—but since it has
no clear image it is not an icon.
1.3 Why Landmarks Challenge Intelligent Systems 19
It appears that landmarks are serving so well in structuring the spatial domain
that the concept has been mapped into other domains successfully. One of these
mappings happens when geographic reality—the physical world—is mapped to
the metaphysical world. Then for example heaven, paradise or cloud-cuckoo
land become landmarks—orientation points, places to be—which fulfil above’s
definition with the only exception that the space can no longer be experienced with
human senses. Hence we do not consider landmarks in metaphysical space.
These mappings into other domains have been mentioned already. Merriam
Webster’s definition (Meaning 3) contained “very important events or achieve-
ments”, which would be in the social domain and also in the temporal domain, with
regard to providing structure in time. The landmark victory, a significant event in
history, structures the memory of human experience both as a sudden shift of power
relations, and with it into a before and after the victory.
robot finds useful are not necessarily the landmarks a person would identify and use
in an environment, and the spatial representations are different. Now, the imitation
game only requires that a person and a machine can communicate, or more precisely
that a machine can communicate on the person’s terms, independent of its internal
structures.
Accordingly, John McCarthy explained more recently the term artificial intelli-
gence in the following dialog7 :
Q. What is artificial intelligence?
A. It is the science and engineering of making intelligent machines, especially intelligent
computer programs. It is related to the similar task of using computers to understand
human intelligence, but AI does not have to confine itself to methods that are biologically
observable.
Q. Yes, but what is intelligence?
A. Intelligence is the computational part of the ability to achieve goals in the world. Varying
kinds and degrees of intelligence occur in people, many animals and some machines.
Q. Isn’t there a solid definition of intelligence that doesn’t depend on relating it to human
intelligence?
A. Not yet. The problem is that we cannot yet characterize in general what kinds of
computational procedures we want to call intelligent. We understand some of the
mechanisms of intelligence and not others.
Also, more recently critique of the imitation game has been expressed, for example
by French [13, 14]. His critique is not about the validity of Turing’s argument, but
the relevance of the test. He argues that perfect imitation would include making the
same mistakes. This is a critical argument for the domain of human spatial cognition,
where it is well documented that people make systematic errors of judgement
(some of them are discussed later in this book). Assuming a player of the imitation
game has read about the weaknesses of human spatial judgement, she could try to
take advantage of it in a Turing test and may triumph to identify the communication
partner as a machine because it does not show these weaknesses. However,
theoretically, the computer could know this as well and skew its (undistorted)
results accordingly, even randomly, to perfectly mislead the player. But what would
be the purpose of building such a machine? Wouldn’t it be more useful, French
asks, to leave the machine making undistorted results and communicate them in
human terms to a ‘user’, which is a person in a concrete decision making situation?
Wouldn’t that be smart?
People do not only make systematic errors of judgement in their spatial reason-
ing, they are also varying in their spatial (communication) skills (e.g., [5]). Along
the same line of argument, it does appear smart only to build machines of best
(human) reasoning and communication skills. As French writes: “The way forward
in AI does not lie in an attempt to flawlessly simulate human cognition, but in
trying to design computers capable of developing their own abilities to understand
the world and in interacting with these machines in a meaningful manner” ([14],
p. 77). Instead, French postulates that, in order to achieve artificial intelligence,
7
http://www-formal.stanford.edu/jmc/whatisai/whatisai.html, last visited 3/1/2014.
1.3 Why Landmarks Challenge Intelligent Systems 21
In some respect understanding and answering the request is not a too difficult
question for an intelligent system either. An intelligent system is superior to humans
in computing routes. It computes faster, processes more data, and produces more
accurate results. An intelligent system can for example guarantee to compute the
quickest route, and perhaps even include data about current traffic conditions,
something our person in the example above cannot do. But to do so the system
must know start and destination. If the intelligent system runs on the user’s smart
phone determining the start is possible using the sensors on board. But to determine
the destination the system has to resolve this word ‘airport’, which is surprisingly
tricky.
Obviously, ‘airport’ is ambiguous. The system knows many airports. Which one
is the one the user has in mind? Choosing the most prominent one (as search engines
are good at) would guide everybody to Atlanta, currently the world’s largest airport.
But this is probably not what the user had in mind. Choosing the one nearest to the
user would be a better guess, but in the concrete example this would lead to a local
airfield, not the international airport. Choosing the one most frequently visited by
this person (another option for machine learning) would also be inappropriate. No,
in this particular case it is the one where the inquirer wants to depart with a flight in a
couple of hours from now. An intelligent system could determine that the function
of an airport is to provide air travel, that air travellers need valid tickets bought in
advance, and thus, could check whether this user has such a ticket to identify the
airport. This is not only quite a complex reasoning chain, it requires also to assess
all the various suggestions and reject the less likely ones.
After a route has been selected by the system, it has to communicate it in a way
easy to understand and memorize by the user. Assume that the system is aware of
the benefits of communicating by landmarks, then it has a challenge in selecting
landmarks for this purpose. While a person needs not think twice when picking the
‘hospital’, an intelligent system knows thousands of objects along the route, of a
variety of types and spatial granularities, from suburbs to ATMs, garbage bins and
light poles. What is a good landmark? The one in the most outstanding color? A fire
hydrant; not a good choice for a car driver, and too many of them along the route.
The largest one? The best known one? Probably one of the objects at decision points,
but which one? And is the landmark unambiguous (“the hospital”) or ambiguous
(“a hospital”)? Is the landmark known to the inquirer, or at least recognizable in
its identity, such that the system can use its name (“Royal Melbourne Hospital”)?
If so, does the inquirer also know how to find this landmark such that the first
part of the route can be folded into a simple instruction: “You know how to find
Royal Melbourne Hospital?” Klippel calls this spatial chunking [23]. We will come
back to this later. Or can the inquirer at least recognize the type, as a hospital can
typically be identified in its function from the outside by signs and certain functions
at ground level? If not, should the landmark be described by its appearance: “A tall
building on your right that is actually a hospital”? That is to say an intelligent system
should integrate knowledge about the context of the enquiry, knowledge about the
appearance of objects, analytical skills with respect to the route, and knowledge
about the familiarity of the inquirer with the environment.
1.4 Summary 23
1.4 Summary
References
1. Appleyard, D.: Why buildings are known. Environ. Behav. 1(2), 131–156 (1969)
2. Aristotle: Physics. eBooks@Adelaide. The University of Adelaide, Adelaide (350BC)
3. Couclelis, H.: Aristotelian spatial dynamics in the age of GIS. In: Egenhofer, M.J., Golledge,
R.G. (eds.) Spatial and Temporal Reasoning in Geographic Information Systems, pp. 109–118.
Oxford University Press, New York (1998)
4. Couclelis, H., Golledge, R.G., Gale, N., Tobler, W.: Exploring the anchorpoint hypothesis of
spatial cognition. J. Environ. Psychol. 7(2), 99–122 (1987)
5. Daniel, M.P., Tom, A., Manghi, E., Denis, M.: Testing the value of route directions through
navigational performance. Spatial Cognit. Comput. 3(4), 269–289 (2003)
6. Dey, A.K.: Understanding and using context. Pers. Ubiquit. Comput. 5(1), 4–7 (2001)
7. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271
(1959)
8. Downs, R.M., Stea, D.: Image and Environment. Aldine Publishing Company, Chicago (1973)
9. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. The MIT Press, Cambridge
(1998)
10. Frank, A.U.: The rationality of epistemology and the rationality of ontology. In: Smith, B.,
Broogard, B. (eds.) Rationality and Irrationality. Hölder-Pichler-Tempsky, Vienna (2000)
11. Frank, A.U., Raubal, M.: Formal specification of image schemata: a step towards interoper-
ability in geographic information systems. Spatial Cognit. Comput. 1(1), 67–101 (1999)
12. Freksa, C.: Qualitative spatial reasoning. In: Mark, D.M., Frank, A.U. (eds.) Cognitive
and Linguistic Aspects of Geographic Space. NATO ASI Series D: Behavioural and Social
Sciences, pp. 361–372. Kluwer, Dordrecht (1991)
13. French, R.M.: Subcognition and the limits of the turing test. Mind 99(393), 53–65 (1990)
14. French, R.M.: Moving beyond the turing test. Comm. ACM 55(12), 74–77 (2012)
15. Galton, A.: Fields and objects in space, time, and space-time. Spatial Cognit. Comput. 4(1),
39–68 (2004)
16. Gärdenfors, P.: Conceptual Spaces. The MIT Press, Cambridge (2000)
17. Gärling, T., Böök, A., Lindberg, E.: Adults’ memory representations of the spatial properties
of their everyday physical environment. In: Cohen, R. (ed.) The Development of Spatial
Cognition, pp. 141–184. Lawrence Erlbaum Associates, Hillsdale (1985)
18. Gigerenzer, G., Goldstein, D.G.: Reasoning the fast and frugal way: models of bounded
rationality. Psychol. Rev. 103(4), 650–669 (1996)
19. Goodchild, M.F.: Formalizing place in geographical information systems. In: Burton, L.M.,
Kemp, S.P., Leung, M.C., Matthews, S.A., Takeuchi, D.T. (eds.) Communities, Neighborhoods,
and Health: Expanding the Boundaries of Place, pp. 21–35. Springer, New York (2011)
20. Grice, P.: Logic and conversation. Syntax Semantics 3, 41–58 (1975)
21. Han, X., Byrne, P., Kahana, M.J., Becker, S.: When do objects become landmarks? A VR study
of the effect of task relevance on spatial memory. PLoS ONE 7(5), e35940 (2012)
22. Kahneman, D.: Thinking, Fast and Slow. Farrar, Straus and Giroux, New York (2011)
23. Klippel, A., Hansen, S., Richter, K.F., Winter, S.: Urban granularities: a data structure for
cognitively ergonomic route directions. GeoInformatica 13(2), 223–247 (2009)
24. Knauff, M.: Space to Reason: A Spatial Theory of Human Thought. MIT Press, Cambridge
(2013)
25. Kuhn, W.: Modeling the semantics of geographic categories through conceptual integration.
In: Egenhofer, M.J., Mark, D.M. (eds.) Geographic Information Science. Lecture Notes in
Computer Science, vol. 2478, pp. 108–118. Springer, Berlin (2002)
26. Lakoff, G., Johnson, M.: Metaphors We Live By. The University of Chicago Press, Chicago
(1980)
27. Lakoff, G.: Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. The
University of Chicago Press, Chicago (1987)
28. Lynch, K.: The Image of the City. The MIT Press, Cambridge (1960)
References 25
29. Mark, D.M., Smith, B., Tversky, B.: Ontology and geographic objects: an empirical study of
cognitive categorization. In: Freksa, C., Mark, D.M. (eds.) Spatial Information Theory. Lecture
Notes in Computer Science, vol. 1661, pp. 283–298. Springer, Berlin (1999)
30. Mark, D.M., Turk, A.G.: Landscape categories in Yindjibarndi. In: Kuhn, W., Worboys, M.F.,
Timpf, S. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 2825,
pp. 28–45. Springer, Berlin (2003)
31. Massey, D.: The conceptualization of place. In: Massey, D., Jess, P. (eds.) A Place in the
World?, vol. 4, pp. 45–77. Oxford University Press, Oxford (1995)
32. Miller, G.A.: Wordnet: a lexical database for english. Comm. ACM 38(11), 39–41 (1995)
33. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT Press,
Cambridge (2012)
34. Montello, D.R.: Scale and multiple psychologies of space. In: Frank, A.U., Campari, I.
(eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 716, pp. 312–321.
Springer, Berlin (1993)
35. Presson, C.C., Montello, D.R.: Points of reference in spatial cognition: stalking the elusive
landmark. Br. J. Dev. Psychol. 6, 378–381 (1988)
36. Relph, E.C.: Place and Placelessness. Pion Ltd., London (1976)
37. Rosch, E.: Natural categories. Cognit. Psychol. 4(3), 328–350 (1973)
38. Rosch, E.: Principles of categorization. In: Rosch, E., Lloyd, B.B. (eds.) Cognition and
Categorization, pp. 27–48. Lawrence Erlbaum Associates, Hillsdale (1978)
39. Rosch, E., Mervis, C.B., Gray, W.D., Johnson, D.M., Boyes-Braem, P.: Basic objects in natural
categories. Cognit. Psychol. 8(3), 382–439 (1976)
40. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Pearson
Education, London (2003)
41. Sadalla, E.K., Burroughs, J., Staplin, L.J.: Reference points in spatial cognition. J. Exp.
Psychol. Hum. Learn. Memory 6(5), 516–528 (1980)
42. Siegel, A.W., White, S.H.: The development of spatial representations of large-scale environ-
ments. In: Reese, H. (ed.) Advances in Child Development and Behaviour, pp. 9–55. Academic,
New York (1975)
43. Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In: Freksa,
C., Mark, D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol.
1661, pp. 37–50. Springer, Berlin (1999)
44. Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. MIT Press, Cambridge (2005)
45. Tuan, Y.F.: Space and Place: The Perspective of Experience. University of Minnesota Press,
Minneapolis (1977)
46. Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
47. Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327–352 (1977)
48. Weiser, M., Brown, J.S.: The coming age of calm technology. In: Denning, P.J., Metcalfe, R.M.
(eds.) Beyond Calculation: The Next Fifty Years of Computing. Springer, New York (1997)
49. Winter, S., Freksa, C.: Approaching the notion of place by contrast. J. Spatial Inform. Sci.
2012(5), 31–50 (2012)
50. Wittgenstein, L.: Philosophical Investigations, 2nd edn. Basil Blackwell, Oxford (1963)
Chapter 2
Landmarks: A Thought Experiment
2.1 Experiment
Imagine an environment with no structure at all, like in Genesis 1.2: “The earth
was without form, and void; and darkness was upon the face of the deep”. The only
structure given is a flat infinite surface orthogonal to gravity. Walkers are presented
with a monochrome empty plane up to the horizon, under a white sky of diffuse
light. There is no further structure in this environment, and no hint for direction
other than the vertical axis imposed by gravity. There is not even shadow supporting
a sense of direction. All what the walkers experience in this environment is their
own locomotion, and thus path integration. Their body will tell them from which
location they originated. Therefore they can always point in the direction of this
location and guess the covered distance, a mental ability called homing [14]. In
their desire to establish and maintain orientation in this empty environment the
only location to relate to is this point of origin. Let us call it home. Home is the
only place there is. Thus home becomes a reference point for the exploration of the
environment: A landmark.
If this environment would have force fields that differ by location then sensing
the force differentials could support the sense of direction. For example, if walkers
would be equipped with a magnetic sense, or a compass as an external device,
their mental effort to maintain their sense of direction would be supported con-
siderably [21]. Similarly if the plane would be tilted towards gravity their sense of
gravity would add observations to path integration.
Now imagine that a walker, after roaming around to discover the environment,
stumbles upon a coin on the ground. This walker, picking up the coin, might feel
lucky. At last she has an experience in an otherwise uneventful environment. This
experience is linked to a location. She will remember the event, and for a while also
its location: Another landmark.
When she returns home she wants to report to her friends where she has found the
coin. How can she do this given that there are no external cues in the environment?
All she can refer to are directions and distances related to her body. She might say:
“Over there [pointing], perhaps 20 steps from here”, the direction physically linking
to her body, and the distance in a quantity relating to an internalized measure that
can be realized by walking. The reproducibility of measures proportional to the body
or body mechanics will help her even in communicating with friends, assuming
they have a body like her. Scheider et al. also concerned about grounding human
experiences of space, wrote ([18], p. 76):
Humans perceive length and direction of steps, because (in a literal sense) they are able to
repeat steps of equal length and of equal direction. And thereby, we assume, they are able
to observe and measure lengths of arbitrary things in this environment.
2.1 Experiment 29
They then went on to develop a theory of steps between foci of attention. In our
thought experiment we stick with the embodied experience, memory and commu-
nication of the walker (coming back to formal models in Chap. 4). In this regard it
is interesting to see how instead of words the walker used pointing to communicate
the direction. Finding words of similar accuracy in this environment would have
been quite difficult. The distance, however, was expressed in a quantitative manner.
A qualitative description might have come to mind more easily: “Over there
[pointing], not too far from here”. The qualitative description can be generated with
less cognitive effort (we will come back to ‘quality over quantity’ in Sect. 3.3.2.1),
but its realization might be more uncertain. In addition, the interpretation of a
qualitative term is context-dependent; not too far does mean different things when
talking about a car ride or an exploration of the immediate neighborhood, and in this
empty environment there is not much shared experience between the walker and her
friends that would establish context.
For her friends, the realization of her place descriptions is subject to uncertainty.
While an instruction “20 steps” may produce less uncertainty than “not too far”,
it also takes more cognitive effort to realize by requiring counting. And since the
coin was picked up by the walker, the landmark experience of her friends is only a
mediated one.
Human bodies vary. For example, using foot lengths or step lengths as
measures depends on an individual’s body and can be reproduced by another
person only with some uncertainty. Hence, when quantities need to be accu-
rate some agreement is required on an absolute measure, which is a measure
independent from an individual body. Most standardized unit measures are
anthropomorphic, based either on average body dimensions (e.g., the length
of a foot), or on human body movement (e.g., the length of a step). Even
the meter, our today’s standard unit according to the International System of
Units (SI, from Système International d’Unités), was defined as a breakdown
of the length of a great circle of the Earth to a unit in some relation to the scale
of the human body. In order to be reproducible by everybody, an absolute
distance measure requires from an individual learning their individual body
properties (e.g., step length) compared to the standard. With other words,
absolute measures require an additional layer of cognitive effort in realization.
This expression could refer to the direction in front of herself (“put yourself in my
position; straight in front of me”), or could have referred to the direction in front
of the friend (“from your position walk straight”). For the prior interpretation, the
recipient must transform the instruction by a mental rotation from the orientation
of the speaker to their own orientation. For the latter interpretation, the speaker
must do the mental rotation before speaking. Both is practical only when speaker
and recipient are meeting face-to-face. If they are communicating over a distance
(e.g., telephone) or asynchronously (e.g., email) the communication of body-pose
related directions requires links to external cues. Since they do not exist in this
environment such a communication is simply impossible [10]. Pointing, however,
conveys the intended interpretation because it happens within the shared space [7].
What applies to the communication of direction—the ambiguity of the reference
system, and the need for a mental transformation between reference systems either
by the speaker or the recipient—applies also to the communication of distance.
If the distance from the speaker is “about 20 steps” this information may need to be
updated by the recipient according to their different positions and step sizes, and if
the speaker actually means “about 20 steps in front of you” this interpretation needs
to be conveyed as well. These mental transformations—rotations and translations—
require spatial skills people have only to varying degrees [1, 19].
Stripped of any external cues within the environment, the walker will find it hard
to describe accurately the location where she found the coin. The more time passes,
or the more other walks she will have made since then, the less will she be able to
reenact the locomotion experience. Constant updating of multiple vectors (home and
all discoveries made over time) will become an overload, and the walker will give
up maintaining those vectors felt no longer to be essential (last the homing vector).
She may not necessarily forget the event itself, but she may forfeit her ability to
describe its location. For a while though, the event provided a second landmark for
the walker. In the real world we have similar experiences. “Let’s meet at the café
where we have met first” works in communication because this café has attached
emotional value, is remembered for the meeting and for its location, and thus the
location is describable and can be found again.
From here on our thought experiment splits for a while into three parallel streams.
One continues with constructing a memorable space (Sect. 2.1.3.1), another intro-
duces a global frame of reference (Sect. 2.1.3.2), and a third one defines an arbitrary
frame of reference (Sect. 2.1.3.3). Each of them ends up with a network structuring
the environment, although motivated by different principles. The lines will be
reunited in Sect. 2.1.3.4.
2.1 Experiment 31
Let us assume the walker decides to mark the location where she found the coin with
some chalk on the ground. She also draws a line on the ground from that location to
home. Home is another mark on the ground. These externalized landmarks can now
be found even with fading memories of path integration. The experience of walking
can be repeated, can be communicated easier to others, and can be shared by others.
Further landmark experiences in the environment can be added over time,
and connected to the existing ones. What forms over time is a travel network
between landmarks. The intersections in this network were originally destinations,
or locations of particular (shared) memories or stories. But over time also the edges
in the network get some prominence, since they are commonly experienced by some
embodied locomotion. As the dependent elements (e.g., “the route from home to the
place where we have found the coin”) their prominence may be lower, but we will
argue later in the book that they will also share some landmarkness. For example,
the walker and her friends can give names to edges (or sequences of edges in this
regard). If over time the stories of the original landmark experiences fade away the
prominence of the named edges may even get stronger (e.g., “the coin route”).
Whatever elements are the primary anchors, either the connected landmark
locations or the dual view of the edges between landmarks, this network enables for
spatial tasks such as orientation and wayfinding. Using landmark orientation would
be maintained either with local landmarks (“I am at the location where the coin was
found”) or with global landmarks (“I am three intersections from home”), and route
planning would be about an appropriate sequence of landmarks (e.g., “From home
to the location where the coin was found, and then right”). Using edge orientation
would be maintained also either locally (“I am on Coin Route”) or globally (“Coin
Route must be in this direction”), and route planning would be about an appropriate
sequence of edges (e.g., “Point Route, then turn right into Serendipity Street”).
Alternatively, let us change the settings of the experiment slightly. Instead of diffuse
light imagine there is a point-like light source, mounted several times above body
height. Call it the sun although it will not move in this experiment. This sun
can be observed from any location, and since it is the only marked point in the
environment—a singularity in an otherwise homogeneous empty space—it will
attract attention from walkers. It also provides a reference direction for orientation
and communication. Instead of using solely their locomotion-based orientation,
walkers can now refer to the sun: “Walk towards the sun”. Even on the ground is
now a singular location where the body throws no shadow, which is where the sun
is in zenith. This point can be found by any walker. It is an embodied experience
but also a characteristic of the environment, and thus independent from previous
locomotion. Therefore it can be used as a common, or shared reference point.
Everybody can find it with no further instructions about its distance or direction.
32 2 Landmarks: A Thought Experiment
With respect to this reference point distances can be estimated: “Closer to the sun”
means a location where the walker has a shorter shadow (an embodied experience),
“near the sun” may have some context-dependent meaning related to shadow length,
and even quantities can be given, such as “within 10 step lengths from the pole under
the sun”.
Since in our thought experiment the height of the sun is related to human body
dimensions, and constant, the walker and her friends could develop over time
also a sense of distance from the pole by observing the angle of the sun above
the horizon, or the length of their body’s shadow, instead of estimating steps
from path integration. With a constant height of the sun c and body height a,
the shadow length b is proportional to the current distance from the pole d b
by the similarity of triangles (Fig. 2.1):
a c
D
b d
the location of the discovery by chalk on the ground. By doing so she defines a prime
meridian (Fig. 2.2). Let us call the direction of the prime meridian North. With a
marked direction and the memory for the distance the walker has now a reproducible
characterization of the location of the discovery, one that does not change and is not
dependent on her actual pose or location. Compared to path integration, requiring
constant updating of all related locations, this is quite a relief for her memory. If she
wants to find the location again she only has to come back to the pole (a landmark),
find the prime meridian (another landmark), and memorize the distance. If she
wants to tell friends she can now text: “Go to the pole, find the prime meridian,
and walk about 20 steps”, and neither of them has to be at the pole at the time of
communication. Furthermore, future other discoveries can be linked to the pole and
prime meridian as well. The pole becomes the datum point of a global reference
coordinate system.
So, similarly to the world constructed from landmark experiences alone
(Sect. 2.1.3.1), a global reference coordinate system relieves from constantly
updating internal representations, and takes over to anchor other locations.
Additionally we have gained a (polar) network structure by marking salient
locations with reference to the datum—the pole and the prime meridian. The
city of Karlsruhe in Germany, for example, shows such a radial network of lateral
circles and meridians, laid out from the palace in the center (Fig. 2.3). Alternatively,
the walker could lay out a rectangular network by constructing parallels to the prime
meridian and then perpendiculars to the meridian. A rectangular network has the
advantage of allowing constant block sizes, where the radial network has constantly
increasing blocks with the distance from the pole. A rectangular grid is a street
network pattern chosen in many European settlements in the new colonies of the
eighteenth and nineteenth century, such as North America and Australia.
34 2 Landmarks: A Thought Experiment
Fig. 2.3 The regular design of the city of Karlsruhe, Germany. Map copyright OpenStreetMap
contributors, used under CC BY-SA 2.0
The walker decides to move from now on only along the drawn lines, and calls
them streets. In contrast to Sect. 2.1.3.1 these streets are constructed from abstract
principles, and so far only the datum has produced a landmark experience.
In another alternative, let us assume the walker does not care whether the environ-
ment provides any cues for structure. Instead, the walker decides to draw freely a
network of lines on the ground. The walker might be guided by cognitive efficiency,
as too sparse lines on the ground would cause temptation to find shortcuts (adding to
the network), and too dense lines would reduce the imageability of the network [12].
As in the other two lines of thought these streets would help structuring the
environment, but would not be based on any prior landmark experiences. Most cities
have neither a circular nor a rectangular network structure.
2.1 Experiment 35
We are now ready to reunite the three alternative lines of our thought experiment.
Whatever network the walker chooses, the result is a structure of nodes (say, street
intersections) and edges between nodes (say, street segments between intersections).
Such a structure is a graph [4, 9, 15, 22]. This graph, since it was drawn by the
walker on the ground, is also planar (each intersection of edges is a node) and
embedded (each node is at a particular location on the plane).
Each network structures the environment independent from prior landmark
experiences. It has distinguished locations, the intersections, that are easy to
perceive with the body senses as locations of choice. They are memorable, and
hence landmarks by themselves. Since the walker could draw only a network of
limited extent the individual intersections are countable and finite. Instead of polar
coordinates within the polar reference system locations can now be described using
this discrete network, e.g., “at the corner of”. Distances can now be measured
in numbers of intersections. Directions are discrete as well. In the radial and
the rectangular network only right angles exist, and even in the free-form street
networks intersections offer a very limited number of possible directions to take.
In graph theory, this property is characterized by the degree of nodes: The degree
of a node is the number of its incident edges. Radial and grid network have only
nodes of degree 4, except the pole in the radial network and the outer boundary
of the networks. Free-form street networks can also have nodes of degree 1 (dead-
ends), degree 3 (e.g., T-intersections) and degrees higher than 4 (more complex
intersections). However, due to physical space constraints this number cannot be
arbitrarily large. From a cognitive perspective the limited number of directions
is again a relief. The walker does no longer need to control constantly direction
and to integrate steps. Instead, the walker only counts the passed intersections
and memorizes discrete turn choices. These typically low counts are not stretching
numerical cognition and short term memory [6, 13].
Networks may appear relatively plain, but then there are also individual differ-
ences coming out of node degrees or node (or edge) centrality. For example, the
pole in a radial network has a node degree standing out in the otherwise regular
structure, and it is also the node of highest betweenness centrality in the network—
betweenness centrality of a node in a graph is a measure of how many shortest
paths a node is in [8]. The latter means that statistically it will be experienced more
often by walkers than other nodes. These reasons add to the pole’s experiential
features. The pole is a stronger landmark than the other intersections, and due to
its uniqueness in this respect it is a global landmark. One function of a global
landmark is supporting global orientation and wayfinding. For example, even if a
walker somewhere in the radial network feels temporarily disoriented, some simple
heuristics will lead her back to the pole. She will follow the next meridian, i.e.,
the straight streets towards the sun, and will reach the pole. In other network forms
local variations may produce more subtle differences between nodes, but centers or
bottlenecks will stand out as well. A regular node, however, is a local landmark.
It helps locating events and referencing to these events as local anchor points.
36 2 Landmarks: A Thought Experiment
For example, if the coin would have been found along an edge, not at a node, any of
the following would be a natural reference: “Near corner West fourth St and Prime
Avenue”, or “In West fourth St, 30 steps from Prime Avenue”, the latter implying
an intersection of the two streets.
Now let us imagine the cells enclosed by network edges and nodes are filled with
white blocks, larger than the human body. These blocks limit the sight of walkers.
Since they are plain white without further texture or structure they are nearly
indistinguishable for the human senses. There is no particular bodily experience
attached to the encounter of any individual block except the shape of the cell they
occupy.
These blocks get more importance in our experiment if we give them individual
faces or meanings. Individual blocks can get attached a special shape, or a special
color. A block can be labelled “supermarket”, and another one “café”. These blocks
stand out from the other ones that were left unchanged plain white. They provide
a special experience for passersby. Walkers will memorize these experiences, and
attach them to the locations where they make these experiences. As long as these
labelled objects are globally unique and only few they can have global landmark
characteristics. “In the direction of the supermarket” provides global orientation
if everybody knows the supermarket. They can also take the function of local
landmarks. People can refer to it when describing local events (“in front of the
supermarket”, “three blocks from the supermarket”), either trusting that the recipient
has made the encounter with the supermarket before already, or will easily identify
it when passing by. “At the supermarket” is an even more efficient description than
“at the third intersection” because it does require only one object recognition task,
and no counting.
However, landmarks do not have to be globally unique. There might be a second
supermarket in this environment, perhaps even of the same brand, as it happens out
there in any real city. The two supermarkets are still differentiable from the rest of
the environment, but an instruction such as “At the supermarket” must be considered
ambiguous now. There are three common cognitive mechanisms that are used for
disambiguation of local landmarks in a communication context:
1. Nearness: “At the supermarket” is disambiguated by choosing the nearest indi-
vidual as a default. Behind this cognitive heuristics is also embodied experience
since the cost of interaction with the environment is inversely proportional to the
cost of travel: the nearest supermarket is the easiest one to reach. The argument
is additionally supported by the first law of geography as stated by Tobler:
“Everything is related to everything else, but near things are more related than
distant things” ([20], p. 236).
2.2 Summary 37
2. Order: “Go straight and at the supermarket turn left” cannot be solved on the
assumption that this next supermarket in a particular direction is also the nearest
one. Instead a principle of order comes into play. The next supermarket is the
first encountered in a particular search. By the same ordering principle one could
equally well say: “At the second supermarket turn left”.
3. Hierarchical priming: While “at the supermarket” is ambiguous, “at the super-
market on Prime Avenue” is less likely ambiguous (depending on whether the
street name is a differentiator). This hierarchical localization [16,17] works recur-
sively if needed, i.e., in cases where Prime Avenue either is not unambiguous or
not prominent enough. “At the supermarket on Prime Avenue, in the Southern
sector [of the environment]” is adding in this way. There is strong evidence that
spatial mental representations are hierarchically organized.
Hierarchical priming happens also through salience hierarchies of landmarks. For
example, a hierarchical description (or thought) “I found the coin in the entrance
of the supermarket, not far from the ATM” refers to the supermarket as a global
landmark, and then specializes further by another reference to a local landmark,
an ATM. Here the hierarchical priming is required for disambiguation between
the many ATMs in an environment. This way, hierarchies help to break down
large environments into manageable regions of influence. Any object or event in
the environment can be related to landmarks in a variety of ways, qualitatively
and quantitatively. Besides of something being “near the supermarket” (distance,
defining the region of influence), its location can also be characterized as “on the
way from supermarket to home” (orientation), as “on the right when travelling
from supermarket to home” (direction), as “between the supermarket and home”
(projection), or as “in the quarter of the supermarket” (topology). Quantitative
characterizations are possible as well, such as “30 m from the supermarket”. Also
the prominence of a landmark can prime the memory for a particular street segment,
as in “I found a coin in the street where the supermarket is”.
2.2 Summary
A thought experiment has illustrated the fundamental role of landmarks for struc-
turing mental representations of an environment. In the constructive approach of
the experiment we have assumed an environment that provides salient experiences
linked to locations, and expected that these experiences form anchor points in mem-
ory, and also configurations that allow relative orientation. We have in particular
learned that landmark configurations are sufficient for spatial cognitive tasks such
as orientation and wayfinding. The existence of a global frame of reference, which
is essential in any technical system, from spatial information systems to robots, is
not essential for human spatial problem solving as long as the environment provides
configurations of landmarks allowing relative orientation. We have also learned that
experiences at particular locations form landmarks, and these experiences can also
be experiences of the structure (nodes) or dimensions (edges) of the structure of an
environment.
38 2 Landmarks: A Thought Experiment
The fact that our environment resembled an urban environment does not matter.
The same insights about landmarks could have been made in a process introducing
landmarks in a landscape. Spatial cognition certainly developed in natural environ-
ments, but its mechanisms apply equally in human-made environments.
Landmarks’ primary role appears to be structuring a mental representation of
an environment by forming anchors for relational links. Verbalized, these anchors
and their links appear to convert into relational descriptions. Evidence for this
assumption will be presented in the next chapter. Without landmarks both tasks,
forming a mental representation of an environment and, correspondingly, being able
to communicate about locations in the environment, appear to be significantly more
complex, relying on locomotion and path integration only.
References
1. Battista, C., Peters, M.: Ecological aspects of mental rotation around the vertical and horizontal
axis. J. Indiv. Differ. 31(2), 110–113 (2010)
2. Both, A., Duckham, M., Kuhn, W.: Spatiotemporal Braitenberg vehicles. In: Krumm, J.,
Kröger, P., Widmayer, P. (eds.) 21st ACM SIGSPATIAL International Conference on Advances
in Geographic Information Systems. ACM Press, Orlando (2013)
3. Braitenberg, V.: Vehicles: Experiments in Synthetic Psychology. The MIT Press, Cambridge
(1984)
4. Christofides, N.: Graph Theory: An Algorithmic Approach. Academic, London (1975)
5. Couclelis, H.: Of mice and men: what rodent populations can teach us about complex spatial
dynamics. Environ. Plann. A 20(1), 99–109 (1988)
6. Dehaene, S.: The Number Sense. Oxford University Press, New York (1997)
7. Emmorey, K., Reilly, J.S. (eds.): Language, Gesture, and Space. Lawrence Erlbaum Associates,
Inc., Hillsdale (1995)
8. Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry 40(1), 35–41
(1977)
9. Harary, F.: Graph Theory. Addison-Wesley, Reading (1969)
10. Janelle, D.G.: Impact of information technologies. In: Hanson, S., Giuliano, G. (eds.) The
Geography of Urban Transportation, pp. 86–112. Guilford Press, New York (2004)
11. Klatzky, R.L., Loomis, J.M., Beall, A.C., Chance, S.S., Golledge, R.G.: Spatial updating of
self-position and orientation during real, imagined, and virtual locomotion. Psychol. Sci. 9(4),
293–298 (1998)
12. Lynch, K.: The Image of the City. The MIT Press, Cambridge (1960)
13. Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for
processing information. Psychol. Rev. 63, 81–97 (1956)
14. Mittelstaedt, M.L., Mittelstaedt, H.: Homing by path integration in a mammal. Naturwis-
senschaften 67(11), 566–567 (1980)
15. Ore, Ø.: Graphs and their uses. In: New Mathematical Library, vol. 34. The Mathematical
Association of America, Washington (1990)
16. Plumert, J.M., Spalding, T.L., Nichols-Whitehead, P.: Preferences for ascending and descend-
ing hierarchical organization in spatial communication. Mem. Cogn. 29(2), 274–284 (2001)
17. Richter, D., Vasardani, M., Stirling, L., Richter, K.F., Winter, S.: Zooming in–zooming out:
hierarchies in place descriptions. In: Krisp, J.M. (ed.) Progress in Location-Based Services,
Lecture Notes in Geoinformation and Cartography. Springer, Berlin (2013)
References 39
18. Scheider, S., Janowicz, K., Kuhn, W.: Grounding geographic categories in the meaningful
environment. In: Stewart Hornsby, K., Claramunt, C., Denis, M., Ligozat, G. (eds.) Spatial
Information Theory. Lecture Notes in Computer Science, vol. 5756, pp. 69–87. Springer, Berlin
(2009)
19. Shepard, R.N., Metzler, J.: Mental rotation of three-dimensional objects. Science 171(3972),
701–703 (1971)
20. Tobler, W.: A computer movie simulating urban growth in the Detroit region. Econ. Geogr.
46(2), 234–240 (1970)
21. Turner, C.H.: The Homing of Ants: An Experimental Study of Ant Behavior. University of
Chicago, Chicago (1907)
22. Wilson, R.J., Watkins, J.J.: Graphs: An Introductory Approach. Wiley, New York (1990)
Chapter 3
Cognitive Aspects: How People Perceive,
Memorize, Think and Talk About Landmarks
Abstract This chapter deals with the human mind and its representation of
geographic space, particularly with the role of landmarks in these representations.
The scientific disciplines that are called upon to illuminate this area are neu-
roscience, cognitive science, and linguistics. This broad range of disciplines is
necessary, because the structure of spatial representations in the human brain and
the behaviour of these representations in spatial tasks are not directly accessible,
and thus, indirect approaches have to be pursued. Direct observations of brain cells
are invasive, and thus applied typically on animals only. To what extent observations
from animals can be transferred to explain human spatial cognition is a matter
for investigation in its own right. However, indirect methods such as functional
magnetic resonance imaging shed some light into brain activity. Cognitive scientists,
being interested in intelligence and behaviour rather than actual cell structures,
live with a similar challenge. They observe the human mind indirectly by devising
experiments on human memory, reasoning, and behaviour. Linguists add studies of
human spatial communication, which should also allow indirect conclusions about
mental representations.
The first sign, a potential isolation by brain damage, assumes that particular areas
in the brain are responsible for particular tasks (or abilities). While these associ-
ations are generally observable for some spatial tasks—see for example [181]—
looking closer into spatial abilities this argument might be difficult to prove.
Many spatial abilities combine activities in different brain regions, not to mention
the plasticity of the brain. One reason may be that spatial skills are themselves
quite diverse, and hence not necessarily located in a single brain region. Take
for example the range between visual-spatial abilities (e.g., object recognition),
spatial memory abilities (e.g., object localization), or the cognitive elements of
sensorimotor abilities (e.g., self-localization). In their own way, however, each of
these abilities is interacting with landmarks.
Despite the breadth of spatial abilities, applying these criteria and further
evidence, Gardner [60] identified a small set of ‘intelligences’, and among these a
spatial intelligence. In some sense spatial abilities are fulfilling the other two signs.
They are problem-solving skills, for example, recognizing or imagining an object
from an unfamiliar angle, coordinating body movement or solving more complex
tasks like wayfinding. Spatial abilities support everyday activities, such as finding
home again, but also professional skills such as playing the piano (sensorimotor),
imagining a DNA sequence (visual-spatial), or navigating an airplane (a combina-
tion). Some people feel good at spatial intelligence, and others will have difficulties
with, for example, reading a map, or explaining a route. Central to Gardner’s
argument is that these abilities cannot be fully replaced (or explained) by other
capacities, for example verbal or logical-mathematical intelligence. Others have
even pointed out that spatial abilities help reasoning in non-spatial domains [154].
Nevertheless, spatial intelligence remains built on a variety of abilities. These
spatial abilities can even be differently pronounced in individuals. For example,
some may be acute in visual perception, but bad in self-localization. The individual
performance in these skills can be tested, and these tests are quite popular.
In this book we do not use the term spatial intelligence, but its more established
synonym spatial cognition. However, while the composite of spatial abilities
define spatial cognition, confusingly the scientific study of these abilities is also
called spatial cognition. The Handbook of Spatial Cognition, for example, states:
“Spatial cognition is a branch of cognitive science” [231]. In effect it should
be called spatial cognitive science. Anyway, relating back to the superordinate
category, cognitive science, may not immediately help clarifying when eminent
researchers say:
Cognitive science is not yet established as a mature science [. . . ] it is really more of
a loose affiliation of disciplines than a discipline of its own. Interestingly, an important
pole is occupied by Artificial Intelligence [. . . ] other affiliated disciplines are generally
taken to consist of linguistics, neuroscience, psychology, sometimes anthropology, and the
philosophy of mind ([227], p. 4).
of knowledge and beliefs about spatial properties of objects and events in the
world” [148] can be approached from a neuroscience perspective (which regions
in the brain, and which cell types, store and process spatial knowledge and are
active in the processing of visual or sensorimotor stimulations), from a psycho-
logical perspective (how do people behave in orientation and wayfinding tasks,
and what do their abilities reveal about cognitive capacities), from a linguistic
perspective (how people communicate (and hence think) about space), from an
anthropological perspective (for example, why Inuit are able to find home in their
monotonous environment1), from a philosophy of mind perspective (for example,
recognizing that landmarks form a graded category), and, last but not least, from
the perspective of artificial intelligence. Artificial intelligence has two interests
in spatial cognitive science, aligned with what Searle has called strong AI and
weak AI [193]. One searches for computational models of human spatial cognitive
abilities (imitating human thinking), the other one searches for the spatially
intelligent machine, which is a machine able of interacting with humans on spatial
problem solving (simulating human thinking).
Spatial cognitive science supports strongly our position that landmarks are
embodied experiences that shape mental spatial representations. Montello states:
Cognition is about knowledge: its acquisition, storage and retrieval, manipulation, and use
by humans, non-human animals, and intelligent machines. Broadly construed, cognitive
systems include sensation and perception, thinking, imagery, memory, learning, language,
reasoning, and problem-solving. In humans, cognitive structures and processes are part of
the mind, which emerges from a brain and nervous system inside of a body that exists in
a social and physical world. Spatial properties include location, size, distance, direction,
separation and connection, shape, pattern, and movement ([148], p. 14771).
Similarly, Varela et al. argue that cognition is embodied: “[We] emphasize the
growing conviction that cognition is not the representation of a pregiven world by
a pregiven mind but is rather the enactment of a world and a mind on the basis of
a history of the variety of actions that a being in the world performs” ([227], p. 9),
a view that is shared also by others [113].
1
Inuit demonstrate significantly higher levels of visual memory [101].
44 3 How People Perceive, Memorize, Think and Talk About Landmarks
Spatial cognition is the ability of living beings to perceive, memorize, utilize and
convey properties about their spatial environment. According to what we just have
said, we should rather speak of a combination of abilities. But factorizing these
abilities has been proven tricky. McGee, for example, identified an ability for
spatial visualization—the ability to mentally manipulate, rotate or twist objects—
and an ability for spatial orientation—the ability to imagine an object from different
perspectives [140]. These two abilities were already part of the broader Guilford–
Zimmerman aptitude survey ([73], parts V and VI). Later, Carroll identified five
major spatial abilities, differently factorized, but in essence adding dynamic spatial
abilities of estimating speed and predicting movements [21]. While these abilities
could be identified and tested in paper-and-pencil tests, i.e., in small-scale space,
finding environmental spatial cognitive abilities such as wayfinding or learning the
layout of environments requires field experiments in environmental space [80, 198].
As environmental spatial abilities are also amalgams, Allen undertook an attempt
to break down the spatial abilities that service wayfinding [3, 5]. He especially
distinguished between a family of abilities dedicated to object identification, object
localization and traveler orientation, and made finer distinctions within these
families depending on whether the objects are static or mobile, and whether the
observers are stationary or moving observers.
A recent review of spatial abilities and their individual performance measure-
ments has been provided by Hegarty and Waller [78]. Individual performance
variations in spatial abilities have triggered questions whether these abilities can
be strengthened by training, and whether they are gender dependent.
The issue of gender and spatial abilities is hotly debated in colloquial contexts,
but also in science. See for example Silverman and Eals’ theory that sex differ-
ences exist and are grounded in evolutionary division of labor [199], or Dabbs
et al. studying the frequently reported advantage of female in landmark-
based and egocentric orientation and male advantage in cardinal/Euclidean
orientation [34]. However, without going into detail here, across a large range
of contributions the reported research results are contradictory. More funda-
mentally, the underlying assumptions are questionable since experiments do
usually not distinguish between genetic disposition and socialization. With
the high plasticity of the brain it may even be impossible to resolve this latter
point in principle.
3.1 Spatial Cognition 45
Refraining from siding with any particular factorization scheme, let us highlight
just a few spatial abilities which we will relate to later.
People find it easy to visually imagine intimately familiar spaces, even without
visual stimulation. For example, people typically can answer a question such as
“Imagine entering your living room—now what’s to your left?” They even can
manipulate these imaginations. Figure 3.1 shows a (picture of a) sheet of paper with
a dotted line printed in the center. A test for the ability of mentally manipulating this
object is: “Imagine folding it and viewing it from another angle. What will it look
like?”
People can form mental three-dimensional images of depicted objects, and rotate
these images mentally. In their famous experiment Shepard and Metzler asked
participants whether two figures are the depictions of the same object (Fig. 3.3).
The time they take for deciding is proportional to the angular rotational difference,
which is evidence for an actual mental rotation [197]. Based on Shepard and
Metzler’s stimuli, Vandenberg and Kuse developed later a pencil-and-paper test that
has become a widely accepted standard test for mental rotation [225]. A prominent
application of mental rotation is in map reading, where conventionally maps are
oriented north-up, independent from the current orientation of the map reader. Wall-
mounted maps are an obvious case for this argument since they need also to be
mentally rotated to the horizontal plane [150]. Mental rotation must also be at
46 3 How People Perceive, Memorize, Think and Talk About Landmarks
Fig. 3.2 In order to interpret the picture visual-spatial abilities use perspective and experience
work when people take (or change) perspectives. Tversky and Hard, for example,
have shown that people switch with ease between egocentric and another person’s
perspectives in their verbal descriptions [222].
People are able to locate themselves during locomotion automatically and continu-
ously, even without landmarks as external references [50]. For example, this ability
enables people to always point in the direction to the location they started from,
independent from the route taken. Pointing gestures and tracking of path completion
have been applied in studying human path integration capacity [125, 126]. Path
integration is an ability essential for survival, not only for humans. Desert mice,
for example, have a capability for homing [145]. Desert ants travel some tortuous
routes outbound from home foraging for food, but then return home along straight
routes [152, 234], proving to be able for path integration. Honeybees communicate
directions and distances of food sources, information also gained from path
integration [58]. For people, it guarantees finding home again even in the dark, or
with a visual impairment. Since path integration works without external reference
3.1 Spatial Cognition 47
points, it has been linked to the sensor-motor system and vestibular organs, and in
the brain to particular cell types in different regions of the brain. Path integration is
closely related to our sense of place and sense of direction [79]. Path integration is
accumulating uncertainty, and hence typically combinations with spatial updating
are applied.
In order to be able to act, people know the position of surrounding objects relative to
their body. They are spatially aware, and this awareness is provided by their senses.
Since people act in space and time, their own pose or location changes constantly.
Similarly objects in the environment can change pose or location. Only continuous
updating of the spatial mental representation of these relationships helps with
survival. Wolbers et al. studied spatial updating from a neuroscience perspective,
i.e, how the brain keeps track [240], and Kelly et al. looked at spatial updating from
a cognitive perspective, i.e., which cues in the environment are used [97, 98].
How to get from here to there is certainly a fundamental planning ability of any
animal, humans included. Montello and Raubal ranked wayfinding as the most
important function of spatial cognition [151]. Aspects of wayfinding have already
been mentioned, such as the homing ability based on path integration. Let us search
for a more systematic understanding of this task of spatial cognition.
Montello [149] identified wayfinding as a component of navigation. He defines
navigation as goal-directed movement of one’s self through an environment to find
a distal destination. However, splitting navigation into components does not happen
along clear lines. While Montello proposes to split into locomotion and wayfinding
(p. 258), Waller and Nadel see wayfinding as agglomerate of a suite of cognitive
abilities such as place memory, imagery and planning [231]. Wang and Spelke [233]
suggest to isolate a third component in between locomotion and wayfinding, spatial
orientation. Let us take a closer look at these three components of navigation.
In this book we apply Montello’s notion of wayfinding, which has also been
adopted elsewhere (e.g., [12, 61, 66, 77]). It involves several abilities:
Wayfinding comprises the tactical and strategic part of solving the problem to
find a distal destination.
representations [89]. Also the degree of detail in maps has an impact, as more
abstract representations support the construction of a mental spatial representation
better than, say, satellite imagery [162, 186].
Spatial orientation comes interlinked with locomotion (as spatial updating),
utilizing the sensorimotor system, but also comes interlinked with wayfinding out
of cognitive, sometimes even conscious effort. Thus spatial orientation involves also
a bundle of spatial abilities. While spatial orientation may focus more on the basic
ability of imagery, wayfinding may focus more on the basic ability of planning.
Maintaining the spatial orientation of the body to the outside world is a crucial
condition for wayfinding and locomotion. Gärling et al. [62] differentiate several
levels of orientation functions:
We define environmental orientation as the ability to perceive one’s position relative to
points or systems of reference in the environment. These points or systems of reference
may be perceptually available, but, [. . . ] this is not a necessary condition. [. . . ] A hierarchy
of orientation functions may be tentatively proposed. Body orientation is defined as the
perception of the body axes relative to the line of gravity and the limbs relative to
the body axes. At the next level, orientation is maintained in the environment relative
to perceptually available reference systems. Visual direction and position constancy as
well as auditory localization are important here. At the highest level of the hierarchy,
orientation is maintained relative to points and systems of reference that may not be
perceptually available. Geographic orientation could be considered a special case. Other
indirect information sources, such as maps, signs, and the sun, are available (p. 165).
The sensorimotor component of this task is controlled by the nervous system. The
immediate orientation of the body is observed by the visual and auditory senses,
head orientation and motion is registered by the vestibular (equilibrium) organ,
and an awareness of the position of the parts of the body and their movements
is maintained by the proprioceptive sense. An integration of these sensory inputs
allows for coordinated actions of the body, for example, tracking targets, or
controlling of posture, gait and other movements.
The cognitive component of this task requires a mental spatial representation of
the environment. Cognition is constantly trying to maintain a subjective sense of
orientation. This sense of orientation might be satisfied with the ability to locate
oneself in one’s mental representation of the environment. This explanation would
be roughly equivalent to orientation of robots, establishing or maintaining a cor-
respondence between sensor observations and the robot’s internal representation of
the world. For people, however, this definition is problematic since human sensing is
subjective and biased by attention. Furthermore, their mental spatial representation
is not directly accessible for observation, and neither are the individual’s self-
localization in this representation, the correctness of the self-localization, or the
50 3 How People Perceive, Memorize, Think and Talk About Landmarks
The fundamental role of landmarks for orientation and wayfinding stems from a
strong correspondence between an experience captured in (spatial) memory and
a location in the physical environment. While we have defined landmarks as the
reference points of mental spatial representations, their corresponding physical
objects can be called landmarks only with regard to their potential to produce an
experience that will be captured in spatial memory.
3.1 Spatial Cognition 51
Siegel and White have looked at the development of a mental spatial representa-
tion of an environmental space by somebody unfamiliar to this environment [198].
Siegel and White studied the first-hand experience of the environment by loco-
motion. However, environments can also be learned from secondary sources such
as maps, sketches, verbal route descriptions or tourist guides. Thus, the following
sequence of learning, described by Siegel and White, is not a step-by-step process
but rather a complex continuous process [147].
During locomotion, any outstanding experience along the route will trigger a
memory, located in space roughly by path integration. Siegel and White call this type
of knowledge in the emerging mental spatial representation landmark knowledge.
Landmark knowledge can also be mediated. The armchair traveller to Paris, reading
about the highlights of Paris in a tourist guide and locating them on a map, will have
a similar experience.
The path integration between landmark experiences provides connections called
by Siegel and White route knowledge and elsewhere also procedural knowledge.
The mode of locomotion, effort, and intensity of experience all impact on the
experience of distance. Contributing to the sense of distance are motor sense, visual
and auditory sense, but also the memory loading along the route, that means the
number of objects or events experienced along the route, the degree of unfamiliarity
with an environment, the mental preoccupation, and many more factors. Those who
learn from secondary sources infer route knowledge from reading a tour description
or reading a map [69].
A third tier of knowledge, called survey knowledge by Siegel and White, and
also called configurational knowledge, emerges from integrating routes over time
into a network-like representation that can be analyzed for directions and distances.
Route segments can be freely recombined for wayfinding and orientation. However,
people seem to have different cognitive preferences for representing an environment,
or short, cognitive styles [163]. They are more landmark-focused, route-focused,
or survey-focused. Their cognitive styles are correlated with spatial abilities [164].
According to Ishikawa and Montello some people even do not develop a survey
representation, irrespective of the frequency of exposure to an environment or the
activity in the environment [87].
Differing classifications were used as well. Piaget and Inhelder argued for a
figurative knowledge—visual imagery of objects and configuration of objects—
and operative knowledge—the ability to manipulate the visual imagery [166].
More recently, Gardner has suggested to make a distinction between relatively static
and relatively active forms of spatial knowledge [60]. Declarative knowledge has
been contrasted with procedural knowledge by Mandler [131]. Since declarative
spatial knowledge lends itself to visual imagination it is similar to figurative or
configurational knowledge.
Whichever classification is used, landmarks form the glue. Equipped with a
representation of such knowledge orientation and wayfinding become possible. For
example, Allen [3] has proposed a framework for examining the cognitive abilities
involved in wayfinding. The framework consists of wayfinding tasks on one hand
and means to accomplish these tasks on the other hand. In all of his identified tasks
52 3 How People Perceive, Memorize, Think and Talk About Landmarks
The human senses mediate between the physical environment and the representation
of this environment in the mind. Wikipedia, for instance, states:2 “A mental
representation is the mental imagery of things that are not currently seen or sensed
by the sense organs”. But a prior perception by sense organs has led to the mental
representation in the first instance, and accordingly, current perceptions interact
with the mental representation. Spatial mental representations, we would assume
by now, are based on the experience of distance and direction from locomotion, and
on experiences of objects or events that are location-specific.
Kahneman [94], simplifying, distinguishes two ways of thinking, which he calls for
brevity System 1 and System 2. System 1 is the one below conscious thinking. It is
subconscious, emotional and automatic, and hence, fast compared to the other way
of thinking. As many human skills become internalized, below conscious thinking,
they fall into System 1. For example, locomotion (once the baby has learned
to walk), path integration and mental rotation belong to System 1. Even acquisition
of information triggered by external stimuli can happen subconsciously [121], in
interaction with System 1. System 2 does conscious reasoning, and hence, is the
slow system. For example, dialog about directions involves System 2. People are
able to reflect how they argue while they argue about the route they choose in a
given situation.
O’Regan has developed theories how people can form an image of the exterior
environment of the body from perception (sensors) and feelings (responses to sensor
readings) [161]. The question whether we can trust our perceptions is of course an
old one. Already René Descartes, out of his methodological skepticism, formulated
the famous quote cogito ergo sum (originally je pense, donc je suis, [41]), putting
2
http://en.wikipedia.org/wiki/Representation_(psychology), last visited 3/1/2014.
54 3 How People Perceive, Memorize, Think and Talk About Landmarks
the human ability to reason above his doubts about frequently deceptive perceptual
sensations. The perceptions’ deceptiveness still concerns research. Optical illusions
demonstrate that our visual system already interprets retinal images before the con-
scious mind gets access. Purves and Lotto suggest that the visual system interprets
stimuli based on experience, or on what has been learned to hypothesize as it had
been true in many cases before [171]. Such a pre-processing is a prime example
for System 1 thinking. One hypothesis applied by the visual system is the grey
world assumption, the assumption that the average reflectance of an environment
is grey. While this assumption in many cases provides useful interpretations, it can
be misled by influences of illumination and reflection. Thus, Descartes’ scepticism
is appropriate, and O’Regan’s quest is not trivial.
O’Regan claims that “an organism consciously feels something when it has at
least a rudimentary self which has access consciousness to the fact that it is engaged
in a particular type of sensorimotor interaction with the world, namely one that
possesses richness, bodiliness, insubordinateness, and grabbiness” (p. 120). These
four principles are:
Richness The reality is richer than memories and imaginings.
Bodiliness Vision and other senses have bodiliness, as any movement of the body
produces an immediate change in the visual (or other sensory) input.
Insubordinateness The sensory input is not totally controlled, or can change even
if the body does not move because the external reality changes.
Grabbiness Senses like the human visual system are alert to any changes in the
sensory input.
The experience of landmarks is an involuntary act, one that does not involve
conscious decision or choice. In this sense, landmarks must have properties that
correspond to the grabbiness of human senses. They can have other, less grabbing
attention properties as well, because of the richness of reality. In addition, a human
will experience that this particular sensory input is related to a particular location in
the environment instead of a particular body movement. It can only be repeated by
reaching the same location. This aspect relates to O’Regan’s insubordinateness.
Accordingly, in Kahneman’s categories, learning an environment by landmarks
involves System 1. The mind develops a representation of the environment that
is independent from intellectual rigour or effort, and merely based on embodied
experience (and thus, of course, depending on attention). The mental representation,
however, is accessible to both, System 1 and System 2. People find their way in a
known environment and orient themselves in the environment without an explicit
involvement of System 2. During wayfinding, the interplay between expected
sensory input from certain locations and actual sensory input confirming these
expectations provides a feeling of being oriented. System 2, however, can access
the mental spatial representation to explain, for example, the current orientation with
respect to a few learned landmarks. Similarly, although the route of daily commute
is travelled without conscious thinking, it takes System 2 to explain this route to
another person.
3.2 The Role of Perception 55
But what is it about some objects and events that grab this attention? While in
principle the world is a continuous phenomenon, the human mind classifies visual
and other perceptions of its environment and identifies discrete objects in this
continuous environment. Visual perception alone has already self-organizing and
holistic processing built in. This was discovered and described by Gestalt theory
long before a neuroscientific understanding could be developed ([48, 200, 236, 237],
but also [138]). Ehrenfels characterized the fundamental problem of Gestalt psy-
chology with these words:
Here we confront an important problem [: : :] of what precisely the given presentational
formations (spatial shapes and melodies) in themselves are. Is a melody (i) a mere sum of
elements, or (ii) something novel in relation to this sum, something that certainly goes hand
in hand with, but is distinguishable from, the sum of elements? ([48], p. 250, translation
from German following [200]).
56 3 How People Perceive, Memorize, Think and Talk About Landmarks
Gestalt theory studied the basic laws of visual or aural perception. However, the laws
can also be observed in thought processes, memories, and the understanding of time.
It started from the observation that perception easily separates figures from ground
by some fundamental rules of configurational qualities. These Gestalt rules are:
Proximity Perceived phenomena near to each other are more likely perceived as
Gestalt.
Similarity Perceived phenomena similar to each other are more likely perceived
as Gestalt.
Simplicity Simpler explanations for a configuration are more likely to be selected
by perception.
Continuity Configurations in a linear order are more likely perceived as belonging
together.
Closure Closed configurations are more likely to be perceived as Gestalt.
Joint destiny Perceived phenomena that move in the same direction are more
likely perceived to belong together.
Phenomena—objects or events—experienced as landmarks must have strong
(visual, aural, or other, but for humans predominantly visual) figure qualities.
Considering these Gestalt rules, they must have strong local contrast by being
dissimilar or far from the rest. To find out more about salient phenomena in mental
spatial representations, Appleyard asked people in a survey for the buildings in
their home town they could memorize best [8]. The collected buildings were
characterized by Appleyard with respect to a set of properties. The correlation of
properties with the number of nominations was used to identify the significant
properties. He found:
Form properties Inflow of people, contour, size, form, and visual attributes of the
facade.
Visibility properties Frequency of visibility, prominence of the view point, and
nearness of the building to its view points.
Semantic properties Intensity of use of the building (traffic), or uniqueness of use.
Appleyard observed that the more a building stands out the more likely it is that
it comes to mind in such a survey. He found also that the correlation between
the number of nominations and the significant properties shows both in local
comparisons (i.e., for a neighborhood) as well as in global comparisons (i.e., for
the city), which relates to our prior distinction of local and global landmarks.
More recent investigations to characterize the properties of landmarks are made
by Sorrows and Hirtle [201] and Burnett et al. [19]. They also identified proper-
ties such as uniqueness, distinguished location, visibility, and semantic salience.
Another obvious property should be permanence (e.g., [20], p. 67). Studies with
children have shown that they also choose non-permanent objects, while adults do
not [6, 29]. Sorrows and Hirtle [201] distinguished in particular:
3.2 The Role of Perception 57
the air travel network. People experience them mostly from the inside, where they
more often than not look interchangeable. For example, Singapore Changi Airport
is a node in this network experienced by millions of travelers per month. A travel
agent telling her customer “At Changi, change to the flight to Frankfurt” may help
with orientation and wayfinding. But usually it does not evoke or relate to the visual
image of the terminal buildings. Train stations have a similar prominence in spite of
low visual imaginability.
Similarly, other categories of objects or events can have this landmarkness.
For example, in natural environments a prominent representative for landmarks is
landform. Prototype landform objects are mountains, saddles or rivers, which all
can be characterized by their visual, semantic or structural qualities. Mathematically
they form singularities in the terrain height. Even slope alone is already observed
as a peculiarity in learning, memorizing and communicating environments and
routes. In orienteering a route description can be: “Walk uphill”. The salience of
slope is visual as well as structural (in terms of resistance, or increased physical
effort to overcome this part of the route). In Lynch’s categorization slope would
form a gradual, but conquerable barrier. However, little literature exists on natural
landmarks. For example, Brosset et al. report that landform is the second largest
category of landmarks in route directions in natural environments [15]. In urban
environments landform has not been systematically investigated. This may be
caused by the availability of local, fine-grained objects to choose from (landform
is of comparably coarse grain), or also by the fact that salient patterns of landform
(e.g., steep slopes) are relatively rare in urban environments.
For urban environments Ishikawa and Nakamura have tested which categories of
objects people use as landmarks [88], however in a narrow context. In their experi-
ment people walked unfamiliar routes in an urban environment, and were asked to
nominate the “objects that they thought were helpful as clues for navigation”, or
more precisely, “to imagine a situation where they use those landmarks to explain
the routes to someone who visited the place for the first time” (p. 8). Participants
were also asked to give reasons for their selections. Despite this narrow context,
Ishikawa and Nakamura could show that people picked various objects such as
buildings, signage, street furniture or trees. Also, since this was a reporting task
to an experimenter, not an instruction task to a wayfinder, participants with a better
sense of direction tended to select fewer landmarks. They also studied the physical
properties of selected buildings, finding that facade area, color saturation, and age
affected the selection, but that these parameters can differ between persons and
routes. Denis et al. [40] also looked at object categories in verbal route descriptions
of residents of Venice. They found that streets, bridges and squares were mentioned
far more frequently than buildings. Lynch’s collected sketches of some North
American cities [128] would allow a similar quantitative analysis, which has not
been done yet.
However, in each of these studies the environment itself (such as Venice with its
bridges) as well as the assumed activity or purpose provide a specific context. Both
context factors, environment and activity or purpose, influence the choice of object
60 3 How People Perceive, Memorize, Think and Talk About Landmarks
(or event) categories. Hence, the findings cannot be compared or averaged. For
example, the literature is typically focusing on urban environments and wayfinding
by walkers, which explains the prominence of object categories such as buildings or
streets. In such a context, objects (or events) at this level of spatial granularity allow
to anchor the decisions of a person with low ambiguity: “At the corner store turn
left” works well, while “on [the hill] Montmarte turn left” does not.
Hence, the guideline for object or event categories being perceived or used as
landmarks is actually the current context, or focus of the observer. The attention and
intention of the observer regulates the affordance of objects.
Generally, perceptions of any sort can lead to memorizable events (“the desert today
was superb”). Memory, or a mental representation, stores primarily the properties
of the experience. Landmark experience, however, is different. Objects or events
are perceived as landmarks in relationship to their location. Thus, a mental spatial
representation stores the experience together with its location. Furthermore, since
the object or event was perceived in the context of moving in the environment,
location is in the first instance related to the pose and heading of the ego, in a
second instance related to what has been experienced before (the own trajectory),
and thirdly to the larger knowledge acquired so far about the environment. This
means, location is stored primarily in relation to other (known) objects or events.
This section approaches landmarks in mental spatial representations from two
angles, brain and mind. While brain is frequently associated with matter, and mind
with cognitive abilities, consciousness and personality, the distinction is not clear
cut. Neuroscience explains gray matter by function as well, and, as mentioned
before already, cognitive abilities are not necessarily all conscious. However, the
observational approaches of the sciences are orthogonal. Neuroscience studies the
brain from the perspective of cells and synapses, enzymes, and receptors at micro-
level, and roles and activities in brain regions at macro-level. It applies invasive and
non-invasive methods, both stationary due to limitations by technical equipment.
Cognitive science, in contrast, studies cognitive capacities through observing people
(or animals) behaving in situ or in response to external stimulation.
3.3 Mental Spatial Representations 61
The hippocampus is the region in the brain that has a central role in encoding and
retrieving information for behavior guided by memory, not only for spatial behavior
but for a range of different behaviors. However, just as the hippocampus is not solely
occupied with spatial memory and abilities it is also not the only region in the brain
involved in spatial memory and abilities. This makes mental spatial representations
and reasoning one of the more complex areas for neuroscience to understand.
Adding to the challenge, neuroscience can only rely on indirect (non-invasive)
observations on humans. Recent technology such as functional magnetic resonance
imaging (fMRI) is observing oxygen levels in blood, which may be correlated with
cell activity but must not be identified with it. Patients with brain lesions provide
further, but also indirect insight into the working of the brain [36, 181]. Whether
cell activity of rodents, observed by invasive procedures, can be taken to explain
human capacities is open for debate. Furthermore, several types of brain cells seem
to be involved in mental spatial representations. With one of them, grid cells, being
discovered only less than a decade ago [59], it may be no surprise that there is still
uncertainty and speculation about the mechanics of mental spatial representations
[46, 174].
Very early, and based on behavioral experiments with rats, Tolman established
facts for a direct spatial representation in animals’ long-term memory:
We assert that the central office itself is far more like a map control room than it is like an
old-fashioned telephone exchange. The stimuli, which are allowed in, are not connected by
just simple one-to-one switches to the outgoing responses. Rather, the incoming impulses
are usually worked over and elaborated in the central control room into a tentative,
cognitive-like map of the environment ([216], p. 192).
Some 30 years later O’Keefe discovered place cells in the hippocampus of rodents.
These cells fire when a rat returns to a place it had visited before [158]. The book he
wrote with Nadel [159] still suggested the existence of a cognitive map. It was too
tempting to believe that the brain stores the experiences at particular locations in a
map-like fashion. By now, however, it has become clear that place cells are not the
only cells involved in mental spatial representations (e.g., [173]), and that spatial
abilities are formed by complex interactions between various regions in the brain.
Place cells, or fields of place cells, show firing patterns depending on location.
But how does a rodent’s brain know? In principle there are two ways for an animal to
locate itself: One by path integration, i.e., sensorimotor stimulations, and the other
by perception of external cues. Since place cells fire even when an animal moves in
the dark, current thinking is that hippocampal representations are primarily driven
by path integration. Since path integration is accumulating uncertainty over time,
external cues may help to stabilize the localization, but are clearly second order.
In contrast, head direction cells, monitoring the direction of the face, have been
shown to be sensitive to visual external cues [209, 210]. This is no surprise since
stationary visual external cues provide a stable reference frame for turning the head.
However, head direction cells also operate in the absence of light, supported by
62 3 How People Perceive, Memorize, Think and Talk About Landmarks
the vestibular system, but then, similar to place cells, with drift over time. Head
direction cells are found in many regions of the brain, including the postsubicular
cortex.
Grid cells [59] seem to code location in a regular tessellation, similar to an
internalized coordinate system. Since the tessellation covers space, firing patterns
are sufficient to decode location in space [74]. Grid cells are found in the medial
entorhinal cortex, an informational input into the hippocampus. There must be a
mapping of locations identified by firing of the grid cells and locations identified
primarily from path integration in the place cells, and further associations between
place and events to form memories (ibid.).
Another pathway into the hippocampus deals with object recognition establishing
some contextual information about location. With contextual, especially visual
information provided, the brain is capable of allocentric spatial reasoning, probably
in the posterior parahippocampal cortex. Ekstrom et al. [49], for example, found
cells in the hippocampus, parahippocampal cortex and other areas that were
responding to location, but in addition also cells that fired dependent on the visual
external cues the person viewed:
We present evidence for a neural code of human spatial navigation based on cells that
respond at specific spatial locations and cells that respond to views of landmarks. The for-
mer are present primarily in the hippocampus, and the latter in the parahippocampal region.
Cells throughout the frontal and temporal lobes responded to the subjects’ navigational
goals and to conjunctions of place, goal and view (p. 184).
If grid cells form an internalized coordinate system their resolution and number
becomes interesting. The resolution determines the smallest variation in location
that can be distinguished in firing patterns, and their number determines the size of
the environment that can be represented. Just as a technical equivalent: If the surface
of the Earth should be represented in a tessellation of square elements of 1 m edge
length it needs 40;000;0002 or 1:6 1015 elements. It is estimated that the human
brain overall has about 100 billion neurons, thus, human spatial memory must be
differently organized to remember things such as where the keys have been left or
how to travel to Sydney. It needs hierarchic representations. In fact grid cells show
properties that establish multi-resolution memory. First, different areas of similar-
sized grid cells represent the same environment, but with a random offset of their
grids. By nesting, even a small number of neurons can represent a fine level of
granularity [139]. Secondly, the scale of grid cells varies along the entorhinal cortex.
For rats, grid cells have been found representing distances of about 25 cm in their
dorsal-most sites to about 3 m at the ventral-most sites [16]. And finally, it has been
shown that when certain channels are knocked off the brain produces a coarser scale
spatial memory [65].
In addition to a representation of location the brain shows two additional types of
spatial memory. One is episodic. Sequences of place cells form a route memory that
can be imagined and mentally travelled at ground perspective. Burgess et al. [18]
put it more precisely:
3.3 Mental Spatial Representations 63
While processing of spatial scenes involves the parahippocampus, the right hippocampus
appears particularly involved in memory for locations within an environment, with the left
hippocampus more involved in context-dependent episodic or autobiographical memory
(p. 625).
The other one is for survey-like, from-above visual imagination and manipulation
of environments. Shelton and Gabrieli [196] have observed people viewing an
environment in each of the two perspectives with fMRI. When comparing the
brain activation during route and survey encoding they found that both types of
information recruited a common network of brain areas, but while survey encoding
recruited a subset of areas recruited by route encoding, route encoding, in contrast,
recruited regions that were not activated by survey encoding.
Similarly, routine behavior in well learned environments may be stored as
schema knowledge outside the hippocampus [219]. This means, people in spatial
decision making situations may follow schemas (mental shortcuts) rather than
analysing their mental spatial representations.
In all likelihood a person will not experience every square-meter of the Earth
in their life-time. But there will be a difference between the traveling range of
the peasant in the middle ages, who rarely left the district, and a global nomad
of the twenty-first century. The global nomad at least will appreciate schema
knowledge after experiencing that all airports look alike. Relph called them even
‘no-places’ [175]. On the other hand, Maguire et al. found that spatial memory
shows plasticity in response to environmental demands [130]. They compared the
brains of London taxi drivers with control participants who did not drive taxis.
It turned out that the posterior hippocampi of taxi drivers were significantly larger,
and this observation correlated with the amount of their professional experience.
The involvement of the parahippocampal region in both spatial memory as well
as object or scene recognition is particularly interesting in the context of landmarks
as used for visual navigation. Janzen and Turennout [92] have investigated combi-
nations of skills:
Human adults viewed a route through a virtual museum with objects placed at intersec-
tions (decision points) or at simple turns (non-decision points). Event-related functional
magnetic resonance imaging (fMRI) data were acquired during subsequent recognition
of the objects in isolation. Neural activity in the parahippocampal gyrus reflected the
navigational relevance of an object’s location in the museum. Parahippocampal responses
were selectively increased for objects that occurred at decision points, independent of
attentional demands. This increase occurred for forgotten as well as remembered objects,
showing implicit retrieval of navigational information (p. 673).
We will come back to this encoding of navigational relevance; in Sect. 3.2 we have
called it structural salience. In addition, Janzen and Turennout demonstrated later
that good navigators show even a consolidation effect in their spatial memory. Their
activity in the hippocampus increases when recognizing objects along routes learned
a while a ago, compared to routes traveled just now [91]. Furthermore, Maguire
and colleagues have investigated whether a specific human navigation system
exists [129, 203]. Their experiment was based on fMRI while tracking participants
in virtual environments. They could identify three specific brain regions supporting
64 3 How People Perceive, Memorize, Think and Talk About Landmarks
navigation, which together seem to code the proximity and direction to the goal.
The three regions were the medial prefrontal cortex and the right entorhinal region,
neuronal activity in one of them positively correlated and in the other negatively
correlated with goal proximity, and the bilateral posterior parietal cortex, where
activity was correlated with the egocentric direction to goals.
With such a differentiated picture it makes sense to assume that different
navigation tasks access different spatial abilities. Hartley et al. observed brain
activity in people finding their way in an unfamiliar (virtual) environment and
in people following a familiar route in another (virtual) environment [75]. The
wayfinders showed higher activities in the anterior hippocampus, while the route
followers showed higher activities in the head of caudate. The prior coincides
with the prior assumption that the hippocampus is involved in place learning, and
provides response to location within a spatial representation. The latter is consistent
with an assumption of the caudate supporting action-based representations, and
providing fast response to actions instead of locations. Their observations suggest
that (at least) two representations are available for navigation, an action-based
which is more efficient in learned environments and a location-based which is more
efficient in unknown environments.
A review of behavioral and neuroscientific findings in rodents and humans
by Chan et al. brought up that environmental objects can act as landmarks for
navigation in different ways [23]. They proposed a taxonomy for conceptualizing
object location information during navigation. This taxonomy consists of:
• Objects as visual beacons for navigation indicating a nearby en-route target
location.
• Objects used as associative cues indicating a nearby location with an associated
navigational action.
• Objects as visual cues to maintain or regain orientation.
• Objects used as an environmental reference frame for navigation, which are
geometric properties of larger objects that can provide a frame for organizing
spatial information, such as alignment operations (for example, rats seem to
prefer geometric cues over object cues for orientation [25]).
The distinction between smaller objects—visual beacons or associative cues—and
larger objects, or rather object geometries such as walls, becomes even more
relevant with research testing whether one of them is preferred. Hartley et al. [76],
for example, have geometrically altered the boundaries of a (virtual) environment
between two visits of participants. The first time participants encountered a cue
object in a simple rectangular enclosure and a distant visual cue for orientation.
The second time, after a brief break outside of the environment, participants were
brought back and asked to mark the place where the cue had been. On some trials
the geometry (size, aspect ratio) of the arena was varied between presentation and
testing. Hartley et al. report:
Responses tended to lie somewhere between a location that maintained fixed distances from
nearby walls and a location that maintained fixed ratios of the distances between opposing
walls. The former were more common after expansions and for cued locations nearer to
3.3 Mental Spatial Representations 65
the edge while the latter were more common after contractions and for locations nearer
to the center. The spatial distributions of responses predicted by various simple geometric
models were compared to the data. The best fitting model was one derived from the response
properties of place cells in the rat hippocampus, which matches the proximities 1=.d C c/
of the cue to the four walls of the arena, where d is the distance to a wall and c is a global
constant.
Also, the geometry of the arena seemed to have served as a weak cue for orientation.
Overall people seemed to combine two strategies: matching the distant visual cue
for orientation, and representing proximity to the walls of the arena consistent with
path integration.
Knauff added to this discussion [104] by arguing for a stronger separation of
mental visual imagery and spatial reasoning. He refers to the long tradition in
cognitive science of thinking of a pathway for visual object identification—the
processing of visual properties such as shape, texture and color—and a pathway for
recognizing where objects are in space. He also cites Landau and Jackendoff [114],
who observed evidence in language encoding for a non-linguistic, cognitive dispar-
ity in the representation of what and where. Similarly, one could cite the visual sense
with its separate predispositions for the detection of location and the recognition of
objects [13, 138, 191]. Here Knauff suggests that “human reasoning is based on
spatial representations that are more abstract than visual images and more concrete
than propositional representations” (p. 16), claiming that spatial representations
are not restricted to a certain format, and integrate different types of information
while avoiding excessive visual detail.
Now, taking Knauff’s suggestion seriously, a particular role for landmarks opens
up. Since landmarks have both properties, the visual imagery of object identity and
the anchoring to location, landmarks do form the bridge between mental visual
imagery and mental spatial representations. This location-object binding will be
useful in both, the subconscious, automatic cognitive processes (System 1), such as
self-localization, and conscious cognitive processes (System 2), such as searching
for a landmark provided in a verbal route description.
Now let us focus on the cognitive capacity to form and maintain representations of
the spatial environment, and to recall from these representations for reasoning or
sharing knowledge with others. We will see that landmarks play a central role in all
these processes. We will concentrate on cognitive studies; Sect. 3.4 will be dedicated
to sharing knowledge with others.
66 3 How People Perceive, Memorize, Think and Talk About Landmarks
Gopnik suggests a developmental process based on theory theory (in other contexts
theory theory is also called folk theories). An infant, reaching out, collects embodied
experiences and tests with these experiences its first, primitive theories about space.
Causality, or at least probability, learned from repetition will lead to revisions and
the development of increasingly mature theories [70, 71, 165]. Theory building
mechanisms can be specialization by adding a constraint to a theory limiting it to
special sorts or cases, generalization by removing such a constraint if a theory has
been found to be a special case of a more general theory, and dynamic weighting
between theories to assign importance to favored theories [224]. Based on these
mechanisms the infant will learn, for example, that some things are out of reach,
or that some things can be stacked, or that all the perceptual stimuli caused by
a car form a whole, and will move together when the car moves. Thus, theory
theory is about concept formation [132], and generally related to space (anything
3.3 Mental Spatial Representations 67
Mental spatial representations, just as external ones, are based on spatial frames
of reference. The role of a frame of reference is facilitating an unambiguous way
to locate things in a space. A frame of reference is established by its datum.
Mathematically the datum comprises an origin location and direction (short: origin)
from which distance and direction measurements are made.
For a cognizing individual the datum can be an oriented object in the real
world, defining location and direction. Two kinds of objects come to mind: The
own body (an egocentric perspective), or another oriented object (an allocentric
perspective) [17, 99, 106]. Taking my body as origin I can visualize that “the
library is right of me, not far”, which relates the library to my body, or describes
its location with respect to my body’s location and heading. Using an egocentric
frame of reference has some challenges, despite being the frame of reference babies
3
The term perspective taking is used here deliberately in the most general sense of focusing on a
situation, selecting relevant objects and relations, and then choosing a geometric perspective. In
contrast, linguistics understands only the latter as perspective taking [118].
3.3 Mental Spatial Representations 69
develop first. One challenge lies in the ambiguity about the orientation of the body,
where chest, head and viewing direction can be, and frequently are, not aligned.
If the body is moving, then the heading direction can be non-aligned with any of
the prior ones. Another challenge, and a more significant one, lies in the body’s
mobility, which requires constant updating of a representation of all the relationships
of the body with the objects in the environment. This updating is actually much
more costly than maintaining an allocentric, stable representation of the location
of stationary objects and only updating the body’s location and orientation in
this representation. I can mentally visualize: “The car is parked in front of the
church”, which has no reference to my body at all, but instead refers to the inherent
orientation of the church, which has a front side by design. When I move or turn,
the relationship between car and church remains stable.
Updating my location in an allocentric frame of reference can happen in
two ways. One was mentioned before, path integration, sometimes called dead-
reckoning. This updating process through path integration is independent of the
configuration of objects around. Path integration provides an update only with
respect to an initial location. Loomis et al. investigated the ability of path integration
by homing experiments with either blind or blind-folded humans [125]. Similarly
to technical dead-reckoning systems, which are equipped with inertia sensors, path
integration quickly accumulates error.
The other updating mechanism is piloting, orientation by recognizing landmarks
in the world, and establishing the own body’s relationship to these landmarks in
their known configurations. Piloting, therefore, has no accumulation of errors. But
it relies on the presence of recognizable landmarks in the proximity. Unavoidably
piloting has to deal with gaps between experiencing these landmarks, and path
integration can help bridging these gaps.
The thought experiment (Chap. 2) had used ‘home’ as an allocentric first point
of reference, later replaced by a pole together with a reference direction (see also
Fig. 3.7). But with more and more landmarks in the experiment’s environment a
single datum point can be traded for configurations of landmarks, at least for local
70 3 How People Perceive, Memorize, Think and Talk About Landmarks
Fig. 3.7 A datum point in the real world (CC-BY-SA 3.0 by Cosmo1976, modified)
3.3.2.3 Hierarchies
They did no longer make a direct claim that this product is a single integrated
picture. Later Tversky made a point that the term map can only have metaphorical
meaning for an actually more complex mental spatial representation, one that has
also some properties of a collage structured by some spatial models of relations
between objects [221]. For good reasons we therefore stick in this book with
the neutral term mental spatial representation. There is no single, homogeneous
pictorial ‘mental map’, or synonymously, ‘cognitive map’.
72 3 How People Perceive, Memorize, Think and Talk About Landmarks
Mental spatial representations have one property that can only be explained in a
relational way. They are hierarchic. Hierarchic structures have notable advantages.
One advantage is that a search for individual entities is significantly faster when
applying hierarchial heuristics compared to exhaustive search. Readers familiar
with spatial database structures see parallels to indices and hierarchical data
structures [185]. The other advantage is that different spatial tasks are carried
out more efficiently at different levels of a hierarchy. For example, wayfinding
requires access to the street network of an environment, while coordination of
locomotion requires a far more detailed representation of a significantly smaller
environment [215]. Selective access to mental spatial representations is therefore
essential for cognitive efficiency.
Typically, each coarser level of a hierarchy is derived by abstraction from the
next finer level [84]. This is not the case for mental spatial representations. For
example, geographic space, in contrast to environmental space, can only be learned
from secondary sources such as maps. The mental spatial representation of, say, a
country’s boundaries, or its neighbors, is not derived from extensive locomotion.
Distant (global) landmarks can be learned from the distance, without travelling or
knowing more detail about them. Hence, mental spatial representations are not filled
at every level. Even the notion of ‘levels’ is questionable, but useful to imagine
configurations of landmarks in a vector space. At a coarse level of detail these
landmarks can be the countries of Europe, with Austria to the east of Switzerland.
At a fine level of detail these landmarks can be the furniture in my living room, with
the table in front of the couch. Loosely, we expect similar things to be collected in a
level of a hierarchy.
Levels are connected by links for reasoning purposes. These links establish
relationships, and depending on the type of relationship different hierarchies can
be constructed. Spatial hierarchies cater for both, the selection of objects that
are retrieved from memory as well as the level of detail with which they are
retrieved. Readers familiar with cartography may be reminded of zooming. Zooming
through map series is supported by cartographic generalization techniques such
as simplification, selection and aggregation. In fact, mental spatial representations
maintain complementary hierarchies based on these three operations, a hierarchy
by detail based on simplification, a hierarchy by salience based on selection, and
a hierarchy by granularity. Let us have a closer look at these hierarchies, and their
empirical evidence.
Hierarchies by Detail
Landmark hierarchies by detail are rather subtle. They relate to the nature of
landmarks, which can stand both for a location as well as an object. In the prior
function they provide an anchor in a mental spatial representation. Similarly,
Lynch [128] characterized landmarks as external reference points. In the latter
function they provide memory for the shape and visual image of an object, for tasks
such as object recognition. Landmarks seem to be able to bridge between the most
3.3 Mental Spatial Representations 73
abstracted form of an object as a purely spatial reference, and more detail of a visual
image. For reasons of cognitive efficiency the mind may only recall what is just
needed. For example, wayfinding according to a plan “at the corner store turn left”
requires first to find a store, which can be based on a fast low-detail schema, and
only when a store has been identified in the environment a detailed visual image
from memory is required for object disambiguation.
Hierarchies by Salience
We encountered already global landmarks and local landmarks, landmarks that stand
out in a larger environment—such as the Eiffel Tower in Paris—and landmarks
that stand out only locally—such as a corner store along a street, or ATMs in the
entrance area of a mall. The coarse classification hints to asymmetric relationships.
For example, “the ATM at Eiffel Tower” may be accepted, but “Eiffel Tower at the
ATM” is not. One is more salient than the other, even dominant.
Thus, configurations of landmarks (a ‘layer’ in a salience hierarchy) comprises
similar objects, but similarity is based here on salience, not size or level of detail.
Selection and weighting, we have argued above, are linked to the embodied or
mediated experience of the environment. We mentioned already the following
factors contributing to salience: The figure-ground contrast, the relevance of the
location, and the frequency of encounter or prominence. The more salient landmarks
are the more easily recalled objects of an environment [8]. If some objects are
more easily recalled into working memory they must also be earlier available for
processing. Besides of priming the mental spatial representation, they also serve as
available reference locations for other objects nearby (“at the Eiffel Tower”).
The first evidence for such reference points in mental spatial representations was
provided by Sadalla et al. [184]. They asked participants for distance judgements
between pairs of reference- and non-reference points. In their words:
Since a reference point is regarded as a place that defines the position of other adjacent
places, it follows that other places should more easily be seen in spatial relation to a
reference point than vice versa [. . . ]: Adjacent places should be judged closer to reference
points than are reference points to adjacent places (p. 517).
One of their tests, for example, demonstrated different judgements for questions
like “Is the newsagent’s close to the hospital?” and “Is the hospital close to the
newsagent’s?”, with the hospital assumed to be a reference object. Also, distance
judgements to reference points are made faster than to non-reference points, and
direction judgements to other reference points are made faster when at a reference
point. Adding to these observations, Allen [2] found that distance estimates across
object clusters around reference points were judged to be consistently longer than
distances of the same length within such clusters.
Couclelis et al. [31] concentrate then on the link of objects with their reference
object, i.e., the link of a location characterized by a landmark and its reference
region. They call it the tectonic plate hypothesis. According to the tectonic plate
74 3 How People Perceive, Memorize, Think and Talk About Landmarks
Hierarchies by Granularity
We are all familiar with containment hierarchies. It is the way we have learned
geography. Continents are aggregations of countries, and countries of states or
counties. Accordingly, western postal addresses locate an addressee on street level,
then on city level, then on country level. Such a hierarchical structure is established
by part-of relationships from an entity at one level to an entity at the next coarser
level of granularity. Salt Lake City is part of Utah, and Utah contains Salt Lake City.
Mental spatial representations reflect these hierarchies by containment relationships.
Developmental psychology has already documented the fundamental role of topol-
ogy for cognitive spatial abilities (Sect. 3.3.2.1). Other literature identified a number
of classifications of spatial granularity that are motivated by human conceptions of
space (for reviews see [56, 177]).
A hierarchy by granularity, based on aggregation and abstraction, is indepen-
dent from hierarchies on detail or on salience. Two objects of the same level
of granularity can have very different salience. For example, the Guggenheim
Museum in New York is more salient than the buildings next to it along Fifth
Avenue, which are nondescript apartment buildings. Even more, an object can
be more salient than its container object, i.e., salience does not accumulate with
aggregation and abstraction automatically. For example, more people will confirm
to have experienced (physically or mediated) the Champs-Élysées than the eighth
Arrondissement of Paris, of which Champs-Élysées is part of.
3.3 Mental Spatial Representations 75
The classical experiment to provide evidence for this type of hierarchy has
been made by Stevens and Coupe [206]. They asked participants to estimate the
direction between pairs of cities that are located in different states or countries. For
example, they were asked to estimate the direction from San Diego (California) to
Reno (Nevada). Since the state of Nevada is east of the state of California, most
participants made biased direction estimates to the east. But the direction from San
Diego (California) to Reno (Nevada) is actually north-northwest (Fig. 3.8). Running
the experiment for multiple pairs of cities, the only explanation for the consistent
distortions the participants made in their estimates is by assuming hierarchical
reasoning.
Stevens and Coupe point also to the efficiency gains of hierarchical reasoning. If a
relationship can be stored at a coarser level of granularity then all the relationships
between pairs of objects contained in both aggregates do not need to be stored. The
adopted strategy in the particular example seems to be:
• San Diego is in California, Reno is in Nevada (lifting the problem to a coarser
level of granularity).
• Nevada is East of California (reasoning about directions at this coarser level of
granularity).
• Ergo, Reno must be East of San Diego (projecting the reasoning result to the finer
level of granularity).
Look at exhaustive search in contrast. An exhaustive search for the direction from
San Diego to Reno would require to activate in working memory the vector space
of cities in the western part of the USA, and then to determine by vector addition
the direction between the two named cities. No doubt that this exhaustive search is
possible for the mind. A person living at the west coast of the USA may have learned
the configuration of cities from travelling around extensively or from studying maps.
But the cognitive load is significant, compared to lifting the problem up to the
coarser level of granularity.
But the experiment also demonstrates the price to pay. A heuristic is a mental
shortcut. It has become part of System 1 because it provides satisfactory solutions
in general, whilst accepting that it does not always guarantee the correct (or optimal)
solution [64, 94, 220].
Inspired by the above experiment, Hirtle and Jonides [83] hypothesized that
mental spatial representations of inhomogeneous distributions of objects are also
hierarchic. Clusters exist in the real world. For example, buildings are dense in cities
and scarce in rural areas, cities are dense in populated countries and absent on sea,
and so on. People judge distances between these clusters differently than within
clusters, and thus, they form tree-like mental hierarchical structures formed by
these clusters, even in absence of boundaries or barriers as in Stevens and Coupe’s
experiment.
Moreover, McNamara [141] studied how people learn environments and then
make judgements from memory. He presented his participants either a physical
environment or the same environment represented on a map. Both presentations
had a hierarchic structure. Objects in the physical environment were laid out in
regions marked on the floor, and correspondingly the map space was partitioned
into regions. After a learning phase participants had to engage in three tasks: item
recognition, direction judgements, and Euclidean distance estimation. The results
from all three tasks were sensitive to whether objects were in the same region or in
different regions, which is a clear indication of a hierarchically structured mental
spatial representation of a learned environment.
Thus, there is strong evidence for a hierarchic organization of mental spatial
representations by granularity, more precisely in containment hierarchies.
3.4 Externalization of Mental Spatial Representations 77
also a remarkable illustration how complex verbal dialog can be, which is just
conveying the position of a line in a diagram. Some of this complexity raises actually
from the differences between the two communicating partners’ knowledge of the
environment, which requires negotiations and adjustments of knowledge.
The Map Task. In 1991 the Human Communication Research Centre at the
Universities of Edinburgh and Glasgow designed an experiment to collect
a corpus of 128 dialogues about maps. In their experiment two participants
would “sit opposite one another, and each has a map which the other cannot
see. One speaker—designated the instruction giver—has a route marked on
her map; the other speaker—the instruction follower—has no route. The
speakers are told that their goal is to reproduce the Instruction Giver’s route on
the Instruction Follower’s map. The maps are not identical and the speakers
are told this explicitly at the beginning of their first session. It is, however, up
to them to discover how the two maps differ. The maps were designed as line
drawings with landmarks attached. The landmarks were labelled with names.
A variable in the test was the degree of agreement between the maps, where
differences between the maps could consist of absence of some landmarks
and name changes of landmarks, which also was used to produce ambiguities
by multiple occurrences of a name.
instruction giver can leave a map on the table to be picked up by the instruction
taker at a later point in time. In synchronous, non-co-located communication, an
instruction giver can use a map when talking to an instruction taker—the map
task situation from above. And in asynchronous, non-co-located communication an
instruction giver can draw a route on a map and send by mail to the instruction
taker. But an equally perfect example, using another communication mode, is
pointing directions. In face-to-face situations a pointing gesture in a direction can be
sufficient, while in asynchronous, co-located communication a street sign pointing
in a direction does provide the equivalent. The street sign has been left behind by
somebody, the road authority, for the purpose of guiding any passer-by. In contrast,
a mobile location-based service can show an arrow on the screen of the smartphone
that is reflecting an up-to-date knowledge of a central (non-co-located) database.
Its frame of reference is established locally from the positioning and the compass
sensors of the smartphone. The smartphone can also access websites that may
present sketches indicating directions by arrows, which establishes an asynchronous
and non-co-located form of communication. The shown sketches have to provide
also a frame of reference, for example, a landmark with its landmark-centered
reference system.
In this context most revealing about mental spatial representations is the message
produced, which is focusing on the producer only. Collecting gestures, utterances
or observing other communication behavior of an individual should enable to
reconstruct aspects of their mental spatial representations. However, before we
review the corresponding literature in the following sections let us reflect on this
methodology and its implications. Two issues in particular shall be discussed: the
levels in the externalization process, and the reading process of the scientist.
The prior relates to the construction and use of mental spatial representations.
As discussed above, mental spatial representations of individuals are constantly
evolving by their embodied or mediated experiences. Mental spatial representations
have been understood so far as being in long-term memory (see Sect. 3.3.1).
Particular tasks at hand invoke portions of the long-term memory into working
memory, the mental visual imagery available for problem solving which also applies
some spatial abilities. Then selected elements of the working memory—the portion
that has been found relevant for the solution of the problem—has to be mapped into
language (for this triad see, for example, [39]), such as signs or gestures, sketches
or words. Each of these communication modes is flexible, such that there is no one-
to-one correspondence of working memory and expression. Consider for example
the large variety of ways of describing one and the same route, or one and the same
place [187]. In a communication process only one of the many possible expressions
will be realized. The point is that an expression in a language is a very indirect clue
on mental spatial representations.
3.4 Externalization of Mental Spatial Representations 81
one is more informative (even a categorical “Turn right at the Catholic church”
would have been), but the additional amount of information is irrelevant in this
context, and hence, neglected by many speakers. The additional information is only
relevant—and then not neglected by speakers—in contexts where the a reference to
the base category would be ambiguous; say, where St. Francis would be not the first
church encountered and finer distinctions have to be made.
In the following two subsections we will focus more narrowly on sketches and
spoken or written verbal instructions. We will leave aside other forms of external-
izations, although they also could provide insights to mental spatial representations
in general, and landmarks in particular. Maps, for example, are rarely the output of
a single person’s mental spatial representation, but rich in documenting collective
human landmark experience in an environment. Purely symbolic representations
such as arrows are externalizing aspects of mental spatial representations and
studied in this regard (e.g., [102]), however, they are poor about landmarks.
Pointing, as we have mentioned before in the context of path integration, provides
an important insight into mental spatial representations, however, it stays also poor
with respect to landmark knowledge. We also leave aside more literary descriptions,
for example those that try to capture the atmosphere or essence of an environment.
The following two subsections aim in particular to extract these kinds of
knowledge about mental spatial representations:
• Elements of mental spatial representations.
We are asking what the atoms of mental spatial representations are. Lynch,
for example, identified elements of graphical representations of cities [128]. But
Lynch’s work was limited to one particular context, or level of spatial granularity.
More work is needed in various contexts, and should include research on qualities
of landmarks, or on landmarkness, since we learned above already that all of
Lynch’s elements have landmarkness.
• Structures of mental spatial representations.
We are asking how these atoms are connected. Here the hierarchies by
granularity (containment) and by salience (use of anchor points) play a role. But
also research in identifying fundamental qualitative spatial relationships belongs
here. Mark and Egenhofer, for example, have studied the cognitive meaning of
spatial predicates in spoken languages by presenting participants labelled sketch
maps and asking for agreement to the label [135].
• Individual landmarks.
We are asking which objects in a specific environment have formed the
experience and ensuing mental spatial representation of an individual. Lynch’s
sketches do not only reveal elements (categories of types), they also reveal
concrete instances of these elements. In a similar vein, collecting landmarks
from corpora of place descriptions [39] or from information retrieval approaches
(e.g., [24, 176]) should allow a mapping of individual, collective, and context-
aware mental spatial representations of particular environments, despite our
awareness that completeness can never be reached.
84 3 How People Perceive, Memorize, Think and Talk About Landmarks
3.4.2 Sketches
Strikingly these elements can be grouped into two groups [217]. Paths, nodes and
districts facilitate or afford movement of people. The second group is formed by
elements that inhibit movement, namely edges and landmarks. These elements
either increase the integration of the environment and contribute to its cohesion
and homogeneity, for example, paths connecting parts of the city, nodes connecting
paths, and districts formed by perceiving groups of nodes, paths and landmarks as
regions of homogeneous character. Or they increase its heterogeneity by fragmen-
tation and differentiation, for example, edges separating or delineating districts, and
landmarks representing islands of distinction in a city.
Thus Lynch’s classification, despite its impact on our understanding on mental
spatial representations, seems to be focused to the context of wayfinding and
orientation at urban scale. Stevens [205] studied playful behaviors in urban contexts
and replaced some of Lynch’s elements by elements of more private nature. He
introduces thresholds and props. Thresholds are the locations where paths cross
boundaries, such as the entrance to the train station, or the stairs up to the
library, where juveniles or street artists find niches. Props are elements of ‘urban
detail’, small and easily overlooked but perhaps producing a private experience for
some, such as public artworks, signs, trees, street furniture, or doorknobs. Such
observations clearly support the expectation that elements are context-dependent,
and vary over scales of granularity.
Another extension of Lynch’s elements is also relevant for our exploration. In
Lynch’s classification, a railway embankment is an edge, since pedestrians and
motorists have to travel detours to find crossings. But switching the context, and
considering public transport users instead, a train line becomes a path. This means
that Lynch’s elements, when applied to objects in the city, depend on a person’s
perspective. This perspective relates the person’s mobility characteristics with the
environmental affordances [217].
A special affordance in this regard is accessibility, since already Lynch had
distinguished between accessible elements (path, node, districts) and inaccessible
elements (edges and landmarks). Following logical deduction one additional ele-
ment has to be added [217]. Lynch’s accessible elements are zero-dimensional
(node), one-dimensional (path), and two-dimensional (district), while his inac-
cessible elements are zero-dimensional (landmarks as reference points) and one-
dimensional (edges). There must exist a two-dimensional inaccessible element, a
restricted district, such as barracks, waste land or gaps that are inaccessible for
(civilian) pedestrians, motorists, or other mobility modes. The reason why neither
Lynch nor anybody else has postulated it before is probably founded in the sketches
themselves. Restricted areas do not get sketched. They are the blank spaces on
sketches. However, keeping in mind that sketches are drawn for a purpose, if the
person switches their perspective the blank spaces in one context can become visible
in another context.
Thus, Lynch’s elements reflect the sketching person. They represent the relevant
objects of a mental spatial representation for wayfinding and orientation assuming
86 3 How People Perceive, Memorize, Think and Talk About Landmarks
People also externalize their mental spatial representations when they give verbal,
i.e., spoken or written descriptions about locations, configurations, or directions to
a recipient. Similar to sketches, landmarks are reference points for anchoring verbal
descriptions in location.
A fundamental observation about landmarks in verbal descriptions has been
made by Landau and Jackendoff [114]. They observed that the located objects
(locatum) appear to be encoded in language with more detailed geometric prop-
erties such as their axis, their volume, surfaces and parts. In contrast, reference
objects (relata) are encoded only with coarse geometric properties, primarily the
main axes. Their observation is consistent with our expectation that the relata
are shared knowledge. They do not require description except what is needed
to establish the frame of reference for the locatum, i.e., information about their
orientation. In addition, they found the preference for prepositions of qualitative
spatial relationships, especially topology, distance and direction. They conclude:
“The striking differences in the way language encodes objects [locata] versus places
[relata] lead us to [postulate] a non-linguistic disparity between the representations
of what and where”, a disparity that already had been discussed above to be detected
in the visual apparatus [13,191], as well as in the neuronal basis of the mental spatial
representation [104]. In a cross-linguistic comparison between English, Japanese
and Korean, Munnich et al. added that spatial properties show sufficient similarity
between languages to assume a common cognitive basis [153].
These verbal descriptions cover the same communication purposes as the
sketches discussed above, but use a different language, namely a non-visual and
linear language. Verbal descriptions are sequential in utterance and understanding.
3.4 Externalization of Mental Spatial Representations 87
As this linear structure will have less impact on route descriptions, since routes
themselves are linear, most research on verbal descriptions so far has looked into
route descriptions. Couclelis [30], for example, pointed out the potential to learn
about spatial cognition through the investigation of verbal route descriptions. She
wrote:
Route directions are readily available, natural protocols reflecting the direction givers’ cog-
nitive representations of certain critical aspects of their environment. Still, the relationship
between the spoken words and the underlying cognitive structures is far from transparent.
Responding effectively to a non-trivial request for route directions is a complex task during
which different aspects of spatial cognition come into play at different stages. A number of
questions can be raised about that process (p. 133).
(continued)
88 3 How People Perceive, Memorize, Think and Talk About Landmarks
(continued)
“Information is only useful pragmatically when it influences a decision”, and
thus, pragmatic information content can only be determined for a particular
user in a particular decision making situation. “All messages which lead to the
same actions have the same information content, which is the minimum [size
of the instruction] to determine the action [for this user]. If two users differ in
the action they consider, their [situations] differ and therefore the information
they deduce from the information content of the same message is different”
(p. 47).
Based on these two principles it can be stated that all route descriptions enabling
an instruction follower to realize a certain route are equivalent from a pragmatic
perspective. This means:
3.4 Externalization of Mental Spatial Representations 89
• There will be longer (richer) and shorter (leaner) route descriptions facilitating
an instruction follower to reach a target. The maxim of brevity would prefer the
shorter ones. In the longer ones the maxim of relevance will identify irrelevant
or redundant references.
• There will be route descriptions that will fail to guide a particular instruction
follower to a target. These descriptions violate the maxim of relevance by
omitting relevant information.
Thus, a different instruction follower may require a different message. However, one
important conclusion follows from these rather abstract considerations. Of all prag-
matically equivalent descriptions guiding to the target the shorter descriptions may
be the better route descriptions. A tangible reason for this assumption is the limited
capacity of short term memory [32, 110, 144]. Thus, in accord with other forms of
externalizations, route descriptions reveal more about the structure of mental spatial
representations in connection with strategies of spatial and communication abilities,
and less about the content of mental spatial representations.
The assumption of preferences for shorter descriptions has actually been
confirmed in independent research offering further insight in the internal structure
and content of good route descriptions (e.g., [38–40, 127]). This cognitive and
linguistic research has again to rely on indirect observations since there are no
formal criteria for judging the quality of route descriptions other than whether the
instruction taker has reached the target. Indirect ways of observation are:
• A purely descriptive approach of linguistic structure. The result is a character-
ization of a route description rather than an assessment. It permits at least a
qualitative comparison between route descriptions for the same route.
• Ratings of human route descriptions by local experts. In principle this method
can be applied in situ, e.g., after route following, or in a survey relying on the
mental spatial representations of the raters.
• Navigational performance by instruction followers unfamiliar with the environ-
ment. In principle this method can collect whether followers succeed, but in
addition can also survey how comfortable they felt.
• Comparison of human route descriptions with some algebraically produced route
descriptions. Algebraic approaches are suited to produce minimal instructions
according to some model, but since the model can mismatch with a context
there is no guarantee for producing successful descriptions, let alone shortest
successful ones. Hence, generally the algebraically produced route description
have to be tested in a control experiment as well.
One of the first investigations of this kind were Wunderlich and Reinelt’s [241]
linguistic study of forms of speech in route descriptions. They worked from a
corpus of route descriptions to identify four phases in the full discourse: an opening
(“Excuse me, can you tell me . . . ”), the route instructions itself (the path to
be followed), an optional securing phase (ensuring that the message has been
conveyed), and a closure (“Thank you”). In the route instructions they found patterns
identifying landmarks as intermediate destinations and locations of reorientation.
This structure was close to a formal model proposed before by Kuipers [109].
90 3 How People Perceive, Memorize, Think and Talk About Landmarks
Later Streeter et al. [207] applied a pure navigational performance test to com-
pare customized route maps with verbal route descriptions. They demonstrated that
car drivers following the verbal route descriptions, which provided one instruction
per turn, drove fewer miles, took less time, and showed about 70 % fewer errors
than the drivers relying on the route map. If a picture tells more than 1,000 words
then less is obviously more. Correspondingly, drivers who had both verbal route
descriptions and the route maps available performed badly as well.
Then Denis and collaborators [37–39] collected campus route descriptions from
students. They applied all four approaches in their study of these route descriptions.
In later work [40] Denis et al. repeated the experiment in the real world, collecting
route descriptions from citizens of the city of Venice, and confirmed the prior
findings.
First, they characterized the collected route descriptions by criteria such as
actions specified and localized by a landmark, actions specified without a reference
to a landmark, references to a landmark without any specified action, descriptions
of a landmark, and comments. Results documented that people with higher
visuo-spatial imagination also use more landmarks in their descriptions, which
supports our claim that landmarks bridge between visual memory and spatial
memory.
Secondly, these instructions were rated. The rating was performed by local
experts and non-experts.
In addition, and related to the fourth approach, they came up with a construction
of minimal descriptions, which they call skeletal descriptions. For the construction
they used the elements identified in the student generated route descriptions, thus
maintaining the perspective taking of the speakers whilst concentrating on the
smallest common denominator in the descriptions.
And fourth, they tested navigational performance of instruction followers
equipped with good, poor, or skeletal route descriptions.
Three of their observations are critical for us:
• The construction of skeletal descriptions confirmed that “landmarks and their
associated actions were key components of [good] route description” ([39],
p. 409). Also, references to landmarks are unevenly distributed along the route.
They tend to concentrate at points where orientation decisions are to be made.
• Rating of the original route descriptions is highly correlated with their similarity
to the skeletal description.
• Skeletal descriptions received scores similar to those of good descriptions,
despite being, on average, shorter by design.
Independently, Lovelace et al. [127] collected a corpus of route descriptions for a
particular route, and then searched for shared characteristics such as numbers of
segments or turns, or numbers of references to landmarks used. Their classification
schema was inspired by Denis’ et al. but further refined by grouping landmarks in
those at decision points and those not at decision points. While Denis et al. report
of a tendency for concentration of landmark references at decision points, Lovelace
et al. find about half of the landmark references not at decision points. Considering
3.4 Externalization of Mental Spatial Representations 91
that both types of landmark references serve different purposes—the first one is
anchoring an action of orientation decision making, while the second one is not
linked to any decision and thus rather of calming or confirming nature—the different
observations between the studies can relate to different contexts. Environments with
longer route segments suggest intermediate confirmatory comments, especially for
instruction followers unfamiliar with the environment.
This thought matches another observation of Lovelace et al. They asked people
familiar and people unfamiliar with the environment to rate the collected route
descriptions. Again in contrast to Denis’ et al. findings of high ratings and high
navigational performance with shorter route descriptions, Lovelace et al. report
a preference for richer route descriptions (inter-rater correlations showed that
subjective ratings were reliable and consistent across individuals). This preference
for richer descriptions indicates again that within their context intermediate confir-
matory references to landmarks were advisable.
Allen [4] wrote about the findings of Denis et al. on the essence of good route
descriptions: “The next step in this strategy may involve a formal description of
the structure and components of these skeletal descriptions, which consist of a
combination of directives and descriptives as described previously” (p. 335). This
is, of course, what the rest of this book is about. But before we move on let us
have a look at Allen’s own work. He demonstrated that route descriptions are better
remembered and lead to higher navigational performance when the production of
the route descriptions build in few psycho-linguistic principles:
• The principle of spatio-temporal order: The spatial and temporal order of
localizations in route descriptions should be consistent with the order in which
these locations are experienced when traveling along the described route.
• The principle of referential determinacy: References to landmarks at points
where decisions about orientation have to be made, and links to the proper action
with the experience of the landmark.
• The principle of mutual knowledge: Delimiters describing topological, direc-
tional and distance relations in route descriptions are chosen according to
the communication context, i.e., appropriate for the environment and for the
instruction follower.
However, all research cited so far studied route descriptions in a homogeneous
context, which typically assumes a person unfamiliar with the environment, and
in a mono-modal movement, either walking or driving a car. They emphasize the
role of decision points (for orientation) along otherwise linear route elements. One
could expect then Allen’s principles be satisfied by route descriptions of single
granularity, which means one instruction per decision point. However, even in
these circumstances human route descriptions are not necessarily of a constant
granularity. Instead, elements are often grouped together (a process in memory and
language sometimes called chunking [35, 103]). For example, an instruction “at the
third intersection turn right” applies numerical chunking, and “follow the signs to
92 3 How People Perceive, Memorize, Think and Talk About Landmarks
Place descriptions are so common that we do not think much about them. A person
tells her partner where the keys have been left, or where they should meet in the
evening. They call a local emergency number when they have witnessed an accident
and explain where this has happened. They write captions revealing locations when
uploading holiday pictures to a social networking site. Social conventions even
create new forms of place descriptions, such as checking in on one social networking
site, or hashtagging a location on another site. All these conversations are performed
with the intention to help the recipient identifying or finding these places. More
complex place descriptions can describe whole configurations of objects. These
descriptions are intending to help the recipient to form a mental image of an
environment. For example, a person moving into a new apartment may send a letter
to an old friend describing this apartment as a configuration of rooms. Or a second
year student may explain the configuration of buildings on campus to a fresh first
year student.
Both kinds of place descriptions are challenged by the linear structure of
language. A description of a location has to refer to objects in the environment
that are in some two-dimensional relationship with the location to be specified (“the
café in Richmond”), or a three-dimensional relationship (“the key is on the living
room table, below the newspaper”), or even a relationship that realizes a temporal
dimension (“in front of the place where the café has been”). Now we already
recognize that all references to relata are references to landmarks, and assumed to
be known or recognizable by the recipient.
We call the located object the locatum. This object is located in relationship to
one or several known objects, the relata. The relata form the frame of reference to
locate the locatum. The speaker must assume knowledge of the relata to be shared
by the recipient, such that the recipient can re-establish the frame of reference in
their mind.
3.4 Externalization of Mental Spatial Representations 93
For example, in “Cartier is at place Vendôme” the jeweller shop Cartier is the
locatum and place Vendôme is the relatum. Variations of this form exist, of course.
For example, I may describe my location with “at home”, which at the surface
omits the locatum. Also, the schema describes only binary spatial relationships
between one locatum and one relatum, but there are also ternary (or n-ary) spatial
relationships between one locatum and two or more relata. “The fountain is between
church and city hall” is an example for a ternary relationship.
Human place descriptions almost exclusively use qualitative spatial or temporal
relationships to link a locatum with its relatum. While graphic languages (maps,
sketches) still convey some geometric meaning, verbal place descriptions would
know order (“behind the library”) and even comparison (“a larger building”),
but rarely metric information. Typically it is “I am close to the intersection” rather
than “I am 30 m from the intersection”, and “to the right” instead of “in 85ı ’.
Several linguistic strategies have been identified helping with the linearization
challenge of place descriptions (e.g., [37]). One of them is a deliberate choice of
the speaker of a survey or a route perspective. In a survey perspective references
are put in the sequence in which a sketch would be drawn (“find the café north
of the library”). In a route perspective the linear sequence follows from mentally
following a route (“find the café after passing the library”). In both cases landmarks
(the library) help with localization. Hidden behind this observation is already a
glimpse of an alternative linguistic strategy for linearization, which is zooming
through hierarchies [169, 170, 194]. “The café is behind the library” is linking a
less salient object (café) with a more salient object (library), i.e., zooming out on
the salience hierarchy.
Both hierarchies, by salience as well as by spatial granularity, are suited to zoom
in or zoom out. A sentence like “Cartier is in Paris, at place Vendôme” is zooming
in through a hierarchy of spatial granularity, from the coarser Paris to the finer place
Vendôme. The interpretation is actually complex:
1. Cartier is in Paris.
2. Cartier is at place Vendôme.
3. Context suggests that this is the place Vendôme in Paris.
The sentence could also have been “Cartier is at place Vendôme in Paris”, zooming
out. Grammatically, however, the interpretation of relata in this sentence remains
ambiguous. It is either of both:
1. Cartier is at place Vendôme, which is in Paris.
2. Cartier is in Paris, and also at place Vendôme.
In the prior case the locatum is related to one relatum only, which in turn is related to
another relatum. In the latter case both reference objects are related to the locatum.
94 3 How People Perceive, Memorize, Think and Talk About Landmarks
In the prior case the spatial relationship between place Vendôme and Paris is given,
in the latter it must be inferred. Similarly, anaphora can introduce ambiguity. The
utterance “Cartier is at place Vendôme; it is in Paris” requires a resolution whether
“it” refers to Cartier or place Vendôme. Western postal addresses are such place
descriptions, applying the hierarchical pattern: “Cartier is at place Vendôme, and
place Vendôme is Paris”. The relatum at each level is located (or disambiguated for
localization) by the next coarser level.
But why did the instruction giver found it necessary to add in Paris as a coarser
relatum to an already finer localization? After all, adding relata in a zoom-out
fashion cannot contribute to improving the localization precision like a zoom-in
structure does. However, it can improve the accuracy. An instruction giver does not
stop after at place Vendôme because she or he considers the utterance insufficient
for one of two reasons. One is that the instruction giver is not sure whether this finer
relatum is salient enough. In this case the better known Paris can trigger the memory
for a less known place Vendôme via a salience hierarchy. The other possible reason
is that the finer relatum is ambiguous. In this case Paris is used to disambiguate the
place Vendôme in Paris from any other place Vendôme elsewhere.
It has not yet been investigated how the two hierarchical organization
principles—cognitive salience and spatial granularity—are applied together in
language production [178]. For example, “Cartier is next to the Ritz” would have
been another valid place description, this time linking the location of the jeweller to
the presumably more well-known location of the Hotel Ritz. Both are institutions
characterized at the same level of spatial granularity, street address level. However,
one is considered more salient than the other, or used as an anchor point to locate
the other. Perhaps the speaker knows that the instruction giver had the experience
of a stay at the Ritz before. Whether in a particular communication situation the
speaker prefers to provide a hierarchical instruction by granularity or salience, is
part of the flexibility of language.
Mental spatial representations are not only formed by being exposed to an environ-
ment, but also by being exposed to secondary sources. All external representations
discussed above contribute to the formation of a mental spatial representation,
including printed maps or street signs.
The reading process of this mediated, second-hand information has many
parallels with the reading process during the first-hand, in-situ experience of the
environment. Neuroscientific research even supports that reading is an embodied
process, in the sense that comprehension involves simulations of an embodied
experience involving a reactivation of the reader’s perceptual, motor and affective
knowledge [26]. Thus, the in-situ experience is a reading process as well, by com-
prehension of a scene consisting of perception of properties and affordance, and an
effort of the mind to make sense of it. Mediated information affords to be integrated
3.4 Externalization of Mental Spatial Representations 95
3.5 Summary
Fig. 3.9 Types of spatial and temporal knowledge involved in orientation and wayfinding
References
22. Casakin, H., Barkowsky, T., Klippel, A., Freksa, C.: Schematic maps as wayfinding aids. In:
Freksa, C., Habel, C., Brauer, W., Wender, K.F. (eds.) Spatial Cognition II, Lecture Notes in
Artificial Intelligence, vol. 1849, pp. 54–71. Springer, Berlin (2000)
23. Chan, E., Baumann, O., Bellgrove, M.A., Mattingley, J.B.: From objects to landmarks: the
function of visual location information in spatial navigation. Front. Psychol. 3, 304(11 pages)
(2012)
24. Chen, W.C., Battestini, A., Gelfand, N., Setlur, V.: Visual summaries of popular landmarks
from community photo collections. In: Xu, C., Steinbach, E., El Saddik, A., Zhou, M. (eds.)
17th ACM International Conference on Multimedia, pp. 789–792. ACM, Beijing (2009)
25. Cheng, K.: A purely geometric module in the rat’s spatial representation. Cognition 23(2),
149–178 (1986)
26. Chow, H.M., Mar, R.A., Xu, Y., Liu, S., Wagage, S., Braun, A.R.: Embodied comprehension
of stories: interactions between language regions and modality-specific neural systems.
J. Cogn. Neurosci. 26(2), 279–295 (2014)
27. Clark, A.: Microcognition: Philosophy, Cognitive Science and Parallel Distributed Process-
ing. MIT Press, Cambridge (1989)
28. Cohen, R. (ed.): The Development of Spatial Cognition. Lawrence Erlbaum Associates,
Hillsdale (1985)
29. Cornell, E.H., Heth, C.D., Alberts, D.M.: Place recognition and way finding by children and
adults. Mem. Cogn. 22(6), 633–643 (1994)
30. Couclelis, H.: Verbal directions for way-finding: space, cognition, and language. In: Portugali,
J. (ed.) The Construction of Cognitive Maps. GeoJournal Library, vol. 32, pp. 133–153.
Kluwer, Dordrecht (1996)
31. Couclelis, H., Golledge, R.G., Gale, N., Tobler, W.: Exploring the anchorpoint hypothesis of
spatial cognition. J. Environ. Psychol. 7(2), 99–122 (1987)
32. Cowan, N.: The magical number 4 in short-term memory: a reconsideration of mental storage
capacity. Behav. Brain Sci. 24(1), 87–114 (2001)
33. Cuayáhuitl, H., Dethlefs, N., Richter, K.F., Tenbrink, T., Bateman, J.: A dialogue system for
indoor way-finding using text-based natural language. Int. J. Comput. Linguist. Appl. 1(1–2),
285–304 (2010)
34. Dabbs, J.M., Chang, E.L., Strong, R.A., Milun, R.: Spatial ability, navigation strategy, and
geographic knowledge among men and women. Evol. Hum. Behav. 19(2), 89–98 (1998)
35. Dallal, N.L., Meck, W.H.: Hierarchical structures: chunking by food type facilitates spatial
memory. J. Exp. Psychol. Anim. Behav. Process. 16(1), 69–84 (1990)
36. Damasio, A.: Self Comes to Mind: Constructing the Conscious Brain. Vintage Books, New
York (2010)
37. Daniel, M.P., Carite, L., Denis, M.: Modes of linearization in the description of spatial
configurations. In: Portugali, J. (ed.) The Construction of Cognitive Maps. GeoJournal
Library, vol. 32, pp. 297–318. Kluwer Academic, Dordrecht (1996)
38. Daniel, M.P., Tom, A., Manghi, E., Denis, M.: Testing the value of route directions through
navigational performance. Spatial Cognit. Comput. 3(4), 269–289 (2003)
39. Denis, M.: The description of routes: a cognitive approach to the production of spatial
discourse. Curr. Psychol. Cognit. 16(4), 409–458 (1997)
40. Denis, M., Pazzaglia, F., Cornoldi, C., Bertolo, L.: Spatial discourse and navigation: an
analysis of route directions in the city of Venice. Appl. Cognit. Psychol. 13(2), 145–174
(1999)
41. Descartes, R.: Discours de la methode pour bien conduire sa raison, et chercher la verite dans
les sciences. Plus La dioptrique. Les meteores. Et La geometrie. Qui sont des essais de cete
methode. Ian Maire, Leyden (1637)
42. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271
(1959)
43. Downs, R.M., Stea, D.: Image and Environment. Aldine Publishing Company, Chicago (1973)
100 3 How People Perceive, Memorize, Think and Talk About Landmarks
44. Downs, R.M., Stea, D.: Maps in Minds: Reflections on Cognitive Mapping. Harper and Row,
New York (1977)
45. Duckham, M., Kulik, L.: “Simplest” paths: automated route selection for navigation. In:
Kuhn, W., Worboys, M., Timpf, S. (eds.) Spatial Information Theory, Lecture Notes in
Computer Science, vol. 2825, pp. 169–185. Springer, Berlin (2003)
46. Dudchenko, P.A.: Why People Get Lost: The Psychology and Neuroscience of Spatial
Cognition. Oxford University Press, Oxford (2010)
47. Eco, U., Rorty, R., Culler, J., Brook-Rose, C.: Interpretation and Overinterpretation. Cam-
bridge University Press, Cambridge, UK (1992)
48. Ehrenfels, C.v.: Über Gestaltqualitäten. Vierteljahresschrift für wissenschaftliche Philosophie
14, 249–292 (1890)
49. Ekstrom, A.D., Kahana, M.J., Caplan, J.B., Fields, T.A., Isham, E.A., Newman, E.L., Fried, I.:
Cellular networks underlying human spatial navigation. Nature 425(6954), 184–188 (2003)
50. Etienne, A.S., Jeffery, K.J.: Path integration in mammals. Hippocampus 14(2), 180–192
(2004)
51. Evans, G.W., Brennan, P.L., Skorpanich, M.A., Held, D.: Cognitive mapping and elderly
adults: verbal and location memory for urban landmarks. J. Georontology 39(4), 452–457
(1984)
52. Forbus, K.D., Usher, J., Lovett, A., Lockwood, K., Wetzel, J.: Cogsketch: sketch understand-
ing for cognitive science research and for education. Top. Cognit. Sci. 3(4), 648–666 (2011)
53. van Fraassen, B.C.: Literate experience: the [de-, re-] interpretation of nature. Versus
85/86/87, 331–358 (2000)
54. Frank, A.U.: Pragmatic information content: how to measure the information in a route
description. In: Duckham, M., Goodchild, M.F., Worboys, M. (eds.) Foundations in Geo-
graphic Information Science, pp. 47–68. Taylor & Francis, London (2003)
55. French, R.M.: Moving beyond the Turing test. Commun. ACM 55(12), 74–77 (2012)
56. Freundschuh, S.M., Egenhofer, M.J.: Human conceptions of spaces: implications for geo-
graphic information systems. Trans. GIS 2(4), 361–375 (1997)
57. Frisch, K.v.: The Dancing Bees: An Account of the Life and Senses of the Honey Bee.
Methuen, London (1954)
58. Frisch, K.v.: The Dance Language and Orientation of Bees. Harvard University Press,
Cambridge (1993)
59. Fyhn, M., Molden, S., Witter, M.P., Moser, E.I., Moser, M.B.: Spatial representation in the
entorhinal cortex. Science 305(5688), 1258–1264 (2004)
60. Gardner, H.: Frames of Mind: The Theory of Multiple Intelligences, EBL, vol. 3rd. Basic
Books, New York (2011)
61. Gärling, T., Lindberg, E., Böök, A.: Cognitive mapping of large-scale environments: the
interrelationship of action plans, acquisition, and orientation. Environ. Behav. 16(1), 3–34
(1984)
62. Gärling, T., Böök, A., Lindberg, E.: Adults’ memory representations of the spatial properties
of their everyday physical environment. In: Cohen, R. (ed.) The Development of Spatial
Cognition, pp. 141–184. Lawrence Erlbaum Associates, Hillsdale (1985)
63. Gibson, J.J.: The Ecological Approach to Visual Perception. Houghton Mifflin Company,
Boston (1979)
64. Gigerenzer, G., Todd, P.M., Group, A.R. (eds.): Simple Heuristics That Make Us Smart.
Evolution and Cognition. Oxford University Press, New York (1999)
65. Giocomo, L.M., Moser, M.B., Moser, E.I.: Computational models of grid cells. Neuron 71(4),
589–603 (2011)
66. Golledge, R.G.: Place recognition and wayfinding: making sense of space. Geoforum 23(2),
199–214 (1992)
67. Golledge, R.G.: Human wayfinding and cognitive maps. In: Golledge, R.G. (ed.) Wayfinding
Behavior, pp. 5–45. The Johns Hopkins University Press, Baltimore (1999)
68. Golledge, R.G. (ed.): Wayfinding Behavior: Cognitive Mapping and Other Spatial Processes.
The Johns Hopkins University Press, Baltimore (1999)
References 101
69. Golledge, R.G., Stimson, R.J.: Spatial Behavior: A Geographic Perspective. The Guildford
Press, New York (1997)
70. Gopnik, A.: The theory theory as an alternative to the innateness hypothesis. In: Antony, L.M.,
Hornstein, N. (eds.) Chomsky and His Critics, pp. 238–254. Blackwell Publishing Ltd, New
York (2003)
71. Gopnik, A.: Causality. In: Zelazo, P.D. (ed.) The Oxford Handbook of Developmental
Psychology, vol. 1. Oxford University Press, Oxford (2013)
72. Gould, J.L., Able, K.P.: Human homing: an elusive phenomenon. Science 212(4498),
1061–1063 (1981)
73. Guilford, J.P., Zimmerman, W.S.: The Guilford-Zimmerman aptitude survey. J. Appl. Psy-
chol. 32(1), 24–34 (1948)
74. Hafting, T., Fyhn, M., Molden, S., Moser, M.B., Moser, E.I.: Microstructure of a spatial map
in the entorhinal cortex. Nature 436(7052), 801–806 (2005)
75. Hartley, T., Maguire, E.A., Spiers, H.J., Burgess, N.: The well-worn route and the path less
traveled: distinct neural bases of route following and wayfinding in humans. Neuron 37(5),
877–888 (2003)
76. Hartley, T., Trinkler, I., Burgess, N.: Geometric determinants of human spatial memory.
Cognition 94(1), 39–75 (2004)
77. Heft, H.: Way-finding as the perception of information over time. Popul. Environ. 6(3),
133–150 (1983)
78. Hegarty, M., Waller, D.: Individual differences in spatial abilities. In: Shah, P., Miyake,
A. (eds.) The Cambridge Handbook of Visuospatial Thinking, pp. 121–169. Cambridge
University Press, Cambridge (2005)
79. Hegarty, M., Richardson, A.E., Montello, D.R., Lovelace, K.L., Subbiah, I.: Development of
a self-report measure of environmental spatial ability. Intelligence 30(5), 425–447 (2002)
80. Hegarty, M., Montello, D.R., Richardson, A.E., Ishikawa, T., Lovelace, K.L.: Spatial abilities
at different scales: individual differences in aptitude-test performance and spatial-layout
learning. Intelligence 34(2), 151–176 (2006)
81. Herskovits, A.: Language and Spatial Cognition. Cambridge University Press, Cambridge
(1986)
82. Heye, C., Rüetschi, U.J., Timpf, S.: Komplexität von Routen in öffentlichen Verkehrssys-
temen. In: Strobl, J., Blaschke, T., Griesebner, G. (eds.) Angewandte Geographische
Informationsverarbeitung XV, pp. 159–168. Wichmann, Heidelberg (2003)
83. Hirtle, S.C., Jonides, J.: Evidence of hierarchies in cognitive maps. Mem. Cognit. 13(3), 208–
217 (1985)
84. Hobbs, J.R.: Granularity. In: Joshi, A.K. (ed.) Proceedings of the 9th International Joint
Conference on Artificial Intelligence, pp. 432–435. Morgan Kaufmann, Los Angeles (1985)
85. Hochmair, H.: Investigating the effectiveness of the least-angle strategy for wayfinding in
unknown street networks. Environ. Plann. B Plann. Des. 32(5), 673–691 (2005)
86. Hochmair, H., Frank, A.U.: Influence of estimation errors on wayfinding-decisions in
unknown street networks: analyzing the least-angle strategy. Spat. Cognit. Comput. 2(4), 283–
313 (2002)
87. Ishikawa, T., Montello, D.R.: Spatial knowledge acquisition from direct experience in
the environment: individual differences in the development of metric knowledge and the
integration of separately learned places. Cognit. Psychol. 52(2), 93–129 (2006)
88. Ishikawa, T., Nakamura, U.: Landmark selection in the environment: relationships with object
characteristics and sense of direction. Spat. Cognit. Comput. 12(1), 1–22 (2012)
89. Ishikawa, T., Fujiwara, H., Imai, O., Okabe, A.: Wayfinding with a GPS-based mobile
navigation system: a comparison with maps and direct experience. J. Environ. Psychol. 28,
74–82 (2008)
90. Janelle, D.G.: Impact of information technologies. In: Hanson, S., Giuliano, G. (eds.) The
Geography of Urban Transportation, pp. 86–112. Guilford Press, New York (2004)
91. Janzen, G., Jansen, C., Turennout, M.v.: Memory consolidation of landmarks in good
navigators. Hippocampus 18(1), 40–47 (2008)
102 3 How People Perceive, Memorize, Think and Talk About Landmarks
92. Janzen, G., Turennout, M.v.: Selective neural representation of objects relevant for navigation.
Nat. Neurosci. 7(6), 673–677 (2004)
93. Johnson-Laird, P.N.: Mental Models: Towards a Cognitive Science of Language, Inference
and Consciousness. Cambridge University Press, Cambridge, UK (1983)
94. Kahneman, D.: Thinking, Fast and Slow. Farrar, Straus and Giroux, New York (2011)
95. Kaplan, S., Kaplan, R.: Cognition and Environment: Functioning in an Uncertain World.
Praeger, New York (1983)
96. Keeton, W.T.: Magnets interfere with pigeon homing. Proc. Natl. Acad. Sci. 68(1), 102–106
(1971)
97. Kelly, J.W., McNamara, T.P., Bodenheimer, B., Carr, T.H., Rieser, J.J.: The shape of human
navigation: how environmental geometry is used in maintenance of spatial orientation.
Cognition 109(2), 281–286 (2008)
98. Kelly, J.W., McNamara, T.P., Bodenheimer, B., Carr, T.H., Rieser, J.J.: Individual differences
in using geometric and featural cues to maintain spatial orientation: cue quantity and cue
ambiguity are more important than cue type. Psychonomic Bull. Rev. 16(1), 176–181 (2009)
99. Klatzky, R.L.: Allocentric and egocentric spatial representations: definitions, distinctions, and
interconnections. In: Freksa, C., Habel, C., Wender, K.F. (eds.) Spatial Cognition, Lecture
Notes in Artificial Intelligence, vol. 1404, pp. 1–17. Springer, Berlin (1998)
100. Klein, W.: Wegauskünfte. Zeitschrift für Literaturwissenschaft und Linguistik 33, 9–57
(1979)
101. Kleinfeld, J.: Visual memory in village Eskimo and urban Caucasian children. Arctic 24(2),
132–138 (1971)
102. Klippel, A., Montello, D.R.: Linguistic and non-linguistic turn direction concepts. In: Winter,
S., Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory, Lecture Notes in
Computer Science, vol. 4736, pp. 354–372. Springer, Berlin (2007)
103. Klippel, A., Tappe, H., Habel, C.: Pictorial representations of routes: chunking route segments
during comprehension. In: Freksa, C., Brauer, W., Habel, C., Wender, K.F. (eds.) Spatial
Cognition III, Lecture Notes in Artificial Intelligence, vol. 2685, pp. 11–33. Springer, Berlin
(2003)
104. Knauff, M.: Space to Reason: A Spatial Theory of Human Thought. MIT Press, Cambridge
(2013)
105. Knauff, M., Ragni, M.: Cross-cultural preferences in spatial reasoning. J. Cognit. Cult. 11(1),
1–21 (2011)
106. Kozhevnikov, M., Hegarty, M.: A dissociation between object manipulation spatial ability and
spatial orientation ability. Mem. Cognit. 29(5), 745–756 (2001)
107. Kuhn, W.: Ontology of landscape in language. In: Mark, D.M., Turk, A.G., Burenhult, N.,
Stea, D. (eds.) Landscape in Language: Transdisciplinary Perspectives. Culture and Language
Use, vol. 4, pp. 369–379. John Benjamins Publishing Company, Philadelphia (2011)
108. Kuhn, W., Frank, A.U.: A formalization of metaphors and image-schemas in user interfaces.
In: Mark, D.M., Frank, A.U. (eds.) Cognitive and Linguistic Aspects of Geographic Space.
NATO ASI Series D: Behavioural and Social Sciences, vol. 63, pp. 419–434. Kluwer
Academic Publishers, Dordrecht (1991)
109. Kuipers, B.J.: Modeling spatial knowledge. Cognit. Sci. 2(2), 129–153 (1978)
110. Kuipers, B.J.: On representing commonsense knowledge. In: Findler, N.V. (ed.) Associative
Networks: Representation and Use of Knowledge by Computers, pp. 393–408. Academic,
New York (1979)
111. Lakoff, G.: Women, Fire, and Dangerous Things: What Categories Reveal About the Mind.
The University of Chicago Press, Chicago (1987)
112. Lakoff, G., Johnson, M.: Metaphors We Live By. The University of Chicago Press, Chicago
(1980)
113. Lakoff, G., Johnson, M.: Philosophy in the Flesh: The Embodied Mind and Its Challenge to
Western Thought. Basic Books, New York (1999)
114. Landau, B., Jackendoff, R.: “What” and “where” in spatial language and spatial cognition.
Behav. Brain Sci. 16(2), 217–238 (1993)
References 103
115. Larkin, J.H., Simon, H.A.: Why a diagram is (sometimes) worth ten thousand words. Cognit.
Sci. 11(1), 65–100 (1987)
116. Lee, P.U., Tversky, B.: Interplay between visual and spatial: the effect of landmark descrip-
tions on comprehension of route/survey spatial descriptions. Spat. Cognit. Comput. 5(2, 3),
163–185 (2005)
117. Lee, P.U., Tappe, H., Klippel, A.: Acquisition of landmark knowledge from static and dynamic
presentation of route maps. In: 24th Annual Meeting of the Cognitive Science Society. George
Mason University, Fairfax, Virginia (2002)
118. Levelt, W.J.M.: Perspective taking and ellipsis in spatial descriptions. In: Bloom, P., Peterson,
M.A., Nadel, L., Garrett, M.F. (eds.) Language and Space, pp. 77–108. The MIT Press,
Cambridge (1996)
119. Levine, M., Jankovic, I.N., Palij, M.: Principles of spatial problem solving. J. Exp. Psychol.
Gen. 111(2), 157–175 (1982)
120. Levinson, S.C.: Space in Language and Cognition. Cambridge University Press, Cambridge
(2003)
121. Lewicki, P., Hill, T., Czyzewska, M.: Nonconscious acquisition of information. Am. Psychol.
47(6), 796–801 (1992)
122. Liben, L.S., Myers, L.J.: Developmental changes in children’s understanding of maps:
What, when, and how? In: Plumert, J.M., Spencer, J.P. (eds.) The Emerging Spatial Mind,
pp. 193–218. Oxford University Press, Oxford (2007)
123. Likert, R.: A technique for the measurement of attitudes. Archives Psychol. 140, 1–55 (1932)
124. Lloyd, R., Patton, D., Cammack, R.: Basic-level geographic categories. Prof. Geogr. 48(2),
181–194 (1996)
125. Loomis, J.M., Klatzky, R.L., Golledge, R.G., Cicinelli, J.G., Pellegrino, J.W., Fry, P.A.:
Nonvisual navigation by blind and sighted: assessment of path integration ability. J. Exp.
Psychol. Gen. 122(1), 73–91 (1993)
126. Loomis, J.M., Klatzky, R.L., Golledge, R.G., Philbeck, J.W.: Human navigation by path
integration. In: Golledge, R.G. (ed.) Wayfinding Behavior, pp. 125–151. The John Hopkins
University Press, Baltimore (1999)
127. Lovelace, K.L., Hegarty, M., Montello, D.R.: Elements of good route directions in familiar
and unfamiliar environments. In: Freksa, C., Mark, D.M. (eds.) Spatial Information Theory,
Lecture Notes in Computer Science, vol. 1661, pp. 65–82. Springer, Berlin (1999)
128. Lynch, K.: The Image of the City. The MIT Press, Cambridge (1960)
129. Maguire, E.A., Burgess, N., Donnett, J.G., Frackowiak, R.S.J., Frith, C.D., O’Keefe, J.:
Knowing where and getting there: a human navigation network. Science 280(5365), 921–924
(1998)
130. Maguire, E.A., Gadian, D.G., Johnsrude, I.S., Good, C.D., Ashburner, J., Frackowiak, R.S.J.,
Frith, C.D.: Navigation-related structural change in the hippocampi of taxi drivers. Proc. Natl.
Acad. Sci. 97(8), 4398–4403 (2000)
131. Mandler, J.M.: Representation. In: Damon, W. (ed.) Handbook of Child Psychology, vol. 2,
pp. 255–308. Wiley, Hoboken (1998)
132. Mandler, J.M.: On the spatial foundations of the conceptual system and its enrichment.
Cognit. Sci. 36(3), 421–451 (2012)
133. Mark, D.M.: Finding simple routes: “ease of description” as an objective function in
automated route selection. In: Second Symposium on Artificial Intelligence Applications, pp.
577–581. IEEE, Miami Beach (1985)
134. Mark, D.M.: Toward a theoretical framework for geographic entity types. In: Frank, A.U.,
Campari, I. (eds.) Spatial Information Theory, Lecture Notes in Computer Science, vol. 716,
pp. 270–283. Springer, Berlin (1993)
135. Mark, D.M., Egenhofer, M.J.: Calibrating the meanings of spatial predicates from natural
language: Line-region relations. In: Waugh, T.C., Healey, R.G. (eds.) Advances in GIS
Research: Proceedings of 6th International Symposium on Spatial Data Handling, pp.
538–553. Edinburgh (1994)
104 3 How People Perceive, Memorize, Think and Talk About Landmarks
136. Mark, D.M., Turk, A.G.: Landscape categories in Yindjibarndi. In: Kuhn, W., Worboys, M.F.,
Timpf, S. (eds.) Spatial Information Theory, Lecture Notes in Computer Science, vol. 2825,
pp. 28–45. Springer, Berlin (2003)
137. Mark, D.M., Turk, A.G., Stea, D.: Progress on Yindjibarndi ethnophysiography. In: Winter,
S., Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory, Lecture Notes in
Computer Science, vol. 4736, pp. 1–19. Springer, Berlin (2007)
138. Marr, D.: Vision. W. H. Freeman and Company, New York (1982)
139. Mathis, A., Herz, A.V.M., Stemmler, M.B.: Resolution of nested neuronal representations can
be exponential in the number of neurons. Phys. Rev. Lett. 109(1), 018,103 (2012)
140. McGee, M.G.: Human Spatial Abilities: Sources of Sex Differences. Praeger, New York
(1979)
141. McNamara, T.P.: Mental representations of spatial relations. Cognit. Psychol. 18(1), 87–121
(1986)
142. Michon, P.E., Denis, M.: When and why are visual landmarks used in giving directions? In:
Montello, D.R. (ed.) Spatial Information Theory, Lecture Notes in Computer Science, vol.
2205, pp. 292–305. Springer, Berlin (2001)
143. Milgram, S., Jodelet, D.: Psychological maps of Paris. In: Proshansky, H.M., Ittelson, W.H.,
Rivlin, L. (eds.) Environmental Psychology: People and Their Physical Settings, 2nd edn., pp.
104–124. Holt, Rinehart and Winston, New York (1976)
144. Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for
processing information. Psychol. Rev. 63, 81–97 (1956)
145. Mittelstaedt, M.L., Mittelstaedt, H.: Homing by path integration in a mammal. Naturwis-
senschaften 67(11), 566–567 (1980)
146. Montello, D.R.: Scale and multiple psychologies of space. In: Frank, A.U., Campari, I. (eds.)
Spatial Information Theory, Lecture Notes in Computer Science, vol. 716, pp. 312–321.
Springer, Berlin (1993)
147. Montello, D.R.: A new framework for understanding the acquistion of spatial knowledge in
large-scale environments. In: Egenhofer, M.J., Golledge, R.G. (eds.) Spatial and Temporal
Reasoning in Geographic Information Systems, chap. 11, pp. 143–154. Oxford University
Press, New York (1998)
148. Montello, D.R.: Spatial cognition. In: Smelser, N.J., Baltes, P.B. (eds.) International Ency-
clopedia of the Social and Behavioral Sciences, pp. 14,771–14,775. Pergamon Press, Oxford
(2001)
149. Montello, D.R.: Navigation. In: Shah, P., Miyake, A. (eds.) Handbook of Visuospatial
Thinking, pp. 257–294. Cambridge University Press, Cambridge (2005)
150. Montello, D.R.: You are where? the function and frustration of you-are-here (YAH) maps.
Spat. Cognit. Comput. 10(2), 94–104 (2010)
151. Montello, D.R., Raubal, M.: Functions and applications of spatial cognition. In: Waller,
D., Nadel, L. (eds.) The APA Handbook of Spatial Cognition, pp. 249–264. American
Psychological Association, Washington, DC (2013)
152. Müller, M., Wehner, R.: Path integration in desert ants, cataglyphis fortis. Proc. Natl. Acad.
Sci. 85(14), 5287–5290 (1988)
153. Munnich, E., Landau, B., Dosher, B.A.: Spatial language and spatial representation: a cross-
linguistic comparison. Cognition 81(3), 171–208 (2001)
154. Newcombe, N., Frick, A.: Early education for spatial intelligence: why, what, and how. Mind
Brain Educ. 4(3), 102–111 (2010)
155. Newcombe, N., Huttenlocher, J.: Development of spatial cognition. In: Damon, W., Lerner,
R.M. (eds.) Handbook of Child Psychology: Theoretical Models of Human Development,
vol. 2, 6th edn., pp. 734–776. Wiley, Hoboken (2006)
156. Newcombe, N., Uttal, D.H., Sauter, M.: Spatial development. In: Zelazo, P.D. (ed.) The
Oxford Handbook of Developmental Psychology, vol. 1. Oxford University Press, Oxford
(2013)
157. Ogden, C.K., Richards, I.A.: The Meaning of Meaning: A Study of the Influence of Language
Upon Thought and of the Science of Symbolism. Routledge & Kegan Paul, London (1923)
References 105
158. O’Keefe, J., Dostrovsky, J.: The hippocampus as a spatial map: preliminary evidence from
unit activity in the freely-moving rat. Brain Res. 34(1), 171–175 (1971)
159. O’Keefe, J., Nadel, L.: The Hippocampus as a Cognitive Map (definition of a cognitive map).
Clarendon Press, Oxford (1978)
160. Olson, D.L., Bialystok, E.: Spatial Cognition. Child Psychology. Lawrence Erlbaum Asso-
ciates, Hillsdale (1983)
161. O’Regan, J.K.: Why Red Doesn’t Sound Like a Bell: Understanding the Feel of Conscious-
ness. Oxford University Press, New York (2011)
162. Parush, A., Berman, D.: Navigation and orientation in 3D user interfaces: the impact of
navigation aids and landmarks. Int. J. Hum. Comput. Stud. 61(3), 375–395 (2004)
163. Pazzaglia, F., De Beni, R.: Strategies of processing spatial information in survey and
landmark-centred individuals. Eur. J. Cognit. Psychol. 13(4), 493–508 (2001)
164. Pazzaglia, F., Meneghetti, C.: Spatial text processing in relation to spatial abilities and spatial
styles. J. Cognit. Psychol. 24(8), 972–980 (2012)
165. Piaget, J.: Studies in Reflecting Abstraction. Psychology Press, London, UK (2000)
166. Piaget, J., Inhelder, B.: The Child’s Conception of Space. Routledge & Kegan Paul, London
(1956)
167. Pick, H., Acredolo, L. (eds.): Spatial Orientation: Theory, Research, and Application. Plenum
Press, New York (1983)
168. Pinker, S.: The Language Instinct: How the Mind Creates Language, 2nd edn. Harper
Perennial Modern Classics, New York (2007)
169. Plumert, J.M., Carswell, C., DeVet, K., Ihrig, D.: The content and organization of communi-
cation about object locations. J. Mem. Lang. 34, 477–498 (1995)
170. Plumert, J.M., Spalding, T.L., Nichols-Whitehead, P.: Preferences for ascending and descend-
ing hierarchical organization in spatial communication. Mem. Cognit. 29(2), 274–284 (2001)
171. Purves, D., Lotto, R.B.: Why We See What We Do Redux. Sinauer Associates, Inc.,
Sunderland (2011)
172. Raubal, M., Egenhofer, M.: Comparing the complexity of wayfinding tasks in built environ-
ments. Environ. Plann. B 25(6), 895–913 (1998)
173. Redish, A.D.: Beyond the Cognitive Map: From Place Cells to Episodic Memory. The MIT
Press, Cambridge (1999)
174. Redish, A.D., Ekstrom, A.D.: Hippocampus and related areas: what the place cell literature
tells us about cognitive maps in rats and humans. In: Waller, D., Nadel, L. (eds.) Handbook of
Spatial Cognition, pp. 15–34. American Psychological Association, Washington, DC (2013)
175. Relph, E.C.: Place and Placelessness. Pion Ltd., London (1976)
176. Richter, K.F., Winter, S.: Harvesting user-generated content for semantic spatial information:
the case of landmarks in OpenStreetMap. In: Hock, B. (ed.) Proceedings of the Surveying
and Spatial Sciences Biennial Conference 2011, pp. 75–86. Surveying and Spatial Sciences
Institute, Wellington (2011)
177. Richter, D., Richter, K.F., Winter, S.: The impact of classification approaches on the detection
of hierarchies in place descriptions. In: Vandenbroucke, D., Bucher, B., Crompvoets, J. (eds.)
Geographic Information Science at the Heart of Europe, Lecture Notes in Geoinformation
and Cartography, pp. 191–206. Springer, Berlin (2013)
178. Richter, D., Vasardani, M., Stirling, L., Richter, K.F., Winter, S.: Zooming in–zooming out:
hierarchies in place descriptions. In: Krisp, J.M. (ed.) Progress in Location-Based Services,
Lecture Notes in Geoinformation and Cartography. Springer, Berlin (2013)
179. Rieser, J.J.: Spatial orientation of six-month-old infants. Child Dev. 50(4), 1078–1087 (1979)
180. Rosch, E., Mervis, C.B., Gray, W.D., Johnson, D.M., Boyes-Braem, P.: Basic objects in
natural categories. Cognit. Psychol. 8(3), 382–439 (1976)
181. Sacks, O.: The Man Who Mistook His Wife for a Hat. Picador, London (1985)
182. Sadalla, E.K., Magel, S.: The perception of traversed distance. Environ. Behav. 12(1), 65–79
(1980)
183. Sadalla, E.K., Staplin, L.J.: An information storage model for distance cognition. Environ.
Behav. 12(2), 183–193 (1980)
106 3 How People Perceive, Memorize, Think and Talk About Landmarks
184. Sadalla, E.K., Burroughs, J., Staplin, L.J.: Reference points in spatial cognition. J. Exp.
Psychol. Hum. Learn. Mem. 6(5), 516–528 (1980)
185. Samet, H.: The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading
(1990)
186. Sanchez, C.A., Branaghan, R.J.: The interaction of map resolution and spatial abilities on
map learning. Int. J. Hum. Comput. Stud. 67(5), 475–481 (2009)
187. Schegloff, E.A.: Notes on a conversational practice: formulating place. In: Sudnow, D. (ed.)
Studies in Social Interaction, vol. 75, pp. 75–119. MacMillan, New York (1972)
188. Schelling, T.C.: The Strategy of Conflict. Harvard University Press, Cambridge (1960)
189. Schmid, F., Richter, K.F., Peters, D.: Route aware maps: multigranular wayfinding assistance.
Spat. Cognit. Comput. 10(2), 184–206 (2010)
190. Schmid, F., Kuntzsch, C., Winter, S., Kazerani, A., Preisig, B.: Situated local and global
orientation in mobile you-are-here maps. In: de Sa, M., Carrico, L., Correia, N. (eds.) 12th
International Conference on Human Computer Interaction with Mobile Devices and Services
(MobileHCI), pp. 83–92. ACM Press, Lisbon (2010)
191. Schneider, G.E.: Two visual systems. Science 163(3870), 895–902 (1969)
192. Searle, J.R.: Speech Acts. Cambridge University Press, Cambridge (1969)
193. Searle, J.R.: Minds, brains, and programs. Behav. Brain Sci. 3, 417–424 (1980)
194. Shanon, B.: Where questions. In: 17th Annual Meeting of the Association for Computational
Linguistics. ACL, University of California at San Diego, La Jolla (1979)
195. Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of
Illinois Press, Chicago (1949)
196. Shelton, A.L., Gabrieli, J.D.E.: Neural correlates of encoding space from route and survey
perspectives. J. Neurosci. 22(7), 2711–2717 (2002)
197. Shepard, R.N., Metzler, J.: Mental rotation of three-dimensional objects. Science 171(3972),
701–703 (1971)
198. Siegel, A.W., White, S.H.: The development of spatial representations of large-scale envi-
ronments. In: Reese, H. (ed.) Advances in Child Development and Behaviour, pp. 9–55.
Academic, New York (1975)
199. Silverman, I., Eals, M.: Sex differences in spatial abilities: evolutionary theory and data. In:
Barkow, J.H., Cosmides, L., Tooby, J. (eds.) The Adapted Mind: Evolutionary Psychology
and the Generation of Culture, pp. 533–549. Oxford University Press, New York (1992)
200. Smith, B. (ed.): Foundations of Gestalt Theory. Philosphia Resources Library. Philosophia
Verlag, Munich (1988)
201. Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In: Freksa,
C., Mark, D.M. (eds.) Spatial Information Theory, Lecture Notes in Computer Science, vol.
1661, pp. 37–50. Springer, Berlin (1999)
202. Sperber, D., Wilson, D.: Relevance: Communication and Cognition. Basil Blackwell, Oxford
(1986)
203. Spiers, H.J., Maguire, E.A.: A navigational guidance system in the human brain. Hippocam-
pus 17(8), 618–626 (2007)
204. Steck, S.D., Mallot, H.A.: The role of global and local landmarks in virtual environment
navigation. Presence 9(1), 69–83 (2000)
205. Stevens, Q.: The shape of urban experience: a reevaluation of Lynch’s five elements. Environ.
Plann. B Plann. Des. 33(6), 803–823 (2006)
206. Stevens, A., Coupe, P.: Distortions in judged spatial relations. Cognit. Psychol. 10(4),
422–437 (1978)
207. Streeter, L.A., Vitello, D., Wonsiewicz, S.A.: How to tell people where to go: comparing
navigational aids. Int. J. Man Mach. Stud. 22(5), 549–562 (1985)
208. Talmy, L.: How language structures space. In: Pick, H. (ed.) Spatial Orientation Theory:
Research and Application, pp. 225–282. Plenum Press, New York (1983)
209. Taube, J.S., Muller, R.U., Ranck, J.B.: Head-direction cells recorded from the postsubiculum
in freely moving rats. I. Description and quantitative analysis. J. Neurosci. 10(2), 420–435
(1990)
References 107
210. Taube, J.S., Muller, R.U., Ranck, J.B.: Head-direction cells recorded from the postsubiculum
in freely moving rats. II. Effects of environmental manipulations. J. Neurosci. 10(2), 436–447
(1990)
211. Taylor, H.A., Tversky, B.: Descriptions and depictions of environments. Mem. Cognit. 20(5),
483–496 (1992)
212. Taylor, H.A., Tversky, B.: Spatial mental models derived from survey and route descriptions.
J. Mem. Lang. 31(2), 261–292 (1992)
213. Tenbrink, T., Winter, S.: Variable granularity in route directions. Spat. Cognit. Comput. 9(1),
64–93 (2009)
214. Thorndyke, P.W.: Distance estimation from cognitive maps. Cognit. Psychol. 13(4), 526–550
(1981)
215. Timpf, S., Frank, A.U.: Using hierarchical spatial data structures for hierarchical spatial
reasoning. In: Hirtle, S.C., Frank, A.U. (eds.) Spatial Information Theory, Lecture Notes in
Computer Science, vol. 1329, pp. 69–83. Springer, Berlin (1997)
216. Tolman, E.C.: Cognitive maps in rats and men. Psychol. Rev. 55(4), 189–208 (1948)
217. Tomko, M., Winter, S.: Describing the functional spatial structure of urban environments.
Comput. Environ. Urban Syst. 41, 177–187 (2013)
218. Trowbridge, C.C.: On fundamental methods of orientation and “imaginary maps”. Science
38(990), 888–897 (1913)
219. Tse, D., Langston, R.F., Kakeyama, M., Bethus, I., Spooner, P.A., Wood, E.R., Witter, M.P.,
Morris, R.G.M.: Schemas and memory consolidation. Science 316(5821), 76–82 (2007)
220. Tversky, A., Kahneman, D.: Judgement under uncertainty: heuristics and biases. Science
185(4157), 1124–1131 (1974)
221. Tversky, B.: Cognitive maps, cognitive collages, and spatial mental models. In: Frank, A.U.,
Campari, I. (eds.) Spatial Information Theory, Lecture Notes in Computer Science, vol. 716,
pp. 14–24. Springer, Heidelberg (1993)
222. Tversky, B., Hard, B.M.: Embodied and disembodied cognition: spatial perspective-taking.
Cognition 110(1), 124–129 (2009)
223. Tversky, B., Lee, P.U.: How space structures language. In: Freksa, C., Habel, C., Wender,
K.F. (eds.) Spatial Cognition, Lecture Notes in Artificial Intelligence, vol. 1404, pp. 157–175.
Springer, Berlin (1998)
224. Twaroch, F.: Sandbox geography. Ph.d. thesis, Technical University Vienna (2007)
225. Vandenberg, S.G., Kuse, A.R.: Mental rotations, a group test of three-dimensional spatial
visualization. Percept. Mot. Skills 47(2), 599–604 (1978)
226. Vanetti, E.J., Allen, G.L.: Communicating environmental knowledge: the impact of verbal
and spatial abilities on the production and comprehension of route directions. Environ. Behav.
20(6), 667–682 (1988)
227. Varela, F.J., Thompson, E., Rosch, E.: The Embodied Mind: Cognitive Science and the
Human Experience. The MIT Press, Cambridge (1991)
228. Vasardani, M., Timpf, S., Winter, S., Tomko, M.: From descriptions to depictions: a concep-
tual framework. In: Tenbrink, T., Stell, J., Galton, A., Wood, Z. (eds.) Spatial Information
Theory, Lecture Notes in Computer Science, vol. 8116, pp. 299–319. Springer, Cham (2013)
229. Vygotsky, L.S.: Thought and Language. MIT Press, Cambridge (1986)
230. Walker, M.M., Dennis, T.E., Kirschvink, J.L.: The magnetic sense and its use in long-distance
navigation by animals. Curr. Opin. Neurobiol. 12(6), 735–744 (2002)
231. Waller, D., Nadel, L. (eds.): Handbook of Spatial Cognition. American Psychological
Association, Washington, DC (2013)
232. Wang, J., Schwering, A.: The accuracy of sketched spatial relations: how cognitive errors
influence sketch representation. In: Tenbrink, T., Winter, S. (eds.) Proceedings of the Inter-
national Workshop Presenting Spatial Information: Granularity, Relevance, and Integration,
pp. 40–47. SFB/TR8 and University of Melbourne, Melbourne, Australia (2009)
233. Wang, R.F., Spelke, E.S.: Updating egocentric representations in human navigation. Cogni-
tion 77(3), 215–250 (2000)
108 3 How People Perceive, Memorize, Think and Talk About Landmarks
234. Wehner, R.: Desert ant navigation: how miniature brains solve complex tasks. J. Comp.
Physiol. 189(8), 579–588 (2003)
235. Weissensteiner, E., Winter, S.: Landmarks in the communication of route instructions. In:
Egenhofer, M., Freksa, C., Miller, H.J. (eds.) Geographic Information Science, Lecture Notes
in Computer Science, vol. 3234, pp. 313–326. Springer, Berlin (2004)
236. Wertheimer, M.: Über Gestalttheorie. Philosophische Zeitschrift für Forschung und
Aussprache 1, 39–60 (1925)
237. Westheimer, G.: Gestalt theory reconfigured: Max Wertheimer’s anticipation of recent
developments in visual neuroscience. Perception 28(1), 5–15 (1999)
238. Winter, S., Freksa, C.: Approaching the notion of place by contrast. J. Spat. Inf. Sci. 2012(5),
31–50 (2012)
239. Wittgenstein, L.: Philosophical Investigations, 2nd edn. Basil Blackwell, Oxford (1963)
240. Wolbers, T., Hegarty, M., Büchel, C., Loomis, J.M.: Spatial updating: how the brain keeps
track of changing object locations during observer motion. Nat. Neurosci. 11(10), 1223–1230
(2008)
241. Wunderlich, D., Reinelt, R.: How to get there from here. In: Jarvella, R.J., Klein, W. (eds.)
Speech, Place, and Action, pp. 183–201. Wiley, Chichester (1982)
242. van der Zee, E., Slack, J. (eds.): Representing Direction in Language and Space. Oxford
University Press, Oxford (2003)
Chapter 4
Conceptual Aspects: How Landmarks
Can Be Described in Data Models
Abstract Landmarks seem to be cross with current spatial data models. We have
argued that landmarks are mental concepts having a fundamental role in forming
the spatial reference frame for mental spatial representations. But landmarks are not
a fundamental category in current geographic information modelling. For example,
among Kuhn’s list of core concepts of spatial information [36] one finds location
and objects as separate concepts, which appears to be incompatible with our
cognitively motivated view of landmarks as concepts that are providing just that: the
link between recognizable objects and location anchoring. This chapter sets out to
fill this gap. In order to bridge between the cognitive concept and a formal, machine
readable description of the semantics of landmarks we resort to ontologies. In this
formal conceptualization landmarks will be specified intentionally, as a function,
or role, of entities representing geographic objects, an approach fully aligned with
our intentional definition in Sect. 1.1. The intentional specification will cater for
a quantitative landmarkness, which is also compatible with the graded notion of
categories. Finally, landmarkness will be modelled with dynamic variability to cater
for context.
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 109
DOI 10.1007/978-3-319-05732-3__4, © Springer International Publishing Switzerland 2014
110 4 How Landmarks Can Be Described in Data Models
1
https://en.wikipedia.org/wiki/World_Geodetic_System, last visited 3/1/2014.
4.1 The Purpose of Modelling Landmarks 111
4.2.1 Properties
Before the model (a formal, explicit specification in a machine readable format) can
be presented and discussed let us consider the properties of objects2 a landmark
model must be able to reflect.
If landmarks are not a type, but a property, then any entity in the spatial database
(any instance of any type) will have to have this property to some degree, or to some
level of agreement. Considerations for this landmarkness are:
• There will be objects represented in the database that everybody will experience
as so outstanding in the environment that this experience is linked with the
location and stored in mental spatial representations.
• There will also be objects represented in the database that have meaning only
for some people. They have landmarkness mostly for semantic reasons, such as
my home. The spatial database may or may not know about the semantics, just
like other people may know or not know where I live or work. In the mobile
2
Smith encourages ontology engineers to give up the fuzzy term concept (or conceptualization).
Instead, “ontologies [. . . ] should be understood as having as their subject matter not concepts, but
rather the universals and particulars which exist in reality and are captured in scientific laws” ([52],
p. 73).
4.2 Towards a Landmark Model 113
A simple example for an algebra is the algebra of natural numbers, let us say
including 0. The operation of addition of two individuals a and b, (+) a b,
is fully specified by the following axioms:
neutral element: (+) a 0 = a
associativity: (+) ((+) a b) c = (+) a ((+) b c)
commutativity: (+) a b = (+) b a
increment: i (0)=1 and (+) a i (b) = i ((+) a b).
In our context an algebraic specification will cater for the type landmark, derived
from a property of landmarkness, and we will sketch and discuss operations on
landmarkness and their behavior. Perhaps we should mention here that this formal
model is independent from the language of the conversation, i.e., graphical or verbal,
in the same way as the algebra of natural numbers is independent from a notation in
arabic or roman numerals.
The formal model will be written here in the syntax of the functional program-
ming language Haskell.3 We do not explain the details of the language here, but
even the unfamiliar reader should get the idea of the model from this approach.
Haskell provides an elegant way of formal modelling. It is fully typed but supports
polymorphism, e.g., talking about landmarkness of any type of entity in the spatial
database. Since it does not only specify operations (interfaces) but also their
semantics, Haskell, or more generally, functional programming languages are ideal
tools for specification and rapid prototyping, and have been successfully applied in
the geographic information domain for a while [5, 13–15, 58, 65].
For a start let us collect some data types that capture properties we have discussed
before. Landmarkness is a property of any entity to a degree, thus we first define a
data type representing landmarkness that we will attach later to all entities:
3
http://www.haskell.org/, last visited 3/1/2014.
4.2 Towards a Landmark Model 115
So far, the data type is only a synonym for a double precision floating point
number. This means it inherits already some semantics from the type system of
Haskell. For example, addition, multiplication and subtraction are defined. This is
clearly more than we want as we have not indicated any use for these operations
in all discussion so far. But we might want to have some of its properties, for
example its quantitativeness, compared to a Boolean variable, to express a degree of
landmarkness, and the order relation to select the entity with stronger landmarkness.
As the Haskell comment, rather than the type, indicates we assume values are
limited between 0 Lns 1. This assumption needs to be formally specified
elsewhere which we skip here. But let us discuss what this voluntary limitation
means:
• Entities representing objects that have no landmarkness, have a property of type
Lns of value 0, rather than no property of type Lns. This means, semantically
Lns is a fuzzy membership value4 [69]. The semantics of a fuzzy membership
value is a degree of membership to a vaguely defined category. Accordingly a
Lns property of value 0 expresses no degree of membership at all to the category
landmark.
• If Lns is a fuzzy membership value to the category landmark, its maximum is 1,
indicating a central element, or prototype, of the category landmark. Generally,
a larger value indicates a more central element in the category.
Especially the first condition helps distinguishing between objects of no landmark-
ness and objects of unknown landmarkness. Objects of unknown landmarkness have
no landmarkness attribute in the spatial database (an empty list).
We will specify fuzzy membership functions—functions assigning fuzzy mem-
bership values to entities [69]—in the following chapter. However, from our
discussion of spatial cognition we have already learned that there is no global
measure of landmarkness. The landmarkness of an object is context-dependent,
and context is an open complex system including the person, the situation, and the
task at hand. Hence, a fuzzy membership function, as well as a data model storing
landmarkness, requires integration with another property, the context under which
an object has landmarkness. As two very prominent ambassadors of embodied cog-
nition, the two neurobiologists Maturana and Varela have both confirmed the view
expressed in the previous chapter that cognitive representations are not objective
4
Can a landmarkness value be alternatively interpreted in a probabilistic manner? Probabilities
cover the same range of values, but have a significantly different meaning. From a probabilistic
perspective, one could argue, the value of type Lns represents the frequency with which a reference
to this object is chosen in verbal route descriptions. After the discussion above, we should add that
it is chosen in verbal route descriptions in a given communication context. For example, Denis’
skeletal descriptions came out as the descriptions containing the references everybody used in a
given context. References in these descriptions would be most likely landmarks (as well as by
broadest agreement, i.e., in the fuzzy membership sense). Hence, probabilistic interpretations
are possible, and some of the computation methods in the next chapter may actually apply a
probabilistic interpretation.
116 4 How Landmarks Can Be Described in Data Models
In Haskell, the vertical bar means an exclusive disjunction. The context identifier
can take one of these alternative values, and thus, can describe landmarkness of
an object either for a pedestrian, or for a motorist, or for Tom, an individual. This
simplistic model will allow at least a context-dependent storage and reasoning about
landmarkness. A smarter machine would replace this model by a model with more
elements or more (hierarchical) structure.
118 4 How Landmarks Can Be Described in Data Models
With the help of a third data type the model will be enabled to refer to an object
in a flexible, context-dependent manner:
With these three types we can introduce a complex data type describing the
context-dependent landmarkness property of an entity in a spatial database:
Note that any entity in a spatial database can have multiple of these properties,
but only one for each Cid. This means entities have lists of Ls with Cid as unique
key:
Entities can have any structure. Typically it will be a type from some database
taxonomy, an identifier, an official name, a geometric description as a point,
polyline, polygon, volume or a set of these, and further thematic descriptors. For
readability we have generalized this structure in the code and concentrated on the
relevant bit. All what we did was adding one property, a list of landmarknesses in
different contexts. Here is an example of what we can represent now:
This database entity is a good landmark for motorists, for whom “the church”
should be an appropriate reference in verbal communication. It will also be a good
landmark, despite its spatial extension, for pedestrians, for whom it is easy to
identify the building in situ as “St Francis”. The database has not yet stored any
information about the suitedness to refer to this building in communication with
Tom personally, but of course Tom can be identified as a pedestrian. Also the model
has not yet been equipped with an ability to store information about the suitedness
4.2 Towards a Landmark Model 119
of this reference for users of public transport. This would require an extension of
C id by another context identifier.
In a similar vein we add to the abstract class of Entities to specify operations on
these properties:
The first operation copies all content from the entity and appends the new
landmarkness at the end of the existing list of landmarknesses. This operation works
even if l was prior an empty list. The second operation is only an alias for the
observer that had been introduced with the data type Entity. The third one goes
through all elements of the list of landmarknesses of this entity, and when it finds
an element for the context Cid it puts it into the output list. Note that C id should be
a unique key in an entity’s list, such that the result is either a list of one element or
an empty list. The fourth operation, hasLns, takes the output of getLns—i.e., either
an empty list or a list of one element—and tests whether there is a context-related
landmarkness. Only if lval> 0 it returns True, and in both alternative cases it returns
False. Either there is no landmarkness, lvalD 0, or the landmarkness is unknown and
getLns returns an empty list.
120 4 How Landmarks Can Be Described in Data Models
Also, with access to the geometric description of entities in the spatial database,
an operation can be defined to select all context-related landmarks within an area of
interest, for example within a buffer zone around a route.
With some more effort such an operator can even be extended for lifting
the landmarkness property from entities to types. For example, if all entities
of a type turn out to have some landmarkness then the type could be said to
have landmarkness. Accordingly, one of the fuzzy membership functions we will
introduce in the next section is based on type landmarkness, generalizing instance
properties of landmarkness.
Admittedly some properties of landmarks that we had discussed in the previous
chapter are not yet captured in this formal model: hierarchies of salience, hierarchies
of spatial granularities, and qualitative spatial relations in some flexible spatial
frames of reference. The reason why these (necessary) properties are not put into the
model here is that they are expected to appear from the general structure of a spatial
database in their current forms. Entities in spatial databases have a type, a thematic
description, and a geometric description in a spatial frame of reference [40]. The
database itself comes with a taxonomy. Especially for integration of different
databases, where taxonomies typically clash, taxonomies can be created in an ad-
hoc manner, for example based on similarity computed from affordance [29, 30].
Taxonomies of entity types provide a specialization hierarchy, or an is-a hierarchy.
Specialization, especially when described by affordance, will be a key of context-
dependent choice of landmarks, in addition to our formal model’s capacity to
assign different landmarkness measures to each entity depending on context.
Some methods of landmark identification are solely type-based (see Chap. 5).
Hierarchies of spatial granularity (part-of or containment hierarchies) are expected
to result from the geometric description of the entities. And qualitative spatial
relations between landmarks, or relata, are intertwined with the selection process of
landmarks (Chap. 6), which use either the geometric descriptions of entities again,
or use one of the qualitative spatial calculi [62]. Thus, the presented model should
be sufficient so far.
4.2 Towards a Landmark Model 121
The extension does not change the principles of the model, only the data string
in entities gets longer:
train station this particular route will have decision points, and requires references
to anchor the orientation decisions to these locations. Hence, selection requires an
integration of stored and ad-hoc criteria.
The rest of the book, however, uses a (semi-formal) procedural syntax for
specifying algorithms instead of Haskell. The majority of the literature we are
compiling and reviewing in the following chapters used procedural syntax already,
and while we might edit and homogenize individual styles we preserve to a large
extent the original thoughts and approaches.
As we have seen before each of Lynch’s elements can have some landmarkness,
which varies according to context. Thus, computational models promising to extract
elements of the city must be highly relevant for modelling landmarks.
A notable formal spatial modelling stream has grown out of urban research.
Space Syntax [27] is a way to express the configuration of space by visual accessi-
bility. Graphs constructed from axial lines, which are the longest lines of visibility
in an environment, allow a quantitative analysis of configuration properties. Even
some aspects of cognition are implicit in space syntax analysis since configuration,
and especially the legibility of a configuration is correlated with movement behavior
and hence likely to be correlated with mental spatial representations [43]. Despite
expected correlations, however, space syntax as such lacks a deeper cognitive
grounding.
Nevertheless, measures of space syntax have been used to redefine a subset of
Lynch’s elements. Some approaches focus on exploring axial lines and isovists in
a map of the city [8], other approaches on isovists in a three-dimensional urban
digital model [42]. Isovists [2] are the set of all points visible from a given vantage
point in space and with respect to a particular environment. Since the isovist is a
property of a location the vantage point can be characterized by a measure, e.g.,
the size of the surface that is visible. Morello and Ratti argue that characterizing
location by visibility is about legibility of urban space—Lynch’s original interest.
Their redefinition of several of Lynch’s elements in terms of visibility allows to
capture Lynch’s elements in a formal, algorithmic way.
Also a link between scaling properties of the city’s street network structure and
the salient elements of the city form has been suggested [31]. Scaling properties
refer here to the typical long tail distribution of elements in the street network by
their spatial granularity. For example, a city’s street network has a few major roads
crossing the city, and many smaller streets refining the network locally. Traffic,
or human movement, is similarly distributed with arterial roads being regularly
congested, and lighter traffic in the more narrow streets. Clearly there is a link
between scaling and hierarchical mental spatial representations. Timpf et al. [55]
have even suggested a formal (conceptual) model of human route planning based
on such a hierarchy in the street network. They distinguish a planning process
4.2 Towards a Landmark Model 123
estimations of distances and directions are better for nearby objects [39]. Also,
wayfinding and other spatial activities seem to be easier for parts of the environment
that are nearby and familiar: “For an individual there is not an equal probability of
behavior occurring in all sections of the environment. There is a spatial bias for
behavior close to his residence owing to least effort considerations” ([6], p. 368).
That things near to each other are more correlated than distant things was an
observation made originally by Tobler for geographic space [56], and later also
confirmed as a mental default assumption [41]. As a design choice Voronoi diagrams
provide an unambiguous set of relationships5 and an efficient reasoning on a small
number of links between likely correlated objects.
A number of properties follow from this design choice:
• Objects in one configuration can belong to different object classes (e.g., the
variety of objects within a city), but they must be spatially separable (non-
overlapping), i.e., from the same level in a hierarchy of spatial granularities. For
example, one configuration can consist of cities (in the context of representing the
geographic space of a country), and another configuration can consist of salient
buildings within a city (in the context of representing the geographic space of the
city), but a configuration cannot contain a city and a building of that city. The city
is an agglomerate of buildings (hence, the building is contained by the city), and
accordingly, city and buildings belong to different levels of spatial granularity.
• The objects of a configuration form a context via contrast [63]. The spatial
meaning of a locative expression can be specified by its current contrast set. For
example, the locative expression “[at / pass] the church” can have a meaning in
a configuration of f(this) church, city hall, museum, cinemag, and it would have
another spatial meaning in f(this)church, the other church north of it, and the
third church south of itg.
The Voronoi diagram associates always the nearest landmark for a localization task.
“I am near the church” describes my location anchored by the church (in some
appropriate contrast set). “I am in Zurich” anchors my location at another level of
granularity and also evokes a different contrast set.
Choosing Voronoi diagrams is of course a simplifying heuristic, which brings in
uncertainty at two levels:
• The application of a Voronoi diagram is a (heuristic) simplification of a more
complex geographic reality. For example, a configuration of all European capitals
provides a stable Voronoi diagram to locate other objects in this reference frame.
However, the Voronoi diagram is only a poor reflection of the nations’ boundaries
and hence may produce non-preferred nearness relationships.
5
Mathematically there are exceptions: relationships become ambiguous when nearest neighbors
are arranged on a line or in a rectangle. But practically this nearly never happens in geographic
space.
126 4 How Landmarks Can Be Described in Data Models
The hierarchy by spatial granularity is typically part of the data structures of a spatial
database. Such hierarchic structures exist for gazetteers, which can link each entity
to the larger object it belongs to [26], as well as geometric spatial databases, which
provide the topological structures between layers of object categories in order to rea-
son for containment. Here, we refer to models that actually make use of these hier-
archic structures in either generating or understanding place descriptions [57, 68].
Fig. 4.1 The Voronoi diagram of the set of Melbourne’s inner city train stations Southern Cross,
Flagstaff, Melbourne Central, Parliament, and Flinders Street. Map copyright OpenStreetMap
contributors, used under CC BY-SA 2.0
most central station in the public transport network is Flinders Street Station, so
in the process of establishing a next level of the salience hierarchy the only local
maximum, Flinders Street Station, gets lifted. Thus the final salience hierarchy
consists of a root node (Flinders Street Station), and a second level, containing
all stations. This hierarchy reflects that in some communication contexts “the train
station” refers to Flinders Street Station, city-wide. In other communication contexts
the contrast set may consist of all five stations, and a finer distinction has to be made.
These salience hierarchies are also suited to map hierarchical cognitive reasoning
across hierarchies of spatial granularity (Sect. 4.2.5.2). In order to illustrate this, let
us consider a hierarchical verbal place description. “I am at Melbourne Central,
Swanston Street exit” is a description zooming in, using landmarks from different
levels of spatial granularity as relata. In Voronoi Diagrams this zooming behavior
can be reflected. The first relatum, Melbourne Central may be understood in
the context of a train traveller, i.e., applying the contrast set of Fig. 4.1. The
corresponding Voronoi cell was perceived by the speaker to be too coarse for this
particular communication purpose, and a refinement was sought. Figure 4.2 shows
a local refinement of the original Voronoi diagram built from the salient elements of
Melbourne Central—the new contrast set. Without a commitment to a geometrically
exact position the speaker conveys a location closer to the Swanston Street exit than
to any other salient element of the train station.
Hence, spatial entities bearing landmarkness information and being explicitly or
implicitly organized in hierarchical data structures, can reflect hierarchical cognitive
reasoning.
128 4 How Landmarks Can Be Described in Data Models
Fig. 4.2 A switch in spatial granularity from stations to exits (or other elements) of stations. Map
copyright OpenStreetMap contributors, used under CC BY-SA 2.0
Our conceptual model requires data models capable to represent the landmarkness
of entities of any spatial granularity while representing this landmarkness in a
context-dependent way. In this section we will review existing data models for their
compliance, and discuss shortfalls in particular cases.
6
http://www.w3.org/2010/POI/, http://openpoi.ogcnetwork.net/, last visited 3/1/2014.
4.3 Landmarks in Geographic Data Structures 129
even for a layer of user generated POIs. Storing these categories as layers enables
the user to switch them on or off. For the service itself all objects in a layer are
equal, and switching on or off by users means the service itself has no concept
of relevance, or mechanism to capture and consider a communication context. The
original statement “We’ve found it super useful for checking out what’s nearby a
hotel we’ll be staying at, orienting ourselves, getting the feel for a neighborhood,
or just browsing around for fun”7 makes clear that the purpose of POI is quite
different from landmarks, and far from the intelligent communication capabilities
of the machine we are after. This approach to POI is also more prone to commercial
interference such as sponsoring.
7
http://google-latlong.blogspot.com.au/2009/08/i-didnt-know-that-was-there.html, last visited
3/1/2014.
8
http://www.openstreetmap.org, last visited 3/1/2014.
130 4 How Landmarks Can Be Described in Data Models
The closest contender for a suited data model is probably an inclusion of landmarks
in OpenLS [24].
OpenLS is an international standard by ISO and OGC.10 OpenLS describes an
open platform for location-based services. Among its core services is a route service.
OpenLS specifies primarily the interaction between client and server and the format
in which the transferred data is encoded. The role of a route service in OpenLS
is to determine a travel route between two points and to collect the navigation
information required to communicate this route.
For the interaction between client and server OpenLS comes with an XML
schema for location services, called XLS. XLS is an exchange data model for route
services, this means for structuring and encoding requests and responses regarding
route descriptions. XLS only caters for POI, and does not yet cater for an abstract
data type landmark. However, XLS offers explicitly an optional attribute, which can
serve different purposes, among others also describing a landmark. This attribute
allows for providing the name of the landmark and its location with respect of the
side of the route. Other information about the landmark cannot be encoded in this
attribute, such that information necessary for the cognitive ergonomic use of this
landmark cannot be encoded and is lost.
9
http://wiki.openstreetmap.org/wiki/Map_Features, last visited 3/1/2014.
10
http://www.opengeospatial.org/standards/ols, last visited 3/1/2014.
4.4 Summary 131
Fig. 4.3 The top levels of a landmark taxonomy along routes (from [24])
4.4 Summary
In this section we have presented a formal model suitable for developing entities in
spatial databases that can represent some landmarkness of the represented objects
in certain contexts. We have also discussed variations of this model, including
structured context models, vectors characterizing an objects’ landmarkness in
multiple dimensions, and the landmarkness of types.
What we have not yet discussed is the computational semantics of landmarkness
values. Formally, they represent fuzzy membership values to a vaguely defined
132 4 How Landmarks Can Be Described in Data Models
category landmark (in a certain context), but questions whether this landmarkness
comes out of people’s rating, data mining in harvested user generated data, or expert
assignment, and whether it represents an object’s prominence, its cognitive salience,
or its centrality in an environment have been left out so far. They will be discussed
in the next chapter.
References
17. Ghasemi, M., Richter, K.F., Winter, S.: Landmarks in OSM. In: 5th Annual International
OpenStreetMap Conference. Denver (2011)
18. Goodchild, M.: Citizens as sensors: The world of volunteered geography. GeoJournal 69(4),
211–221 (2007)
19. Grenon, P., Smith, B.: Snap and span: Towards dynamic spatial ontology. Spat. Cogn. Comput.
4(1), 69–104 (2004)
20. Gruber, T.R.: Toward principles for the design of ontologies used for knowledge sharing. Int.
J. Hum. Comput. Stud. 43(5–6), 907–928 (1995)
21. Guarino, N., Oberle, D., Staab, S.: What is an ontology? In: Staab, S., Studer, R. (eds.)
Handbook on Ontologies, 2nd edn., pp. 1–17. Springer, Dordrecht (2009)
22. Haklay, M.: How good is volunteered geographic information? A comparative study of
OpenStreetMap and Ordnance Survey datasets. Environ. Plan. B 37(4), 682–703 (2010)
23. Haklay, M., Weber, P.: OpenStreetMap: User-generated street maps. Pervasive Comput. 7(4),
12–18 (2008)
24. Hansen, S., Richter, K.F., Klippel, A.: Landmarks in OpenLS - a data structure for cognitive
ergonomic route directions. In: Raubal, M., Miller, H., Frank, A.U., Goodchild, M.F. (eds.)
Geographic Information Science. Lecture Notes in Computer Science, vol. 4197, pp. 128–144.
Springer, Berlin (2006)
25. Hausdorff, F.: Grundzüge der Mengenlehre. Veit & Company, Leipzig (1914)
26. Hill, L.L.: Georeferencing: The Geographic Associations of Information. Digital Libraries and
Electronic Publishing. MIT Press, Cambridge (2006)
27. Hillier, B., Hanson, J.: The Social Logic of Space. Cambridge University Press, Cambridge
(1984)
28. Janelle, D.G.: Impact of information technologies. In: Hanson, S., Giuliano, G. (eds.) The
Geography of Urban Transportation, pp. 86–112. Guilford Press, New York (2004)
29. Janowicz, K., Keßler, C.: The role of ontology in improving gazetteer interaction. Int. J. Geogr.
Inf. Sci. 22(10), 1129–1157 (2008)
30. Janowicz, K., Raubal, M.: Affordance-based similarity measurement for entity types. In:
Winter, S., Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory. Lecture
Notes in Computer Science, vol. 4736, pp. 133–151. Springer, Berlin (2007)
31. Jiang, B.: Computing the image of the city. In: Campagna, M., De Montis, A., Isola, F., Lai, S.,
Pira, C., Zoppi C. (eds.) Proceedings of the 7th International Conference on Informatics and
Urban and Regional Planning, pp. 111–121. Cagliari, Italy (2012)
32. Jordan, T., Raubal, M., Gartrell, B., Egenhofer, M.J.: An affordance-based model of place
in GIS. In: Poiker, T.K., Chrisman, N. (eds.) 8th International Symposium on Spatial Data
Handling, pp. 98–109. IGU, Vancouver, Canada (1998)
33. Klippel, A., Winter, S.: Structural salience of landmarks for route directions. In: Cohn, A.G.,
Mark, D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 3693,
pp. 347–362. Springer, Berlin (2005)
34. Klippel, A., Hansen, S., Richter, K.F., Winter, S.: Urban granularities - a data structure for
cognitively ergonomic route directions. GeoInformatica 13(2), 223–247 (2009)
35. Krumm, J., Davies, N., Narayanaswami, C.: User generated content. Pervasive Comput. 7(4),
10–11 (2008)
36. Kuhn, W.: Core concepts of spatial information for transdisciplinary research. Int. J. Geogr.
Inf. Sci. 26(12), 2267–2276 (2012)
37. Maling, D.H.: Coordinate systems and map projections for GIS. In: D. Maguire, M. Goodchild,
D. Rhind (eds.) Geographical Information Systems: Principles and Applications, pp. 135–146.
Longmans Publishing Co. (1991)
38. Maturana, H.R.: Neurophysiology of cognition. In: Garvin, P. (ed.) Cogntion: A Multiple View,
pp. 3–23. Spartan Books, New York (1970)
39. McNamara, T.P.: Mental representations of spatial relations. Cogn. Psychol. 18(1), 87–121
(1986)
40. Molenaar, M.: An Introduction to the Theory of Spatial Object Modelling for GIS. Research
Monographs in Geographic Information Systems. Taylor and Francis, London (1998)
134 4 How Landmarks Can Be Described in Data Models
41. Montello, D.R., Fabrikant, S.I., Ruocco, M., Middleton, R.S.: Testing the first law of cognitive
geography on point-display spatializations. In: Kuhn, W., Worboys, M.F., Timpf, S. (eds.)
Spatial Information Theory. Lecture Notes in Computer Science, vol. 2825, pp. 316–331.
Springer, Berlin (2003)
42. Morello, E., Ratti, C.: A digital image of the city: 3D isovists in Lynch’s urban analysis.
Environ. Plan. B 36(5), 837–853 (2009)
43. Penn, A.: Space syntax and spatial cognition: Or why the axial line? Environ. Behav. 35(1),
30–65 (2003)
44. Purves, R.S.: Methods, examples, and pitfalls in the exploitation of the geospatial web.
In: Hesse-Biber, S.N. (ed.) The Handbook of Emergent Technologies in Social Research,
pp. 592–622. Oxford University Press, New York (2011)
45. Raper, J.: Geographic relevance. J. Doc. 63(6), 836–852 (2007)
46. Raubal, M., Winter, S.: Enriching wayfinding instructions with local landmarks. In: Egenhofer,
M.J., Mark, D.M. (eds.) Geographic Information Science, Lecture Notes in Computer Science,
vol. 2478, pp. 243–259. Springer, Berlin (2002)
47. Renz, J.: Qualitative Spatial Reasoning with Topological Information, Lecture Notes in
Computer Science, vol. 2293. Springer, Berlin (2002)
48. Richter, K.F., Tomko, M., Winter, S.: A dialog-driven process of generating route directions.
Comput. Environ. Urban Syst. 32(3) (2008)
49. Richter, K.F., Winter, S.: Harvesting user-generated content for semantic spatial information:
The case of landmarks in OpenStreetMap. In: Hock, B. (ed.) Proceedings of the Surveying
and Spatial Sciences Biennial Conference 2011, pp. 75–86. Surveying and Spatial Sciences
Institute, Wellington (2011)
50. Sadalla, E.K., Burroughs, J., Staplin, L.J.: Reference points in spatial cognition. J. Exp.
Psychol. Hum. Learn. Mem. 6(5), 516–528 (1980)
51. Scheider, S., Kuhn, W.: Affordance-based categorization of road network data using a grounded
theory of channel networks. Int. J. Geogr. Inf. Sci. 24(8), 1249–1267 (2010)
52. Smith, B.: Beyond concepts: Ontology as reality representation. In: Varzi, A.C., Vieu, L.
(eds.) Third International Conference on Formal Ontology and Information Systems (FOIS),
pp. 73–84. IOS Press, Turin (2004)
53. Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In:
Freksa, C., Mark, D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science,
vol. 1661, pp. 37–50. Springer, Berlin (1999)
54. Studer, R., Benjamins, V.R., Fensel, D.: Knowledge engineering: Principles and methods. Data
Knowl. Eng. 25(1–2), 161–197 (1998)
55. Timpf, S., Volta, G.S., Pollock, D.W., Frank, A.U., Egenhofer, M.J.: A conceptual model of
wayfinding using multiple levels of abstraction. In: Frank, A.U., Campari, I., Formentini, U.
(eds.) Theories and Methods of Spatio-Temporal Reasoning in Geographic Space. Lecture
Notes in Computer Science, vol. 639, pp. 348–367. Springer, Berlin (1992)
56. Tobler, W.: A computer movie simulating urban growth in the Detroit region. Econ. Geogr.
46(2), 234–240 (1970)
57. Tomko, M., Winter, S.: Pragmatic construction of destination descriptions for urban environ-
ments. Spat. Cogn. Comput. 9(1), 1–29 (2009)
58. Tomko, M., Winter, S.: Describing the functional spatial structure of urban environments.
Comput. Environ. Urban Syst. 41(September), 177–187 (2013)
59. Varela, F.J.: Whence perceptual meaning? A cartography of current ideas. In: Varela, F.J.,
Dupuy, J.P. (eds.) Understanding Origins, Boston Studies in the Philosophy and History of
Science, vol. 130, pp. 235–263. Springer, Amsterdam (1992)
60. Voronoi, G.: Nouvelles applications des paramètres continus à la théorie des formes quadra-
tiques (first part). Journal für die Reine und Angewandte Mathematik (Crelle’s Journal)
1908(133), 97–178 (1908)
61. Voronoi, G.: Nouvelles applications des paramètres continus à la théorie des formes quadra-
tiques (second part). Journal für die Reine und Angewandte Mathematik (Crelle’s Journal)
1909(134), 67–182 (1909)
References 135
62. Wallgrün, J.O., Frommberger, L., Wolter, D., Dylla, F., Freksa, C.: A toolbox for qualitative
spatial representation and reasoning. In: Barkowsky, T., Knauff, M., Ligozat, G., Montello,
D.R. (eds.) Spatial Cognition V. Lecture Notes in Artificial Intelligence, vol. 4387, pp. 39–58.
Springer, Berlin (2007)
63. Winter, S., Freksa, C.: Approaching the notion of place by contrast. J. Spat. Inf. Sci. 2012(5),
31–50 (2012)
64. Winter, S., Kuhn, W., Krüger, A.: Does place have a place in geographic information science?
Spat. Cogn. Comput. 9(3), 171–173 (2009)
65. Winter, S., Nittel, S.: Formal information modeling for standardisation in the spatial domain.
Int. J. Geogr. Inf. Sci. 17(8), 721–741 (2003)
66. Winter, S., Tomko, M., Elias, B., Sester, M.: Landmark hierarchies in context. Environ. Plann.
B Plann. Des. 35(3), 381–398 (2008)
67. Winter, S., Truelove, M.: Talking about place where it matters. In: Raubal, M., Mark, D.M.,
Frank, A.U. (eds.) Cognitive and Linguistic Aspects of Geographic Space: New Perspectives
on Geographic Information Research. Lecture Notes in Geoinformation and Cartography,
pp. 121–139. Springer, Berlin (2013)
68. Wu, Y., Winter, S.: Interpreting destination descriptions in a cognitive way. In: Hois, J., Ross,
R., Kelleher, J., Bateman, J. (eds.) Workshop on Computational Models for Spatial Language
Interpretation and Generation (CoSLI-2), vol. 759, pp. 8–15. CEUR-WS.org, Boston (2011)
69. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)
70. Zheng, Y., Zhou, X. (eds.): Computing with Spatial Trajectories. Springer, New York (2011)
Chapter 5
Computational Aspects: How Landmarks
Can Be Observed, Stored, and Analysed
The previous chapters have established the importance of landmarks for our
understanding of an environment. We have also highlighted how this impacts on our
way of communicating. Furthermore, Chap. 4 has shown that it is possible to capture
at least the principle aspects of landmarks and landmarkness in a formal way,
making them accessible to computers. This chapter will discuss how computers may
be able to populate these formal specifications. This will encompass approaches of
determining whether and how a geographic object sticks out from the background,
as well as how to select the most appropriate landmark in a given situation. Both are
important aspects of integrating landmarks into geospatial services, however, they
are often treated separately.
To provide an initial idea of the differences between both aspects, consider
the situation depicted in Fig. 5.1. In the situation on the left, some algorithm
determined for each geographic object whether it fulfills the criteria of landmarkness
as discussed in the last chapter. In particular, such an algorithm identifies those
objects that are salient in their local surroundings. The figure in (a) shows all
objects of those contained in the grey areas that the algorithm considered to be
salient. This results in a set of landmark candidates, which form the input for other
algorithms that select the most relevant landmark for a given situation. For example,
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 137
DOI 10.1007/978-3-319-05732-3__5, © Springer International Publishing Switzerland 2014
138 5 How Landmarks Can Be Observed, Stored, and Analysed
a b
Statue Statue
Opera Opera
Fig. 5.1 Computing a landmark: (a) identifying geographic objects that may serve as a landmark
in principle; (b) selecting the most suited landmark for a specific situation
in Fig. 5.1b, in order to describe the marketplace when coming from the south,
selecting a landmark candidate that is visible early on is a sensible choice; here, this
may be the church.
We term these steps landmark identification and landmark integration, respec-
tively. They are important steps in computing a landmark. However, they are often
performed by different algorithms and research addresses either one or the other.
Accordingly, the next section will present approaches to identifying landmarks.
Section 5.3 then will illustrate approaches to integrating landmarks. Section 5.4
will compare the approaches to identification and integration of landmarks, which
will lead to a criticism of existing approaches, presented in Sect. 5.5. That
section will also argue for some extensions and alternative approaches, respectively,
to circumvent some of the inherent drawbacks that come with the current approaches
to computing landmarks. To a large extent it seems these drawbacks are mainly
responsible for why landmarks are not used (more) in today’s location-based
services.
The aim of landmark identification is to find all geographic objects in a given region
that may serve as a landmark in principle. To achieve this, for each object its salience
for people needs to be determined. Salience can either be computed or inferred from
how these objects are referred to in some data source. In other words, algorithms for
landmark identification either use attribute data of geographic objects (Sect. 5.2.1)
or they mine other document sources to determine an object’s salience by the way
(and number of times) documents refer to that object (Sect. 5.2.2).
5.2 Landmark Identification 139
The first approach to the automatic identification of landmarks has been presented
by Raubal and Winter [35]. Their approach reflects the three landmark characteris-
tics of Sorrows and Hirtle [45] discussed in Chaps. 3 and 4. Raubal and Winter’s
approach aims at capturing perceptual and cognitive aspects of geographic objects,
which are then used to calculate landmark salience. Their formal model of landmark
salience is based on the concept of attractiveness, which reflects the ‘landmarkness’
of an object, i.e., how strong a landmark candidate it is.
In accordance with Sorrows and Hirtle’s conceptual classification of landmark
aspects, Raubal and Winter define three different kinds of attractiveness: visual,
semantic, and structural. Geographic objects are visually attractive if they are in
sharp contrast to their surroundings or have a prominent spatial location. The formal
model for landmark salience includes four measures of visual attractiveness:
• Façade area: If the façade area of an object is significantly larger or smaller
than those of the surrounding objects this object becomes well noticeable. In
the original model, façade area is simply measured as the product of width and
height, assuming rectangular buildings, however, this can be extended to account
for more irregular shapes as well.
• Shape: Unusual shapes, especially among more regular, box-like objects, are
highly remarkable. Indeed, architects use this to make buildings stand out. The
model distinguishes two aspects of shape, the shape factor and the deviation.
The shape factor simply is the proportion of height and width. The deviation is
the ratio of the area of the minimum-bounding rectangle (mbr) of the object’s
façade and its façade area. Again, more complicated measures could be defined
if needed.
• Color: For humans, color is a clear indicator of visual attractiveness. For example,
a red building will be highly visible among a row of grey buildings. In the model,
color is measured as a decimal value derived from the RGB color space.
• Visibility: Like color, visibility is highly important for visual attractiveness. The
model assumes a two-dimensional visibility, defined by recognizability within
a buffer zone. Visibility is measured as the fraction (over its total size) of a
building’s front (its façade) in this buffer zone.
Table 5.1 Properties and measurement of visual, semantic, and structural attraction
Attractiveness
Measure Property Measurement Measure
R
Visual Façade pvf D xjx 2 façade svis D .pvf C
Attraction Shape pvsf D height=width pvsf C pvsd C
pvsd D .area mbr ˛/=area mbr pvc C pvv /=5
Color pvc D ŒR; G; B
P
Visibility pvv D x j x visible
Semantic Cultural and psec D Œ0; 1 ssem D
Attraction Historic .psec C psem /=2
Explicit Mark psem D Œ0; 1
Structural Nodes pstn D i C o sst r D
Attraction Boundaries pstb D cell size form factor .pstn C pstb /=2
Adapted from [35]
Fig. 5.2 Example of a wayfinding situation where a reference to a landmark may be used to
describe the required action; adapted from [35]
And since salience reflects local differences between geographic objects, i.e., one
object being noticeably different in one or several of the salience properties, the
final step is to calculate these differences, for example, by employing maximum or
minimum operations, which will result in the geographic object with the highest (or
lowest) salience value in a given local configuration of objects.
Let’s look at an example (from Raubal and Winter [35]) to see how this formal
model may be applied to determining landmark salience. Consider the situation
depicted in Fig. 5.2.
There are three buildings that are of interest at the marked intersection: Café
Aida, Bank Austria and the Haas building. Table 5.2 shows the values of each
property for the Haas building. The table also indicates which of the properties
are significant and, thus, need to be taken into account for calculating landmark
salience. Calculation is done using Eq. (5.1). For the other two buildings, deter-
mining their landmark salience works accordingly, which leaves us with a salience
score of 1:8 for the Haas building, 1:2 for Bank Austria, and 0:9 for Café Aida.
Applying a maximum operation results in the Haas building as being the most
salient geographic object at the intersection. It seems to be the most recognizable
one in terms of landmark identification. We will see below that this does not
automatically mean it is always the best landmark to use when communicating about
this intersection.
Raubal and Winter’s model [35] has seen several extensions over the years.
Emphasizing the aspect of structural attraction, advance visibility [52] is one such
extension. The underlying assumption of advance visibility is that landmarks that
142 5 How Landmarks Can Be Observed, Stored, and Analysed
Fig. 5.3 A landmark with low (left) and high (right) advance visibility; modified from [52]
are identifiable early on along a route are more useful than those that can only be
spotted at the very last moment. Figure 5.3 shows an example of a landmark with
low and high advance visibility.
Formally, advance visibility v is a measure of route coverage c and orientation
o of the landmark in question. Route coverage reflects how much of a landmark is
visible along the entering route segment when approaching the intersection where
it is located. The route coverage measure is calculated as the ratio between the part
covered by the landmark and the total length of the route segment, which is specified
by start and end point ps , pe . Orientation o measures the orientation of a landmark
towards the route. A landmark that is oriented in moving direction is easier to spot
than one that requires considerable head movement. Formally, orientation is defined
as the difference between façade orientation df and route segment orientation dr ,
both in terms of cardinal directions.
v D co (5.2)
jpi pe j
cD
jps pe j
jdf dr j
oD
180
5.2 Landmark Identification 143
Fig. 5.4 The route investigated in [31]. White dots indicate differing selections between partici-
pants and model, black dots coinciding selection. Modified from [31]
Let us return to the example of Fig. 5.2. If also accounting for advance visibility,
it turns out that for the given direction of travel, the Bank Austria building has
a higher salience value than the Haas building. Coverage for Bank Austria is
c D 1, and the orientation is also close to 1, while the Haas building is orientated
nearly completely away from the route. In principle, advance visibility needs to be
calculated for every landmark from every possible direction of approach to cover
for every possible wayfinding situation.
Klippel and Winter [26] presented a further extension of the salience model
that expands the aspect of structural salience. It accounts for the location of
landmarks along the route and the kind of wayfinding action that needs to
be performed. Since this is route-specific and not a general property of a
landmark candidate, their approach will be further discussed in Sect. 5.3.
Raubal and Winter’s formal model has been empirically evaluated by Nothegger,
Winter, and Raubal [31] (see also [53]). They implemented (partially manual)
measures for façade area and shape, color, visibility, and semantic attraction of
a building. Data for one route through Vienna’s first district has been collected,
combining several data sources (again, a partially manual process); see Fig. 5.4.
Using this data, for each intersection along the route the most salient landmark
was calculated. These were then compared with the results of a human subject
144 5 How Landmarks Can Be Observed, Stored, and Analysed
study, where participants had been asked to select the most prominent façade for
each intersection while viewing a 360ı panorama photograph of that intersection.
In seven of the nine intersections, automatic selection corresponded to human
selection, showing the power of this formal model for landmark salience—or more
precisely façade salience.
The same authors also discussed approaches of adapting the parameters of the
model to specific contexts [53]. Again using an empirical study, they established
weighting factors for different aspects of the model when encountering façades
during day or night. People were asked to select the most prominent façade at an
intersection from a photograph (showing either a day or a night shot), and then
to rank different aspects according to their importance for making this selection.
Table 5.3 lists these factors. While visibility is important both during the day and at
night, shape does not seem to be used at all at night. Instead a façade’s area becomes
much more important. The same holds for marks, especially if they are illuminated.
Elias [11] identified this challenge of data collection and parameter adaptation as
the weak spot of Raubal and Winter’s salience model. She proposed to use existing
topographic or cadastral data sets and to run machine learning approaches to identify
potential landmark candidates. The attributes in these data sets can be used as feature
vectors describing the different geographic objects. These feature vectors may be
fed into classification or clustering algorithms, which will then identify ‘outliers’,
i.e., geographic objects that are not easily joined with other objects. Arguably, these
outliers stand out from their surrounding environment and, thus, can be considered
to be landmarks following Presson and Montello’s definition [32].
Depending on the chosen classification algorithm, using this approach may
require to normalize the data first. Attributes may need to be preprocessed such
that they are all on the same scale and use the same measurement type (ordinal,
nominal, etc.). This is to ensure that no single attribute dominates the discrimination
between objects. Elias focused on buildings as landmark candidates and proposed
the following attributes to use in the classification (see Table 5.4). As can be seen,
these attributes either refer to land use or are derived from the geometry of buildings,
in other words information that can be expected from a cadastral data set. This
makes data collection easier, however, it also means that visual attractiveness can at
best only be implicitly inferred from non-visual attributes.
From the many possible approaches, Elias chose to test her approach using
ID3 [33], a supervised classification algorithm, and Cobweb [13], a hierarchical
clustering algorithm. While both approaches seem promising, ID3 has the advantage
5.2 Landmark Identification 145
Table 5.4 Attributes of buildings used in the classification of a landmark candidate (from [11])
Attribute Description
Building use Public, residential, Outbuilding, . . .
Building label Name or function of building
Size of building Length * width in m2
Elongation Ratio length/width
Number of corners Counting corners (normally 4 to 6)
Single building All alone, single in a row, one Neighbor, . . .
Building moved away from road Closest distance to road in m
Building ground area
Ratio of building area to parcel area parcel area
Number of buildings
Density of buildings (direct 100 m100 m
in m12
neighborhood)
Number of Buildings
Density of buildings (district) 500 m500 m
in m12
Orientation to road Along (length towards road), across (width),
angular, building at corner (in grad)
Orientation to north Angle building length to north in rad
Orientation to neighbor Difference angle to neighbor in rad
Perpendicular angle in building Deviation of angles to normal in rad
Parcel land use Public, residential, commerce and service,
industrial, . . .
Number of buildings on parcel Counting buildings
Special building objects on parcel Number of car ports, winter gardens, . . .
Neighbor land parcel use 0 or 1 (Boolean)
Form of parcel area Number of corners, number of neighbors
of not only identifying salient buildings, but also making attributes contributing to
a building’s salience explicit. Thus, it would be easier to generate guidelines for
landmark identification from ID3 than Cobweb.
The approaches discussed so far determine local landmarks, i.e., they allow
identifying geographic objects that stand out from their immediate surroundings.
However, they do not establish relationships between these landmarks and they
do not rank which landmark is the dominating one for a given area, i.e., which
best represents this area in any references made to it. Linking the conceptual ideas
of Raubal and Winter’s formal model [35] with Elias’ data mining approach to
determining local landmarks [11], Winter et al. [55] addressed exactly this problem.
Their approach allows for generating a leveled hierarchy of local landmarks.
Algorithm 5.1 illustrates their approach. Fundamentally, it is based on partitioning
an environment using Voronoi diagrams [1] (Sect. 4.2.5). On the lowest level,
the most salient landmark in each local neighborhood—say an intersection—is
identified using an unsupervised adaptation of the ID3 algorithm (Line 1). These
most salient landmarks form the next higher level in the hierarchy. Step 1 of the
algorithm gets repeated until there is only one landmark left, i.e., the Voronoi
diagram only consists of a single cell.
146 5 How Landmarks Can Be Observed, Stored, and Analysed
5 Compute the Voronoi partition and the Delaunay triangulation of LiC1 and go back to
step 1.
6 else
7 Stop.
a b
c d
Fig. 5.5 A toy example for hierarchical clustering of POIs. Initially, every POI is its own
cluster (a). In a second step, the closest two POIs form a new cluster. The new cluster center—
the centroid—is shown in black, the original POIs in dark grey (b). This process is repeated until
there is only a single cluster left (c, d)
former by the latter gives a number that indicates how unique a given n-gram is for a
cluster because higher numbers indicate that this n-gram appears frequently within
a cluster, but only sparsely outside it. In addition, Mummidi and Krumm also define
a minimum size for a valid cluster, and measure ‘term purity’, i.e., the fraction of
descriptions within a cluster that contain a given n-gram.
Others have used tags associated with Flickr photographs to find appropriate
labels for places or to delineate city cores [18, 34], using similar ideas, but different
techniques to those just presented.
Next, we will have a look at approaches that aim at identifying prototypical
and/or prominent views for specific locations from photograph collections. For
example, Kennedy and Naaman [23] as well as Zheng et al. [58] use unsupervised
learning techniques to find canonical (prototypical) views for given clusters of
photographs that (likely) show the same geographic object. They use Flickr1 or
Picasa2 and Panoramio3 as data sources, respectively, exploiting tags, geographic
position, and the images themselves.
1
http://www.flickr.com, last visited 8/1/2014
2
https://picasaweb.google.com, last visited 8/1/2014
3
http://www.panoramio.com, last visited 8/1/2014
5.2 Landmark Identification 149
Table 5.5 The five most prominent landmarks of the world and for some well known cities,
according to the approach by [5]
1st 2nd 3rd 4th 5th
World Eiffel Trafalgarsquare Tatemodern Bigben Notredame
New York Empirestatebuilding Timessquare Rockefeller Grandcentralstation Applestore
London Trafalgarsquare Tatemodern Bigben londoneye Piccadilycircus
Rome Colosseum Vaticano Pantheon fontanaditrevi Basilica
Berlin Brandenburgertor Reichstag Potsdamerplatz Berlinerdom Tvtower
Photos taken with digital cameras may contain much more information than
just the image itself. The Exchangeable image file format (Exif) standard
specifies the format for images, as well as for sound and tags, used by digital
cameras, including those in smart phones. This information encompasses the
camera type and basic information about how the image was taken, namely,
exposure, aperture and focal length, but also the date it was taken, the (GPS)
position it was taken at (if the camera allows for this), the photo’s resolution,
the color space, the ISO setting, aspect ratio, the software used in the camera,
and much more. In addition, today’s photo websites store information about
who has uploaded a photo (the user, either by name or ID), when it was
uploaded, and which digital album it is contained in. Furthermore, these
sites allow users to describe their photos by comments and/or tags. Tags are
freeform keywords associated with an element—here a photo. Typically, users
are free to choose whatever content they like in their tags.
Crandall et al. [5] may have presented the most advanced attempt to mine such
digital photograph collections. Their approach can deal with millions of photos. The
authors are able to identify the most representative images for the most prominent
landmarks of the largest US or European cities, and to track photographers through
a city, for example. This approach exploits the so called scale of observation, i.e.,
the fact that on different levels of scale different effects are observable. Observing
on the scale of countries, cities will appear as clusters, when observing on the scale
of a single city, points of interest and tourist attractions will emerge.
Using an estimate of the scale, latitude / longitude values of the photos’ locations
are treated as points in a plane and clustered by the mean shift technique. This
technique yields peaks in the distribution of photos. The magnitude of these peaks
represents the number of different photographers that took a photo at this location.
Table 5.5 shows some results for different cities. The labels are automatically
generated by picking the top tag within a cluster according to TFIDF.
150 5 How Landmarks Can Be Observed, Stored, and Analysed
Fig. 5.6 The canonical views for several of Europe’s most photographed cities; from http://www.
cs.cornell.edu/w8/~crandall/maps/map-europe.png (last visited 15/11/2013). Figure used with
permission by the author
For the identified landmarks, Crandall et al. look for canonical views, i.e., typical
photographs. This combines clustering with graph construction. Images are tested
for similarity using SIFT (Scale Invariant Feature Transform; see [28]) interest
points. A graph is constructed with each photo as a node and each edge between
a pair of nodes is weighted with the similarity between the two photos. The graph
is partitioned to find clusters of similar photos. For each cluster the node (photo)
with the largest weighted degree is chosen as the canonical view for this cluster.
Figure 5.6 shows an example of canonical views for some of Europe’s most popular
cities in form of a map.
5.3 Landmark Integration 151
Obviously, all the approaches presented in this section can only find what is in
the sources. Their results depend on the available data, i..e, on what people upload to
the Internet. Thus, these approaches are very good at picking up the most prominent
locations in a country or city, where there will be many uploads (often from tourists),
but will fail to pick up local landmarks in some residential neighborhoods because
only very few, if any, data is published about them. This issue will be further
discussed at the end of this chapter in Sect. 5.5.
When a set of landmark candidates exists, these can be used in a range of services,
some of which we will discuss in the next chapter. Most of the times, this will
require selecting one—or a very few—landmarks from the set of all candidates.
This is necessary because not all landmark candidates will be relevant for a given
situation or task. And often it will be the case that the landmark that gets selected
is not the most outstanding of the whole set, according to the salience measure
applied. Rather, it will be the most relevant landmark. Accordingly, this section
covers several approaches to determining relevance of a landmark candidate for
specific tasks and situations, predominantly in navigation scenarios.
Klippel and Winter [26], for example, extended the salience model by Raubal
and Winter [35] in order to select a landmark at an intersection that is most suitable
to initiate a turning action (e.g., ‘turn left’ or ‘veer right’). Figure 5.7 illustrates
possible locations of landmarks with respect to turning at an intersection. Not all
landmarks are equally suitable for identifying the required turn.
The location of a landmark relative to a turn at an intersection is an aspect
of structural salience. The extended model uses advance visibility, as it has
been discussed in the previous section, but additionally takes into account the
Fig. 5.7 Possible locations of landmarks with respect to turning actions; modified from [26]
152 5 How Landmarks Can Be Observed, Stored, and Analysed
Fig. 5.8 (a) Turning right at an intersection with three buildings as landmark candidates (only
relevant façades are considered); (b) the resulting overall salience values. Adapted from [26]
configuration of the street network and the route along this network. This results in
a distribution of salience values for different landmark candidates, which allows for
a selection of the most suited, i.e., most relevant landmark for a specific wayfinding
situation. Usually, landmarks located at an intersection are structurally more salient
than those between intersections along a street segment, and those passed before a
turning action are more salient than those after a turn. Figure 5.8 illustrates a street
intersection and the different salience values of the buildings located there. You can
clearly observe the emergence of a salience distribution, with the façade facing the
ingoing street segment of the building located before the intersection being the most
salient one in this case.
This is an important insight. In wayfinding, relevance of a landmark depends on
its location relative to the street network. Consequently, it is important to being
able to compute this location. In principle, this is a geometric problem, which
could be solved by calculating various distances and angles for all possible different
configurations. Such geometric solutions are complicated by the fact that landmarks
may not only be represented as points, but may also be linear or area-like [17].
This will have an influence on the geometric operations required to determine their
relative location with respect to the street network.
Richter [36, 40] presented an alternative approach by using qualitative descrip-
tions of a landmark’s location. These descriptions reflect how landmarks are referred
to in human descriptions (see Chap. 6), employing concepts, such as ‘before’ or
‘after’ a turn. In a nutshell, this approach exploits (circular) ordering information
to determine a landmark’s relative location, which works for different landmark
geometries.
This approach—as many others—models a street network as a graph-like repre-
sentation, where nodes represent intersections and edges represent street segments.
5.3 Landmark Integration 153
t
en
e- g
point with two functionally
gm
u t o in
se
relevant branches, the
ro u tg
o
incoming and outgoing
route-segment. From [36],
decision
modified
point
route-segment
incoming
Fig. 5.10 Circular order of
C
objects at an intersection.
Starting with A, the order is
A < D < C < church <
B < A (from [36], modified) D
B
a b c
Fig. 5.11 Three functionally different locations of a point landmark relative to a decision point:
(a) turning action after passing the landmark; (b) turning action before passing the landmark; (c)
landmark not at a functionally relevant branch (from [36], modified)
landmark may be next to the incoming route segment, next to the outgoing route
segment, or next to any of the other branches of the intersection. Accordingly, the
task now is to determine this next to relationship.
A next to relationship corresponds to a neighborhood relation between a
landmark and a branch, which gets us back to the circular ordering. By introducing
a virtual branch that connects the landmark with the decision point we include the
landmark into the branches’ circular ordering. With that, we can determine whether
the virtual branch is a neighbor to one of the functionally relevant branches, simply
by checking whether one succeeds the other in the ordering. If the landmark is
direct successor or predecessor of the incoming route segment, the turning action
is performed after the landmark is passed (denoted with lm< ). Accordingly, if
the landmark is a neighbor to the outgoing route segment, the turning action is
performed before passing the landmark (lm> ). If the landmark is not directly
neighbored to either of the functionally relevant route segments, all we can say is
that a turning action is performed at the decision point with this particular landmark
(denoted by lm ).
One further restriction is illustrated in Fig. 5.12. A landmark may be next to
the incoming route segment, but still only be passed after the turning action or
not directly passed at all (e.g., at a T-intersection). Therefore, we need to further
restrict the neighborhood region for incoming and outgoing route segment, such that
only the area before or after the turning action, respectively, is considered. Again,
this is solved by introducing virtual branches demarcating these regions. These
virtual branches start at the decision point and are perpendicular to the incoming
(or outgoing) route segment. We call the area next to the incoming route segment
before-region, and the area next to the outgoing route segment after-region [40].
Using this combination of virtual branches and reasoning with ordering informa-
tion we can determine where a point landmark is located relative to the route and,
consequently, what its functional role is in terms of the wayfinding process (see
Chap. 6). However, as stated above, landmarks are not always represented as points.
They may also be extended objects, i.e., a line or polygon. Richter’s approach can
be extended to handle these objects as well in a straightforward manner [36].
5.3 Landmark Integration 155
a b c
Fig. 5.12 Virtual branches demarcating before- and after-region are needed to correctly distin-
guish passing a landmark before or after the turning action. (a) An example case where a landmark
is next to the incoming route segment, but actually not directly passed during the turning action;
(b) the before-region for this situation; (c) the after-region
lm- lm-
lm< lm<
Fig. 5.13 Example of the relation of an extended landmark to the route. For each coordinate,
its relation to the route is determined. Since different relations hold, the overall relation is lm
(from [36], modified)
cussed in the next chapter. Landmark integration is based on work by Williams [50].
The approach is not well documented, but seems to use the location of a landmark
and the travel direction as parameters to determine whether a landmark is relevant
for describing a given route.
As a final thought in this section, let us step away from looking at each individual
geographic object in determining its relevance for the given context, and consider a
simplified, more general approach. Recently, Duckham et al. [9] explored categories
of features instead of individual properties to determine suitability as a landmark.
They established a rank order of different categories, such as restaurants, gas
stations, or schools, which is based on nine different aspects (e.g., physical size,
proximity to road, visibility, or permanence). For each aspect default assumptions
are made for every category. The rank order was established by a panel of experts,
who also decided on how many instances of a given category are likely to be typical.
The resulting scoring system for each aspect is shown in Table 5.6.
The overall suitability (or relevance) score is defined as the linear sum of all
nine suitability aspects. This sum is normalized in the range Œ0; 1, with 1 being
most suitable and 0 being least suitable (Eq. 5.3). Normalizing the weighting makes
the score comparable between different settings and also independent from the
absolute numerical values used in the expert rating. The score value scoref .c/ is
the suitability score from Table 5.6 for a landmark category c 2 C with respect to
the suitability aspect f . F is the set of all nine suitability aspects.
P P
f 2F scoref .c/ min.f f 2F scoref .c 0 /jc 0 2 C /g/
w.c/ D P (5.3)
max.f f 2F scoref .c 0 /jc 0 2 C g/
4
http://www.whereis.com.au, last visited 8/1/2014
5.3 Landmark Integration 157
candidates (points of interest) along the route. At each intersection with at least one
landmark candidate, the one with the highest weight gets selected—if two or more
landmarks have the same weight, an arbitrary decision is made.
Selection processes for landmark candidates have also been integrated into routing
algorithms. These routing algorithms aim at a route that is easy to follow, as opposed
to the shortest or fastest route that standard routing algorithms deliver. As previously
discussed and further elaborated in the next chapter, (references to) landmarks make
it easier for people to find their way.
These routing algorithms typically use the Dijkstra shortest path algorithm, but
instead of simply using geometric distances, weights are based on cognitive criteria
that determine the ease of finding the way. In the following, two such approaches
are presented in some more detail: the landmark spider by Caduff and Timpf [4] and
simplest instructions by Richter and Duckham [39].
Caduff and Timpf’s algorithm aims at guiding wayfinders past landmarks at
decision points. For each decision point, it determines the most relevant landmark,
using its relevance as weighting parameter. More formally, the weight of an edge is
the weighted sum of a landmark’s distance to a decision point, the orientation of a
traveler with respect to the landmark, and the landmark’s salience (Eq. 5.4).
The general idea of this weighting function is illustrated in Fig. 5.14. Closer
landmarks are taken to be more relevant than landmarks farer away from a decision
point [49]. Similar to the concept of advance visibility, landmarks in movement
direction are easier to spot, and thus more relevant than those located behind a
traveler.
However. since the Dijkstra algorithm uses a minimum-weight approach, weights
have to be converted such that highly relevant landmarks get a very low weight,
158 5 How Landmarks Can Be Observed, Stored, and Analysed
while less relevant landmarks receive a higher weighting. That is, weights here serve
as a penalty; good landmarks mean low penalty. The according algorithm is listed
in Algorithm 5.2.
Richter and Duckham’s [39] algorithm is more complex, but essentially does
something similar. The algorithm is based on Richter’s work on context-specific
route directions [37] (see next Chapter), which not only integrate landmarks into
the instructions, but also minimize the number of instructions. The overall aim is to
generate instructions that are memorizable and easy to follow. This is reflected in
the algorithm.
This algorithm operates on the complete line graph [8, 51]. The complete line
graph reflects movement options in a graph, i.e., it captures all possibilities to turn
from one edge onto another. Formally, this is the graph G 0 D .E 0 ; "/. E 0 is the set
of edges in G, where the direction of edges is ignored (i.e., .vi ; vj / D .vj ; vi / in
E 0 ). " is the set of pairs of vertices in E that share their middle vertex, i.e., " D
f..vi ; vj /; .vj ; vk // 2 E Eg. A pair of adjacent edges in E represents a decision
an agent can take to move from one edge to the next. Figure 5.15 illustrates these
definitions further.
Richter and Duckham’s algorithm models the instructions required to describe
a route as a set I of instruction labels. Instructions are associated with decisions
(the pairs of adjacent edges). These instructions describe what to do in order to
move from one edge to another (e.g., ‘turn right at this intersection’). Specifically
these instructions may contain references to landmarks (‘turn right at the church’).
There may be more than one instruction associated with a pair of edges, reflecting
alternative ways of describing the decision. And, likewise, the same instruction may
be associated with several pairs of adjacent edges—there may be many intersections
in the network where a right turn is possible. However, the algorithm assumes that
5.3 Landmark Integration 159
a b c
{l} {r}
{s}
Fig. 5.15 The complete line graph and how it applies to the labeling and decision function in
the simplest instruction algorithm. (a) An example street network; the solid nodes represent the
intersections (vertices), the hollow nodes are the vertices of the line graph (the edges in the original
graph). (b) The actual complete linegraph; the original graph is shown in light grey for reference.
(c) Instruction labeling for some of the edges as seen from the solid vertex. Applying the decision
function d.solid node; s/ when at this solid vertex would result in the vertex just above it
there is no ambiguity in instructions, i.e., ‘turn right’ may not describe two or more
decisions from the same edge.
Formally, instructions are covered by the labeling function l W " ! I 2 and the
decision function d W E I ! E [ f;g. For a given edge e and an instruction i ,
d.e; i / D e 0 gives the edge e 0 that results from executing the decision i at e, which
may be the empty set indicating that instruction i is not possible to execute at e.
Each instruction has a cost associated with it, represented by the weighting function
w W I ! R C . Costs reflect the cognitive effort to execute an instruction, i.e., how
difficult it is to understand and to identify what to do in the actual environment.
Figure 5.15 shows an example that illustrates these concepts.
As a further feature of this algorithm, which makes it complex in the end,
instructions may be ‘spread forward through the graph’. This reflects the fact that
often a single instruction may cover several intersections, such as in ‘turn left
at the third intersection’ where implicitly the instruction tells a wayfinder to go
straight at intersection one and two [25]. The cognitive and communication benefits
of combining instructions this way will be further discussed in Chap. 6. Here,
it is only important to note that there are limits to combining instructions (e.g.,
nobody would instruct someone to ‘turn left at the 47th intersection’). Therefore, in
spreading instructions through the graph, a distinction is needed between an edge
being reachable and being chunkable. An edge et is reachable from an edge es with
an instruction i if there exists a path from es to et that can be encoded as a sequence
of executions of instruction i . The same edge is chunkable from es if the sequence
of instruction i required to reach et from es is valid according to the combination
rules. This is checked by the validity function v W E E ! ft rue; f alseg.
An instruction gets spread forward as long as there are edges reachable with it.
A cost update is then only performed for those edges that are chunkable. This way
160 5 How Landmarks Can Be Observed, Stored, and Analysed
a b
Fig. 5.16 Spreading instructions forward in a graph; edges are labeled with their associated
instruction sets. (a) Reachable vertices (r) from s as here the instructions match. Chunkable vertices
are denoted by c; here instructions adhere to the combination rules. (b) Chunkable edges can be
reached in a single step, which corresponds to dynamically introducing new edges between the first
vertex and the chunkable vertex. Costs for reaching these edges are equivalent to reaching the first
edge (denoted by the different ws )
Algorithms 5.2 and 5.3 establish the costs to reach specific edges (or vertices)
in the underlying graph. They do not actually return a path to this destination edge.
This requires another reconstruction algorithm, which starts from the destination
and walks backwards through the graph until it reaches the origin, assembling
the path along the way. Algorithm 5.4 exemplifies such an approach for the
simplest instructions path. Using an algebraic language notation, where E and I
form alphabets, the algorithm constructs a path by iteratively prepending letters
to an initially empty word. Retrieval of the instructions that need to be followed
simply requires direct backtracking through the predecessor list (line 7), while
reconstructing the edges to follow requires an additional loop to retrieve those edges
between subsumed instructions (lines 8–15).
Richter [38], attempting to identify why landmarks have not been taken up in
commercial navigation services so far, categorized several of the approaches
presented in Sects. 5.2 and 5.3 with respect to different aspects. This categorization
is summarized in Table 5.7. Most of the data mining approaches of Sect. 5.2.2 are
not discussed here since they do not explicitly target landmarks of the scale and kind
addressed in this book.
Table 5.7 Matrix of approaches to landmark identification and integration; modified from [38]
Identification
Structure Function
Data Source Geometry Data Source Geometry
GIS WWW Abstract Point Polygon Individual Category GIS WWW Abstract Point Polyline Individual Category
Elias [11]
Raubal &
Winter
[35]
Tezuka &
Tanaka
[47]
Winter [52]
Winter et
al. [55]
Integration
Structure Function
Data Source Geometry Data Source Geometry
GIS WWW Abstract Point Polygon Individual Category GIS WWW Abstract Point Polygon Individual Category
Caduff &
Timpf
[4]
5.4 A Comparison of Landmark Identification and Landmark Integration. . .
Dale et
al. [6]
Duckham
et
al. [9]
(continued)
163
164
5.5.2 Taxonomies
Given the previous discussion, using types rather than individuals seems to be the
more promising approach. Here, properties of individual geographic objects do not
matter as they are inferred by some heuristics. These heuristics make use of a general
assessment of a specific type’s suitability as a landmark. For example, it may be
argued that in general a pub is more salient than a doctoral clinic. Consequently,
much less data would be required to make such an approach work. In fact, a POI
database that categorizes its entries according to type would suffice. This is the
approach taken by Duckham et al. [9], which has been discussed earlier in this
chapter.
Götze and Boye [16] recently suggested using a machine learning approach to
determine suitable landmark candidates. In their approach, landmark candidates
are described using a feature vector that may contain data, such as distance from
the route, but also categorical information, for example, whether the candidate
is a restaurant. Their approach determines user preferences for landmarks from
route directions generated by the users themselves. Depending on which candidates
appear in these descriptions, preferences for specific kinds of landmarks are learned
by the system and offered in system-generated descriptions in the future. This takes
away the necessity for a detailed data collection for every landmark candidate since
the feature vector uses fairly simple attributes. Still, these vectors need to be filled
and candidates need to be collected for the approach to work—and most importantly,
users need to be motivated to describe routes to themselves.
Even more, as argued above, existing POI databases usually exhibit a sparse and
uneven distribution of POIs across the space they cover. For example, Richter [38]
has shown for the WhereIs routing service that assuming an even distribution across
Australia—which is clearly not the case—there would only be one POI object
every 45 km2 . Thus, clearly new ways of collecting a sufficient number of landmark
candidates with sufficient detail are required. These will be discussed in the final
section of this chapter.
important and we will simply stick with user-generated content. Essentially, the idea
is to access users’ knowledge about an environment in order to collect and update
the required information; the users serve as ‘database’ [41].
Holone et al. [19], for example, suggested a system that allows to mark route
segments as bad or inaccessible for wheelchair users—or people with similar
movement restrictions, even if only temporarily (e.g., pushing a baby stroller).
Karimi et al. [21,22] discussed SoNavNet, a social navigation network, where people
can provide and request recommendations for POIs and routes to these POIs. While
neither of these approaches specifically targets landmarks, both exploit users and
their willingness to contribute in order to provide better navigation services.
One way to motivate users to contribute the kind of data a service provider is
looking for is to set up an entertaining incentive, such as a mobile game, in a way that
the sought for data becomes a by-product of that game [2,54]. Bell et al. [2] designed
such a game, called EyeSpy, which collects photographs of city locations that
support navigation. In that game, players take photographs within a city environment
and/or produce text tags describing the environment. These photographs are geo-
referenced by using a phone’s WiFi or GPS sensor readings. Other players then
need to confirm these photographs by moving close to the location where they were
taken and confirming that they have found what is depicted on the photographs.
Players get points for performing such confirmations, but they also get points based
on the popularity of a photograph, i.e., how often it gets confirmed by other players.
The idea behind this setup is that photographs become more popular if they are easy
to find and recognize. Therefore, in order to get more points, players will aim at
submitting photographs they believe to be easily recognizable. Such photographs
will also be easily recognizable in other contexts, for example, when provided
as navigation assistance. Thus, the useful by-products of EyeSpy are photographs
of a city environment that make specific locations more recognizable, i.e., have a
landmark character.
Richter and Winter [14, 42] proposed mechanisms to collect landmark data in
OpenStreetMap. OpenStreetMap (OSM) is a topographic data set of the world
compiled from user-generated content. The OpenLandmarks application would
allow users to mark existing objects within the OSM data as landmarks. A first
prototype has been developed (Fig. 5.17) and used in some early experimentation.
Currently, the system’s interface is map-based. If users identify a building as a
potential landmark while walking through a (city) environment, they may request
all buildings in their surroundings that are actually represented as polygons in OSM
to be highlighted as potential landmark candidates. Users may then select the one
they have identified on the map and describe its landmark characteristics either by
name (e.g., ‘Flinders St Station’) or by description (‘big yellow train station’).
It is envisioned to provide an improved version of such a tool to the Open-
StreetMap community—or in fact anyone interested in contributing—and this way
to build up a database of user-generated landmark candidates over time, similar
to what has been achieved with OpenStreetMap. While we believe that employing
mechanisms of user-generated content is the only realistic way of ever getting a
5.6 Summary 169
sufficiently detailed and up-to-date data set of landmark candidates, there are many
challenges attached to such an approach. These will be further discussed in the
conclusions of this book in Chap. 7.
5.6 Summary
integration of the most relevant candidates into the offered service. We discussed
several approaches that allow for identifying landmark candidates, either using
geographic data or less structured data (texts, photographs) as a source. We then
presented approaches that given a set of landmark candidates are able to select those
candidates that are best suited for a specific situation.
We have seen that these two steps are not well integrated in today’s existing
approaches and that there are further more serious issues that prevent landmarks
from being widely used in (commercial) applications, most importantly the huge
effort of collecting all the data needed for a useful landmark identification. This led
us to an outlook on some alternative approaches to acquiring this data, namely using
types instead of individuals, or tapping into the power of user-generated content.
The next chapter will now connect human and machine by discussing how
landmarks may enrich the interaction between them—in both directions.
References
13. Fisher, D.H.: Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2(2),
139–172 (1987)
14. Ghasemi, M., Richter, K.F., Winter, S.: Landmarks in OSM. In: 5th Annual International
OpenStreetMap Conference. Denver, CO (2011)
15. Goodchild, M.: Citizens as sensors: The world of volunteered geography. GeoJournal 69(4),
211–221 (2007)
16. Götze, J., Boye, J.: Deriving salience models from human route directions. In: Proceedings
of CoSLI-3 Workshop on Computational Models of Spatial Language Interpretation and
Generation. Potsdam, Germany (2013)
17. Hansen, S., Richter, K.F., Klippel, A.: Landmarks in OpenLS - a data structure for cognitive
ergonomic route directions. In: Raubal, M., Miller, H., Frank, A.U., Goodchild, M.F. (eds.)
Geographic information science. Lecture Notes in Computer Science, vol. 4197, pp. 128–144.
Springer, Berlin (2006)
18. Hollenstein, L., Purves, R.S.: Exploring place through user-generated content: Using Flickr to
describe city cores. J. Spatial Inform. Sci. 1(1), 21–48 (2010)
19. Holone, H., Misund, G., Holmstedt, H.: Users are doing it for themselves: Pedestrian
navigation with user generated content. In: Proceedings of the 2007 International Conference
on Next Generation Mobile Applications, Services and Technologies, IEEE Computer Society,
pp. 91–99. Washington (2007)
20. Jones, C.B., Purves, R.S.: Geographical information retrieval. Int. J. Geogr. Inform. Sci. 22(3),
219–228 (2008)
21. Karimi, H.A., Benner, J.G., Anwar, M.: A model for navigation experience sharing through
social navigation networks (SoNavNets). In: International Workshop on Issues and Challenges
in Social Computing (WICSOC2011). Las Vegas, NV (2011)
22. Karimi, H.A., Zimmerman, B., Ozcelik, A., Roongpiboonsopit, D.: SoNavNet: A framework
for social navigation networks. In: Procceedings of the International Workshop on Location
Based Social Networks (LBSN’09). ACM Press, New York (2009)
23. Kennedy, L.S., Naaman, M.: Generating diverse and representative image search results for
landmarks. In: Proceedings of the 17th International Conference on World Wide Web, WWW,
pp. 297–306. ACM, New York(2008)
24. Klippel, A.: Wayfinding choremes. In: Kuhn, W., Worboys, M., Timpf, S. (eds.) Spatial
Information T heory. Lecture Notes in Computer Science, vol. 2825, pp. 320–334. Springer,
Berlin (2003)
25. Klippel, A., Tappe, H., Habel, C.: Pictorial representations of routes: Chunking route segments
during comprehension. In: Freksa, C., Brauer, W., Habel, C., Wender, K.F. (eds.) Spatial
Cognition III. Lecture Notes in Artificial Intelligence, vol. 2685, pp. 11–33. Springer, Berlin
(2003)
26. Klippel, A., Winter, S.: Structural salience of landmarks for route directions. In: A.G. Cohn,
D.M. Mark (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 3693,
pp. 347–362. Springer, Berlin (2005)
27. Krumm, J., Davies, N., Narayanaswami, C.: User-generated content. Pervasive Comput. 7(4),
10–11 (2008)
28. Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh
IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
29. Lynch, K.: The Image of the City. The MIT Press, Cambridge, MA (1960)
30. Mummidi, L., Krumm, J.: Discovering points of interest from users’ map annotations.
GeoJournal 72(3), 215–227 (2008)
31. Nothegger, C., Winter, S., Raubal, M.: Selection of salient features for route directions. Spatial
Cognit. Comput. 4(2), 113–136 (2004)
32. Presson, C.C., Montello, D.R.: Points of reference in spatial cognition: Stalking the elusive
landmark. Br. J. Dev. Psychol. 6, 378–381 (1988)
33. Quinlan, J.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
34. Rattenbury, T., Naaman, M.: Methods for extracting place semantics from Flickr tags. ACM
Trans. Web 3(1), 1:1–1:30 (2009)
172 5 How Landmarks Can Be Observed, Stored, and Analysed
35. Raubal, M., Winter, S.: Enriching wayfinding instructions with local landmarks. In: Egenhofer,
M.J., Mark, D.M. (eds.) Geographic Information Science. Lecture Notes in Computer Science,
vol. 2478, pp. 243–259. Springer, Berlin (2002)
36. Richter, K.F.: A uniform handling of different landmark types in route directions. In: Winter, S.,
Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory. Lecture Notes in
Computer Science, vol. 4736, pp. 373–389. Springer, Berlin (2007)
37. Richter, K.F.: Context-Specific Route Directions - Generation of Cognitively Motivated
Wayfinding Instructions, vol. DisKi 314 / SFB/TR 8 Monographs Volume 3. IOS Press,
Amsterdam, The Netherlands (2008)
38. Richter, K.F.: Prospects and challenges of landmarks in navigation services. In: Raubal, M.,
Mark, D.M. (eds.) Cognitive and Linguistic Aspects of Geographic Space–New Perspectives
on Geographic Information Research, Lecture Notes in Geoinformation and Cartography,
pp. 83–97. Springer, Berlin (2013)
39. Richter, K.F., Duckham, M.: Simplest instructions: Finding easy-to-describe routes for naviga-
tion. In: Cova, T.J., Miller, H.J., Beard, K., Frank, A.U., Goodchild, M.F. (eds.) Geographic
Information Science. Lecture Notes in Computer Science, vol. 5266, pp. 274–289. Springer,
Berlin (2008)
40. Richter, K.F., Klippel, A.: Before or after: Prepositions in spatially constrained systems. In:
Barkowsky, T., Knauff, M., Ligozat, G., Montello, D.R. (eds.) Spatial Cognition V. Lecture
Notes in Artificial Intelligence, vol. 4387, pp. 453–469. Springer, Berlin (2007)
41. Richter, K.F., Winter, S.: Citizens as database: Conscious ubiquity in data collection. In:
Pfoser, D., Tao, Y., Mouratidis, K., Nascimento, M., Mokbel, M., Shekhar, S., Huang,
Y. (eds.) Advances in Spatial and Temporal Databases. Lecture Notes in Computer Science,
vol. 6849, pp. 445–448. Springer, Berlin (2011)
42. Richter, K.F., Winter, S.: Harvesting user-generated content for semantic spatial information:
The case of landmarks in OpenStreetMap. In: Hock, B. (ed.) Proceedings of the Surveying
and Spatial Sciences Biennial Conference 2011, pp. 75–86. Surveying and Spatial Sciences
Institute, Wellington, NZ (2011)
43. Sadeghian, P., Kantardzic, M.: The new generation of automatic landmark detection systems:
Challenges and guidelines. Spatial Cognit. Computat. 8(3), 252–287 (2008)
44. Schlieder, C.: Reasoning about ordering. In: Frank, A.U., Kuhn, W. (eds.) Spatial Information
Theory. Lecture Notes in Computer Science, vol. 988, pp. 341–349. Springer, Berlin (1995)
45. Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In:
Freksa, C., Mark, D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science,
vol. 1661, pp. 37–50. Springer, Berlin (1999)
46. Surowiecki, J.: The Wisdom of Crowds. Doubleday, New York (2004)
47. Tezuka, T., Tanaka, K.: Landmark extraction: A web mining approach. In: Cohn, A.G., Mark,
D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 3693,
pp. 379–396 (2005)
48. Tomko, M., Winter, S.: Describing the functional spatial structure of urban environments.
Comput. Environ. Urban Syst. 41, 177–187 (2013)
49. Waller, D., Loomis, J.M., Golledge, R.G., Beall, A.C.: Place learning in humans: The role of
distance and direction information. Spatial Cognit. Computat. 2(4), 333–354 (2000)
50. Williams, S.: Generating pitch accents in a concept-to-speech system using a knowledge base.
In: Proceedings of the 5th International Conference on Spoken Language Processing. Sydney,
Australia (1998)
51. Winter, S.: Modeling costs of turns in route planning. GeoInformatica 6(4), 345–361 (2002)
52. Winter, S.: Route adaptive selection of salient features. In: Kuhn, W., Worboys, M., Timpf, S.
(eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 2685, pp. 349–361.
Springer, Berlin (2003)
53. Winter, S., Raubal, M., Nothegger, C.: Focalizing measures of salience for wayfinding. In:
Meng, L., Zipf, A., Reichenbacher, T. (eds.) Map-based Mobile Services: Theories, Methods
and Implementations, pp. 127–142. Springer, Berlin (2005)
References 173
54. Winter, S., Richter, K.F., Baldwin, T., Cavedon, L., Stirling, L., Duckham, M., Kealy, A.,
Rajabifard, A.: Location-based mobile games for spatial knowledge acquisition. In:
Janowicz, K., Raubal, M., Krüger, A., Keßler, C. (eds.) Cognitive Engineering for Mobile
GIS (2011). Workshop at COSIT’11
55. Winter, S., Tomko, M., Elias, B., Sester, M.: Landmark hierarchies in context. Environ. Plann.
B: Plann. Des. 35(3), 381–398 (2008)
56. Zhang, X., Mitra, P., Klippel, A., MacEachren, A.: Automatic extraction of destinations, origins
and route parts from human generated route directions. In: Fabrikant, S., Reichenbacher,
T., van Kreveld, M., Schlieder, C. (eds.) Geographic Information Science. Lecture Notes in
Computer Science, vol. 6292, pp. 279–294. Springer, Berlin (2010)
57. Zhang, X., Mitra, P., Klippel, A., MacEachren, A.M.: Identifying destinations automatically
from human generated route directions. In: Proceedings of the 19th ACM SIGSPATIAL
International Conference on Advances in Geographic Information Systems, GIS, pp. 373–376.
ACM, New York (2011)
58. Zheng, Y.T., Zhao, M., Song, Y., Adam, H., Buddemeier, U., Bissacco, A., Brucher, F., Chua,
T.S., Neven, H.: Tour the world: Building a web-scale landmark recognition engine. In: IEEE
Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1085–1092 (2009)
Chapter 6
Communication Aspects: How Landmarks
Enrich the Communication Between Human
and Machine
Throughout this book, we have established that landmarks are a key construct
for humans to make sense of the environment they live in. Landmarks structure
our mental representation of space and they are an important element of any
spatial communication, be it verbal or graphical. Accordingly, producing and
understanding references to landmarks comes natural to us. For a computer, this
is far less straightforward; in fact, it is a rather hard problem as we have already
argued in the introduction. Nonetheless, because it is so natural for us, enabling
computers to use landmark references in their communication as well will lead to
more natural, easier, and more successful human-computer-interaction.
In Sect. 1.3, we have exemplified what such a communication takes. This chapter
will look at some approaches that aim at enabling it. Following Turing’s ideas of a
machine communicating with people without being identified as a machine [62], or
more specifically, doing so in a spatial context [66], the aim here is not to copy the
human cognitive processes and facilities. Rather, it is sufficient that the surface—
the interface—of the computer’s internal processes matches with human expectation
and concepts. Further, it does not make sense to recreate human imperfection,biases,
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 175
DOI 10.1007/978-3-319-05732-3__6, © Springer International Publishing Switzerland 2014
176 6 How Landmarks Enrich the Communication Between Human and Machine
and mistakes in the human-computer interaction [31]. Rather, the machine should
always pick what is (objectively) correct and best—a perfect super-human so to
speak.
In the remainder of this chapter, we will first address ways of how machines may
produce references to landmarks (in Sect. 6.2) because this appears to be easier than
understanding references to landmarks, which we will discuss in Sect. 6.3. Finally,
in Sect. 6.4, we will point to some studies that have investigated the usability and
usefulness of introducing landmarks into human-computer interaction.
For a machine, it is easier to successfully produce landmarks than to understand
them for computational and cognitive reasons. Computationally, a machine may
rely on the data structures and algorithms presented in Chaps. 4 and 5—given that
sufficient data is available. As we have discussed, this is hardly ever uniformly the
case for any given environment (see also the final discussion in the next chapter).
Accordingly, some fallback strategies in communication are required in case no
landmark is available.
These computational methods ensure that the machine will pick landmark
references that are salient and relevant. The pick may not necessarily be actually
optimal though, since data may not be complete and any computational approach
out of necessity makes some simplifying assumptions. However, generally this will
not be an issue, since in contrast to machines people are very good at adapting
to their communication partner. Simply by mentioning a geographic object in the
communication, the communication partner will pay attention to this object, making
it more relevant and, thus, increasing its salience [57]. The object becomes a
landmark by virtue of being mentioned in the communication. This will compensate
for the machine potentially picking not the most suited landmark reference. People
will still be able to recognize and understand the chosen one in most cases. So, in
short, in producing landmark references, machines can rely on a well-defined, but
likely incomplete, set of landmark candidates to choose from—determined by the
underlying data sets and algorithms—and on the cognitive facilities of the human
communication partner, which enable them to flexibly adapt to the chosen landmark
references.
On the other hand, this human flexibility and ability to adapt is the reason why
understanding landmark references is so difficult for a computer. Even though we
can assume a limited context in which interaction happens, namely negotiating some
spatial descriptions (e.g., where something is located or how to get to some place),
the variety of geographic objects people may select from and the variety in the ways
they may describe this selection is immensely wide. This means that the computer
needs much greater flexibility compared to the case of producing landmarks, and
machines are not very good at adaptation. Thus, either the vocabulary needs to be
restricted for successful understanding of landmark references—which takes away
many advantages of landmark-based communication—or a sophisticated mecha-
nism for resolving (arbitrary) landmark references is required, including detailed
data about the environment and advanced parsing functionality. Alternatively,
machine learning mechanisms may be employed to learn from communication
with a human partner; however, to date we are not aware of any approach actually
following down that path.
6.1 Landmarks in Human-Computer Interaction 177
In short, it is much harder for the computer to infer what kind of landmark
reference a human interaction partner will produce than for the human to under-
stand machine generated landmark references. This makes understanding landmark
references the more difficult challenge.
First, let us have a look at a sample dialog requesting directions to a specific location
between two people. We have already discussed the principles underlying such
dialogs in Sect. 3.4 of this book.
Suppose you are in Melbourne, Australia, for the first time. You have been
wandering around the Central Business District (CBD) for a while, have ended up
in front of the exhibition centre just south of the Yarra river, and now you want
to go back to your hotel, which happens to be at the eastern end of Little Bourke
Street (one of the streets of the CBD, and the eastern end being part of Chinatown).
Luckily, there just happens to be one of the friendly guides provided by the City
of Melbourne for tourist support, hanging around here. So you decide to approach
them and ask for directions. The dialog may go something like this:
You: Excuse me?
Guide: Yes?
You: How do I get to Chinatown?
Guide: Chinatown? You mean Little Bourke Street?
You: Yes.
Guide: Ok. You cross the street here (points east) and walk along the river past
the casino. Then you cross another big street—Queens Bridge. Just behind that
street there is a footbridge crossing the river. Take this and continue along the
river on the other bank until you get to Flinders Street Station. Do you know
that?
You: Nod, because you have passed it earlier today and remember it from the
travel book.
Guide: Good, you can go either through the tunnel underneath the station or get
onto the bridge just next to the station and then turn left away from the river. This
will get you onto Swanston Street, which you will probably know as well. Just
walk up that street for a bit and Little Bourke Street will be on your right. You
can easily spot it by its Chinese gate at the entrance to the street.
You: Oh, that sounds easy. I just walk along the river here, cross it and then go
up Swanston Street. Thank you very much!
Guide: Yes, exactly. Have a nice day!
You may have experienced many similar dialogs in your life. As discussed
in Sect. 3.4, we can observe several phases in this example dialog, namely a
initiation phase (‘Excuse me?’), the route directions themselves (‘You cross the
street here. . . ’), a securing phase (‘Oh, that sounds easy’), and a closure phase
178 6 How Landmarks Enrich the Communication Between Human and Machine
(‘Have a nice day’) [1, 15, 18, 49]. In each phase, each communication partner has
different tasks, and both need to produce and understand references to landmarks.
In human-computer interaction there is, of course, no need for social
conventions—the computer will always behave the same way. In the following
we will assume that the human user requests some spatial information (location of
a geographic object, or route directions) from the computer. We also assume that
the context is defined, i.e., we do not assume a general purpose query machine, but
some service dedicated to spatial communication, for example, a navigation service.
The requirements for a computer to produce landmark references have been
discussed throughout Chaps. 4 and 5. They can be summarized as follows:
1. A data set (or the combination of several data sets) about a geographic environ-
ment that contains data which makes geographic objects discriminable from each
other and allows to access (some of) their properties (such as location, size, or
type).
2. A data structure that allows for capturing the landmarkness of geographic objects
and for integrating this information into other required data (e.g., a path network
for routing purposes).
3. A mechanism to establish an object’s principle suitability as a landmark—its
salience.
4. A mechanism to establish an object’s suitability in a specific given situation—its
relevance.
To make landmark production work it needs the actual communication mecha-
nism(s), i.e., some way to communicate the chosen landmark reference(s) to the
human user in verbal or graphical form in the right context.
In short, the production of landmark references by a computational system
seems to be a reasonable expectation. However, people may have differing previous
knowledge about an environment and, thus, require different detail in the spatial
descriptions. Or they may have some clarification questions (in a securing phase)
and the system would need to understand and come up with alternative or more
detailed descriptions.
Understanding landmark references is the greater challenge compared to produc-
ing them. Accordingly, the requirements for successfully understanding landmarks
may be greater as well. At least they appear to be less well defined. The minimal
requirements are:
5. A mechanism to parse human queries (verbal or graphical) into a form that the
computer can process.
6. A data repository that allows for a matching of landmark references to geographic
objects—something like georeferencing [36], only that the references can be
expected to be less well specified than in an authoritative gazetteer.
7. Some (spatial) reasoning mechanisms to correctly relate all references in the
query, which then allows to come up with a reasonable answer—bringing us
back to producing landmark references.
6.2 Producing Landmarks 179
In their initial paper on landmark identification, Raubal and Winter [54] have
already suggested a formal notation for including landmarks into route directions.
Approaches for producing landmarks are to this day predominantly developed for
enriching route directions with landmarks. Most notably, the Australian routing
service WhereIs augments its route directions with landmark references that are
derived from an external collection of POIs (see Chap. 5).
As a little exercise you may try to recreate yourself the route described in
above dialog. The best route we could achieve on WhereIs contains some
unnecessary turns and loops and, thus, results in 22 instruction steps. This is
clearly much more complicated than the human dialog. For that reason, we
will not use this particular route for the following example.
Commercial routing services usually generate their instructions using some pre-
defined templates with little linguistic variation. The same holds for WhereIs,
which uses a template of the form “<TURN ACTION onto STREET NAME at
POI NAME>” to incorporate landmark references into the generated instructions.
For example, when calculating directions from the Melbourne Exhibition Centre
(on Normanby Rd, Southbank) to The University of Melbourne (156–292 Grattan
St, Parkville),1 five (out of the 11) instructions contain a reference to a POI
landmark: “At the roundabout, take the 1st exit onto Peel St, West Melbourne at
Queen Victoria Market”; “Veer left onto Elisabeth St, Melbourne at Public Bar
Hotel”; “At the roundabout take the 3rd exit onto Elisabeth St, Melbourne at Dental
Hospital”; “Turn right onto Grattan St, Melbourne at Royal Melbourne Hospital”;
“Turn left onto UNNAMED road after Barry St, Carlton at The University of
Melbourne”. As you can see, these instructions utilize large, prominent buildings as
landmarks, which are easy for wayfinders to identify. At the same time the generated
instructions are very monotonous, especially given that those instructions without
landmark references (not printed here) look very much the same without the “at”
part.
In research, there have been several attempts of breaking up these monotonous
directions with more variation, using methods from natural language generation
(NLG). Dale et al. [17], for example, argued for a more natural sentence structure
1
Like most routing services, WhereIs requires address information to actually calculate a route.
180 6 How Landmarks Enrich the Communication Between Human and Machine
Text Planner
Message
Sequence
Microplanner
Sentence
Plans
Surface Realizer
Text
that gets away from the one-sentence-per-step messages routing services produce,
and instead make use of more complex clause structures that cover related infor-
mation in a single sentence. Their CORAL system generates such instructions; its
architecture is shown in Fig. 6.1. This kind of architecture is quite typical for NLG-
motivated route planners and reflects to a degree the theoretical models for giving
route directions discussed in Sect. 3.4.
CORAL starts out with a route through a network representation of the environ-
ment, very similar to what is used in any routing service. It then moves through
three phases to arrive at the verbal messages communicated to the user. The
first phase, the text planner, determines what information about the route needs
to be conveyed. This information is represented using three different message
types: POINT messages refer to landmarks (e.g., ‘turn left at the gas station’);
DIRECTION messages correspond to turns made at decision points (e.g., ‘turn
right’); PATH messages describe continuous movement along parts of the road
network (e.g., ‘follow the road’). The text planner phase results in an alternating
sequence of POINT, DIRECTION, and PATH messages, finishing with a POINT
message that describes the destination.
This text plan is the input for the micro-planning phase, which decides how to
combine the individual messages into clause structures and also how to refer to
each of the referenced elements. The former is an aggregation task, where two
6.2 Producing Landmarks 181
(or more) messages are merged. Most typically, this is a PATH C POINT (e.g.,
‘follow the road until the roundabout’) or POINT C DIRECTION (e.g., ‘turn left at
the gas station’) combination, but other combinations are possible as well. The latter
part of the micro-planning phase is generating referring expressions—including
selection of the reference noun (or pronoun, or other construct) used to describe
an intended object, such as an intersection or landmark. Following the principles
of relevance [33], the CORAL system tries to be as specific as needed, following
the three principles of sensitivity, adequacy, and efficiency. It in turn uses as a
reference alternatively either a landmark that is at or close to an intersection, the
type of intersection (e.g., T-intersection or roundabout), the name of the immediately
preceding intersection, or the name of the intersecting street.
Finally, the surface realizer phase maps the semantic specifications into actual
sentences, i.e., into (grammatically correct) natural language. An example output of
the CORAL system is shown in Table 6.1; this example is taken from [17].
As you can see, these instructions have some more linguistic variation than the
WhereIs directions discussed above (on p. 179). For example, turns are described by
stating ‘take a right’ or ‘turn to the left’. Curiously enough however, the particular
example provided by Dale et al. does not contain any landmark references, with an
exception of the structural landmark ‘the end of the road’ [42].
Another system that aims for variation in the generated route directions is an
information kiosk developed at the University of Bremen, Germany [16]. While set
in an indoor scenario, it makes explicit use of landmark references and is based on
some of the methods discussed in Chap. 5. There is no principle reason why it would
not work in an outdoor setting as well. Figure 6.2 shows an overview of the pipeline
architecture employed in the kiosk system. For now we are only interested in the
bottom part, i.e., the pathway starting at the ‘spatial data’ box and leading to the
‘user’ box. We will discuss the top part in the next section.
Except for being a dialog system, i.e., enabling some synchronous, co-presence
user interaction, the system’s approach to producing route directions is similar
to the CORAL system. The kiosk system starts by calculating a route between
the kiosk itself, which always is the origin since users are standing in front of
the computer running the kiosk software, and the requested destination. Next, the
instructions required to successfully navigate this route are determined. To this
end, a computational process called GUARD (Generation of Unambiguous, Adapted
Route Directions) [57] is employed. GUARD unambiguously describes a specific
182 6 How Landmarks Enrich the Communication Between Human and Machine
user user
response Language dialog act
Understanding
Knowledge Spatial
base Data
system
User GUI dialog state
Dialog Route
management generation
Language
system generation system
response dialog act
Fig. 6.3 Overview of GUARD , the generation process for route directions in the kiosk system
(from [57], modified)
2
We will not explain context-free grammar any further in this book other than saying that it is a
concept of formal language theory. A grammar essentially is a mapping from some nonterminal
symbols (e.g., the semantic representation of a left turn) to some (string of) terminal symbols
(e.g., the words ‘left’, or ‘go left’); the grammar is context-free if a specific nonterminal symbol
V always maps to the same string of terminal symbols w, regardless of its surrounding symbols
(the context). For more details on formal languages refer to a textbook on theoretical computer
science or theoretical linguistics, for example, Hopcroft, J.E., Motwani, R., Ullman, J.D. (2006).
Introduction to Automata Theory, Languages, and Computation (3rd ed.). Addison-Wesley.
184 6 How Landmarks Enrich the Communication Between Human and Machine
Table 6.2 Route directions generated by the system of Dethlefs et al., taken
from [20]
Turn-by-turn directions
Follow Swan Street to Richmond South Post Office for 52 m. Take a left after the
Richmond South Post Office, and take a right at Woodlawn Street. At this point,
go past a car park on the left. Cross the next junction at Normanby Place, and
cross the next junction at Moorhouse Road. This will then lead onto Wellington
Parade. Go straight over junction. Cross the next junction, when you get to Vale
Street. Go into East Melbourne. Take a slight left after the East Melbourne Post
Office. At this point, go past traffic lights on the right. Head straight over
junction, and take a turn after the traffic lights. Take a left before the Barbecue,
and after the Barbecue, take a slight right. At this point, pass the Richmond
Cricket Ground on the left and continue for 62 m
Destination descriptions
Head to the East Melbourne Post Office that is on Wellington Parade. Take
Rotherwood Street and then go over Wellington Parade. You will come to the
Richmond Cricket Ground
You can observe that there is a lot of linguistic variation in the turn-by-turn
directions. They are also much more verbose than those of the CORAL or kiosk
systems. Indeed, people had some difficulties realizing that these route directions
were machine-generated (see Sect. 6.4 for further discussion). You can also see
that the system may refer to slightly odd landmarks, such as a barbecue. Public
barbecues are placed in many parks in Australia, but they are significantly smaller
than, say, buildings, making them rather unlikely candidates for landmark references
in route directions for longer routes.
This last example, and also the example of the CORAL system at the beginning,
highlight again that computational systems depend on data sets that contain a
sufficient number of potential landmark candidates in order to produce useful
references to landmarks sufficiently often.
A simple way of dealing with this data issue is to restrict the system to a specific
location. The SpaceBook project, for example, develops a tourist information
system with pedestrian navigation functionality for the city of Edinburgh [38]. The
system integrates navigation instructions and the provision of tourist information
about relevant POIs into a single dialog. Navigation instructions incorporate
references to landmarks—both the tourist POIs as well as other salient buildings,
such as restaurants. The data is taken from a city model spatial database that contains
information about thousands of objects in Edinburgh (according to the authors).
The data has been compiled from existing sources, such as OpenStreetMap, Google
Place3 and the Gazetteer for Scotland.4 More details about SpaceBook will be
presented in the next sections of this chapter.
3
https://www.google.com/business/placesforbusiness/, last visited 8/1/2014.
4
http://www.scottish-places.info/, last visited 8/1/2014.
6.2 Producing Landmarks 185
Now let us turn to producing landmarks in a graphical form. Unfortunately, there has
not been much work done on a dedicated production of graphical landmark displays.
Commercial (in-car) navigation services often depict POIs on their maps, or at least
offer the option to do so. However, as argued before in this book, they cannot really
function as landmarks. Curiously, while WhereIs produces references to landmarks
in its verbal instructions, the accompanying route map does not highlight these
landmarks. The Bremen kiosk system in one of its iterations depicted landmarks as
part of a multi-modal presentation of route information. But this has been a rather
ad-hoc solution basically labeling those polygons that represent objects mentioned
in the description, while not labeling other object polygons. This research has
never been published or pursued any further. There has been some preliminary
unpublished work in relation to the OpenLandmarks idea [70], where both the
location of a referenced landmark on a moving-dot map as well as a photograph of
that landmark were shown as part of multi-modal instructions (see Fig. 6.4). Again,
this is work in its earliest stages.
Elias et al. [26] discussed adequate depictions of landmarks on maps. This
is a general discussion not regarding an attached landmark production system.
Their analysis is from a cartographic perspective and provides some guidelines for
including landmarks on a map. In their analysis, they distinguish different ways
of referring to a building, namely either using a shop’s name (McDonalds, H&M,
186 6 How Landmarks Enrich the Communication Between Human and Machine
level of abstraction
church
Fig. 6.5 Levels of abstraction in depicting landmarks on a map (after [26]). The photograph in the
first panel is CC-BY-2.0 by Flickr user Michiel2005, modified
Starbucks, etc.) or its type (gas station, pharmacy, bakery, etc.), or the function of
a building (school, church, etc.) or their visual properties (the big red building, the
small wooden shed, etc.). Table 6.3 shows the results of their analysis using several
levels of abstraction (from image to textual description) illustrated in Fig. 6.5.
The matrix in Table 6.3 highlights the most suitable way of graphically present-
ing a landmark. According to Elias et al. [26], a textual description would always be
possible, but is not always adequate. For shops of well known brands, the brand’s
icon may be the ideal representation as it will be easily identifiable for most people.
For shop types specifically designed icons or symbols may be used (e.g., a stylized
bank or gas station). Buildings with specific function are often large and may be
at (structurally) prominent locations, thus, showing them in some detail—or at
least their outline—helps in identifying them in the real world. Visual aspects are
best represented directly, i.e., by using a (photographic) image or a drawing that
highlights this visual property. To convey proper names of buildings, a textual label
has to be used.
There has been some research on using photographs for augmenting wayfinding
assistance, which implies the use of landmark-based navigation concepts even
though this is not explicitly discussed. For example, Hirtle and Sorrows [37]
presented a library finder for the University of Pittsburgh campus. This system
uses a hierarchical approach. It first presents an overview of the campus with the
library indicated and then offers more detailed information on how to get there
(to the building and then within the building). Each of these hierarchy levels is
6.2 Producing Landmarks 187
5
http://store.ovi.com/content/178183, last visited 8/1/2014.
188 6 How Landmarks Enrich the Communication Between Human and Machine
Then it is important to understand which specific bar and cinema are referred to.6
This requires also an understanding of the context of the utterance [53].
WhereIs solves this problem by forcing users to resolve ambiguities themselves.
It is possible to enter POI names as origin and destination (in the ‘what’ text
box), however, the user needs to additionally provide the suburb of the POI (in
the ‘where’ text box). Then WhereIs suggests a list of potential street addresses that
the input may refer to and makes the user select the appropriate one. This ensures
that WhereIs gets the context right, but on the other hand forces users to provide
information they possibly do not know.
Such context restrictions are typical for computational services. The above
argument is not a specific criticism of WhereIs, other well-known (web-based)
navigation services behave similarly. For example, Winter and Truelove [67] have
shown that while Google Maps allows for more free-form input compared to
WhereIs, interpretation of this input is often inadequate and requires users to
either adapt their input to the system interpretation or to put significant effort
in interpreting the results themselves. For example, Google Maps currently still
interprets any spatial relation as ‘near’.
Although Winter and Truelove’s analysis was drawn from the interpretation of
place descriptions, which are structurally different from landmark references as
shown before, many findings also hold for understanding landmark references. Also
in the context of place description research, Vasardani et al. [64] highlighted the
following issues:
• Official, authoritative gazetteers do not usually include unofficial and vernacular
place names, or temporary (replacements of) place names, such as event names
that are sometimes used synonymously for the location they are happening at.
This is in contrast to the popular use of vernacular place names, in particular
in familiar environments. Sometimes people may not even be aware of official
names. Thus, the restriction to official place names restricts a system’s under-
standing of references to places and in turn restricts interaction possibilities with
the users.
• A sensible interpretation of spatial relations is important to understand place ref-
erences correctly. Clearly, taking every relation to mean ‘near’ is inappropriate.
Formal models for a range of spatial relations have already been investigated in
the literature (see Chap. 4). In principle, these models enable a more adequate
interpretation of place references, however, the interpretation of spatial relations
is context-dependent [69]. Also, people’s cognitive concepts of a relation may
differ from what is defined in a formal model [40, 52] leading to diverging
interpretations of a place description.
• Places often are extended geographic regions with indeterminate boundaries.
This indeterminacy is hard to capture satisfactorily in computational models,
despite of several attempts [6, 12, 47]. Again, context and individual differences
aggravate the issue.
6
Even though in this example it may be sufficient to know which cinema is meant, assuming there
is only one bar in it.
190 6 How Landmarks Enrich the Communication Between Human and Machine
The latter two issues are important aspects of understanding locative expressions,
as has been discussed in Chap. 4. The first relates back to the context and content
restrictions. Such restrictions are unavoidable as no computational system will ever
understand every possible landmark reference possible (globally, i.e., around the
world, or even locally, e.g., within a city). This is a big difference to a human
communication partner who will have no problems dealing with vernacular or
unofficial references, or with references to vaguely defined geographic objects,
though there might be occasional misunderstandings. Not being able at all to deal
with such indeterminacy is a major restriction for computational systems.
Coming back to the systems already presented in Sect. 6.2.1, for some systems
solving these issues is still relatively easy. The kiosk system [16], for example, is
stationary and placed inside a building. Accordingly, meaningful dialogs can only
be expected about this specific building. And for a single building, even if it is large,
it is quite reasonable to manually collect all relevant data needed for landmark-
based communication. Most importantly, such data will comprise of rooms (their
number, function and occupation), some infrastructure, such as coffee machines or
printers, and some landmark objects, for example, poster walls or furniture placed
in corridors (cf. also [58]). This data will then form the system’s knowledge base.
The kiosk system uses it to both produce and understand landmark references (see
Fig. 6.2 on p. 182).
The kiosk’s dialog manager uses different dialog states (Table 6.4) and a
deterministic dialog policy defined in Eq. (6.1).7 The sequences in Eq. (6.1) (e.g.,
10,000,000) are to be read against Table 6.4. Digits in the sequence represent the
domain value of each state, in the order the states appear in the table. That is,
10; 000; 000 corresponds to a state where a greeting has happened and the system
awaits a user request.
Essentially, the dialog manager proceeds through a dialog according to the
previously presented communication models. It reacts to utterances by users, which
accordingly have to be parsed in order for the system to provide useful replies.
The kiosk system relies on textual input, so typed utterances are being parsed. To
this end, the OpenCCG parser is used [14], producing a structured representation
of the utterances’ semantics, which is matched against the knowledge base. In case
the parser fails, keyword spotting is used to look for names of locations or people,
which may help in guessing the user request.
7
In fact the dialog manager uses a Markov Decision Process (MDP) model, but this is not really
important here. For more details on MDP, look up a textbook on Artificial Intelligence, for
example, S. Russel & P. Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. Prentice
Hall, Englewood Cliffs, NJ.
6.3 Understanding Landmarks 191
Table 6.4 Dialog states used in the kiosk dialog manager, after [16]
State Domain value
Salutation 0 D null; 1 D greeting; 2 D closing
Origin 0 D unknown; 1 D requested; 2 D known
Destination 0 D unknown; 1 D requested; 2 D known
NumTuples 0 D null; 1 = one; 2 D more-than-one
Instructions 0 D unknown; 1 D known; 2 D provided
UserUtterance 0 D unknown; 1 D parsed; 2 D unparsed; 3 D spotted
MoreInstructions 0 D null; 1 D empty; 2 D yes; 3 D no
8
ˆ
ˆ opening if s 2 f0000000g
ˆ
ˆ
ˆ
ˆ request if s 2 f1000000; 1000012g
ˆ
ˆ
ˆ
ˆ other_request if s 2 f1220210; 1220220g
ˆ
ˆ
ˆ
ˆ s 2 f1220210; 1220220; 1110030; 1210030;
ˆ
ˆ
query_route if
ˆ
ˆ
< 1220030; 1211030; 1221030g
p.s/ D present_info if s 2 f1221110; 1221130g
ˆ
ˆ
ˆ
ˆ clarify if s 2 f1112100; 1112030; 1212030; 1222211; 1222231g
ˆ
ˆ
ˆ
ˆ apologize if s 2 f1110020; 1210020; 1220220; 1210210g
ˆ
ˆ
ˆ
ˆ confirm if s 2 f1112010; 1112030g
ˆ
ˆ
ˆ
ˆ if s 2 f1 3g
ˆ closing
:̂
wait otherwise
(6.1)
The SpaceBook system uses commercial software for parsing natural language
input as well as for speech generation. An interaction manager keeps track of the
geographic context and the dialog history to resolve anaphoric references,8 among
others. Questions related to the navigation process are answered by computations
within the system (e.g., distances between locations). Touristic questions are
answered by a textual lookup from the Gazetteer of Scotland, Wikipedia9 and
WordNet.10 A ranking mechanism, which was trained using machine learning
techniques, ranks candidate answers and the top candidate is presented to the user.
Since answers often involve fairly long text, they are provided piecewise and users
can interrupt their presentation anytime. Understanding what kind of question is
being asked is again achieved through machine learning, i.e., by training another
classifier using an annotated question corpus.
Not many interfaces exist allowing for spatial input in graphical form, let alone
graphical input related to landmarks. This may change in the (near) future with
the advent of smartphones with their advanced gesture input recognition. There has
been a long run of research on so called query by sketch [23], however. Query by
sketch permits users to sketch (draw) a spatial query to request some information
from a service. Such sketches are first and foremost a sequence of pixels, which
may be grouped to lines (a geometric operation). These lines may then be further
grouped to represent some geographic objects (a semantic operation, even though
it will incorporate geometric sub-operations), such as streets, buildings, or labels.
Figure 6.6 shows a fictitious sample sketch map that is quite typical for how people
produce such maps in a communication act.
As discussed by Forbus et al. [28] it takes a big effort for a computational system
to understand sketches. The authors characterize the sketching ability of a system
as well as of its users along four dimensions: visual understanding, conceptual
understanding, language understanding, drawing skills.
Visual understanding refers to the ability of making sense of the ‘ink’ used
while sketching, i.e., how different line strokes form and how these line strokes
in turn may form more complex objects (so called ‘glyphs’). Geometric operations
as mentioned above fall into this kind of understanding. Conceptual understand-
ing helps translating these objects into meaningful elements (such as streets or
8
Anaphoric references use some kind of deictic reference, usually a pronoun, to refer back to an
item mentioned before. For example, in ‘St Peter is a large cathedral in Rome; it is home to the
Pope’ ‘it’ is an anaphoric reference to St Peter.
9
http://www.wikipedia.org, last visited 8/1/2014.
10
http://wordnet.princeton.edu/, last visited 8/1/2014.
6.3 Understanding Landmarks 193
a b
B
End
A
Start
Fig. 6.7 Different geometric operations in processing a sketch, after [7]. (a) Buffer zones used to
determine whether a stroke forms part of an object; (b) closing gaps within an object; (c) removing
overshoots for polygonal objects
to close gaps, for lines interruptions in their flow may be closed (see Fig. 6.7b).
If objects form a closed loop, i.e., represent a region, any overshooting lines are
removed (see Fig. 6.7c).
Furthermore, visual understanding has to determine how the various objects
relate to each other, i.e., to understand spatial relationships in the sketch. It is widely
accepted that qualitative spatial relations are best suited to this end [13, 23, 29, 65].
Egenhofer [23] suggested to use five types of binary spatial relations: coarse
topological relations (using the nine-intersection model [24]), detailed topological
relations, metrical refinements (in line with [25]), coarse cardinal directions [30],
and detailed cardinal directions. Using these relations results in fairly complex,
comprehensive sketch descriptions that are also flexible enough to allow for relaxing
relations in an ordered manner. This is important for querying spatial data based on
sketches where users ask for configurations that reflect the provided sketch in a
qualitative way, i.e., describe scenes that are similar to the one sketched.
After establishing what objects there are in a sketch and how they relate to each
other, the next step is to understand what these objects represent. This conceptual
understanding is much harder than visual understanding, which should not be hard
to guess given the previous discussions around geometric and semantic information
in this book.
The simplest way of handling conceptual understanding is to offer a fixed
vocabulary. Such a fixed set of objects and relationships between these objects
restrict what users may express in their sketch. In general, it helps understanding
6.4 Evaluating Landmark-Based Human-Computer Interaction 195
11
http://www.cyc.com/platform/opencyc, last visited 8/1/2014.
196 6 How Landmarks Enrich the Communication Between Human and Machine
these arguments. We will discuss usability studies and other forms of evaluation
that explicitly looked at the benefits of incorporating landmarks. A word of warning
in the beginning though: not all is well yet in the realm of landmark-based services.
The following studies focus on aspects of landmark production. They are inter-
ested in how well (or how much better) users understand a service’s communication
output if this service employs landmarks. Most of them look at verbal production,
but there is some work done on graphical landmark production as well.
Let us start with a requirement analysis. Burnett and his group have done a range
of such studies as well as some work on human factors in landmark use (e.g., [8,51])
in the context of in-car and pedestrian navigation systems.
In the 1990s studies compared having a passenger providing instructions vs.
using a car navigation system. A passenger who has detailed route knowledge
and provides clear and timely instructions arguably presents the ideal situation.
Compared to this, drivers using a navigation system made more navigation errors,
took longer to complete a route, spent less time looking on the road or in the mirrors,
rated their mental workload to be higher, and were rated by an expert to have lower
quality of driving [9,27]. Based on these studies and other findings, Burnett [8] listed
several reasons why navigation systems should include references to landmarks
reflecting the arguments we have made: (1) Landmarks are consistent with basic
human navigation strategies; (2) landmarks are valued by drivers; (3) landmarks are
effective and efficient in navigation tasks; (4) landmarks increase user satisfaction.
Argument (1) has been widely discussed in the book. Regarding Argument (2),
for example, a survey of 1,158 UK drivers found out that landmarks are the
second most popular information type (after left-right information) that participants
would want from their passengers helping in navigation [11]. Burnett’s group found
that participants either identified landmarks as crucial information in navigation
situations or produced landmark references to support others in navigation tasks,
depending on the condition tested and consistent with what we have discussed
earlier [10, 51]. Arguments (3) and (4) will be discussed in more detail in the
following—that is what this section is all about.
In an evaluation study testing pedestrian navigation instructions [59], participants
either received instructions relying on distances and street names, or enhanced
instructions with landmark references (similar to those of WhereIs). Ross et al.
found that participants receiving enhanced instructions were significantly more
confident in taking the right decisions at decision points and, indeed, also made
fewer navigation errors compared to those receiving the basic instructions. Their
study clearly makes a case for the inclusion of landmarks.
The Kiosk system [16] proved to be successful in guiding wayfinders as well.
As already mentioned in Sect. 6.3.1 the rate of parsed utterances is rather low with
only about 17 %. However, the keyword spotter manages to cope for this and detects
useful keywords in 80 % of the utterances (leaving only 3 % of unparsed utterances).
In the Kiosk system evaluation, 26 participants, who were mostly unfamiliar
with the environment, had to find their way to six different locations in a university
building after negotiating with the kiosk how to get there. Given the high detection
rate for keywords, only short dialogs were needed for users to receive the required
6.4 Evaluating Landmark-Based Human-Computer Interaction 197
information from the kiosk system [19]. This is an indicator for efficient dialog
handling. More importantly, the overall user satisfaction was 90 %. The reason may
be that almost 90 % of the participants found the test location eventually, and about
80 % of those with no or only small wayfinding problems (i.e., only minor confusion
along the way, but without really taking wrong turns) [19].
Dethlefs et al. [20] evaluated their approach in a computer-based survey without
actual wayfinding. Participants had to rate turn-by-turn route directions and destina-
tion descriptions generated by their approach, descriptions produced by a human
direction giver, and by Google Maps (no destination descriptions from Google
Maps; no commercial system is capable of producing them). Participants were
asked to determine which of the presented instructions were automatically generated
by a computer and which instructions appeared to be most useful. Instructions
generated by Google Maps were clearly identified to be computer-based (94 % of
participants). For the descriptions generated by Dethlefs et al.’s approach, only
36 % were classified as computer-generated, thus, 64 % of the participants took
them as being from a human communication partner. This applies to turn-by-
turn directions. Forty-two percent of the destination descriptions were correctly
identified as computer-generated, but also 34 % of the human directions were falsely
taken to be from a machine. In conclusion, this approach produces instructions that
appear to be more natural than those generated by Google Maps, and may, thus, be
better suited for human-computer interaction.
However, in terms of usefulness ratings results tell a different story. While
destination descriptions by Dethlefs et al.’s approach are rated by both familiar
and unfamiliar participants to be more useful than the human-generated ones (53
vs. 46 %; 65 vs. 33 %, respectively) turn-by-turn directions are only perceived as
most useful by 7 % of the familiar participants. Human directions were seen to be
most useful here, just ahead of those by Google Maps (48 vs. 42 %). Unfamiliar
users are more positive; 37 % prefer Dethlefs et al.’s directions. Familiar users
may reject them because the instructions tend to be verbose and, as discussed in
Sect. 6.2.1, may include odd landmark references. The significant caveat with this
study is that participants did not actually have to find their way. The study only tested
for naturalness (where their approach is strong) and user preference (where it is
popular with unfamiliar participants, but not so much with familiar ones). No actual
performance data has been collected, therefore, no statements regarding which kind
of directions is actually better can be made.
The SpaceBook system [38] was evaluated in a setup where participants had to
perform eight tasks in two runs. These included both navigation and tourist infor-
mation tasks. The system was tested against a baseline system relying on standard
smartphone applications. Participants rated both systems equally successful in terms
of task completion. However, the baseline system had a better task completion rate
in most navigation tasks, whereas the SpaceBook system performed better in the
tourist information tasks. This is also reflected in user preferences. The base system
is preferred for navigation tasks, the SpaceBook system for tourist information tasks.
With the SpaceBook system, users had major issues with navigation because the
system did not provide any graphical information, i.e., no map or directional arrows.
198 6 How Landmarks Enrich the Communication Between Human and Machine
Further, latency and positional errors of the GPS produced some directions too late
or at the wrong locations, which harmed navigation ease and success.
The correct timing of verbal instructions for GPS-based navigation systems
was addressed by Rehrl et al. [55]. They tested instructions using either metric
information or landmark references to indicate where to turn. Participants navigated
test routes in the city of Salzburg wearing headphones that canceled most of
the street noise. An instructor shadowing the participants triggered instructions at
pre-defined locations to avoid any timing issues that may result from poor GPS
positioning. Similar to the SpaceBook project, these instructions were voice-only,
no graphical route information was provided to the participants.
Overall, the study showed that the correct timing of unambiguous instructions
is most crucial for successful wayfinding. Landmark references help removing
ambiguity. While the type of instructions had no effect on walking time, participants
made notably fewer navigation errors when using the landmark-based instructions
compared to those receiving metric references. Basically, participants did not use
metric information at all, but simply waited for the next instruction to come, whereas
landmarks helped them to identify correct turns.
The system of Hile et al. [34] (see Sect. 6.2.2) combines photographic views
along a route with verbal instructions. In their evaluation, participants could switch
between a photo view and a map view. Most used both in their navigation, with
the photo view mostly applied in critical, ambiguous situations. The participants
perceived a range of the presented photographs as confusing, because in the real
world trees blocked the view to the landmark depicted on the photo and referenced
in the verbal instructions, and because photos did not coincide with the participants’
perspective. In the latter case, participants had to mentally transform the perspective
seen on the photograph to the perspective they had while moving, which is a
cognitively demanding task. This clearly indicates that while graphical landmark
references, and especially photographs, can be a powerful support for human users
of a service, a careful selection of these photographs is crucial. Hile et al. addressed
some of the issues identified by the study participants in some follow-up work [35].
Beeharee and Steed [4] got similar results from their user evaluation. Participants
were clearly faster with photo-augmented instructions compared to only map views
because they could use the photographs to disambiguate situations and to confirm
that they were on the right track. But photos not taken directly from the participants’
perspective, or in a different season, or different lighting conditions may confuse
(some) users as again there is a perceptual mismatch that requires cognitive effort
to resolve.
Wither et al. [68] compared their panorama view based navigation service (with
and without enlarged business signs) to a ‘traditional’ map based navigation mode
(again with or without enlarged business signs). They found no difference in
navigation performance between the different modes. However, other than expected
participants spent more time looking at the panorama view than on the map. Initially,
Wither et al. hypothesized that panoramas were easier to match to the real world,
since they essentially reproduce the perspective in the real world directly. But it
seems that both increased visual complexity of panorama images and people’s prior
References 199
experience of using map based navigation counter this expectation. The enlarged
signs had no effect on navigation performance, but were rated as a useful feature.
To sum up, essentially all studies presented in this section clearly demonstrate
the advantages spatial information services gain from incorporating landmarks. But
they also show that there are still a range of design challenges, many due to data
issues (discussed in Sect. 5.5.1), issues of selecting truly relevant and identifiable
landmarks (as exemplified in the study of Hile et al. [34]), i.e., of getting the context
right (Chap. 4), and issues of finding a common ground in describing landmarks (as
discussed in the beginning of Sect. 6.3.1).
6.5 Summary
In this chapter, we looked at landmarks in the interplay between human users and
spatial information services. We highlighted some of the requirements for machine
spatial communication. We then discussed in detail what it takes for a service
to produce and understand landmark references, both verbally and graphically. In
particular, we argued why it is easier to produce landmarks than to understand them.
As in previous chapters, we also looked at some examples from research that aim
at either of these tasks. Finally, we discussed a range of studies that demonstrate
the power of landmark-based communication in human-computer interaction. This
chapter now concludes our argument for why landmarks are a crucial element for
truly intelligent spatial information systems. The results of the evaluation studies
provide convincing evidence for our hypothesis. However, the studies also illustrate
that there is still work to do.
References
1. Allen, G.L.: From knowledge to words to wayfinding: issues in the production and comprehen-
sion of route directions. In: Hirtle, S.C., Frank, A.U. (eds.) Spatial Information Theory. Lecture
Notes in Computer Science, vol. 1329, pp. 363–372. Springer, Berlin (1997)
2. Bateman, J.A., Hois, J., Ross, R., Tenbrink, T.: A linguistic ontology of space for natural
language processing. Artif. Intell. 174(14), 1027–1071 (2010)
3. Beeharee, A., Steed, A.: Filtering location-based information using visibility. In: Strang, T.,
Linnhoff-Popien, C. (eds.) Location- and Context-Awareness. Lecture Notes in Computer
Science, vol. 3479, pp. 306–315. Springer, Berlin (2005)
4. Beeharee, A.K., Steed, A.: A natural wayfinding exploiting photos in pedestrian navigation
systems. In: Proceedings of the 8th Conference on Human-Computer Interaction with Mobile
Devices and Services, MobileHCI ’06, pp. 81–88. ACM, New York (2006)
5. Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic
generation-space models. Nat. Lang. Eng. 14(4), 431–455 (2008)
6. Bittner, T., Stell, J.G.: Stratified rough sets and vagueness. In: Kuhn, W., Worboys, M.F.,
Timpf, S. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 2825,
pp. 270–286. Springer, Berlin (2003)
200 6 How Landmarks Enrich the Communication Between Human and Machine
7. Blaser, A.D., Egenhofer, M.J.: A visual tool for querying geographic databases. In: AVI ’00:
Proceedings of the Working Conference on Advanced Visual Interfaces, pp. 211–216. ACM,
New York (2000)
8. Burnett, G.: Turn right at the traffic lights: the requirement for landmarks in vehicle navigation
systems. J. Navigation 53(3), 499–510 (2000)
9. Burnett, G., Joyner, S.M.: An assessment of moving map and symbol-based route guidance sys-
tems. In: Noy, Y.I. (ed.) Ergonomics and Safety of Intelligent Driver Interfaces, pp. 115–136.
Lawrence Erlbaum Associates, Mahwah (1997)
10. Burnett, G., Smith, D., May, A.: Supporting the navigation task: characteristics of ‘good’
landmarks. In: Hanson, M.A. (ed.) Contemporary Ergonomics 2001, pp. 441–446. Taylor and
Francis, London (2001)
11. Burns, P.C.: Navigation and the older driver. Unpublished Ph.D. Thesis, Loughborough
University (1997)
12. Burrough, P.A., Frank, A.U. (eds.): Geographic Objects with Indeterminate Boundaries. Taylor
and Francis, London (1996)
13. Chipofya, M., Wang, J., Schwering, A.: Towards cognitively plausible spatial representations
for sketch map alignment. In: Egenhofer, M.J., Giudice, N., Moratz, R., Worboys, M.F. (eds.)
Spatial Information Theory. Lecture Notes in Computer Science, vol. 6899. Springer, Berlin
(2011)
14. Clark, S., Hockenmaier, J., Steedman, M.: Building deep dependency structures with a wide-
coverage CCG parser. In: Proceedings of the 40th Annual Meeting of the Association for
Computational Linguistics, ACL ’02, pp. 327–334. Association for Computational Linguistics,
Stroudsburg (2002)
15. Couclelis, H.: Verbal directions for way-finding: space, cognition, and language. In: Portugali,
J. (ed.) The Construction of Cognitive Maps. GeoJournal Library, vol. 32, pp. 133–153.
Kluwer, Dordrecht (1996)
16. Cuayáhuitl, H., Dethlefs, N., Richter, K.F., Tenbrink, T., Bateman, J.: A dialogue system for
indoor way-finding using text-based natural language. Int. J. Comput. Ling. Appl. 1(1–2),
285–304 (2010)
17. Dale, R., Geldof, S., Prost, J.P.: Using natural language generation in automatic route
description. J. Res. Pract. Inform. Tech. 37(1), 89–105 (2005)
18. Denis, M.: The description of routes: a cognitive approach to the production of spatial
discourse. Curr. Psychol. Cognit. 16(4), 409–458 (1997)
19. Dethlefs, N., Cuayáhuitl, H., Richter, K.F., Andonova, E., Tenbrink, T., Bateman, J.: Evaluating
task success in a dialogue system for indoor navigation. In: Lupkowski, P., Purve, M. (eds.)
Aspects of Semantics and Pragmatics of Dialogue. SemDial 2010, pp. 143–146. Polish Society
for Cognitive Science, Poznań (2010)
20. Dethlefs, N., Wu, Y., Kazerani, A., Winter, S.: Generation of adaptive route descriptions in
urban environments. Spatial Cognition & Computation 11(2), 153–177 (2011)
21. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to
represent a digitized line or its caricature. Cartographica Int. J. Geogr. Inform. Geovisualization
10(2), 112–122 (1973)
22. Duckham, M., Winter, S., Robinson, M.: Including landmarks in routing instructions. J.
Location-Based Serv. 4(1), 28–52 (2010)
23. Egenhofer, M.J.: Query processing in spatial-query-by-sketch. J. Vis. Lang. Comput. 8,
403–424 (1997)
24. Egenhofer, M.J., Herring, J.R.: A mathematical framework for the definition of topological
relationships. In: Brassel, K., Kishimoto, H. (eds.) 4th International Symposium on Spatial
Data Handling, pp. 803–813. International Geographical Union, Zürich (1990)
25. Egenhofer, M.J., Mark, D.M.: Naive geography. In: Frank, A.U., Kuhn, W. (eds.) Spatial
Information Theory. Lecture Notes in Computer Science, vol. 998, pp. 1–15. Springer, Berlin
(1995)
References 201
26. Elias, B., Paelke, V., Kuhnt, S.: Concepts for the cartographic visualization of landmarks. In:
Gartner, G. (ed.) Location Based Services and Telecartography: Proceedings of the Symposium
2005, Geowissenschaftliche Mitteilungen, pp. 1149–155. TU Vienna, Vienna (2005)
27. Fastenmeier, W., Haller, R., Lerner, G.: A preliminary safety evaluation of route guidance com-
paring different MMI concepts. In: Proceedings of the First World Congress on Applications of
Transport Telemetrics and Intelligent Vehicle Highway Systems, vol. 4, pp. 1750–1756. Artech
House, Boston (1994)
28. Forbus, K.D., Ferguson, R.W., Usher, J.M.: Towards a computational model of sketching.
In: Proceedings of the 6th International Conference on Intelligent User Interfaces, IUI ’01,
pp. 77–83. ACM, New York (2001)
29. Forbus, K.D., Usher, J., Lovett, A., Lockwood, K., Wetzel, J.: Cogsketch: sketch understanding
for cognitive science research and for education. Top. Cognit. Sci. 3(4), 648–666 (2011)
30. Frank, A.U.: Qualitative spatial reasoning about distances and directions in geographic space.
J. Vis. Lang. Comput. 3, 343–371 (1992)
31. French, R.M.: Moving beyond the turing test. Comm. ACM 55(12), 74–77 (2012)
32. Goodman, J., Gray, P., Khammampad, K., Brewster, S.: Using landmarks to support older
people in navigation. In: Brewster, S., Dunlop, M. (eds.) Mobile Human-Computer Interaction:
MobileHCI 2004. Lecture Notes in Computer Science, vol. 3160, pp. 38–48. Springer, Berlin
(2004)
33. Grice, P.: Logic and conversation. Syntax and Semantics 3, 41–58 (1975)
34. Hile, H., Vedantham, R., Cuellar, G., Liu, A., Gelfand, N., Grzeszczuk, R., Borriello, G.:
Landmark-based pedestrian navigation from collections of geotagged photos. In: Proceedings
of the 7th International Conference on Mobile and Ubiquitous Multimedia, MUM ’08,
pp. 145–152. ACM, New York (2008)
35. Hile, H., Grzeszczuk, R., Liu, A., Vedantham, R., Košecka, J., Borriello, G.: Landmark-based
pedestrian navigation with enhanced spatial reasoning. In: Tokuda, H., Beigl, M., Friday, A.,
Brush, A., Tobe, Y. (eds.) Pervasive Computing. Lecture Notes in Computer Science, vol. 5538,
pp. 59–76. Springer, Berlin (2009)
36. Hill, L.L.: Georeferencing: The Geographic Associations of Information. Digital Libraries and
Electronic Publishing. MIT Press, Cambridge (2006)
37. Hirtle, S.C., Sorrows, M.E.: Designing a multi-modal tool for locating buildings on a college
campus. J. Environ. Psychol. 18(3), 265–276 (1998)
38. Janarthanam, S., Lemon, O., Bartie, P., Dalmas, T., Dickinson, A., Liu, X., Mackaness, W.,
Webber, B.: Evaluating a city exploration dialogue system combining question-answering
and pedestrian navigation. In: Proceedings of the 51st Annual Meeting of the Association of
Computational Linguistics, pp. 1660–1668. Sofia, Bulgaria (2013)
39. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition, 2nd edn. Pearson
Prentice Hall, Upper Saddle River (2008)
40. Klippel, A., Montello, D.R.: Linguistic and non-linguistic turn direction concepts. In: Winter,
S., Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory. Lecture Notes in
Computer Science, vol. 4736, pp. 354–372. Springer, Berlin (2007)
41. Klippel, A., Tappe, H., Habel, C.: Pictorial representations of routes: chunking route segments
during comprehension. In: Freksa, C., Brauer, W., Habel, C., Wender, K.F. (eds.) Spatial
Cognition III. Lecture Notes in Artificial Intelligence, vol. 2685, pp. 11–33. Springer, Berlin
(2003)
42. Klippel, A., Richter, K.F., Hansen, S.: Structural salience as a landmark. In: Workshop Mobile
Maps 2005. Salzburg, Austria (2005)
43. Klippel, A., Hansen, S., Richter, K.F., Winter, S.: Urban granularities: a data structure for
cognitively ergonomic route directions. GeoInformatica 13(2), 223–247 (2009)
44. Kolbe, T.H.: Augmented videos and panoramas for pedestrian navigation. In: Gartner, G.
(ed.) Proceedings of the 2nd Symposium on Location Based Services and TeleCartography,
Geowissenschaftliche Mitteilungen. TU Vienna, Vienna (2004)
202 6 How Landmarks Enrich the Communication Between Human and Machine
45. Kopczynski, M.: Efficient spatial queries with sketches. In: Proceedings of the ISPRS Technical
Commission II Symposium, pp. 19–24, Vienna, Austria (2006)
46. Kopczynski, M., Sester, M.: Graph based methods for localisation by a sketch. In: Proceedings
of the 22nd International Cartographic Conference (ICC2005). La Coruna, Spain (2005)
47. Kulik, L.: A geometric theory of vague boundaries based on supervaluation. In: Montello, D.R.
(ed.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 2205, pp. 44–59.
Springer, Berlin (2001)
48. Lee, Y., Kwong, A., Pun, L., Mack, A.: Multi-media map for visual navigation. J. Geospatial
Eng. 3(2), 87–96 (2001)
49. Lovelace, K.L., Hegarty, M., Montello, D.R.: Elements of good route directions in familiar
and unfamiliar environments. In: Freksa, C., Mark, D.M. (eds.) Spatial Information Theory.
Lecture Notes in Computer Science, vol. 1661, pp. 65–82. Springer, Berlin (1999)
50. Lynch, K.: The Image of the City. The MIT Press, Cambridge (1960)
51. May, A.J., Ross, T., Bayer, S.H., Tarkiainen, M.J.: Pedestrian navigation aids: Information
requirements and design implications. Pers. Ubiquit. Comput. 7(6), 331–338 (2003)
52. Montello, D.R., Frank, A.U.: Modeling directional knowledge and reasoning in environmental
space: testing qualitative metrics. In: Portugali, J. (ed.) The Construction of Cognitive Maps.
GeoJournal Library, vol. 32, pp. 321–344. Kluwer, Dordrecht (1996)
53. Porzel, R., Gurevych, I., Malaka, R.: In context: Integrating domain- and situation-specific
knowledge. In: Wahlster, W. (ed.) SmartKom: Foundations of Multimodal Dialogue Systems,
pp. 269–284. Springer, Berlin (2006)
54. Raubal, M., Winter, S.: Enriching wayfinding instructions with local landmarks. In: Egenhofer,
M.J., Mark, D.M. (eds.) Geographic Information Science. Lecture Notes in Computer Science,
vol. 2478, pp. 243–259. Springer, Berlin (2002)
55. Rehrl, K., Häusler, E., Leitinger, S.: Comparing the effectiveness of GPS-enhanced voice
guidance for pedestrians with metric- and landmark-based instruction sets. In: Fabrikant, S.,
Reichenbacher, T., van Kreveld, M., Schlieder, C. (eds.) Geographic Information Science.
Lecture Notes in Computer Science, vol. 6292, pp. 189–203. Springer, Berlin (2010)
56. Richter, K.F.: A uniform handling of different landmark types in route directions. In: Winter,
S., Duckham, M., Kulik, L., Kuipers, B. (eds.) Spatial Information Theory. Lecture Notes in
Computer Science, vol. 4736, pp. 373–389. Springer, Berlin (2007)
57. Richter, K.F.: Context-Specific Route Directions: Generation of Cognitively Motivated
Wayfinding Instructions, vol. DisKi 314 / SFB/TR 8 Monographs Volume 3. IOS Press,
Amsterdam (2008)
58. Richter, K.F., Winter, S., Santosa, S.: Hierarchical representations of indoor spaces. Environ.
Plann. B Plann. Des. 38(6), 1052–1070 (2011)
59. Ross, T., May, A., Thompson, S.: The use of landmarks in pedestrian navigation instructions
and the effects of context. In: Brewster, S., Dunlop, M. (eds.) Mobile Computer Interaction:
MobileHCI 2004. Lecture Notes in Computer Science, vol. 3160, pp. 300–304. Springer, Berlin
(2004)
60. Takacs, G., Chandrasekhar, V., Gelfand, N., Xiong, Y., Chen, W.C., Bismpigiannis, T.,
Grzeszczuk, R., Pulli, K., Girod, B.: Outdoors augmented reality on mobile phone using loxel-
based visual feature organization. In: Proceedings of the 1st ACM International Conference on
Multimedia Information Retrieval, MIR ’08, pp. 427–434. ACM, New York (2008)
61. Tomko, M., Winter, S., Claramunt, C.: Experiential hierarchies of streets. Comput. Environ.
Urban Syst. 32(1), 41–52 (2008)
62. Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
63. Tversky, B., Lee, P.U.: Pictorial and verbal tools for conveying routes. In: Freksa, C., Mark,
D.M. (eds.) Spatial Information Theory. Lecture Notes in Computer Science, vol. 1661,
pp. 51–64. Springer, Berlin (1999)
64. Vasardani, M., Winter, S., Richter, K.F.: Locating place names from place descriptions. Int. J.
Geogr. Inform. Sci. 27(12), 2509–2532 (2013)
References 203
65. Wallgrün, J.O., Wolter, D., Richter, K.F.: Qualitative matching of spatial information. In:
Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic
Information Systems, GIS ’10, pp. 300–309. ACM, New York (2010)
66. Winter, S.: Spatial intelligence: ready for a challenge? Spatial Cognit. Comput. 9(2), 138–151
(2009)
67. Winter, S., Truelove, M.: Talking about place where it matters. In: Raubal, M., Mark, D.M.,
Frank,A.U. (eds.) Cognitive and Linguistic Aspects of Geographic Space: New Perspectives
on Geographic Information Research. Lecture Notes in Geoinformation and Cartography,
pp. 121–139. Springer, Berlin (2013)
68. Wither, J., Au, C.E., Rischpater, R., Grzeszczuk, R.: Moving beyond the map: Automated
landmark based pedestrian guidance using street level panoramas. In: Proceedings of the 15th
International Conference on Human-Computer Interaction with Mobile Devices and Services,
MobileHCI, pp. 203–212. ACM, New York (2013)
69. Worboys, M., Duckham, M., Kulik, L.: Commonsense notions of proximity and direction in
environmental space. Spatial Cognit. Comput. 4(4), 285–312 (2004)
70. Zhang, Q.: Multi-modal landmark-integrated route instructions. Unpublished Masters Thesis,
Department of Infrastructure Engineering, The University of Melbourne (2012)
Chapter 7
Conclusions: What Is Known and What Is Still
Challenging About Landmarks
Abstract This chapter concludes the book. It briefly summarizes what we have
discussed in the previous six chapters and then looks ahead. In particular, we
contemplate what it takes for a geospatial system to be intelligent, and what we still
miss at the moment in order to build such systems. Overall, we believe that we
have provided an appreciation and better understanding of both the challenges and
potential of landmarks in intelligent geospatial systems.
This book set out to summarize the current state of knowledge about landmarks
largely based on cognitive research, and how this makes geo-spatial systems more
capable of interacting with human beings.
We started with defining what a landmark actually is, and very soon realized
that this is not a trivial task. We believe that it is crucial to relate the concept of
landmarks directly to people’s embodied experience and their cognitive processing
of their living environments and, thus, define landmarks to be geographic objects
that structure human mental representations of space. They serve as anchor points
in our mental representations; their internal structure is not important in that respect.
This sets landmarks apart from other geographic concepts that seem similar, such as
places or points of interest.
We have seen that there is a large body of research in the cognitive and
neurosciences backing this definition. Landmarks serve as anchor and reference
points in our mental representations and communication, which makes them crucial
in our understanding of our world. They do this because landmarks stick out. Each
landmark has some aspects that makes it grab our attention. These aspects depend
on the context in which we encounter a potential landmark.
If we want computers to use landmarks in their interaction with us, we need
to somehow make that elusive concept landmark known to them. We have exem-
plified this by presenting an—admittedly—simple algebraic formalization, but
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 205
DOI 10.1007/978-3-319-05732-3__7, © Springer International Publishing Switzerland 2014
206 7 What Is Known and What Is Still Challenging About Landmarks
Remember HAL, the artificial intelligence that controls the spacecraft and interacts
with the astronauts in Stanley Kubrick’s film 2001: A Space Odyssey? HAL was
capable of remarkable communication skills, including learning mechanisms, some
of them so creepy as to create the suspense in the movie. For example, note this
competence of introspection: “I know I’ve made some very poor decisions recently,
but I can give you my complete assurance that my work will be back to normal. I’ve
still got the greatest enthusiasm and confidence in the mission. And I want to help
you.” That was way back in 1968. And it was fiction.
How would HAL have been communicating with the astronauts on spatial tasks,
such as orientation and wayfinding? To limit context and expectations, let us stick
with terrestrial environments (and abandon the creepiness of HAL). How would an
intelligent system guide us through traffic, to our hotel, remind us where we would
have a meeting tomorrow or where we have parked our car last night?
We believe such systems should support human interaction with the environment,
but not control this interaction. After all, Alan Turing set as the gauge for an
intelligent system a machine that is capable of communicating like a person, not
one that functions like a person [17]. Not more and not less. But we have also
pointed out several times in this book that the goal for intelligent geospatial systems
7.3 What We Need to Know: Human-Computer Interaction 207
needs to be to also make up for human mistakes and cognitive limitations. Systems
interacting with humans in a way meaningful to the human seems to be a reasonable
expectation for an intelligent geospatial system. It is also an expectation more
inspiring than the pure imitation of people’s sometimes unreliable spatial skills and
spatial communication behavior.
At this stage we should clarify our own aspirations in using computers to
understand human cognition and in using them to achieve artificial cognition. In line
with French’s call for meaningful interaction [7], this book does search to enable
the computer to understand human spatial expressions, and to generate spatial
expressions that can easily be digested by people. The suggested and reviewed
formal models do aim to enter into a meaningful dialog with a person, and hence,
need to sufficiently understand human spatial cognition. But they do not aspire to
explain and map processes of human spatial cognition and communication behavior
on representations in a computer. This book is not about cognitive modeling.
In other words we do not see a need for computational processes replicating
the neurological basis of spatial representations in the brain in order to achieve a
meaningful dialog supporting a person in spatial decision making. But we do see a
need for models of landmarks that capture the nature of landmarks and are able to
relate to human embodied experiences.
1
http://21robot.org/about/?lang=english, last visited 3/1/2014.
208 7 What Is Known and What Is Still Challenging About Landmarks
world assumption we have discussed in Chap. 5 (on p. 139). We have seen that
finding outstanding geographic objects may be supported by data mining methods
(Sect. 5.2.2), and reasoning about object types (Sects. 5.3 and 5.5.2).
A combination of these approaches will likely get us quite far in identifying
landmarks in densely populated areas, in particular in city centers and commercial
districts. However, given the uneven distribution of POI data and the lack of business
signage in residential suburbs, there are also large areas of our environment where
we would like an intelligent system to refer to landmarks, but it fails to do so for lack
of data. This is where user-generated content steps in. Implementing a platform that
allows users to contribute landmark information about their neighborhood would
be an ideal supplement to the machine learning techniques used for an automatic
identification of landmarks. Such a platform must be easy to use, allow for quick
rewards for the contributor, but it also needs to ensure that the contributed data can
be applied in landmark identification approaches in a straightforward manner [12].
We see the OpenLandmarks platform as a first step in the right direction (see
Sect. 5.5.3). A database such a user-generated content platform feeds into may be
pre-filled by all landmark candidates we can get from the data mining methods
discussed before. This would serve two purposes: first, contributors are not faced
with a blank canvas, but can already see what their contributions will (are supposed
to) look like. Second, the end result will be an integrated database that combines
user-generated content and automatically extracted landmarks. This will also allow
for correction mechanisms. Users may change information on landmarks the system
identified, and likewise the system may be able to flag improbable landmark
candidates submitted by a user.
It may well be the case that the success of such a user-generated content
approach to collecting landmark information depends on the developed
platform having a monopoly in collecting such data. Wikipedia would not
be so successful if there were six or seven other online encyclopedias of
similar status competing for contributors. The same holds for OpenStreetMap.
OSM data would be much less useful if every country (or even only every
continent) had their own platform for collecting topographic data, each with
slightly different mechanisms and data structures. In fact, a recent change of
the OSM license, which also affects the kind of data you can use as basis for
your contributions, has led several OSM contributors to fork OSM data into a
new data set where the old license still holds—this move has been particularly
strong in Australia. The consequences of this fork are still to be seen, but some
fragmentation and inconsistencies between the two data sets are most likely
to occur.
services (for a review of early work see [2]). The often cited definition by Dey [4]
takes context to be “any information that can be used to characterize the situation
of entities that are considered relevant to the interaction between a user and an
application, including the user and the application themselves.” This definition is
certainly correct, but it is also similarly difficult to operationalize as some of the
definitions of landmark that we have looked at in the beginning of this book.
Thus, rather than compiling and parametrizing a list of factors that may describe
the current context, which will never be complete, we propose a process-oriented
view [6, 14].
This view on context is centered on the task at hand and focuses on the processes
that occur between the involved entities. In the case of intelligent geospatial systems
these entities are the system (S), the user (U), and the environment (E) that user and
system act in and communicate about. Figure 7.1 provides a conceptual overview
of such a context model. The model assumes goal-directed behavior by both user
and system, and accordingly the task determines the processes performed by both
of them. It also accounts for the influences different environments may have on
task performance. Crucially, both user and system may need to adapt to the current
context, i.e., processes of adaptation are a fundamental property of this model.
Accordingly, we suggest the following road ahead. The basic first step is to con-
tinue and refine the automatic production of skeletal route and place descriptions [3].
Or, more precisely—because skeletal descriptions are not formally defined—the
aim is to produce descriptions of sufficient pragmatic information content that use a
minimal number of references, i.e., descriptions that are relevant and short.
We have already come a long way here if you think of the substantial preliminary
work both in verbal and graphical communication we discussed in this book
(e.g., [1,8,11,16]). This work largely assumes an unfamiliar communication partner
References 211
(the U in Fig. 7.1), who is the easiest to model. Other factors of the communication
context have not yet been fully addressed, for example, an adaptation to a particular
environment or the time of the day (E).
Another change to the environment, and consequently to the task (T), would be an
adaptation of skeletal descriptions to emergency situations. Again, this is a change
to context that is relatively easy to capture and model. It demands particularly clear
and unambiguous communication. When people are under stress, there is no time
for thinking about what the system may have actually meant.
Adapting descriptions to the prior knowledge of the user would be a desirable
achievement. Such adaptation is fundamental for pragmatic information content,
i.e., communicating only what is relevant. It would prevent your car navigation
system from telling you how to get from your home to the highway in all detail—
something you have likely done hundreds of times before. There is preliminary work
done in that direction as well, most of which assumes that you own the system
that communicates with you. These systems infer your knowledge by tracking your
movements (e.g., [9, 15]). More challenging, but also more interesting, is a scenario
where the system has to figure out the user’s prior knowledge by dialog [13]. Such a
scenario requires flexibility and adaptation, and also strategies of negotiation. This
step gets us finally to (landmark) understanding, which we have argued to be more
difficult than (landmark) producing.
These dialog-based systems will need to understand place descriptions by people
in-situ, using location information as well as keeping track of what has been
mentioned before in the current dialog. But it will require more than this. The system
will have to be capable of mapping the user’s understanding of an environment to
its own representation. This means that the systems’ internal representations will
need to be created based on the defining elements of human spatial representation—
landmarks.
In the long run these developments will lead to intelligent geospatial systems—
systems as flexible and adaptive as a human communication partner, with re- and
pro-active behavior and constructive communication. Landmarks are the elements
that can tie it all together by structuring both the human and system’s representation
of space.
References
1. Agrawala, M., Stolte, C.: Rendering effective route maps: Improving usability through
generalization. In: SIGGRAPH 2001, pp. 241–250. ACM, Los Angeles (2001)
2. Chen, G., Kotz, D.: A survey of context-aware mobile computing research. Technical Report
TR2000-381. Dartmouth College, Hanover (2000)
3. Denis, M.: The description of routes: a cognitive approach to the production of spatial
discourse. Curr. Psychol. Cognit. 16(4), 409–458 (1997)
4. Dey, A.K.: Understanding and using context. Pers. Ubiquit. Comput. 5(1), 4–7 (2001)
5. Duckham, M., Winter, S., Robinson, M.: Including landmarks in routing instructions. J.
Location-Based Serv. 4(1), 28–52 (2010)
212 7 What Is Known and What Is Still Challenging About Landmarks
6. Freksa, C., Klippel, A., Winter, S.: A cognitive perspective on spatial context. In: Cohn,
A., Freksa, C., Nebel, B. (eds.) Spatial Cognition: Specialization and Integration. Dagstuhl
Seminar Proceedings, vol. 05491. Dagstuhl, Germany (2007)
7. French, R.M.: Moving beyond the Turing test. Comm. ACM 55(12), 74–77 (2012)
8. Kopf, J., Agrawala, M., Bargeron, D., Cohen, M.F.: Automatic generation of destination maps.
In: SIGGRAPH Asia, pp. 158:1–158:12. ACM, New York (2010)
9. Patel, K., Chen, M.Y., Smith, I., Landay, J.A.: Personalizing routes. In: UIST ’06: Pro-
ceedings of the 19th Annual ACM Symposium on User Interface Software and Technology,
pp. 187–190. ACM Press, New York (2006)
10. Rauh, R., Hagen, C., Knauff, M., Kuss, T., Schlieder, C., Strube, G.: Preferred and alternative
mental models in spatial reasoning. Spatial Cognit. Comput. 5(2–3), 239–269 (2005)
11. Richter, K.F.: Context-Specific Route Directions: Generation of Cognitively Motivated
Wayfinding Instructions, vol. DisKi 314 / SFB/TR 8 Monographs Volume 3. IOS Press,
Amsterdam (2008)
12. Richter, K.F., Winter, S.: Citizens as database: conscious ubiquity in data collection. In: Pfoser,
D., Tao, Y., Mouratidis, K., Nascimento, M., Mokbel, M., Shekhar, S., Huang, Y. (eds.)
Advances in Spatial and Temporal Databases. Lecture Notes in Computer Science, vol. 6849,
pp. 445–448. Springer, Berlin (2011)
13. Richter, K.F., Tomko, M., Winter, S.: A dialog-driven process of generating route directions.
Comput. Environ. Urban Syst. 32(3), 233–245 (2008)
14. Richter, K.F., Dara-Abrams, D., Raubal, M.: Navigating and learning with location based
services: a user-centric design. In: Gartner, G., Li, Y. (eds.) Proceedings of the 7th International
Symposium on LBS and Telecartography, pp. 261–276 (2010)
15. Schmid, F.: Knowledge based wayfinding maps for small display cartography. J. Location
Based Syst. 2(1), 57–83 (2008)
16. Tomko, M., Winter, S.: Pragmatic construction of destination descriptions for urban environ-
ments. Spatial Cognit. Comput. 9(1), 1–29 (2009)
17. Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
18. Winter, S., Freksa, C.: Approaching the notion of place by contrast. J. Spatial Inform. Sci.
2012(5), 31–50 (2012)
19. Winter, S., Truelove, M.: Talking about place where it matters. In: Raubal, M., Mark, D.M.,
Frank, A.U. (eds.) Cognitive and Linguistic Aspects of Geographic Space: New Perspectives
on Geographic Information Research. Lecture Notes in Geoinformation and Cartography,
pp. 121–139. Springer, Berlin (2013)
Author Index
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 213
DOI 10.1007/978-3-319-05732-3, © Springer International Publishing Switzerland 2014
214 Author Index
Burgard, Wolfgang, 14 D
Burgess, Neil, 62–64, 68 Dabbs, James M., 44
Burnett, Gary, 56, 196 Dale, Robert, 155, 163, 179, 180
Burns, Peter C., 196 Dallal, Nancy L., 91
Burrough, Peter A., 189 Dalmas, Tiphaine, 184, 191, 197
Burroughs, W. Jeffrey, 10, 73, 116, 117 Damasio, Antonio, 61
Byrne, Patrick, 10 Daniel, Marie-Paule, 20, 89, 90, 93
Böök, Anders, 14, 48, 49 Dara-Abrams, Drew, 210
Büchel, Christian, 47 Davies, Nigel, 129, 167
De Beni, Rossana, 51
Dehaene, Stansilas, 35
C Denis, Michel, 20, 59, 80, 83, 87, 89, 90, 93,
Caduff, David, 157, 163 96, 178, 210
Cammack, Rex, 82 Dennis, Todd E., 71
Campbell, John, 56 Descartes, René, 53
Caplan, Jeremy B., 62 Dethlefs, Nina, 95, 181, 183, 190, 196, 197
Carite, Luc, 90, 93 DeVet, Kathy, 93
Carr, Thomas H., 47 Dey, Anind K., 10, 111, 210
Carroll, John B., 44 Dickinson, Anna, 184, 191, 197
Carswell, Christopher, 93 Dijkstra, Edsger W., 12, 52
Casakin, Hernan, 95 Doherty, Gwyneth, 78
Cavedon, Lawrence, 168 Donnett, James G., 63
Chalmers, Matthew, 168 Dosher, Barbara Anne, 86
Chan, Edgar, 64 Dostrovsky, Jonathan, 61
Chance, Sarah S., 29 Douglas, David H., 193
Chandrasekhar, Vijay, 187 Downs, Roger M., 10, 44, 71
Chang, E-Lee, 44 Drymonas, Euthymios, 146
Chen, Guanling, 210 Duckham, Matt, 27, 52, 156–158, 161–163,
Chen, Mike Y., 211 165, 167, 168, 183, 189, 208
Chen, Wei-Chao, 83, 187 Duda, Richard O., 147
Cheng, Ken, 64 Dudchenko, Paul A., 61
Chipofya, Malumbo, 194 Dylla, Frank, 111, 120
Chow, Ho Ming, 94
Christofides, Nicos, 35
Chua, Tat-Seng, 148 E
Cicinelli, Joseph G., 46, 69 Eals, Marion, 44
Claramunt, Christophe, 59, 183 Eco, Umberto, 81
Clark, Andy, 88, 92 Edelsbrunner, Herbert, 124
Clark, Stephen, 190 Egenhofer, Max J., 74, 83, 97, 122, 123,
Cohen, Michael F., 210 192–195
Cohen, Robert, 44 Ehrenfels, Christian von, 55
Cohn, Anthony G., 111 Ekstrom, Arne D., 61, 62
Conroy Dalton, Ruth, 122 Elias, Birgit, 126, 144–146, 163, 165, 185, 186
Cornell, Edward H., 56 Emmorey, Karen, 30
Cornoldi, Cesare, 59, 87, 89, 90 Etienne, Ariane S., 46
Couclelis, Helen, 9, 15, 27, 73, 87, 178 Evans, Gary W., 74
Coupe, Patty, 75
Cowan, Nelson, 89
Crandall, David, 149 F
Cuayáhuitl, Heriberto, 95, 181, 190, 196, Fabrikant, Sara Irina, 125
197 Fastenmeier, Wolfgang, 196
Cuellar, Gregory, 187, 198, 199 Fellbaum, Christiane, 3
Culler, Jonathan, 81 Fensel, Dieter, 112
Czyzewska, Maria, 53 Ferguson, John, 168
Author Index 215
J Kozhevnikov, Maria, 68
Jackendoff, Ray, 65, 86 Krüger, Antonio, 123
Janarthanam, Srivinam, 184, 191, 197 Krumm, John, 129, 147, 167
Janelle, Donald G., 30, 79, 113 Kuhn, Werner, 4, 27, 28, 77, 96, 109, 112, 114,
Jankovic, Irwin N., 71 123
Janowicz, Krzysztof, 28, 120 Kuhnt, Sascha, 165, 185, 186
Jansen, Clemens, 63 Kuipers, Benjamin J., 89
Janzen, Gabriele, 63 Kulik, Lars, 52, 158, 189
Jeffery, Kathryn J., 46 Kuntzsch, Colin, 95
Jiang, Bin, 122 Kuse, Allan R., 45
Jodelet, Denise, 86 Kuss, Thomas, 208
Johnson, David M., 5, 6, 82 Kwong, Angela, 187
Johnson, Mark, 3, 43, 77
Johnson-Laird, Philip N., 77
Johnsrude, Ingrid S., 63 L
Jones, Christopher B., 146 Lakoff, George, 3, 5, 43, 77, 82
Jonides, John, 76, 77 Landau, Barbara, 65, 86
Jordan, Troy, 123 Landay, James A., 211
Joyner, Sue M., 196 Langston, Rosamund F., 63
Jurafsky, Daniel, 188 Larkin, Jill H., 78
Lee, Paul U., 78, 95, 195
Lee, Y. C., 187
K Leitinger, Sven, 198
Kahana, Michael J., 10, 62 Lemon, Oliver, 184, 191, 197
Kahneman, Daniel, 12, 53, 76 Lerner, G., 196
Kakeyama, Masaki, 63 Levelt, Willem J. M., 68
Kaplan, Rachel, 44 Levine, Marvin, 71
Kaplan, Stephen, 44 Levinson, Stephen C., 44, 77
Karimi, Hassan A., 168 Lewicki, Pawel, 53
Katardzic, Mehmed, 166 Liben, Lynn S., 67
Kazerani, Aisan, 95, 183, 197 Likert, Rensis, 47
Kealy, Allison, 168 Lindberg, Erik, 14, 48, 49
Keeton, William T., 71 Liu, Alan, 187, 198, 199
Kelly, Jonathan W., 47 Liu, Siyuan, 94
Kennedy, Lyndon S., 148 Liu, Xingkun, 184, 191, 197
Keßler, Carsten, 120 Lloyd, Robert, 82
Khammampad, Kartik, 187 Lockwood, Kate, 84, 194, 195
Kirasic, Kathleen C., 56 Loomis, Jack M., 29, 46, 47, 69, 71, 157
Kirschvink, Joseph L., 71 Lotto, R. Beau, 54
Kjelstrup, Kirsten B., 62 Lovelace, Kristin L., 44, 47, 89, 90, 178
Klatzky, Roberta L., 29, 46, 68, 69, 71 Lovett, Andrew, 84, 194, 195
Klein, Wolfgang, 78, 87 Lowe, David G., 150
Kleinberg, Jon, 149 Lynch, Kevin, 8, 34, 59, 72, 83, 84, 165, 195
Kleinfeld, Judith, 43
Klippel, Alexander, 22, 83, 91, 95, 126, 130,
131, 143, 146, 151–154, 159, 164, 165, M
181–183, 189, 210 MacEachren, Alan, 146
Knauff, Markus, 10, 65, 78, 86, 208 Mack, Andy, 187
Košecka, Jana, 198 Mackaness, William, 184, 191, 197
Kolbe, Thomas H., 187 MacMillan, Donny, 168
Kopczynski, Matthias, 195 Magel, Stephen, 52
Kopf, Johannes, 210 Maguire, Eleanor A., 62–64
Kotz, David, 210 Malaka, Rainer, 189
Kowtko, Jacqueline, 78 Maling, D. H., 110
Author Index 217
N Q
Naaman, Mor, 148 Quinlan, John R., 144
Nadel, Lynn, 42, 44, 48, 61, 71
Nakamura, Uiko, 59
Narayanaswami, Chandra, 129, 167 R
Neven, Hartmut, 148 Ragni, Marco, 78
Newcombe, Nora, 42, 66 Rajabifard, Abbas, 168
218 Author Index
Ranck, James B., 61 Sester, Monika, 126, 145, 146, 163, 195
Raper, Jonathan, 123 Setlur, Vidya, 83
Rattenbury, Tye, 148 Shannon, Claude E., 87
Ratti, Carlo, 122 Shanon, Benny, 87, 93
Raubal, Martin, 4, 48, 97, 120, 123, 126, Shelton, Amy L., 63
139–145, 151, 163, 179, 210 Shepard, Roger N., 30, 45
Rauh, Reinhold, 208 Sherwood, Scott, 168
Redish, A. David, 61 Siegel, Alexander W., 10, 44, 51, 56
Reeves, Stuart, 168 Silverman, Irwin, 44
Rehrl, Karl, 198 Simon, Herbert A., 78
Reilly, Judy S., 30 Skorpanich, Mary Anne, 74
Reinelt, Rudolph, 89 Slack, Jon, 77
Relph, Edward C., 15, 63 Smith, Barry, 6, 55, 112
Renz, Jochen, 111 Smith, Darren, 56, 196
Richards, I. A., 96 Smith, Ian, 211
Richardson, Anthony E., 44, 47 Solstad, Trygve, 62
Richter, Daniela, 37, 74, 94 Song, Yang, 148
Richter, Kai-Florian, 22, 37, 74, 83, 94, 95, Sorrows, Molly E., 8, 56, 126, 139, 186
111, 130, 131, 152–155, 157, 158, Sotillo, Catherine, 78
161–168, 176, 181–183, 189, 190, Spalding, Thomas L., 37, 93
194–197, 209–211 Spelke, Elizabeth S., 48
Rieser, John J., 47, 67 Sperber, Dan, 88
Rischpater, Ray, 187, 198 Spiers, Hugo J., 63, 64
Robinson, Michelle, 156, 163, 165, 167, 183, Spooner, Patrick A., 63
208 Staab, Steffen, 112
Roongpiboonsopit, Duangduen, 168 Staplin, Lorin J., 10, 52, 73, 116, 117
Rorty, Richard, 81 Stea, David, 10, 44, 71, 78
Rosch, Eleanor, 5, 6, 42, 43, 82 Steck, Sibylle D., 52
Ross, Robert, 188 Steed, Anthony, 187, 198
Ross, Tracy, 196 Steedman, Mark, 190
Rostamizadeh, Afshin, 12 Stell, John A., 189
Ruocco, Marco, 125 Stemmler, Martin B., 62
Russell, Stuart J., 12 Stevens, Albert, 75
Rüetschi, Urs-Jakob, 92 Stevens, Quentin, 85
Stimson, Robert J., 51
Stirling, Lesley, 37, 94, 168
S Stolte, Chris, 210
Sacks, Oliver, 42, 61 Streeter, Lynn, 90
Sadalla, Edward K., 10, 52, 73, 116, 117 Strong, Rebecca A., 44
Sadeghian, Pedram, 166 Strube, Gerhard, 208
Samet, Hanan, 72 Studer, Rudi, 112
Sanchez, Christopher A., 49 Subbiah, Ilavanil, 47
Santosa, Sigit, 190 Surowiecki, James, 167
Sauter, Megan, 66
Saux, Eric, 59
Schegloff, Emanuel A., 80 T
Scheider, Simon, 28, 123 Takacs, Gabriel, 187
Schelling, Thomas C., 55 Talmy, Leonard, 78
Schlieder, Christoph, 153, 208 Talwalkar, Ameet, 12
Schmid, Falko, 95, 211 Tanaka, Katsumi, 147, 163
Schneider, Gerald E., 65, 86 Tappe, Heike, 91, 95, 159, 182
Schneider, Luc, 112 Tarkiainen, Mikko J., 196
Schwering, Angela, 84, 194 Taube, Jeffrey S., 61
Searle, John R., 43, 81 Taylor, Holly A., 95
Author Index 219
Tenbrink, Thora, 92, 95, 181, 188, 190, 197 Waller, David, 42, 44, 48, 157
Tezuka, Taro, 147, 163 Wallgrün, Jan Oliver, 111, 120, 194, 195
Thompson, Evan, 42, 43 Wang, Jia, 84, 194
Thompson, Henry S., 78 Wang, Ranxiao Frances, 48
Thompson, Simon, 196 Watkins, John J., 35
Thorndyke, Perry W., 52 Weaver, Warren, 87
Thrun, Sebastian, 14 Webber, Bonnie, 184, 191, 197
Timpf, Sabine, 72, 92, 93, 122, 157, 163 Weber, Patrick, 129
Tobler, Waldo, 9, 36, 73, 125 Wehner, Rüdiger, 46
Todd, Peter M., 76 Weinert, Regina, 78
Tolman, Edward C., 61, 71 Weiser, Mark, 21
Tom, Ariane, 20, 89, 90 Weissensteiner, Elisabeth, 82
Tomko, Martin, 85, 93, 111, 114, 123, 126, Wertheimer, Max, 55
140, 145, 146, 163, 183, 210, 211 Westheimer, Gerald, 55
Trinkler, Iris, 64 Wetzel, Jon, 84, 194, 195
Trowbridge, C. C., 69 White, Sheldon H., 10, 44, 51
Truelove, Marie, 113, 189, 208 Williams, Sandra, 156
Tse, Dorothy, 63 Wilson, Deirdre, 88
Tuan, Yi-Fu, 15, 16 Wilson, Robin J., 35
Turennout, Miranda van, 63 Winter, Stephan, 15, 22, 37, 74, 82, 83, 85,
Turing, Alan M., 19, 175, 206 92–95, 111, 113, 114, 123, 125, 126,
Turk, Andrew G., 6, 78 130, 131, 139–146, 151–153, 156, 158,
Turner, Charles Henry, 28 163–165, 167, 168, 175, 179, 182, 183,
Tversky, Amos, 5, 76 189, 190, 197, 208–211
Tversky, Barbara, 6, 46, 71, 78, 95, 195 Wither, Jason, 187, 198
Twaroch, Florian, 66 Witter, Menno P., 61–63
Wittgenstein, Ludwig, 5, 81
Wolbers, Thomas, 47
U Wolter, Diedrich, 111, 120, 194, 195
Usher, Jeffrey M., 84, 192, 194, 195 Wonciewicz, Susan A., 90
Uttal, David H., 66 Wood, Emma R., 63
Worboys, Mike, 189
Wu, Yunhui, 126, 183, 197
V Wunderlich, Dieter, 89
Van de Weghe, Nico, 111
van der Zee, Emile, 77
van Fraassen, Bas C., 81
X
Vandenberg, Steven G., 45
Xiong, Yingen, 187
Vanetti, Eric J., 82
Xu, Yisheng, 94
Varela, Francisco J., 42, 43, 115, 116
Vasardani, Maria, 37, 93, 94, 189
Vedantham, Ramakrishna, 187, 198, 199
Vitello, Diane, 90 Z
Volta, Gary S., 122 Zadeh, Lofti A., 115
Voronoi, Georgy, 124 Zhang, Xiao, 146
Vygotsky, Lew S., 66 Zhao, Ming, 148
Zheng, Yan-Tao, 148
Zheng, Yu, 113
W Zhou, Xiaofang, 113
Wagage, Suraji, 94 Zimmerman, Benjamin, 168
Walker, Michael M., 71 Zimmerman, Wayne S., 44, 45
Subject Index
A D
advance visibility, 141–143, 151, 157 description
affordance, 55 destination, 183, 184, 197
after-region, 154, 155 place, 18, 29, 83, 87, 92–94, 126, 127, 189,
anchor point, 9, 10, 15, 23 210, 211
artificial intelligence, 20, 190, 208 route, 12, 18, 21, 48, 51, 55, 58, 59, 65,
87–92, 95, 116, 128, 130, 131, 146, 155,
158, 167, 177, 179, 181, 183, 184, 196,
B 197, 210
base level theory, 82 destination, 31
before-region, 154, 155
Bing Maps, 147
Bremen kiosk system, 181–185, 190, 191, 196 E
embodied experience, 1, 7, 10, 13, 19, 21, 29,
31, 36, 43, 53, 54, 66, 73, 77, 80, 94
C etymology, 3
categories, 5, 156, 165, 166, 183 Exif, 149
category resemblance, 5, 82 experience, 207
clustering, 144, 146, 149, 150, 187 EyeSpy, 168
hierarchical agglomerative, 147, 148
Cobweb, 144
F
cognitive economy, 5
fMRI, 61
cognitive efficiency, 34, 71–73, 76, 124
function
cognitive map, 9, 61, 71
in wayfinding, 165
cognitive science, 42
cognitively ergonomic, 130, 131
CogSketch system, 195 G
configuration, 10 gender differences, 44
context, 6–8, 12, 13, 15–18, 21, 23, 29, 32, 36, geographic information retrieval, 146
48, 50, 59, 60, 62, 63, 74, 77, 81–85, 88, Gestalt, 78
89, 91, 93, 95–97, 109–113, 115–123, global landmark, 31, 35–37, 56, 57, 71, 73, 74,
125–132, 140, 144, 147, 165, 168, 176, 116
178, 189, 190, 199, 205, 208–211 Google Maps, 189, 197
context awareness, 10, 111, 113 Google Street View, 187, 208
contrast, 15 graph, 35, 150, 152, 159, 160, 162, 166
CORAL system, 155, 180–184 complete line, 158, 159
crowdsourcing, 167, 187 grey world assumption, 54, 139, 209
K.-F. Richter and S. Winter, Landmarks: GIScience for Intelligent Services, 221
DOI 10.1007/978-3-319-05732-3, © Springer International Publishing Switzerland 2014
222 Subject Index
P
L path integration, 28, 31–33, 38, 46, 48, 51, 53,
landform, 59 61, 62, 65, 69, 83
landmark identification, 138, 139, 141, 145, pCRU, 183
163, 165, 166, 179 piloting, 52
landmark integration, 138, 151, 163, 165, 166 place, 4, 14, 15, 123
landmark spider, 157 sense of, 15
landmarkness, 112, 137, 139, 147, 178 place cells, 61
lateral circle, 32 point, 4
local landmark, 35, 145 points of interest, 16, 146–148, 166–168, 179,
location, 4, 109 184, 185, 189, 191, 208
location-based service, viii, 23, 138, 166 OpenPOI, 128
locatum, 86, 92, 93, 123 pragmatic information content, 211
locomotion, 28, 50 preferential interpretation, 78
Subject Index 223
R T
reference frame, 30, 37, 67–70, 80, 86, 92, 120, taxonomy, 4
153 tectonic plate hypothesis, 73
reference point, 1, 8–11, 14, 16, 17, 23, 28, 31, term frequency inverse document frequency,
47, 49, 50, 58, 69, 72–74, 84–86 147, 149
reference region, 73, 74, 123, 126 theory theory, 66
region of influence, 37, 146 thinking, 53
relatum, 86, 92–94, 120, 121, 123, 127 Turing test, 19, 20, 110, 175
relevance, 146, 151, 152, 156, 157, 176, 178,
206, 211
functional, 153, 154 U
robot, 12, 14 universe, 110
route segment updating, 49
incoming, 153–155 urban environment, 38, 59
outgoing, 153, 154 user-generated content, 5, 129, 167, 168, 208,
209
S
salience, 15, 73, 138, 139, 141, 145, 151, 152, V
157, 165, 166, 176, 178, 183, 206 visibility, 139, 143, 144, 156
measure, 126, 139, 140, 143, 151 volunteered geographic information, see also
structural, 151 user-generated content, 167
scale of observation, 149
simplest instructions, 157, 162
sketch map, 192, 195 W
SLAM, 14 wayfinding, 1, 14, 17, 23, 27, 31, 35, 37, 42–44,
space 48–52, 54, 55, 59, 60, 70, 72, 73, 81, 85,
environmental, 13 97, 116, 125, 152–154, 198, 206
geographic, 13 WhereIs, 156, 166, 167, 179, 181, 185, 189,
vista, 13 196
Space Syntax, 122 World Geodetic System, 110
SpaceBook project, 184, 191, 192, 197, 198
spatial abilities, 42, 44, 47, 49, 51, 61, 64, 67,
74, 80 Y
spatial chunking, 22, 131, 182, 183 yellow pages, 156, 165