Automatic Annotation of Geographic Maps

Mirko Horstmann1 , Wilko Heuten2 , Andrea Miene1 , and Susanne Boll3
Center for Computing Technologies (TZI), Universit¨ at Bremen, Postfach 330440, 28334 Bremen, Germany {mir, miene} 2 OFFIS Escherweg 2 26121 Oldenburg, Germany 3 University of Oldenburg Department of Computing Science Escherweg 2 26121 Oldenburg, Germany

Abstract. In this paper, we describe an approach to generate semantic descriptions of entities in city maps so that they can be presented through accessible interfaces. The solution we present processes bitmap images containing city map excerpts. Regions of interest in these images are extracted automatically based on colour information and subsequently their geometric properties are determined. The result of this process is a structured description of these regions based on the Geography Markup Language (GML), an XML based format for the description of GIS data. This description can later serve as an input to innovative presentations of spatial structures using haptic and auditory interfaces.



Many of our daily tasks require knowledge about the layout and organisational structure of physical environments. These tasks include navigation and orientation as well as communication about geographic locations and are most often supported by maps. A map can be seen as a two-dimensional representation of a real world environment with a reduced amount of information that is created for a specific context and goal. Information stored in maps can be used to build a mental model of the physical space and to understand geographical entities and relationships. Maps use graphical representations to visualise an areas’ spatial layout and the semantic entities it contains such as parks, gardens, buildings and streets. Most of the existing map material is stored and managed in Geographic Information Systems (GIS) by the publisher and typically includes the modelling of the geographic area and the different elements and layers that belong to it. However, in many cases an end user only

This first implementation is therefore meant as an alternative to tactile orientation maps. The result of these problems is that existing maps usually have to be completely redesigned in order to present them in a tactile format. Nevertheless. Our approach to the problem is a software system that extracts semantic entities from maps provided as bitmap images. non-visual presentations to convey information. which use specialised. with thermoforming. for non-visual representations like haptic. There is no fully automatic process for a conversion as one might think. Like most approaches. illiterate people do not have access to the text included in maps. This description scheme then forms the semantic annotation of the map. which often means that a laborious manual process must be applied if this material is not already in a structured format that includes semantic descriptions. Next. tactile diagrams suffer from their relatively low resolution and the limited ability of the finger tips to recognise fine structure. BATS[2] have tried to overcome the limitations of tactile maps with combined tactile and auditory output. However. Unfortunately. in a more complex process. The most widely used method are tactile printouts of the maps. Our proposed solution focuses on city maps. The proposed use-case is that of exploration of a given area for an overview. most ideas can be applied to other map types. nearby regions are grouped to form single entities. a structured representation of its shape and its type is generated in the standardised Geography Markup Language (GML). tactile maps are an important aid for blind people to make themselves familiar with new environments. . For each of these.g. However. Projects like TACIS[1] or. The software first identifies coherent regions of similar colour and classifies them as one of several known types. With this information the user can build a mental model and familiarise with a physical environment. which can be used in various ways. tactile or auditory display as well as any multimodal combination. Patterns that symbolise different areas on a map are therefore limited to a few distinguishable textures. visually impaired people rely even more on the information stored in maps than sighted people because building a detailed mental model is required as a preparation for tasks like navigation and orientation in unfamiliar environments. more recently. e. people with motor deficiencies cannot point to tiny elements on the map as there is no option to zoom into them in a semantic fisheye fashion. as is shown in [6]..2 Mirko Horstmann et al. A number of techniques can be employed to make maps accessible. receives a bitmap of the map in which all semantic entities only exist implicitly. rather than for exact navigation. they rely on the conversion of existing material. The TeDUB[3] system therefore aimed to semi-automatically interpret simple diagrams for an accessible presentation using image and knowledge processing techniques. Maps may later enter the system through a scanner interface for printed material or as bitmap images from web pages. which can be produced on swell paper or. lines of Braille text are usually 6mm high and cannot be reduced in size and thereby further clutter the tactile image. This almost complete loss of semantic information excludes many people from using the geographic maps as an orientation and exploration support: Visually impaired and blind people cannot see the layout of the map. encoded as coloured pixels. Furthermore.

routes. points of general interest (hotels. . . schools. Furthermore. publishers tend to establish their own standards. . Therefore we have compared several city maps of various online and print publishers and have found the following set of typical objects: – – – – – – – – – – Parks and gardens Water (lakes. rivers. single houses should be grouped to blocks. The city map mediates shapes of certain objects-types. With the help of a scale a user can measure distances and determine sizes of objects. . we identify the requirements for a system that extracts semantic information of existing city maps automatically and provides this information in a format that can be used by other systems for a non-visual representation for blind and visually impaired or otherwise print impaired people. ) Public transportation information (stations. These include filtering objects. . . 2. ) Bridges and tunnels Additional objects to help interpreting the map (keys. In order to familiarise with a city and to get an initial overview. zooming and panning. E. the presentation should only include larger objects or groups of smaller objects of the same type. which are geographically close to each other. . sights. several blocks to residential areas. shops.1 Entities of City Maps There are no known open standards as for what objects a city map should consist of.ICCHP 2006 3 2 Requirements for the software In this section. large objects. . and the interchange format for the information. .. . functions that appear in current map viewers are also useful for non-visual representations.g. a city map implies the location of objects relatively to other objects based on a mapping from the real world. which is presented at the same time. Moreover. their shape and location as well as distances between objects are more important than small streets and single buildings. town hall. The analysis addresses three topics: The kind of information that has to be conveyed to the user. although the underlying format used to communicate the data must support these functions (see Section 2. changing level of details. In order to reduce the amount of information. . too much information at the same time will make it more difficult for the user to build a mental model of the depicted city. ) Public buildings (churches. These have to be implemented in the specific viewer and are not covered in this paper. scale. ) Most of these items are associated with additional attributes like their names or one-way directions for the traffic. seas) Streets of various types Squares and Places Quarters and other organisational structures Monuments. . . Therefore.3) . the requirements for existing maps from which to extract the information. north arrow.

The next step is to find image regions with a specific colour which represent certain objects. pixels. Therefore a standardised modelling language for geographic entities is strongly recommended. on web pages or as scanned images from printed maps) whereas vector version usually have to be obtained commercially and are then rather restricted regarding their use and redistribution. The format should be powerful enough to describe the entities listed in Section 2. Nearby regions are then grouped to form areas with the respective entities. sighed people are able to identify the most important things at a glance. Since we do not consider textual information printed on the map in this first prototype. Although vector graphics are more amenable for the task of extracting the necessary spatial information about areas of interest. An example is shown in fig. Nevertheless. Keeping the later non-visual representation in mind. [5. the semantic information should be stored separately from the map itself. To reduce noise the image is smoothed with a median filter. 4])..4 Mirko Horstmann et al. e. 2.g. To segment the image regions belonging to a known entity the image is binarised using the colour intervals . it is often the case that bitmap versions are more easily available (e. 2. which can be used for haptic and auditory rendering. This results in a set of image masks of which each one marks all regions of one respective kind of entity. our first processing step is to remove text through morphologic operations. The colour of each requested object is specified by colour intervals given by a minimum and maximum value for each of the red. image regions with a specific colour (one that is within a given interval of values for the separate colour channels) are detected. After a pre-processing step that removes text and noise from the image.g. In this paper we focus on city maps with a typical set of colours to distinguish different entities like watercourses. the description of the geographic objects should not be in a visual format. green and blue colour channels.g. 3 Automatic Extraction of Semantic Information from City Maps There are various types of maps which code geographic information in different ways. but rather in a vector format. parks or buildings and our approach makes use of this special kind of colour code for an automatic extraction of entities. 1 where black text is removed through a morphologic closing operation. This approach is simple but suitable to remove text on most maps. our more advanced approaches for text/background separation could later be employed (see e.3 General Requirements for Modelling Semantic Annotations of City Maps It is important that the format for storing the extracted semantic information is open. If a map includes text in different colours. easy to read and easy to distribute.2 Maps for Semantic Extraction Maps usually come in two formats: bitmap images and vector graphics (GIS data).1 and their attributes. In order to share semantic information between other maps and publishers. Furthermore a publisher should be able to extend the description for individual needs.

left) and after removing textual information (right). 2. Such regions have to be grouped together and treated as one object during further interpretation. City map example .de specified for that Fig. 1.Britzer Garten” (original. “for- . Fig. 3 shows how the clustering process is influenced by the distance threshold.ICCHP 2006 5 Fig. The resulting image mask shows pixels that are within the given colour intervals and therefore represent the requested objects. objects are split into several image regions by other objects.. Often. http://www. City map example . Small areas which are irrelevant for further interpretation are removed from the binary image by morphologic operations.and three-dimensional geographical objects (also referred to as features). http://www.falk.falk. 4 GML for Modelling Semantic Information on City Maps The Geography Markup Language (GML) – an initiative by the Open Geospatial Consortium (OGC) – enables the specification of two.Britzer Garten” and segmentation results for watercourses. it provides a “polygon” element rather than “lake”. A typical example is a park which is split into several green image regions by paths or roads crossing through it.g.. c Falk Verlag. Figure 2 shows an example for the segmentation of watercourses. It provides only a general framework for describing geographic features (e. A cluster of regions is then represented by its convex hull which is later described as a polygon. To group these regions together a threshold is specified up to which distance regions of the same colour are clustered together. c Falk Verlag.. The clustering step also allows us to group together several buildings to a building complex.

For our schema.root element --> <xs:element name="FeatureCollection" type="en:GeographicFeatureCollectionType" substitutionGroup="gml:_FeatureCollection" /> Our features are either buildings. http://www. lakes. or squares. they must be extended to form concrete est” or “building” elements).6 Mirko Horstmann et al.1 lists a set of geographic entities that we need to describe – a “collection of features” in the GML nomenclature. we instantiate two elements “FeatureCollection” and “Feature”. c Falk Verlag. the goal of the FeatureCollection element is to hold all Feature elements: <xs:element name="Feature" type="en:GeographicFeatureType" substitutionGroup="gml:_Feature" /> <!-. which forces the GML instance author to categorize a feature as such: <xs:simpleType name="GeographicDescriptionType"> <xs:restriction base="xs:string"> <xs:enumeration value="Building"/> <xs:enumeration value="Park"/> <xs:enumeration value="Lake"/> <xs:enumeration value="Sight"/> <xs:enumeration value="Square"/> . GML provides abstract types for features as well as collections. we have therefore defined two element types “GeographicFeatureCollectionType” and “GeographicFeatureType” based on the mentioned abstract types. Section 2. From these types. 3. In order to use them. we have defined a required additional attribute to the “GeographicFeatureType”. Fig. It is the application developer’s task to specify his or her own application schema. parks.1 and ensure that the format is open and standardised and therefore readable by everyone who wants to convey geographical information to blind and visually impaired users. Clustering of watercourses with a distance threshold of 15 (left) and 60 (right). sights. By using and extending the GML we can describe and model the extracted semantic information discussed in Section 2. Therefore.

so that instead of these pixel coordinates we could later use Gauss Kr¨ uger coordinates. this is not mandatory. we can assign a name and a description to a feature. whereby the vertex coordinates are given. with the last pair of coordinates being the same as the first. Thus. For example. 5 Results and Future Work In this paper we present a promising approach for improving the accessibility of maps. However. We can also specify the bounding box of the feature if we wish.ICCHP 2006 7 . The coordinate pairs are also referred to as control points of the linear ring. the coordinate pairs are simply the pixel coordinates of the bitmap image. For example the description of a building would look like this: <en:Feature featureType="Building"> <gml:location> <gml:Polygon> <gml:exterior> <gml:LinearRing> <gml:pos>619 209</gml:pos> <gml:pos>643 125</gml:pos> <gml:pos>706 84</gml:pos> <gml:pos>716 99</gml:pos> <gml:pos>677 228</gml:pos> <gml:pos>619 209</gml:pos> </gml:LinearRing> </gml:exterior> </gml:Polygon> </gml:location> </en:Feature> In this case.. The coordinate pairs of this ring are the vertex coordinates of the polygon. curves. whereby we enter the coordinate pair separated by a whitespace.. multi-point lines. The GML location element allows us to specify the location of a feature as a polygon element. Therefore by using this schema. For our purpose. the really important information is the definition of the features as polygons and their vertex coordinates as well as their “featureType”. </xs:restriction> </xs:simpleType> There are several standard object properties that we can assign to each feature. A polygon exterior consists of an exterior linear ring. we can specify the exterior using the “gml:exterior” element. however. as well as more general objects. we are able to add new geographic elements easily. there must be at least three vertices. In the gml:Polygon element. GML provides support for various coordinate systems. We show that automatic methods can be applied to extract geographic information from . We can specify these control points with the “gml:pos” element. Features can also be points.

This will not only extend the amount of useful information extracted from city maps but it will also allow us to investigate the usefulness of our approach for other kinds of maps that do not as much rely on colour codes. 1998. pp.. Hagen. Parente. M. 5. This will allow a content creator to make use of the methods in the larger scope of accessible map creation..: BATS: The Blind Audio Tactile Mapping System.. 141–163. Seattle. FP6-2003-IST-2-004778. M. Bishop. 2005. Spencer.: Tactile acoustic computer interaction system (TACIS): A new type of graphic access for the blind. C. Dijkstra. which can be transformed to auditory or haptic presentations. maps.: Extracting Textual Inserts from Digital Videos. Ioannidis.htm). Gallagher.. Ungar. pp. Schlieder... pp.. Second. 2. IEEE Computer Society. In Proceedings of the ACM Southeast Conference (ACMSE ’03). G. A. P. O. Ocha´ ıta.. The prototype uses OpenCV. of the Sixth International Conference on Document Analysis and Recognition (ICDAR’01). We would like to thank Alexander K¨ ohn for his support during the development of the software prototype. in Proceedings of the 3rd TIDE Congress... 2001. Evans. Crombie. Espinosa. December 2004. Hermes. Blades. In Proc.: Comparing methods for introducing blind and visually impaired people to unfamiliar urban environments. G. which leads to a semantic annotation. Petrie. Ioannidis. S. N. Watkowski. G. Our solution demonstrates a high potential in helping people with special needs to access maps in bitmap format. the Open Source Computer Vision Library (http://www. Volume 10(2). W. March 2003. Helsinki. Miene. September 10–13. Universit¨ at Bremen. In Journal of Environmental Psychology. 3. D. Burn. H. Herzog.. A. which is the most used format on the Web. T. 6. USA. Acknowledgement This paper is supported by the European Community’s Sixth Framework Programme. A. 2001. S. so a semi-automatic process should be aimed at. King. Our future work will concentrate on two aspects: First.... In New Review of Hypermedia and Multimedia. .-M. 277-287. 18.. Experience in image processing shows that it is not to be expected that more complex information can always be extracted automatically.. We are using the open standard GML to describe the extracted semantic information in a structure. Technology for Inclusive Design and Equality Improving the Quality of Life for the European Citizen. 4. Horstmann. M. M. the exploitation of other image features like text or symbols to extract additional kinds of information. index. 1079–1083.8 Mirko Horstmann et al. B. C... References 1. Frasch.. D. Washington.: Automatische Extraktion von Szenentext.: Automated interpretation and accessible presentation of technical diagrams for blind people. Savannah GA. A. Diploma thesis. E. King.. Lorenz. C. a closer integration of the methods into an existing workflow.. Becker. A. June 1998.