You are on page 1of 206

Dieter Pfoser

Ki-Joune Li (Eds.)

Web and Wireless


LNCS 8470

Geographical
Information Systems
13th International Symposium, W2GIS 2014
Seoul, South Korea, May 29–30, 2014
Proceedings

123
Lecture Notes in Computer Science 8470
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbruecken, Germany
Dieter Pfoser Ki-Joune Li (Eds.)

Web and Wireless


Geographical
Information Systems
13th International Symposium, W2GIS 2014
Seoul, South Korea, May 29-30, 2014
Proceedings

13
Volume Editors
Dieter Pfoser
George Mason University
Department of Geography
and Geoinformation Science
Fairfax, VA, USA
E-mail: dpfoser@gmu.edu
Ki-Joune Li
Pusan National University
Department of Computer Science
and Engineering
Pusan, South Korea
E-mail: lik@pnu.edu

ISSN 0302-9743 e-ISSN 1611-3349


ISBN 978-3-642-55333-2 e-ISBN 978-3-642-55334-9
DOI 10.1007/978-3-642-55334-9
Springer Heidelberg New York Dordrecht London

Library of Congress Control Number: 2014937289

LNCS Sublibrary: SL 3 – Information Systems and Application,


incl. Internet/Web and HCI
© Springer-Verlag Berlin Heidelberg 2014

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication
or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location,
in ist current version, and permission for use must always be obtained from Springer. Permissions for use
may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution
under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface

These proceedings contain the papers selected for presentation at the 13th In-
ternational Symposium on Web and Wireless GIS held during May 29–30, 2014.
This symposium was intended to provide a forum for and review of the advances
in, both, the theoretical and the technical developments in the Web and wireless
GIS area. Compared to other academic events on GIS, this series of symposia
focuses on Web and wireless aspects. The first symposium was held in Kyoto
in 2001. The locations have since then been alternating between Asia, Europe,
and North America and this year’s W2GIS symposium was held in Seoul, South
Korea.
In all, 22 submissions were received from Europe, Asia, North America, and
Middle East countries. Even though the number of submissions was slightly
smaller than in the previous years, the quality of the papers was very high.
Through a rigorous review process with three reviewers per paper, 12 papers were
selected for presentation at the symposium and publication in the proceedings.
The selected papers cover several interesting topics including parallel processing
of geo-spatial data, the geo-social net and geo-referenced multimedia, geo-sensor
networks, indoor GIS, and Web and wireless GIS applications. All topics reflect
recent progress in the domain of Web and wireless GIS.
Distinguished keynote addresses were given by Dr. Erik Hoel from ESRI, Prof.
Cyrus Shahabi from USC, and Dr. Sang-joon Park from ETRI. Dr. Hoel provided
an overview of green field research topics from an industrial perspective. Prof.
Shahabi explained the basic concepts and challenges of GeoCrowd. Dr. Park
gave an explanation of indoor positioning technologies based on his research and
development experiences at ETRI from the past ten years.
We wish to thank the authors for their high-quality contributions and the
Program Committee for their thorough and timely reviews. We also would like
to thank the sponsors and Springer LNCS for their support of the symposium.
Finally, our thanks go also to the Steering Committee for providing continuous
advice.

May 2014 Ki-Joune Li


Dieter Pfoser
W2GIS 2014 Symposium Committee

Symposium Chair
Ki-Joune Li Pusan National University, South Korea
D. Pfoser George Mason University, USA

Steering Committee
M. Bertolotto University College Dublin, Ireland
J.D. Carswell Dublin Institute of Technology, Ireland
C. Claramunt Naval Academy Research Institute, France
M. Egenhofer NCGIA, USA
K.J. Li Pusan National University, South Korea
S. Liang University of Calgary, Canada
K. Sumiya University of Hyogo, Japan
T. Tezuka University of Tsukuba, Japan
C. Vangenot University of Geneva, Switzerland

Program Committee
M. Arikawa University of Tokyo, Japan
S. Bell University of Saskatchewan, Canada
A. Bouju La Rochelle University, France
T. Brinkhoff Jade University Oldenburg, Germany
E. Camossi European Commission, Joint Research Centre,
Ispra, Italy
T.-Y. Chou Feng Chia University, Taiwan
R. De By ITC, The Netherlands
S. Di Martino University of Naples Federico II, Italy
M. Duckham University of Melbourne, Australia
P. Froehlich Telecommunications Research Center Vienna,
Austria
J. Gensel Laboratoire d’Informatique de Grenoble,
France
Y. Ishikawa Nagoya University, Japan
B. Jiang University of Gävle, Sweden
H.K. Kang KRIHS, South Korea
H. Karimi University of Pittsburgh, USA
Y. Kidawara National Institution of Communications and
Technology, Japan
M.S. Kim ETRI, South Korea
VIII W2GIS 2014 Symposium Committee

K.S. Kim National Institute of Communications and


Technology, Japan
D. Kitayama Kogakuin University, Japan
B. Köbben ITC - University of Twente, The Netherlands
Y.J. Kwon Korea Aerospace University, South Korea
D.L. Lee HKUST, Hong Kong
R. Lee National Institution of Communications and
Technology, Japan
S. Li Ryerson University, Canada
H. Lu Aalborg University, Denmark
M.R. Luaces University of A Coruña, Spain
H. Martin Laboratoire d’Informatique de Grenoble,
France
P. Muro-Medrano University of Zaragoza, Spain
K. Patroumpas National Technical University of Athens,
Greece
M. Petit Matiasat System R&D, France
C. Ray Naval Academy Research Institute, France
K.F. Richter University of Zurich, Switzerland
M. Schneider University of Florida, USA
S. Shekhar University of Minnesota, USA
M. Tomko University of Zurich, Switzerland
G. Tortora University of Salerno, Italy
T. Ushiama Kyushu University, Japan
A. Voisard Freie Universität Berlin and Fraunhofer,
Germany
X. Wang University of Calgary, Canada
S. Winter University of Melbourne, Australia
H. Wu Wuhan University, China
P. Yang George Mason University, USA

Local Arrangements
B.G. Kim Pusan National University, South Korea
J.H. Ham Pusan National University, South Korea

Sponsors
Pusan National University, South Korea
Korea Spatial Information Society, South Korea
Korea Agency for Infrastructure Technology Advancement, South Korea
Loc&All Ltd., South Korea
Table of Contents

Session 1: Communication and Parallel Processing


for Geospatial Data
On Parallelizing Large Spatial Queries Using Map-Reduce . . . . . . . . . . . . . 1
Umesh Bellur

Feathered Tiles with Uniform Payload Size for Progressive Transmission


of Vector Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Andrew Dufilie and Georges Grinstein

Session 2: Geo-Social Net, Crowdsourcing, and


Trajectory
Trajectory Aggregation for a Routable Map . . . . . . . . . . . . . . . . . . . . . . . . . 36
Sebastian Müller, Paras Mehta, and Agnès Voisard

A Study of Users’ Movements Based on Check-In Data in


Location-Based Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Jinzhou Cao, Qingwu Hu, and Qingquan Li

Key Frame Selection Algorithms for Automatic Generation of


Panoramic Images from Crowdsourced Geo-tagged Videos . . . . . . . . . . . . . 67
Seon Ho Kim, Ying Lu, Junyuan Shi, Abdullah Alfarrarjeh,
Cyrus Shahabi, Guanfeng Wang, and Roger Zimmermann

Session 3: Geo-Sensor Network


ReSDaP: A Real-Time Data Provision System Architecture for Sensor
Webs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Huan Li, Hong Fan, Huayi Wu, Hao Feng, and Pengpeng Li

GeosensorBase: Integrating and Managing Huge Number of


Heterogeneous Sensors Using Sensor Adaptors and Extended SQL
Querying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Min Soo Kim, Chung Ho Lee, In Sung Jang, and Ki-Joune Li

Session 4: Applications of W2GIS


ForestMaps: A Computational Model and Visualization for Forest
Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Hannah Bast, Jonas Sternisko, and Sabine Storandt
X Table of Contents

Isibat: A Web and Wireless Application for Collecting Urban Data


about Seismic Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Paule-Annick Davoine, Jérôme Gensel, Philippe Gueguen, and
Laurent Poulenard

Session 5: Indoor GIS


A Journey from IFC Files to Indoor Navigation . . . . . . . . . . . . . . . . . . . . . . 148
Mikkel Boysen, Christian de Haas, Hua Lu, and Xike Xie

Using Cameras to Improve Wi-Fi Based Indoor Positioning . . . . . . . . . . . . 166


Laura Radaelli, Yael Moses, and Christian S. Jensen

Integrating IndoorGML and CityGML for Indoor Space . . . . . . . . . . . . . . . 184


Joon-Seok Kim, Sung-Jae Yoo, and Ki-Joune Li

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197


On Parallelizing Large Spatial Queries
Using Map-Reduce

Umesh Bellur

GISE Lab, Department of Computer Science


Indian Institute of Technology Bombay, Powai, Mumbai 400076, India
umesh@cse.iitb.ac.in

Abstract. Vector Spatial data types such as lines, polygons or regions


etc usually comprises of hundreds of thousands of latitude-longitude pairs
to accurately represent the geometry of spatial features such as towns,
rivers or villages. This leads to spatial data operations being computa-
tionally and memory intensive. A solution to deal with this is to dis-
tribute the operations amongst multiple computational nodes. Parallel
spatial databases attempt to do this but at very small scales (of the
order of 10s of nodes at most). Another approach would be to use dis-
tributed approaches such as Map-Reduce since spatial data operations
map well to this paradigm. It affords us the advantage of being able to
harness commodity hardware operating in a shared nothing mode while
at the same time lending robustness to the computation since parts of
the computation can be restarted on failure. In this paper, we present
HadoopDB - a combination of Hadoop and Postgres spatial to efficiently
handle computations on large spatial data sets. In HadoopDB, Hadoop
serves as a means of coordinating amongst various computational nodes
each of which performs the spatial query on a part of the data set. The
Reduce stage helps collate the result data to yield the result of the origi-
nal query. We present performance results to show that common spatial
queries yields a speedup that nearly linear with the number of Hadoop
processes deployed.

Keywords: MapReduce, Hadoop, postGIS, Spatial Data, HadoopDB.

1 Introduction
A Geographic information system (GIS) is one that captures, stores, analyzes,
manages and presents spatial data along with relevant non spatial information.
A GIS forms the core of many applications in areas as varied as agriculture to
consumer applications such as location based services. Today, many computer
applications, directly or indirectly, are based on carrying out spatial analysis
at the back-end. Spatial analysis involve spatial operations to be performed
on spatial data. We represent the spatial features such as roads, towns, cities
etc as Vectored data. Vector data is collection of latitude-longitude pairs called
Geospatial points, structured into a format so as to represent the geometry of
spatial features. An example would be the use of vectored polygons to represent

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 1–18, 2014.
© Springer-Verlag Berlin Heidelberg 2014
2 U. Bellur

city or state boundaries. For example, to represent the road network of the state
of Arizona in the USA, we require approximately ten million points, each of which
is a coordinate involving a latitude and longitude. The number of geospatial
coordinates required to represent the geometry of real world objects varies from
few hundreds to tens of thousands. Spatial operations such as overlapping test
(to check whether two areas overlap each other or not) etc are performed on a set
of vector spatial data sets. These operations are generally the implementation
of geometric algorithms. Because of the enormous number of points required to
represent a single spatial object and complexity of geometric algorithms, carrying
out spatial computation on real world data sets has been resource-intensive. A
core-duo, 2G machine takes about 75-85% CPU consumption for spatial join
queries. We therefore consider spatial operations to be a potential candidate for
parallelism.
Parallel spatial DBMSs such as Oracle spatial are being widely used for
carrying out parallel computation of spatial data across a cluster of machines.
Parallel DBMSs design have been optimized to yield high performance but do
not score well in terms of scalability. Asterdata (www.asterdata.com), a parallel
database known to posses one of the best scalability in parallel database com-
munity is scalable to around 330-350 nodes. In parallel DBMSs, the intermediate
results of query are pipelined to next query operator or another sub-query with-
out being written to disk. Now if any sub-query fails, the intermediate results
processed so far are lost and entire query have to be restarted again. Not writ-
ing intermediate data onto disks, results in high performance but at the same
time avoid parallel DBMS from exhibiting good fault tolerance. With the in-
crease in the size of a cluster of commodity machines, the probability of node
or task failure also increase and this failure is likely to become a frequent event
in case the parallel DBMS cluster size is increased to the order of few hundreds
of nodes. This would result in a significant degradation in the performance of
parallel DBMSs. Thus, poor fault tolerance capability puts an upper bound on
the cluster size of parallel DBMSs (up to few tens of nodes), as a result of which
parallel DBMSs have limited scalability.
MapReduce [1], on the other hand, provides a framework for processing large
volumes of data, of the order of hundreds of terabytes. The scalability and fault
tolerance features of MapReduce enable us to use a large number of commodity
machines for carrying out data intensive computations cheaply. The Map-Reduce
parallel programming model does not necessitate a programmer to understand
and control the parallelism inherent in the operation of the paradigm.
In this paper we present the design of a shared nothing, data distributed,
spatial query processing system that we term HadoopDB. We employ the Hadoop
MapReduce libraries to process spatial data extracted from a spatial DB such
as postGIS. We have written a query converter that takes a SQL like query at
the front end and turns it automatically into a map reduce job that uses data
from a set of postGIS instances in the back end in which the spatial data set to
be operated on is distributed. We show that we can achieve near linear speed
On Parallelizing Large Spatial Queries Using Map-Reduce 3

up with the number of map jobs deployed on commodity hardware this proving
the feasibility of this approach for processing of large spatial data sets.
The rest of this paper is organized as follows. We first present a brief back-
ground of MapReduce and qualitatively analyze the parallel spatial DBs with
MapReduce in Section 2. We then look at related efforts in Section 3. In Section
4, we present the overview of HadoopDB architecture with a description on query
execution steps and the scheme of Vector data distribution over HadoopDB clus-
ter nodes. In Section 5, we present the set of benchmarks used to evaluate our
system and experimental results of these benchmarks. We conclude the paper
with a brief summary and directions for future work.

2 MapReduce Vs Parallel Spatial RDBMS

2.1 The Concept of MapReduce

A typical MapReduce job require the programmer to provide the problem logic
of two functions : Map and Reduce functions. The Map function partitions the
input data to be processed preferably into disjoint sets. Each set is then returned
to Reduce function for further processing. Key-value pairs form the basic data
structure in MapReduce. The input to the Map function is the key value pair
(k1 , v1 ), key k1 being the byte offset of a record within the input file, the value k2
being the record line. The Map output the set of intermediate key-value pairs,
[(k2 , v2 )]. The MapReduce library implements the shuffle phase which lies in
between the Map and Reduce phases. The shuffle phase rearrange the interme-
diate Map-output and aggregates all the values associated with the same key
together to form a (key, list(values)) pair which forms the input to the reduce
phase to follow. The last phase is the Reduce phase which processes the list of
values associated with the same key. An identical Reducer function executes in
parallel on worker nodes. The output of the Reducers is the final output that is
written back onto disk.
The Apache Hadoop [2] software library is a framework that allows for the dis-
tributed processing of large data sets across clusters of computers using MapRe-
duce. It is designed to scale up from single servers to thousands of machines, each
offering local computation and storage. Rather than rely on hardware to deliver
high-availability, the library itself is designed to detect and handle failures at
the application layer, so delivering a highly-available service on top of a cluster
of computers, each of which may be prone to failures. Hadoop Distributed File
System (HDFS) is the primary storage system used by Hadoop applications.
HDFS creates multiple replicas of data blocks and distributes them on compute
nodes throughout a cluster to enable availability and reliability.

2.2 MapReduce Vs Parallel Spatial DBMS

Processing a large amount of spatial data has become a critical issue in the
recent times. Parallel DBMS technology has been widely used for processing
4 U. Bellur

larger volumes of vector data, but with the ever increasing need to process larger
and larger spatial data sets, parallel DBMS is no more a desirable technology
for this purpose. We now look at a quick comparison of parallel spatial RDBMSs
and data distribution approaches to process spatial queries.
1. Scalability: Parallel database systems scale really well into the tens and
rarely even into low hundreds of machines. Unfortunately, parallel database sys-
tems, as they are implemented today, unlike Hadoop, they do not scale well into
the realm of many thousands of nodes. Enormous quantities of spatial data is
constantly being generated from various sources such as satellites, sensors and
mobile devices. NASA’s Earth Observing System (EOS), for instance, generates
1 terabyte of data every day [15]. Processing such large volumes of spatial data
on a daily basis needs to employ many more machines, probably in the order of
few thousands which parallel DBMS technology do not support.
2. Fault Tolerance: Fault tolerance is the ability of the system to cope up with
node/task failures. A fault tolerant DBMS is simply one that does not have to
restart a query if one of the nodes involved in query processing fails. Hadoop has
been especially designed to be fault tolerant since it works on commodity hard-
ware. In parallel DBMS, the intermediate results of query are pipelined to next
query operator or another sub-query without being written to disk. Now if any
sub-query fails, the intermediate results processed so far are lost and entire query
have to be restarted again. However, in Hadoop, the intermediate results of the
mappers (Or Reducers) are always written to the disk before they are fetched
by the Reducers (Or mappers of the next Map-Reduce stage). Thus, instead of
pipelining the intermediate results/data to subsequent processes, Hadoop pro-
cesses themselves are pipelined to operate on target data. In case of a task/node
failure, the same task is restarted on another node to operate on the target
intermediate data which still exists on disk.
3. Performance: Parallel DBMS have been designed to work in real time and
therefore focus on performance, whereas Hadoop has been designed for batch
processing. Hadoop was not originally designed for structured data analysis, and
thus is significantly outperformed by parallel database systems on structured
data analysis tasks. In fact, Hadoop takes around 6-8 seconds just to initiate
distributed processing on a 3-4 node cluster where as parallel DBMS finishes
much of the computation in this time period. Hadoop’s slower performance is also
because Hadoop stores data in the accompanying distributed file system (HDFS),
in the same textual format in which the data was generated. Consequently, this
default storage method places the burden of parsing the fields of each record
on user code. This requires each Map and Reduce task to repeatedly parse and
convert string fields into the appropriate type. This, further results in widening
the performance gap between MapReduce and a parallel DBMSs [3].
To summarize, MapReduce offers excellent scalability and good fault tolerance
which enables MapReduce to process larger data sets on sufficiently large clusters
of commodity machines, whereas parallel DBMS technology is limited to cluster
sizes up to few dozen nodes but outperforms the MapReduce in terms of response
On Parallelizing Large Spatial Queries Using Map-Reduce 5

time. The authors in [3] discuss the comparison between MapReduce and parallel
DBMS in greater detail.

3 Related Work

Parallel spatial DBMSs such as Oracle Spatial have been in use for carrying out
spatial analysis on moderately large spatial data sets. Today, spatial RDBMSs
have improved to support a variety of spatial indexing mechanisms which enable
it to process spatial queries really fast. But, parallel DBMS, because of their
limited scalability, fail to handle the ever increasing size of spatial repositories.
To overcome this barrier, researchers have focused on data distribution as an
alternate solution which is capable of executing a variety of spatial operations
such as spatial joins [[7],[8],[9]], nearest neighbor queries [5] and Voronoi diagram
construction [10].
There has been recent work that discusses how spatial queries can be natu-
rally expressed with the MapReduce programming model but without explicitly
addressing any of the details of data distribution or parallelization. The work
discusses algorithmic strategies to parallelize spatial operations such as spatial
join, Nearest Neighbor query and data partitioning in a MapReduce framework.
Spatial Join with MapReduce (SJMR) is the strategy to perform a Spatial join
between two data sets in shared nothing environment. [[7],[8],[9]] mainly fo-
cuses on different variations of SJMR and show that MapReduce is applicable
in computation-intensive spatial applications. Our focus has been to realize an
end to end system that can take a SQL like spatial query and execute it using
MapReduce while fetching the relevant data from a spatial DB. The elements
of mapping a SQL like syntax to MapReduce semantics is non-trivial as is in-
tegrating the MapReduce envrionment (HDFS in particular) with spatial DBs
such as postGIS.

4 HadoopDB - Integrated System of MapReduce and


DBMS

HadoopDB[12] is a hybrid strategy that combines the reliability of spatial


databases with scalable and fault-tolerant Hadoop/MapReduce systems. It com-
prises of Postgres spatial on each node forming the database layer, Hadoop/
MapReduce as a communication layer that coordinates the multiple nodes each
running Postgres. By taking advantage of Hadoop (particularly HDFS, Hadoop
scheduling, and job-tracking), HadoopDB distinguishes itself from many of the
current parallel and distributed databases by dynamically monitoring and ad-
justing for slow nodes and node failures to optimize performance in heteroge-
neous clusters. Especially in cloud computing environments, where there might
be dramatic fluctuations in the performance and availability of individual nodes,
fault-tolerance and the ability to perform in heterogeneous environments are
critical. The system is designed to process most of the problem logic within
6 U. Bellur

the database layer, thus speeding up the queries by making use of database’s
optimized capabilities such as Indexing which is not supported in MapReduce,
whereas the aggregation of data from multiple nodes, if required, is done in the
MapReduce environment.
Figure 1 shows the architecture of the system. The Database Connector (DC)
component of the system has the responsibility to connect to the databases
hosted on cluster machines. DC probes the Catalog file residing in HDFS to
locate the host address, port number and database name for a given table name.
It also contains the replication details of all tables. The databases hosted on
cluster nodes are spatially enabled, open source Postgres databases which we
shall now refer to as postGIS. The Hadoop daemon, called the Task Tracker
runs on each cluster node to assist and control the execution of local Maps and
Reducers.
Geoserver [13] comprises the front end of the system. It allows users to edit and
query geospatial data. Designed for interoperability, it publishes data from any
major spatial data source using open standards (Geography Markup Language
or GML). The HadoopDB library relies on the HIVE SMS (SQL-to-MapReduce-
to-SQL) [12][11] planner to provide high level SQL interface which converts SQL
query into equivalent MapReduce plan. But it doesn’t support spatial data types
and operations. Therefore, we have implemented a simple SQL-to-MapReduce
Converter module (SMC) in the Geoserver that recognize the basic spatial data
types viz Polygons, Multipolygons, LineStrings and Points and translates the
spatial SQL queries into the equivalent compiled MapReduceSQL code. We shall
describe its capabilities and features in Section 4.2.

4.1 Vector Data Distribution


We shall now discuss the strategy to distribute the vector data across the cluster
nodes. The distribution of data across the cluster nodes is primarily governed by
the JOIN operation which is the most commonly used and expensive operation
to perform. In particular, spatial joins combine two spatial data sets by their
spatial relationship such as Intersection, containment or within. In shared noth-
ing distributed DBMSs, if two tables residing on different sites need to be joined,
then one of the tables has to be imported onto the other’s site prior to performing
the join. Spatial data are often large in size and therefore expensive to transfer
from disk over the network. Vector spatial data, by its nature, is well suited to be
processed on clusters following shared-nothing architecture. Hosting all spatial
objects enclosed within a finite geographical boundary (termed a partition) as
tables on a single database site eliminates the need to manipulate tables across
database sites, thus abiding by Hadoop’s shared-nothing architecture. For exam-
ple, any spatial object enclosed within a region A would not overlap, intersect,
meet or touches any spatial object in another geographical region B, and there-
fore can be hosted on two different database sites as any (predicate based) join
across the two sets would always return a null result. Also, it is highly unlikely
that there would be a request for a join between tables containing data that is
not spatially proximal.
On Parallelizing Large Spatial Queries Using Map-Reduce 7

SQL Query
Namenode

SMC MR-code Database


Connector

Geoserver
Reader

catalog.xml
HDFS

tasktracker tasktracker tasktracker

postGIS postGIS postGIS

Node 1 Node 2 Node 3

Fig. 1. HadoopDB architecture with Geo-server front end

Partitioning Strategy. For a collection of spatial objects, we define the uni-


verse as the minimum bounding rectangle (MBR) that encloses all objects in the
collection. In order to distribute the data sets across shared-nothing database
sites following the discussion above, it is required to decompose the universe
into smaller regions or partitions. The dimensions of the universe is determined
by manual analysis of the data set or through some hand-coded scripts. This
is static and permanent information, once computed, need not be computed
again through the life time of data set. The number of partitions into which the
universe is to be spatially decomposed depends on the maximum table size a
database can process efficiently without using temporary disk buffers (or run-
ning out of memory). If the total number of spatial objects in the universe is N ,
and the average number of objects that can be stored in a database table which
avoids disk buffer access during query execution is M , then number of partitions
to be made is the ceiling of N/M . The dimensions of partitions boundaries are
predicted by dividing the universe into smaller rectangular regions of equal sizes.
Partitioning of spatial data sets is done by testing the spatial relationship be-
tween partitions and MBR of spatial object as per the predicate condition, say
overlap. The spatial objects which qualifies the predicate with the partition(s)
becomes a member of that partition(s). This step produces candidates which are
a superset of the actual result. Figure 2 shows the decomposition of spatial
data space into four partitions. Each partition consists of the spatial objects
whose MBRs tests positive overlap with the partition. All the spatial objects
8 U. Bellur

belonging to a particular partition resides on a single database site in the set of


Distributed DBMSs. Also, note that spatial object labeled as O1 in the figure
overlaps with two partitions P1 and P4, so it is a member of two partitions and
therefore resides on two corresponding database sites.

Universe

P1 O1 P2

P4 P3

Fig. 2. Decomposition of the Universe into Partitions

Partition Skew. In reality, the distribution of spatial features over 2D spatial


data space is generally uneven. For example, there are more roads in cities than
in the rural areas. Therefore, the distribution of spatial objects into partitions
may be imbalanced. Figure 2 shows that partition P3 consist of the least num-
ber of spatial objects where as partition P1 and P4 are densely populated. This
situation is termed Partition Skew and is not uncommon. Since each partition
corresponds to the tables residing on the same database site, this uneven dis-
tribution results in tables residing on different database sites to vary from each
other in size. Consequently, different amount of query-computation is carried out
on different cluster nodes, thus resulting an increase in the overall job execution
time. The overall execution time of the job is decided by the time taken by the
cluster node which finishes its share of computation after all cluster nodes have.
Therefore, we need Load Balancing for balanced distribution of objects among
partitions.

9 10 11
P2 P3 P4
6 7 8
P3 P4 P1
3 4 5
P4 P1 P2
0 1 2
P1 P2 P3

Fig. 3. Tile Based Partitioning Scheme


On Parallelizing Large Spatial Queries Using Map-Reduce 9

(Data)Load Balancing. To deal with the problem of partition skew, a tile


based partitioning method [9] is used for balanced distribution of objects among
partitions. This method involves the decomposition of universe into N smaller
partitions called Tiles where N  P (number of partitions). There is also a
many-to-one mapping between tiles and partitions. All spatial objects that tests
positive for the overlap test with the tile(s) is copied to the partition(s) the
tile(s) maps to. Larger the number of tiles the universe is decomposed into,
more uniform distribution of objects is among partitions. In Figure 3 above, the
universe is decomposed into 48 tiles. We have shown the decomposition of only
one partition P1 into tiles numbered from 0 to 11. Likewise other partitions are
also decomposed in the same manner (not shown in the figure). Tiles are mapped
to a partitions in Round Robin fashion. Some spatial objects that are spatially
enclosed within this partition are now mapped to other partitions. For example,
some spatial objects of partition P1 which overlaps with tile 2 and 5, will now be
a member of partition P3 and P2 respectively. In the same manner, some spatial
objects from other partitions are also mapped to partition P1. This results in
the uniform distribution of spatial objects among partitions.

4.2 Query Execution Steps


The SMC module is capable of transforming any spatial query into the equivalent
MapReduceSQL form provided that there is no collation of data needed from
different database sites except through Group By clause, and aggregate functions
supported are sum,max and min only. Table 1 shows the set of rules to map SQL
constructs to MapReduce. As long as the SQL query does not have the Group By
clause, the equivalent MapReduceSQL has only Map functions. Group By clause
requires the records having the same value of a field that is being grouped to be
collated across different database sites, thereby necessitating a reduce function.
For this MapReduce code, the input specification involves the input data to be
retrieved from cluster databases instead of HDFS. Once the data is fetched out
of the databases, rest of the computation proceeds as per the usual MapReduce
paradigm.

Table 1. SQL to MapReduce Mapping

SQL construct MapReduce construct


No GroupBy clause Only Map
Group By clause Map and Reduce
Group by field output-key of Mappers and input-key of Reducers
Aggregate functions Sum , Min , Max
supported Data types primitive data types + Geometry data types
set of fields Selected Map Input Value

The compiled MapReduce job, produced by SMC, is copied by the Hadoop


Master node to relevant cluster nodes as a single jar file. Here relevant cluster
10 U. Bellur

nodes are the nodes which host any of the tables specified in original query. This
information comes from the catalog file residing on HDFS. The query execution
passes through three phases: (a) The first phase involves executing the original
query inside the database locally on each of the cluster nodes. That is why we
call it a SQL-enabling MapReduce job because their input data source is DBMSs
instead of HDFS. (b) In the second phase, the tuples extracted from the DBMSs
(in the first phase), called the ResultSet, are read by the Mappers. Here the Map
job performs any extra computation, that may not be supported at the postGIS
layer. For example, although we can output all pair of roads thats intersect each
other by a simple DBMS query, if we are specifically interested in finding all
T-Point intersection between roads, it can be tested in the map phase whether
the two roads, which are now confirmed to intersect, actually intersect at around
90 degrees or not. (c) In the third phase, Reducers start when all mappers have
finished, each reducer, aggregates the individual map-outputs, consolidates them
and writes the final results back onto HDFS, which can then be read by the
Geoserver for visual rendering. This phase is optional, and is not required if no
aggregation of Map-outputs is required from different cluster nodes. Usually, the
third phase comes into picture in case of nested queries, or queries with GROUP
BY clause.
Inter site Spatial Join: As mentioned earlier, partitioning of the spatial data
sets among database sites is governed primarily by Spatial joins. As long as the
spatial join operand tables reside on the same database sites, the database layer
takes care of performing speedy joins by exploiting spatial indices. However,
there can be scenarios where we need to perform a join across tables residing
on different database sites. We call such spatial joins as Inter site Spatial join.
For example, we have two tables counties and soils which stores the geometry of
counties and soil-distribution (as polygons) respectively of the state of california
and resides on two different database sites. Here we exploit the advantage of
having MapReduce as a task coordination layer between the databases in the
sense that it has the capability to programmatically represent a wide variety
of logic that operates on tuples extracted from different DBs. We can therefore
shift the entire spatial join algorithm to the MapReduce layer. Let us suppose we
have spatial data sets R and S residing on database sites Ri and Si respectively.
Performing inter site spatial join involves three steps :
1. Read Source Data: Read qualified tuples from the sites Ri and Si in parallel
as per the WHERE SQL clause, if any. These tuples are read by the Map
Phase.
2. Spatial Data Partitioning: The partitioning scheme described in the previous
section, is now performed online and is implemented in the Map Phase. This
phase needs the characteristics of data sets such as universe dimensions and
number of partitions as an additional input which is essential to decompose
the universe into partitions. Each partition contains the spatial objects from
the set R and S which are potential candidate to qualify join predicate.
3. Performing actual spatial join: Each partition is then processed by reducers
in parallel to compute the spatial join between R and S. We implement the
On Parallelizing Large Spatial Queries Using Map-Reduce 11

well known Sweepline algorithm in this phase which is used to perform the
spatial join.

5 Experimental Evaluation
We now present a set of benchmarks to asses the performance of Geoserver on
top of spatial HadoopDB as compared to a single node Geoserver (a geoserver
with localhost postGIS in the backend) in the domain of spatial data processing.
We subject each of the systems to spatial queries with different execution plans
to explore the behavior of the two systems.
The test data comprises of the counties (polygons) and roads (Linestrings) of
three states of the Unites States : California, Arizona and Texas. Following are
the details of the environment we conduct experiments.

Table 2. Hardware and Test Data Description

Node State # Counties # roads CPU, RAM(GiB), freq(GHz)


Node 1 Texas 32693 1377372 intel 4 core, 2, 2.66
Node 2 Arizona 11963 718556 intel 4 core, 2, 2.66
Node 3 California 62096 2062872 intel 2 core, 2, 1.66

Data Distribution: The test data is distributed across a three node cluster. In
case of Hadoop, we upload the input files onto HDFS which are then scattered
into fixed size data blocks across HDFS. In case of HadoopDB, one postGIS
database server is active on each node. We distribute the data State wise, that
is, each node stores the county and roads table of exactly one state. All the
experiments are performed on this three node cluster set up. The network com-
munication between cluster nodes is established through a 100 Mbps ethernet
backplane.

Query 1: Highly Selective Spatial Queries


Goal: To show the improvement in response time by distributing the query over
multiple postGIS servers. Hypothesis: Highly selective spatial queries,such as
one shown in figure 4 aims at selecting very small number for tuples which qual-
ifies the given predicate condition of the large data sets (order of tens of millions
of rows). HadoopDB has Spatial indexing support. By replacing the Hadoop’s
default read only data source HDFS by database layer, MapReduce is no more
bound to scan all the data blocks (or chunks) in a brute force manner to retrieve
the required result as per the business logic. Hadoop by itself does not have
any support for building indices on the input datasets. MapReduce framework
splits large files into smaller chunks which are then distributed across cluster
nodes. Each data chunk is bound with exactly one Mapper. When Mappers
start, data chunks are independently processed by their respective mappers in
parallel across the cluster. However, the potential tuples which actually satisfy
12 U. Bellur

the selection criteria may belong to only a few, or even to one data chunk. But
Hadoop’s inability to index a data tuple to the data chunk that contains the
tuple requires it to process all the data chunks and thus unnecessarily launch as
many mappers as the number of data chunks thereby increasing the job tracker
overhead to control the ongoing computation and results in over-consumption of
cluster resources.

450
select id, geom
time in seconds

from roads where


length(geom) >0.01

30
20
3Node 3Node 1Node
Hadoop Geoserver Geoserver

Fig. 4. Performance evaluation of a Highly Selective Query

Result and Explanation: The query in our experiment outputs only those roads
whose length is greater than 0.01 units. HadoopDB clearly outperforms single
node Geoserver as shown Figure 4. In HadoopDB, the qualified tuples are fetched
out of the database layer as per the SQL WHERE condition logic. Tuples not
satisfying the constraint are filtered out at the database layer itself. Hence, the
workload of the MapReduce environment is very low as compared to that of the
pure MapReduce case. Hadoop scans all the data tuples and so exhibits terrible
performance.
Query 2: Spatial Join Queries
Goal: To evaluate the performance of Hadoop, HadoopDB and that of single
postGIS while performing spatial joins.
We perform the spatial join between counties and roads of all three states.
We aim to determine those roads which intersect with each other in all counties
(some roads intersect at the boundary of the counties). We employ the SJMR
algorithm [6] in which the partitions correspond to bounding boxes of states. For
HadoopDB and single DB , we use the SQL Query as shown in figure 5.
Hypothesis: We perform the above spatial join query by implementing SJMR
on Hadoop which involves the online partitioning of spatial data sets in the Map
Phase followed by Reduce phase performing actual spatial join. In case of Intra
Join on HadoopDB (that is join operand tables resides on same database sites),
data partitioning was done offline and is not a part of run time processing. The spa-
tial join query logic is pushed into the database layer, thus completely absolving
On Parallelizing Large Spatial Queries Using Map-Reduce 13

the Map phase of any compute intensive geometric computations and we also avoid
the reduce phase altogether. We also perform Inter site join on HadoopDB by re-
distributing the test data between two database sites, which is similar to SJMR
except that its data source is a set of database tables rather than HDFS.

select a.id, sum(length((b.geom)))


16 from polygons as a, roads as b Reduce phase
14 where intersects(a.geom, b.geom)
time in minutes

group by a.id; Map phase


12
10 Inter Join
8
6
4
Intra Join
2
0
3 Node 3 Node 3 Node 1 Node
Hadoop Geoserver HadoopDB Geoserver

Fig. 5. Performance evaluation of the Spatial Join Query


Result and Explanation: As shown in Figure 5, HadoopDB intra join clearly
outperforms Hadoop and single node Geoserver. But, HadoopDB’s performance
degrades down to that of Hadoop in case of inter join. This is because, the Join
processing has been now shifted from database layer down to the MapReduce
layer which, like SJMR, now involves online partitioning followed by Reduce
phase.
Query 3: Global Sorting
Goal: To evaluate the performance of the systems when the network bandwidth
becomes the bottleneck.
Hypothesis: The query shown in figure 6 requires that counties to be first read
out of HDFS (or DBMS in case of HadoopDB), then aggregated together at
a single reducer process for sorting. This results in large volumes of data flow
across the network. The overall completion time also includes the time taken
for data aggregation at a single machine over the 100 Mbps link and so the
performance is largely driven by network bandwidth.
Result and Explanation: Figure 6 shows that there is no significant difference in
the performance of the three systems for this query, because the MapReduceSQL
implementation of this query merely reads all tuples from each local database
in case of HadoopDB and from HDFS in case of Hadoop. Single node Geoserver
performs slightly better for this query as it suffers from no network overhead.
However the single node Geoserver is largely limited by the size of the machine
on which it runs (size of the memory) and it easily runs out of memory while
processing large data sets.
14 U. Bellur

select id, geom from counties


70 order by area(geom);

Reduce phase
60
Map phase
50

40
time in seconds

30

20

10

0
3Node 3Node 1Node
Hadoop Geoserver Geoserver

Fig. 6. Performance evaluation of the Global Sort Query

Query 4: KNN Queries


Certain spatial queries do not show any improvement even if the geometry
column is indexed. In fact, the execution of such queries is drastically slowed
down if they involve the join operation. For example KNN (K nearest neighbor)
query computes K neighbors that are nearest to a given spatial object in terms
of euclidean distance.
Hypothesis: The KNN query (see figure 7) is executed with in a cursor loop for
every polygon t. In every iteration, it computes the KNN of a polygon t. Now for
moderate to large datasets, this exercise becomes painfully slow because distance
is not an indexable function as it involves relations between two entities. This
is because functions such as distance can not be reduced to questions like ”Is
a within b?” or ”Do a and b overlap?”. Even more concrete: GIST-indices can
only operate on the bounding boxes of two objects. We have also implemented
the KNN algorithm using pure MapReduce for k=5.
Result and Explanation: Figure 7 shows that it is very expensive to perform
the queries involving non-indexable functions . Hadoop, as usual partitions the
data sets in the Map phase, then three reducers corresponding to three states of
America evaluate 5 nearest neighbors for each county in parallel.
Query 5: Anti Shared-Nothing Spatial Queries
Goal: Performance evaluation of Hadoop, HadoopDB and single node Geoserver
for spatial Queries which tend to go against Shared-Nothing restriction.
Hypothesis: Certain spatial queries tend to go against Hadoop’s Shared-Nothing
restriction by invoking the need of communication between independent Map and
On Parallelizing Large Spatial Queries Using Map-Reduce 15

Reduce phase
22
Map phase
20
18 select t.geom,b.geom
from polygons as b
16 order by Distance(t.geom,b.geom)
14 limit k;
time in minutes

12
10
8
6
4
2
0
3Node 3Node 1Node
Hadoop Geoserver Geoserver

Fig. 7. Performance evaluation of K nearest neighbor for k = 5

Reduce processes running on cluster machines. The query as shown in figure 8


returns all the roads of the state of California which are longer than the longest
road of Arizona and Texas. Since, the roads tables of the three states of America
resides on three different database sites, we first need to evaluate the result
of the subquery first, which is then taken as the input by the outer query to
yield the final result. Because, the results of different database sites (length
of the longest road of Arizona and Texas) need to be communicated to the
California database site, the execution plan of this query goes against Hadoop’s
Shared nothing restriction and, therefore this query cannot be represented by a
single-stage MapReduce program. To implement the above query in HadoopDB,
MapReduceSQL contains two MapReduce stages. In the first stage, the subquery
is processed on the Arizona and Texas sites in parallel and local results are
written onto HDFS (length of the longest roads of the state). In the second
MapReduce stage, the outer query takes the result of the previous MapReduce
stage from HDFS as input during run time and is processed on California site
only. The same mechanism is followed by Hadoop by setting the input directories
to Texas and Arizona for the first MapReduce stage, and to California directory
for the second MapReduce stage.
Results and Explanations: Figure 8 shows that Hadoop’s performance is the
worst of the three setups due to obvious reasons. However, the performance
of the three Node-HadoopDB is comparable to that of single node Geoserver.
This is because of the overhead of launching two MapReduce task one after the
another dominates the overall effective query execution. The Hadoop framework
takes around 8-10 seconds just to initiate the MapReduce jobs.
16 U. Bellur

250
MR Stage 1
200 MR Stage 2

selet geom from california roads


time in seconds

150 where length(geom)


>ALL
( (select (max(length(geom)))
from arizona roads)
100 UNION
(select max(length(geom))
from texas roads) );
50

0
3Node 3Node 1Node
Hadoop Geoserver Geoserver

Fig. 8. Performance of Nested Spatial Query

Discussion
HadoopDB outperforms Hadoop in distributed computations on spatial data due
to storing data in spatial DBs instead of as flat files. However, the database layer
alone cannot capture spatial problems that require spatial continuity analysis.
For example, in the KNN query problem, independent local query execution on
database sites might yield incorrect results. This is due to the fact that some
true nearest neighbors of the spatial object resides on a different database site
as a result of the partitioning. HadoopDB relies on the MapReduce layer to
compute the nearest neighbors that spawns over multiple database sites. Other
distributed Shared-Nothing spatial DBMSs, however, have to rely on only Table
Import strategy only to solve such problems. It should also be noted that in
spatial analysis, it is not uncommon to perform the join on non-spatial common
attribute between two tables. This is trivially done via SQL when operand tables
hosts on same database sites. But, in case the tables resides on different database
sites, we need to employ MapReduce layer to perform relational join. However,
MapReduce can capture the relational joins only on the Equality predicate. It is
a limitation of MapReduce paradigm that it cannot capture the inequality based
joins such as T1.A < T2.A.
With the space partitioning scheme we followed, spatial objects that satisfy
the overlap criteria with two or more partitions may get replicated to two or
more database sites. This results in redundant computation and final results of
the original query may contains duplicate results.
On Parallelizing Large Spatial Queries Using Map-Reduce 17

6 Conclusion
We conclude that MapReduce programming paradigm alone is sufficient to ex-
press most spatial query logic, but lack of support for spatial indexing mechanism
and its brute force nature make it impractical for interactive real time spatial
data analysis systems. HadoopDB shows great promise in query execution speeds
as spatial indices of postGIS adds a significant advantage, but on the other hand
performance degrades down to no better than MapReduce for queries where
the execution plan tends to go against the “Shared-Nothing“ restriction such as
with inter site spatial join. We also realize that vector spatial data, by its nature,
is well suited to be processed on Shared-Nothing distributed database clusters.
Hosting all spatial objects confined within a finite geographical boundary as a
single table chunk on one database node eliminates need to manipulate tables
across database nodes, thus abiding by Hadoop’s shared-nothing architecture,
avoiding the dependency on MapReduce layer and therefore yielding high per-
formance. But this advantage comes at the cost of correctness of the results of
some uncommon spatial queries such as KNN queries. The situation gets com-
pounded if spatial data suffers from partition skew and load balancing is required
which is not uncommon.

References
1. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters.
In: Proceedings of the 6th Conference on Symposium on Operating Systems Design
and Implementation, vol. 6, p. 10. USENIX Association, San Francisco (2004)
2. Bialecki, A., Cafarella, M., Cutting, D., Malley, O.: Hadoop: a framework for
running applications on large clusters built of commodity hardware, Wiki at
http://lucene.apache.org/hadoop
3. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S.R., Stone-
braker, M.A.: A comparison of approaches to large-scale data analysis. In: Proceed-
ings of the 35th SIGMOD International Conference on Management of Data, pp.
165–178. ACM Press, New York (2009)
4. Stonebraker, M., Abadi, D., DeWitt, D.J., Madden, S., Paulson, E., Pavlo, A.,
Rasin, A.: MapReduce and parallel DBMSs: friends or foes? Commun. ACM 53(1),
64–71 (2010)
5. Zhang, J., Mamoulis, N., Papadias, D., Tao, Y.: All-nearest-neighbors queries in
spatial databases, p. 297 (June 2004)
6. Zhang, S., Han, J., Liu, Z., Wang, K., Xu, Z.: SJMR: Parallelizing spatial join with
MapReduce on clusters. In: Proceedings of CLUSTER, pp. 1–8 (2009)
7. Dittrich, J.P., Seeger, B.: Data redundancy and duplicate detection in spatial join
processing. In: ICDE 2000: Proceedings of the 16th International Conference on
Data Engineering, pp. 535–546 (2000)
8. Brinkhoff, T., Kriegel, H.P., Seeger, B.: Parallel processing of spatial joins using
R-trees. In: ICDE 1996: Proceedings of the Twelfth International Conference on
Data Engineering, pp. 258–265 (1996)
9. Patel, J.M., DeWitt, D.J.: Partition based spatial-merge join. In: Proceedings of
the 1996 ACM SIGMOD International Conference on Management of Data, pp.
259–270. ACM, New York (1996)
18 U. Bellur

10. Akdogan, A., Demiryurek, U., Banaei-Kashani, F., Shahabi, C.: Integrated Media
Systems Center, University of Southern California, Los Angeles, CA 90089
11. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyck-
off, P., Murthy, R.: Hive - a warehousing solution over a map-reduce framework.
PVLDB 2(2), 1626–1629 (2009)
12. Abouzeid, A., Bajda-pawlikowski, K., Abadi, D., Silberschatz, A., Rasin, E.:
HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for
analytical workloads. In: Proc. VLDB 2009 (2009)
13. http://en.wikipedia.org/wiki/GeoServer
14. http://arcdata.esri.com/data/tiger2000/tiger_download.cfm
15. Leptoukh, G.: NASA remote sensing data in earth sciences: Processing, archiving,
distribution, applications at the GES DISC. In: Proc. of the 31st Intl. Symposium
of Remote Sensing of Environment (2005)
Feathered Tiles with Uniform Payload Size
for Progressive Transmission of Vector Data

Andrew Dufilie and Georges Grinstein

Institute for Visualization and Perception Research,


University of Massachusetts Lowell,
Lowell, Massachusetts, USA
{adufilie,grinstein}@cs.uml.edu

Abstract. We introduce Feathered Tiles, a novel vector data tiling


method for web mapping. This method eliminates redundant data
transfer, greatly reduces the amount of excess data transmitted for pro-
gressive refinements, and supports smooth zooming operations with on-
the-fly generalization. For a given set of geometries, the effective area
of each vertex is computed and stored as a third coordinate, along with
the bounds of the effective area. The results are partitioned in three
dimensions into tiles of a desired byte length. Each tile is stored along
with the 3-dimensional bounds encapsulating the effective area of all ver-
tices contained within. Individual tiles can then be retrieved on demand
with 3-dimensional queries to reproduce a simplified set of geometries
for a given scale and viewport. The key to reducing excess data transfer
lies in associating tiles with the effective bounds of individual vertices
rather than the bounds of the geometries that contain the vertices. This
tiling method is implemented in the open source visualization framework,
Weave.

Keywords: Vector Data, Vector Tiling, Progressive Transmission, Web


Mapping, Generalization, Data Structures, Open Source.

1 Introduction

Our motivation for designing a vector data tiling method came from the require-
ments of our open source, web-based visualization framework, Weave [12,40]. Our
goals included immediate feedback when the user visits the page and a highly
interactive and customizable visualization interface. We needed the capability of
rendering individual geometries with dynamic line and fill styles as well as poly-
gon intersection testing during brushing operations. We also wanted to be able to
interactively explore large data sets without requiring powerful server machines.
Since available solutions did not meet these requirements, a new solution had to
be developed.
To achieve these goals, it was apparent that progressive transmission of vector
data was necessary. If a fully detailed visualization cannot be transferred or
processed within a timely manner, the user should be allowed to interact with

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 19–35, 2014.

c Springer-Verlag Berlin Heidelberg 2014
20 A. Dufilie and G. Grinstein

a coarse representation of the data instead. To determine the order in which


vector data should be transmitted to the client, it is necessary to implement a
ranking system.
Our first prototype ranked each vertex in a collection of geometries by a
computed importance value and stored the results in a SQL table. Each time the
map was panned or zoomed to a new location, the client application queried the
server for a 3-dimensional range of data (x, y, and importance) and reassembled
the geometries from the resulting subset. Though this allowed large data sets to
be explored, it was apparent that a tiling system was necessary in order to avoid
redundant data transfer and reduce the computational burden of the server.
Due to the non-uniform distribution of vertices in typical geographic vector
data, a spatially uniform distribution of tiles is not practical as resulting tile
payload sizes can range anywhere from bytes to megabytes. The resulting un-
predictability of transfer and processing requirements for any given tile request
would be unacceptable for an interactive web-based system. Having seen the
success of the Slippy Map 1 image tiling scheme [33], we desired the same pre-
dictability and reliability for vector tiles. Our goal thus became a vector tiling
system in which the tiles have a uniform payload size.
To minimize server requirements, we generate vector tiles once using a prepro-
cessor rather than generating tiles on-the-fly. For each vertex, the preprocessor
computes its effective area which is then treated as a third coordinate. The re-
sults are partitioned in three dimensions into tiles with uniform payload size
and overlapping bounds. The server component provides a list of tiles with their
3-dimensional bounds and allows them to be retrieved by their ID numbers. The
client determines which tiles are needed based on the visible extent and scale,
and remembers which tiles have been received to avoid redundant data transfer.
The client reorganizes the tiled vector data into efficient data structures to en-
able on-the-fly filtering and generalization for smooth zooming operations. The
server and client are implemented in Java and ActionScript, respectively.
This paper contributes several advancements to the field of vector-based web-
mapping. We present Feathered Tiles, a novel vector tiling method which pro-
duces tiles of uniform payload size, eliminates redundant data transfer, and does
not compromise data precision. Novel methods are presented for partitioning
vector data and reducing excess data transfer in an overlapping tile scheme. We
also suggest a non-traditional usage of BLG-tree [26] structures which makes
smooth zooming operations possible without explicitly storing and transmitting
such structures to the client.
The rest of the paper is organized as follows. In Sect. 2 we provide an overview
of related vector mapping solutions, Sects. 3–5 describe our architecture, Sect. 6
discusses the benefits of our solution, and the paper concludes in Sect. 7 with
future work.

1
A Slippy Map is web-based map which uses tiled images and supports zoom and
pan interactions. It uses a fixed set of zoom levels corresponding to magnification
factors of two. Zoom level N uses 4N square images arranged in grid covering the
entire world. Each tile is identified by a set of three integer coordinates (Z, Y, X).
Feathered Tiles 21

2 Related Work

This section provides a diverse sample of existing vector-based web-mapping so-


lutions. There are two main approaches: those that use multiple representations
of vector data for discrete levels of detail, and those supporting progressive re-
finements. Approaches that use multiple representations suffer from redundant
data transfer when a change in zoom occurs, while those that support progres-
sive refinements seek to eliminate redundancies. Ours is the only method which
partitions vector data with respect to byte length, thus the transfer requirement
for any given server request in the other methods is unpredictable.

2.1 Approaches Using Multiple Representations


Antoniou et al. [1] use an SVG [36] tiling system for vector data. They avoid
redundancy across tiles by splitting polygons at tile boundaries and then merging
them on the client. In order to make sure the polygons render correctly at the
edges of tiles, their system requests an extra set of tiles outside the viewing area.
This does not solve the problem in all situations however, as there could be a
polygon crossing three tiles but not having a vertex in the middle tile, in which
case the polygon could be rendered incorrectly at the edge of the screen.
The approaches of Campin [7] and Langfeld et al. [18] generate SVG tiles by
clipping polygons at tile boundaries.
GIS Cloud [15] provides compact JSON-formatted tiles in a Slippy Map [33]
tiling scheme by snapping all vertices to pixel coordinates, eliminating features
smaller than a pixel, and using clever indexing and lookup techniques. The server
generates the tiles on-the-fly [30]. This solution achieves impressive client-side
performance in exchange for its tradeoffs.
Mapsforge [19] uses a custom binary tile format for vector data. It is not
streamed from a server, but allows for efficient storage of geographical informa-
tion, fast tile-based access, and filtering of map objects by zoom level.
OpenScienceMap [35] is an open source Android application supporting tiled
vector data using multiple representations for discrete zoom levels.
TileMill2 [37] is an experimental utility that generates vector tiles arranged
in a Slippy Map tiling scheme. They are stored in a binary format and are
never transferred directly to the client. The binary format contains a set of map
features defined by a list of vector graphics instructions such as moveTo and
lineTo. The advantage of defining map features this way is that the tile can be
stored once and re-used to generate any number of raster images with different
styles quickly on the server.
TileStache [38] generates GeoJSON [14] tiles with clipped geometries in a
Slippy Map tiling scheme which can then be rendered in Polymaps [29] as SVG.
For more examples, the OpenStreetMap Wiki provides an extensive list of
vector tiling solutions [28].
22 A. Dufilie and G. Grinstein

2.2 Approaches Supporting Progressive Refinements


The tGAP-tree is a structure suitable for progressive data transfer with the
server component performing dynamic queries on complex SQL tables [27,20].
Our approach uses progressive refinements and supports what Schmalstieg
et al. [34] call smooth levels of detail. Although we eliminate redundant data
transfer, we do have one drawback as pointed out by Han et al. [16]: “progressive
lossless vector transmission takes longer than downloading the entire raw data
set because of added encoding indexes.” Starting with the next section, the
remainder of this paper describes our approach.

3 Preprocessing Method
This section describes our preprocessor which converts a set of geometries into
a set of vector tiles. We first describe how we assign importance values to every
vertex in a set of geometries. Second, we describe the TileSplit algorithm for
partitioning three-dimensional data into tiles. Third, we describe how we apply
the TileSplit algorithm, and fourth we explain the critical details for minimizing
tile overlap and why we named our method Feathered Tiles.

3.1 Vertex Importance Calculations


Throughout our implementation we define importance values using area (in data
coordinates) as the unit. A different implementation could use a different unit
as long as all components are updated accordingly. Our client uses the area of
a single pixel as the minimum threshold for considering an object during spa-
tial queries and on-the-fly generalization. This eliminates the need to preserve
topological consistency during preprocessing, since topological inconsistencies
are difficult to discern when the error is less than a single pixel [6]. If a larger
minimum threshold is desired for further reduction of data transfer and process-
ing requirements, topologically consistent simplification methods [9,21] should
be used to calculate importance values.
Our architecture reproduces valid simplified geometries by skipping all ver-
tices with an importance value less than a given threshold value. Such values
are generated using Visvalingam’s area-based method for ranking vertices [39].
This algorithm iteratively removes vertices with the least effective area, defined
by the triangle formed by a vertex and its two neighboring vertices. Because
the refinement process is exactly the inverse of the simplification process [17]
this algorithm guarantees that our progressive refinements occur in order of de-
scending effective area, meaning that the map stabilizes quickly. In contrast, the
more widely known Douglas-Peucker (DP) algorithm [11] produces less pleasing
results because it tends to produce spikes where there are none [39] and produces
jumpy progressive refinements because the reverse of the simplification process
is not guaranteed to give progressively lower error values [26]. In fact the DP
algorithm is specifically designed to find the biggest jump possible in each of its
iterations, while Visvalingam’s does the opposite.
Feathered Tiles 23

The simplification process stops when a polygon or polyline is reduced to its


minimum number of vertices. The remaining vertices are marked as “required”
and their importance values are set equal to the area of the geometry’s bounding
box. This ensures that all required vertices will be included with a shape once
it becomes visible during a zoom-in operation. For polygons that have multiple
parts (islands or donut holes)2 , we add a special placeholder at the index before
a new part begins and treat it as a required vertex of the part that follows.
These placeholders are necessary to avoid incorrectly treating vertices from mul-
tiple parts as a single closed loop. Since required vertices of individual parts of
polygons have importance values equal to the area of the part’s bounding box,
islands and donut holes will be excluded when they are smaller than a single
pixel.

3.2 The TileSplit Algorithm


The TileSplit algorithm can be used for partitioning any data with geospatial
aspects into tiles suitable for web mapping. The purpose of this algorithm is to
produce tiles with a uniform payload size by partitioning the data in three di-
mensions with respect to its length in bytes. Two data structures are introduced
in this algorithm:
– StreamObject is an interface for any object with three coordinates (X, Y,
and importance), a queryBounds and a payload. The queryBounds specifies
the (X, Y) range in which the StreamObject is required at or below its
importance level. The payload can be any length of data to be included in a
tile.
– StreamTile has a queryBounds, an importance range, a list of StreamOb-
jects, and a payload. The queryBounds is the minimum bounding rectangle
containing the queryBounds of every StreamObject in the tile. The impor-
tance range covers the minimum and maximum importance values of all the
StreamObjects. The payload contains the concatenated payloads of all the
StreamObjects included in the StreamTile.
The TileSplit algorithm first sorts a list of StreamObjects by their importance
values. Then, it iteratively consumes chunks from the stream in descending order
of importance such that the first chunk is the size of a single tile and each
successive chunk is four times larger than the previous. Each chunk is then
partitioned into tiles with respect to the X and Y dimensions, each partition
with a byte length approximately equal to the target tile payload size. The result
is a layered pyramid of tiles similar to the Slippy Map image tiling scheme [33],
except that the bounding boxes and importance levels are non-uniform. This
non-uniform coverage is required to achieve the goal of uniform tile payload size,
and this is how we mitigate the problem of non-uniform distribution of geometric
detail. Pseudocode for implementing the TileSplit algorithm is shown below, and
Fig. 1 shows an example of resulting tile boundaries.
2
Also known as weakly simple polygons.
24 A. Dufilie and G. Grinstein

Pseudocode for the TileSplit algorithm

Function TileSplit(Array<StreamObject> input, Integer tileSize)


// Divides a stream into StreamTile objects with
// payload size approximately equal to tileSize.
Array<StreamObject> chunk
Array<StreamTile> output
Integer tally, tileCount

SortByImportance(input)
output = new Array<StreamTile>
tileCount = 1
While (input.length > 0)
(chunk, tally) = RemoveChunk(input, tileCount * tileSize)
// Prevent the last level from having under-sized tiles
While (tileCount > 1) And (tileCount * tileSize > tally)
tileCount = tileCount / 4
EndWhile
QuadSplit(chunk, tally, tileCount, output)
tileCount = tileCount * 4
EndWhile
Return output
EndFunction

Function RemoveChunk(Array<StreamObject> input, Integer chunkSize)


// Removes a chunk from a stream
// with respect to StreamObject payload size.
Array<StreamObject> output = new Array<StreamObject>
Integer tally = 0
While (input.length > 0 && tally < chunkSize)
StreamObject so = input.pop()
tally = tally + so.getPayloadSize()
output.push(so)
EndWhile
Return (output, tally)
EndFunction

Function SplitInHalf(Array<StreamObject> input, Integer totalSize)


// Splits a stream in half
// with respect to StreamObject payload size.
Array<StreamObject> half
(half, _) = RemoveChunk(input, totalSize / 2)
Return (input, half)
EndFunction

Function QuadSplit(Array<StreamObject> input, Integer tally,


Feathered Tiles 25

Integer tileCount, Array<StreamTile> output)


// Groups StreamObjects into StreamTile objects,
// partitioning the input in the X and Y dimensions.
Array<StreamObject> west, east, nw, ne, sw, se
If (input.length == 0) Then Return
If (tileCount == 1)
// All objects in a single tile
output.push( new StreamTile(input) )
Return
EndIf
SortByX(input)
(west, east) = SplitInHalf(input, tally)
SortByY(west)
SortByY(east)
(nw, sw) = SplitInHalf(west, tally/2)
(ne, se) = SplitInHalf(east, tally/2)
QuadSplit(nw, tally/4, tileCount/4, output)
QuadSplit(sw, tally/4, tileCount/4, output)
QuadSplit(ne, tally/4, tileCount/4, output)
QuadSplit(se, tally/4, tileCount/4, output)
EndFunction

Fig. 1. Example tile boundaries generated by the TileSplit algorithm overlayed on the
6-megabyte shapefile used to produce them

3.3 Tile Payloads


For a given collection of geometries, we run the TileSplit algorithm twice to
produce a set of metadata tiles and a set of geometry tiles. Keeping these separate
allows the client to request the metadata without requesting the geometry detail,
but a different implementation could combine all the information into one set
26 A. Dufilie and G. Grinstein

of tiles if desired. Each tile payload contains a stream of objects, and since the
byte-level details have no effect on the outcome we will only describe the contents
at an object level.

Metadata Tiles. Each object in a metadata tile corresponds to a geometry


and contains a shapeID (an integer), a shapeKey (a string) and a bounding
box (four coordinates). To simplify our storage model, our implementation also
includes shared metadata (projection and geometry type) in the first tile. It is
safe to do so because the first tile generated by our TileSplit algorithm covers
the entire (X, Y) range and has the highest importance range, and thus is always
requested by the client. If in the future we use a different TileSplit algorithm, we
may have to relocate this shared metadata. For use with the TileSplit algorithm,
each metadata object implements the StreamObject interface as follows:
x, y : Center coordinates of bounding box
importance : Area of bounding box
queryBounds : Equal to the bounding box

Geometry Tiles. The geometry tiles contain CombinedPoint objects which


correspond to (X, Y) locations appearing in the geometry data. A Combined-
Point object contains x, y, importance, and a list of (shapeID, vertexID) pairs.
This information is used for dynamically reconstructing the original geometries,
and is similar to a structure used by Zhang et al. [42] containing x, y, shapeID,
and vertexID. The added importance value allows us to perform on-the-fly gen-
eralization of individual geometries. We group vertices by (X, Y) location in
order to reduce the size of the final output for polygon collections that represent
geographic boundaries sharing common borders. For the TileSplit algorithm, the
CombinedPoint implements the StreamObject interface as follows:
x, y : Coordinates shared by all referenced vertices
importance : Highest importance value for any referenced vertex
queryBounds : Envelops the effective area of all referenced vertices

3.4 Minimizing Tile Overlap to Reduce Excess Data Transfer

When vertices from a single geometry are spread across multiple tiles in the X or
Y dimensions, vertices from some of the off-screen tiles may still be required to
correctly render the part of the geometry that is on-screen. Possible approaches
to this missing data problem include duplicating vertices across tiles, introduc-
ing new vertices at tile boundaries [22,7,18], and using overlapping tile query
bounds. Duplicating or creating additional vertices increases the size of each
tile unpredictably, which conflicts with our goal of creating tiles with uniform
payload size. Overlapping tile query bounds is the best approach in our case
as it does not add any additional complexity since our tile bounds are already
non-uniform.
The simplest way to ensure a tile is requested when it is required is to extend
the tile’s query bounds to envelop each geometry referenced in the tile. That is
Feathered Tiles 27

the approach used in a winged-edge topology [27,32], where each edge is asso-
ciated with two polygons and the abox (area box) that envelops them is used
as filtering criteria. Though this approach solves the missing data problem it
creates the additional problem of excess data, since we do not necessarily need
all off-screen vertices in order to render a geometry correctly.
To reduce excess data transfer, we extend our tile’s query bounds to envelop
only the effective area of the included vertices rather than the bounds of the
referenced geometries. As mentioned in Sect. 3.1, the effective area of a vertex is
the area of the triangle it forms with its two adjacent vertices during the simpli-
fication process. This distinction is critical because this approach minimizes the
amount of tile overlap, which in turn reduces the amount of data the client will
download at a given scale and viewport, as illustrated in Sect. 6. The similarities
of our method to the winged-edge topology and the importance of this detail led
us to name our method Feathered Tiles.

4 Tile Management
This section describes the roles of the client and server when managing and request-
ing tiles. Our approach is client-heavy with few requirements of the server beyond
hosting the data, which allows servers to accommodate more simultaneous users.

4.1 Client Tile Requests


Each tile collection contains a list of tile descriptors, each of which includes
an ID number, bounding box coordinates, and an importance range. The client
first examines the tile descriptors to determine which tiles to request based on
the active scale and extent, much like the metadata file described by Zhang et
al. [42]. We index the tiles into a 5-dimensional KD-tree [3] with four dimensions
for the bounding box as done by Rosenberg [31] with a fifth dimension added for
the maximum importance value of the tiles. Other structures could conceivably
be used for this purpose such as range trees [4]. When performing a range query
on the tree, the minimum importance threshold is set to the area covered by a
single pixel in the viewport at the current scale. Thus all tiles with importance
equal to or greater than the current pixel area are caught by the query, ensuring
that the client will receive all the progressive refinements necessary to render
what is visible at the desired scale.
When the client changes its view parameters, it queries the tile tree for a list
of tiles required by the current view. If any tile references are found, they are
removed from the tile tree and requested from the server. Using this approach
tiles are requested only once. To account for interrupted downloads, the removed
tile references may be kept in a separate “pending” list so they can be added
back to the tile tree if their download did not complete.

4.2 Server Tile Management


Given that the client independently determines which tiles it needs, the server
component has very little additional requirements. A minimal server would
28 A. Dufilie and G. Grinstein

require no special services running. The tile descriptors could be stored in a


separate file in the same folder as the individual files for the tiles. In our imple-
mentation we store the tiles as rows in a database, indexed by their ID numbers.
The client is allowed to request multiple tiles at once, and the server responds by
concatenating the payload of each tile into a single stream. The advantage of this
approach is a reduced number of client-server round-trip communications. The
drawback is that the dynamic nature of the requests prevents the web browser
from caching the results. We were not particularly concerned with this aspect of
the architecture during development, but if we decide we want a cache-friendly
solution, we have that option. Note that a cache-friendly solution does not re-
quire the tiles to be stored as individual files on disk, since URL patterns can
be redirected to servlet calls, which would enable both SQL storage and browser
caching.

5 Client Processing and Rendering

When the client receives tiles from the server, it asynchronously parses the pay-
load stream and dynamically builds data structures that facilitate on-the-fly
generalization with smooth levels of detail. This section explains how these struc-
tures are built and how they are used to improve the performance of the client.

5.1 View-Based Filtering

We use the same type of 5-dimensional KD-tree as described in Sect. 4.1 for
filtering geometric features based on the current scale and extent. Geometry
features outside the viewport or smaller than a single pixel are excluded from
the query result. The tree is built using the information included in the metadata
tiles (see Sect. 3.2) and is rebuilt every time we observe that the list of pending
metadata tiles has been completely parsed. Since optimally balanced KD-trees
are computationally expensive to build, we randomize the insertion order of
nodes as a fast alternative to avoid worst-case performance. Since there are much
fewer geometries than there are vertices, metadata tiles are requested nowhere
nearly as often as geometry tiles.

5.2 Implicit BLG-Trees for On-the-Fly Generalization

In order to achieve acceptable performance with highly detailed geometries, the


client must be able to generalize detailed polygons and polylines on the fly. In
Sect. 3.1 we explained that we can use the vertex importance values as filtering
criteria for line generalization. Therefore, we can derive simplified geometries by
skipping vertices with importance values below a given threshold. However, we
want to avoid checking the importance values of all the vertices if possible. For
that purpose, we generate BLG-trees [26] dynamically from the tiled geometry
data as it is received.
Feathered Tiles 29

The BLG-tree is traditionally used to store results from the Douglas-Peucker


(DP) line simplification algorithm [11] to facilitate on-the-fly generalization of
a polyline [26]. Each node of the BLG-tree contains coordinates and an error
threshold value for a single vertex in a polyline, and the tree is constructed
such that a full in-order traversal of the tree will visit every vertex of the orig-
inal polyline in order. Generalization is achieved by skipping nodes with error
values below a desired threshold during an in-order traversal. Because the DP
algorithm is not guaranteed to produce error values in decreasing order [26],
the parent-child node relationships are a necessary part of the result and these
BLG-trees cannot be reconstructed from the DP algorithm’s error values and
vertex IDs alone. Because of this, BLG-tree structures are traditionally stored
on a server and transmitted to a client, adding undesirable communication and
administrative overhead [21].
In our case, we are able to implicitly derive BLG-trees from our importance
values and vertex IDs since we require that the importance values define the
ranking. No matter the order in which the data is received, a valid BLG-tree can
be dynamically constructed by inserting and rearranging nodes such that the
vertices appear in their original order and deeper nodes have lower importance
values. This is an atypical usage of the BLG-tree structure, since it has no
relation to the DP algorithm.

5.3 Off-screen Vertex Skipping


In early versions of our software we noticed that zooming in to large polygons
with thousands of vertices would slow down the rendering significantly. To pre-
vent this from occurring, we eliminate unnecessary off-screen vertices in our
BLG-tree traversal routine by considering two parameters instead of one: min-
Importance, and visibleBounds. We use a variation of the Cohen-Sutherland [24]
algorithm to skip vertices that are outside the viewing rectangle. We do not per-
form clipping on line segments because we have not experienced any significant
performance hit resulting from a large, simplified portion of a polygon being
off-screen in Flash Player. The need for clipping should be re-assessed if the a
client is implemented in a different run-time environment.
During BLG-tree traversal, two flag values are kept for the two previous ver-
tices added to the resulting node list. The flag values are generated by the
GridTest routine, shown below. The code snippet that follows is taken from the
BLG-tree traversal routine and shows how to use the GridTest result for skipping
vertices. To determine if a particular vertex can be skipped, we check the result
of applying the binary AND operator on three consecutive flag values. Section 6
gives sample results of this off-screen simplification process.
Pseudocode for the GridTest routine
Function GridTest(x, y, xMin, yMin, xMax, yMax)
// Returns a value to be ANDed with two previous results.
Return (x < xMin ? 0x0001 : (x > xMax ? 0x0010 : 0))
| (y < yMin ? 0x0100 : (y > yMax ? 0x1000))
30 A. Dufilie and G. Grinstein

EndFunction

Pseudocode for skipping off-screen vertices while traversing a BLG-tree structure


// Begin snippet for NodeVisit (not a stand-alone function)
If (visibleBounds != Null)
gridTest = visibleBounds.getGridTest(node.x, node.y)
If (prevPrevGridTest & prevGridTest & gridTest)
// Drop previous node.
// Keep current prevPrevGridTest value.
result.removeLast();
Else
// Don’t drop previous node.
// Shift prev grid test values.
prevPrevGridTest = prevGridTest;
EndIf
prevGridTest = gridTest;
EndIf
// append this node to the results
result.append(node);
// End snippet
There is one caveat to this vertex skipping process: In order to avoid see-
ing slivers of simplified off-screen lines, either the drawing routine must omit
off-screen line strokes or the visibleBounds parameter must be padded. The for-
mer approach is similar to how Langeld et al. [18] separates the border from
the fill, while the latter approach is used by Campin [7], TileStache [38], and
Polymaps [23].

6 Evaluation and Discussion


The benefits of progressive transmission and on-the-fly generalization for vector
data are well documented in related work [5,8,10,41]. Progressive transmission
reduces the amount of data required to be transferred, and on-the-fly general-
ization reduces the amount of data processed during rendering. However, the
effectiveness of these solutions depend greatly on the details of their implemen-
tation. When a web mapping client zooms far in to a highly detailed portion of
vector data, the client must make sure that it a) does not request more data
than necessary; and b) can efficiently render only the portion of data which is
visible.
Progressive transmission makes it possible to retrieve fully detailed geometry
data when required, but it is important to avoid excess data transfer. Nordan [25]
gives a perfect example of when this matters: “If the user was zoomed in to look
at the border between Russia and Finland, the considerable time and comput-
ing power required to download and assemble the entire outline of Russia at
that zoom level would be a complete waste.” In Sect. 3.4 we described how we
Feathered Tiles 31

minimize tile overlap to reduce excess data transfer. Using Nordan’s example,
we can see how much tile overlap matters. Figure 2 shows the borders of Nor-
way, Finland, and Russia, and Table 1 shows the results of applying different
tile overlapping methods. If each tile’s query bounds is extended to include the
bounds of every referenced geometry (the winged method), the entire outlines of
the three countries are downloaded and parsed at the extent shown. Under the
Feathered Tiles method only 15% of the data is transmitted. Results will vary
with the tile payload size and input file, but Feathered Tiles will always produce
less tile overlap and in turn reduce excess data transfer.

Fig. 2. Displaying a 13-megabyte shapefile of countries of the world with 3-meter


accuracy, zoomed in to the borders of Norway, Finland, and Russia. At this extent,
only a small fraction of the data is required for rendering.

In the previous example, reducing the download size is only half the prob-
lem. Suppose that the client already had the full detail of the geometry cached
in memory as a result of panning along the borders, or the client has explic-
itly loaded a large, local shapefile into memory. In a highly detailed shapefile,
individual polygons may have thousands or millions of vertices. It’s clear that
an increased number of vertices will take a longer time to process, so it makes
sense not to waste time with off-screen vertices (OSVs). This problem is solved
by OSV skipping, described in Sect. 5.3. Figure 3 demonstrates two examples
before and after OSV skipping, with related statistics shown in Table 2.
32 A. Dufilie and G. Grinstein

Table 1. The method for determining tile query bounds greatly affects the amount of
excess data transfer in Fig. 2. The winged method extends the query bounds of a tile to
include the bounds of each referenced geometry, while the feathered method includes
only the effective area of the vertices contained within. In both cases, the target tile
payload size was set to 32-kilobytes.

Tiles requested at Vertices received at


Tile overlap method Overall tile overlap
extent shown extent shown
Winged 405% 117 126,548
Feathered 3% 17 18,775

Fig. 3. Examples before (left) and after (right) off-screen vertex skipping when zoomed
in to Michigan (top) and Louisiana (bottom) shorelines. Off-screen portions are faded
out. The data comes from a 42-megabyte United States boundary shapefile. Only a
small fraction of the data is required to render the visible portion of the polygons.

Table 2. Skipping certain off-screen vertices in Fig. 3 allows correct rendering of


polygons using only a fraction of the data

Total vertices at On-screen vertices Percentage of vertices


Shoreline
scale shown at extent shown required for rendering
Michigan 14,000 3,500 25%
Louisiana 10,000 1,500 15%
Feathered Tiles 33

7 Conclusion and Future Work


This paper presents Feathered Tiles, a novel approach for vector-based web
mapping which eliminates redundant data transfer and supports smooth zooming
operations with on-the-fly generalization. Tiles are partitioned to uniform byte
length which enables planned, predictable progressive transmission techniques.
One critical aspect of Feathered Tiles is the definition of the effective area of a
tile, which includes only the effective area of the vertices contained within the tile
rather than the bounds of the geometries it references. It has been demonstrated
that this decision can greatly reduce the amount of data requested by the client.
Finally, important client-side performance enhancements were outlined which
enable selective processing on large amounts of vector data for highly interactive
vector-based web-mapping.
There are several directions our future work can take. Firstly, different im-
portance calculation methods can be used to improve preprocessing speed and
output quality. For example, the algorithm proposed by Buzer has a time com-
plexity of O(n log n) and produces minimal representations of polylines targeted
for given pixel scales without introducing visible topological inconsistencies [6].
Another possibility is to eliminate the need for tile descriptors. The client re-
quest would then consist of a data range, a scale, and a bitmask for filtering out
the tiles it has or is currently receiving. With some polishing, our tiling method
could be encapsulated in a new standalone file format to facilitate on-the-fly ex-
ploration and generalization of large geometry sets. To tackle the issue of large
geometry sets exceeding the memory capacity of lower-end machines, a method
for freeing unused parts of the cache could be developed. We could also consider
using a more adaptive tiling method [2,13] to further reduce excess data transfer.
Finally, different encoding methods for data compression could be explored. For
example, grouping vertices by geometry ID or importance value rather than x,y
pairs may improve the storage efficiency.

References
1. Antoniou, V., Morley, J., Haklay, M(M.): Tiled vectors: A method for vector trans-
mission over the web. In: Carswell, J.D., Fotheringham, A.S., McArdle, G. (eds.)
W2GIS 2009. LNCS, vol. 5886, pp. 56–71. Springer, Heidelberg (2009)
2. The Astrophysical Research Consortium: Tiling and Adaptive Tiling. The Sloan
Digital Sky Survey Project Book. Princeton University (1993), http://www.
astro.princeton.edu/PBOOK/tiling/tiling.htm
3. Bentley, J.L.: Multidimensional binary search trees used for associative searching.
Communications of the ACM 18(9), 509–517 (1975)
4. Bentley, J.L., Friedman, J.H.: Data Structures for Range Searching. ACM Comput.
Surv. 11(4), 397–409 (1979)
5. Bertolotto, M., Egenhofer, M.J.: Progressive transmission of vector map data over
the world wide web. GeoInformatica 5(4), 345–373 (2001)
6. Buzer, L.: Optimal simplification of polygonal chains for subpixel-accurate render-
ing. Computational Geometry 42(1), 45–59 (2009), http://dx.doi.org/10.1016/
j.comgeo.2008.03.002
34 A. Dufilie and G. Grinstein

7. Campin, B.: Use of vector and raster tiles for middle-size Scalable Vector Graph-
ics mapping applications. In: SVGOpen 2005 (2005), http://www.svgopen.org/
2005/papers/VectorAndRasterTilesForMappingApplications/
8. Corcoran, P., Mooney, P., Bertolotto, M., Winstanley, A.: View- and scale-based
progressive transmission of vector data. In: Murgante, B., Gervasi, O., Iglesias,
A., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2011, Part II. LNCS, vol. 6783, pp.
51–62. Springer, Heidelberg (2011)
9. Corcoran, P., Mooney, P., Bertolotto, M.: Line simplification in the presence of non-
planar topological relationships. In: Bridging the Geographic Information Sciences,
pp. 25–42. Springer, Heidelberg (2012), doi: http://dx.doi.org/10.1007/978
-3-642-29063-3 2
10. Costa, D.C., Teixeira, M.M., De Paiva, A.C., de Souza Baptista, C.: A service-
oriented architecture for progressive transmission of maps. In: Proceedings of IX
Brazilian Symposium on GeoInformatics, INPE 2007. GeoInfo, Campos do Jordão,
Brazil, November 25-28, pp. 97–108 (2007)
11. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points
required to represent a digitized line or its caricature. Cartographica: The Inter-
national Journal for Geographic Information and Geovisualization 10(2), 112–122
(1973)
12. Dufilie, A., Fallon, J., Stickney, P., Grinstein, G.: Weave: A Web-based Architecture
Supporting Asynchronous and Real-time Collaboration. In: Proceedings of the AVI
Workshop on Supporting Asynchronous Collaboration in Visual Analytics Systems
(2012)
13. Environmental Systems Research Institute, Inc.: Tiled processing of large datasets.
ArcGIS Desktop 8.3 Help (2009), http://webhelp.esri.com/arcgisdesktop/9.3
/index.cfm?TopicName=Tiled+processing+of+large+datasets
14. GeoJSON – JSON Geometry and Feature Description, http://geojson.org/
15. GIS Cloud, http://www.giscloud.com/
16. Han, H., Tao, V., Wu, H.: Progressive vector data transmission. In: Proceedings of
the 6th AGILE, Lyon, France, pp. 103–113 (2003)
17. Haunert, J.H., Dilo, A., van Oosterom, P.: Constrained set-up of the tGAP struc-
ture for progressive vector data transfer. Computers and Geosciences 35(11), 2191–
2203 (2009)
18. Langfeld, D., Kunze, R., Vornberger, O.: SVG Web Mapping. Four-dimensional
visualization of time- and geobased data. In: SVGOpen 2008 (2008),
http://www.svgopen.org/2008/papers/92-SVG_Web_Mapping/
19. Mapsforge, http://code.google.com/p/mapsforge/wiki/
SpecificationBinaryMapFile
20. Meijers, M.: Cache-friendly progressive data streaming with variable-scale data
structures. In: Proceedings of the ICA/ISPRS Workshop on Generalisation and
Multiple Representation, Paris, France, June 30-July 1 (2011)
21. Meijers, M.: Simultaneous & topologically-safe line simplification for a variable-
scale planar partition. In: Advancing Geoinformation Science for a Changing
World, pp. 337–358. Springer, Heidelberg (2011)
22. Migurski, M.: TileStache Mailing List (July 19, 2011),
https://groups.google.com/d/msg/tilestache/p7OotBbz5tE/clvzx0YAtUYJ
23. Migurski, M.: StackExchange answer (November 22, 2010), http://gis.
stackexchange.com/questions/3712/create-vector-tiles-for-polymaps
24. Newman, W.M., Sproull, R.F.: Principles of interactive computer graphics, 124,
252. McGraw-Hill, Inc. (1979)
Feathered Tiles 35

25. Nordan, R.P.V.: An Investigation of Potential Methods for Topology Preservation


in Interactive Vector Tile Map Applications. Master Thesis. Norwegian University
of Science and Technology (2012)
26. van Oosterom, P., Van Den Bos, J.: An object-oriented approach to the design of
geographic information systems. Computers and Graphics 13(4), 409–418 (1989)
27. van Oosterom, P.: Variable-scale topological data structures suitable for progres-
sive data transfer: The GAP-face tree and GAP-edge forest. Cartography and
Geographic Information Science 32(4), 331–346 (2005)
28. Vector Tiles - OpenStreetMap Wiki, http://wiki.openstreetmap.org/wiki/
Vector tiles
29. Polymaps, http://www.polymaps.org
30. Ravnic, D.: Re: GisCloud showing tons of vectors features on Web Browser.
OpenLayers-Users mailing list (September 23, 2011), http://lists.osgeo.org/
pipermail/openlayers-users/2011-September/022351.html
31. Rosenberg, J.B.: Geographical data structures compared: A study of data struc-
tures supporting region queries. IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems 4(1), 53–67 (1985)
32. Samet, H.: Foundations of Multidimensional and Metric Data Structures, pp. 317–
329 (2006)
33. Slippy Map Tilenames, http://wiki.openstreetmap.org/wiki/Slippy map
tilenames
34. Schmalstieg, D., Schaufler, G.: Smooth levels of detail. In: Virtual Reality Annual
International Symposium, pp. 12–19. IEEE (March 1997)
35. Schmid, F., Janetzek, H., Wladysiak, M., Hu, B.: OpenScienceMap: open and free
vector maps for low bandwidth applications. In: Proceedings of the 3rd ACM Sym-
posium on Computing for Development. ACM, New York (January 2013)
36. Scalable Vector Graphics. Wikipedia entry, http://en.wikipedia.org/wiki/
Scalable Vector Graphics
37. TileMill2, https://github.com/mapbox/tm2
38. TileStache documentation. TileStache.Vector, http://tilestache.org/doc/
TileStache.Vector.html (accessed June 2013)
39. Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of
points. The Cartographic Journal 30(1), 46–51 (1993)
40. Weave: Web-based Analysis and Visualization Environment,
http://www.oicweave.org
41. Yang, B.S., Purves, R.S., Weibel, R.: Implementation of progressive transmis-
sion algorithms for vector map data in web-based visualization. The International
Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
34. Part XXX (2004)
42. Zhang, L., Zhang, L., Ren, Y., Guo, Z.: Transmission and visualization of large
geographical maps. ISPRS Journal of Photogrammetry and Remote Sensing 66(1),
73–80 (2011)
Trajectory Aggregation for a Routable Map

Sebastian Müller1 , Paras Mehta1 , and Agnès Voisard1,2


1
Institut für Informatik, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany
sebastian.mueller@fu-berlin.de
http://www.mi.fu-berlin.de/en/inf/groups/ag-db/
2
Fraunhofer FOKUS

Abstract. In this paper, we compare different approaches to merge tra-


jectory data for later use in a map construction process. Merging tra-
jectory data reduces storage space and can be of great help as far as
data privacy is concerned. We consider different distance measures and
different merge strategies, taking into account the cost of calculation, the
connectivity of the results, and the storage space of the result. Finally,
we give a hint on a possible information loss for each approach.

Keywords: Trajectory Summarization, Trajectory Data, Subtrajecto-


ries, Movement Patterns, GPS.

1 Introduction
The amount of available trajectories of mobile users, in the form of GPS tracks,
is rapidly increasing. A major underlying reason is the availability of cheap
GPS receivers connected to the Internet. We assume that nearly every current
smartphone has integrated GPS. According to [1], there were a total of 173.7
million smartphones shipped in the 3rd quarter of 2012, which was an annual
increase of 44%. On the basis of these numbers, we can conclude a potential
increase in users which are able to record GPS trajectories of 173.7 million
quarterly.
The merging of trajectories is important for answering non-individual ques-
tions. Our motivation is the construction of a map based on trajectories. Map
construction has recently gained popularity in scientific research. The ACM Dig-
ital Library lists 12719 publications with the keywords “map construction” for
the period between 2008 and 2012 compared to 8196 in the period between 2003
and 2007 [2]. Nowadays, road maps are available in good quality. However, map
construction can still be used to detect changes in the road network for various
application. Additionally, map construction can be used for company territories
and to create maps used in various outdoors activities such as sports, e.g., maps
for racing bicycles. This has already been done for taxi driving directions [3].
Merged trajectories can help ensuring privacy requirements as well as reducing
storage effort while still providing enough correct data to create a confident map
with lower calculation effort.
One use of our approach is a further anonymization of data. Work has already
been done in the anonymization of trajectory data. Nevertheless, this work often

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 36–53, 2014.
Springer-Verlag Berlin Heidelberg 2014
Trajectory Aggregation for a Routable Map 37

has another scope and the data is afterwards not used for map construction, but
for other tasks, e.g., data mining of crowd movements [4–6]. In these approaches,
one motivation is urban planning, and therefore complete trajectories are used.
In our approach we split trajectories in order to be able to build a subtrajectory
based on a larger set of trajectories and a lower distance in between.
We first need to define what we consider as merging of trajectories. A tra-
jectory is the path that a moving object follows through space as a function of
time. In our case, we consider a set of linear movements as a trajectory with the
condition that every end point of a linear movement is a start point of another
linear movement, except for the start and the end point of the whole trajectory.
As the input to the merging process we have 2 or more trajectories. We define
the output as the network of trajectories. Trajectories in a network can be con-
nected at a node. The trajectories in the network have additional information,
namely the number of trajectories which were integrated in the merged trajec-
tory and the variance of the integrated trajectory. We abbreviate the network
of trajectories as an aggregation and for clarity we call a trajectory which is a
candidate to be merged with trajectories in the aggregation as a single trace.
The merging process is divided into two major tasks: the first task is the
selection of trajectories or parts of trajectories to be merged and the second
task is the merging itself.
This paper is organized as follows. Related work is dicussed in section 2. Sec-
tion 3 discusses the selection of trajectories. Section 4 focuses on the problem
of the merging of trajectories. In Section 5, we present our prototypical imple-
mentation. Finally, we present the evaluation of our system (Section 6) and our
conclusions (Section 7).

2 Related Work

We consider methods from the field of computational geometry (such as spa-


tial distance measures) as related work, as well as different approaches for map
construction. The Fréchet distance is an important measure for the closeness of
two trajectories. Its computation is described in [7]. In our case, it has to be
applied for partial curves or subtrajectories [8]. Another spatial distance mea-
sure is the Hausdorff distance [9]. In [10], there is a comparison of trajectory
merging strategies and a new merge process based on the Fréchet distance. The
focus of this work is on objects in Geographic Information Systems(GIS) and
their integration.
The most comparable approach to the trajectory aggregation discussed in
this paper is the approach of incremental data acquisition [11]. In this approach,
there is a road map as precondition and additional information from trajecto-
ries is added incrementally. The main difference in comparison to our approach
is that we first build an aggregation and this is the input for constructing a
map. Conclusively, our iteration step refines an aggregation and the iteration
step described in [11] refines a road map. Other approaches rely directly on a
set of GPS traces and have no iteration or refinement steps [12]. There are also
38 S. Müller, P. Mehta, and A. Voisard

approaches which use the Fréchet distance to find similarity of trajectories with
the aim of creating subtrajectories in order to detect commuting patterns [13].
Additionally, subtrajectories can be found via clustering [14], also with the help
of the Fréchet distance [15, 16]. This is a very interesting approach that we fol-
low, too. Nevertheless, in this work we aim to find subtrajectories by thresholds
which prevents additional overhead. This allows us to really concentrate on dis-
tance measures. Subtrajectories can also be found using movement similarities
[17]. Nevertheless, these subtrajectories cannot be used for map construction,
they express more likely a moving pattern. Additionally, approaches are related
which find median trajectories [18]. These approaches focus more on a complete
trajectory than on partial trajectories which could represent roads.

3 Selection of Trajectories

We select trajectories by distance and by angle so that close and similarly aligned
trajectories are considered to be merged. Both, distance and angle, can be ex-
pressed in many different ways. Distance and angle can be measured from a local
and from a global viewpoint. We define a local viewpoint as the comparison be-
tween two nodes or two edges, while a node is a start or an end point of a linear
movement and an edge is a linear movement. From the two nodes there is always
one node from the aggregation compared with one node from the single trace,
the same for the two edges. A global viewpoint may include multiple nodes or
multiple edges. In the following, we first discuss a measure for the angle and
then we include this measure in defining the distance from a local and a global
viewpoint.

3.1 Angle Measuring

The aim of angle measuring is to find either edges or nodes (together with their
outgoing edges) with similar directions. More precisely, we call it similar direc-
tion measure because this property could also be expressed by a comparison of
slopes. Measuring the angle helps to prevent the merging of nearby nodes or
edges that follow different directions. The most important examples are cross-
ings and bridges or tunnels. In order to be able to make connections between
different directions it is important to keep these directions instead of merging
them (crossings). And, in order to ensure no connection between unconnected
streets we also need to store these trajectories separately (bridges over streets).
Furthermore, we would like to be able to include an angle variation in our dis-
tance measure in order to merge traces in similar directions more likely. Following
these characteristics we use an angle threshold and an angle expression that can
be included in the distance measure.
We mention a slope calculation as possible replacement for angle calculation,
not because of a better semantic expression, but because of lower calculation
costs. A relative slope calculation is able to replace a relative angle calculation.
Nevertheless, there is a major difference between angle and slope calculation: The
Trajectory Aggregation for a Routable Map 39

increase of the angle is proportional while the increase of the slope is progressive.
We can flatten this progression, e.g., above the value 1, by taking the inverse of
the slope and giving 2 minus the inverse as result. That way we have a value
range from 0 to 2 for the slope instead of a value range from 0 to infinity.
Next, we need to be aware that the calculation of relative slopes should dis-
tinguish four possible direction groups which result out of the combinations of
{up, down} and {lef t, right}. Table 1 shows these possible values. For each di-
rection group, the sign function (sgn) of the slope (ma,b , where a and b are
the initial and final nodes of an edge), the difference of the values of the x axis
(δ(x)a,b ), and the difference of the values of the y axis (δ(y)a,b ) are shown. Please
note that each sign function can be derived by the two remaining ones. All are
just illustrated for completeness. Taking these into account we have 4 × 4 dif-
ferent possible combinations of the direction groups. Which formula we need
to calculate the relative slope as difference from one slope to the other can be
decided based upon the combination of two sign functions of a and b.

Table 1. Different directions which should be taken into account when calculating a
relative slope

y y

x x

a) {up, lef t} b) {up, right}


sgn(ma,b ) = −1 sgn(ma,b ) = 1
sgn(δ(x)a,b) = −1 sgn(δ(x)a,b) = 1
sgn(δ(y)a,b ) = 1 sgn(δ(y)a,b ) = 1

y y

c) {down, lef t} d) {down, right}


sgn(ma,b ) = 1 sgn(ma,b ) = −1
sgn(δ(x)a,b) = −1 sgn(δ(x)a,b) = 1
sgn(δ(y)a,b ) = −1 sgn(δ(y)a,b ) = −1
40 S. Müller, P. Mehta, and A. Voisard

Algorithm 1. Calculation of the flattened slope


Require: xA1 , yA1 , xA2 , yA2 , xB1 , yB1 , xB2 , yB2
1: dif f 1 ⇐ f alse
2: dif f 2 ⇐ f alse
3: for G = A → B do
4: dx ⇐ xG2 − xG1
5: dy ⇐ yG2 − yG1
6: if dy = 0 then
7: mG ⇐ maximum
8: else
9: mG ⇐ dx/dy
10: end if
11: if dx < 0 AND dy < 0 then
12: dif f 1 ⇐ ¬dif f 1
13: end if
14: if dx < 0 AND dy > 0 then
15: dif f 2 ⇐ ¬dif f 2
16: end if
17: if mG > 1 ∨ mG < −1 then
18: mG ⇐ (2 − (1/|mG |)) ∗ sign(mG )
19: end if
20: end for
21: if sign(mA ) = sign(mB ) ∧ dif f 1 = dif f 2 then
22: return |mA | + |mB |
23: else if sign(mA ) = sign(mB ) ∧ ¬dif f 1 ∧ ¬dif f 2 then
24: return |mA − mB |
25: else if sign(mA ) = sign(mB ) ∧ (dif f 1 ∨ dif f 2) then
26: return 4 − |mA − mB |
27: else
28: return 4 − (|mA | + |mB |)
29: end if

The flattened slope can be calculated using Algorithm 1. The inputs are 2
lines (A and B) with each 2 points (e.g. A1 and A2). In order to distinguish
cases (shown in Table 1) the variables dif f 1 and dif f 2 are used. Within the
for loop (line 3), the differences of the x and y coordinates of the start and the
end points are calculated. According to these differences the states dif f 1 and
dif f 2 are adjusted (lines 11 to 16). In the same loop, the two slopes (mA and
mB ) are calculated. Finally (lines 21 to 29), the result is modified according to
the different states of dif f 1, dif f 2 and the sign function of the two slopes. The
formulas are also shown as overview in Table 1.
Nevertheless, by comparing the relative angle measure with the relative slope
measure we will detect some inconsistencies in the relative slope measure. As
mentioned before, the slope increases exponentially, not proportionally, which is
why we flattened the values between 1 and (after the flattening) 2. The flattening
does reduce this effect, but cannot eliminate it. Figure 1 shows the value ranges
of angles and their respective flattened slope and vice versa. In Figure 1b, one
Trajectory Aggregation for a Routable Map 41

can see that the intervals are not proportional when using a scale based on the
flattened slope.
y
angle: 90◦ slope: 2.0
angle: 80◦ slope: 1.8236
angle: 70◦ slope: 1.636
angle: 60◦ slope: 1.4227
angle: 50◦ slope: 1.161
angle: 40◦ slope: 0.839

angle: 30◦ slope: 0.5773

angle: 20◦ slope: 0.364

angle: 10◦ slope: 0.1764


x
(a) Proportional increase of angle

y
angle: 90.00◦ slope: 2.0
angle: 78.69◦ slope: 1.8
angle: 68.20◦ slope: 1.6
angle: 59.04◦ slope: 1.4
angle: 51.33◦ slope: 1.2
angle: 45.00◦ slope: 1.0
angle: 38.66◦ slope: 0.8
angle: 30.96◦ slope: 0.6

angle: 21.80◦ slope: 0.4

angle: 11.31◦ slope: 0.2

x
(b) Proportional increase of flattened slope

Fig. 1. Comparison of values of the flattened slope measure and the angle measure

We evaluated the performance on a UNIX terminal server with 2 Intel Xeon


CPU’s 5160 @ 3GHz and 16 GB RAM. The calculation of a flattened slope takes
on average 292.5 ns while the calculation of an angle takes on average 945 ns.
This is a reduction of calculation costs of 69%.

3.2 Local Difference Measure


The local difference measure takes single nodes or edges for merging into ac-
count. For measuring the distance in meters we use the JCoord package [19].
The JCoord package calculates the distance in meters from a pair of latitudes
and longitudes. The calculation of altitudes would be an overcalculation because
we don’t expect much variation and in particular no significant influence for the
whole merging process.
The distance can be calculated between nodes and between edges. The calcu-
lation between nodes is a standard distance calculation via latitudes and longi-
tudes. The calculation between edges can have variations. Using the two points
42 S. Müller, P. Mehta, and A. Voisard

of each edge and additionally 4 more points which could be found via a per-
pendicular, we have 8 points which could be used measuring distances between
those. Figure 2 shows these 8 points for the aggregation and the single trace.

trace
x x x x

x x agg
x x
x

Fig. 2. Two directed edges with extension and perpendiculars marking the crossings
of perpendicular and edges or their extension

Since the edges are directed, the first approach is to measure the distance
between the two start points and the distance between the two end points. These
distances would express a difference in length or a difference in angle, e.g., if we
find a higher distance between the end points than between the start points the
angle or the length of the edges has to differ, which is shown in Figure 3. A
higher distance of the end points which is caused by a difference in angle is a
good indicator because we don’t want to take edges into account to be merged
which have different directions. On the contrary, a higher distance of the end
points which is caused by a difference in length is misleading because these are
good candidates to be merged. This is because the two edges follow the same
direction and they are near to each other.
Using perpendiculars we can avoid this problem. While a difference in length
would not influence the distance using perpendiculars, a difference in angle
would. Nevertheless, another issue arises with using a distance which is calcu-
lated via the perpendicular. Figure 4 shows cases with equal distances calculated
via the perpendiculars. They differ in the distance between start and end points.
The first case with low distance between start and end points is a good match
because the edges are near, but in the second case the edges aren’t near and it
is probable that there is a better match (left of the single trace).
Regarding these issues, we prefer to use both measures with the requirement
that both measures are good and outliers are penalized.
The distance in meters has to be combined with a measure for similar direc-
tions.
Trajectory Aggregation for a Routable Map 43

trace
agg
x

(a) Difference in length

y
trace

agg
x

(b) Difference in angle

Fig. 3. two directed edges which differ in length or angle and their distances of start
and end points

trace
agg
x

(a) Low difference in start points

trace
agg
x

(b) High difference in start points

Fig. 4. two directed edges which are parallel, but differ in distance between start and
end points
44 S. Müller, P. Mehta, and A. Voisard

The combination of these two measures can be either a sum or a product:

sum: c = wa ∗ a + wd ∗ d
product: c = (wa ∗ a) ∗ (wd ∗ d)

where a is the angle measure, d is the distance measure in meters, c is the


difference, and wa and wd are weights for angle and distance in order to balance
them. Combining as a sum has the effect of achieving independence from the
mutual influence of the two measures while a product can increase and decrease
the effect of one measure depending on the value of the other. The important
effect of the product is to penalize attributes with similar values, resulting in
a higher difference compared to values which are not similar. Since we suspect
that it is better to merge nodes or edges which are near and in similar direction
at the same time and that outliers in either measure are not a good indicator
for merging, our preferred combination is the sum.

3.3 Global Difference Measure


The global difference measure calculates a difference based on multiple nodes or
edges. There are distance measures which calculate a distance between itineraries.
The ones we take into account are the Fréchet distance [7] and the Hausdorff
distance [9]. We would like to distinguish between distance and difference. We
use distance for the underlying distance measure and difference for the combina-
tion of distance and other measures which describe a difference like angle. As in
the local difference measure, we would like to include the angle into the global
difference measure.
One reason to include the angle in the difference measure is to not merge bridges
or tunnels. Figure 5 shows three GPS traces that could have been logged if the un-
derlying road network has two tunnels or bridges. The three traces were recorded
with different window sizes or different speeds so that the distance between nodes
differs in between. All edges which are completely in the gray area would be con-
sidered to be merged if we would only take a distance measure into account. In this
case the red dotted lines in the gray box would be merged which is not desired if
the traces indicate a tunnel or a bridge, as shown in this example.
We first consider integrating the Fréchet distance in our global difference mea-
sure. In order to use the Fréchet distance for merge decisions, we would expand the
Fréchet distance as long as a distance below  is fulfilled. We would search for one
point of the aggregation and one point of the single trace which have a distance
equal to or below . Next, we check for connections to this point. First, we consider
the first connection and try to expand the trace we want to merge. While the trace
we want to merge increases we have to repeat this expansion step.
In this expansion step we can integrate a hard threshold for the angle. Also, we
can replace  by a combined difference measure. Initially, this raises the question
of how we would measure an angle difference between two traces instead of two
edges. Considering the example with bridges or tunnels, we gave earlier (see
Figure 5), we would like to exclude differences in angle which are valid for the
complete trace. That is why we only need to care about the angle from the start
Trajectory Aggregation for a Routable Map 45

y trace1 x trace2
x x
x
x
x
x x x x x x x x agg
x
x
x
x x x

Fig. 5. Three GPS traces which can occur having two tunnels or bridges

to the end point of a trace and don’t need to consider all the angles in between.
How angle and Fréchet distance can be combined for a global difference measure
is similar to the already discussed combination for a local difference measure
(see section 3.2). The Hausdorff distance can replace the Fréchet distance, but
this will be evaluated in future work.

4 Merging of Trajectories

After having chosen the traces which should be merged, the actual merging
starts. Because we propose an incremental approach for building our trace net-
work, we want to store how many traces already influenced the current aggre-
gation. This allows us to continuously improve the aggregation while preserving
the aggregation from adjusting excessively to noisy single traces. For this reason,
we added an attribute to store the number of traces which already influenced
the aggregation. We also added this information to the GPX data format [20]
as an extension.
A merging approach has to take into account that not only two edges but
also traces might be joined and that it is not favorable to just adjust start and
end point of identified edges. For example, in Figure 6 (derived from Figure 3)
a merge of the aggregation and a single trace is shown which was merged taking
just two edges into account. The dashed green trace shows the new aggregation.
From this, we can observe that the former smooth aggregation became noisy
which represents a bad merging process. Conclusively, we need also to take parts
of edges into account for merging.
In order to take also parts of edges into account when merging, we can use
the points found via the perpendiculars (see also Figure 2). If we take only into
account the points which can find actual points via the perpendicular on the
other edge and those points found via the perpendicular, we can avoid noisy
merges. The inner points which should be merged are shown in Figure 7.
46 S. Müller, P. Mehta, and A. Voisard

y
trace

agg
newagg
x

Fig. 6. A noise induced by merging two edges which differ in length

Figure 8 shows the modified merging process which takes parts of edges into
account for the scenario in Figure 6 and it is based on using the points found
via the perpendicular. It is a better merging result because it does not produce
noise, it just takes the new information from the trace into account.

trace
x x x

x x agg
x
x

Fig. 7. Highlighted merging area by using two directed edges with extension and per-
pendiculars marking the crossings of perpendicular and edges

There are other merging approaches possible. In [10], there is an approach


of dividing distance within the geometries (in our case, traces). This approach
also seems promising when used with the Fréchet distance. One problem with
the approach of using the perpendiculars to find points is with iterations of
many merges. First of all, the same reason we wanted to split the single trace in
the example above (see Figures 6 and 8) leads us also to split the aggregation.
While this process is repeated, the distance between points is reduced. After
some iterations, a cleaning step might become necessary in order to ensure low
storage use. Nevertheless, using an approach based on divided distances, while
keeping the distance constant, will not lead to reduced distances between points
and thus will not enforce a cleaning step.
Another aspect for the merging process is the evolution of the aggregation
after several merge processes. The aggregation is going to be more stable with
more traces participating. In order to include this aspect, each edge in the aggre-
gation has a certain weight, depending on how many traces already influenced
this edge:
Trajectory Aggregation for a Routable Map 47

y
trace

agg
newagg
x

Fig. 8. Merging result of two edges which differ in length taking parts of edges into
account

we ∗ng +nt
nn = we +1

where nn is the newly added node, we is the current weight of the edge, ng
is the ghost node and nt is the node in the trace which will be merged into the
aggregation. After the merge, the weight will be increased by one.

5 Implementation

In order to evaluate different methods for iterative map construction (via ag-
gregation), we implemented one aggregation based on a local difference measure
and one based on a global difference measure.
The implementation based on the local difference measure is a complete im-
plementation capable of processing a set of GPX traces and creating a map in
OpenStreetMap XML format [21]. The steps are cleaning, aggregation and road
generation.
In the cleaning step, we first remove errors which are typical for GPS, e.g., if
GPS initializes again and GPS points have a high variation to the actual position.
To prevent this, we remove points if they seem impossible to reach. Furthermore,
we remove points which go backwards for short distances, which usually occurs
when a car stops at a traffic light and the GPS position varies around the actual
position. Next, we use the Ramer-Douglas-Peucker filter [22, 23].
The aggregation step includes the steps selection and merge. In this imple-
mentation, we distinguish from our proposed scenario: our scenario would always
add one trace to the aggregation, thus cleaning and aggregation are performed
for one trace. This implementation focuses on the evaluation of the aggregation
performance, thus all traces are first cleaned and then added to the aggregation.
The overall result would be the same. The implementation is currently using
a node to node difference which is increased incrementally in both directions
for all nodes in the aggregation and all nodes in the single trace. This means
that the selection is completely done before the merging. In order to merge, the
marked points are projected via the perpendicular to the aggregation as shown
in Figure 8. We will call the nodes, found this way, ghost nodes. The weight of
the aggregation influences how far the ghost point will be moved.
The aggregation already includes nodes with more than two edges. These
nodes are created when one new trace can be partially added to the aggregation,
48 S. Müller, P. Mehta, and A. Voisard

(a) Empty map

(b) Input data (blue) and cleaned


data (green)

(c) Aggregation

(d) Road network


Fig. 9. Stages of the road generation process within agg2graph [24]
Trajectory Aggregation for a Routable Map 49

but parts of it go somewhere where they cannot be matched. This is an important


precondition of the road generation.
The road generation identifies crossings based on the nodes which connect
more than two edges. Furthermore, the road generation identifies road classes:
primary, secondary and tertiary. They are identified based on the variance of
the matches. If more traces influenced an edge in the aggregation with a high
variance, it is more likely to be a main street with multiple lanes [24].
Figure 9 shows the different stages within the agg2graph software. All trajec-
tories are shown with an arrow to indicate the direction. The test case shown here
is in an urban territory in Berlin. The data was gathered from OpenStreetMap
GPS traces [25].
The other implementation is about the global difference measurement where
we used the Fréchet distance to select traces to be merged. This implementation
focuses on the aggregation. Nevertheless, within the aggregation it also creates
nodes with more than two edges, which would be a prerequisite for the first im-
plementation. Besides the selection it also evaluates different merging strategies.

6 Evaluation
We evaluated our measures graphically and statistically. We first distinguish
between the evaluation of the two implementations because, depending on the
implementation, we have to use different evaluation criteria. Both evaluations
use the GPS traces set provided by OpenStreetMap [25].

6.1 Local Difference Measure


The evaluation of the local difference measure was done for a rural as well
as for a urban scenario. The rural scenario is Bevern with a bounding box
from 52.7438049 N, 7.9694866 E to 52.7062756 N, 8.0461723 E. The urban sce-
nario is in Berlin with a bounding box from 52.5143927 N, 13.2676005 E to
52.5199552 N, 13.2817841 E. In both scenarios it was measured how statistical
data varies by reducing the confidence of the road network. We vary the param-
eter “confidence of the road network” which is equal to the minimum number
of traces each node or edge has to be influenced by, e.g., if the confidence is set
to 2, only those edges or nodes are in the road network which were influenced
by at least 2 traces. Table 2 show the results for both scenarios and confidence
levels from 1 to 3. We have to mention the limitation that these results cannot
indicate a good or bad performance of the local difference measure because they
are not comparable to another measure. The results, however, can be logically
explained, e.g., the reduction in total length goes along with higher confidence.
This shows, that the whole system provides reasonable results.

6.2 Global Difference Measure


The evaluation of the global difference measure was done visually by comparing
the input traces to the computed aggregation. Figure 10 shows GPS traces of a
50 S. Müller, P. Mehta, and A. Voisard

Table 2. Statistical results for the use of local difference measure in an urban and
rural scenario
scenario rural urban
confidence 1 2 3 1 2 3
total length of road network in meters
83451 23319 10352 14405 6271 2501
average street length in meters
488 496 863 141 179 208
number of streets
171 47 12 102 35 12
number of crossings
91 20 3 51 21 7

highway crossing and the aggregation on the basis of the traces. As a difference
measure, the Fréchet distance was used, ignoring angle differences. The merge
strategy took parts of edges into account. The two green arrows show a short-
coming of the selection of the  for the Fréchet distance. In this case, it was
chosen too high so that an independent part of the road was merged with traces
of the aggregation which are on another part of the road. If it would have been
chosen lower, it would have been detected as a separate road, but other parts,
that actually belong together, may also be separated.

Fig. 10. Merging result of a highway crossing [26], traces are green, aggregation is red

7 Conclusion
In this paper, we showed different stages of an iterative map construction ap-
proach. This is a basis for a privacy preserving collection process of GPS traces.
For every stage of this approach we showed alternative methods: distance and
angle measures resulting in difference measures and we pointed out important
Trajectory Aggregation for a Routable Map 51

challenges in the merging stage. Our implementation includes a small selec-


tion of these methods. The implementation is part of the open source project
agg2graph [27]. In the evaluation we were able to show the applicability of this
approach as well as a short insight into what could be further evaluation: increas-
ing the confidence and measuring statistical data is comparable to increasing k
in k-anonymity. Evaluation for this approach should be done statistically and
visually to be able to measure quality, but also to be able to detect shortcomings
as we detected a “forgotten” road segment by choosing  too high.
Future work should mainly include the implementation of more measures in
an integrated environment in order to provide comparable evaluation results. In
order to create maps for special purposes, we consider extending the distance
measure with altitude variations. In order to compare different global difference
measures, we will implement and evaluate the Hausdorff distance. We plan to
implement further merging strategies to evaluate their influence on the overall
outcome. All our new implementations will be part of the agg2graph project
for better comparability and integration. It is interesting to compare our results
with other methods to calculate subtrajectories and to evaluate against each
other. We would like to enhance results by using existing smoothing techniques,
like kernel smoothers [28], smoothing splines [29], Kalman filters [30], and other
statistical smoothing approaches [31]. Another interesting extension seems to be
a spatio-temporal approach [32, 33]. It could be used to construct maps not only
for different vehicles, but also for different daytime scenarios, e.g., for better
navigation in a certain time of the day. We will consider data privacy issues in
further implementations, e.g., by implementing k-anonymity [34]. We also want
to look into different approaches to provide a distributed system, e.g., a client-
server architecture. Finally, we plan to integrate a map comparison to a map
from OpenStreetMap in order to evaluate the correctness of our road network.

Acknowledgments. The authors wish to thank the students who participated


in the prototype, and more precisely Johannes Mitlmeier, Jens Fischer, and
Franz Gatzke. The research leading to these results has received funding from
the European Union Seventh Framework Programme - Marie Curie Actions,
Initial Training Network GEOCROWD (http://www.geocrowd.eu) under grant
agreement No. FP7- PEOPLE-2010-ITN-264994.

References
1. Canalys: Sony and HTC overtake RIM and Nokia in smart phones (2012),
http://www.canalys.com/newsroom/sony-and-htc-overtake-rim-and-nokia-
smart-phones
2. Association for Computing Machinery: ACM digital library (2013),
https://dl.acm.org/
3. Yuan, J., Zheng, Y., Zhang, C., Xie, W., Xie, X., Sun, G., Huang, Y.: T-drive:
driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPA-
TIAL International Conference on Advances in Geographic Information Systems,
GIS 2010, pp. 99–108. ACM, New York (2010)
52 S. Müller, P. Mehta, and A. Voisard

4. Evans, M.R., Oliver, D., Shekhar, S., Harvey, F.: Summarizing trajectories into k-
primary corridors: a summary of results. In: Proceedings of the 20th International
Conference on Advances in Geographic Information Systems, SIGSPATIAL 2012,
pp. 454–457. ACM, New York (2012)
5. Andrienko, G., Andrienko, N., Giannotti, F., Monreale, A., Pedreschi, D.: Move-
ment data anonymity through generalization. In: Proceedings of the 2nd SIGSPA-
TIAL ACM GIS 2009 International Workshop on Security and Privacy in GIS and
LBS, SPRINGL 2009, pp. 27–31. ACM, New York (2009)
6. Goel, P., Kulik, L., Kotagiri, R.: Privacy aware trajectory determination in road
traffic networks. In: Proceedings of the 20th International Conference on Advances
in Geographic Information Systems, SIGSPATIAL 2012, pp. 406–409. ACM, New
York (2012)
7. Alt, H., Godau, M.: Computing the Fréchet distance between two polygonal curves.
Int. J. Comput. Geometry Appl. 5, 75–91 (1995)
8. Buchin, K., Buchin, M., Wang, Y.: Exact algorithms for partial curve matching via
the Fréchet distance. In: Proceedings of the Twentieth Annual ACM-SIAM Sym-
posium on Discrete Algorithms, SODA 2009, pp. 645–654. Society for Industrial
and Applied Mathematics, Philadelphia (2009)
9. Rockafellar, R.: Variational analysis. Springer, Berlin (1998)
10. Devogele, T.: A new merging process for data integration based on the discrete
Fréchet distance. In: Richardson, D.E., Van Oosterom, P., van Oosterom, P.J.M.
(eds.) Advances in Spatial Data Handling: 10th International Symposium on Spa-
tial Data Handling, Ottawa, Canada, pp. 167–181 (2002)
11. Zhang, L., Sester, M.: Incremental data acquisition from GPS-traces. In: Geospa-
tial Data and Geovisualization: Environment, Security, and Society; Special Joint
Symposium of ISPRS Commission IV and AutoCarto 2010 in Conjunction with
ASPRS/CaGIS 2010 Special Conference. ASPRS/CaGIS 2010 (2010)
12. Cao, L., Krumm, J.: From GPS traces to a routable road map. In: Proceedings of
the 17th ACM SIGSPATIAL International Conference on Advances in Geographic
Information Systems, GIS 2009, pp. 3–12. ACM, New York (2009)
13. Buchin, K., Buchin, M., Gudmundsson, J., Löffler, M., Luo, J.: Detecting commut-
ing patterns by clustering subtrajectories. In: Hong, S.-H., Nagamochi, H., Fuku-
naga, T. (eds.) ISAAC 2008. LNCS, vol. 5369, pp. 644–655. Springer, Heidelberg
(2008)
14. Lee, J.G., Han, J., Whang, K.Y.: Trajectory clustering: a partition-and-group
framework. In: Proceedings of the 2007 ACM SIGMOD International Conference
on Management of Data, SIGMOD 2007, pp. 593–604. ACM, New York (2007)
15. Zhu, H., Luo, J., Yin, H., Zhou, X., Huang, J.Z., Zhan, F.B.: Mining trajectory
corridors using Fréchet distance and meshing grids. In: Zaki, M.J., Yu, J.X., Ravin-
dran, B., Pudi, V. (eds.) PAKDD 2010, Part I. LNCS, vol. 6118, pp. 228–237.
Springer, Heidelberg (2010)
16. Gudmundsson, J., Valladares, N.: A GPU approach to subtrajectory clustering
using the Fréchet distance. In: Proceedings of the 20th International Conference
on Advances in Geographic Information Systems, SIGSPATIAL 2012, pp. 259–268.
ACM, New York (2012)
17. Dodge, S., Laube, P., Weibel, R.: Movement similarity assessment using symbolic
representation of trajectories. Int. J. Geogr. Inf. Sci. 26(9), 1563–1588 (2012)
18. van Kreveld, M., Wiratma, L.: Median trajectories using well-visited regions and
shortest paths. In: Proceedings of the 19th ACM SIGSPATIAL International Con-
ference on Advances in Geographic Information Systems, GIS 2011, pp. 241–250.
ACM, New York (2011)
Trajectory Aggregation for a Routable Map 53

19. Scott, J.: JCoord (2013), http://www.jstott.me.uk/jcoord/


20. Foster, D.: GPX: the GPS exchange format (2013),
http://www.topografix.com/gpx.asp
21. OpenStreetMap Community: OSM XML - OpenStreetMap wiki (2013),
https://wiki.openstreetmap.org/wiki/OSM XML
22. Ramer, U.: An iterative procedure for the polygonal approximation of plane curves.
Computer Graphics and Image Processing 1(3), 244–256 (1972)
23. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points
required to represent a digitized line or its caricature. Cartographica: The Inter-
national Journal for Geographic Information and Geovisualization 10(2), 112–122
(1973)
24. Mitlmeier, J.: Generierung von Straßengraphen aus aggregierten GPS-Spuren.
Master thesis, Freie Universität Berlin (2012)
25. OpenStreetMap Community: Public GPS traces,
http://www.openstreetmap.org/traces (2013)
26. Fischer, J.: GPS track aggregation with use of Fréchet distance. Bachelor thesis,
Freie Universität Berlin (2012)
27. Müller, S.: Agg2graph (2013), http://sebastian-fu.github.com/agg2graph/
28. Hastie, T., Tibshirani, R., Friedman, J.H.: The elements of statistical learning:
data mining, inference, and prediction: with 200 full-color illustrations. Springer,
New York (2001)
29. Hastie, T.J., Tibshirani, R.J.: Generalized additive models. Chapman & Hall, Lon-
don (1990)
30. Welch, G., Bishop, G.: An introduction to the Kalman filter. Technical report,
Chapel Hill, NC, USA (1995)
31. Chazal, F., Chen, D., Guibas, L., Jiang, X., Sommer, C.: Data-driven trajectory
smoothing. In: Proceedings of the 19th ACM SIGSPATIAL International Con-
ference on Advances in Geographic Information Systems, GIS 2011, pp. 251–260.
ACM, New York (2011)
32. Buchin, M., Driemel, A., van Kreveld, M., Sacristán, V.: An algorithmic framework
for segmenting trajectories based on spatio-temporal criteria. In: Proceedings of
the 18th SIGSPATIAL International Conference on Advances in Geographic In-
formation Systems, GIS 2010, pp. 202–211. ACM, New York (2010)
33. Xie, K., Deng, K., Zhou, X.: From trajectories to activities: a spatio-temporal join
approach. In: Proceedings of the 2009 International Workshop on Location Based
Social Networks, LBSN 2009, pp. 25–32. ACM, New York (2009)
34. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzzi-
ness Knowl.-Based Syst. 10(5), 557–570 (2002)
A Study of Users’ Movements Based on Check-In Data
in Location-Based Social Networks

Jinzhou Cao1, Qingwu Hu1,*, and Qingquan Li2,3


1
School of Remote Sensing and Information Engineering,
Wuhan University, Wuhan 430079, P.R. China
{caojinzhou,huqw}@whu.edu.cn
2
Shenzhen Key Laboratory of Spatial Smart Sensing and Services,
Shenzhen University, Shenzhen 518060, P.R. China
3
State Key Laboratory of Information Engineering in Surveying,
Mapping and Remote Sensing, Wuhan University, Wuhan 430079, P.R. China
liqq@szu.edu.cn

Abstract. With the development of GPS technology and the increasing popular-
ity of mobile device, Location-based Social Networks (LBSN) has become a
platform that promote the understanding of user behavior, which offers unique
conditions for the study of users’ movement patterns.
Characteristics of users’ movements can be expressed by places they’ve
visited. This paper presents a method to analyze characteristics of users’
movements in spatial and temporal domain based on data collected from a Chi-
nese LBSN Sina Weibo. This paper analyzes spatial characteristics of users’
movement by clustering geographic areas through their check-in popularity.
Meanwhile, temporal characteristics and variation of users’ movements on the
timeline is analyzed by applying statistical method.

Keywords: Check-In, Location-based Social Networks, Users’ movements.

1 Introduction

The improvement of means of geographic data acquisition and the thriving rise of
mobile Internet technology make it possible to create location data in social networks
anytime and anywhere. This social networks driven by geographic location are called
Location-based Social Networks (LBSN). This kind of network not only adds a loca-
tion to existing social network, but also generates a knowledge database inferred from
an individual’s location (history) and location tagged data, e.g., common interests,
behavior, and activities [1]. For instance, a user’s trajectory movement often appear-
ing in the stadium indicates that the user might like sports; the trajectory of the user
frequently crossing the wild shows his preferences for outdoor activities.
LBSN has become a platform to promote the understanding of user behavior,
which offers unique conditions for the study of users’ movement patterns. Hence,
how to take full advantage of huge geographic data generated in LBSN to mine know-
ledge becomes particularly important.

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 54–66, 2014.
© Springer-Verlag Berlin Heidelberg 2014
A Study of Users’ Movements Based on Check-In Data in Location-Based Social Networks 55

Mobile social networking services has been the concern by many scholars at home
and abroad in recent years. In early years, most of the studies were based on non-
geospatial networks and the impact of geographical space was ignored. However,
follow-up studies suggest that geographical space play a restrained role on social
networks and many complex networks are embedded in it [2]. Zheng et al. mined
recommendatory locations and representative activities to provide a roadmap for trav-
elers using a large amount of GPS trajectories [3]. Liang et al. raised a way through
the study of check-in data to help urban public space managers to make improvements
in the spatial arrangement and operation of urban space at a lower cost and higher
efficiency [4].
Unlike the traditional GPS data that were collected passively, data generated by
LBSN is characterized by large amount, high efficiency, and high socialization. As a
result, the subjective desire of users like interests, habits can be well reflected. Hence,
if location check-in data could be fully mined we argue that a higher level of know-
ledge and information can be obtained, e.g., understanding the similarity between
users based on their location histories [5]. Commercial social media itself analyze
users’ check-in records actively to recommend and push advertisement in order to
create new profits [6].
Characteristics of users’ movements can be expressed by places they’ve visited. In
this paper, we present an approach to analyze of user’s daily movement patterns from
spatial and temporal perspective using check-in data in Sina Weibo, which is one of
the most popular social network in China. First, we provide a general overview of the
dataset collected from Sina Weibo and briefly analyze the spatial and the frequency
distribution of the data. Then, we introduce the principles and methods to process
spatial modeling analysis and temporal statistical analysis on users’ movement pat-
terns. After that, we collect data in specific regions and users through Sina API inter-
face, and conduct experiments. The results are analyzed and discussed. Finally, we
conclude with a discussion and highlight directions for future work.

2 Location Check-In Dataset


Social behavior is directly related to the location in users’ daily life. When a user
arrives at a place (e.g., restaurants or gymnasium), he will usually be associated with
the activities of this place (e.g., eating or fitness). Nevertheless, we need lots of data
sources for further research on the law of statistical characteristics in order to confirm
this correlation is not accidental.
Sina Weibo is a Chinese microblogging website, a hybrid of Twitter and Facebook
with a market penetration similar to what Twitter has established in the USA. Users
check-in at places through a dedicated mobile device using GPS and other sensing
technologies to automatically detect their location and post on the Sina Weibo plat-
form. It has more than 0.5 billion registered users as of 2013, 57% of total number of
microblogging users in China, and the number of daily active users has reached more
than 60 million, with frequent information update, which provides powerful data
56 J. Cao, Q. Hu, and Q. Li

guarantee[7]. Moreover, there has accumulated more than 600 million check-in
records in Sina Weibo. The fact that most of the records are in three major cities in
China: Beijing, Shanghai and Guangzhou and about 60% of them are restaurant spot,
20% scenic spot among the records confirms the relationship between users’ check-in
activities and their movements.

Fig. 1. Sina Weibo mobile client check-in interface

Previous research may only use the two attributes (e.g. geographic coordinates and
timestamp) of check-in data, with no more detailed information to further support, to
analyze. Sina Weibo API provides location service interfaces freely, however, and we

can acquire the following various attributes about a place name, category, geograph-
ic coordinates, total number of check-ins, number of visitors checked-in, etc. Thus, it
can meet the needs of the multi-level and multi-angle analysis and processing.
We have crawled data in Shanghai, China, between January 1st 2013 and March
31th 2013.Due to the data generated by users voluntarily, data quality issues, such as
low accuracy, data redundancy, incorrect formatting, should be taken into account [8].
Thus it’s necessary to data preprocess to get standard data. We have selected 1514470
check-ins after data preprocessing. Each record corresponds to a check-in at one of
the 34963 POIs. A spatial distribution of collected dataset is depicted in Fig.2. A cir-
cle represents a geographic venue and its radius the popularity of it in units of number
of check-ins. Each color corresponds to one of 10 categories shown in Table 1. The
distribution of spatial dataset highlights the diversity of users’ movements.
A Study of Users’ Movementss Based on Check-In Data in Location-Based Social Networks 57

Fig. 2. Spatial distribution of collected dataset in Shanghai

The number of check-in ns is an indicator of popularity for places among ussers


[9].The complementary cu umulative distribution function (CCDF) of the numberr of
check-ins at different placees is shown in Fig.3: there is a significant heavy tail in the
distribution and the data ap pproximately exhibit log-normal distribution. Only a ffew
places have a large numberr of check-ins, while a higher number of places have oonly
few check-ins; about 20% of o places have just one check-in, with 30% above 10, w whe-
reas there is around 50% off places that have more than 100 check-ins. It well refleects
the heterogeneity in users’’ movements, and the reasons behind it could be maany,
ranging from subjective reasons (e.g., forgetting check-in at a place), to social oones
(e.g., sharing location with
h others). Users checking-in has always been voluntary ra-
ther than mandatory, anyho ow, for which reason characteristics of users’ check-ins can
be a good sign to characteriize the users’ movements.

Fig. 3. Complementary Cumu ulative Distribution Function (CCDF) of the number of checkk-ins
at different places. The data ap
pproximately exhibit log-normal distribution.
58 J. Cao, Q. Hu, and Q. Li

3 Users’ Movement Pattern Analysis

People can be profiled according to the categories of places they visit, whereas geo-
graphic areas can be modelled according to their constituent venues. In this section
we model the users’ movement patterns by clustering geographic areas through their
check-in popularity. In particular, we propose the use of place categories to create the
squared area feature vector, define the similarity measurement and then apply the
spectral clustering algorithm [10].In the meantime, we analyze temporal patterns of
users’ movements by applying statistical method in order to demonstrate the characte-
ristics and variation on the timeline. Flow chart is shown in Fig.4.

Location
Sina API
Check-in Data

Time
Data Preprocessing

Category

Data Processing

Squared Areas Temporal Statistical


Division Analysis

Check-In Data
Modeling

Daily-dependent

Square Vector
Similarity
Measurement Weekly-dependent

Spectral
Clustering

Data Analysis

Users’ Movement
Pattern Analysis

Area Temporal
Clustering Statistical
Distribution Distribution

Fig. 4. The flow chart of users’ movement pattern analysis


A Study of Users’ Movementss Based on Check-In Data in Location-Based Social Networks 59

3.1 Spatial Modeling off Users’ Movements

Squared Areas Division. Squared areas division effectively is a basis for subsequuent
operations. The square sizee of each squared area is an important factor to considerr. If
the size is too large, check
k-in records may contain multiple categories, thus the ccha-
racterization of area is hard
d to determine. On the contrary, the amount of data insside
the area can be too small to t generate reasonable statistical representation. We seet a
threshold of the number of o check-ins per area and finally calculate a reasonaable
square size and the number of area.
158 square kilometers in n the central area of Shanghai was chosen to be the dataaset
in the experiment. Imposing g the threshold of at least 30 check-in records per area has
generated 559 areas. Spatiaal distribution of squared areas is shown in Fig.5.

Fig. 5. Spatial distribution of Squared Areas. The squared areas not covered by color blue m
mean
that there are less than 30 checck-in records within them.

Location Check-In Data Modeling.


M There is a need to merge and split location cate-
gory according to the charracteristics of users’ movements due to location categgory
provided by Sina Weibo diiffering from what we need. Finally we classified intoo 10
ble 1 and manually modified the category attributes of ac-
categories, as shown in Tab
quired data.

Tablle 1. The location category classification


1 Home 2 Work 3 Education
4 Shopping 5 Travel 6 Outdoors
7 Food 8 Life services 9 Leisure
10 Fitness

Detailed description of lo
ocation check-in data modeling is the following: Considder-
ing a squared area A within
n a city, we divide A into a certain number of equally siized
60 J. Cao, Q. Hu, and Q. Li

squares, each one representing a smaller local area a. The representation of a is de-
fined according to the categories of nearby places and the number of check-ins took
place at those. In this way not only we know what types of places are in an area, but
we also have a measure of their importance from the perspective of users’ move-
ments. We define , of a category c to a geographic area a, for all places p that
belong to category c within a, as follows:

, ∑ , (1)

Hence, any area a can be represented using a vector , the dimensionality of


which is the number of the classified categories and each feature value is equal to
, corresponding to a particular category. Particularly, , can be normalized
in order to facilitate the research.

Square Vector Similarity Measurement. Supposing feature samples constituted by


all the values of , as X, number of squared areas a, the dimensionality (number of
categories) c, the matrix form is shown in Equation (2).

, ,
(2)
, ,

Where , represents the number of check-ins that belong to category i within area j.
We now define the similarity , between two square vectors i and j. Distance
calculation (e.g., Euclidean Distance, Ming Distance and Mahalanobis Distance) and
the similarity function (e.g., SMC, Cosine, Correlation Coefficient) are the common
similarity measurement methods [11, 12]. Nevertheless, the similarity matrix calcu-
lated by different formulae will be very different and also different matrices will have
different clustering results. For instance, Euclidean Distance is commonly used in
image segmentation, and Cosine is often used in text data clustering. Because Cosine
similarity has the property that it can be used in any dimension vector comparison,
especially in high-dimensional space, we adopt the Cosine similarity measure as simi-
larity measurement. See Equation (3), (4).

, (3)

∑ / (4)
∑ ∑

Similarities between all vectors constitute the similarity matrix W, as shown in


Equation (5).
, ,
(5)
, ,

Where , represents the similarity between sample i and j, equaling to , .


A Study of Users’ Movements Based on Check-In Data in Location-Based Social Networks 61

Spectral Clustering. The impact of the similarity matrix for clustering results doesn’t
been taken into consideration in traditional clustering algorithms. The direct analysis
of similarity matrix itself can avoid the limitations of the introduction of distribution
hypothesis of sample space to a great degree in spectral clustering algorithm, howev-
er. Spectral clustering algorithm is capable of clustering on all sample space that is
arbitrary shape theoretically and has been applied to speech recognition, text mining
and other fields widely [13].
Spectral clustering method views samples as vertex, and similarity between two
samples is considered as edge with weight. From this point of view, clustering prob-
lem is converted into graph division problem: find a method to divide a graph into
groups so that weight of edges inside groups is as low as possible (namely similarity
between groups as low as possible) and weight of edges among groups is as high as
possible (namely similarity within group as high as possible) [14].
In this paper, we treat each squared area as a vertex in graph. The graph is generated
by connecting the vertexes according to similarities between squared areas. Then divide
the graph into groups and each group is a cluster. Detailed steps are listed as follows:

1. Create similarity graph from squared areas, and generate weight matrix W.
2. Compute Laplacian matrix L by Equation (6), in which D is degree matrix:

(6)
3. Compute k smallest eigenvector of L.
4. Combine the k eigenvectors together and generate an N * k matrix, in which every
row is a k-dimension vector. Finally conduct k-means algorithm to cluster the data
and get result [15].

3.2 Temporal Statistical Analysis of Users’ Movements


Characteristics of users’ movement is largely associated with time. Temporal patterns
of check-in data can be acquired by conducting statistical analysis on check-in data’s
time attribute, and it is presented as temporal characteristics and variation of users’
movements on the timeline. Generally statistical analysis on time can be conducted in
two different temporal bands, day and week [16, 17].
Generally speaking, users’ dining and sleeping behavior are daily-dependent. This
kind of activities take place each day and are closely related with time of the day.
Thus we can conduct statistical analysis on daily-dependent behavior based on
categories of the location separately. Meanwhile, users’ working and entertaining
behavior are weekly-dependent: users show different behavior in weekends and
weekdays. Because of this, users’ weekly-dependent behaviors are analyzed weekly.

4 Experimental Results and Analysis


4.1 Area Clustering Results
We now demonstrate the results yielded by clustering the 559 areas. Eight clusters are
displayed in different colors, as seen in Fig.6.Each cluster is represented in Table 2 with
top 5 categories ranked according to their popularity amongst the cluster members.
62 J. Cao, Q. Hu, and Q.
Q Li

Fig. 6. Spectral Clustering Reesults. Correspondence between the color and the cluster num
mber
is shown in the right.

A common observation fromfr Table 2 is the fact that each area has a dominant cateego-
ry, usually much higher thaan the second. The proportion of category ranking first are
more than 50% in addition to Cluster 1.Cluster 1 suggests the coexistence of Food and
Travel, covering the most central
c area of Shanghai with lots of famous scenic spotts ,
which is the highest membeership score amongst all clusters. Cluster 4 may signify rresi-
dential areas, ranking secon nd amongst all clusters. These two clusters share closee to
60% of all squared areas, which is not only in line with the characteristics of urban PPOI
category, mainly in restauraants and residential areas, but also the characteristics of us-
ers’ movements in urban areas. It is notable to observe that categories Food and Hoome
being the top five categoriess in all clusters also more confirms this conclusion.

Table 2. Squared Area Clu


ustering. The category of Life Services is abbreviated as Life.

Cluster1 (211) Clluster2 (36) Cluster3 (19) Cluster4 (172))


Food 0.379 Leiisure 0.644 Outdoors 0.649 Home 0.5664
Travel 0.253 Hoome 0.088 Work 0.116 Education 0.1224
Leisure 0.084 Traavel 0.079 Home 0.076 Travel 0.0887
Shopping 0.081 Fo
ood 0.077 Food 0.05 Food 0.0772
Home 0.068 Outddoors 0.043 Travel 0.03 Work 0.0556
Cluster5 (66) Clluster6 (25) Cluster7 (25) Cluster8 (5)
Work 0.507 Shoppping 0.579 Life 0.549 Fitness 0.7885
Food 0.126 Fo
ood 0.098 Home 0.113 Education 0.0888
Home 0.106 W
Work 0.082 Travel 0.109 Home 0.0663
Travel 0.087 Hoome 0.069 Food 0.08 Food 0.002
Leisure 0.052 Traavel 0.064 Work 0.036 Leisure 0.018
A Study of Users’ Movementss Based on Check-In Data in Location-Based Social Networks 63

4.2 Temporal Distributtion Results

We will find very meaning gful patterns closely related to users’ movements from
ma
temporal point of view by applying statistical measures to check-ins over hours and
days.Fig.7 provides a generral overview of temporal distribution of check-ins.

Fig. 7. (a) Daily and


a (b) Weekly Temporal Distributions of check-ins

As depicted in Fig.7 (a)), users typically check-in frequently at noon and in the
evening, most occurring at a 9:00 to 23:00, with two peaks at around 13:00 and
7:00.This is due to the facct that most POIs are related to restaurants and food, and
check-in activities are mostly concentrated in dinner time. A related observation can
be made for Fig.7 (b).As users’
u movements related to dining, shopping, and leissure
are over-represented in thiss figure, and we find the highest volume of check-inss on
Saturdays and Sundays. Ov verall, we can see that data has been reasonably well re-
flected, and no evidence for contrary to common sense can be found in our data, ee.g.,
higher number on check-in ns in the middle of the night or lower during weekendss. In
this way it ensures that the characteristics
c extracted from them would be meaningfuul.
For better analyzing charracteristics and variation on the timeline, we can apply sta-
tistical measures to thosse categories which are daily-dependent and weekkly-
dependent. Fig.8 plots thee daily check-ins patterns to three different categorries:
Home, Food, and Work.
As can be seen in Fig.8 (a), home related check-ins increase from 6am, reachinng a
long lasting plateau between 10am and 3pm yet. This may be related with the fact tthat
people go out for work or other
o things at this time. But when they return home lineear-
ly – increasing distributionn is observed between 3pm and 11pm, which rather inndi-
cates that more and more peeople commute to home for rest.
Places related with food patterns
p is shown in Fig.8 (b) significantly, with two peaaks:
at 12pm, at 6pm, demonstrrating that users check-in at restaurants at the peak dinning
time, while almost no check k-ins can be observed from 12pm to 6am.Those findings are
in line with what may be ex xpected by a human observer and daily living habits. A sspe-
cific point to note, howeverr, is that check-ins don’t show a continuous rise at breakkfast
time and between 6am and 9am in the morning. The reason behind this pattern mayy be
that mostly breakfast restaurrants are not fixed and people would not stay too long inn the
purchase of breakfast. This also demonstrates that most office-goers are used to soolve
his breakfast in his way to work
w rather than at breakfast restaurants.
64 J. Cao, Q. Hu, and Q.
Q Li

(a) Homee (b) Food

(c) Work
Fig. 8. Daily temporal disttributions of check-ins to different daily-dependent categories

Check-ins show a steepeer 2% increase at 9am with regard to 7am, indicating the
hough a drop on growth rate from 9am can be observed the
rush hour at this time. Alth
frequency maintains at a hig gh position. Check-ins decreased from 2pm.
Figure 9 adds the weekly check-ins patterns for three different categories: Home, En-
tertainment, and Work. Cheeck-ins related with home, as shown in Fig.9 (a), stay rrela-
tively rich throughout everry day in a week with frequency at above 10%, and the
higher number of check-ins takes place at weekends with above 15%.In contrast to the
characteristics depicted in Fig.9(c),
F places tagged as work show a significant checkk-in
decay during the weekend, which
w is in line with common sense.Fig.9 (b) plots the vaaria-
tion of check-ins related witth entertainment. This distribution do not show such signnifi-
cant patterns on weekdays but
b rises straight up on weekends, especially Saturday.
Discussed above, we can n draw the following conclusions:
The frequency statistics of users’ movements is concordant with users’ daily scche-
dule and behavior. Daily-d dependent behaviors is closely tied to eating, work, coom-
mute and other daily perio odic activities, and shows cyclic effect to some degrree.
Weekly-dependent behavio ors exhibit weekend effect, referred to a significant difffer-
ence check-in frequency beetween weekdays and weekends, which is related with the
time in working or non-worrking day.
Finally, while a single teemporal band may not be sufficient to identify unique ppat-
terns for users’ movementss, we argue that multiple temporal bands can be combined
to provide an accurate and meaningful descriptions of different users’ movement ppat-
terns [18].
A Study of Users’ Movementss Based on Check-In Data in Location-Based Social Networks 65

25 25
20 20
15

Frequency(%)
Frequency(%)
10 15

Check-ins
Check-ins

5 10
0 5
Mon 0

Sun
Tue

Fri
Sat
Wed
Thu

Mon

Thu

Sun
Tue

Fri
Sat
Wed
Day of a weeek(7 Days)
Day of a week(7 Days)
(a) Home (b) Entertainment

25
20
15
Frequency(%)

10
Check-ins

5
0
Mon

Sun
Tue

Fri
Sat
Wed
Thu

Day of a week(7 Days)

(c) Work
Fig. 8. Weekly temporal disttributions of check-ins to different weekly-dependent categories

5 Discussion and Future


F Work

As discussed in the previouus section, we can get a general consensus that LBSN off ffers
opportunities of easily rellating users with specific locations in reality and useers’
movement patterns can be extracted quickly by analyzing the attributes of checkk-in
data (e.g., category, the nu
umber of check-ins). We argue that users’ movements and
preferences have been deep ply embedded in the digital geographic space, and shaared
and access to public. It bennefits sociologists to understand users’ movement patteerns
by data generated from LBS SN and urban scientists could plan layout of the city bettter.
w intend to improve clustering algorithm, evaluate the ac-
In terms of future work we
curacy of clustering and immprove it, thereby improving the accuracy of users’ moove-
ments’ analysis. Moreover,, additional semantic information such as comments, ttags
could be discussed and miined deeply. Hence, extraction and modeling of semanntic
information can allow a deeeper study of motivation of users’ movement and exxpe-
rience degree of movement etc.

Acknowledgment. The autthors would like to thank National Natural Science Founnda-
tion of China to support thee project (Grand No.41371377).
66 J. Cao, Q. Hu, and Q. Li

References
1. Zheng, Y., Zhou, X.: Computing with spatial trajectories. Springer Science+Business Me-
dia (2011)
2. Garlaschelli, D., Loffredo, M.I.: Structure and evolution of the world trade network. Phy-
sica A: Statistical Mechanics and its Applications 355, 138–144 (2005)
3. Zheng, Y., Zhang, L., Xie, X., Ma, W.: Mining interesting locations and travel sequences
from GPS trajectories, pp. 791–800 (2009)
4. Liang, L.Y., Ren, L.L., Wan, Y.H.: “LBS-based Social Network” of the Management and
Operations in Urban public Space. Information Security and Technology 7, 56–63 (2011)
5. Li, Q., Zheng, Y., Xie, X., Chen, Y., Liu, W., Ma, W.: Mining user similarity based on lo-
cation history, p. 34 (2008)
6. Zheng, Y., Zhang, L., Ma, Z., Xie, X., Ma, W.: Recommending friends and locations
based on individual location history. ACM Transactions on the Web (TWEB) 5, 5 (2011)
7. Wikipedia, http://en.wikipedia.org/wiki/Sina_Weibo
8. Goodchild, M.F., Glennon, J.A.: Crowdsourcing geographic information for disaster re-
sponse: a research frontier. International Journal of Digital Earth 3, 231–241 (2010)
9. Scellato, S., Mascolo, C.: Measuring user activity on an online location-based social net-
work. In: 2011 IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS), pp. 918–923 (2011)
10. Noulas, A., Scellato, S., Mascolo, C., Pontil, M.: Exploiting semantic annotations for clus-
tering geographic areas and users in location-based social networks (2011)
11. Bishop, C.M., Nasrabadi, N.M.: Pattern recognition and machine learning, vol. 1. Sprin-
ger, New York (2006)
12. Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: Analysis and an algorithm.
In: Advances in Neural Information Processing Systems, vol. 2, pp. 849–856 (2002)
13. Hagen, L., Kahng, A.B.: New spectral methods for ratio cut partitioning and clustering.
IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems 11,
1074–1085 (1992)
14. Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: Analysis and an algorithm.
In: Advances in Neural Information Processing Systems, vol. 2, pp. 849–856 (2002)
15. Mei, Y.C., Wei, Y.K., Yit, K.C., Angeline, L., Teo, K.T.K.: Image segmentation via nor-
malised cuts and clustering algorithm. In: 2012 IEEE International Conference on Control
System, Computing and Engineering (ICCSCE), pp. 430–435 (2012)
16. Noulas, A., Scellato, S., Mascolo, C., Pontil, M.: An empirical study of geographic user
activity patterns in foursquare. In: ICWSM 2011 (2011)
17. Aubrecht, C., Ungar, J., Freire, S.: Exploring the potential of volunteered geo-graphic in-
formation for modeling spatio-temporal characteristics of urban population. In: Proceed-
ings of 7VCT 11, p. 13 (2011)
18. Ye, M., Janowicz, K., Mülligann, C., Lee, W.: What you are is when you are: the temporal
dimension of feature types in location-based social networks. In: Proceedings of the 19th
ACM SIGSPATIAL International Conference on Advances in Geographic Information
Systems, pp. 102–111. ACM (2011)
Key Frame Selection Algorithms
for Automatic Generation of Panoramic Images
from Crowdsourced Geo-tagged Videos

Seon Ho Kim1 , Ying Lu1 , Junyuan Shi1 , Abdullah Alfarrarjeh1 ,


Cyrus Shahabi1 , Guanfeng Wang2 , and Roger Zimmermann2
1
Integrated Media Systems Center, Univ. of Southern California, Los Angeles, CA
2
School of Computing, National University of Singapore, Singapore 117417
{seonkim,ylu720,junyuans,alfarrar,shahabi}@usc.edu,
{wanggf,rogerz}@comp.nus.edu.sg

Abstract. Currently, an increasing number of user-generated videos


(UGVs) are being collected – a trend that is driven by the ubiquitous
availability of smartphones. Additionally, it has become easy to contin-
uously acquire and fuse various sensor data (e.g., geospatial metadata)
together with video to create sensor-rich mobile videos. As a result,
large repositories of media contents can be automatically geo-tagged at
the fine granularity of frames during video recording. Thus, UGVs have
great potential to be utilized in various geographic information system
(GIS) applications, for example, as source media to automatically gen-
erate panoramic images. However, large amounts of crowdsourced media
data are currently underutilized because it is very challenging to manage,
browse and explore UGVs.
We propose and demonstrate the use of geo-tagged, crowdsourced mo-
bile videos by automatically generating panoramic images from UGVs for
web-based geographic information systems. The proposed algorithms lever-
age data fusion, crowdsourcing and recent advances in media processing to
create large scale panoramic environments very quickly, and possibly even
on-demand. Our experimental results demonstrate that by using geospa-
tial metadata the proposed algorithms save a significant amount of time
in generating panoramas while not sacrificing image quality.

Keywords: Geo-tagged videos, crowdsourcing, key frame selection,


geospatial metadata, panorama.

1 Introduction

A number of trends have recently emerged around mobile video. First, we are
experiencing enormous growth in the amount of mobile video content that is be-
ing collected with handheld devices. Second, the continuous fusion of geo-spatial
metadata with video frames at a fine granular level (e.g., frames) has become
feasible and transparent for the end user, leading to the concept of sensor-rich
mobile videos [1]. However, even though these correlated data are now available,

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 67–84, 2014.

c Springer-Verlag Berlin Heidelberg 2014
68 S.H. Kim et al.

the browsing and exploring of large video repositories still present tremendous
challenges, but also great opportunities. Especially the utilization of such a plen-
tiful data for the generation of new visual information for GIS applications, such
as panoramic images, has not been studied much. Since web-based GIS appli-
cations increasingly integrate panoramic images for, e.g., situation awareness,
there exists a need to quickly and easily capture dynamically changing environ-
ments. This research studies how to effectively utilize the geospatial metadata
for the automatic generation of panoramic images from UGVs.
Conventional systems for generating panoramic images generally fall into two
categories: 1) images are collected with professional equipment, pre-processed,
and then presented as panoramic images (e.g., Google Street View); or 2) the
data is crowdsourced (also referred to as user-generated-videos or UGVs) with
a wide variety of mobile devices, i.e., a very heterogenous set of hardware and
software.
The professional approach has the advantage of a relatively uniform quality
of the media material. However, this comes with the drawback of data only
being available in the most popular cities and areas, and information being
refreshed only at very long intervals (i.e., years between updates). Crowdsourced
information, on the other hand, can be continuously updated and hence can be
very “fresh” and available under a variety of conditions (e.g., day and night,
or during specific events). Hence, more lively and informative images might be
provided to GIS.
However, panorama generation from UGVs faces the following challenge: the
camera positions, trajectories, and view directions of UGVs are determined by
individual users. Such videos are not usually captured with panorama genera-
tion in mind. To overcome this issue, we leverage another technological trend.
Current smartphones contain sensors that can capture the geographic properties
of the recorded scene, specifically the camera position (GPS receiver) and the
viewing direction (digital compass). We address the above challenge by propos-
ing a new approach that makes effective use of crowdsourced mobile videos and
their associated metadata. The key idea is to cross-fuse spatial, temporal, vi-
sual and other crowdsourced data to enable new, up-to-date, and exploratory
applications. Specifically, we describe a use case of leveraging sensor-rich videos
for the automatic generation of panoramic images from user generated mobile
videos. The main contribution of our work is a set of spatial selection algorithms
of key frames from multiple geo-tagged videos to reduce the processing time re-
quired for panorama generation without loss of image quality. Thus, the achieved
efficiency enables very scalable, user-driven solutions.
Please note that we are not focusing on specific image stitching techniques for
panorama generation in this study. Rather, we demonstrate how to intelligently
select the most relevant input image set using spatial metadata before applying
commercial or open source stitching techniques. Our hypothesis is that well
prepared input image datasets are critical for reducing the processing time of
any stitching techniques and enhancing the quality of the resulting images. Our
approach is to effectively select a complete image set that covers all directions
Key Frame Selection Algorithms for Automatic Generation 69

(in order) with proper overlaps for stitching between adjacent images. Many
conventional methods to select input images for such purposes struggle due to the
lack of automatic filtering. Even though photos and videos can be location-tagged
with some commercial cameras, the result is usually just one geo-coordinate
even for a long mobile video. In practice this is not sufficient and we therefore
leverage fine-grained geo-tagged mobile videos, a concept which was introduced
previously [1], [2].
We propose key frame selection algorithms for two different types of panorama
images: point and route panoramas. Experimental results show that our ap-
proach can achieve a 20 to 30 times faster processing time than a naive baseline
approach while providing comparable or better panoramic image quality. Ad-
ditionally, geo-tagging, key frame selection, and stitching can be automatically
pipelined for a quick generation of panoramic environments.
The remaining parts of this paper are organized as follows. Section 2 surveys
techniques related to our work. Section 3 describes the proposed algorithms
followed by the experimental results in Section 4. Finally, Section 5 concludes
the study.

2 Related Work
Generating panoramic images has been explored extensively in the fields of com-
puter vision and multimedia in the context of omnidirectional cameras [3], hand-
held cameras [4], mobile phones [5], or web videos [6], [7]. Some vision-based
techniques generate spherical panorama around a fixed point [8] and others cre-
ate a panorama along a line or route to show a consecutive view along the
path [9] [10]. Regardless of the source device, panoramas can be synthesized
from images [8], [11], [12] or videos [13], [14], [7]. To avoid stitching all video
frames, which typically contain significant redundancy and hence result in a
long processing time, a set of approaches were proposed [15], [16], [17] to se-
lect key frames from videos as input to panorama generation algorithms. Some
methods [15], [16] adaptively identify key frames based on the number of tracked
feature points and the amount of image-to-image overlap. Fadaeieslam et al. [17]
use a Kalman filter to predict the overlap area between each frame and its pre-
vious key frame. Most existing selection techniques in the literature work only
on one video source and assume that video frames are spatially adjacent. In
addition, they find the common feature points between frames to choose a set
of representative key frames. However, our study proposes a novel way to select
key frames from multiple videos purely based on the overlap of contextual ge-
ographical metadata that is associated with videos, which enables a far faster
generation of panoramic images without degradation of image quality.
This work is complementary to our earlier work employing geo-tagged videos.
For instance, Zhang et al. [18] used the concept of crowdsourced geo-tagged
videos to create video summarizations along a geographical path and Arslan Ay
et al. [1] proposed a search approach for large volumes of videos by considering
videos tagged with geo-metadata. Additionally, Kazemi et al. [19] studied max-
imizing the task assignment problem in spatial crowdsourcing and the proposed
70 S.H. Kim et al.

techniques can be used to ask a set of workers to record geo-tagged videos in spe-
cific locations. These methods can be combined together to form an integrated
system such as MediaQ [2].

3 Framework and Algorithms


To generate panoramas from UGVs, we use a two-pass approach. The first pass is
to select a near-minimum number of key video frames among the UGV dataset.
The second pass is to use the selected video frames for panorama generation
with freely available or open-source software packages (e.g., Autostitch [20]),
which use content-based processing techniques. To accelerate the process of
panorama stitching, we focus on the first pass, termed Geo-Pre-Selection, i.e.,
pre-selecting the near-minimum number of frames from large-scale UGV sets
based on their geo-information while still generating comparable (or even better)
quality panoramas, compared with the panoramas generated without Geo-Pre-
Selection. We are motivated by the following two objectives:
1. Acceleration of panorama stitching in the second pass. Panorama stitching
involves a pipeline of complex algorithms for feature extraction, feature
matching, image selection, adjustment, and blending, etc., of which image
adjustment is the most time consuming component. To the best of our knowl-
edge, the time complexity of the classical image adjustment algorithm [21]
is cubic in the number of images, and cannot scale to process a large set of
videos with millions of frames as input.
2. Improving the quality of the generated panoramic images. Consecutive frames
in a video typically have large overlap. Too much overlap between two ad-
jacent video frames not only increases the unnecessary computational cost
with redundant information [22], but also impacts blending effectiveness and
thus reduces the panorama quality.

3.1 Preliminaries
Let V be a video dataset. For a video v ∈ V, each video frame in v is denoted
as fi at time ti . As shown in Figure 1, the scene of video frame fi is represented
in a 2D Field-of-View (FOV) model with four parameters (p, θ, R, α), which
are illustrated below. Let F be the video frame set {f |∀f ∈ v, ∀v ∈ V}. All the
video frames of all the videos in V are treated as a large video frame set F .
Consequently, the video frame selection is transformed into the task of FOV
selection. Thus the Geo-Pre-Selection problem addressed in this paper is, given
an FOV dataset F , to select a subset F ⊂ F with near-minimum number of
FOVs, such that the quality of the panorama generated from F is comparable
or better than that from the panorama generated without Geo-Pre-Selection.

3.2 Selection Criteria


The Geo-Pre-Selection problem presents two main challenges: (1) what are the
FOV selection criteria based on the geo-metadata of videos?, and (2) how should
Key Frame Selection Algorithms for Automatic Generation 71

Figure 1 is a FOV (p, θ, R, α), where


p is the camera position consisting of North
the latitude and longitude coordinates
read from the GPS sensor, θ is the an-

− r
gle of the view direction d with respect d
to north obtained from the digital com-
θ
pass sensor, R is the maximum visible
distance at which an object can be rec-
α R
ognized, and α is the visible angle ob-
p
tained based on the camera and lens
properties at the current zoom level. Fig. 1. 2D Field-of-View (FOV) model

the selection algorithms be designed based on the criteria to minimize the number
of selected FOVs as much as possible? The selection criteria fall into the following
cases:

– Criteria 1: The camera locations of the selected FOVs should be as close


as possible to the query object (e.g., a point, a route). Obviously, FOVs
whose camera locations are far away from the specified object would not be
selected.
– Criteria 2: Every two adjacent selected FOVs should have appropriate over-
lap. Specifically, too much image overlap results in distortions and excessive
processing for stitching, while too little image overlap may result in failed
stitching.
– Criteria 3: The selected FOVs should cover the scene around the specified
object as much as possible.

Based on these criteria, we proceed to present the baseline algorithms and the
Geo-Pre-Selection algorithms for the point panorama in Section 3.3 and route
panorama in Section 3.4, respectively.

3.3 Point Panorama Generation


3.3.1 Baseline Algorithm (BA-P)
The baseline algorithm for panorama generation, denoted as BA-P, exploits
Criteria 1 which states that the selected video frames should be close to the
given location q. We select video frames whose camera positions are located
within a predefined threshold radius r (e.g., 10 meters, which is a typical GPS
error margin) from location q.
The baseline algorithm aims to prune all the frames that are too far away
from the given location q, which forms the input set in conventional approaches.
However, this is not sufficient since it only considers the camera positions of the
video frames. The next two algorithms below follow the filter-refine paradigm
and use BA-P for filtering. We proceed to present two algorithms to enhance
the video frames selection by additionally considering Criteria 2 and 3.
72 S.H. Kim et al.

3.3.2 Direction-based Algorithm (DA-P)


Let the candidate video frames filtered by the baseline method be a set CF .
Recall that the camera locations of the candidate video frames in CF can be
close to each other. We define two terms OverlapP and CoverP among the video
frames in CF for point panoramas as follows.

Definition 1 (OverlapP ). Given any two FOVs f1 , f2 in CF, the overlap of


f1 and f2 , denoted by OverlapP (f1 , f2 ), is the intersecting viewing angle, which
can be calculated as (f2 .α/2 + f1 .α/2) − |f2 .θ − f1 .θ|.

Definition 2 (CoverP ). Given a set of FOVs F = {f1 , . . . , fn }, F ⊂ CF,


ranked by the viewing direction in increasing order, the cover of F , denoted by
CoverP (F ), is the union of the viewing angles in F . It is calculated as
n 
n−1
fi .α − OverlapP (fj .θ, fj+1 .θ).
i=1 j=1

Figure 2 shows the overlap and cover of two FOVs f1 and f2 . Additionally, the
overlap ratio of video frame f1 (with respect to f2 ) is OverlapP (f1 , f2 )/f1 .α.

θ n -1 θ0
f1
f2
θ1
q
Overlap(f1, f2)
……
Cover(f1, f2)

Fig. 2. OverlapP and CoverP between two


FOVs f1 and f2 for point panorama Fig. 3. Divided direction groups

Then, the Geo-Pre-Selection problem for point panorama is formally defined


as follows.

Definition 3. Given the candidate video frames set CF, a user specified location
q, an overlap parameter p (0 ≤ p ≤ 1), the Geo-Pre-Selection for Point
Panorama Problem is to select a subsequence F = {f1 , . . . , fn } of FOVs from
CF ranked by the angle of the viewing direction in increasing order, such that
for any two adjacent FOVs fi and fi+1 in F , OverlapP (fi , fi+1 )/fi .α ≥ p,
CoverP (F ) = CoverP (CF ) and |F | is minimal, where α is the viewable angle
of each FOV in CF and |F | is the number of FOVs in F .

To answer the Geo-Pre-Selection for point panoramas problem efficiently,


we designed a heuristic algorithm, named Direction-based Algorithm for Point
panorama DA-P. DA-P uses the filter-refine paradigm. In the filter phase, it
employs the baseline method to filter out the FOVs whose camera locations are
outside of the range of the circle with the predefined radius r to obtain a set
CF of candidate FOVs. In the refinement phase, it first ranks the FOVs in CF
Key Frame Selection Algorithms for Automatic Generation 73

by the angle of the viewing directions in increasing order. Next it initializes the
first video frame with the FOV with the smallest viewing direction, and then
for each previous selected video frame fpre , select the FOV with the maximum
viewing direction angle from the FOVs such that their overlap ratio with fpre
is no less than the parameter p as the next selected FOV. For FOV fpre , the
direction of the next ideal selected FOV having overlap ratio p with fpre is given
in Eqn. (1). The pseudocode of the DA-P is given in Algorithm 1.

fpre .θ + (1 − p) × f.α (1)

Algorithm 1. DA-P (F : FOV dataset, q: user-specified location, r: radius in


filter step, p: the overlap ratio of two adjacent FOVs)
Output: FOV results in Results of Geo-Pre-Selection Point Panorama Problem.
1: CF ← RangeQuery(F, q, r); //filter step: BA-P
2: Rank FOVs in CF by the view directions in increasing order;
3: Let fpre be the FOV with the smallest view direction angle in CF ;
4: Results ← fpre ;
5: for each FOV f in CF in increasing order of view direction angle do
6: if f is the FOV with the maximum view direction angle in {f |f.θ ≤ fpre .θ +
(1 − p) × f.α} then
7: Results ← f ;
8: fpre ← f ;
9: end if
10: end for

3.3.3 Direction-Location-based Algorithm (DLA-P)


The drawback of the Direction-based Algorithm DA-P is that it only considers
directions of candidate video frames in the refinement phase. To improve the
quality of the selected FOVs for point panorama generation, we next consider
both, the viewing directions and the camera locations of video frames, in the
refinement phase and propose a new heuristic selection algorithm Direction-
Location-based Algorithm for Point panorama, denoted as DLA-P.
Like DA-P, DLA-P uses the filter-refine paradigm. The filter phase is the same
as the baseline method BA-P. In the refinement phase, the algorithm equally
divides 360 degrees into n directions around location q and groups the FOVs into
n groups based on their directions. For each group, the best matching FOV is
selected. The “best” metric is measured by the linear combination of the distance
and the direction difference.
Figure 3 shows the n divided groups. The direction of group j, denoted by θj ,
is defined as the middle direction in the group.
For FOV f with view direction angle f.θ, the group number it belongs to is
given in Eqn. (2), where n is the total number of groups.
f.θ/360 × n
(2)
74 S.H. Kim et al.

The measurement of the difference between an FOV f in group j and the best
FOV is formally defined as Eqn. (3). Here, Dist(q, f.p) is the euclidian distance
between the camera location of f and the user-specified location q, M axDist
is the maximum euclidian distance of pairs of distinct objects in CF, i.e., the
value of M axDist is two times the predefined radius r, cos(θj , f.θ) is the cosine
of the direction difference between the group direction θj and the angle of the
view direction f.θ of f , and β is a parameter for adjusting the balance of the
camera location distance and the direction difference.
Dist(q, f.p)
DLScoreP (f, q) = β × + (1 − β) × (1 − cos(θj , f.θ)) (3)
M axDist

To ensure that the overlap ratio between two adjacent video frames is no less
than the parameter p and the scene coverage of the selected FOVs is maximal,
the group number n can be calculated as in Eqn. (4), where αavg is the average
viewable angle of the FOVs in CF. The pseudocode of the Algorithm DLA-P is
given in Algorithm 2.

 
360
n= (4)
(1 − p) × αavg

Algorithm 2. DLA-P (F : FOV dataset, q: user-specified location, r: radius in


filter step, p: the overlap ratio of two adjacent FOVs, β: balance factor of camera
location distance and direction difference)
Output: FOV results in Results of Geo-Pre-Selection Point Panorama Problem.
1: CF ← RangeQuery(F, q, r); //Filter step: BA-P
2: Initialize n tuples Ti {V al, F ov} with {1, ∅}, 1 ≤ i ≤ n;
3: for each FOV f in CF do
4: j ← f.θ/360 × n ; //Group number n calculated in Eq.(4)
5: if DLScoreP (f, q)<Tj .V al //DLScoreP given in Eq.(3)
6: then Tj .V al = DLScoreP ; Tj .F ov = f ;
7: end for

3.4 Route Panorama Generation


In this section, we present the Geo-Pre-Selection algorithms for route panorama
generation. Unlike a point panorama which is viewed from one point, a route
panorama is viewed along a route. Given a specified route se and the direction
D at which side of the route the panorama is required, we expect to select
visually coherent and near minimum number of video frames to generate the
route panorama.

3.4.1 Baseline Algorithm (BA-R)


The baseline keyframe selection algorithm for route panorama, denoted as BA-R,
selects video frames whose camera positions from the route se is no larger than
Key Frame Selection Algorithms for Automatic Generation 75

SP(f1)

SP(f2)
f1
R1

D f2
OverlapR(f1, f2) R2
α1
LP(f1) p1
LP(f2)
α2
s e
p2
CoverR(f1, f2)

Fig. 4. Illustration of OverlapR and CoverR between two FOVs for route panorama

a predefined threshold r (e.g., 10 meters, which is a typical GPS error margin)


and the frames directions are within [D − , D + ], where  is the compass error
margin (e.g., 5 degrees).
The BA-R algorithm can roughly filter out video frames that are far away from
the route se or view directions that are not much different from the specified
direction to get relevant frames. However, the number of output frames of BA-
R is still large and the set contains many redundant frames. To accelerate the
panorama stitching time, we proceed to present a refinement algorithm that is
expected to select the near minimum number of keyframes.

3.4.2 Projection-based Algorithm (PA-R)


Definition 4 (LocationP rojection). Given any FOV f(p, θ, R, α) in F and the
user specified route se, we call the projection of −

sp on route se as the Location
P roject of the FOV f , denoted as LP (f ), which is calculated as LP (f ) =
Dist(s, p) × cos(−

sp, →

se).
Definition 5 (SceneP rojection). Given any FOV f(p, θ, R, α) in F and the
user specified route se, we call the projection of FOV f on route se as the
SceneP roject of f , denoted as SP (f ), which is calculated as SP (f )=2 × R ×
sin f.α
2 .
Definition 6 (OverlapR). Given any two FOVs f1 (p1 , θ1 , R1 , α1 ) and f2 (p2 , θ2 ,
R2 , α2 ) in F and the query route se, the overlap of f1 and f2 , denoted by
OverlapR(f1 , f2 ), is their intersecting projection on the route
 se, which can be
calculated as OverlapR(f1 , f2 ) = SP (f1 )/2 + SP (f2 )/2 − LP (f1 ) − LP (f2 ).
Definition 7 (CoverR). Given a set of FOVs F = {f1 , . . . , fn } F ⊂ F, ranked
by the location projection in increasing order, and the query route se, the cover
of F , denoted by CoverR(F ), is the union scene projection of F on route se. It

n 
n−1
is calculated as SP (fi ) − OverlapR(fj , fj+1 ).
i=1 j=1

Figure 4 shows the overlap and cover of two FOVs f1 and f2 for a route
panorama. Additionally, the overlap ratio of video frame f1 with respect to f2 ,
denoted as OverlapRatio(f1 , f2 ) is OverlapR(f1 , f2 )/SP (f1 ).
76 S.H. Kim et al.

Then, the Geo-Pre-Selection for route panorama problem is formally defined


as follows.

Definition 8. Given the candidate video frames set CF filtered by BA − R, a


user specified route se, an overlap parameter p (0 ≤ p ≤ 1), the Geo-Pre-
Selection for Route Panorama Problem is to select a subsequence F =
{f1 , . . . , fn } of FOVs from CF ranked by their location projections LP (fi ) in in-
creasing order, such that for any two adjacent FOVs fi and fi+1 in F ,
OverlapR(fi , fi+1 )/SP (fi ) ≥ p, CoverR(F ) = CoverR(CF ) and the number
of FOVs in F is minimal.

Algorithm 3. PA-R (F : FOV dataset, se: user-specified route, D: user speci-


fied query direction at which side from the route the panorama is required, i.e.,
either left or right side of the route se, r: filter rectangle width, : direction error
margin, p: Overlap ratio threshold)
Output: FOV results in Results of Geo-Pre-Selection Route Panorama Problem.
1: CF ← RectangleQuery(F, se, r, ); //Filter step: BA-R
2: CF ← Rank FOVs in CF by their LocationP rojections in increasing order;
3: Let fpre be the first FOV in the ranked CF;
4: Results ← fpre ;
5: for each FOV f in CF do
6: if f is the FOV with the minimum OverlapRatio(fpre , f ) ≤ p then
7: Results ← Results ∪ f ;
8: fpre ← f ;
9: end if
10: end for

To answer the Geo-Pre-Selection for route panoramas problem efficiently, we


developed a heuristic algorithm, named Projection-based Algorithm for Route
panorama PA-R. PA-R follows the filter-refine paradigm. In the filter phase, it
employs the baseline method BA-R to filter out the irrelevant FOVs. In the
refinement phase, it first ranks the candidate FOVs by their location projection
in increasing order. Next it initializes the first video frame with the FOV with the
smallest location projection, and then for each previously selected video frame
fpre , it selects the FOV with the minimum overlap ratio that is larger than
or equal to the given overlap ratio p. The pseudocode of the Projection-based
Algorithm PA-R is given in Algorithm 3.

4 Experimental Results

We implemented the proposed algorithms on top of the prototype mobile video


collection system, termed MediaQ [2], which we built in our previous research and
generated panoramic images captured with our Android app. MediaQ is an online
media management system to collect, organize, share, and search mobile multi-
media contents using automatically tagged geospatial metadata. User-generated-
videos can be uploaded to MediaQ from users’ smartphones and displayed
Key Frame Selection Algorithms for Automatic Generation 77

accurately on a map interface according to their automatically sensed geospatial


metadata which are used for the selection of frames in this study. Using MediaQ,
we captured mobile videos in two locations, the University of Southern Califor-
nia (USC) in Los Angeles and the downtown area of Singapore. We collected 345
videos (77,642 frames) for the experiments. Three different Android phones (Mo-
torola Milestone - Qwerty, HTC EVO 3D, and Samsung Galaxy S4) were used to
record Standard Definition videos (at 720×480 resolution).
To evaluate the effectiveness of the algorithms, we report both quantitative and
qualitative measurements from the experiments. For the quantitative evaluation,
we used the following metrics: (1) SelectionTime: the processing time of select-
ing key frames from all the videos based on the proposed selection algorithms,
(2) StitchingTime: the processing time ( panorama stitching time) using the se-
lected video frames by the software used in the experiments, (3) SelectedFOV#:
the number of the selected video frames to generate a single panoramic image, and
(4) Video#: the number of different videos from which the selected video frames
were extracted. For the qualitative evaluation, we performed a user study to mea-
sure QualityRank: a rank based on how the resulting images were perceived by
humans based on image quality (e.g., the clarity and the covered viewable angle).
QualityRank is an average order number. The QualityRank of a result image was
calculated by QualityRank = p1 × 1 + p2 × 2 + p3 × 3, where p1 (resp., p2 or p3 )
is the percentage of participants that rank the image first (resp., second or third).
In the user study, twenty-two people, 9 males and 13 females, participated. The
participants were chosen among those who were familiar with the chosen locations
for a more accurate comparison. The participants were requested to rank the re-
sulting panoramas generated by different algorithms.
Panoramic images were generated using an off-the-shelf software, AutoStitch1 .
The parameter set we used for the panorama generation in AutoStitch was 1,000
RANSAC iterations, a JPEG quality factor of 85 and a resolution of about
3,600×500. On average, it took eight to nine seconds to generate a panorama in
our experiments using an Intel i3 2.53 GHz processor. The time cost could be
reduced with a lower quality parameter setting or with a faster processor.
We selected 10 different locations for the generation of two types of panoramic
images (point and route panoramas) using the algorithms in Section 3. In this
section we mainly present the results for point panoramas since the results from
point and route panorama generation algorithms showed similar observations.
Due to space limitation, the results for route panorama are not much elaborated.

4.1 Point Panorama Generation

As shown in Table 1, the DA-P and DLA-P algorithms used a far smaller num-
ber of frames than BA-P in generating panoramic images. To generate one
panoramic image, BA-P, DA-P, and DLA-P used 210, 17 and 15 FOVs on
average, respectively. Consequently, their processing time (SelectionTime and
StitchingTime) were far less than those of BA-P. The average ExtractTime was
1
http://www.cs.bath.ac.uk/brown/autostitch/autostitch.html
78 S.H. Kim et al.

85.3, 2.34, 2.02 seconds by BA-P, DA-P, and DLA-P, respectively, while the av-
erage StitchingTime was 148.5, 8.51, 8.65 (seconds) by BA-P, DA-P, and DLA-P,
respectively. This shows that an effective selection of input frames for the gen-
eration of panoramic images can save a lot of processing time compared to a
baseline approach, i.e., BA-P.
The qualitative evaluation using a user study demonstrated that the results
from BA-P and DA-P were comparable by showing a similar QualityRank, 2.19
and 2.21 on average. However, DLA-P produced better quality images by scoring
1.6 (on average) in QualityRank, in most cases DLA-P generated comparable
quality images to either BA-P or DA-P. This demonstrates that we can generate
a better quality panoramic image while using a far smaller number of frames by
effectively filtering out redundant frames.

Table 1. Point Panorama results at different locations

SelectionTime (sec) StitchingTime (sec) SelectedFOV# Video# QualityRank


Loc
BA DA DLA BA DA DLA BA DA DLA BA DA DLA BA DA DLA
1 154.67 2.02 1.88 225.4 8.7 9.2 372 17 14 4 3 3 2.56 1.93 1.5
2 59.48 1.98 1.75 58.3 7.87 11.1 77 14 14 4 4 2 2.90 1.82 1.29
3 105.51 2.11 1.51 128.4 9.50 7.3 228 17 13 3 2 3 2.07 2.32 1.61
4 68.97 2.48 1.92 68.97 11.1 11.1 145 17 15 2 2 2 1.61 2.96 1.43
5 120.17 2.32 2.09 132.0 9.1 9.46 249 17 15 3 2 3 1.21 2.96 1.82
6 60.28 3.88 2.91 203.6 8.2 8.1 127 17 15 2 2 2 2.17 2.35 1.51
7 86.56 1.99 2.06 110.2 8.5 8.7 182 17 15 3 3 3 2.36 2.14 1.5
8 72.74 2.13 1.89 149.0 6.3 5.4 210 16 15 5 4 1 2.71 1.43 1.86
9 124.81 1.99 1.98 262.0 8.45 7.6 262 17 16 3 3 3 2.96 1.93 1.11
10 107.15 2.54 2.16 146.76 7.4 8.5 248 21 17 5 2 3 1.39 2.25 2.36
AVG 85.32 2.34 2.02 148.46 8.51 8.65 210 17 15 3.4 2.7 2.5 2.19 2.21 1.60

As visual examples of the generated images from different algorithms, we present


two cases: an open space in Figure 5 and a small pathway in Figure 6. In Figure 5,
BA-P selected 228 video frames from the whole video dataset with 69,238 video
frames while DA-P and DLA-P selected only 17 and 13 key frames, respectively,
by effectively filtering out redundant images. This case represents a very success-
ful generation of a panorama image since all three results look visually accurate
and comparable with few artifacts. In general, generating panoramic images of an
open space resulted in high quality images by all three algorithms. However, the
selected numbers of FOVs by DA-P and DLA-P were about 16 times smaller than
those by BA-P, hence computational costs for stitching were significantly saved.
Figure 6 demonstrates a different case. BA-P selected 77 video frames while DA-P
and DLA-P both selected 14 key frames, respectively. The images generated by
DA-P and DLA-P were much better than that by BA-P, with DLA-P producing
the best result. In this case there exist many nearby buildings and other obstruc-
tions such as trees in a relatively tight space. Too many redundant frames and
too much image overlap resulted in distortions and increased the matching errors
in stitch processing. These examples clearly demonstrate the superiority of the
DA-P and DLA-P algorithms compared with BA-P. Note that most panoramic
images generated in our experiments were of satisfactory quality similar to the
ones shown in Figure 5.
Key Frame Selection Algorithms for Automatic Generation 79

(a) Algorithm BA-P, SelectedFOV# = 228, Video# = 3.

(b) Algorithm DA-P, SelectedFOV# = 17, Video# = 2.

(c) Algorithm DLA-P, SelectedFOV# = 13, Video# = 3.

Fig. 5. Panorama results around Cromwell field at USC. Here the results are visually
pleasing and quite comparable for all three algorithms.

In summary for point panorama generation, compared to BA-P, the proposed


algorithms DA-P and DLA-P select significantly fewer video frames and gen-
erate comparable (DA-P ) or better (DLA-P ) quality panoramic images since
they refine the candidate video frame set using the selection Criteria 2 and 3.
Additionally, the DLA-P algorithm performs better than the DA-P algorithm
since it considers both the viewing directions and camera locations of the video
frames.

4.2 Route Panorama Generation


We repeated the same experiments for route panorama generation. For route
panorama images at ten locations, we obtained similar results as in the point
panorama case. On average, the SelectionTime of BA-R was 150.2 seconds while
that of PA-R was 2.3 seconds. The StitchingTime of BA-R was 866 seconds
while PA-R took only 15.1 seconds. Based on the user study, the QualityRank
of PA-R was comparable to that of BA-R. This clearly shows that the selection
of input images using our algorithm greatly saves processing time to generate
route panorama images. One visual example is given in Figure 7.

4.3 Observations and Discussion


The quality of both the panorama results are highly related to the availability
of proper video data. Some of the challenges include the following:
80 S.H. Kim et al.

(a) Algorithm BA-P, SelectedFOV# = 77, Video# = 4.

(b) Algorithm DA-P, SelectedFOV# = 14, Video# = 4.

(c) Algorithm DLA-P, SelectedFOV# = 14, Video# = 2.

Fig. 6. Panorama results near the Mudd Hall of Philosophy (MHP) at USC. Significant
artifacts are produced if too many FOVs are used as input into the stitching process,
i.e., algorithm BA-P in (a).

Video Quality. The produced images depend on the video quality captured
by mobile devices. Since we are selecting all possible frames, not only I-frames,
a poor camera sensor or compression method in a mobile device may result in
distortions and unnatural colors in selected frames. This may further lead to mis-
matches in all feature-based stitching algorithms. In our experiments, we mainly
used mobile phones supporting 720×480 Standard Definition video recording.
When some videos were recorded in HD, the quality of the final results were
considerably better (see Figure 8). Lighting conditions may also vary across
different videos, which can further create challenges when images are selected
from different videos. However, the technological trends are very promising since
mobile video quality is rapidly improving (e.g., some of the latest smartphones
record very high quality videos) and stitching technologies are getting more
effective.
Key Frame Selection Algorithms for Automatic Generation 81

(a) BA-R, SelectedFOV# = 64, Video# = 3.

(b) PA-R, SelectedFOV# = 25, Video# = 3.

Fig. 7. Route Panorama results at Exposition Park - Los Angeles

Fig. 8. An example of an HD panoramic image: a portion to illustrate the visual clarity

Object Coverage. The results depend on the availability of data at a cer-


tain location. To achieve complete coverage, data needs to be available from all
around (i.e., 360 degrees) at the same location. A number of techniques can be
used to achieve better coverage, for example, by crowdsourcing video recordings
when there exist missing images.
Sensor Accuracy. The metadata of videos are collected from the sensors
on mobile devices. Normally, the error range of GPS locations is 5 to 10 meters,
and that of compass orientations is less than several degrees, which it is fine to
detect overlapping FOVs. However, at times GPS and compass accuracy can be
disturbed by various factors such as a weak GPS signal. Inaccurate metadata
82 S.H. Kim et al.

affects the selection of input images for stitching. Recently a number of studies
have investigated GPS and compass data correction via post-processing so that
sensor accuracy would not be a problem in most outdoor videos, especially in
videos where consecutive sensor data are available.
Media Processing. Generating panorama also depend on the used media
processing technology (e.g., computer vision algorithms). In this study we used
existing software packages which allowed us little control over the output. For
example, input image ordering can help in the overall performance of stitching,
however, AutoStitch did not recognize this. Since images can be selected from
different videos captured by different cameras, an algorithm which can consider
different camera properties can also enhance the quality of the resulting images.

5 Conclusions

We have presented novel methods to effectively and automatically generate point


and route panoramic images from a large set of crowdsourced mobile videos.
Our approach leverages geo-spatial sensor metadata to achieve a high efficiency
without sacrificing quality. Our goal is to provide components to automate large-
scale, up-to-date panoramas for various GIS applications.
We found that, while there exist still considerable challenges, it is increas-
ingly possible to use crowdsourced, user-generated videos to create visual ele-
ments such as panoramas. Since UGVs can be very up-to-date, new exploratory
applications are possible. Note that, while this study focused on a specific al-
gorithm for panoramas, our approach can be applied to generate other useful
visual elements, for example, 3D models or video summaries, which can then be
presented together in an integrated, geo-immersive environment. In such a case
a map, or a 3D environment such as Google Earth, may function as a backdrop
into which the visual elements are placed and explored by users. By combining
all these visual elements (i.e., original footage such as images and videos and
post-processed information such as panoramas and 3D models) users can ex-
plore distant geographic regions and obtain an excellent understanding of the
surroundings.

Acknowledgments. This research has been funded in part by Award No. 2011-
IJCX-K054 from the National Institute of Justice, Office of Justice Programs,
U.S. Department of Justice, as well as NSF grant IIS-1320149, the USC Inte-
grated Media Systems Center (IMSC) and unrestricted cash gifts from Google
and Northrop Grumman. Any opinions, findings, and conclusions or recommen-
dations expressed in this material are those of the authors and do not necessarily
reflect the views of any of the sponsors such as the National Science Foundation
or the Department of Justice. This research has also been supported in part at
the Centre of Social Media Innovations for Communities (COSMIC) by the Sin-
gapore National Research Foundation under its International Research Centre @
Singapore Funding Initiative and administered by the IDM Programme Office.
Key Frame Selection Algorithms for Automatic Generation 83

References
1. Arslan Ay, S., Zimmermann, R., Kim, S.H.: Viewable Scene Modeling for Geospa-
tial Video Search. In: 6th ACM Intl. Conference on Multimedia, pp. 309–318 (2008)
2. Kim, S.H., Lu, Y., Constantinou, G., Shahabi, C., Wang, G., Zimmermann, R.:
MediaQ: Mobile Multimedia Management System. In: ACM Multimedia Systems
Conference (2014)
3. Kawanishi, T., Yamazawa, K., Iwasa, H., Takemura, H., Yokoya, N.: Generation
of High-resolution Stereo Panoramic Images by Omnidirectional Imaging Sensor
using Hexagonal Pyramidal Mirrors. In: 14th International Conference on Pattern
Recognition, vol.1, pp. 485–489. IEEE (1998)
4. Zhu, Z., Xu, G., Riseman, E.M., Hanson, A.R.: Fast Generation of Dynamic and
Multi-resolution 360 Panorama from Video Sequences. In: Int’l Conference on Mul-
timedia Computing and Systems, pp. 400–406. IEEE (1999)
5. Wagner, D., Mulloni, A., Langlotz, T., Schmalstieg, D.: Real-time Panoramic Map-
ping and Tracking on Mobile Phones. In: Virtual Reality Conference (VR), pp.
211–218. IEEE (2010)
6. Liu, F., Hu, Y.H., Gleicher, M.L.: Discovering panoramas in web videos. In: 16th
ACM International Conference on Multimedia, pp. 329–338. ACM (2008)
7. Szeliski, R.: Video Mosaics for Virtual Environments. IEEE Computer Graphics
and Applications 16(2), 22–30 (1996)
8. Szeliski, R., Shum, H.Y.: Creating Full View Panoramic Image Mosaics and Envi-
ronment Maps. In: 24th Annual Conference on Computer Graphics and Interactive
Techniques, pp. 251–258. ACM Press/Addison-Wesley Publishing Co. (1997)
9. Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., Szeliski, R.: Photograph-
ing long scenes with multi-viewpoint panoramas. ACM Transactions on Graphics
(TOG) 25, 853–861 (2006)
10. Zheng, J.Y.: Digital route panoramas. IEEE Multimedia 10(3), 57–67 (2003)
11. van de Laar, V., Aizawa, K., Hatori, M.: Capturing Wide-view Images with Uncali-
brated Cameras. In: Electronic Imaging 1999, pp. 1315–1324. International Society
for Optics and Photonics (1998)
12. Nielsen, F.: Randomized Adaptive Algorithms for Mosaicing Systems. IEICE
Transactions on Information and Systems 83(7), 1386–1394 (2000)
13. Mann, S., Picard, R.W.: Virtual Bellows: Constructing High Quality Stills from
Video. In: International Conference on Image Processing (ICIP), vol. 1, pp. 363–
367. IEEE (1994)
14. Peleg, S., Herman, J.: Panoramic Mosaics by Manifold Projection. In: Interna-
tional Conference on Computer Vision and Pattern Recognition, pp. 338–343. IEEE
(1997)
15. Steedly, D., Pal, C., Szeliski, R.: Efficiently Registering Video into Panoramic Mo-
saics. In: 10th International Conference on Computer Vision (ICCV), vol. 2, pp.
1300–1307. IEEE (2005)
16. Hsu, C.T., Cheng, T.H., Beuker, R.A., Horng, J.K.: Feature-based Video Mosaic.
In: International Conference on Image Processing, vol. 2, pp. 887–890. IEEE (2000)
17. Fadaeieslam, M.J., Fathy, M., Soryani, M.: Key frames selection into panoramic
mosaics. In: 7th International Conference on Information, Communications and
Signal Processing (ICICS), pp. 1–5. IEEE (2009)
84 S.H. Kim et al.

18. Zhang, Y., Ma, H., Zimmermann, R.: Dynamic Multi-video Summarization of
Sensor-Rich Videos in Geo-Space. In: Li, S., El Saddik, A., Wang, M., Mei, T.,
Sebe, N., Yan, S., Hong, R., Gurrin, C. (eds.) MMM 2013, Part I. LNCS, vol. 7732,
pp. 380–390. Springer, Heidelberg (2013)
19. Kazemi, L., Shahabi, C.: GeoCrowd: Enabling Query Answering with Spatial
Crowdsourcing. In: ACM SIGSPATIAL GIS, pp. 189–198 (2012)
20. Brown, M., Lowe, D.: AutoStitch: A New Dimension in Automatic Image Stitching
(2008)
21. Lourakis, M.I.A., Argyros, A.A.: SBA: A Software Package for Generic Sparse
Bundle Adjustment. ACM Transactions on Mathematical Software, 1–30 (2009)
22. Fadaeieslam, M., Soryani, M., Fathy, M.: Efficient Key Frames Selection for
Panorama Generation from Video. Journal of Electronic Imaging 20(2), 023015
(2011)
ReSDaP: A Real-Time Data Provision System
Architecture for Sensor Webs

Huan Li, Hong Fan*, Huayi Wu, Hao Feng, and Pengpeng Li

LIESMARS, Wuhan University, 129 Luoyu Road, Wuhan, PR China, 430079


{lihuan,hfan3,wuhuayi}@whu.edu.cn

Abstract. More and more sensors for environment surveillance are deployed
around the world given the rapid development of sensor networks. Since sensor
data is produced continually for months or even years in many places, a huge
amount of data is stored all over the world. This study proposes ReSDaP, an ar-
chitecture to bridge sensor networks and spatio-temporal databases by conti-
nuously creating and running Provision Items (PIs). A PI is responsible for the
continual linkage-processing-storage (LPS) of the data stream produced by a
sensor, and acts as a pipeline for nonstop transmission of data into spatio-
temporal databases at fixed time intervals. Actual data provisions from Wuhan
meteorological sensors are used as a case study of ReSDaP. This implementa-
tion demonstrates that ReSDaP has good scalability and increases the availabili-
ty of sensor web data.

Keywords: Sensor Web Service, Spatio-temporal data, Real-time provision,


Data stream pipeline, System Architecture.

1 Introduction

In tandem with advances in sensor technology, a sensor network-based “e-skin” is


being extended and now covers the whole earth and continuously monitors the Earth's
environment. Real-time and long term sensor data are acquired by ubiquitous sensing
tools or even virtual sensors. Sensors collect short term information like weather con-
ditions, traffic, or the water level of a river, but also provide long period surveillance
to analyze trends or changes of a phenomenon or an artificial/natural object. Long
term data of interest include deformations of bridges and buildings, landslides, and
seismic changes. Thus, the twofold uses of sensor data presents a challenge in that
long term and varied uses require consistent and reliable data. Therefore, it is neces-
sary to gather these sensing data uniformly, process them effectively, and manage
them quickly and efficiently.
From an engineering viewpoint, a sensor is a device that converts physical,
chemical or biological parameters into electrical signals [1]. A sensor network is a
computer-accessible network of spatially distributed sensors for monitoring a global
environment, for data concerning temperature, sound, vibration, pressure, motion or
*
Corresponding author.

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 85–99, 2014.
© Springer-Verlag Berlin Heidelberg 2014
86 H. Li et al.

pollutants at different locations [2]. A sensor web is a group of interoperable web


services which all comply with a specific set of sensor behaviors and interface speci-
fications [3]. Lai et al. [4] classified sensor types from different perspectives, such as
signal type, sensor measures, media types by which sensor data transmit, locations
where sensor nodes are deployed, and their ability to automatically adjust to changing
conditions. According to the characteristics of sensor web data, sensors are divided
into four types: in-situ sensors, mobile sensors, image sensors and video sensors. Ding
and Gao [5] only classify all sensors as the first two types, but considering the special
nature of image and video sensor data, this paper seperates sensors into four indepen-
dent types.
Lee and Reichardt [6] assert that standardized interfaces and data encoding
schemes are needed to rapidly discover, access, integrate, fuse, and use sensor-derived
data for data collection from heterogeneous and multi-vendor sensors. Lee and Rei-
chardt detail the specifications of the Open Geospatial Consortium (OGC) Sensor
Web Services Interface with IEEE 1451 standards. The OGC standards provide a
means to enable the realization of web-based sensor data interfaces for conveniently
data acquisition through the Internet. Furthermore, flexible and universal implementa-
tion for OGC Sensor Observation Services (SOSs) was introduced by Chen et al [2].
This new generation of Sensor Web Enablement (SWE) [1] is more applicable when
bringing sensor resources onto the Web so to make them available to applications.
With sensor data services following open standards, sensor data are easily acquired
from the Internet once they are published, creating a spatial data stream. A spatial
data stream transmission is a process of transfering spatial data through the network
continuously and in real-time [7]. Sensors generate real-time data flows [2, 8]. Based
on this analysis, there are several challenges to be considered to collect data smoothly
for various applications:
1. Time constraints are of different degrees in different applications. Urban fires
and floods require faster data acquisition than road traffic congestion and
weather conditions. The sampling frequencies of sensors also have a big dif-
ference requiring a flexible time configuration to customize data acquisition
solutions for different needs.
2. Observations from heterogeneous, multi-vendor sensors need a unified
processing framework. Different data types include numerical data, raster data,
and video data. To handle these different data for different applications it is bet-
ter not to have different processing system definitions, but rather a unified data
management frame. Therefore, data processing can be uniform and flexible.
3. The number of sensors is too large, and must be handled with systematic
control units. For huge sensor data provisioning, we need to have a flexible
provision management mechanism to observe the conditions of data streams,
controlling the process and the progress.
Aiming at these problems, this study proposes a real-time data provision
system architecture for sensor web, named the ReSDaP (Real-time Sensor Data
Provision Architecture) with a flexible time configuration scheme. It includes a cus-
tomizable and extensible plug-in framework for sensor network data processing
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 87

through dynamic loading, selecting, and deleting user-defined plug-ins. It can also
query and show the data streams that are being saved to the database as status data for
geo-objects. Data acquired by a sensor is regarded as a status/state of the geo-object
the sensor observes. The data stored to the database can be numerical, video streams,
images, and other data types. Users can control the functioning of Provision Items
(PIs) by a series of management operations like pause, stop, restart and delete. ReS-
DaP is also capable of creating PIs through visual interface.
The remainder of the paper is organized as follows. Section 2 describes related
work. In Section 3, we describe the general situation of sensor data provision, design
considerations, and architecture. Section 4 presents our implementation and expe-
riences from Wuhan meteorological sensor data provision. Finally we draw conclu-
sions and prospects for future work in Section 5.

2 Related Work
At present, most studies focus on sensor data collection [9], integration and fusion
[10], energy-saving deployment and communication optimization [11, 12] among
sensors at hardware-level. Meanwhile a few studies have described effective ways of
flexible real-time processing of sensor web data, and dynamic loading of user-defined
data processing plug-ins.
All the sink nodes of a sensor web can be connected together, and play a gateway
role to other networks [8, 13]. The application layer protocol serves to make the un-
derlying hardware transparent to the upper layer users [13]. Based on these technolo-
gies, it is easy to obtain sensor data from the network through gateway services for
sensor webs.
At the database level, many studies have focused on the management of big data, sto-
rage of data of different types, database structures [8, 14]. Many of the current open
source NoSQL (Not only SQL) databases such as MongoDB [15], CouchDB [16],
ArangoDB [17], provide solutions for storage of large amounts of unstructured data.
For real-time sensor data processing, general approaches lack flexibility and can
not achieve dynamic loading of user-defined processing plug-ins. Therefore, there is
an urgent need for a system architecture that can acquire sensor web data in real time,
flexibility process data, and storing the data to the database system continuously.
The IoT-ClusterDB [5] architecture focuses on spatio-temporal data modelling and
effective data query, realizing data uploads to the Internet of Things (IoT). IoT-
ClusterDB achieves numerical data access and real-time processing, and supports
analysis of multimedia data to extract numerical data. It realizes the high frequency
data sampling sparseness, but it is not appropriate for particular applications of user-
defined data processing, and time interval configuration related to sampling rate of
sensors has not been addressed.
ISHSN [18] consists of a gateway of the Internet of Things (IoT) and Access Agent
(AA). It receives a variety of sensor data as xml format, JSON format, or other data
formats and encapsulates them into Basic Data Format (BDF). It uses a combination
of centralized and distributed storage solutions using a balanced scheduling algorithm
in the AA to store data into a local database or a global database.
88 H. Li et al.

ChukwaX [19] was developed based on Yahoo's large data acquisition and analysis
system Chukwa. It adopts a hierarchical architecture to disperse the whole access
pressure and adaptor pattern to match different protocols to support sensing network
dynamic accessing. It uses different adaptors in the Sensor Network access Agent to
fit different sensor networks. These adaptors are responsible for parsing sensor net-
work protocols. ChukwaX supports dynamic access management and sensor networks
that can freely access or cut a connection between things. However, it only can simply
cut off a connection with a sensor network and lacks visual and flexible systematic
management.

3 Design

3.1 General Situation


The system structure of sensor data acquisition is shown in Fig. 1. The sensor net-
works are deployed to continuously collect environmental monitoring data, and
transmit and integrate the data in the centralized sink nodes or gateways in a wired or
wireless way. The data are then published in the form of Sensor Observation Services
(SOS) to the Internet. These in turn, are acquired and stored by Data Provision Con-
trol Units to the Data Management Center. Users can obtain wanted data by accessing
SOS through the Internet or at the same time, get the required data from the Data
Management Center by controlling sensor data provisions by operating Data Provi-
sion Control Units.

Fig. 1. System structure for sensor data acquisition

Sensors can be deployed as space, air, or land laid sensors, such as satellites, avia-
tion aircraft, ground sensors, and transmitted through comprehensive ground data
receiving units, which can also be nodes in a sensor web.
Sensor networks release SOS in line with OGC SWE standards through sink nodes
or gateways. Users can access the data for a certain range of time and specific extent
directly through the Internet. Such data have some limitations, users can only get a
single data service through a query at a specific SOS address.
The Data Management Center is the storage and management unit for sensor data,
wherein the data management method is flexible according to specific requirements.
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 89

For relatively simple applications, file management can be enough. For relatively
large amounts of simple data, a relational database can be used. For spatio-temporal
data, more complex data structures and management methods might be considered.
The Data Provision Control Unit accesses SOS through the Internet. It acquires
historical data and real-time observation data through the configuration of time and
space parameters. Then, the data are stored in the Data Management Center. Informa-
tion naturally occurs in the form of a sequence (stream) of data values. A data stream
is a real-time, continuous, ordered (implicitly by arriv al time or explicitly by time-
stamp) sequence of items [20]. The Federal Standard 1037C defines a data stream as a
sequence of digitally encoded signals used to represent information in transmission
[21]. Users operate a Data Provision Unit directly to observe the data provision
progress and which sensor data are being ingested. The Data Provision Control Unit is
mainly responsible for large data stream provision, including real-time collection,
processing, and storage of sensor data.
The necessary considerations for the design of the architecture are introduced in
the next section.

3.2 Considerations

Management of Provision Items


A Provision Item (PI) is a new concept defined as a data stream bridge created in a
series of steps including accessing the sensor data, data processing, and data storage.
It is an Input-Process-Output (IPO) process. Since there may be thousands of sensors
to be ingested, the management of these PIs needs to be considered.
The system must have the capability to manage the PIs with creating, starting,
stopping, deleting and sorting functionalities. New items can be set up through a wi-
zard, getting the parameters for settings step by step.
The sensors which need to be provided may be in the hundreds or even thousands.
It would be very time-consuming to ingest sensors one by one. Thus, we need to have
a batch mode for intelligently creating PIs with the same parameters including server
address, spatio-temporal database connection settings, among others. In the mean
time, providing a way to define xml configuration files to save the settings of a PI for
reusing and sharing also needs to be considered so that a large amount of PIs can be
created by simply importing a configuration file.

Data Processing
An application system is unable to complete all of the data processing functions be-
cause of the diversity of sensor types and the complexity of data structures.The plug-
in framework however, is a flexible programming design technology that solves this
problem. It realizes a separation of the application framework and the functions of the
application. These functions are implemented as plug-ins which can be loaded and
unloaded from the framework in order to achieve or delete specific functions. It is a
feasible way to define a plug-in framework which achieves user-defined data
processing functions by providing uniform function interfaces.
90 H. Li et al.

Time Configuration
There are two ways to upload the sampling data of sensors to the data service center:
active and passive. In active uploading, data is uploaded if some condition is satisfied.
In passive uploading, data is uploaded at fixed periods and frequencies. Because the
latter does not require any computing power from the sensor itself, it simplifies the
data acquisition process, and is more widely used.
As soon as the sensor data arrives at the data center, they will be published as a
Sensor Observation Service (SOS) and users can get the data they want through the
Internet. Sensor web data requests from SOS can be considered in three ways.
1. Default frequency. There is a fixed frequency at which sensor data is uploaded to
data service center. It can be used as the default frequency for ReSDaP to periodi-
cally sending a data request to SOS.
2. Obtaining the latest data. Requests for the latest data are continuously sent to SOS.
If the resolved data from xml response is the same with last data acquired, it is dis-
carded. If not, it is passed to the next processing stage or directly stored to the da-
tabase as the latest data.
3. User-defined time interval. The fixed request frequency is defined by the user. De-
pending on application needs, the time unit can be seconds, minutes, or hours.

Time consumption can be different depending on the method adopted to get sensor
web data from SOS.
Based on the TCP/IP protocol, the ReSDaP using WinSocket sends a request to
SOS and gets response from the server. First, ReSDaP sends a Http Post request to the
sensor web server with a concrete content of GetCapabilities/DescribeSensor/
GetObservation operation in xml format. Then, it receives a corresponding xml-
format response from the server according to the settings of the parameters defined in
the request. Third, the relevant data is acquired by parsing the xml document.

3.3 Architecture
An architecture named ReSDaP has been adopted for sensor web data provision. The
system architecture is shown in Fig. 2. It contains three parts: One is the data supply.
Its major components are the sensor web services SOS information apart from other
data, plus the other data including raster, vector, model, and dynamic off-line data.
Another part is a distributed spatio-temporal database for storing the data stream. The
last and most important part is the data provision system, as shown in the middle part
of Fig. 2. It consists of tools for data storage, an algorithm plug-in library for data
processing, and the management system for Provision Items (PIs).
By sending a request to the service system interface, ReSDaP gets sensor data.
Then it associates the sensors and geo-objects they observe, and calls the data storage
API of the database management system for storing the data after applying a
processing plug-in to the data. ReSDaP provides a message listening port for sensor
web service system, receiving Http Post messages such as sensor registering, launch-
ing, pausing and unregistering. It also has a corresponding message handling mechan-
ism. A data importing tool is used for storage of other types of data to the database.
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 91

Fig. 2. The system architecture for ReSDaP

Users can customize and extend algorithms library, dynamically load, delete user-
defined plug-ins. Through the PI management function, a PI can be controlled by
starting, pausing, and deleting operations and so on.

Data Stream Pipelines


The core target of the ReSDaP system is to build high-speed data stream pipelines,
through which the sensor data can be quickly stored in the database from SOS after a
stream processing. It can provide a complete, reliable data source for following appli-
cations such as building a spatio-temporal database, data processing, historical data
display, big data analysis, and data mining.

Fig. 3. ReSDaP data stream pipelines

As seen in Fig. 3, the sensor web data produced by sensor networks are published
to the Internet through a sink node or specific gateway. Data streams are transferred to
the database through data provision system pipelines. Once the ReSDaP data stream
pipelines are established, sensor web data can be requested from the Internet and
processed through specific procedures, then saved to the database as data streams.
The specific data stream pipeline processing procedures include associating sensors
and the geo-objects they observe, and data processing if one processing plug-in is
selected, and then saving according to database settings.
92 H. Li et al.

Linkage between sensors and geo-objects


Sensor observations are stored in the database as the status data of geo-objects for
several reasons. First, it is more convenient for user queries. From the users' perspec-
tive, it is easy to query wanted data by inputting keywords through the interface of a
search engine or a database. The keywords are usually some specific geo-object’s
properties such as the temperature, wind speed or air pollution concentration of a
region rather than information about sensors or the sensors’ IDs. So, it is more rea-
sonable approach and suits the needs of practical applications better than saving the
observations of sensors to the database as a descriptions or statuses of the correspond-
ing geo-objects;
Second, 2D and 3D visualization and management are more efficient. Currently, it is
a common practice to use layers to manage the visualization of spatio-temporal data. A
layer is a basic unit for data management and visualization in GIS. Data of the same
type or property will be placed in the same layer and is generally not directly named by
the ID or name of a sensor. For example, when the observations of two sensors are both
temperature, these data can be displayed in the same layer named "Temperature".
To achieve this function, there is a need to associate observations of sensors with
the geo-objects they observe. The realization procedure obtains an ID list of sensors
by sending a GetCapabilities request to the Sensor Observation Service (SOS) server
side, and then sends a DescribeSensor request to the server taking a sensor’s ID as the
parameter to get the metadata of the sensor.

Fig. 4. Linkage procedure for sensors and geo-objects

As shown in Fig. 4, after obtaining a list of sensors and corresponding metadata of


these sensors, it is easy to determine which geo-object in the database a sensor is as-
sociated with, and if no geo-object in the database meets the desired conditions, estab-
lishes a new geo-object. The whole process is geo-object-centered. The first step is to
determine whether to batch process. If so, we create new geo-objects in bulk and link
them to the selected sensors. If not, a query is executed to determine whether the geo-
object exists in the target database. If it exists, one can select the target geo-object,
otherwise a user must create a new one.
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 93

Plug-in framework
This article uses plug-in technology to achieve real-time processing of the sensor data.

Fig. 5. Plugin framework and interface definition

The plug-in management framework and interface definition are shown in Fig. 5.
Example plug-ins are Extracting NDVI, Data sparseness, and Data interpolation For
plug-ins implemented in accordance with the specifications of the system, the plug-in
management module is prepared for registration and management operations. A plug-
in implementation specification defines the external exposure of the five interface
functions: plug-in initialization, plug-in description, settings for the plug-in parame-
ters, data processing, and shut-ins of the plug-ins. Four of these functions are for
initialization, description and closing operations of the plug-in. The fifth and last
function is the core interface, for data processing. Its input parameters are described
by “any” type of Boost liberary for any data type. The output parameters can also be
of any type, according to users’ needs in specific data defining circumstances. This
type of data defined by the “any” type makes the plugin interface more flexible and
versatile.
To load the plug-ins according to universal standard, every plug-in has its config
file in xml format. The file is a description of the plug-in specification for data
processing. It defines the name of the plug-in, a brief description for realizing the
functions of the specific features, plug-in file names, and other information such as
descriptions of input, output, and return value prarameters. Thus, all plug-ins can be
loaded, registered, and managed correctly and consistently.
The data processing step is optional, as explained in Fig. 3. The input sensor data is
chosen and processed (or not processed) by a user-defined data processing algorithm
and then output. If algorithm processing is used, the data is output after the processing
of the user-selected algorithm library. Otherwise it is output directly, and then gener-
ated as geo-object data and saved to the database.

Provision Item List and View


A tri-layer tree structure is used to store all PIs. The first layer is the root, used to
represent and control all PIs. The second layer includes single PIs created directly by
the wizard and batch nodes which are groups of PIs established by batch mode. The
third layer is the child nodes of batch nodes, and every child node is a PI.
94 H. Li et al.

(a) An example config file of PIs

(b) The tri-layer tree structure diagram of PIs in the config file shown in (a)
Fig. 6. An example definition for the access item file and its data structure

An example of a definition of an xml file and its corresponding tri-layer data struc-
ture is shown in Fig. 6. Displayed in Fig. 6(a) is the definition file for PIs with the
settings of each PI comprising: management parameters, data source parameters, data
storage parameters and data processing parameters. Parameters for PI management
include PI name, creator, creation time and others. Data source parameters are for
sensor web services including service address to determine where to send data re-
quests. Data storage parameters are the connect settings for database systems, includ-
ing the database server IP, port, database name, username, password and so on. Data
processing parameters include the sensor ID association with the geo-object ID, and
information for processing algorithms. Fig. 6(b) is the corresponding tri-layer tree
data structure diagram of the PIs shown in Fig. 6(a).
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 95

4 Meteorological Sensor Data Provison for the Wuhan Weather


Bureau

4.1 Data Preparation and Publishing


The performance of the database is not the key point of this study but rather we fo-
cused on the performance of the provison of sensor web data. In-situ sensor data used
for testing were real-time meteorological data generated by a simulation program
based on real meteorological sensor data for Wuhan Weather Bureau, China. We
applied sensor web data for geo-spatial Sensor Observation Service (SOS) to conceal
the underlying sensor hardware following Chen et al. [2] which publishes sensor data
to the Internet.

4.2 Configuration of Provision Parameters


The configuration of PIs’ main parameters is achieved through the New Wizard for
provision items through the four steps as described below.
The first step is the global settings, including naming the PIs, whether the data was
to be processed by algorithm plug-ins, with batching or not, SOS service address set-
tings, and settings about where and how to store the data. The creation time is auto-
matically added by acquiring the computer system time.
The second step is to associate meteorological sensors with the geo-objects they
observe. First, sensors are selected from the "Sensors List" as accessed from SOS
service address from the previous step. The Sensors List is a list with sensor IDs and
related meta-information: coordinates, frequencies, and work statuses. Then, the cor-
responding geo-objects for the observed sensors from the "Objects List" are selected
from among those acquired from the storage address of the previous step. If the
wanted object does not exist, then a new object is created by setting the object name
like “TempOfEastLake” meaning the temperature of Wuhan East Lake, and setting
the object type “Situ” and object descriptions.
The third step is to configure the time interval to send data requests to the SOS ad-
dress, based on the real-time level requirements of practical applications. Since me-
teorological data are not so urgent and frequent, for our test we set the time interval as
30 minutes according to the sensor observation frequency acquired from the second
step.
The fourth step is to set the algorithm parameters if one was selected during the
first step. After setup is complete, the provision items are started.

4.3 Monitering and Controlling the Status of Provision Items


In order to facilitate Provision Item (PI) management, a PI list is established with six
major attribute columns. Other minor attributes such as creation date, modification
time and so on can also be displayed in the list. The six properties listed are as fol-
lows: PI name, sensor ID, sensor type, object ID, status of PI and status description.
By right-clicking a selected PI or multi-selected PIs, a pop-up menu shows several
96 H. Li et al.

operations including launch, pause, stop and delete by single or batch processing.
When the mouse is moved over a PI, a tooltip pops up automatically to displayspecif-
ic PI information, such as sensor web services, and database connection information.

Fig. 7. Meteorological sensor data provision for Wuhan, Hubei Province. The background
consists of a raster map and a vector map of Wuhan. The green icons represent the sensors with
sensor IDs labeled aside.

In addition to the PI list for management, the same operations can be performed in
a PI view. As shown in Fig. 7, a PI view showing meteorological sensor data provi-
sion for Wuhan, Hubei Province, is displayed. The sensors are labeled by their IDs.
The view is a two-dimensional window displaying specific sensor locations and their
current positions. With mouse hovering on one sensor, a tooltip with relevant infor-
mation for the sensor pops up. By right-clicking the sensor, the same operations can
be done through the shortcut menu as in the PI list.
A web data service for viewing sensor data provided to the database, is shown Fig.
8. It displays a real-time data view interface from the database system for monitoring
the status of data provisions. Fig. 8(a) represents a list of geo-objects in the database.
The data shown in it includes the center point position, bounding box, start time and
end time of the geo-object status, and the type of the sensors which observe the geo-
objects. It can be easily determined whether the status data of the geo-object are pro-
vided in real-time by comparing the end time with current time. Fig. 8(b) is a status
data list of a geo-object. Data observed by the corresponding sensor is stored in this
table as status data of the geo-object.
The stuff of Wuhan Weather Bureau can easily control the PIs since they can be
paused, stoppped, restarted and removed, according to actual needs. Through the
management of PIs, enormous amounts of data from meteorological sensors can be
ingested to the databases, accurately and in real-time, as states of the target geo-
objects they observe. The batch processing mechanism can also be adopted for con-
venient management.
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 97

(a) List of geo-objects

(b) List of status data of a geo-object


Fig. 8. Web data service shows real-time data provided to a database

The provision procedures of PIs can be monitored. It can query which sensor a
geo-object is observed by, and which geo-objects the observations of a sensor are
linked to. Users can bind selected sensors to newly-created spatio-temperal geo-
objects or ones existing in the database. The configurations of PIs can be saved as xml
files and can be imported directly for re-use.

4.4 Discussion
Implementation of real sensor data provisions for Wuhan Weather Bureau shows that,
ReSDaP is suitable for sensor data acquisition, processing and management in real-
time. Users can customize and extend the processing algorithm plug-ins based on the
actual applications. Real-time data provision of sensor web can be visualized, moni-
tored, customized, expanded and controlled since the ReSDaP architecture is flexible
and applicable.
98 H. Li et al.

5 Conclusions and Outlook

With the rapid development of hardware technology, the cost of sensors is decreasing
with an increasing number of smart sensors used in a variety of applications replacing
human labor. Sensors are widely distributed around the earth, monitoring the envi-
ronment of land, water and air. Real-time data provision from sensor webs has broad
application prospects and can also be extended to the field of Internet of Things (IoT).
Adressing the characteristics of sensors, their big number, enormous data volume,
real-timeliness, and extended time observations, ReSDaP is proposed to ingest sensor
data, providing data support for fast responses to disasters and emergencies, spatio-
temporal data modelling, historical data acquisition and playback. This architecture
solves problems in real-time data provision for sensor web by making the provisions
of sensor data customizable and extensible. Provision Items (PIs) are for acquiring,
controlling and managing data streams from sensor webs. Time configuration is flexi-
ble for different needs and applications. The provision procedures of PIs can be moni-
tored and sensors and their provision states are visible from a PI list table which
shows attributes of the PIs.
The next step is to research the auto-discovery of sensor web services and the
smart ingestion of sensor data to spatio-temporal databases from the perspective of
semantic web.

Acknowledgements . This work is supported by National High Technology Research


and Development Program of China (Grant No. 2012AA121401). We also sincerely
thank Dr. Kwan for important proposals for this paper, and Prof. Chen N.C. and
Zhang W.J. for providing Wuhan meteorology sensor web services. The authors
would like to thank the other members of our project for their valuable advices and
selfless assistance.

References
1. Broering, A., Echterhoff, J., Jirka, S., Simonis, I., Everding, T., Stasch, C., Liang, S.,
Lemmens, R.: New Generation Sensor Web Enablement. Sensors 11, 2652–2699 (2011)
2. Chen, N., Di, L., Yu, G., Min, M.: A flexible geospatial sensor observation service for di-
verse sensor data based on Web service. ISPRS Journal of Photogrammetry and Remote
Sensing 64, 234–242 (2009)
3. Di, L.: Geospatial sensor web and self-adaptive Earth predictive systems (SEPS). In: Pro-
ceedings of the Earth Science Technology Office (ESTO)/Advanced Information System
Technology (AIST) Sensor Web Principal Investigator (PI) Meeting, San Diego, USA, pp.
1–4 (2007)
4. Lai, X., Liu, Q., Wei, X., Wang, W., Zhou, G., Han, G.: A survey of body sensor net-
works. Sensors 13, 5406–5447 (2013)
5. Ding, Z., Gao, X.: A Database Cluster System Framework for Managing Massive Sensor
Sampling Data in the Internet of Things. Chinese Journal of Computers 35, 1175–1191
(2012)
6. Lee, K.B., Reichardt, M.E.: Open standards for homeland security sensor networks. IEEE
Instrumentation & Measurement Magazine 8, 14–21 (2005)
ReSDaP: A Real-Time Data Provision System Architecture for Sensor Webs 99

7. Liu, Y., Gong, J., Guo, W.: Resource locating and selection for spatial image streaming on
P2P network. Acta Geodaetica et Cartographica Sinica 39, 383–389 (2010)
8. Diallo, O., Rodrigues, J.J.P.C., Sene, M.: Real-time data management on wireless sensor
networks: A survey. Journal of Network and Computer Applications 35, 1013–1021
(2012)
9. Wang, N., Shen, S., Shen, Z., Zhao, M.: A germplasm resources data collection and man-
agement geographic information system based heavy weight network. Measurement 1734-
2739 (2013)
10. Moro, G., Monti, G.: W-Grid: A scalable and efficient self-organizing infrastructure for
multi-dimensional data management, querying and routing in wireless data-centric sensor
networks. Journal of Network and Computer Applications 1218-1234 (2012)
11. Cristescu, R., Vetterli, M.: On the optimal density for real-time data gathering of spatio-
temporal processes in sensor networks. In: Proceedings of the 4th International Sympo-
sium on Information Processing in Sensor Networks, p. 21. IEEE Press (2005)
12. Shi, H.Y., Wang, W.L., Kwok, N.M., Chen, S.Y.: Game theory for Wireless Sensor Net-
works: a survey. Sensors 12, 9055–9097 (2012)
13. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: A survey on sensor networks.
IEEE Communications Magazine 40, 102–114 (2002)
14. Pokorny, J.: Database architectures: Current trends and their relationships to environmen-
tal data management. Environmental Modelling & Software 1579-1586 (2006)
15. MongoDB, Inc., http://www.mongodb.com/
16. Apache Software Foundation, http://couchdb.apache.org/
17. TriAGENS GmbH, http://www.arangodb.org/
18. Chen, J., Cao, J., Chen, Q.: ISHSN: An integration system for heterogenous sensor net-
works. Journal of Computer Applications 1191-1193+1207, in Chinese (2013)
19. Chen, Q.-K., Lv, X.-M., Hao, J.-T., Zhang, Z., Zhuang, S.-L.: ChukwaX: A Heterogeneous
Data Access System for Internet of Things. Computer Engineering 12-15 (2012) (in Chi-
nese)
20. Golab, L., Özsu, M.T.: Issues in data stream management. ACM Sigmod Record 32, 5–14
(2003)
21. National Telecommunications & Information Administration: Data stream, http://
www.its.bldrdoc.gov/fs-1037/dir-010/_1451.htm
GeosensorBase: Integrating and Managing Huge
Number of Heterogeneous Sensors Using Sensor
Adaptors and Extended SQL Querying

Min Soo Kim1 , Chung Ho Lee1 , In Sung Jang1 , and Ki-Joune Li2
1
Spatial Information Research Laboratory, ETRI, Daejeon, Korea
{minsoo,leech,e4dol2}@etri.re.kr
2
Department of Computer Engineering, Pusan National University, Pusan, Korea
lik@pnu.edu

Abstract. Recently, there has been much interest in observing sensor


readings ranging from network cameras to small wireless sensor nodes.
Sensor web is integrating such sensor readings and providing tremendous
volumes of real-time sensor data. OGC SWE, Microsoft SenseWeb, and
Oak Ridge National Lab Sensorpedia are the most representative sen-
sor web platform providing sensor data service. However, the existing
platforms have some problems to build real-world application services.
The first problem is heterogeneity of sensors. The real-world services
may need to integrate various kinds of sensors. Some smart sensors that
support web service can be easily integrated with the sensor web plat-
form. However, most sensors provide their readings using their own ways
such as TCP/IP. The integration of such sensors consumes high cost in
time and money. The second problem is that the existing platforms only
support some limited query operations of simple filtering. However, in
real-world services, it is expected that various kinds of flexible query op-
erations become more and more necessary to efficiently find out the sen-
sor readings of interest. Therefore, we propose a GeosensorBase system
that can easily integrate heterogeneous sensors and can support flexible
user requests. Specifically, we propose efficient sensor data management
methods and an extended query processor. Finally, in order to verify the
advantages of the GeosensorBase, we demonstrate a prototype for envi-
ronmental monitoring and a test bed for smart city monitoring. We think
the GeosensorBase can not only dramatically reduce the costs and the
efforts consumed in new sensor integration, but also can support pow-
erful query processing capabilities which are suitable for various kinds
of sensor applications. So, we argue that the GeosensorBase gives great
help to developers of sensor applications.

Keywords: Open Geospatial Consortium, Sensor Observation Service


(SOS), Sensor Planning Service (SPS), Sensor Web, GeoWeb, Geosensor.

1 Introduction
Recently, there has been much interest in observing sensor readings ranging from
network cameras to small wireless sensor nodes embedded in certain objects.

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 100–114, 2014.

c Springer-Verlag Berlin Heidelberg 2014
GeosensorBase: Integrating and Managing Heterogeneous Sensors 101

Sensor web is integrating such heterogeneous sensor readings and providing


tremendous volumes of real-time sensor data. The sensor web is increasingly
attracting interests from various fields of applications. OGC SWE [1], Microsoft
SenseWeb [2], and Oak Ridge National Lab Sensorpedia [3] are the most repre-
sentative sensor web platform providing sensor data service.
However, the existing platforms have two major problems to build real-world
ap-plication services. The first problem is heterogeneity of sensors. The real-
world services may need to integrate huge numbers of and various kinds of sen-
sors. Some smart sensors can provide their readings to the web using web services
such as OGC SWE. These smart sensors can be easily integrated with the sen-
sor web platform. However, most sensors provide their readings using their own
ways such as TCP/IP, HTTP, serial, and so on. The integration of such sensors
consumes high cost in time and money. Moreover, the cost may occur when-
ever a new sensor is added. The second problem is the limited query processing
capabilities of sensor web platforms. The existing platforms only support some
limited query operations of simple filtering. However, in real-world services, it is
expected that as the number of sensors grows rapidly, various kinds of flexible
query operations become more and more necessary to efficiently find out the
sensor readings of interest.
Therefore, in this paper, we propose a GeosensorBase system that can efficiently
manage heterogeneous sensor readings using sensor adaptors and extended SQL
querying. The proposed system can support rapid and simple integration tool for
various kinds of heterogeneous sensors. It can also support flexible query pro-
cessing capabilities, irrespective of the processing capabilities of heterogeneous
sensors. Finally, we demonstrate a prototype and a test bed based on the Geosen-
sorBase.
The rest of the paper is organized as follows. Section 2 introduces related
works concerned with sensor web and sensor network middleware. In Section
3, we present the proposed solution of GeosensorBase including the system de-
sign. Section 4 presents its implementation and application to test bed services.
Finally, Section 5 presents our conclusion and future work.

2 Related Work

2.1 Sensor Web

OGC SWE [1], Microsoft SenseWeb [2], and Oak Ridge National Lab Sensorpe-
dia [3] are the most representative sensor web platform. They make sensor data
accessible on the web. OGC SWE is a sensor web platform specifying interop-
erable interfaces and metadata encodings that enable real time integration of
heterogeneous sensor data into the information infrastructure. Oak Ridge Na-
tional Lab (ORNL)s Sensorpedia is a web-based application consisting of Google
Maps interface where users can search and explore published sensor data. It en-
ables individuals, communities, and enterprises to share, find, and use sensor
data online. Microsoft SenseWeb platform is an open and scalable infrastructure
102 M.S. Kim et al.

for sharing of sensor data streams. The platform mainly focused on efficient user
query processing of numerical sensor data through caching and indexing.
However, the existing platforms do not present the solution to the problem
of heterogeneity of sensors. With the recent advancement of IoT (Internet of
Things) technology, it is expected that the explosive increase in sensor types
occurs in the near future. Such the sensors may provide their readings using
various communication protocols of serial, TCP/IP, HTTP, SOAP and using
various data formats of a simple text, binary, XML, GML, SenosrML, JSON.
Therefore, in this paper, we intend to propose a system that can support rapid
and simple integration of such heterogeneous sensors. Also, the existing platforms
do not present the solution to the problem of limited query operations. As a
variety of application services in IoT increase, it is expected that the explosive
increase of various user requests to sensors occurs. In this paper, we intend to
propose a sensor web system that can support the flexible and extensible query
operators that support various kinds of user requests.

2.2 Sensor Network Middleware


Various kinds of sensor network middleware such as Cougar [4], TinyDB [5],
COSMOS [6], MiLAN [7], DSWare [8] have been proposed for efficient sensor
data management. Especially, Cougar, TinyDB, and COSMOS have viewed a
sensor network as a virtual database system and have proposed query processors.
Recently, many researchers have been interested in the spatial query processing
which acquires sensor readings from sensors inside specified geographical area
of interests. To efficiently process such the spatial queries in sensor networks,
many kinds of in-network spatial indexing methods also have been proposed.
Peer-tree [9], SPIX [10], GR-tree [11] which is based on R-tree, and DQT [12],
DIST [13] which is based on Quad-tree have been proposed. Besides, there have
been many works [14], [15], [16] for the distributed processing of the spatial
query. Their experimental results showed that the in-network spatial indexing
and processing provides meaningful reduction in energy consumption of sensors.
As a result, such the in-network processing has advantage both reducing the
energy consumption of sensors and filtering sensor readings inside sensors.
Even though various kinds of in-network algorithm have been proposed, most
existing middleware, however, did not consider the advantages of the in-network
processing. Such the middleware performs query processing at a server after
acquiring all sensor readings from sensors, which is called centralized processing.
The centralized processing, though simple, incurs high energy consumption in
acquiring all sensor readings. A simple optimization for the centralized processing
is to perform the query with distribution in sensors. In this paper, we intend to
propose a system that makes full use of the in-network processing capabilities of
sensors. The proposed system may perform query processing at a server or with
distribution in sensors depending on the capabilities of sensors.
GeosensorBase: Integrating and Managing Heterogeneous Sensors 103

3 Proposed System
In this section, we present the proposed system of GeosensorBase managing het-
erogeneous sensors and supporting flexible queries. We first present the system
design, followed by the sensor management that integrates heterogeneous sen-
sors, and finally the extended query processing of sensor that performs various
kinds of user requests.

3.1 System Design


Our proposed system provides a rapid and simple integration solution on hetero-
geneous sensors. Also, the system efficiently processes various kinds of queries in
cooperative with the in-network query processing of sensors. Finally, the system
provides users with the filtered sensor readings in a variety of interfaces. Figure 1
shows the system architecture composed of five components, namely the integra-
tion adaptor, the data manager, the query processor, the service interface and
the metadata manager. The system operates on the OSGi (Open Service gate-
way Initiative) framework that is a service platform for the Java programming
language that implements a complete and dynamic component model. There-
fore, the five components coming in the form of bundles for deployment can be
remotely installed, started, stopped, and updated without requiring a reboot of
our whole system. The service registry allows bundles to detect the addition of
new services or the removal of services. Finally, the OSGi framework makes it
possible to be dynamically integrated with our system by only installing and
starting a new sensor adaptor bundle. In other words, it means that new sensors
can be directly integrated with our system during the operation of the system.

GeosensorBase based on OSGi Framework


Service
SOS/SPS Interface Application
Programming Interface

Query Processor
Metadata Manager

Query Analyzer Query Result


& Optimizer Executor Manager

Data Manager
Sensor Data Manager
(Database and Memory)

Integration Adaptor
Sensor Adaptor

Fig. 1. System Architecture of GeosensorBase


104 M.S. Kim et al.

The integration adaptor is a tool that easily integrates any sensors using meta-
data on sensors and simple code implementation. Metadata means the vendor-
specific sensor information, such as a type of sensor data, a sensor name, and
location of a sensor. Simple code implementation means programming for a
sensor adaptor that converts vendor-specific data protocol to our system data
protocol. The integration adaptor automatically generates a new sensor adaptor
source code using the metadata for newly integrated sensors. Therefore, simple
code implementation is just to rewrite a part of the sensor adaptor source code
using the vendor-specific API. Details about this simple code implementation
are described in 3.2 Heterogeneous Sensors Management.
The data manager stores sensor readings acquired from sensors, and provides
them to the query processor. By default, all sensor readings are temporarily
stored in main memory. For the sensor data logging, however, the selected sensor
data can be stored with timestamp information in an external database. Such
the logged sensor data are used to respond to queries on the past sensor readings.
The data manager also supports both pull-mode and push-mode sensors. Here,
the pull-mode sensor means a sensor that provides the data manager with sensor
readings as response on a query, while the push-mode sensor periodically provides
its readings to the data manager with regardless of a query. In the push-mode, the
provided sensor readings are temporarily stored in main memory. User queries
on the push-mode sensors are internally processed using sensor data of the main
memory. Therefore, from users perspective, they dont need to consider whether
the sensor is the pull-mode or the push-mode.
The query processor supports queries such as spatial query, event query, and
periodic query in order to satisfy various kinds of user requirements on sensor
readings. The query analyzer/optimizer parses and analyzes a query and then
creates an optimized query plan. The optimized query plan is created with con-
sidering the processing capability of a sensor. For example, if a sensor is capable
of processing a spatial query, the query plan including a spatial constraint is sent
to the sensor. Otherwise, the query plan except a spatial constraint is sent to
the sensor and then the spatial constraint is processed in the query executor of
our system. The processing capability is an element of the metadata. The query
executor manages life cycles of lots of query plans, schedules periodic queries
depending on their defined time intervals, and sometimes performs spatial and
numeric operations. The result manager asynchronously receives sensor readings
for the queries that the query executor scheduled. Finally, the query processor
is implemented with multi-threading architecture based on multiple queues in
order to efficiently support lots of asynchronous queries processing.
As the service interface, the proposed system first provides its own interface
of Java RMI for an efficient Java programming. It is easy to use and provides
powerful application programming interfaces. It provides interfaces on extended
SQL processing for sensors, on metadata management of sensors, and on vir-
tual sensor groups management. The proposed system second provides the web
service interfaces of OGC SOS [17] and SPS [18], where the web interfaces are
internally implemented using the above Java RMI interfaces.
GeosensorBase: Integrating and Managing Heterogeneous Sensors 105

3.2 Heterogeneous Sensors Management


As mentioned above, in order to easily integrate and access heterogeneous sen-
sors, efficient sensor management method is required. In this section, we propose
an access method to real sensors, a metadata management method for such the
sensors access, and finally a virtual sensor group creation method that may sup-
port efficiency in lots of complex sensors management.
Most sensors generally have their own gateways installed on a PC. The con-
nections between the sensors and their gateways are composed of wireless com-
munication protocols, such as ZigBee, IEEE 1451, Bluetooth, and so on. In order
to access these remote sensors, our proposed system connects to the gateways
by default and asks queries to them. A sensor adaptor of our system actually
makes easy the connection and the querying to the gateway. For the connection
and the querying to a gateway, one sensor adapter is needed with respect to
each gateway. From Figure 1, we can see that the integration adaptor supports
a new sensor adaptor development by providing an automatic sensor adaptor
source code generation wizard. The automatic source code generation dramati-
cally reduces the costs and the efforts consumed in new gateway integration. It
is because the generated source code already includes all implementations about
data managing, querying and web services by using the metadata about the sen-
sors. Simple implementation is just needed, which requests sensor readings using
vendor-specific API of the sensors. Users do not have to worry about the other
things in ac-cessing the sensors. Figure 2 shows a wizard example for integrating
a new gateway.

Fig. 2. Wizard for a Sensor Adaptor Source Code Generation


106 M.S. Kim et al.

As shown in Figure 2, the wizard generates Java source code and supports
various kinds of protocols such as HTTP, TCP, SOAP, and so on. The wizard
for a new sensor adaptor generates the source code using the metadata on the
newly added sensor. Therefore, the metadata definition and writing is necessar-
ily required in the process of new sensor integration. In this system, we define
metadata that includes vendor-specific information, such as sensor id, sensor
location, sensor readings type and name, and its query processing capability.
Figure 3 presents a metadata example on one sensor group called GS1, which is
edited using XML.

1: <?xml version="1.0" encoding="UTF-8"?>


2: <geosensorNetworks xmlns="http://ugis/gsn/metadata">
3: <geosensorNetwork gsnID="GS1">
4: <feature operatable="true" eventable="true" periodicable="true"
5: aggregatable="true" spatialable="true" queryCount="1" gsnMode="pull"/>
6: <nodeSet>
7: <node id="100" name="Node 100" x="100" y="100"/>
8: <node id="101" name="Node 101" x="150" y="150"/>
9: </nodeSet>
10: <sensorSet>
11: <sensor id="1" name="temperature" type="string"/>
12: <sensor id="2" name="moisture" type="float"/>
13: <sensor id="3" name="pressure" type="float"/>
14: </sensorSet>
15: </geosensorNetwork>
16: </geosensorNetworks>

Fig. 3. XML Representation of a Sensor Group (GS1) Metadata

The metadata of Figure 3 means that the sensor group of GS1 may support
comparison operator processing, event query processing, periodic query process-
ing, aggregate query processing, and even spatial query processing. It also means
that the GS1 may support one in-network query processing at a time and may
work in the pull-mode that returns sensor readings as response on the given
query. Finally, we can see that the GS1 has two sensors of 100 and 101, and each
sensor has three sensor readings of nodePurpose, temperature, pressure and their
type information of string and float.
In addition to the simple sensor management method, our system supports a
complex sensors management method using a virtual sensor group creation. The
virtual sensor group creation may logically integrate two or more heterogeneous
sensors as one virtual sensor group. This virtual sensor group creation is used
when intending to efficiently manage complex and heterogeneous sensors as one
sensor group, which have sensor readings of the same meaning. Such the virtual
sensor group can be created like the view creation of the conventional DBMS.
Figure 4 shows a virtual sensor group creation example between GS1 sensors
and GS2 sensors.
GeosensorBase: Integrating and Managing Heterogeneous Sensors 107

GS1 GS2
ID temperature moisture location ID temperature pressure humidity
100 19.2 45.3 (3, 5) 200 21.3 62.3 42.8
101 19.8 43.6 (5, 7) 201 22.3 61.5 41.5
...... ...... ...... ...... ...... ...... ...... ......
VS1
ID T M
• GS1.temperature, GS2.temperature  T
100 19.2 45.3
• GS1.moisture, GS2.humidity  M
101 19.8 43.6
• Operator: GS1.moisture > 30
...... ...... ......
• VS1 (Virtual Sensor Group 1) Creation
200 21.3 42.8
201 22.3 41.5
...... ...... ......

Fig. 4. Virtual Sensor Group Creation between GS1 and GS2

From Figure 4, we can see a virtual sensor group of VS1, where GS1’s tem-
perature and moisture columns, and GS2’s temperature and humidity columns
are mapped into VS1’s T and M columns, respectively. Here, we can create var-
ious kinds of virtual sensor group only by selecting specific sensors which satisfy
some spatial predicates and selection predicates like GS1.moisture > 30. Figure
5 shows an XML representation that creates the example VS1 of Figure 4.

1: <?xml version="1.0" encoding="UTF-8"?>


2: <viewSet xmlns="http://ugis/gsn/metadata">
3: <view viewName="VS1">
4: <viewColumn gsnID="GS1" column="temperature" viewColumnName="T"/>
5: <viewColumn gsnID="GS1" column="moisture" viewColumnName="M"/>
6: <viewColumn gsnID="GS2" column="temperature" viewColumnName="T"/>
7: <viewColumn gsnID="GS2" column="humidity" viewColumnName="M"/>
8: <operator gsnID="GS1" opType="1" column="moisture" operator=">" operand="30"/>
9: </view>
10: </viewSet>

Fig. 5. XML Representation of a Virtual Sensor Group (VS1)

3.3 Extended Query Processor

Sensor applications may need various kinds of filtering functions [19]. So, many
researchers have noted the benefits of a query processor-like interface to sensors.
It is easy to use and powerful to represent various kinds of user requests. In
this section, we propose an extended query processor for sensor readings, which
is called EQP (Extended Query Processor). The EQP supports special query
processing capabilities for efficient acquisition of sensor readings, besides default
SQL capabilities. It supports periodic query processing and event query pro-
cessing capabilities as well as spatial query processing capability. Periodic query
acquires sensor readings periodically, according to the given period and lifetime.
Event query acquires sensor readings only when the given event occurs. Figure
6 shows our summarized grammar of EQP.
108 M.S. Kim et al.

1: SQLSelect() :
2: [“EVENT ON” SQLOrExpr()]
3: “SELECT” SQLSelectColumn()
4: (LOOKAHEAD(2) “,” SQLSelectColumn())*
5: “FROM” SQLTableList()
6: [SQLWhere()] [SQLGroupBy()] [SQLPeriodAndLifetime()]
7: SQLOrExpr() :
8: ......
9: SQLPeriodandLifetime() :
10: (“PERIOD” <INTEGER> [SQLTimeUnit()]
11: “FOR” <INTEGER> [SQLTimeUnit()])
12: SQLTimeUnit() :
13: (“SECOND” | “MINUTE” | “HOUR”)

Fig. 6. Summarized Grammar of the Extended Query Processor

Event query like EVENT ON is defined in line 2, where event conditions are
included in SQLOrExpr(). SQLWhere() syntax of line 6 includes not only com-
parison operators, but also spatial operators, such as Within, Contains, Over-
laps, Intersects, Touch, Disjoint, Crosses, and Equals. Line 9-13 defines a periodic
query requiring period and lifetime input. Figure 7 shows several query examples
based on the grammar of EQP.

(1) SELECT id, temperature, humidity FROM GS WHERE (temperature >= 20 AND
temperature <= 25) PERIOD 10 MINUTE FOR 24 HOUR
(2) EVENT ON(temperature >= 100 AND id = 2 ) SELECT id, temperature, location FROM GS
(3) SELECT id, temperature, humidity, location FROM GS1, GS2, GS3, GS4 WHERE
temperature > 30 AND WITHIN (location, POLYGON (10 10, 10 40, 40 40, 40 10, 10 10))
(4) SELECT A.id, A.humidity, A.location, B.id, B.co, B.location FROM GS1 AS A, GS2 AS B
WHERE DISTANCE(A.locatioin, B.location) < 30 AND A.humidity > 60 AND B.co > 8

Fig. 7. Query Examples that Requests Sensor Readings

Query (1) shows an example of the periodic query that acquires id, tempera-
ture, humidity from GS at every 10 minutes for 24 hours with 20 ≤ temperature
≤ 30 conditions. Query (2) shows an example of the event query that performs
“SELECT id, temperature, location FROM GS ” whenever sensor 2’s tempera-
ture is over 100. Query (3) and (4) shows a simple spatial query and a distance
join query, respectively.
However, all sensors do not support above queries processing because some
smart sensors internally support all queries processing in themselves, but some
poor sensors cannot perform the queries at all. Therefore, we have to create a
proper query plan carefully considering the in-network query processing capa-
bility of sensors. With respect to the query processing capability, we classify
into five types of sensors in this paper. Five types of sensors are determined,
depending on whether the capabilities of “operatable, eventable, periodicable,
aggregatable, spatialable” in Figure 3 are supported or not. For example, for the
query (3), we can create various kinds of query plans depending on the capability
of each GSi like Figure 8.
GeosensorBase: Integrating and Managing Heterogeneous Sensors 109

Capability of GSi Query Plan for each GSi


Metadata of GS1 • Query ID: 1 QP1
• operatable = “true” • Query Type: Periodic
• eventable = “true” • Columns: id, temperature, humidity, location
• periodicable = “true” • Operator: temperature > 30
• aggregatable = “true” • Spatial Operator: Within (location, Polygon(10 10, 10 40, 40 40, 40 10, 10 10))
• spatialable = “true” • Period: 5 Minute, Lifetime: 2 Hour

Metadata of GS2 • Query ID: 1 QP2


• operatable = “true” • Query Type: Snapshot
• eventable = “true” • Columns: id, temperature, humidity, location
• periodicable = “false” • Operator: temperature > 30
• aggregatable = “true” • Spatial Operator: Within (location, Polygon(10 10, 10 40, 40 40, 40 10, 10 10))
• spatialable = “true” • Period & Lifetime: null

Metadata of GS3 • Query ID: 1 QP3


• operatable = “true” • Query Type: Snapshot
• eventable = “true” • Columns: id, temperature, humidity, location
• periodicable = “false” • Operator: temperature > 30
• aggregatable = “true” • Spatial Operator: null
• spatialable = “false” • Period & Lifetime: null

Metadata of GS4 • Query ID: 1 QP4


• operatable = “false” • Query Type: Snapshot
• eventable = “true” • Columns: id, temperature, humidity, location
• periodicable = “false” • Operator: null
• aggregatable = “true” • Spatial Operator: null
• spatialable = “false” • Period & Lifetime: null

Fig. 8. Creation of Query Plan Depending on the Capabilities of Each GSi

From Figure 8, we can see that different query plans are transmitted to each
GSi for the given query (3). This work is executed using each GSi’s metadata
by the query optimizer of our extended query processor. First, the query plan of
QP1 is transmitted to GS1, where GS1 can support a periodic query processing,
comparison operation, and spatial operation. For GS2 which cannot support a
periodic query processing, however, QP2 is transmitted to GS2, which have a
query type of snapshot and null value for period and lifetime. The remaining
periodic predicates are executed at our proposed query processor. Specifically,
the query executor periodically schedules QP2 with the given period 5 minutes
and for lifetime 2 hours. For GS3, the query executor transmits QP3 which has
null value for operator. Then the query executor executes the periodic predicates
and comparison operators. Finally, for GS4, the simple query plan of QP4 having
only columns data is transmitted and then the remaining predicates of periodic
condition, comparison operators, and spatial operators are executed at the query
executor. We have to note that all processes of the query plan creation and the
query execution are internally processed at the query processor, so that clients
do not have to consider the in-network processing capabilities of all sensors. In
other words, it means that all user queries can be processed, regardless of the
in-network processing capabilities of sensors.
110 M.S. Kim et al.

4 Implementation and Test Bed

In this section, we present our implementation and its applying for actual ap-
plication services. We first present a GeosensorBase Studio that provides an
integrated development environment (IDE) for a rapid integration and an ef-
ficient management of heterogeneous sensors. Then the actual implementation
of the proposed GeosensorBase system composed of the integration adaptor,
the data manager, the query processor, the service interface, and the metadata
manager is followed. Finally, we show our simulated prototype for environmental
monitoring and the test bed example for smart city monitoring.

4.1 GeosensorBase Studio


GesensorBase Studio is a graphical authoring tool for integration and man-
agement of heterogeneous sensors. It provides developers with Eclipse-based
integrated development environment (IDE), so that it enables developers to
leverage the entire ecosystem of Eclipse plug-ins. GeosensorBase Studio actu-
ally leverages the existing JDT(Java Development Tool), PDE(Plug-in Devel-
oper Environment), WTP(Web Tools Platform), WST(Web Standard Tools),
and BIRT(Business Intelligence and Reporting Tools). Here, we add our inte-
gration adaptor, data manager, query processor, service interface, and metadata
manager to Eclipse.
GeosensorBase Studio supports various kinds of graphical tools for an efficient
use of GeosensorBase system. It supports graphical tools for sensor adaptor
creation like Figure 2, metadata editing based on map, and complex query input.
Besides the graphical tools, it supports basic monitoring functions on actual
sensor readings and sensor simulation functions. Figure 9 shows some examples
on metadata editing and complex query processing.

Metadata Editing Based on Google Maps Complex Query Processing for CCTV sensor

Fig. 9. Metadata Editing and Complex Query Processing based on GeosensorBase


Studio
GeosensorBase: Integrating and Managing Heterogeneous Sensors 111

Our proposed GeosensorBase system was implemented using JDK 1.6, Spring
framework, and OSGi framework on Windows 7. In the data manager, the ex-
ternal database connection for the sensor data logging was implemented using
JDBC. We tested Apache Derby and Oracle 11g as the external database. In the
query processor, parsing and execution of the extended SQL were implemented
using a Java parser generator of Java Compiler Compiler. Especially, the query
executor and the result manager were implemented to support asynchronous ex-
ecution and thread pooling for efficient processing of huge number of queries.
Finally, the service interface was implemented to support OGC SOS 1.0, SPS
1.0, and extended SQL API.

4.2 Prototype and Test Bed


In order to verify the GeosensorBase system, we first implemented a prototype
for environmental monitoring in the laboratory level. In this prototype, we used
four kinds of heterogeneous sensor groups. The first sensor group composed of
four sensors made by Crossbow was used. The second sensor group composed
of two sensors made by ArchRock was used. These two sensor group acquires
the same sensor readings of temperature, humidity, and illumination. The third
sensor group composed of four sensors, which measure velocity of object, was
used. Finally, a network camera sensor made by AXIS was used. Figure 10 shows
software architecture and hardware configuration of the prototype.

Fig. 10. Software Architecture and Hardware Configuration of the Prototype

As shown in Figure 10, we created four sensor adaptors for AXIX, Crossbow,
Archrock, and ETRI sensors. This creation process is easily executed using the
sensor adaptor source code generation wizard as mentioned in 3.1. Then sensor
readings from the sensor adaptors are processed and managed by the data man-
ager and the query processor. Finally, the sensor readings are sent to a geospatial
engine through web service or open API. The geospatial engine provides an en-
vironmental monitoring service by integrating the sensor readings and map. In
hardware configuration, we can see that LCD panel displaying map and various
kinds of wireless sensors over the panel. In this example, we request two queries
112 M.S. Kim et al.

of a periodic query and an event query. One is to periodically acquire sensor


readings of temperature, humidity, illumination, and velocity. The other is to
acquire all sensor readings and video information of the region having high illu-
mination whenever the illumination value is unusually high. The query results
can be directly displayed and charted on the map of the panel. Besides these
queries, various types of user requests can be processed in this prototype.
We second built a test bed example for smart city monitoring. This test bed
is in-tended to perform verification of functionality and conformance test be-
tween GeosensorBase and actual outdoor sensors. The target area of the test
bed was SeJong special autonomous city in Korea. Eight kinds of heterogeneous
sensor groups and nearly 246 sensors were deployed. They are sensor package for
ground facilities monitoring, sensor package for underground facilities monitor-
ing, sensor package for watershed monitoring, RTLS(Real-Time Locating Sys-
tem) sensor package for construction site monitoring, sensor package for soil
monitoring, WSN package for environmental monitoring, vehicle sensor for at-
mospheric monitoring. In this test bed, our GeosensorBase efficiently integrated
and managed eight kinds of heterogeneous sensor readings. Then the Geosen-
sorBase served sensor web services of SOS and SPS. From this test bed, we can
note that the integration of heterogeneous sensors was very easy and simple,
the acquisition method of sensor readings was very powerful, and the standard
sensor web service was directly available. Figure 11 shows a screen shot of the
test bed example.

Vehicle Monitoring

Underground Facilities Monitoring


Sensor Installation for Road Facilities

Facilities Monitoring

Fig. 11. Test Bed Example Based on GeosensorBase System

5 Conclusions
We presented the GeosensorBase system that can integrate heterogeneous sen-
sors and support flexible user requests. Specifically, we presented efficient sen-
sor data management methods of the GeosensorBase, such as sensor adaptor
GeosensorBase: Integrating and Managing Heterogeneous Sensors 113

creation wizard, metadata editing method, and virtual sensor group creation
method. We also presented an extended query processor that can support var-
ious kinds of queries, such as periodic query, event query, and spatial query.
Finally, we presented two implementations of the prototype and the test bed to
verify the advantages of the GeosensorBase.
We think the main contributions of the GeosensorBase are as follows. First,
the GeosensorBase can dramatically reduce the costs and the efforts consumed
in new sensor integration. The sensor adaptor creation wizard based on XML
metadata easily supports such the integration. Second, the GeosensorBase can
efficiently manage large number of heterogeneous sensor groups using the logical
sensor group. Third, it supports powerful query processing capabilities which
are suitable for various sensor applications. Such the powerful query processing
capabilities can be supported, regardless of the capabilities of sensors. Finally,
it supports an easy-to-use toolkit of the GeosensorBase Studio. This toolkit also
can dramatically reduce the efforts and the time of developers. For the test
bed example, developers could integrate, query, and serve the eight kinds of
heterogeneous and unfamiliar sensor groups within a week by using this toolkit.
So, we think that the GeosensorBase is very applicable to build various kinds of
useful sensor applications in the physical field.
As a remaining work, we would like to embed our sensor adaptor to inside
sensors. It enables the GeosensorBase to connect heterogeneous sensors directly
without a gateway. The sensor adaptor inside sensor can operate as a software
gateway. Finally, it makes possible to directly integrate huge number of sensors
such as smart phones in the near future.

Acknowledgment. This work was supported by the IT R&D program of


MOTIE /KEIT. [10041790, Development of Advanced Ship Navigation Sup-
porting System based on Oncoming International Marine Data Standard]

References
1. Botts, M., Percivall, G., Reed, C., Davidson, J.: OGC Sensor Web Enablement:
Overview and High Level Architecture (OGC 07-165). Open Geospatial Consor-
tium white paper (December 28, 2007)
2. Nath, S., Liu, J., Zhao, F.: Sensormap for wide-area sensor webs. IEEE Com-
puter 40(7), 90–93 (2008)
3. Gorman, B.L., Resseguie, D.R., Tomkins-Tinch, C.: Sensorpedia: Information Shar-
ing Across Incompatible Sensor Systems. In: 2009 International Symposium on
Collaborative Technologies and Systems, pp. 32–39 (2009)
4. Bonnet, P., Gehrke, J., Seshadri, P.: Towards Sensor Database Systems. In: Tan,
K.-L., Franklin, M.J., Lui, J.C.-S. (eds.) MDM 2001. LNCS, vol. 1987, pp. 3–14.
Springer, Heidelberg (2001)
5. Madden, S., Franklin, M.J., Hellerstein, J.M.: TinyDB: An Acquisitional Query
Processing System for Sensor Networks. ACM TODS 30(1), 122–173 (2005)
6. Kim, M., Lee, Y.J., Ryou, J.C.: COSMOS: A Middleware for Integrated Data Pro-
cessing over Heterogeneous Sensor Networks. ETRI Journal 30(5), 696–706 (2008)
114 M.S. Kim et al.

7. Heinzelman, W., Murphy, A., Carvalho, H., Perillo, M.: Middleware to Support
Sensor Network Applications. IEEE Network Magazine Special Issue 18(1), 6–14
(2004)
8. Li, S., Son, S.H., Stankovic, J.A.: Event Detection Services Using Data Service
Middleware in Distributed Sensor Networks. In: Zhao, F., Guibas, L.J. (eds.) IPSN
2003. LNCS, vol. 2634, pp. 502–517. Springer, Heidelberg (2003)
9. Demirbas, M., Ferhatosmanoglu, H.: Peer-to-Peer Spatial Queries in Sensor Net-
works. In: 3rd International Conference on Peer-to-Peer Computing, pp. 32–39
(2003)
10. Soheili, A., Kalogeraki, V., Gunopulos, D.: Spatial Queries in Sensor Networks. In:
14th ACM International Workshop on Geographic Information Systems, pp. 61–70
(2005)
11. Kim, M.S., Son, J.H., Kim, J.W., Kim, M.H.: Energy-Efficient Distributed Spatial
Query Processing in Wireless Sensor Networks. IEICE Transaction on Information
and Systems E93-D(6), 1447–1458 (2010)
12. Demirbas, M., Lu, X.: Distributed Quad-Tree for Spatial Querying in Wireless
Sensor Networks. In: IEEE International Conference on Communications, pp. 3325-
3332 (2007)
13. Meka, A., Singh, A.: DIST: A Distributed Spatio-temporal Index Structure for
Sensor Networks. In: ACM CIKM, pp. 139–146 (2005)
14. Lee, C.K., Zheng, B., Lee, W., Winter, J.: Materialized In-Network View for Spatial
Aggregation Queries in Wireless Sensor Network. ISPRS Journal of Photogram-
metry and Remote Sensing 62(5), 382–402 (2007)
15. Sharifzadeh, M., Shahabi, C.: Supporting Spatial Aggregation in Sensor Network
Databases. In: 12th ACM International Workshop on Geographic Information Sys-
tems, pp. 166–175 (2004)
16. Kim, M.S., Kim, J.W., Kim, M.H.: Hybrid Spatial Query Processing between a
Server and a Wireless Sensor Network. IEICE Transaction on Information and
Systems E93-D(8), 2306–2310 (2010)
17. Na, A., Priest, M.: OpenGIS Sensor Observation Service (OGC 06-009r6), Open
Geospatial Consortium Inc. (October 26, 2007)
18. Simonis, I.: OpenGIS Sensor Planning Service Implementation Specification (OGC
07-014r3). Open Geospatial Consortium Inc. (August 2, 2007)
19. Huang, C.-Y., Liang, S., Xu, Y.: A Sensor Data Mediator Bridging the OGC Sensor
Observation Service (SOS) and the OASIS Open Data Protocol (OData). In: Liang,
S.H.L., Wang, X., Claramunt, C. (eds.) W2GIS 2013. LNCS, vol. 7820, pp. 129–146.
Springer, Heidelberg (2013)
ForestMaps: A Computational Model
and Visualization for Forest Utilization

Hannah Bast, Jonas Sternisko, and Sabine Storandt

Department of Computer Science, University of Freiburg, Germany


{bast,sternis,storandt}@informatik.uni-freiburg.de

Abstract. We seek to compute utilization information for public spaces,


in particular forests: which parts are used by how many people. Our con-
tribution is threefold. First, we present a sound model for computing this
information from publicly available data such as road maps and popu-
lation counts. Second, we present efficient algorithms for computing the
desired utilization information according to this model. Third, we provide
an experimental evaluation with respect to both efficiency and quality, as
well as an interactive web application, that visualizes our result as a heat-
map layer on top of OpenStreetMap data. The link to our web application
can be found under http://forestmaps.informatik.uni-freiburg.de.

1 Introduction
Recreation is an important part of human life. Most people spent a significant
fraction of their recreation time in public spaces such as forest, parks, or zoos.
For the authorities of these public spaces, it is important to have utilization
statistics about which parts of these spaces have been visited or are going to be
visited by how many people.
Such utilization statistics are useful for a number of purposes. For example,
for the prioritization of maintenance works. Or for selecting proper locations for
new construction works (e.g. a look-out or an inn) or facilities (e.g. litter bins).
Forests, in particular, are also used for purposes other than recreation, most no-
tably for logging and preservation. Here past and projected visitor information
helps to find a meaningful assignment of the various parts of the forest to the var-
ious purposes. Indeed, the original motivation for this paper was a request from
the German forest authorities for computing such usage statistics for exactly the
named reasons.
Our problem could be easily solved if we could track the movements of each
visitor in the area of interest. But for a large number of visitors this is practically
infeasible, and it would also be a major privacy issue. Instead, our approach is
to come up with a computational model for how many people move through an
area of interest on which paths. The input for this model should be publicly
available data. Once computed, we visualize our usage statistics as a heat map,
overlaid on top of a standard map.

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 115–133, 2014.

c Springer-Verlag Berlin Heidelberg 2014
116 H. Bast, J. Sternisko, and S. Storandt

Extract population data 27 Map population


for cities/counties data to vertices in
(Wikipedia, GeoNames). 58 the street graph.

35

Extract Compute forest Compute travel times Determine popularity


forest areas entry points. from all vertices in of entry points.
the street graph to
forest paths close-by entry points.
waters
places of interest
street data
from OpenStreetMap.

Identify probable
roundtours and tours
through the forest.
FOREST UTILISATION
HEAT MAP

Fig. 1. Overview of the complete pipeline for our utilization distribution generator on
the example of public forest areas

1.1 Contribution

The contribution of this paper is threefold. First, we present a new model


for utilization statistics of (paths in) public spaces based on publicly avail-
able data such as road maps and census data. We also show how to incor-
porate additional information like survey data (e.g. the mean time spent in
the forest) or points of interests (e.g. look-outs or inns). Second, we developed
and implemented a pipeline of efficient algorithms that uses this input data to
compute the desired utilization statistics. Third, we provide an experimental
evaluation on various data sets with respect to both efficiency and quality, as
well as an interactive web application that is freely available. In particular, we
created a web site which provides zoomable heat maps for all our data sets. The
link, the details behind the heat map realization, as well as all our data sets
ForestMaps: A Computational Model and Visualization for Forest Utilization 117

and code (thus enabling full reproducibility of our results) can be found here:
http://forestmaps.informatik.uni-freiburg.de .
Throughout the paper, we will consider forest areas as a representative for pub-
lic spaces. Forest areas are particularly hard to deal with, in particular harder
than parks or zoos, for a number of reasons. First, forest areas have a large
number of potential entry points, and it becomes part of our problem to deter-
mine these. Second, access to forest areas is usually unrestricted and there are
no ticket booths or similar facilities, which could provide historical data on how
many people entered at that particular entry point. Third, forest areas are often
large, which entails a very large number of possible paths and round-tours. Since
we want our algorithms to be efficient, we cannot simply enumerate these paths,
but have to resort to more sophisticated solutions.
The mains steps of our pipeline are as follows. A schematic illustration is
provided in Figure 1 above.

(1) Given a road map and the boundaries of the forest areas, compute the set
of forest entry points efficiently.
(2) Given the population number of a whole area, compute a sound estimate of
the distribution of inhabitants inside that area.
(3) For each forest entry point, use the road map and the result of (2) to estimate
the number of people that are likely to use that entry point.
(4) Extract a representative set of routes and round-tours within the forest areas
and estimate their relative attractiveness.
(5) Combine the information from (3) and (4) to estimate which parts of the
forest areas are utilized to which extent.
(6) Visualize the utilization information from (5) in an intuitive and interactive
manner in a web application.

We describe each of these steps in details in the following. Steps (1)-(3) are
explained in Section 2. In Section 3, we propose two approaches for route and
round-tour extraction as required in step (4), and show how step (5) can be
accomplished on that basis. In Section 4, we explain how additional information
like points of interest and survey data can be incorporated. Although this in-
formation is not crucial for our pipeline, it can enrich the model if available. In
Section 5, we provide the setup and results of our experimental evaluation on
three data sets, namely the road maps and forest areas of Germany, Austria, and
Switzerland. In our evaluation, we consider both efficiency and quality. For our
largest data set, the whole of Germany, our pipeline can be completed in about
two hours. To estimate the quality, we compare our utilization information with
GPS traces extracted from OpenStreetMap.

1.2 Related Work

From an algorithmic point of view, we are facing two main challenges. (1) Map-
ping population data given for an area to individual locations inside that area.
(2) Computing a set of meaningful paths in the forest and their attractiveness.
118 H. Bast, J. Sternisko, and S. Storandt

Challenge (1) has been addressed in [1]. Here, aerial photographs are used as
a basis to detect buildings in an area, and to extract building features like a
building’s footprint and its height. Given these characteristics, together with a
pre-defined classification of buildings, the number of residents per block is com-
puted. This approach is refined further by additionally considering city maps
that distinguish between industrial and residential areas. This leads to a sophis-
ticated multi-step algorithm that achieves very high accuracy. For over 90% of all
buildings, the number of inhabitants is estimated correctly within a tolerance of
1 person for houses and 5 persons for apartments. The drawback of this approach
is that it requires very sophisticated input data (in particular, high-resolution
aerial images), which is not available for many areas. Also, this input data is
large and complex and very time-consuming to compute with. In comparison,
our approach is much simpler and uses widely available data like road maps and
census information for whole countries. Thus we cannot, of course, estimate the
number of residents for individual buildings. But we achieve good accuracy for
estimating population distribution within sub-areas, see Section 2.4. And this is
all we need for our purpose here: more fine-grained information would not help
us to compute better utilization statistics.
Concerning challenge (2), several approaches to determine “nice” routes inside
a given area have been developed. In [2], the problem of finding good jogging
routes is investigated. It is first shown that an exact version of the problem
is NP-complete. Then several simple and fast heuristics are proposed, which
return useful routes in practice. These heuristics take as input a road map,
attractiveness estimates for sub-areas, and a desired route length. In [3], a similar
problem is addressed, namely tour suggestions for outdoor activities. The input
there is a maximal tour length together with a tolerance value. The algorithm
is based on spatial filtering and the computation of concatenations of shortest
paths.
In both [2] and [3], the goal is to find few good routes or round-tours, or even
just a single one. In contrast, for our problem we need to compute a compre-
hensive set of meaningful tours, from which we can then estimate the desired
utilization statistics of the whole area. Previous work that comes slightly closer
to this task is the computation of alternative routes in street networks. In [4], for
example, the via-node approach is introduced. Here, all shortest paths from a
given start s to a given target t via a third node v are computed, for every node
v in the graph. Then a representative subset of paths is selected via criteria such
as route length and spatial properties. We adapt the basic idea of this approach
and turn it into a via-edge approach. We employ this approach to determine the
degree of utilization of an edge inside the forest.

2 Computation of Entry Points and Their Popularity

In this section, we describe how to compute entry points and their popularity
based solely on freely available data from OpenStreetMap. We already remarked
above that for areas which require an entry fee, like amusement parks or zoos, this
ForestMaps: A Computational Model and Visualization for Forest Utilization 119

information about entry points is usually available. Entry points are then equal
to ticket booth positions and their popularity can be measured by the number of
tickets sold. For freely accessible grounds like forests such data usually does not
exist, and we need to estimate it by different means. To determine potential entry
points, we compute the boundary polygon of an area and intersect it with the
given path network. To determine the popularity of an entry point, we consider
the population distribution in the surrounding area and the reachability of each
entry point. Both of these are non-trivial procedures, and are described in more
detail at the end of the section.

2.1 Extracting Street Networks and Forest Areas from


OpenStreetMap

We evaluate our algorithms on data from OpenStreetMap (OSM). This project


provides geographical information for nodes (in particular: latitude and longi-
tude), polygonal paths (so called “ways”, referencing previously defined nodes)
and compositions thereof (“relations”, referencing sets of nodes, ways or rela-
tions). The OSM data is provided in XML format. Each entity can have several
attached tags, which are tuples of the form key:value. We parse all nodes, ways,
and relations from the relevant OSM files and translate relations and ways to
sequences of coordinates. We then build the road network from ways with a high-
way:* tag. We generate the forest areas from entities with tags landuse:forest
and natural:wood. Polygons with tag boundary:administrative are retrieved as
boundaries of municipalities. Furthermore, we select nodes whose tags match
certain combinations of tags as points of interest (POIs). For example, places
with the tags man made:tower and tourism:viewpoint are considered as POI.
Note that our algorithms are independent of the particular data format and can
also be applied when the data is given in other common GIS formats like ESRI
shape files.

2.2 Computing Forest Entry Points (FEPs)

The pre-processing of our input data, described above, provides us with the
network of all paths, as well as polygons that bound the forest areas. We then find
all edges from this network that intersect one of these polygons. The intersection

Fig. 2. Detail of a street network combined with forest areas (green). Red nodes high-
light forest entry points. They are outside the forests, but have an adjacent edge that
crosses a forest boundary.
120 H. Bast, J. Sternisko, and S. Storandt

points are then our forest entry points (FEPs). Both the path network and
the polygons consist of line segments. Our computation therefore reduces to
computing the intersection between pairs of line segments. This can be done
easily in constant time per pair. However, the naive approach of intersecting each
path segment with each polygon segment would take “quadratic” time (number
of path segments times number of polygon segments). This could be reduced by
simple pruning techniques, for example, considering only path segments that lie
in the bounding box of a forest area. But even then the computational effort
would be too large. Also, forest areas extracted from OSM sometimes overlap,
and the described pruning only works for disjoint areas. Identifying and merging
overlapping forest areas first is again time-consuming.
We speed up this computation as follows. Instead of searching for intersections
of line segments, we check for each node of the network if it falls inside a forest
polygon or not. Using this classification, we iterate over all nodes in the graph
and check for nodes outside the forest if they have an adjacent edge towards a
node inside the forest. Such edges determine forest entry points. For the sake of
simplicity, we do not add a new node for the crossing of the path with the forest
boundary but instead use the last node on a path that is still outside of the
forest. See Figure 2 for a classification example. To handle the large numbers of
nodes for which the membership test has to be done, we rasterize the polygon to
a bit array of sufficiently fine resolution (we used a precision of 10 × 10 meters
per bit in our experiments). Consider this as image, where the forest areas are
painted in white and the remaining parts in black. The membership test for a
node is then reduced to a constant-time lookup of the value of the pixels at that
node’s coordinates. For our largest data set (Germany), we can thus compute
all forest entry points in a fraction of the time needed for the whole pipeline (3
minutes of about 2 hours); see Table 3 in Section 5.

2.3 Incorporating Population Data


The next question is now many people are likely to use each of these FEPs. To
answer that question, we need to know the population distribution in the sur-
rounding area. Population numbers are widely available for larger administrative
units (states, cities, villages), for example from the Wikipedia info boxes1 or from
GeoNames2 . Using such data for our purposes entails two main challenges.
First, we need to know the boundary of the area to which a particular pop-
ulation pertains. Both kinds of data are available, but usually from different
sources. Getting the proper assignment is non-trivial, because of different data
schemas and variations in naming and spelling. We solve this problem by using
an approximate match on a carefully selected set of name tags.
Second, we need more fine-grained information than just the population of a
whole village. Specifically, we need an estimate of the number of people that live
near each line segment from the given path network. We here make the following
assumption, which we later validate in our experiments in Section 5.3.
1
http://de.wikipedia.org/wiki/Saarland
2
http://www.geonames.org/2842635/saarland.html
ForestMaps: A Computational Model and Visualization for Forest Utilization 121

Assumption 1. The population number in an area is strongly correlated to the


accumulated length of local streets in this area. In other words: the population
density and the density of local streets coincide.
The intuition behind this assumption is that every house is typically close to
some street, while the length of a street provides a good upper bound on the
number of houses there and the density of houses does not vary too much in
typical residential areas.

Fig. 3. Left: a small artificial example of a street Voronoi diagram. There is one distinct
color for each vertex, and all the parts of edges belonging to the Voronoi cell of that
vertex are drawn in that color. Right: a real-world population distribution based on
such a street Voronoi diagram, where larger circles indicate higher population numbers.

We make use of this assumption as follows. For every street vertex, we compute
the sum of the lengths of the street segments for which this vertex is the closest
one. This is simply half of the sum of the lengths of all adjacent edges. From a
geometric point of view, this amounts to computing a Voronoi diagram for all
vertices and sum up the street lengths inside the respective Voronoi cells. For one-
way segments we map the whole length to the tail node of that segment (since
all residents must leave via that node). An example is given in Figure 3, left side.
The running time of this approach is linear in the size of the vertices and edges
in the street graph. It hence scales well to large networks. Dividing the computed
sum of lengths for every vertex by the total sum of all edge lengths, we obtain
percentage values. Multiplying these with the total population of the whole area
results in an individual population estimation for each vertex. This is illustrated
in Figure 3, right side. Types of streets, which are normally unpopulated, such
as motorways, are simply excluded from the described procedure.
122 H. Bast, J. Sternisko, and S. Storandt

With the procedure as described so far, there is still some imbalance due to
different building density along streets in different areas. For example, multi-
story buildings with a large number of people are more likely in a city center,
whereas houses in remote areas tend to be more sparse and to have less inhab-
itants. In an extreme case, like an industrial zone, there might be streets but
no actual inhabitants at all. We alleviate this imbalance by identifying (large)
clusters with a high density of living streets, typically metropolitan areas, using
a simple grid-based approach. Inside of such clusters, we increase the percent-
age values by multiplying the above-mentioned lengths of segment sums by a
constant weight factor, specified below. To identify such clusters, we again use
a grid-based approach. We choose 1000 × 1000 cells. We say that a grid cell is
dense, if the sum of the lengths of the contained streets is 25% or more above
the average (taken over all grid cells that contain at least one street). We use a
weight factor of 3 for such grid cells. We found this to be a typical ratio when
comparing population count divided by total street length in city vs. rural areas.
Moreover, we say that a grid cell is super dense, if the sum of the lengths of the
contained streets is 50% or more above the average. We use a weight factor of
6 for such grid cells. This simple approach requires only constant time per edge
and grid cell. In combination with the street Voronoi diagram, this gives us a
very efficient tool for population estimation.

2.4 Computing Popularity Values for Entry Points

The closer someone lives to a certain FEP, the higher the probability that this
person will use this FEP for visiting the forest. If several FEPs are nearby, the
likelihood for usage is distributed among them. To compute for every street
vertex v ∈ V the set of suitable FEPs, we could execute Dijkstra’s algorithm
from each such vertex. For a more realistic model, we restrict the travel time
to 30 minutes. For each FEP f contained in the Dijkstra search tree, we then
compute the usage probability as

d(v, f )
u(v, f ) = 1 −  ,
f  : d(v,f  )<30min d(v, f  )

where d denotes the travel time between the 


given node and FEP. The popularity
value of each FEP f is then computed as v pop(v) · u(v, f ), where pop(v) is
the population for v computed as described in the previous section.
But as stated above, this straightforward approach to computing these popu-
larity values would be to run a Dijkstra from each street vertex v. Even with a
bounded radius of 30 minutes, the computation time would be several days for
a data set like Germany. What comes to our rescue here is that the number of
FEPs is about two orders of magnitude smaller than the number of all nodes
in the street network. We therefore run a Dijkstra computation (bounded to a
radius of 30 minutes) on the backward graph from each FEP. That way, each
street vertex can (and will) occur in a number of Dijkstra results, typically on
the order of hundreds. Explicitly storing all the values that contribute to the
ForestMaps: A Computational Model and Visualization for Forest Utilization 123

u(v, f ) from above would hence consume many times more space than needed
for storing the actual network.
To avoid this, we discretize the d(v, f ) distances into buckets with a resolution
of 5 minutes. That way, we need to store only a few counts for each vertex.
Namely, the number of FEPs reachable in less than 5 minutes, the number of
FEPs reachable between 5 and 10 minutes, and so on.
It then remains to distribute the populations pop(v) over the FEPs. This is
easily done with another backwards Dijkstra from each FEP.
Figure 4 illustrates the backwards approach by a small example.

(0,1,1)
(1,1,0)
(1,0,0) (0,0,2)
(0,0,2) (0,1,1)
(0,0,1)
(0,0,0)
(0,0,1) (0,0,2)

(0,0,0)
14
24 5
7 (0,0.6,0.4) 31
(1,0,0) (0.8,0.2,0)
12 (0,0,0.5) 3 10
(0,0,0.5) (0,0.6,0.4)
8 10
(0,0,1) 10
10 (0,0,0)
(0,0,1) (0,0,0.5)
4 14

(0,0,0) 3

Fig. 4. Backwards approach for four forest entry points (red) and three travel time
buckets (indicated by the circular areas with the color ranging from orange to yellow).
The upper image shows the result of the first round of backward Dijkstras. For each
street vertex (black), we obtain the number of FEPs in each bucket. The lower image
shows the result of the second rounds of backward Dijkstras. The counters from above
are converted to usage likelihood values. See for example the resulting tuple (0, 0.6, 0.4)
generated from the counters (0, 1, 1). Here, we have one FEP reachable between 5 and
10 minutes, and another one between 10 and 15 minutes. Summing up the average
travel time for these buckets, we get 7.5 + 12.5 = 20 minutes. The likelihood for the
FEP in the 5-10 bucket is then 1 − (7.5/20) = 0.625 ≈ 0.6. The likelihood for the
other FEP equals the remaining probability of 0.375 ≈ 0.4. These values are then used
as coefficients to map population values (blue) to FEPs. The resulting values are the
popularity counts for each FEP (red).
124 H. Bast, J. Sternisko, and S. Storandt

3 Computing the Utilization Distribution


The utilization distribution that we want to compute depends on two main
factors: the popularity of entry points and the attractiveness of paths through
the public space. We have shown how to compute the popularity of entry points
in the previous section. In this section, we show how to compute a comprehensive
set of paths and round-tours and their relative attractiveness. We consider two
approaches: flooding (Section 3.1) and via-edge (Section 3.2). Both approaches
allow to combine entry point popularity with tour attractiveness values in order
to estimate the desired utilization distribution better.

3.1 The Flooding Approach


The goal of this section is to assign to each edge in the forest an attractiveness
value that reflects how likely this edge is included in a tour. A naive approach
is to run a Dijkstra search from every FEP considering only the forest subgraph
of the street network. Then for every edge (v, w) explored by this search, its
attractiveness gets increased by the fraction of the popularity of the FEP f and
the distance from f to w. Think of this technique as “flooding” the entry point’s
popularity along shortest paths into the nearby forest. Consequently, a higher
FEP popularity contributes to a higher attractiveness of the edges in its range.
Conversely, the further away an edge is from the entry points to the forest, the
lower is its attractiveness. This approach is simple and efficient. However, it has
the disadvantage that it might leave some edges with an attractiveness of zero,
because they are not part of any shortest path (Figure 5). But it is not uncommon
that people walk along non-shortest paths during leisure activity. As a remedy,
we could compute the k shortest paths to each inner vertex, either forbidding
loops [5] or allowing them [6]. Another way would be to define several edge
metrics (length, niceness, quietness, ...) if such information is available, and then
search for optimal paths for several linear combinations of these metrics. Both of
these approaches would generate multiple paths between a FEP and a node inside
a forest, but still there is no guarantee that each edge of a path is considered at
least once. Moreover, this simple approach only models round-tours reasonably,
whereas hiking from one FEP to another is not included. We therefore propose,
in the following subsection, an alternative approach that captures tours through
the forest as well.

3.2 The Via-Edge Approach


In the via-edge approach, we iterate over the forest edges one by one to calculate
their respective attractiveness. Specifically, for each edge (v, w), we run a back-
wards Dijkstra from v and a forwards Dijkstra from w. This provides us with a set
of paths of the type FEP1 →∗ v → w →∗ FEP2 (where, of course, FEP1 = FEP2
is possible). Now for every such path we increase the attractiveness value of the
edge (v, w) by the minimum of the popularity values of FEP1 , FEP2 multiplied
with the shortest path distance between FEP1 and FEP2 , divided by the total
ForestMaps: A Computational Model and Visualization for Forest Utilization 125

23m

20m

Fig. 5. The upper path is three meters longer than the lower one, so any Dijkstra-
based point-to-point search will not include the upper path in a solution. Still, this
path section might be used in a hike.

path length (which can be extracted from the labels created in the two Dijkstra
runs and the length of the edge). For round-tours (FEP1 = FEP2 ) the attrac-
tiveness increase would be zero, hence this case is handled like in the flooding
approach. So the attractiveness increases with the popularity of the entry points,
and the smaller the difference between the via-edge path and the shortest path
the higher the attractiveness. That way, we consider all tours via a certain edge
and we generate meaningful attractiveness values for all edges in the forest.

4 Incorporation of Additional Information


4.1 Survey Data on User Preferences
The two main behavioral factors that influence utilization intensities in forests
are: first, the travel time to get to the forest, and second, the amount of time
spend in the forest. If these two factors are captured by survey data, we can in-
corporate them into our pipeline and thereby increase the accuracy of our model.
Table 1 shows the results of a survey on recreational behavior from a German
forest authority3 as published in [7]. We observe that most people prefer forest
entries in the vicinity of their residences (about 80% take ≤ 15 minutes to get
to the forest). The time spent in the woods has a peak between 30 minutes and
one hour, and the distribution has a positive skew towards longer sojourns. We
can plug in these observations as follows: for the computation of FEP popularity
values, we use travel time buckets in the backwards approach as provided by
the survey. After the backwards Dijkstra runs from the FEPs, we distribute the
population values of each street vertex according to the percentage values given
in Table 1 (if for a bucket no FEP is available, we add the percentage to the
previous bucket). As an example, assume that from a vertex v we can reach one
FEP fA in less than 5 minutes and two FEPs fB and fC between 10 and 15
minutes. We conclude that 63% of the population of v use fA , and 18.5% use
fB and fC respectively. To incorporate the duration of stay times we use the
percentage values of the second column of Table 1 to weight tours detected by
our algorithms (flooding or via-edge). Thus, edges used in tours with a length
favored by more people receive higher attractiveness values.

3
http://www.fva-bw.de/
126 H. Bast, J. Sternisko, and S. Storandt

Table 1. Survey data about forest visits according to [7]

travel time to the forest time spent in the forest


up to 5 min 38% up to 30 min 25%
6 to 10 min 25% 31 to 60 min 42%
11 to 15 min 16% 61 to 90 min 11%
16 to 30 min 18% 91 to 120 min 15%
longer than 30 min 3% longer than 120 min 7%

4.2 Points of Interest (POIs)

Of course, the presence of attractors like an inn, look-outs or a lake in the


forest, vivaria in the zoo or sunbathing lawns in parks, affects the likeliness
of tours passing in the vicinity and thus the edge attractiveness. If such POIs
are available, an easy way to incorporate them would be to assign a “niceness”
value to each edge which increases with the proximity to an attractor. Then
every tour created by our via-edge approach described above gets weighted with
the respective niceness value of the edge. Unfortunately, this approach covers
only tours visiting a single sight with our via-edge approach. But popular routes
often pass multiple POIs, and this is not properly modeled by this approach.
One remedy would be to compute the shortest paths via all permutations and
subsets of attractors. But this would be too time-consuming as the number
of possible tours grows exponentially in the number of POIs. So instead, we
compute shortest paths between all pairs of attractors and increase the niceness
of edges on these paths accordingly, which can be accomplished in polynomial
time. In this way customer flows concentrates on paths between attractors and
the utilization intensity increases on such sections.

5 Experimental Results

In this section, we evaluate our algorithms on real-world data with respect to


both efficiency and quality. All our algorithms are implemented in C++, except
for the raster-based FEP extraction described in Section 2.2, which is written in
Java. As we see in Table 3 below, the running time of this component is only a
small fraction of the total running time. All our experiments were performed on
a single core of an Intel i5-3360M CPU with 2.80 GHz and 16 GB RAM.

5.1 Data

We extracted street networks and forest boundary polygons from OpenStreetMap


(OSM) for three countries: Germany, Austria, and Switzerland.4 Note that for
all of these countries the available data for street networks and forest areas can
be considered as almost complete. Surely, the magnitude of nodes, edges and
4
Retrieved from http://download.geofabrik.de on November 26, 2013.
ForestMaps: A Computational Model and Visualization for Forest Utilization 127

Table 2. Our five data sets (ordered by graph size) and their main characteristics

Graph OSM extract Wikipedia extract


nodes edges forest areas population
Saarland 535,595 1,149,654 2,165 994,000
Switzerland 6,562,482 12,314,060 39,087 7,997,000
Baden-Württemberg 7,156,371 15,460,922 26,507 10,598,000
Austria 13,473,037 23,469,568 66,498 8,462,000
Germany 47,839,447 93,811,864 394,522 81,890,000

forest areas resembles reality. To get evidence for the scalability of our approach,
we also performed some of our experiments on two states within Germany, one
large (Baden-Württemberg) and one small (Saarland). Population statistics were
looked up manually in Wikipedia. The main characteristics of our data sets are
summarized in Table 2.

5.2 Running Times


We measured the running time for each step of our pipeline, excluding the data
extraction from OSM. The resulting times along with the total running time can
be found in Table 3.

Table 3. Running times for the main steps of our pipeline in seconds. The numbers for
the “edge attractiveness” column are for our (more sophisticated) via-edge approach.

Graph Runtime (in seconds)


population FEP FEP edge total
mapping extraction popularity attractiveness
Saarland 0.01 1.92 63.53 40.65 2min
Switzerland 0.38 11.95 51.20 211.34 5min
Baden-Württemberg 0.43 20.78 584.65 1223.43 31min
Austria 0.79 24.15 56.78 912.55 17min
Germany 2.86 181.25 512.29 6958.33 135min

We observe that the fraction of time spent on the population mapping is


negligible. It is just a few seconds even for our largest data set (Germany). This
is easy to understand, since the computation of the (street) Voronoi diagram
requires only a single sweep over all edges. The fraction of time spent on the
FEP extraction is also relative small. This is due to the efficient rasterization
approach described in Section 2.3. The resulting number of FEPs per input can
be found in Table 4. This number influences the runtime of the subsequent steps,
especially the popularity value computation which requires two runs of Dijkstra
for each FEP. We observe that the number of FEPs for Austria, Switzerland
and Baden-Württemberg are very similar, despite their widely different number
128 H. Bast, J. Sternisko, and S. Storandt

Table 4. Number of forest entry points for our test graphs

Graph number of forest entry points (FEPs)


Saarland 12,157
Switzerland 124,374
Baden-Württemberg 139,468
Austria 198,241
Germany 971,517

of forest areas. This is because the number of FEPs is influenced by both forest
area size and street network density.
The two remaining steps, the computation of the FEP popularity values (Sec-
tion 2.4) and of the forest path attractiveness values (Section 3), are the most
time-consuming of our whole pipeline. This is because both steps require a large
number of Dijkstra computations. Note that the numbers reported in Table 3
for the last step are for the more sophisticated via-edge approach, described in
Section 3.2. The more simplistic flooding approach is about 40 times faster.
We also observe that the total running time for Baden-Württemberg (BW) is
slightly larger than for Austria, despite the smaller street graph for BW. This
is because the forest areas in BW are larger and more compact and with many
more paths inside.

5.3 Quality of the Estimated Population Distribution


In Section 2.3, we described how to distribute the population given for a dis-
trict over the individual nodes of the contained street network. We evaluated
the quality of our approach as follows. We do not have access to the positions
of individual buildings and the number of people inhabiting them. We instead
manually researched the population for some lower administrative districts con-
tained in our data sets. For each such district, we compared this number to the
sum of populations we computed for all nodes of the contained street network.
Table 5 provides the result of this comparison for the German state of Saarland
and its six major sub-districts.
The average deviation is only about 10%, with a maximum deviation of 21%
for one sub-district. This is surprisingly good, given our simple approach. We re-
peated the same experiment on the whole of Germany (81.9 million inhabitants)
and chose 20 districts randomly with population ranging between 3.4 million
(“Berlin”) and 56,312 (“Wittmund”). On average, we over- or underestimated
the population in a district by 28%, while never being away more than a factor
of 2 from the correct result.
ForestMaps: A Computational Model and Visualization for Forest Utilization 129

Table 5. Accuracy of our estimated population for the six sub-districts of the German
federal state Saarland. Δ denotes the relative deviation of our result compared to the
ground truth in percent.

Saarland Population Number


actual estimated Δ
1 Merzig-Wadern 103,520 96,903 −7%
2 Neunkirchen 134,099 118,628 −12%
3 Saarbrücken 326,638 315,866 −3%
4 Saarlouis 196,611 175,705 −11%
5 Saarpfalz-Kreis 144,291 157,316 +9%
6 St. Wendel 89,128 107,452 +21%

5.4 Quality of Our Estimated Utilization Distribution


Evaluating the quality of our final result (which parts of the given forest areas
are used by how many people) is difficult, because there is no such empirical data
available. Indeed, the lack of availability of such data is the main motivation for
this work, see Section 1.
However, large numbers of GPS traces from people contributing to Open-
StreetMap are publicly available5 . Specifically, we used packages containing
about 17,000 traces for Switzerland, 22,000 for Austria and more than 200,000
for Germany. Clearly these traces are susceptible to all kinds of bias and cannot
be considered as a “ground truth”. However, we found it quite safe to assume
that highly frequented paths indeed correlate with prominent numbers of GPS
traces. We hence proceed as follows. We provide two comparisons, one visual
and one quantitative.
For the visual comparison, we produce comparable heat maps for the GPS
data and the utilization intensities computed by our two approaches (flooding
and via-edge). For the GPS data, we intersect the GPS traces with our extracted
forest polygons and overlaid the traces, each with a low transparency. We then
map the aggregated transparency values to the same color range used in the
heat maps for our two approaches. For a fair comparison, the color values were
normalized for each of the three maps individually. To avoid distortions by high
intensities from a few outliers, the top 2% of intensities all get assigned to the
most intense color from our color scheme (the reddest red in our pictures). The
remaining intensities are mapped linearly to the remaining color range (from red
over orange and yellow to no color).
Figure 6 shows these three heat maps for Baden-Württemberg. A number of
interesting observations can be made.

5
http://www.openstreetmap.org/traces and http://goo.gl/wczD8H
130 H. Bast, J. Sternisko, and S. Storandt

Fig. 6. Comparison of heat maps for the GPS data (middle), our simple flooding ap-
proach (left), and the more sophisticated via-edge approach (right). The intensities
range from yellow over orange to red. The more red the more intense. Forest areas
with no paths or no data are green. A detailed discussion is provided in the text below.

(1) For both of our approaches, the hot spots are in similar locations as for the
GPS data.
(2) The flooding approach tends to produce larger hot spots. This is an undesired
artifact of the approach, arising from not considering tours through the forest,
but only proximity of forest edges to FEPs.
(3) The via-edge approach tends to produce more pronounced hot spots, very
similar to those from the GPS data. The via-edge approach also singles out
specific paths and edges, differentiating attractiveness values much better
than the flooding approach between several forest walks in the same area.
(4) Many of the smaller hot spots from the GPS data (e.g. those in the lower
right part of the map) can be found in the heat map of the via-edge approach,
but not in the heat map of the flooding approach. Especially in small forest
areas, people tend to walk through the forest instead of making a round tour.
Only the via-edge approach models this behavior satisfactorily.
(5) The coverage of the GPS data is very limited, the coverage of both of our
approaches is very good.
(6) In the lower left part, parallel to the border, we see an intense forest usage
indicated by the OSM traces, but only a small to medium intensity according
to our models. This is at least partly a border effect, because the cut-off at
the state boundary leads to smaller population values in the surrounding
areas of forest entry points near this boundary.

Figure 7 shows an enlarged version of the heat map for Baden-Württemberg, as


produced with the via-edge approach. Figures 8 and 9 show the respective heat
maps for Austria and Switzerland. For Austria and Switzerland, we observe
a large fraction of yellow areas, because there are far from populated areas.
And again, border effects might play a role in some areas, as seen for Baden-
Württemberg before. Furthermore, many forest areas in Austria and Switzerland
are tourist spots. If corresponding tourist use data were available, it would make
sense to include it into our model.
ForestMaps: A Computational Model and Visualization for Forest Utilization 131

Fig. 7. Heat map for Baden-Württemberg, as produced with our via-edge approach

For our quantitative comparison, we extracted the top-50 hot spots for the OSM
trace map and for the heat maps computed by our two approaches. To extract the
top hot spots, we laid a grid over the map and summed up the heat map intensities
in each grid cell. For each of the three maps, we then extracted the 50 grid cells with
the highest value. We then counted which percentage of the top 50 grid cells from
the OSM trace map are also in the top 50 grid cells in our two approaches. The re-
sult is reported in Table 6. As in the visual comparison, also the quantitative results
show a clear advantage of the via-edge approach over the simple flooding approach.
However, this advantage is smaller than one might expect from the visual compar-
ison, in particular Figure 6. This is because the grid discretization and the top-50
approach over-emphasizes the (easy) top hot spots and blurs the subtle differences
between flooding and via-edge discussed above.
132 H. Bast, J. Sternisko, and S. Storandt

Fig. 8. Heat map for Austria, as produced with our via-edge approach

Table 6. Quality analysis of our two edge attractiveness models using a set of traces
extracted from OSM as ground truth. Hot spot detection is evaluated against this
baseline, the values are given in percent.

Graph Edge Coverage Hot Spot Detection


OSM flooding via-edge flooding via-edge
Saarland 37 99 100 27 38
Switzerland 28 97 100 20 24
Baden-Württemberg 29 95 100 22 31
Austria 22 93 100 18 23
Germany 26 91 100 17 24

Fig. 9. Heat map for Switzerland, as produced with our via-edge approach
ForestMaps: A Computational Model and Visualization for Forest Utilization 133

6 Conclusions and Future Work

We have designed, implemented, and evaluated a pipeline of algorithms for es-


timating the utilization distribution of public spaces, in particular forests. We
used only simple and publicly available map data as input. Our approach pre-
dicts not only the utilization distribution in an area but also the utilization on
the fine-grained level of paths. We have also provided a visualization of our re-
sults in an interactive web application, along with all data sets and code needed
for reproducibility. For future work, it would be most interesting to get hold of a
more comprehensive ground truth (e.g. usage statistics from traces or polls), and
use these for a more thorough quality comparison and to fine-tune our model
and our algorithms.

References
1. Ural, S., Hussain, E., Shan, J.: Building population mapping with aerial imagery
and GIS data. International Journal of Applied Earth Observation and Geoinforma-
tion 13(6), 841–852 (2011)
2. Gemsa, A., Pajor, T., Wagner, D., Zündorf, T.: Efficient Computation of Jogging
Routes. In: Bonifaci, V., Demetrescu, C., Marchetti-Spaccamela, A. (eds.) SEA 2013.
LNCS, vol. 7933, pp. 272–283. Springer, Heidelberg (2013)
3. Maervoet, J., Brackman, P., Verbeeck, K., De Causmaecker, P., Vanden Berghe, G.:
Tour suggestion for outdoor activities. In: Liang, S.H.L., Wang, X., Claramunt, C.
(eds.) W2GIS 2013. LNCS, vol. 7820, pp. 54–63. Springer, Heidelberg (2013)
4. Luxen, D., Schieferdecker, D.: Candidate sets for alternative routes in road networks.
In: Klasing, R. (ed.) SEA 2012. LNCS, vol. 7276, pp. 260–270. Springer, Heidelberg
(2012)
5. Yen, J.Y.: Finding the k shortest loopless paths in a network. Management Sci-
ence 17(11), 712–716 (1971)
6. Eppstein, D.: Finding the k shortest paths. SIAM Journal on Computing 28(2),
652–673 (1998)
7. Ensinger, K., Wurster, M., Selter, A., Jenne, M., Bethmann, S., Botsch, K.: “Ein-
tauchen in eine andere Welt” - Untersuchungen über Erholungskonzepte und Erhol-
ungsprozesse im Wald. German Journal of Forest Research 184(3), 70–83 (2012)
Isibat: A Web and Wireless Application
for Collecting Urban Data about Seismic Risk

Paule-Annick Davoine1,∗, Jérôme Gensel1, Philippe Gueguen2,


and Laurent Poulenard1
1
Grenoble Computer Science Laboratory (LIG), STeamer Research Team, 681, rue de la
Passerelle 38402 Saint Martin d’Hères, France
{davoine,poulenard,gensel}@imag.fr
2
Earth Sciences Institute & IFSTTAR, Risques Research Team, Rue de la Chimie,
38402 Saint Martin d’Hères, France
philippe.gueguen@ujf-grenoble.fr

Abstract. In the field of seismic risk, conducting an inventory in a urban area is


one of the main challenging tasks, considering the large number and the
heterogeneity of the buildings and constructions to be studied in every city. In
this paper, we present Isibat, a client-server application, which makes use of
both wireless networks and technologies and Geoweb software for collecting,
analysing and visualizing seismic data. On the client side, through an iPhone
application, mobile users moving on the field are invited through screens,
menus and maps, to collect data that are used to evaluate either the vulnerability
of the surrounding buildings (in a pre-seismic phase) or the damage caused (in a
post-seismic phase). On the server side, data are stored but also can be queried
and visualized through adapted vulnerability or damage maps.

Keywords: client-server application, geoweb, mobile software, seismic risks,


citizen seismology.

1 Introduction

Urban seismic vulnerability is characterized by the ability that buildings and built
structures show to resist and stand up to seismic shakings and aftershocks. Many
works have shown that there is a strong correlation between the features of buildings,
structures and constructions (nature of materials, shape of the roof, number of floors,
building irregularities, etc.) and the level of damage at a given level of seismic stress
[12], [2], [4]. Having a good knowledge of the urban environment, and especially the
existing buildings, is very important in order to manage, foretell, and assess both their
pre-seismic vulnerability and their post-seismic integrity. Through the past, various
methods for estimating the seismic vulnerability of buildings have been proposed
(HAZUS 19971) [5], [11], that have led to the establishment of a global programme


This research is funded by the French National Research Agency – ANR-09-Risk-009.
1
http://www.fema.gou/hazus

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 134–147, 2014.
© Springer-Verlag Berlin Heidelberg 2014
Isibat: A Web and Wireless Application for Collecting Urban Data about Seismic Risk 135

for reflection on seismic vulnerability, at worldwide scale (Global Earthquake Model,


GEM). However, since urban spaces are vast and heterogeneous, one of the
challenges to be addressed here is to collect a sufficient amount of urban data about
seismic risk, at different levels of detail, in order to perform some relevant analysis.
Indeed, designing and developing methods and tools that allow for an inventory of the
buildings in a city, in order to get detailed and useful data about constructions and
buildings, is an important challenge in the field of seismic risk prevention.
Nowadays, wireless networks, mobiles technologies and devices (smartphones, PC
tablets…), coupled with Web 2.0 technologies, and more particularly those supporting
the Geoweb, form an interesting approach for acquiring data on the field or in-situ.
Smartphones, for instance, have become omnipresent and familiar tools. These mobile
devices are small computers connected to the Internet, handling multimedia
information, and equipped with various kinds of sensors (compass, gyroscope,
accelerometer…), including most of the time a GPS receiver that makes it possible to
get the location of the user and, accordingly, of surrounding objects or points of
interest. These GPS coordinates can, at their turn, be exploited by mobile and/or Web
applications. For instance, GPS coordinates can be semantically enriched by means of
Web services (for instance, the GeoNames Web service) that allow for annotating and
describing any point or located object of interest of the Earth’s surface [13]
In this paper, we present a client/server application, developed in the framework of
the Urbasis2 project that makes use on the client side of an Apple iPhone in order to
collect in-situ urban data for the assessment of seismic risk. Since the massive co-
production of data by expert but also non-expert users is expected and concerned, the
approach adopted here has a strong connection with the Volunteered Geographical
Information (VGI) initiative advocated by Michael Goodchild [7]. On the server side,
the data collected by mobile expert but also non-expert users during their collective
and participatory in-situ inventory are made available for visualisation, navigation,
and analysis. The server stores and manages a database that encompasses information
about inventoried buildings, and contributes to the estimation of both the
vulnerability/exposure to the seismic risk of any identified urban region (i.e. in a pre-
seismic period), and the damages caused by an earthquake (i.e. in a post-seismic
period).
The remainder of the paper is organized as follows: in section 2, we recall the
principles, issues and challenges related to a seismic inventory of an urban area and
we present some dedicated estimation methods. In section 3, we highlight the
usefulness of Geoweb technologies within such a risk prevention context. Then, the
Isibat application is presented in Section 4 before we conclude in Section 5 by giving
some future research perspectives.

2
The Urbasis Project is funded by the French National Research Agency on the theme: urban
seismology: innovating methods for assessing vulnerability and seismic damages. ANR-09-Risk-
009. http://users.isterre.fr/pgueg/URBASIS/Accueil.html (in French).
136 P.-A. Davoine et al.

2 Towards a Seism
mic Inventory of the Urban Built

The estimation of the seissmic vulnerability relies on the European Macroseism mic
Scale EMS-983 that consists of a classification in six classes based on the
characteristics or features of
o a building (see Fig. 1). The purpose of the EMS-98 iis to
support a fine-grain cartogrraphy of the seismic vulnerability or earthquake damagees in
any European city, in ordeer to improve both the knowledge and the management of
the seismic risk.

Fig. 1. The evaluation grid baased on the European Macroseismic Scale EMS-98. Buildingss are
grouped by type of constructioon, according to the material used.

Most of the existing ap pproaches are based on the analysis of orthophotos and
digital terrain models, cou upled with cadastral geographic information, or basedd on
urban data collected at the district
d level [4]. In the former case, an image showing the
composition and the diversiity of the urban morphology is obtained. One of the criteeria
adopted here is called “urb ban ruggedness” defined as the ratio between the averrage
height of the buildings and the urban density (being itself defined as the ratio betwween
the area occupied by sum of the building surfaces and the total area of the studdied
zone). However, a lot of otther features, such as the type of building material usedd, or
the shape of the roof, or irrregularities in the height or the layout of the buildingss, or
the tilt of the slope on whicch buildings are constructed, etc., also need to be taken iinto

3
http://www.franceseisme.fr/EMS98_Original_english.pdf NR -09-Risk-0009.
Isibat: A Web and Wireless Ap
pplication for Collecting Urban Data about Seismic Risk 137

account. In the latter caase, urban seismic vulnerability studies rely on soome
aggregated spatial data or on deduced urban parameters [4]. However, most of the
time, such aggregations co orrespond to statistical summaries that are more or lless
relevant, and lead to an ovversimplification and a generalisation of the characterisstics
of the studied urban area. These approaches do not account for local variations tthat
can be observed on the fieldd.
Regarding the estimationn of the damages, which is performed during a post-seism mic
period, data collected in-sittu are favoured. This task consists of a visual auscultattion
of the buildings based on th
he standard EMS-98 evaluation grid (see Fig. 2).

Fig. 2. Damag
ge evaluation grid according to the EMS-98 scale

Such field surveys are ussually conducted by expert investigators who scour the site
of the disaster and record their observations into paper field notebooks. Once the
observations are made, data are entered into a computer and then processed. T This
manual process is time-con nsuming and costly; it is not well suited for establishhing
some cartography of the damages
d in real time, neither for identifying what are the
areas or zones with the highher (human, material, economical, environmental…) staake.
The objective here is two-ffold: 1) being able to draw a map of the caused damaages
that efficiently contributes to solve a crisis management problem; 2) being ablee to
estimate the level of seismiic intensity over the studied stricken area, according to the
damages and intensities sccales provided by the EMS-98. Moreover, the damaages
caused on a building are im mpacting its seismic vulnerability. So, it is of the utmmost
importance to be able to re--assess its vulnerability between two seismic aftershockss.
Whatever be their goaal, existing methods and available data have seveeral
limitations: i) they do not allow
a for a very detailed knowledge of the buildings inn an
urban area; ii) they do nott allow for establishing relevant correlations between the
type of construction, the vulnerability
v and the damages caused, at a given levell of
earthquake; iii) they do noot account for the identification of specific buildings w with
high stakes into a studied arrea. In order to overcome these limitations, collecting a big
amount of geo-referenced data
d at the building level is required, as well as facilitatting
the reuse of such data.
138 P.-A. Davoine et al.

3 Collecting Urban Seismic Data Using the Geoweb

3.1 Motivations
The Geoweb is the convergence of Geographical Information Technologies and of
Information and Communication Technologies, relying on the Web, the Internet,
Mobile Technologies and Location-Based Systems [8], and offers new opportunities
for collecting data in-situ.
On the one hand, mobile devices such as smartphones or tablets are now equipped
with GPS and geolocation functionalities, and can serve as electronic field notebooks.
Entered data and information are numerical and automatically georeferenced. This
way, the transcription step that is error prone is skipped and the data quality is
therefore improved. Still, collecting in-situ data requires adopting shared protocols,
formats and forms in order to homogenize such data. Also, mobile devices allow for
contextualizing collected geo-and-time referenced data by annotating them with
different kinds of metadata: photos, videos, audios, textual comments, etc. [13].
On the other hand, crowdsourcing and more particularly Volunteered
Geographical Information that offer tools for creating, gathering and publishing
geographic data willingly collected by individuals or citizens [7] have become a
powerful means for producing and sharing geographic information collectively [9].
This way, a set or a community of actors, following a common goal, can produce and
share geographic data on a given topic [9]. Though the Web interface and adapted
functionalities of such VGI tools, users are now given the opportunity to become
active contributors in the process of producing information. Following this
participatory approach, OpenStreetMap4 is one of the most known applications. In the
last few years, a lot of similar initiatives have been launched in the field of natural
hazards prevention, particularly in order to quickly collect data in order to evaluate
the damages caused during a crisis period (for instance, the Ushahidi5 project during
the Haiti post earthquake period in 2010...), as well as in the field of environment
observation [10].
Being able to conduct a comprehensive inventory of the structural characteristics
of the existing buildings in a city is a time consuming and manpower costly task.
Investigators have to walk through a vast urban space, by visually observing each
building and informing about its geographical position and the associated list of its
structural parameters. It seems to us that, in the domain of seismic risk prevention,
adopting an approach based on the co-production of data is an interesting and
promising track. Our idea is to provide contributors with a software environment
coupling one mobile device for acquiring data, and one Web application for storing
and visualizing these data. This environment would help in conducting inventory for
assessing both the vulnerability and the damages caused. We also argue that the
notion of co-production is important too. Instead of a contribution that could only be
done by expert investigators, we assume that any person may, in her own way,

4
http://openstreetmap.org
5
www.ushahidi.com
Isibat: A Web and Wireless Application for Collecting Urban Data about Seismic Risk 139

contribute to such a seismic inventory about existing buildings. This hypothesis


requires considering the terms and conditions of such a participatory and volunteered
contribution.

3.2 How to Contribute to a Seismic Inventory?


Our goal is to make it possible for any person equipped with a smartphone or a PC
tablet, to collect data about the structural parameters of the buildings in an urban
environment, through a well-adapted interface. For this, structural parameters to be
described and collected have to be understood by anyone. In our context, the list of
these parameters follows the one defined and recommended by the EMS-84, which
are shown in Table 1.
At first sight, filling-up the values of such structural parameters through a form
would not require a high level of expertise. However, in the EMS-84 nomenclature,
those parameters are organised in a hierarchical way, ranging from the most general
level (level 1) to the most specific or precise (level n). For instance, for the roofing
parameter, the level 1 consists in indicating if the roofing of the observed building is a
terrace or no. The other levels help in describing in more detail the shape and the
slope, and the characteristics of the roofing construction. Some parameters (at the
highest levels) deal with some more technical specifications and are useful for
refining the estimation methods. Filling-up those parameters may require a certain
level of expertise, but even if they are left unknown (no answer or value is assigned to
these parameters), an assessment of vulnerability can still be achieved, more or less
complete.

Table 1. Structural parameters for the assessment of seismic vulnerability and damage (at the
more general level) [5]

Vulnerability Damage
Building typology EMS-98 damage level
Date of construction Hairline cracks in walls
Soil type Large cracks in walls
Roof Small blocks fallen from stone or brick walls
Number of stories Heavy blocks fallen from stone or brick walls
Regularity in elevation Hairline cracks at the connection between walls
Strength Collapse at the connection between walls
Height of floors At least one floor partially collapsed
Regularity in plan At least one floor totally collapsed
Quality-reinforcement Chimney, moderate or heavy damage
Distance between walls Some tiles fallen
State of preservation Roof partially collapsed
Horizontal diaphragm Roof totally collapsed
140 P.-A. Davoine et al.

Therefore, in the context of the urban seismology, depending on the parameter,


some expertise might be needed, which makes rather sensitive the point of collecting
data by means of a citizen and volunteered approach… As a consequence, we propose
to allow for a process of data collection or production for both citizen and expert
users, according to the complexity and the organisation of such data as described by
the ESM-84. Then, one can distinguish between:
- a citizen data collection that mainly deals with general data, at level 1, that can be
handled by non expert contributors, aware of the seismic risk problematic;
- an expert data collection that is more oriented towards the more complex
parameters, and which is rather part of a process of co-production data. Experts can
achieve this collection during planned collection campaigns or at any moment.
Data quality is another important issue and constraint that has to be handled as it
may impact the potential reusability of data. Indeed, many works [6], [10] have
shown that VGI data can be error prone; such errors being geometric as well as
semantic errors. In our case, the geographic dimension concerns only the location
(geographic coordinates) of the studied building, not its geometric shape nor its
planimetric precision. Semantic data must be checked when entered in order to
control and ease their description and to limit input errors. Then, the user should be
invited to fill-up Boolean input items that requires only two answers: true or false, or
to select one value in a finite list of values associated with other input items
describing structural parameters. Such values could be qualitative or quantitative, and
described by pictograms. Moreover, in order to ease the understanding and the
identification of the parameters, some help should be provided in the shape of texts or
schemas.
The motivation of contributors is an important aspect in the process of VGI or
participatory geography [9]. While there is no doubt about the motivation of expert
contributors, taking into account the stakes, it is questionable what can push citizens
to voluntarily and spontaneously collect such data. Given the success of what Bossu
[1] calls the citizen seismology, one can think that in a post-seismic period, the citizen
contribution would be important and massive. On the opposite, in a pre-seismic
period, one has to find some strong incentive arguments. Especially, when
considering that such information about the structural and seismological
characteristics of buildings can also have a negative impact on their assessment, or
even on their market value on a real estate point of view… The idea is to give people
the feeling of contributing to a global and citizen project. Then, we propose that, after
being processed, and cartographically generalized, such collected and open data
should be used to publish thematic information on seismic hazard maps. In a more
general way, we propose to define different ways to access data depending on
whether the contributor is involved in either a citizen or an expert collection of data.

4 The Isibat Application

Isibat is an application that allows users to co-produce geo-referenced information


about the buildings of a city in order to assess their vulnerability or damage. Isibat is
Isibat: A Web and Wireless Application for Collecting Urban Data about Seismic Risk 141

both an application running on a mobile device (client side) and a Web application
(server side). On the client side, when the collection phase is active, the IsibatMobile
application, allows for the acquisition of in-situ seismic data about the inventoried
buildings, as well as the evaluation in real-time of their levels of vulnerability or
damage. On the server-side (for instance, on a workstation), IsibatOnline is a Web
application that provides storage, visualization and dissemination functionalities for
collected data. The Web application also offers some analysis functionalities for the
assessment of the seismic risk over the studied areas of a city.

4.1 Principles
The Isibat application relies on the following principles:
- An interactive and multi-level acquisition of semantic data: the user enters,
through its mobile device, the parameters characterizing the studied buildings, at
different levels of accuracy, depending on her skills and knowledge (expertise
and competencies).
- An automatic acquisition of contextual data: the geographic coordinates of the
buildings, as well as the date and the time of the collection, are automatically
provided by the mobile device.
- A contextualized use: various and progressive collection modes are offered,
mainly a seismic vulnerability assessment during a pre-seismic period, and a
damage evaluation during a post-seismic phase.
- A cartographic visualization: both the IsibatMobile and the IsibatOnLine
applications rely on cartographic component approach all along the collection,
the display and dissemination processes.
- A co-production and a sharing of information: the application is designed to
allow multiple contributors with varying levels of knowledge and different
objectives, to collect data either spontaneously or in the context of specific and
planned campaigns of acquisition. Once data are collected, they are stored on a
server and made available in various ways, through a Web interface.

4.2 IsibatMobile
The IsibatMobile application can be downloaded for free from the Apple Store6. It
runs in two modes: Vulnerability and Damage. Any inventoried building can be
edited alternately in one of two modes. Each of these modes offers functionalities for
entering, deleting and modifying the compulsory parameters needed for the
assessments of vulnerability and damage. Semantic data are entered using drop-down
lists containing items to select (vulnerability mode) or "box select" (damage mode).
The investigator may also, using her mobile device, take pictures or record some
audio comments, which act as metadata for the collected data.

6
Look for Isibat iPhone application in iTunes.
142 P.-A. Davoine et al.

- In Vulnerability mode, the values of the structural parameters are organized


hierarchically allowing an incremental and progressive input of data. When
the user selects a value at level1 (see Fig. 3 – screen 1), the possible values for
level 2 show up, and so on until the last level (see Fig. 3 – screen 2). This
approach allows the user to refine the seismic inventory based on her level of
expertise. Given the complexity of the parameters and in order to avoid
misunderstandings, an explanatory text, sometimes accompanied by figures, is
associated with each selected parameter (see Fig. 3 – screen 3). As soon as a
building is inventoried, its seismological characteristics are calculated and
displayed according to the EMS-98 scale (see Fig. 3 – screen 4)

Fig. 3. Isibat: the main functionalities when using the Vulnerability mode: (screen 1) edition of
the parameters at levels 1 and 2; (screen 2) edition of parameters at higher levels; (screen
3) example of help; (screen 4) Computation of the vulnerability of the building and multimedia
recording

- In Damage mode, the parameters correspond to a visual observation that the


investigator performs in-situ (on the field). Three values are possible for these
parameters: yes, no or not specified. The level of damage is determined visually
by selecting one image on the grid proposed by the EMS-84 (see Fig. 4).

The mobile application has a map component that makes it possible to locate the list
of inventoried buildings and display their level of vulnerability and / or damage (see
Fig. 5). The location associated with the identified building can be refined
interactively (see Figure 5 – screen 1). Base map used by default are those proposed
by Google, but the user is allowed to integrate her own base maps, either in a vector
or a raster matrix mode.
Isibat: A Web and Wireless Ap
pplication for Collecting Urban Data about Seismic Risk 143

Fig. 4. Isibat: the main functio


onalities when using the Damage mode: on the left, the edition of
the damage parameters; on thee right, the edition of damage levels (3)

Fig. 5. Isibat: (screen 1) a go


oogle pin showing the location of a building; (screen 2) genneral
view of the cartographic comp ponent

4.3 IsibatOnLine
On the server side, the Web b application IsibatOnline is in charge of the managemment,
visualization and cartograp phic processing of collected data. The application conssists
of several modules (see Fig g. 6):
- A management mo odule which allows the registration or the downloadd of
collected data: an up pward flow sends the collected data to the server databaase,
while a downward d flow allows a contributor to download data from m a
collection that it wisshes to continue, modify or complete.
144 P.-A. Davoine et al.

Fig. 6. The Isibat Architecture

- A module based on a PostgreSQL / PostGIS database stores the collected data.


- A validation module, which validates the data submitted by a contributor.
Modifications on existing data can be performed by the contributor according
to her granted permissions.
- A module for publishing data using OGC Web services (WMS, WFS ...). This
module displays the location of the buildings identified on the maps of the
Web application, or exports them towards remote OGC standard compliant
client applications handling.
- A visualization module (MAP), which gives access to the data collected. This
is done through different display modes: maps (maps with points, or
choropleth maps or density maps), graphics (bar charts, pie charts...) or data
tables (see Fig. 7). The consultation can be carried out according to various
criteria corresponding to structural and seismic parameters (levels of
vulnerability and damage).

Through the cartographic interface of IsibatOnLine, one can, for example, create
the map of buildings built before one given date. The OGC module is then used to
dynamically build the map corresponding to this filter criterion. If, subsequently,
buildings are added, they will automatically be added to this map. Also, maps
associated with the classes of vulnerability (as presented in section 2) allow for the
analysis of the homogeneity or inhomegeneity of some urban areas, at different
geographic scales (see Fig. 8).
Isibat: A Web and Wireless Ap
pplication for Collecting Urban Data about Seismic Risk 145

Fig. 7. The IsibatOnLine cartographical interface

phic interface displaying the vulnerability classification of a giiven


Fig. 8. IsibatOnLine cartograp
area

4.4 Technological and Functional


F Aspects
IsibatMobile being an iP Phone application, it is mainly based on the Applle©
technologies. It has been developed
d with the XCode development framework andd is
based on the object-orien nted language Objective C. The mapping part of the
application is based on thee Apple MapKit framework (interactive maps) and on the
CoreLocation framework libraries (geolocation). Data are stored using SQL Lite.
146 P.-A. Davoine et al.

SQLite is a database engine without server, embarked on the iPhone. SQLite is open
source and in the public.
On the server side, IsibatOnline, is a Web application that has been mainly built
with the Google Web Toolkit (GWT) framework. This solution allowed us to write
the code using Java during the development phase, and then to translate this code in
Javascript when compiling. This choice was made for two reasons: first, it facilitates
the writing of automated unit and functional tests, and, second, it overcomes problems
of compatibility with different browsers, since the compilation generates several
Javascript versions, adapted to different browsers (Firefox, Chrome, IExplorer, ...).
Interactive maps of the application rely on a GWT component based itself on
OpenLayers components. The base map uses either OpenStreetMap or Google Maps
layers (chosen by the user). On this base map, OGC layers (WMS essentially) are
added: they display the data collected during the collections performed using
IsibatMobile. The statistics diagrams and the user interface have been designed using
GXT components, an open source library. A PostgreSQL/POSTGIS server ensures
the persistence and the access to data. The application is installed on a Tomcat server.
Exchanges between the two applications are made using XML files, based on a
proprietary schema, or simply on image files for photos. Each of the two applications
parses this file and inserts it into the database (SQLite or PostgreSQL) it uses. To
recover these files or copy them on the iPhone, one must use iTunes. Uploads to and
downloads from the Web application are performed classically.

5 Conclusion and Future Work

Recent advances in mobile networks and wireless technologies on the one hand, map
components on the other hand, make it now possible to envisage and build platforms
for collecting in-situ geo-referenced data, based on efficient and powerful client-
server architectures which take advantage of the numerous sensors now embedded in
most of the mobile devices (smartphones, tablet PCs, ...) found on the market.
In this context, we have presented the application Isibat which is dedicated to the
in-situ seismic inventory of existing buildings either for vulnerability or damage
assessment purposes. Isibat actually consists of two applications: first, IsibatMobile is
a mobile client application, available on iPhone or iPad, intended to assist in an
intuitive and interactive way, contributors during a campaign of data collection,
during a pre or a post-seismic period. IsibatMobile can be used by experts as well as
by citizens. Second, IsibatOnLine is a Web application that acts as a server dedicated
to the management of the data collected using IsibatMobile. IsibatOnLine provides
access to data and, through dynamic and interactive map components, it allows for
querying the inventoried areas.
Up to now, Isibat has been used by expert contributors-investigators only in two
case studies: one concerning the city of Grenoble (France) for the assessment of urban
vulnerability and the other in the area of Ferrara in the north of Italy, to assess the
damage after the earthquake that struck the region in 2012. Feedbacks of expert users
allowed us to improve the usability of both the IsibatMobile and the IsibatOnLine
Isibat: A Web and Wireless Application for Collecting Urban Data about Seismic Risk 147

interfaces. Yet, experimentations and collections involving non-expert users have not
been conducted. This raises questions about data quality (how to represent the
quality?, how to evaluate it?, how to take it into account in future analyses and
processing, etc.) and then about data validation, that deserve to be studied thoroughly
and which, in fact, are questions raised by any application adopting a VGI-like
approach based on the participation of the citizens. We have recently started to work
on such issues.

References
1. Bossu, R., Gilles, S., Mazt-Roux, G., Roussel, F.: Citizen Seismology or How to Involve
the Public in Earthquake Response. In: Miller, D.M., Rivera, J. (eds.) Comparative
Emergency Managment: Examining Global and Régional Responses to Disasters, pp. 237–
259. Auerbach/Taylor and Francis Publishers (2011)
2. Calvi, G., Pinho, R., Magenes, G., Bommer, J., Restrepo-Velez, L., Crowley, H.:
Development of seismic vulnerability assessment methodologies over the past 30 years.
Indian Society Journal of Earthquake Technology 43(3), 75–104 (2006)
3. EMS98, “L’Echelle Macrosismique Européenne 1998”, Conseil de l’Europe, Cahiers du
Centre Européen de Géodynamique et de Séismologie, vol. 19, p. 124 (2001) (in French)
4. Gueguen, P., Michel, C., LeCorre, L.: A simplified approach for vulnerability assessment
in moderate-to-low seismic hazard regions: application to Grenoble (France). Bulletin of
Earthquake Engineering 4(3), 467–490 (2007), http://www.springerlink.com/
content/14hmjgn512805344/
5. GNDT, Instruzioni per la Compilazione de lla Sceda di Relivamento Esposizione e
Vulnerabilità Sismica Degli Edifici. Gruppo Nationale per la Difesa dai Terremoti,
Regione Emilia Romagna y Regione Toscan, Italy (1986) (In Italian)
6. Gires, J.-F., Touya, G.: Quality Assessment of the French OpenStreetMap Dataset.
Transactions in GIS 14, 435–459 (2010)
7. Goodchild, M.F.: Citizens as voluntary sensors: spatial data infrastructure in the word of
Web 2.0. International Journal of Data Infrastructure Research 2, 24–52 (2007)
8. Mericskay, B., Roche, S.: Cartographie 2.0: le grand public, producteur de contenus et de
savoirs géographiques avec le Web 2.0. Cybergeo: European Journal of Geography (2010),
http://cybergeo.revues.org/24710 (in French)
9. Noucher, M.: Coproduction of spatial data: from compromise to argumentative consensus.
Conditions and participatory processes for producing spatial data together. International
Journal of Geomatics and Spatial Analysis, Special issue (2011)
10. Ruitton-Allinieu, A.M.: The Crowdsourcing of geoinformation: data quality and possible
applications, Alto university, School of Engineering, department of surveying (2011)
11. RiskUE, An Advanced approach to earthquake risk scenarios with applications to different
European towns, Projet Européen, EVK4-CT-2000-00014 (2003)
12. Spence, R., Lebrun, B.: Earthquake scenarios for European cities – the risk-UE project.
Bull. Earthquake Eng. 4(4) (2006) (special issue)
13. Viana, W., Miron, A.-D., Moisuc, B., Gensel, J., Villanova-Oliver, M., Martin, H.:
Towards the semantic and context-aware management of mobile multimedia. Multimedia
Tools Appl. 53(2), 391–429 (2011)
A Journey from IFC Files to Indoor Navigation

Mikkel Boysen, Christian de Haas, Hua Lu, and Xike Xie

Department of Computer Science, Aalborg University, Denmark


{bikkelmoysen,theelg}@gmail.com, {luhua,xkxie}@cs.aau.dk

Abstract. In many scenarios, people have to walk through unfamiliar indoor


spaces such as large airports, office buildings, commercial centers, etc. As a
result, indoor navigation is of realistic importance and great potential. Existing
indoor space models for indoor navigation assume that relevant indoor space
information is already available and precise in the model-specific format(s). How-
ever, such information, e.g., indoor topology that is indispensable to indoor nav-
igation, is only implicitly (and even imprecisely) hidden in industry standards
like the Industry Foundation Classes (IFC) that describe building projects. This
paper is motivated to bridge the apparent gap between industry standards and in-
door navigation. In particular, we propose an effective method to construct indoor
topology by carefully processing IFC files. We also refine an existing method that
decomposes large and/or irregular indoor partitions, which helps speed up rout-
ing in indoor navigation. Furthermore, we design an algorithm that computes in-
door distances involving concave partitions. We conduct extensive experiments to
evaluate our proposals. The experimental results demonstrate that our proposals
provide effective processing of IFC files and efficient indoor navigation.

1 Introduction

Every day, many people around the world have to walk through unfamiliar indoor
spaces such as large airports, office buildings, commercial centers, etc. Due to various
factors like modern design and functional requirements, many of such indoor spaces are
becoming increasingly large and complex. Therefore, indoor navigation is of realistic
importance and great potential. In particular, indoor navigation is useful for those peo-
ple that are not familiar with the internal structure of the building. Indoor navigation
is also very useful for those in a hurry to get from one place to another in an indoor
environment, e.g., from the check-in desk to the correct boarding gate at a large airport.
An indoor navigation system requires an appropriate indoor space model that repre-
sents the indoor entities as well as topology among those entities. Such a requirement is
alike to the use of maps of road networks in outdoor navigation settings. Nevertheless,
modeling indoor spaces is unique due to at least two reasons. First, movement inside
an indoor space is enabled and constrained by various indoor entities like walls, doors,
staircases, etc. Second, Euclidean distances and network distances that work for outdoor
settings fall short in indoor spaces because of the indoor entities and topologies.
Recently, some indoor space models [12, 18, 22] have been proposed to support in-
door navigation at different semantic levels. These proposals more or less assume that
all information about indoor entities and topology is already available and precise in

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 148–165, 2014.

c Springer-Verlag Berlin Heidelberg 2014
A Journey from IFC Files to Indoor Navigation 149

the model-specific format(s) when the indoor space model is to be generated for a given
indoor space. This assumption, however, is often questionable in reality.
In the architecture, engineering and construction (AEC) industries, the Industry Foun-
dation Classes (IFC) model, an international standard registered by ISO, is often used
to describe buildings. For example, its use is compulsory for publicly funded building
projects in Denmark [17]. The IFC model does not represent indoor space information
in the format(s) assumed by those navigation-oriented models mentioned above. In-
stead, it focuses on the geometric representation for indoor entities, and its description
for indoor topology is only implicit or even incomplete.
Apparently, there is a gap between the widely used industry standard for describing
indoor spaces and the indoor space models for navigation purposes. The research de-
scribed in this paper is thus motivated to bridge the gap by a series of technical steps
that altogether fulfil the purpose of indoor navigation. As a matter of fact, construct-
ing indoor topology from IFC files benefit not only indoor navigation but also other
applications like indoor tracking data cleansing [2].
In particular, we make the following contributions in this paper. First, we propose
an effective method that processes the IFC file of an indoor space and constructs in-
door topology from the file. Second, after the indoor topology is constructed, we design
a refined decomposition method that breaks down large indoor partitions in irregular
shapes. The decomposition can help speed up routing that involves large indoor par-
titions in irregular shapes. Third, with the availability of indoor topology and refined
decomposition, we design an algorithm that is able to compute indoor distances for
complex partition shapes. Last, we conduct extensive experiments to evaluate our pro-
posals. The results demonstrate that our proposals provide effective processing of IFC
files and efficient indoor navigation.
The rest of this paper is organized as follows. Section 2 reviews related work. Sec-
tion 3 elaborates on processing IFC files. Section 4 presents the refined decomposition.
Section 5 addresses the indoor routing. Section 6 reports the experimental results. Sec-
tion 7 concludes the paper.

2 Related Work

As we work on building indoor navigation systems from digital building information,


the related work comes from two aspects: the development of digital building informa-
tion, and the proposals about indoor space models as well as indoor navigation.

2.1 Digital Building Information

Digital representations of indoor space information make it possible to generate in-


door space models automatically by computer programs. Currently, two methodologies
exist supporting digital building information: CAD and BIM. CAD lacks certain fea-
tures required to improve the working process within the AEC industries, and therefore
BIM (building information modeling) has been established as “digital representation of
physical and functional characteristics of a facility” [14]. A main advantage of BIM is
interoperability, which enables quick and easy sharing of building information [13].
150 M. Boysen et al.

In order to eliminate waste resulting from recollection and recreation of information


in the AEC industries, BuildingSMART has developed the Industry Foundation Classes
(IFC), a neutral data format supporting BIM. IFC can be used to describe, exchange and
share construction project information, and is supported by about 150 software appli-
cations worldwide [5]. Due to its popularity, we use IFC as the source for indoor space
model generation. An introduction to the structure of IFC files is given in Section 3.1.

2.2 Indoor Space Models and Indoor Navigation

Lee [10] proposes the 3D Geometric Network Model that treats the vertical and hori-
zontal connectivity relationship among 3D spatial cells separately. Whiting et al. [16]
propose a 3D metrical-topological model that describes both shapes and connectivity of
spatial cells. Becker et al. [15] combine space partitions with possible events in a dual
space to enable navigation in multi-layered buildings. Focusing on topological relation-
ships, these models do not support distance-aware indoor navigation.
CityGML promoted by Kolbe et al. [6] is an OGC GML standard for modeling urban
elements in 3D objects. It offers architectural models for interior on its lowest Levels of
Detail (LOD) 4. Topological connections between geometries are optional in CityGML.
Most recently, Li et al. [7] are working on IndoorGML to develop it into an application
schema of OGC GML for indoor spaces.
Anagnostopoulos et al. [1] propose an ontological framework for indoor routing
that considers user profiles. Yang and Worboys [21] propose a navigation ontology for
outdoor-indoor spaces. Li and Lee [11] design a lattice-based semantic location model
where “length” of an indoor path is measured by the number of doors. This model falls
short in many practical scenarios as it does not calculate the real indoor distances [12].
Yuan and Schneider [22] propose the iNav model for indoor navigation. It however
does not support arbitrary indoor position-to-position distances. Xu and Güting [20]
propose a generic data model for moving objects in three kinds of spaces: free space,
road networks, and indoor space. Both models [20, 22] map doors to graph nodes and
rooms to edges. Such a design does not support the door directionality and is incompat-
ible with realistic operations like closing/opening a door. Yuan and Schneider [23] also
propose the LEGO representation that involves only so-called connector-to-connector
indoor distances. Lu et al. [12] develop a distance-aware indoor space model that sup-
ports indoor navigation for two arbitrary positions in an indoor space. Therefore, we
employ this latest indoor space model in this research.
This work differs substantially from previous works by the following features. First,
rather than formulating a new indoor space model, this work proposes necessary tech-
niques to instantiating an existing indoor space model [12] by processing raw digi-
tal building information appropriately. Such efforts have not seen in aforementioned
existing research on indoor space models. Second, this work improves existing tech-
niques in relevant aspects. In particular, this work proposes a refined indoor partition
decomposition method compared to an existing method [19]. Also, this work elabo-
rates on intra-partition indoor distance computation, which is only abstract in previous
research. Third, this work demonstrates the complete procedure of building indoor nav-
igation systems from raw digital building information. We also implement a prototype
A Journey from IFC Files to Indoor Navigation 151

system [3], which consists of a server and wireless-enabled mobile terminals, to verify
and evaluate the proposals in this work.

3 Processing IFC Files


In this section, we briefly present the IFC file format and then detail how to construct
indoor topology from the IFC files.

3.1 IFC File Format


IFC is an open data format for representation and exchange of digital building informa-
tion. The default, and most commonly used type of IFC file format is IFC-SPF (.ifc),
an exchange file structure, which is defined by ISO 10303-21, also known as a STEP-
File [4]. IFC files of this type contain a header section for meta-data and a data section
for the building project data. In the data section, each line represents an IFC entity in-
stance, which has its own unique identifier. Instances include attributes, that describes
actual data, and references to other instances, using their unique identifiers. A segment
of an IFC file is shown in Listing 1.1.
Listing 1.1. Example of IfcSpace in an IFC file
#79155= IFCDIRECTION ( ( − 1 . , 0 . , 0 . ) ) ;
#79159= IFCCARTESIANPOINT ( ( 7 1 9 8 0 . , 6 2 1 3 5 . , 0 . ) ) ;
#79163= IFCAXIS2PLACEMENT3D ( # 7 9 1 5 9 , # 3 6 , # 7 9 1 5 5 ) ;
#79166= IFCLOCALPLACEMENT( # 1 0 5 , # 7 9 1 6 3 ) ;
#79169= IFCSPACE ( ’ 1 mijPj97fEJeJTfqANk3Do ’ , # 1 3 ,
’ 0 . 0 . 5 6 ’ , $ , $ , # 7 9 1 6 6 , # 7 9 1 5 1 , ’ T o i l e t ’ , . ELEMENT . , . INTERNAL . , $ ) ;
A typical IFC file contains an IfcSite element at the top. In the top-down fashion,
this element contains one or more IfcBuilding elements each of which contains one
or more IfcBuildingStorey elements. Each IfcBuildingStorey element in turn contains
the building elements (e.g., doors) on that storey. In other words, an IfcSite element
corresponds to a building project, consisting of one or several buildings each of which
one or several floors populated with doors, walls, etc. All elements are located according
to a 3D Cartesian coordinate system within the building project.
Important elements in an IFC file are described below:
IfcBuilding represents a building and the root of all of the following elements.
IfcBuildingStorey represents a floor and contains all of the following elements.
IfcSpace corresponds to an indoor partition constrained by walls. It is created with
location, direction and polyline coordinates, which are used to determine the position
and geometry of the partition.
IfcDoor corresponds to a door. An IfcDoor element is created with position and
direction coordinates, which are used to determine the position of the door in the global
coordinate system.
IfcStair corresponds to a stair. An IfcStair element is created with reference to floor
on which it is defined.
IfcTransportElement corresponds to an elevator. It is created with reference to the
floor on which it is defined.
152 M. Boysen et al.

An IFC file contains explicit information about the containment relationship between
floors and building elements, but it lacks the topological relationships between con-
nected elements, e.g., a door and the connected room(s). Consequently, IFC files cannot
be used directly in indoor navigation that requires indoor topology.

3.2 Construction of Indoor Topology from IFC Elements


In this section, we propose an effective and efficient method to construct indoor topol-
ogy from the geometric information available in an IFC file.
We start by describing a naive idea that works as follows. It treats each door as a
point and assumes all walls are parallel to the x or the y axis. It attempts to connect a
door to the walls (and thus the partitions) whose x-distance (or y-distance) to the door
is not beyond a given threshold. We call this naive idea threshold based method.
The aforementioned simplifications required by the naive idea may not hold in re-
alistic indoor settings. Therefore, we propose a generic indoor topology construction
method that does not need the simplifications. The proposed generic method regards
each door as a line segment, and can handle walls in arbitrary orientations. Generally,
it consists two steps and follows the filter-and-refinement paradigm.
The first step is a filtering step
to quickly prune the search space to
a small number of candidate parti-
tions. It first draws a line segment
perpendicular to the line represent-
ing the door. Next, only those parti-
tions that are on the same floor and
intersect with the perpendicular line
segment will be passed to the second
step. In the example shown in Fig-
ure 1(a), partitions A, C, E, G are
pruned after the perpendicular line Fig. 1. An example of the generic method
segment is drawn.
The second step only runs if there are more than two candidate partitions as each door
is naturally connected to two indoor partitions. In this step, all the walls in the remaining
candidate partitions are inspected and if a wall intersects with the perpendicular line
segment introduced in the first step, the partition that the wall belongs to is stored along
with the shortest distance from the wall to the door. The two partitions containing the
wall(s) with the shortest distance to the door are returned as the partitions to which
the door should be connected. This step is illustrated in Figure 1(b), where the walls
in green color are found to contain the door, and therefore the door is connected to
partitions D and F finally.
We experimentally compare the generic method with the threshold based method
(the naive idea) in Section 6.1. The superiority of the generic method is verified with
respect to various criteria.
A Journey from IFC Files to Indoor Navigation 153

4 Refined Indoor Partition Decomposition


A previous work [19] proposes to decompose indoor partitions that are large and/or of
irregular shapes, such that all indoor partitions are balanced in terms of size and shape.
A benefit of it is that, the dead space (empty area in which no data resides) is reduced
significantly in the tree node MBRs when indoor partitions are indexed by an R-tree,
and thus the search for indoor partitions via the R-tree becomes more efficient. We also
use an R-tree to index the indoor partitions when we implement the distance-aware
indoor space model proposed elsewhere [12].
In this section, we propose a refined decomposition algorithm. In addition to improv-
ing the R-tree structure, the refined decomposition algorithm also helps speed up indoor
routing, as demonstrated by the experimental results shown in Section 6.2.

4.1 The Existing Decomposition Algorithm


We briefly present the existing decomposition algorithm in [19]. To decompose a con-
cave partition, turning points are used. A turning point in a partition is a point that
creates an internal angle greater than 180 degrees. When a concave partition is decom-
posed, it is split by a line segment drawn perpendicular to the longer dimension of the
partition through the turning point closest to the middle of the partition. This is done
recursively until no more turning points exist. Convex partitions are decomposed if the
lengths of their dimensions are imbalanced, i.e. the ratio between the width and height
exceeds a predefined threshold, Tshape . Here, the partition is split by a line segment
drawn perpendicular to the longer dimension through the middle point on that dimen-
sion. This is also done recursively until the partition is balanced.

4.2 Refined Decomposition Algorithm


Dead Space Threshold. The existing decomposition algorithm [19] may result in in-
appropriate sub-partitions on special inputs. In Figure 2(a), the decomposition creates
many small sub-partitions. In Figure 2(b), the decomposition creates a very imbalanced
partition, i.e., a partition with significantly longer width than height. Further decompo-
sition of this sub-partition involves multiple dimensional splits, again resulting in many
small sub-partitions. The amount of small sub-partitions created scales inversely pro-
portional with the percentage of dead space in the MBR, which, in practice, can cause
memory problems. Considering the small amount of dead space removed and the large
amount of small sub-partitions created, decomposition on these kinds of partitions is
considered unnecessary. Therefore, a threshold, Tdeadspace , is introduced to ensure that
partitions are not decomposed unless they have a minimum predefined percentage of
dead space in their MBRs.
Best Split on Turning Point. In the original algorithm, splitting the partition using
turning points is performed perpendicular to the longer dimension. However, this will
not always result in a split where the sub-partitions created are most balanced, i.e. the
ratio between the width and height is closest to equal. Sometimes, a split made perpen-
dicular to the shorter dimension can result in a more balanced split than a split made
perpendicular to the longer dimension.
154 M. Boysen et al.

An example of this can be seen in Figure 3. The two scenarios shown in the figure
are identical apart from how the split is performed. As it can be seen, the height of the
partition exceeds the width and the split should, according to the original algorithm,
be performed perpendicular to the height, as illustrated in Figure 3 (a). This, however,
creates a large and a small sub-partition, where the small partition is imbalanced. If the
split is performed perpendicular to the width of the partition, as seen in Figure 3 (b),
more balanced and evenly sized sub-partitions are created. As such, the solution to this
problem is to find and use the split which creates the most balanced sub-partitions.

Fig. 2. Example of problematic decompositions Fig. 3. Example of choosing the best split

Alignment on Dimension Split. Using a dead space threshold, the existing algorithm
[19] is able to decompose imbalanced convex partitions as intended. However, it is still
possible that a resulting partition contains turning points and is concave. This can cause
problems when making the dimensional split, i.e., the split perpendicular to the longer
dimension through the middle point on that dimension. An example is shown in Fig-
ure 4, where the partition is decomposed into two sub-partitions A and B. By examining
the partition before it is decomposed, it can be seen that it does contain a turning point.
However, assuming that the percentage of dead space does not exceed the threshold,
the partition is not decomposed using turning points. After the dimensional split, the
percentage of dead space in sub-partition A may exceed the threshold, allowing for
decomposition using the turning point. Using the best split solution described above
would create a small sub-partition in the upper right corner of sub-partition A. This
small sub-partition becomes more imbalanced when the dimensional splitting line is
closer to the turning point, possibly resulting in multiple dimensional splits.
The solution to this problem is to search the partition for turning points when making
the dimensional split. If any turning point in the partition is within the distance of a pre-
defined threshold, Talign , the splitting line is aligned with the turning point, eliminating
the aforementioned problem.

Fig. 4. An example of dimensional split problem


A Journey from IFC Files to Indoor Navigation 155

Algorithm. The improvements described above are integrated into Algorithm 1. The
refined algorithm takes as input the partition region r and three thresholds: Tshape ,
Tdeadspace , and Talign . First, in Line 2, the set of turning points is found for r. Next, in
Line 3, a check is made to see if r is concave and the amount of dead space in r exceeds
Tdeadspace . If that is the case, r is decomposed using turning points in Line 4-8. Here,
the turning point t closest to the middle of r is used to create two splitting lines. The
splitting line that results in the best split, is used to divide r into two or more regions,
{ri }. The decomposition algorithm is then run recursively on each of those regions.
If the check in Line 3 is not valid, another check is performed to determine if r is
imbalanced, as seen in Line 11. If that is the case, r is decomposed using dimensional
split (Lines 12-18). Here, the middle point m on r’s longer dimension is used to create
a splitting line s. Then s is aligned with any nearby turning point using the threshold
Talign . Finally, r is divided into two or more regions, {ri }, and the decomposition
algorithm is run recursively on each of those regions.

Algorithm 1. Refined Decomposition


1: function D ECOMPOSE (region r, threshold Tshape , Tdeadspace , Talign )
2: find the set of turning points P for r;
3: if r is concave and deadspace(r) > Tdeadspace then
4: select a turning point t ∈ P on r’s boundary, such that t is closest to the middle of r;
5: create two splitting lines, one for each dimensions to find the best split;
6: use the best splitting line to divide r into two or more regions: {ri };
7: for each ri in {ri } do
8: Decompose(ri, Tshape , Tdeadspace , Talign );
9: else
10: let R(r) be the MBR of r;
min(len(R(r)1 ,len(R(r)2 )
11: if max(len(R(r) 1 ,len(R(r)2 ))
< Tshape then
12: find the middle point m on r’s longer dimension d;
13: create a splitting line s perpendicular to d through m;
14: if a turning point t ∈ P is within Talign of s then
15: align s to cut through t;
16: use s to divide r into two or more regions: {ri };
17: for each ri in {ri } do
18: Decompose(ri , Tshape , Tdeadspace , Talign );

5 Intra-Partition Distances in Indoor Routing

This section describes the algorithm for indoor distance computation and routing. The
overall routing algorithm we use is from a previous work [12]. It is based on door-to-
door distances and is able to compute the shortest route between two positions in the
indoor space. This is done by using an exploratory approach to find possible shortest
routes between two points and updating the result if a shorter route is found as the dif-
ferent paths are traversed. As the algorithm runs, the currently shortest indoor distance
156 M. Boysen et al.

from a source door to a destination door is stored and used to avoid unnecessary com-
putations. Furthermore, the door-to-door distances computed at a given time are reused
to limit the search. The computation of intra-partition distances between doors is not
the focus of the previous work [12]. In the following, we focus on the intra-partition
distances and design a refined algorithm to compute them.
In a simplified way, the shortest distance between two doors can be assumed to be
the Euclidean distance between them. However, this is not always correct, because the
path that represents this distance can be obstructed by obstacles or walls in concave
shaped partitions. An example is shown in Figure 5(a), where s and t are doors.

Fig. 5. Example of intra-partition distance Fig. 6. Example of the initial split

A straightforward approach to compute intra-partition distances is to use Dijkstra’s


algorithm [8], as proposed in [24]. This is done by creating a graph in which the ver-
tices of the partition polygon and the start/end points are modeled as vertices. Edges are
applied between graph vertices if a line segment can be drawn between them without
intersecting the polygon or being outside it. If an edge is applied, the weight is set to
be the length of it, i.e., the Euclidean distance between the two vertices. An example
of such a graph is illustrated in Figure 5(b), for the start and end points s and t in Fig-
ure 5(a). When Dijkstra’s algorithm is run on the graph, the shortest distance between
the start and end point is computed. We call this approach Obstructed Distance (OD). It
can be computationally expensive if the partition polygon has many vertices.
In sequel, we attempt to devise a more efficient method to compute intra-partition
distances. We call it Walk the Line (or WTL for short), which is formulated in Algo-
rithm 2. It takes as input a partition p, a start point s, and an end point t. As part of
the initialization, two polylines are created from the boundary of p, using the start and
end point, as seen in Line 3. An example of this procedure is depicted in Figure 6,
where the two highlighted polylines, which constitute the boundary of the partition, are
created using points s and t as splitting points.
The algorithm returns a set of points, Rpath , which constitute the shortest path from
s to t. After adding s to Rpath and initializing a temporal point temp to be s in Line 4-
5, a while-loop is run from Line 6-16, which encompasses the code to find the shortest
path. This part of the algorithm is explained using the example shown in Figure 7.
In Figure 7(a), the line segment from s to t is represented by a dashed line, which
corresponds to Line 7. This line intersects with the boundary of the partition, and an
intermediate point must be found, which is done in Lines 11-14.
A Journey from IFC Files to Indoor Navigation 157

Algorithm 2. Walk the Line


1: function WALK THE L INE (partition p, point s, point t)
2: set of points Rpath ; point temp;
3: split the boundary of p into polylines pl1 and pl2 using s and t;
4: add s to Rpath ;
5: temp ← s;
6: while temp != t do
7: draw a line segment ltt from temp to t;
8: if ltt intersects with the boundary of p then
9: set point cp to the intersecting point closest1 to temp;
10: set polyline pl to pl1 or pl2 , such that pl contains cp;
11: for each concave vertex v on pl, starting with the one closest to t on pl do
12: draw a line segment ltv from temp to v;
13: if ltv does not intersect and is within the boundary of p then
14: add v to Rpath ; temp ← v; break
15: else
16: temp ← t; add t to Rpath ;
17: return Rpath ;

Fig. 7. An example of the steps of Walk the Line


158 M. Boysen et al.

Here, the two polylines created in the initialization are used and since the intersecting
point closest1 to s, the point cp, lies on the polyline highlighted in blue, the intermediate
point is chosen from the concave vertices on this polyline by “walking the line”. This is
done by iterating through the concave vertices, starting with the one encountered first
when traversing the polyline from t to temp, in this case v4 . For each concave vertex,
the line segment from temp to the vertex is drawn, and the first vertex that the line
segment does not intersect and that is within the boundary of the partition, is chosen as
the intermediate point. This process is shown in Figure 7(b) and 7(c). First, v4 is tested
but the line drawn intersects with the boundary of the partition. So the next concave
vertex v1 is tested and it passes as an intermediate point since the line segment drawn
does not intersect with the boundary of the partition. As such, v1 is added to Rpath and
chosen as the new temp (Line 14).
Figure 7(d) illustrates the start of a new while-loop iteration, and the procedure
shown in Figure 7(a) is repeated with v1 as temp. In this iteration, cp lies on the other
polyline created in the initialization, which is then used. The for-loop in Lines 11-14 is
repeated once again, as shown in Figure 7e, where the first concave vertex encountered
on pl, v3 , is tested. The process is continued and repeated until t is reached, as shown
in Figure 7(f).

6 Experimental Studies

In this section, we report the results of our experimental studies. Section 6.1 reports the
results on the effectiveness and efficiency of IFC file processing. Section 6.2 reports the
results on indoor navigation.
In the experiments, we use eight IFC files (see Table 1) to evaluate our technical
proposals. Those files describe eight real buildings in Scandinavia. All experiments are
done on a Windows 7 enabled PC with an Intel Core 2 Duo P8600 @ 2.4 GHz processor
and 2.25 GB RAM.

Table 1. IFC files used in the experiments and relevant results

Files alias Size (MB) Floor(s) Partition(s) Door(s) Stair(s) Elevator(s) Avg. exe. time (ms)
AC11 2,77 5 82 77 4 0 1945
Office A 4,00 3 99 102 2 0 3541
NEM-FZK 10,16 2 5 5 1 0 350
Cassio 10,82 4 352 433 14 0 8806
Clinic 12,90 4 269 249 3 0 8940
Dds 42,73 5 99 106 0 0 2893
HITOS 62,59 7 243 198 14 3 8292
Statsbygg 66,95 5 120 124 7 1 4991

1
The point which has the shortest Euclidean distance to another given point.
A Journey from IFC Files to Indoor Navigation 159

6.1 Results on IFC File Processing

We first compare the two indoor topology construction methods, namely the threshold
based method and the generic method detailed in Section 3.2. The results are listed
in Table 2. It is apparent that the generic method is more effective as it successfully
connects more doors to their partitions. This is attributed to the fact that it lifts the two
simplifying assumptions that may be hard to satisfy in reality. In contrast, the threshold
based method performs quite badly when the threshold is low (100 in the experiments),
whereas it is difficult for an average user to set the appropriate threshold values.

Table 2. Results of the two indoor topology construction methods

#doors connected to partitions


File name Method
to 0 to 1 to 2 to >2
threshold (100) 5 72 0 0
threshold (200) 5 70 2 0
“AC11”
threshold (400) 0 5 72 0
generic 0 1 76 0
threshold (100) 49 53 0 0
threshold (200) 20 41 40 1
“Office A”
threshold (400) 19 38 44 1
generic 0 15 87 0
threshold (100) 2 3 0 0
threshold (200) 2 3 0 0
“Nem-FZK”
threshold (400) 1 1 3 0
generic 0 2 3 0
threshold (100) 64 364 5 0
threshold (200) 44 112 276 1
“Cassio”
threshold (400) 36 95 300 2
generic 12 86 335 0
threshold (100) 121 127 1 0
threshold (200) 54 98 92 5
“Clinic”
threshold (400) 47 98 95 9
generic 2 21 226 0
threshold (100) 2 72 32 0
threshold (200) 2 22 82 0
“Dds”
threshold (400) 2 11 89 4
generic 0 8 98 0
threshold (100) 48 135 15 0
threshold (200) 24 75 98 1
“HITOS”
threshold (400) 21 74 102 1
generic 1 16 181 0
threshold (100) 31 65 28 0
threshold (200) 20 23 80 1
“Statsbygg”
threshold (400) 13 41 69 1
generic 0 1 123 0
160 M. Boysen et al.

Table 3. Processing time on IFC files

File and Floor Decomposed #Partitions #Access points generic (ms) threshold (ms)
“Ground Floor” No 170 192 1933 2202
in Cassio Yes 241 266 2350
“First Floor” No 155 173 1147 2235
in Clinic Yes 211 231 1260
“Level 1” No 60 68 451 1180
in Office A Yes 93 105 468
“Third Floor” No 37 42 507 579
in Dds Yes 52 57 516

We also look at the execution time of processing the IFC files. We process each used
IFC file 100 times and report the average processing time cost. The results obtained by
using the threshold based method are shown in Table 1 (the last column). Even the IFC
file with the largest number of indoor entities costs less than 9 seconds to process.
We further run the generic method 100 times on a particular floor of the four selected
IFC files and give the average results in Table 3 (the last column). The four IFC files
are selected to represent the different sizes of all the IFC files we have. For comparative
purpose, we also give the counterpart processing time of the threshold based method.
As we can see, the generic method is more efficient.

6.2 Results on Indoor Distances and Routing


In computing routes for indoor navigation, we use the approach proposed in the previ-
ous work [12] after the indoor space model is generated from the IFC files. In particular,
when computing intra-partition distances we compare the walk the line method (WTL
described in Section 5) and the obstructed distance method (OD) that adopts Dijkstra’s
algorithm. Also, we investigate how the routing efficiency is effected by the partition
decomposition. Further, when multiple routing requests are processed, we use a cache to
store the door-to-door distances and reuse them in processing the subsequent requests.
The experiments are conducted on four IFC files: Office A, Cassio, Clinic, and Dbs.
Again, they are selected to well represent the different sizes of all the IFC files we have.
In the experiments, we generate random indoor route requests as follows. First, an
indoor partition is chosen at random from all partitions in a building. Inside that parti-
tion, a position is decided at random as the start point of the route request. Then, the
end point is decided likewise, except that if it is always in a different partition from
the start point. This is to make the tests more meaningful. Furthermore, the following
configurations are set for the different tests:
WTL vs. OD. This test is run with four different IFC files, which differ in number of
access points, partitions, and concave partitions. The route algorithm is run for dif-
ferent files because the computation time of intra-partition distance computations
vary with the shape of the partition polygon and the placement of access points. For
each file, the routing algorithm is run on the same 50 randomly generated start and
end points for WTL and for OD.
A Journey from IFC Files to Indoor Navigation 161

(a) Office A

(b) Clinic

(c) Cassio

(d) Dds

Fig. 8. Computation time for 50 routes on IFC files


162 M. Boysen et al.

Using a cache (WTL vs. OD). This test is performed by running each algorithm, start-
ing with an empty cache, for 50 randomly generated routes. This process is repeated
50 times to calculate the average computation time for each algorithm after a spe-
cific number of routes. The same four IFC files are also used for this test.
Decomposing Partitions. Another test is performed after large indoor partitions are
decomposed in the indoor space model. In the routing algorithm, the WTL solution
is used for these tests. This method is referred to as WTLD.

Routing Efficiency. We first report the computation cost for WTL and OD. The results
are shown in Figure 8. The bar charts show the computation time using OD in percent-
age of the computation time using WTL for the same route, where the red horizontal
lines indicate the cost of using WTL. For example, as shown in Figure 8(a), the com-
putation time using OD is almost 200% the time using WTL for the first route in the
indoor space described by IFC file Office A. In each of the four charts, it is clear that
route computation using OD costs longer time than that using WTL. This indicates that
WTL is more efficient in dealing with intra-partition distance computations. However,
the computation time varies a lot from route to route, which can be explained by the
varying number of concave partitions in those routes.
According to the results obtained using the IFC file Clinic (shown in Figure 8(b)),
OD is significantly (up to 23 times) slower than WTL. The performance gap is much
larger that that observed from the three other figures. A review of the Clinic file dis-
closes that the file contains a relatively high number of concave partitions, which in
most cases are connected to many doors. As a result, more intra-partition distances need
to be computed for the Clinic file. Furthermore, many partitions in the Clinic file have
a high number of vertices in their graph representations. As the OD method basically
employs Dijkstra’s algorithm to search the local graph, it inevitably costs considerable
higher computation time than the WTL method.
As mentioned, the difference in computation time between using OD and using WTL
varies significantly depending on the route. To provide an overview of the average com-
putation time for one route, for each of the different IFC files, a bar chart is shown in
Figure 9. For the file Clinic, the average computation time using OD is as much as
114.4s. However, the bar chart is intentionally cut off at 25s due to the space limitation.
Again, it is very clear that the average computation time for OD is significantly longer
than that for WTL in all tested cases.

Fig. 9. The average computation time per route


A Journey from IFC Files to Indoor Navigation 163

(a) Office A (b) Clinic

(c) Cassio (d) Dds

Fig. 10. Effect of cache on IFC files

Another important note from Figure 9 is that WTLD clearly outperforms WTL and
OD in terms of the average route computation time. This is an expected result. The
number of concave partitions is reduced substantially after the decomposition, and thus
WTLD is able to save considerable computation costs that otherwise would be needed
in routing involving concave partitions.

Effect of Cache. The results of the tests using cache are shown in Figure 10. For the
first couple of routes, in each of the charts, WTLD is faster than WTL, which is in turn
faster than OD. This is expected as the cache does not contain many reusable results at
the early stage. Therefore, these results reflect the base performance of each algorithm,
as already analyzed in Section 6.2.
When more routes are to be computed, both WTL and OD improves clearly. This
indicates that the cached door-to-door distances are reused a lot by these two methods.
A different but interesting observation from these figures is that WTLD does not benefit
visibly from the cache. When a large indoor partition is decomposed, several small par-
titions are generated. For each pair of adjacent small partitions, one or two or even more
“virtual” doors are also generated to connect them. As a result, there are considerably
more door-to-door distances when decomposition is used. As the cache size is fixed,
the ratio of cached door-to-door distances is decreased, which contributes to lower the
cache utilization when intra-partition distances are calculated using WTLD.
On the other hand, when large indoor partitions are not decomposed, the ratio of
cached door-to-door distances is relatively high and thus the cache is utilized more when
OD and WTL are used. Nevertheless, all methods converge finally when sufficient route
requests have been processed. This is because no significantly more new door-to-door
distances are cached after many routes have been calculated.
164 M. Boysen et al.

7 Conclusion
There exists a big gap between indoor space navigation needs and the industry standards
for describing indoor spaces. The research reported in this paper accepts the Industry
Foundation Classes (IFC) as input and supports efficient indoor navigation with spe-
cialized techniques. In particular, we propose an effective method that processes IFC
files and creates necessary topological relationships needed by indoor navigation. Also,
we refine an existing decomposition method that decomposes large indoor partitions
into smaller ones such that indoor routing can be done faster. Further, we design a local
algorithm for calculating intra-partition distances that renders indoor routing more effi-
cient. We conduct a series of experiments using several real IFC files. The experimental
results demonstrate that our proposals offer effective processing of IFC files and result
in more efficient indoor navigation than alternatives.
For future research, it is interesting to extract stair and elevator information from IFC
files to support indoor navigation across floors. The techniques proposed in this paper
can be easily extended for that purpose. It is also interesting to collaborate with AEC
industries on enhancing IFC standard with built-in support for indoor navigation.

Acknowledgments. This work is partly supported by the NILTEK project funded by


European Regional Development Fund and the BagTrack project funded by the Danish
National Advanced Technology Foundation under grant no. 010-2011-1.

References
1. Anagnostopoulos, C., Tsetsos, V., Kikiras, P., Hadjiefthymiades, S.P.: OntoNav: A semantic
indoor navigation system. In: Workshop on Semantics in Mobile Environments (2005)
2. Baba, A.I., Lu, H., Xie, X., Pedersen, T.B.: Spatiotemporal Data Cleansing for Indoor RFID
Tracking Data. In: MDM (1), pp. 187–196 (2013)
3. Boysen, M., de Haas, C., Lu, H., Xie, X., Pilvinyte, A.: Constructing Indoor Navigation
Systems from Digital Building Information. In: ICDE, 4p. (2014)
4. buildingSMART. IFC overview, http://www.buildingsmart-tech.org/
specifications/ifc-overview (accessed in February 2014)
5. buildingSMARTalliance. About the buildingsmart alliance, http://www.
buildingsmartalliance.org/index.php/about/ (accessed in February 2014)
6. CityGML (2012), http://www.citygml.org/ (accessed in February 2014)
7. IndoorGML SWG, http://www.opengeospatial.org/projects/groups/
indoorgmlswg (accessed in February 2014)
8. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn.
The MIT Press (2009)
9. Gallaher, M.P., O’Connor, A.C., Dettbarn Jr., J.L., Gilday, L.T.: Cost analysis of inadequate
interoperability in the U.S. capital facilities industry. Technical report, U.S. Department
of Commerce Technology Administration National Institute of Standards and Technology
(2004)
10. Lee, J.: A spatial access-oriented implementation of a 3-D GIS topological data model for
urban entities. GeoInformatica 8(3), 237–264 (2004)
11. Li, D., Lee, D.L.: A lattice-based semantic location model for indoor navigation. In: MDM,
pp. 17–24 (2008)
A Journey from IFC Files to Indoor Navigation 165

12. Lu, H., Cao, X., Jensen, C.S.: A foundation for efficient indoor distance-aware query pro-
cessing. In: ICDE, pp. 438–449 (2012)
13. Mike DeLacey, M.: Why BIM will become even more important in 2012 (2012),
https://enr.construction.com/technology/bim/2012/0111-why-bim
-will-become-even-more-important-in-2012.asp (accessed in February
2014)
14. National Institute of Building Sciences. Building information modeling (2008),
http://www.wbdg.org/bim/
15. Becker, T., Nagel, C., Kolbe, T.H.: A multilayered space-event model for navigation in indoor
spaces. In: Proc. 3rd International Workshop on 3D Geo-Info, pp. 61–77 (2008)
16. Whiting, E., Battat, J., Teller, S.: Topology of Urban Environments. In: CAAD Futures, pp.
114–128 (2007)
17. Wikipedia. Industry Foundation Classes (2013), http://en.wikipedia.org/wiki/
Industry Foundation Classes (accessed in February 2014)
18. Worboys, M.F.: Modeling indoor space. In: ISA, pp. 1–6 (2011)
19. Xie, X., Lu, H., Pedersen, T.B.: Efficient distance-aware query evaluation on indoor moving
objects. In: ICDE, pp. 434–446 (2013)
20. Xu, J., Güting, R.H.: A generic data model for moving objects. GeoInformatica 17(1), 125–
172 (2013)
21. Yang, L., Worboys, M.F.: A navigation ontology for outdoor-indoor space: (work-in-
progress). In: ISA, pp. 31–34 (2011)
22. Yuan, W., Schneider, M.: iNav: An indoor navigation model supporting length-dependent
optimal routing. In: AGILE, pp. 299–314 (2010)
23. Yuan, W., Schneider, M.: Supporting 3D route planning in indoor space based on the lego
representation. In: ISA, pp. 16–23 (2010)
24. Zhang, J., Papadias, D., Mouratidis, K., Zhu, M.: Spatial queries in the presence of obsta-
cles. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M.,
Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 366–384. Springer, Heidelberg (2004)
Using Cameras to Improve Wi-Fi
Based Indoor Positioning

Laura Radaelli1 , Yael Moses2 , and Christian S. Jensen3


1
Department of Computer Science, Aarhus University, Aarhus, Denmark
radaelli@cs.au.dk
2
The Efi Arazi School of Computer Science, The Interdisciplinary Center, Herzliya, Israel
yael@idc.ac.il
3
Department of Computer Science, Aalborg University, Aalborg, Denmark
csj@cs.aau.dk

Abstract. Indoor positioning systems are increasingly being deployed to enable


indoor navigation and other indoor location-based services. Systems based on
Wi-Fi and video cameras rely on different technologies and techniques and have
so far been developed independently by different research communities; we show
that integrating information provided by a video system into a Wi-Fi based system
increases its maintainability and avoid drops in accuracy over time. Specifically,
we consider a Wi-Fi system that uses fingerprints measurements collected in the
space for positioning. We improve the system’s room-level accuracy by means of
automatic, video-driven collection of fingerprints. Our method is able to relate a
Wi-Fi user to unidentified movements detected by cameras by exploiting the ex-
isting Wi-Fi system, thus generating fingerprints automatically. This use of video
for fingerprint collection reduces the need for manual collection and allows on-
line updating of fingerprints. Hence, increasing system accuracy. We report on an
empirical study that shows that automatic fingerprinting induces only few false
positives and yields a substantial accuracy improvement.

Keywords: Indoor Positioning, Wi-Fi Fingerprinting , Video Tracking.

1 Introduction
Over the past decade, location-based services (LBS) have gained in prominence. LBS
accounted for a revenue of USD 2.8 billion in 2010 and the expected revenue in 2015
is USD 10.3 billion [20]. However, today’s location-based services target mostly out-
door users. In contrast, studies find that people spend some 87% of their time in-
doors [5, 10, 13], and 70% of cellular calls and 80% of data connections in the USA
originated from indoors in 2013 [14]. Additionally, indoor LBS market is forecasted
to grow by 40% over the period 2012–2016 [21]. Thus, time is ripe for enabling also
indoor location-based services, where indoor positioning is a key enabler. Specifically,
indoor positioning systems enable a range of indoor location-based services, includ-
ing simple navigation services as well as more complex shopping assistants and friend
finders, to name but a few.
Key characteristics of indoor positioning systems include their maintainability and
accuracy. The maintainability of a system captures the cost of maintaining the system,

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 166–183, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Using Cameras to Improve Wi-Fi Based Indoor Positioning 167

in particular ensuring that its accuracy remains acceptable. Next, accuracy often refers
to the average positioning error in 2D or 3D with respect to the ground truth. Instead,
we adopt the notion of room-level accuracy, where a position is considered correct if
it belongs to the same room as the ground truth position. Thus, the exact position of
a user inside a room is not relevant, but it is crucial to position the user in the correct
room. Room-level accuracy enables many indoor services (e.g., silence-your-phone-
in-a-meeting-room, in-store ads, and locate-a-friend) and data analyses (e.g., finding
hang-outs in airports, popular shops in malls, and frequent sequences of shops visited).
We consider the task of building a system with room-level accuracy that is capable of
maintaining its accuracy over time.
While GPS is ineffective in indoor settings, video cameras and Wi-Fi are promising
technologies for indoor positioning. Cameras can track indistinctly all people in a mon-
itored space with high room-level accuracy, but cannot easily match a specific person
with a location (if not relying on complex identification techniques). Hence, cameras
alone are insufficient for positioning. On the other hand, Wi-Fi based systems can po-
sition only collaborative people that have a Wi-Fi device turned on, and can provide
location of the specific user through the device, but they achieve highly different levels
of room-level accuracy in different settings (reported accuracy varies from sub-meter
up to 40 meters for pure Wi-Fi based systems).
There are different approaches to Wi-Fi positioning: model-based, fingerprinting,
and trilateration. We focus on Wi-Fi systems that use fingerprinting. A fingerprint con-
sists of a pair of an indoor location and a set of signal strengths of the Wi-Fi access
points seen at that location. Given a set of fingerprints and the signal strengths observed
by a mobile device, the system can position the device. In such a system one of the
main challenges is the collection of fingerprints. When done manually by surveyors,
this is time-consuming and expensive, and it needs to be repeated over time to maintain
system accuracy. The need for a surveyor can be avoided, in fact a user’s Wi-Fi device
can constantly emit Wi-Fi signal strengths measurements, that can generate fingerprints
if associated with the user’s position. A classic solution is to ask the users to mark their
locations on a map or select it from a list (e.g., [1, 19]). Such solutions reduce the cost,
but also increase the effort by users. We propose a method that automates the collection
of fingerprints, thus saving efforts by the users and increasing the fingerprint update rate
to avoid decreased accuracy over time. At the core of our approach is the use of a cam-
era to determine the room the user is in, resulting in automatic room-level fingerprinting
that is transparent to the user.
Figure 1 illustrates how different information from the two sources can be integrated
for reaching our goal. Two rooms are connected only to a corridor that is monitored
by a camera whose field of view (FOV) is shaded gray. A user walks from the corridor
to room 2 holding a Wi-Fi device; we can see the user’s actual position and reported
position (gray phones) at four times (t1 –t4 ). First, the example shows that the room-
level positioning of a Wi-Fi system can be inaccurate: at time t4 , the user is in room 2 ,
but the reported position is in room 1 . Second, the video system can tell that the user
entered room 2 at time t3 and did not leave; therefore, we can generate a fingerprint that
associates room 2 with the Wi-Fi signal strengths measurements of the user received
between t3 and t4 .
168 L. Radaelli, Y. Moses, and C.S. Jensen

‫݉݋݋ݎ‬ଵ ‫݉݋݋ݎ‬ଶ
‫ݐ‬ସ ‫ݐ‬ସ ‫ݐ‬ଷ

‫ݐ‬ଵ ‫ݐ‬ଷ
‫ݐ‬ଵ ‫ݐ‬ଶ
‫ݐ‬ଶ ܿ‫ݎ݋݀݅ݎݎ݋‬

Fig. 1. Running example

To the best of our knowledge, only few works propose integration of fixed cam-
eras with other positioning technologies, and they all investigate intra-room position-
ing, while we consider room-level. None of these works explore the use of cameras in
fingerprinting. We provide a key building block for an automatic fingerprinting system
by proposing a solution for corridor-like spaces. The proposed method can be applied
to a more complex floor plan using a large set of cameras that cover the space. The ba-
sic solution proposed here is sufficient when the space is decomposed into corridor-like
subspaces and the method is applied to each subspace.
The remainder of the paper is structured as follows: in Sec. 2, we formulate the
problem and state our assumptions; Sec. 3 gives an overview of the assumed Wi-Fi
and video systems, and presents the proposed integration method; Sec. 4 reports on
experimental studies. We cover related work in Sec. 5 and conclude and provide future
research directions in Sec. 6.

2 Problem Formulation

Setting: We aim to automate fingerprint collection by generating fingerprints without


any active intervention by users. We consider an indoor space composed of rooms that
are all accessible only via a central corridor.
A Wi-Fi positioning system is deployed in the space continuously collecting infor-
mation on the signal strength measurements and location of each Wi-Fi user. In addition,
a video system monitors the corridor, capturing the movements of people. It continu-
ously detects events of a person entering or leaving a room and the associated time.
We assume that a person remains in a room from the time of entering until the time of
leaving the room.
Automatic Fingerprinting: A valid fingerprint for a given location can be obtained
from the collected Wi-Fi signal strength measurements of a device, as long as the device
remains in the same location for at least th fp time units. This threshold, varies from
system to system, but it is fixed at deployment time. For automatic fingerprinting, the
location as well as the time interval for which the user remains in this location should be
determined. In our case only the time interval a person spends within a room, in-room-
interval, plays a role. The interval is determined by the time of entering and leaving a
room.
Using Cameras to Improve Wi-Fi Based Indoor Positioning 169

Assumptions: Synchronization is required when a pair of sensors monitor and report


the position of the same object and do not share a physical clock. We assume that given
a synchronization checkpoint when the clocks of the two sensors are aligned, the drift
does not affect the synchronization within a short time frame. Fast drifting could occur
when several frames are dropped in the video, but we consider very low resolution and
low frame rate, hence this phenomenon is unlikely. We assume that for short videos
with a synchronization checkpoint at the start (as in our experimental study), sources
remain sufficiently synchronized.
A diagram of the floor is used to relate rooms with doors, and a door in the image
seen by the camera is related to the appropriate room. The process of detecting doors
in the image of the camera can be done either manually or automatically (by observing
where do people disappear from the field of view). Furthermore a mapping between
manual fingerprints in the Wi-Fi system and rooms is required to compute the room-
level accuracy of a system.
As indicated above, we also assume that the space has a star topology: a central room
(“corridor”) has connections to n rooms that do not have connections to other rooms.
It is unrealistic to assume that all people in an indoor space are equipped with a Wi-
Fi device; rather, we assume that some people may be walking in the space with no
Wi-Fi positioning. We refer to people with Wi-Fi positioning as users. For simplicity
we assume that there are no people in the space at time t0 (i.e., at the beginning of
each video monitoring session), and in our experiments we consider two people (not
necessarily two users) walking in the monitored space at the same time.

3 Methods

We describe a method for integrating two different sources of information on the move-
ments of people, namely Wi-Fi positioning and video tracking, in order to automate
Wi-Fi location fingerprinting.

3.1 Wi-Fi Fingerprinting

For a Wi-Fi positioning system, a fingerprint is a pair of an indoor location and a combi-
nation of signal strengths of Wi-Fi access points visible from the location. A fingerprint
is traditionally collected by a surveyor standing in a specific location with a Wi-Fi de-
vice and recording the signal strengths measured by the device for some time. The
resulting fingerprints are stored in a database and are subsequently used by the system
for positioning.
When a user submits the signal strength measurements “seen” by the user’s device,
the system returns the position in the database that corresponds to the fingerprint whose
signal strength measurements are the most similar to those submitted by the user. A
user’s position can be computed either on the server or on the device, with each ap-
proach having different pros and cons (e.g., cost of computation, privacy). A mapping
between the coordinate system of the location and an image of the floor plan is usu-
ally available so that the user’s position can be shown on the floor plan for navigation
purposes.
170 L. Radaelli, Y. Moses, and C.S. Jensen

A room-level fingerprint can be generated from signal strength measurements of a


user if the in-room-interval of the user is known. In the next sections, we describe how
to achieve this by using cameras.

3.2 Video Tracking


The tracking of people by using surveillance cameras is an active research topic in
computer vision. We propose to use video data to assist a Wi-Fi system in collecting
automatic fingerprints. Specifically, the video system can provide the crucial informa-
tion about the in-room-interval of users by identifying the entrance and leaving times
of a user to a given room.
Many sophisticated video tracking algorithms exist that can provide the informa-
tion we need (see review in [23]), and the general method we propose can use any of
these. In our implementation we use a relatively simple tracking algorithm that relies on
background subtraction. Our algorithm assumes a fixed background, which is sufficient
when the scene is static and not crowded as in our setting. We take an image of the
empty monitored space as a fixed background and compute the pixel-wise difference
with a frame to detect changes in the space. We ignore differences that are smaller than
a noise threshold (more sophisticated methods are given in [4]).
After applying the basic background subtraction algorithm, we examine each frame
object-wise looking for connected components, i.e. groups of pixels that form a blob
and that could represent objects moving in the scene. For each frame we extract the
number of objects in the scene (one in Fig. 2), the line of feet of each object (l1 in
Fig. 2), and the center of the shape of each object (c1 in Fig. 2).
c1

l1

Fig. 2. Example of features extracted from video

From these features, we can compute in-room-intervals of users and people enter-
ing/leaving the monitored space. The disappearing and appearing events of persons in
the scene can be identified by comparing the detected objects in successive frames. In
particular, when a tracked object disappears from the field of view, it indicates that a
person either entered a room or left the corridor. When a new object enters the field of
view, it may be either a person leaving a room or a person entering the corridor.
We assume that a room is associated to each door in the image seen by the camera.
We determine which room is involved in the event according to the line of feet and the
shape center with respect to the marked doors. If the person location does not corre-
spond to any of the doors, we label the event as corridor leaving (or entering) event. For
robustness, we consider detection in w successive frames.
Using Cameras to Improve Wi-Fi Based Indoor Positioning 171

Our video tracking algorithm returns a set of events, each with the timestamp of the
event, its type (enter/leave), and the room involved. Fig. 3 shows the result of applying
the algorithm to a video that records a person walking in the corridor and entering room
337 at time tx+1 , and then another person leaving the same room at time ty+1 ; the al-
gorithm detects a room-enter event enter337,tx+1 and a room-leave event leave337,ty+1 .

‫ݐ‬଴ ‫ݐ‬௫ ‫ݐ‬௫ାଵ ‫ݐ‬௬ ‫ݐ‬௬ାଵ

‫ ݏݐ݊݁ݒܧ݉݋݋ݎ‬ൌ ݁݊‫ݎ݁ݐ‬ଷଷ଻ǡ௧ೣశభ ǡ ݈݁ܽ‫݁ݒ‬ଷଷ଻ǡ௧೤శభ

Fig. 3. Room-enter and room-leave events

We want to couple a person entering/leaving a room with a Wi-Fi device. We di-


rectly use the Wi-Fi positioning for identifying the user involved in the event. Another
approach would be to use sophisticated image recognition techniques to identify the
user entering/leaving a room. This is left for future research.

3.3 Integration Technique


The video system provides information on people entering and leaving rooms, and the
Wi-Fi system provides signal strength measurements for all users. To generate a finger-
print, we need to match the interval a person spends in a specific room with a specific
Wi-Fi user.
The proposed heuristic method aims to find proper assignments of Wi-Fi users to the
in-room-intervals reported by the video system. Fingerprints are then generated using
these assignments. The method is structured as follows:
1. For each room, find in-room-intervals using the video system.
2. To each in-room-interval, assign a Wi-Fi user if one exists.
3. For each user, compute when the user entered/left the corridor.

We describe each step in the following.


172 L. Radaelli, Y. Moses, and C.S. Jensen

Find in-room-intervals: We first process the sequence of room events detected by the
video system. Four different situations might occur, as shown in Fig. 4.

1. Simple 3. Unclosed … …
‫ݐ‬ ‫ݐ‬
In-room-interval ଴ ݁݊‫ݎ݁ݐ݊݁ ݁ݒ݈ܽ݁ ݎ݁ݐ‬ ݈݁ܽ‫݁ݒ‬ In-room-interval ଴ ݁݊‫ݎ݁ݐ‬

or
2. Interleaved 4. Unopened … …
‫ݐ‬଴ ‫ݐ‬
In-room-interval ݁݊‫ݎ݁ݐ݊݁ ݎ݁ݐ‬ ݈݁ܽ‫݁ݒ݈ܽ݁ ݁ݒ‬ In-room-interval ଴ ݈݁ܽ‫݁ݒ‬

Fig. 4. In-room-intervals in a sequence of events

We can exclude unopened in-room-intervals: as we assume that at the starting time


t0 there are no people in the space, the video system must observe an enter event
before each leave event. To contend with unclosed in-room-intervals, we set the end
of the interval to the time of computation tc (which coincides with the end of a video
session in our experimental study). We consider an interleaved in-room-interval as a
single in-room-interval that extends form the first enter event until the last leave event.
We process the events sequentially. When encountering an enter event, we look
for the next event. Three cases can occur. First, when the current event is the last in
the sequence, it is an unclosed in-room-interval (UcI), and we set the end time of the
interval to current computation time tc . Second, when the next event is a leave event,
we save the interval as a simple in-room-interval (SI). Third, when the next event is
an enter event, we have encountered an interleaved in-room-interval (II), and we do
not close the interval until it contains the same number of enter and leave events. In
this case, ambiguity occurs. The system can choose to ignore such intervals, but we
propose a heuristic to select a possible solution. The output of this step is a sequence of
in-room-intervals.
Assign Wi-Fi Users: Having found in-room-intervals for each room, we now assign
each interval to a Wi-Fi user. Some of the challenges in doing so are given by the
overlapping intervals and the presence of people who are not Wi-Fi users.
We exploit the existing Wi-Fi positioning system to select the users that may be
assigned to an in-room-interval. Candidates are all the users with a Wi-Fi position δ-
close to the door of the room at both enter and leave time of a simple in-room-interval. If
no user fulfills the requirements for being a candidate, we conclude that the person that
entered the room is not a Wi-Fi user. If there is more than one candidate for the same
interval, we simply select the first detected user, but more complex strategies could be
adopted, e.g., considering the distance of candidates to the door.
The algorithm used to assign in-room-intervals to users is detailed in Algorithm 1.
The algorithm goes through all in-room-intervals ordered by enter time and deals with
simple and unclosed in-room-intervals (lines 8–12) and interleaved in-room-intervals in
a different manner (lines 13–21). First, we select the candidate users U (lines 6, 18.19).
U is composed by users that are δ-close to the door of the room during a period of time
of 2τ around the enter and leave times of the in-room-interval (function getUsers(·)).
Using Cameras to Improve Wi-Fi Based Indoor Positioning 173

Algorithm 1. assignIntervals()
1: Input:Intervals, Users, τ, δ
2: PrevIntervals ← ∅;
3: UsedEvents ← ∅;
4: for i ∈ Intervals do
5: if i .type ∈ {SI, UcI} then
6: U ← getUsers(i , τ, δ) \ overlap(i , PrevIntervals);
7: if |U | > 0 then
8: user ← pickUser(U );
9: Assignments ← Assignments ∪ {(user , i )};
10: if i .type ∈ {II} then
11: iteration ← 1 ;
12: while (|PrevCombo| ∗ 2 < |i .events|) do
13: Combinations ← computeComb(i );
14: for l ∈ 1 , . . . , |Users| do
15: for c ∈ Combinations(l ) do
16: skipCombos(iteration − 1 );
17: if c.enter ∈ UsedEvents ∧ c.leave ∈ UsedEvents then
18: U ← c.users \ overlap(c.interval , PrevIntervals);
19: U ← U \ overlap(c.interval , PrevCombos);
20: if |U | > 0 then
21: c.user ← pickUser(U );
22: PrevCombos ← PrevCombos ∪ {c};
23: UsedEvents ← UsedEvents ∪ {c.enter , c.leave};
24: iteration ← iteration + 1 ;
25: Assignments ← Assignments ∪ getSimpleIntervals(PrevCombos);
26: PrevIntervals ← PrevIntervals ∪ {i };
27: return Assignments ;

U excludes users already assigned to other rooms for in-room-intervals overlapping


in time the current one (function overlap(·)). Second, if U contains more than one
candidate, we use the function pickUser(·) to select which candidate to assign to the
in-room-interval. In our case, the function selects the first user in U . For interleaved
in-room-intervals, we have to pair enter and leave events: Combinations contains all
the possible pairs in the interval, each coupled with the users retrieved by getUsers(·).
We iterate through all the combinations (lines 14–23), starting with the ones that have
only one assigned user (l = 1). If no suitable combination is found during the first
iteration, the next iteration skips the first combination (using function skipCombos(·))
to avoid examining the combinations in the same order as in the previous iteration.
More sophisticated heuristics can be designed to address increasing number of events
and ambiguities.
Assign Corridor Events: Having determined in-room-intervals for each user, we still
need to decide when each one of them entered and left the corridor.
As discussed in Sec. 3.2, we assume that from the video source, we can detect events
such as “entering the corridor” and “leaving the corridor.” In some cases, different op-
tions are available for pairing corridor and room events, some are shown in Fig. 5.
174 L. Radaelli, Y. Moses, and C.S. Jensen

1. ‫ݐ‬଴ 3. ‫ݐ‬଴
݁݊‫ݎ݁ݐ‬௖ ݁݊‫ݎ݁ݐ‬௥ ݁݊‫ݎ݁ݐ‬௖ ݁݊‫ݎ݁ݐ‬௥ ݈݁ܽ‫݁ݒ‬௥ ݈݁ܽ‫݁ݒ‬௖ ݈݁ܽ‫݁ݒ‬௥ ݈݁ܽ‫݁ݒ‬௖

or or

2. ‫ݐ‬଴ 4. ‫ݐ‬଴
݁݊‫ݎ݁ݐ‬௖ ݁݊‫ݎ݁ݐ‬௖ ݁݊‫ݎ݁ݐ‬௥ ݁݊‫ݎ݁ݐ‬௥ ݈݁ܽ‫݁ݒ‬௥ ݈݁ܽ‫݁ݒ‬௥ ݈݁ܽ‫݁ݒ‬௖ ݈݁ܽ‫݁ݒ‬௖

Fig. 5. Some of the possible pairings between room and corridor events

We do not have enough information to identify which pairing option is correct.


Therefore, we apply the same pairing policy for all events: we assign a corridor event
to the closest room event of the same type. This means that we pair a room-enter event
with the closest available corridor-enter event that occurred before it, and we pair a
room-leave event with the closest corridor-leave event that occurred after it. In terms of
the options in Fig. 5, we choose the black straight arrows and discard the dotted ones.

4 Experimental Study
We evaluate the proposed method in a controlled environment, namely a corridor of
offices in the Department of Computer Science at Aarhus University. We examine the
quality of the automatic fingerprints (Sec. 4.2), study the integration method (Sec. 4.3),
and consider the positioning accuracy achieved by automatic fingerprinting (Sec. 4.4).

4.1 Settings
The floor plan, in Fig. 6, encompasses a corridor with 14 offices. A Dell Integrated We-
bcam mounted on a laptop positioned as shown in the figure is used for the monitoring.
We employ a Wi-Fi positioning system [12] with two different sets of fingerprints in
two different studies, and we use Samsung Galaxy S3 and Nexus 3 phones connected
through a dedicated Android application.

4.2 Study 1: Fingerprint Comparison


We first compare manual and automatic fingerprints. We perform 10 surveys, during
each of which we collect one manual fingerprint in the middle of room 337 (using a
dedicated mobile application) and then collect an automatic fingerprint for the same
room. The fingerprint location is marked as a cross in Fig. 6(a), where dots are pre-
existing fingerprints. We compare manual and automatic fingerprints in two different
ways. First, we compute the similarity between the manual and automatic fingerprint
and between the automatic and all the other pre-existing fingerprints. Second, we com-
pare the manual and automatic fingerprints based on two of the main features of a
fingerprint.
We compute the similarity between two fingerprints by using the same algorithm
used for positioning a user. The result is the same for all 10 surveys: the similarity be-
tween manual and automatic fingerprint is 1, while the similarity between the automatic
fingerprint and all the other pre-existing fingerprints is near-zero.
Using Cameras to Improve Wi-Fi Based Indoor Positioning 175

346 344 342 340 338 336 334

corridor

345 343 341 339 337 335 333

(a) Fingerprints in Study 1

346 344 342 340 338 336 334

corridor

345 343 341 339 337 335 333

(b) Fingerprints in Study 2

Fig. 6. Floor plan of our hallway

Next, we compare the manual and automatic fingerprints based on two features: the
mean feature μ, a vector consisting of mean values of signal strengths for each visible
access point; and the standard-deviation feature σ, defined as a vector of corresponding
standard deviations (i.e., the ith entry is the standard deviation from the mean value for
1 10
the ith visible access point). We compute μ∗ = 10 μ
j=1 j (μj is the mean vector
from
10 the jth survey) as the average over the 10 surveys; similarly, we compute σ ∗ =
j=1 σ j as average of the standard-deviation features.
1
10
Fig. 7(a) shows μ∗ for manual (solid line) and automatic (dashed line) fingerprints.
For each bar, the point in the middle represents the average value for the access point,
and the height describes the standard deviation. For all access points, the average values
of manual and automatic fingerprints are very close. For access points 2 and 19, the
standard deviation of the automatic fingerprints is larger than for the manual ones, but
in all the remaining cases, also the standard deviations are comparable.
Fig. 7(b) shows σ ∗ . The value of the average standard-deviation feature for automatic
fingerprints is approximately double that of manual fingerprints. The impact of this
difference depends on how relevant the standard deviation is in the similarity measure
used for positioning a user and on the ratio of manual/automatic fingerprints in the
system. In fact, if most of the fingerprints are actually automatic then this does not affect
the system at all. A reason for the difference in standard deviation might be the fact that
during manual collection, the phone is stationary, while during automatic collection,
176 L. Radaelli, Y. Moses, and C.S. Jensen

Standard Deviation from μ


14
μ* Manual σ* Manual
-40 μ* Auto 12 σ* Auto
Signal Strength

10
-50
8
-60 6
-70 4
2
-80 0
-90 -2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
IDs of Visible Access Points IDs of Visible Access Points
(a) Avg. mean feature μ∗ (b) Avg. standard-deviation feature σ ∗

Standard Deviation from μ


μ* Manual 10 σ* Manual
-40 μ* Auto σ* Auto
Signal Strength

-50 8
-60 6
-70 4
-80 2
-90 0
0 5 10 15 20 25 0 5 10 15 20 25
IDs of Visible Access Points IDs of Visible Access Points
(c) Avg. mean feature μ∗ , 2 phones (d) Avg. standard-deviation feature σ ∗ , 2
phones

Fig. 7. Manual and automatic fingerprint comparison

the phone might be moving in the room. This may actually be an advantage, since users
being positioned are more likely to be moving around than to be stationary.
We also perform 10 surveys of the same type, but with two different phones, so that
different phones are used for manual and automatic fingerprinting. The similarity results
are as before, i.e., the similarity between manual and automatic fingerprints is 1, and the
similarity of automatic fingerprint with the other points in the radio map is near-zero.
A comparison of the average mean feature is shown in Fig. 7(c). Fingerprints col-
lected with the two phones are similar, but the automatic ones generally have a lower
mean feature and a lower standard deviation. Results for the standard-deviation feature,
in Fig. 7(d), show higher average values for one phone, but similar standard deviations.
We conclude that manual and automatic fingerprints are comparable, implying that
a system that contains both types of fingerprints, or possibly only automatic ones, is
feasible. When using different phones for collection, the results are less clear. Finger-
prints collected with different phones at the same location are more similar than the
fingerprints collected at different locations with different phones. We also find that fin-
gerprints taken with different phones show slightly different feature values, which might
reduce accuracy depending on the specifics of the Wi-Fi positioning system.
Using Cameras to Improve Wi-Fi Based Indoor Positioning 177

4.3 Study 2: Automatic Fingerprinting Evaluation


Evaluation metrics To evaluate the integration method, we look at true positives,
false positives, and false negatives. The meanings of these when matching in-room-
intervals to users are:
– true positives: correct assignments, suggesting how much manual fingerprinting
work is avoided;
– false positives: assignments of an in-room-interval to an incorrect user; these assign
a fingerprint to a wrong location and are very undesirable;
– false negatives: number of in-room-intervals not assigned to any user, while a user
was actually in the room; these indicate how much room there is for improvement.
Experiments We record different videos of people moving in the monitored space,
and we compare the results of the algorithm with the ground truth.
We perform 9 experiments. Each consists of one or two people walking in the mon-
itored space according to a script, while the camera is recording a video and the Wi-Fi
system is recording positioning data of the Wi-Fi devices they are carrying. The Wi-
Fi system is initialized using fingerprints shown as dots in Fig. 6(b), where automatic
fingerprint locations are shown as crosses. The experiments are sequenced according to
the room involved and the complexity of the scene and script. We start with a single per-
son walking and entering/leaving a room; we continue with two people under different
conditions: walking separately or together and with or without a Wi-Fi device.
Event Detection For each experiment, we check whether the video tracking recog-
nizes when a person enters/leaves a room and recognizes the room. All but one experi-
ment are processed with the same parameters for background subtraction; in experiment
#7 we use different values due to unusual lighting and background interference (a plant
at the end of the corridor happened to be of the same color as the user’s clothing).
Fig. 8(a) shows the number of events detected by the algorithm with respect to the
real events.
We have no false negatives, i.e., we do not miss any events. We have only few false
positives, meaning that we detect as an event something that is not an event. Most of
these are due to the configuration of the scene: people leaving the FOV are classified as
people entering the last room in the corridor. Or they are due to the simple background
subtraction algorithm we use: people wearing clothing that is the same color as a part
of the background. We can conclude that we achieve good results for our setting and
our experiments. However, in a more general setting, a more sophisticated background
subtraction algorithm must be considered. Only experiment #9 was affected by false
positives. Here, the person leaving the corridor is classified as entering room 340, cre-
ating an incorrect fingerprint. All other errors are “dismissed,” either because there is
no corresponding enter/leave event or because no Wi-Fi traces correspond to the event.
User Assignment To generate an automatic fingerprint, we need to assign a Wi-Fi user
to an in-room-interval; assignment results from our algorithm are shown in Fig. 8(b).
We observe two false positives. The false positive in experiment #3 corresponds to an
empty in-room-interval (duration 0) derived from a faulty event detection, and the false
positive in experiment #9, as already mentioned, gives a real error, since a fingerprint
is generated for a wrong assignment of an in-room-interval to a user. We have only one
178 L. Radaelli, Y. Moses, and C.S. Jensen

Algorithm outcome Algorithm outcome Algorithm outcome


Event Not event User x Not User x
x in i x not in i
enter leave enter leave enter leave enter leave
total 25 12 0 0 total 14 3 total 16 11 0 1
Actual value

Actual value

Actual value
Event 13 9 0 0 x in i 12 1 User x 16 11 0 1
Not Not
12 3 - - x not in i 2 2 0 0 - -
event User x

(a) Room enter and leave (b) User x to room i match (c) User x to corridor event
event detection match

Fig. 8. Confusion matrices

false negative: in experiment #7, an in-room-interval is not assigned to user1 , but to an


individual without a Wi-Fi device. This does not lead to a wrong fingerprint.
We can conclude that the assignment of in-room-intervals to users in our method
performs well; all types of in-room-intervals (simple, unclosed, and interleaved) are
detected correctly, and users are correctly assigned in all cases but two (one with no
consequences, and one yielding a wrong fingerprint).
Corridor Event Assignment We want to check whether the algorithm can correctly
recognize when a user enters/leaves the corridor. Corridor events are not used in our
experiments to generate fingerprints due to the configuration of the corridor; in fact,
it is very long, and we cannot assign all the measurements collected along the whole
corridor to a single point in the middle of it. In different setting, these could be used to
generate automatic fingerprints.
The results in Fig. 8(c) show that we have only one error in the corridor assignments
(experiment #9), which is due to a faulty event detection that leads to a faulty user
assignment. We conclude that assignment of corridor events in our method works well
in the experiments.

4.4 Study 2: Room-Level Accuracy Evaluation


Evaluation Metrics. Our method targets room-level positioning. Knowing the floor
plan of the building, we can map each Wi-Fi location to a room. Hence, we can define
the Wi-Fi room location ru,t as the room where user u is at time t according to the Wi-Fi
system. We use R to denote a set of Wi-Fi room locations.
An error occurs when a position differs from the ground truth. For a user u at time t,
true
ru,t denote the ground truth room location. We consider a function roomMatch(·) that
counts the number of matches with the ground truth of a set of Wi-Fi room locations R:
true
roomMatch(R) = |Rmatch |, where Rmatch = {ru,t ∈ R|ru,t = ru,t }. (1)
We also count matches in an alternative way based on the overall state and not on
a single user. Then a match occurs at a specific time only if all users are correctly
positioned at that time:
true
timeMatch(R) = |Tmatch |, where Tmatch = {t|∀u (ru,t = ru,t )} . (2)
When R contains locations of a single user only, roomMatch(R) = timeMatch(R).
Using Cameras to Improve Wi-Fi Based Indoor Positioning 179

Experiments. We want to check whether uploading a new automatic fingerprint im-


proves the room-level accuracy. Therefore, before each experiment described in the
previous section, we upload the automatic fingerprints generated in the previous exper-
iment, so that we can compare the accuracy before and after experiments.
For each experiment, we measure the room-level accuracy of the Wi-Fi system for
the period of time during which at least one user is inside the monitored space. Results
are shown in Fig. 9, where the number of room matches for each user is computed using
the roomMatch(·) function (Equation 1) and the number of matches for both users is
computed using the timeMatch(·) function (Equation 2). We expect the accuracy of

Matched
Not matched

100
% of Matches

80
60
40
20
0
user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all

user1
user2
all
#1 #2 #3 #4 #5 #6 #7 #8 #9
Experiments

Fig. 9. Room-level accuracy evaluation

the system to increase from one experiment to the next one due to new and more up-
to-date fingerprint uploaded. From the results we see that this is true to some extent.
Notice that for each pair of experiments involving the same room (e.g., experiments
#1 and #2), the accuracy is about three times higher in the experiment performed after
uploading an automatic fingerprint. Therefore, uploading a new fingerprint with current
measurements improves the positioning accuracy for that location.
On the other hand, we observe a drop every two experiments (every time a new room
is visited); one possible reason for this behavior is that measurements are more similar
to new fingerprints collected in other location than to old fingerprints collected at the
ground truth location. This calls for a method to “retire” fingerprints when they are too
old, or at least include the “age” of a fingerprint in the computation of a user’s position.
In experiments #8 and #9, users are asked to walk in the corridor without entering
any room. The resulting low accuracy can be due to the fact that no new fingerprints are
uploaded for the corridor during any of the experiments.
We perform a last experiment (experiment #10), where the path taken by the user
is the same as in experiment #1; the user walks in the corridor, enters room 337, stays
there for 1 minute, and then leaves the room and the corridor. A comparison of the
room-level positioning accuracy in the two experiments is shown in Fig. 10(a).
The accuracy in the last experiment is three times higher than the accuracy in the
first one, and three automatic fingerprints have been uploaded for room 337 between
the two experiments.
180 L. Radaelli, Y. Moses, and C.S. Jensen

Matched
Not matched
100
80

% of Matches
60
40
20
0
Exp. 1 Exp. 10
Experiments
(a) Room-level accuracy comparison

346 344 342 340 338 336 334

345 343 341 339 337 335 333

(b) Trace comparison

Fig. 10. Experiments #1 and #10

Fig. 10(b) shows the traces in the two experiments compared with the ground truth.
The solid line represents the ground truth, the dashed line experiment #1, and the dotted
line experiment #10. We see that not uploading new fingerprints for the corridor reduces
the accuracy of positioning in it, since the position of the user is assigned only to rooms
in experiment #10. Moreover, the wrong fingerprint generated in experiment #9, which
assigns a user to room 340 when the user is actually in the corridor, affects the position-
ing in experiment #10 (when the user is in the corridor, the position is reported as room
340). We also find that the positioning accuracy in rooms for which new fingerprints are
uploaded increases, and the user is correctly positioned in room 337 most of the time.

5 Related Work

Different technologies can provide positioning in indoor spaces [16]. We utilize two
different technologies: Wi-Fi and video cameras.
Wi-Fi based positioning has been studied widely, and a wide range of systems have
been proposed. Liu et al. [15] present a survey of approaches to wireless positioning.
There are three main different approaches: model-based, fingerprinting, and trilater-
ation. In a model-based and trilateration positioning the location of access points is
Using Cameras to Improve Wi-Fi Based Indoor Positioning 181

known and positioning is done using propagation models or lateration methods [2]. Our
work is focused on the systems that use Wi-Fi fingerprinting. Kjærgaard [12] presents
a taxonomy of fingerprinting techniques. A known problem of fingerprinting-based
systems is that they call for the collection of fingerprints, which is costly in terms of
time and resources. Thus, a number of recent studies investigate the use of robots [11],
techniques that enable users to do fingerprinting [19], and techniques that enable user
feedback [7] for reducing the cost of fingerprint collection. We tackle the problem of fin-
gerprint collection by exploiting cameras to allow automatic fingerprinting, thus elimi-
nating or reducing the need for active user involvement.
In the realm of video-based systems, different techniques have been investigated for
the tracking of objects in a space monitored by cameras. Some techniques use a single
camera, and some use multiple cameras with or without overlapping fields of view;
other techniques aim to recognize the shapes of objects; and yet some techniques focus
on positioning. Yilmaz et al. [23] contribute a comprehensive survey. We employ a
simple background subtraction technique that is sufficient in our experimental setting.
More advanced techniques are likely to be needed in more general settings, especially
when execution time is an issue [3, 6].
We propose a method of improving a Wi-Fi based positioning system by using cam-
eras. Only few works consider the integration of cameras with Wi-Fi based systems,
most of which consider phone cameras and not fixed cameras such as surveillance cam-
eras [8, 17]. These other approaches rely on a phone camera for the main positioning
and use Wi-Fi positioning to prune the space and improve the efficiency of video match-
ing algorithms. Our approach is the opposite, as we use Wi-Fi as the main positioning
technology.
Van den Berghe et al. [22] consider fixed cameras in a proposal for fusing cam-
era and Wi-Fi sensor data in order to provide intra-room positioning. They consider a
fingerprinting-based Wi-Fi positioning system and a camera installed on the ceiling of
a room. Their proposal detects moving objects in video, and it projects a found object
onto the floor plan using a Gaussian model. A particle filter “combines” the results of
camera detection with the Wi-Fi positioning in real-time. Accuracy findings in terms of
2D error inside the room of the resulting system are reported. When only one person is
in the room, the accuracy of the underlying Wi-Fi system alone is improved by 0.5–2
meters, but in case of many people walking in the room, the improvement is insignifi-
cant (less than 0.5 meters). Results are not given with respect to room-level accuracy.
The use of external cameras has also been explored in connection with RFID lo-
calization systems. Works in this direction vary with respect to RFID localization al-
gorithms and video processing algorithms [9, 18]. Most of the proposed systems are
evaluated in controlled environments; we use the same general approach to evaluate
our system.

6 Conclusions and Future Work


We propose a method that enables automatic collection of fingerprints for a Wi-Fi based
indoor positioning system in corridor-like spaces, thus providing an important building
block towards a full-fledged organic positioning system. The method integrates two dif-
ferent technologies, namely Wi-Fi positioning and video cameras. We target room-level
182 L. Radaelli, Y. Moses, and C.S. Jensen

accuracy and study this level of accuracy in a real positioning system when introducing
automatic fingerprinting.
Our experimental study indicates that automatic fingerprints are comparable with
fingerprints that are collected manually. In an evaluation of the effect of uploading new
automatic fingerprints to the positioning system, we find that the accuracy increases for
the rooms that have been newly fingerprinted, but also decreases for rooms with old
fingerprints. Hence, we conclude that a system that allows new fingerprints to be up-
loaded also needs a method for discarding obsolete fingerprints. In experiments where
we upload new fingerprints for rooms, but not for the corridor, the overall trace might
not show that a user actually passed through the corridor before entering a room.
A natural next step is to utilize the proposed method in a solution that covers an
entire floor or building. Our method can be applied independently to different monitored
subspaces, and by exploiting topological connections between the subspaces, it may be
possible to further improve the in-room-interval detection capability. A key challenge
is to extend the automatic fingerprinting to non-corridor-like spaces (e.g., rooms with
multiple doors), where one might need to recognize a person entering and leaving a
room through different doors. A possible solution is to rely on vision re-identification
techniques.
An interesting direction for future work is to study an online version of the method
that would make it possible to turn off the phone or to reduce scanning interval time
to save battery. In fact, in addition to positioning accuracy, battery consumption is an
important problem. Under the assumption that the user needs room-level positioning,
we may use cameras that detect when a user enters and leaves a room to turn the Wi-Fi
device off and on.

Acknowledgments. The authors thank Thor S. Prentow for setting up and running the
Wi-Fi positioning system. This research was supported in part by the Geocrowd Initial
Training Network funded by the European Commission as an FP7 Peoples Marie Curie
Action under grant agreement number 264994. Laura Radaelli did part of this work
while visiting IDC Herzliya.

References
1. Barry, A., Fisher, B., Chang, M.L.: A long-duration study of user-trained 802.11 localization.
In: Fuller, R., Koutsoukos, X.D. (eds.) MELT 2009. LNCS, vol. 5801, pp. 197–212. Springer,
Heidelberg (2009)
2. Bell, S., Jung, W.R., Krishnakumar, V.: WiFi-based enhanced positioning systems: accuracy
through mapping, calibration, and classification. In: Proc. ISA, pp. 3–9 (2010)
3. Breitenstein, M., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson
tracking-by-detection from a single, uncalibrated camera. IEEE TPAMI 33(9), 1820–1833
(2011)
4. Cristani, M., Farenzena, M., Bloisi, D., Murino, V.: Background Subtraction for Automated
Multisensor Surveillance: A Comprehensive Review. EURASIP J. Adv. Signal Process. Ar-
ticle 43, 24 pages (2010)
5. Dörre, W.H.: Time-activity-patterns of some selected small groups as a basis for exposure
estimation: A methodological study. J. Expo. Anal. Environ. Epidemiol. 7, 471–491 (1997)
Using Cameras to Improve Wi-Fi Based Indoor Positioning 183

6. Elgammal, A.: Background subtraction: Theory and practice. In: Wide Area Surveillance.
Augmented Vision and Reality, vol. 6, pp. 1–21 (2014)
7. Gallagher, T., Li, B., Dempster, A., Rizos, C.: Database updating through user feedback in
fingerprint-based Wi-Fi location systems. In: Proc. UPINLBS, pp. 1–8 (2010)
8. Hile, H., Borriello, G.: Positioning and orientation in indoor environments using camera
phones. IEEE Computer Graphics and Applications 28(4), 32–39 (2008)
9. Isasi, A., Rodriguez, S., Armentia, J.L.D., Villodas, A.: Location, tracking and identification
with RFID and vision data fusion. In: Proc. RFID Sys. Tech., pp. 1–6 (2010)
10. Jensen, C.S., Li, K.-J., Winter, S.: ISA 2010 workshop report: the other 87%: A report on the
second international workshop on indoor spatial awareness (San Jose, California - November
2, 2010). In: SIGSPATIAL Special 3(1), 10–12 (2011)
11. Kim, K.-H., Min, A.W., Shin, K.G.: Sybot: an adaptive and mobile spectrum survey system
for WiFi networks. In: Proc. MOBICOM, pp. 293–304 (2010)
12. Kjærgaard, M.B.: Indoor location fingerprinting with heterogeneous clients. Pervasive and
Mobile Computing 7(1), 31–43 (2011)
13. Klepeis, N.E., Nelson, W.C., Ott, W.R., Robinson, J.P., Tsang, A.M., Switzer, P., Behar,
J.V., Hern, S.C., Engelmann, W.H.: The national human activity pattern survey (NHAPS):
A resource for assessing exposure to environmental pollutants. J. Expo. Anal. Environ. Epi-
demiol. 11(3), 231–252 (2001)
14. Lacroix, T.E.: Indoor LBS market report (September 2013)
15. Liu, H., Darabi, H., Banerjee, P., Liu, J.: Survey of wireless indoor positioning techniques
and systems. IEEE Trans. Syst., Man, Cybern., C 37(6), 1067–1080 (2007)
16. Mautz, R.: Indoor Positioning Technologies. Geodätisch-geophysikalische Arbeiten in der
Schweiz (2012)
17. Morimitsu, H., Pimentel, R., Hashimoto, M., Cesar, R., Hirata, R.: Wi-Fi and keygraphs for
localization with cell phones. In: Proc. ICCV Workshops, pp. 92–99 (2011)
18. Nick, T., Cordes, S., Gotze, J., John, W.: Camera-assisted localization of passive RFID labels.
In: Proc. IPIN, pp. 1–8 (2012)
19. Park, J.-G., Charrow, B., Curtis, D., Battat, J., Minkov, E., Hicks, J., Teller, S., Ledlie, J.:
Growing an organic indoor location system. In: Proc. MobiSys, pp. 271–284 (2010)
20. Sythoff, J.T., Morrison, J.: Location-based services market forecast, 2011–2015. Pyramid
Research (May 2011)
21. Technavio. Global indoor LBS market 2012–2016 (August 2013)
22. Van den Berghe, S., Weyn, M., Spruyt, V., Ledda, A.: Fusing camera and Wi-Fi sensors for
opportunistic localization. In: Proc. UBICOMM, pp. 169–174 (2011)
23. Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4) (2006)
Integrating IndoorGML and CityGML
for Indoor Space

Joon-Seok Kim, Sung-Jae Yoo, and Ki-Joune Li

Department of Computer Science and Engineering


Pusan National University, Pusan 609-735, South Korea
{joonseok,sjyoo,lik}@pnu.edu

Abstract. Recent progress on indoor positioning and mobile devices al-


lows to provide indoor spatial information services such as indoor LBS
or indoor disaster management. In order to realize these services, indoor
maps are a crucial and expensive component of the system. For this
reason, the interoperability among services and sharing indoor map and
spatial information is a fundamental requirement of the indoor spatial
information system. Several geospatial standards have been and being de-
veloped to meet this requirements, among which CityGML LoD 4 (Level
of Detail 4) and IndoorGML are the most relevant ones for indoor spa-
tial information. However the objectives and scope of these standards
are different although their integration may give a synergy effect. In
this paper, we discuss the issues on the integration of IndoorGML and
CityGML LoD 4 and propose two methods: automatic derivation of In-
doorGML data from CityGML LoD 4 data set and external references
from IndoorGML instance to an object in CityGML data. The derivation
and reference of external objects are based on the mapping relationship
between feature types in CityGML and IndoorGML to be investigated
in this paper. A simple prototype will be also presented, which has been
developed to validate our methods.

Keywords: IndoorGML, CityGML, indoor spatial information, map-


ping relationships between IndoorGML and CityGML, derivation of In-
doorGML from CityGML, external reference of IndoorGML to CityGML.

1 Introduction
With the progress of indoor positioning technologies and mobile devices such
as smart phones, a number of indoor map and navigation services have been
provided within large and complex buildings such as shopping malls [4]. For
these services, the demand for indoor spatial information has been increasing
and accordingly geospatial standards become important to share data and en-
hance the interoperability. There are three geospatial standards which may cover
indoor space: CityGML [6][10] and KML 2.0 of OGC (Open Geospatial Con-
sortium), and IFC (Industrial Foundation Classes)[1] of BuildingSmart, which
provide standard data model and XML schema for visualization, geometric rep-
resentation, and semantic properties of building components. In particular, the

D. Pfoser and K.-J. Li (Eds.): W2GIS 2014, LNCS 8470, pp. 184–196, 2014.

c Springer-Verlag Berlin Heidelberg 2014
Integrating IndoorGML and CityGML for Indoor Space 185

level of detail 4 (LoD 4) of CityGML is intended to describe the interior space of


buildings. However they lack of features related with indoor space model, navi-
gation network, and semantics for indoor space, which are critical requirements
of most applications of indoor spatial information.
In order to meet the requirements, a working group for candidate standard of
OGC, called IndoorGML IndoorGML[8] has been launched since 2012. The basic
goals of IndoorGML are to provide a standard framework of semantic, topolog-
ical, and geometric models for indoor spatial information. However IndoorGML
is a complement of the existing standards rather than an independent one. Inte-
gration of IndoorGML with these existing standards raises as an important issue
for two reasons. First, a part of IndoorGML data can be derived from data of
existing standard specifications such as CityGML LoD 4 or IFC. Second, data
set in IndoorGML may contain external references to indoor spatial objects de-
fined in other data set such as CityGML dataset. Among the existing standards
dealing with indoor space, we particularly focus on CityGML in this paper, since
there are common feature types both in IndoorGML and CityGML LoD 4 and
it is required to handle them in an integrated way.
In this paper, several issues will be discussed to integrate IndoorGML data and
CityGML data. First we study how to derive IndoorGML data from CityGML
LoD 4. Second, we investigate the correspondence between the feature types
in CityGML LoD 4 and IndoorGML, and propose solutions to integrate In-
doorGML and CityGML LoD 4 dataset via external references. The rest of this
paper is organized as follows; in section 2, we explain the basic concepts of In-
doorGML and CityGML. And in section 3, we investigate the correspondence
between IndoorGML and CityGML LoD 4. We propose a method to derive
IndoorGML data from CityGML and create proper network model for indoor
navigation, which is a key part of IndoorGML data creation, in section 4. In
section, we also propose a method to link an IndoorGML feature to a feature in
CityGML via external reference. And we conclude the paper in the next section.

2 Related Work and Motivation


2.1 IndoorGML
IndoorGML is a standard data model to represent, store and exchange indoor
spatial information and a XML application schema based on GML 3.2.1 [11].
Note that we refer to the version 0.8.1 of IndoorGML in this paper. While
CityGML and IFC focus on feature types of building components such as roof,
ceiling, floor, and wall, the main focus of IndoorGML is the representation of
spaces in indoor, called Cell, which is the basic space unit in IndoorGML data
model. Therefore IndoorGML provides a standard framework for representing
geometry, network, and semantics of cells in indoor space. We briefly explain
how to represent these aspects in IndoorGML.
186 J.-S. Kim, S.-J. Yoo, and K.-J. Li

– Geometry of cell: there are three options to represent geometry of cell


as shown in Figure 1. The first option is to reference an object defined in
other data set such as CityGML, which contains its geometric property. The
second option is to include geometric property of cell within IndoorGML
data, which is either a solid in 3D or a surface in 2D. The third option is
not to include any geometry property of cell.

GM_Solid (or GM_Surface)


Option 2: Geometry
` in IndoorGML

Option 3: No Geometry
room
gml::id=001 n
Option 1:
CityGML data External Reference
to room in CityGML

IndoorGML data

Fig. 1. Geometry of Cell of IndoorGML

– Network of cell: IndoorGML is composed of the core module and ex-


tension modules, the basic space model of the core module is called struc-
ture space model, as depicted in Figure 2. While the upper part of Figure
2 shows the primal space with 2D or 3D geometry and induced topology,
the dual space is illustrated by the lower part, which represents the network
structure called NRG (Node-Relationship Graph) [7]. The transformation
from the primary space to dual space is explained by Poincaré duality [7],
where a 3D volumetric object and a 2D boundary surface object between
two 3D volumetric objects are transformed to a node and link respectively.
Note that nodes in geometric NRG have (x, y, z) position data while no
position data is included in logical NRG in Figure 2. The data model of
NRG is depicted in Figure 3. Nodes and edges are represented as instances
of indoorCore::State and indoorCore::Transition respectively in the
model, which form a indoorCore::SpaceLayer. While indoorCore::State
and indoorCore::Transition in Figure 3 form a network of dual space,
indoorCore::CellSpace and indoorCore::CellBoundary represent objects
in the primal space. It means that they may have inline geometric properties
or reference to external objects in other data set.
– Semantics of cell: there are different semantical interpretations of an in-
door space and each interpretation gives a different semantic model. While
the core module of IndoorGML is a neutral model from any semantic inter-
pretation, any extension may be defined on the core module to provide a
Integrating IndoorGML and CityGML for Indoor Space 187

Euclidean Space Topology Space


(Geometry)

Cell

Primal space
3D (or 2D) Geometry 3D (or 2D) Induced
(ISO 19107) Topology (ISO 19107)

Poincare
Duality

Dual space

Geometric NRG Logical NRG

Fig. 2. Structured space model

0..1 <<Feature>> 0..1

ExternalObject

IndoorGML (a part of Core Module) externalReference externalReference

0..1 *
<<Feature>> 0..1 <<Feature>>
indoorCore::CellSpace indoorCore::CellBoundary
boundedBy
0..1

0..1 duality 0..1 duality

2 connect
<<Feature>> <<Feature>> <<Feature>> 1..*
indoorCore::State 2 0..* indoorCore::Transition indoorCore::SpaceLayer
spaceLayer
0..* node edge 0..*

<<Feature>>
indoorCore::MultiSpaceLayer

0..* InterEdge

0..*
<<Feature>>
0..1 indoorCore::InterLayerConnection
InterConnect

Fig. 3. Core module of IndoorGML


188 J.-S. Kim, S.-J. Yoo, and K.-J. Li

semantic context of model. In the current version of IndoorGML, an exten-


sion for a context of indoor navigation is defined as shown in Figure 4.
In this data model, several feature types of indoor cells and boundaries
are defined in terms of indoor navigation. For example, cells for move-
ment space such as corridor, stairs, or elevator shaft are represented as
indoorNavi::TransitionSpace in the indoor navigation module, while cells
for staying such as rooms are represented as indoorNavi::TransitionSpace.

<<Feature>> 0..1 * <<Feature>>


IndoorCore CellSpace boundedBy CellBoundary
Module

IndoorNavi
<<Feature>> <<Feature>>
Module NavigableSpace NavigableBoundary

<<Feature>> <<Feature>> <<Feature>>


TransferSpace GeneralSpace TransferBoundary

<<Feature>> <<Feature>> <<Feature>> <<Feature>> <<Feature>>


TransitionSpace ConnectionSpace AnchorSpace ConnectionBoundary AnchorBoundary

Fig. 4. Navigation module of IndoorGML

An instance from indoorCore::MultiSpaceLayer consists of multiple in-


stances of indoorCore::SpaceLayer, each of which defines a different decom-
position of indoor space as shown in Figure 3. For example, while a big hall is
considered as a cell in a space layer, it can be partitioned into multiple small
cells in another space layer. The inter-layer relationship between cells of different
layer is defined as indoorCore::InterLayerConnection in IndoorGML.

2.2 CityGML
CityGML is an OGC standard for XML application schema to represent, store,
and exchange 3D virtual city models. The most recent version of CityGML is
version 2.0 and based on GML 3.2.1. It includes not only the core module and
appearance module but also several thematic modules such as digital terrain,
tunnels, bridges, and buildings. It also provides a notion of level of details (LoD)
from LoD 0 to LoD 4, where LoD 4 aims to represent building interior space. In
this paper, we focus on the integration of IndoorGML and CityGML for indoor
space and consequently deal with LoD 4 of CityGML building module.
Figure 5 shows a simplified data model of LoD 4 of the building model. A
building mainly consists of three basic components; first BoundarySurface such
as walls, roofs, ceiling, and floors, second Room such as rooms and corridors,
Integrating IndoorGML and CityGML for Indoor Space 189

<<Feature>> *
_AbstractBuilding
* * *
* *
0..1 <<Feature>> <<Feature>>
Room BuildingInstallation
0..1 0..1 *
* *
<<Feature>>
IntBuildingInstallation
* *
<<Feature>>
BuildingFurniture
*
*
<<Feature>>
* _BoundarySurface *
0..2

*
<<Feature>>
_Opening

Fig. 5. Building model of CityGML

third Opening for windows and door. While BoundarySurface and Opening are
geometrically defined as surface or multi-surfaces (gml:MultiSurface in GML),
the geometry of Room is defined as either inline solid (gml:Solid in GML) or
a set of BoundarySurface. If the geometry of a room is defined as a set of
BoundarySurface, it must be a closed space. For example, if there is no physical
boundary between kitchen and living room, we need to make a virtual boundary
by using ClosureSurface to make each room a closed space. It implies that the
connectivity between rooms is found either via Opening or ClosureSurface in
CityGML.

2.3 Motivation

Linking IndoorGML and CityGML is useful for several reasons. First, a large part
of IndoorGML data can be derived from CityGML data. Second, the integration
of IndoorGML and CityGML via external references compensates the weakness
of each standard. However, few works have been done on rules or guidelines for
linking and integrating IndoorGML and CityGML. The goals of this paper are
to propose a derivation method of IndoorGML data from CityGML data and
to discuss the issues and possible solution for integrating two data sets in both
standards via external reference of IndoorGML.

3 Corresponding between IndoorGML and CityGML

In order to explore the integration issues between IndoorGML and CityGML, we


need to investigate the relationships between feature types in both data models.
190 J.-S. Kim, S.-J. Yoo, and K.-J. Li

While Room feature type of CityGML corresponds indoorCore::CellSpace of In-


doorGML data model, we have no further classification of room types in CityGML.
However it is required to tell which subtypes of indoorCore::CellSpace it belongs
to in IndoorGML, for example whether indoorNavi::GeneralSpace or
indoorNavi::TransitionSpace.
In this paper, we propose a mapping between feature types of IndoorGML and
CityGML based on the ontology in [3] and the code list defined in Annex C of
CityGML[10]. According to the ontology in [3], the Room in CityGML corresponds
to either room or passage as shown in Figure 6. Then room and passage are
mapped to indoorNavi::GeneralSpace and indoorNavi::TransitionSPace
respectively, where corridor, stairway, escalator, elevator, moving walk, ramp, lobby
belong to passage.

IndoorGML Ontology CityGML

ConnectionSpace Door
Door
AnchorSpace BuildingExit

GeneralSpace Room
Room
TransitionSpace Passage

Elevator Stairway Escalator

Ramp Corridor MovingWalkway Lobby

Fig. 6. Mapping relationships between IndoorGML and space types

A more delicate mapping between IndoorGML and CityGML is found between


Door in CityGML and feature types of IndoorGML.

– anchor: if the the instance of Door is a gate connecting indoor and outdoor
spaces, then it is considered as an anchor rather than a door in IndoorGML
and otherwise, it is considered a door.
– thin door vs. thick door: while Door is geometrically defined as a multi-
surface in CityGML, it is either a surface or a solid in IndoorGML de-
pending on the door model, that is, whether thin door model or thick door
model as shown in Figure 7. For example, ‘D1’ and ‘Cell D1’ in Figure
7 are represented as thin door and thick door model respectively. If thin
door model is employed in IndoorGML, Door of CityGML is mapped to
indoorNavi::ConnectionBoundary of IndoorGML, otherwise, it is mapped
to indoorNavi::ConnectionSpace in IndoorGML.
Integrating IndoorGML and CityGML for Indoor Space 191

Cell W7
B1
Cell R1
Cell W1

Cell W6
Cell R1 B3

Cell W5

Cell W2
D1 B2

Cell D1
Cell R2
Cell R2
B4 D3
Cell W4 Cell D3 Cell W3

Fig. 7. Thin door model vs. thick door morel

The mapping relationships between feature types of IndoorGML and CityGML


are summarized in table 1.

Table 1. Mapping relationships between feature types of IndoorGML and CityGML

CityGML Feature Type(CodeList) IndoorGML Feature Type


Room(stairs) TransitionSpace
Room(escalator) TransitionSpace
Room(elevator) TransitionSpace
Room(lobby) TransitionSpace
Room GeneralSpace
Door ConnectionSpace
ClosureSurface ConnectionBoundary

4 Derivation from CityGML to IndoorGML

4.1 Generating Reference of IndoorGML to CityGML


Based on the mapping relationships between IndoorGML and CityGML, we
can generate features of IndoorGML data from CityGML data, which belong to
either NavigableSpace or NavigableSpaceBoundary. Then the external refer-
ences to features in CityGML data set are to be included in IndoorGML data.
For this external reference to CityGML data, a unique feature identifier must be
included in each object in CityGML.
In order to clarify this mapping relationship, we propose guidelines to identify
the correspondence as follows.
192 J.-S. Kim, S.-J. Yoo, and K.-J. Li

– CodeList: the definition of Room in CityGML is broad and it may be not


only room such as bed room, living room, but also corridor or stairway.
Therefore we need more information to specify the correspondence. For-
tunately a code list for function and usage of Room is given in CityGML.
For example five code lists for stairway, escalator, elevator, lobby defined
in CityGML correspond to indoorNavi::TransitionSpace of IndoorGML.
Therefore the code list enable to precisely specify the correspondence be-
tween feature type in IndoorGML and instances of CityGML. In the case
where the code list of Room in CityGML is missing or belongs to unknown
code lists, it implies that it should be mapped to merely
indoorNavi::NavigableSpace.
– Control value: A level of control value is defined by space syntax theory in
terms of connectivity of a cell. in [5], a method to find a space with a high
probability for being rooms except living rooms is proposed by using the
degree of connectivity. In [3], the concept of space syntax was introduced,
which allows to investigate properties of space cell. If the control value of a
cell in space syntax is less than 1, then it is considered as an instance of
indoorNavi::GeneralSpace such as office or bed room. Otherwise it may
belong to indoorNavi::TransitionSpace, since it is connected with other
cells and therefore used as a passage between cells.

4.2 Automatic Generation of Geometry of State and Transition

In this subsection, we discuss the automatic derivation of indoorCore::State


and indoorCore::Transition from instances in CityGML. Based on the map-
ping relationships and generation of external references, we see how each instance
in CityGML is mapped to an IndoorGML instance, which eventually belongs to
either indoorCore::CellSpace or indoorCore::CellBoundary of the primal
space.
In this section, we focus on geometric graph where the positions of
indoorCore::State and indoorCore::Transition are specified. The position
of indoorCore::State is easily computed as a centroid of indoorCore::
CellSpace. However in order to generate indoorCore::Transition, we need
to check the connectivity between cells from CityGML data set. This process is
carried out with two steps.

– First, given an instance of Room in CityGML data, we find a Door instance


belonging to BoundarySurface or ClosureSurface of the room.
– Second, we find Room instances which share the Door or ClosureSurface
instance found at the first step. Then we have an instance of indoorCore::
Transition connecting two indoorCore::State instances which correspond
with the Room instances of CityGML at the first step.

The geometry of indoorCore::Transition is determined as a straight line


connecting two indoorCore::State as shown in Figure 8. This may explain
the topological structure of indoor space but is insufficient to figure out the
Integrating IndoorGML and CityGML for Indoor Space 193

n1 n2

n3

n4 n5

n7 n8

n6
n9 n12
n10

n11

Fig. 8. Topographic layer derived from CityGML

IndoorGML CityGML

n7-1
n7 TransitionSpace Room
n7-2

Navigation layer Topographic layer

Fig. 9. Connection between State and CellSpace

navigation routes and therefore to compute optimal route between two points
in indoor space. For example, the route between n2 and n9 quites differs from
the navigation route of ordinary pedestrians. This strange route is explained by
two reasons. The first reason comes from a big hall or long corridor with several
doors as n7 in Figure 8. The second reason is due to case that the shape of cell is
concave and the straight line between two indoorCore::State instances passes
through walls.
In this paper, we propose a solution for this problem by introducing an extra
layer for navigation as shown in Figure 9. The indoorCore::State instances of
the original space layer is split into multiple instance to improve the route. For
example, a indoorCore::State instance n7 is split to two instances n7−1 and
n7−2 as shown in Figure 10 then the route from n2 to n9 becomes more ordinary
than the original one in Figure 8. More detail algorithm of node splitting is found
in [12].
194 J.-S. Kim, S.-J. Yoo, and K.-J. Li

n1 n2 n3-1

n4 n5

n3-2
n7-1 n7-2 n8

n6
n9 n12 n10-1
n10-2

n11

Fig. 10. Navigation layer after subspacing

5 Mapping Cardinality for External Reference


In this section, we discuss how to generate external references from IndoorGML
instance to objects in CityGML. The mapping relationships between IndoorGML
and CityGML are discussed in section 3. There are cases where the mapping
cardinality from IndoorGML instance to CityGML object is not one-to-one.
Figure 11 shows an example where two rooms in CityGML correspond with
three cells of IndoorGML. However one-to-many mapping and many-to-many
mappings are not allowed for the external reference from IndoorGML since only
one external reference can be specified for a cell in IndoorGML, while one-to-one
or many-to-one are allowed.

CityGML Room1 Room2

Layer1 N1 N2 N3

Fig. 11. M:N relationship in spaces

In this paper, we propose a method to resolve this mapping cardinality by


introducing an overlapping layer (Figure 12) or a subspacing layer (Figure 13).
Figure 12 shows an example for the first option that an additional layer Layer2 is
introduced to guarantee one-to-one mapping between CityGML and IndoorGML
(n4 to Room1 and n5 to Room2 ). Then we define the inter-layer connection
between Layer1 and Layer2 , where the topological property of the inter-layer
connection is equal, overlapping or contain.
Integrating IndoorGML and CityGML for Indoor Space 195

Layer2 N4 N5

Layer1 N1 N2 N3

Fig. 12. Overlapping layers

An alternative option for extra layer is shown in Figure 13, where any inter-
section of Room objects of CityGML and cell of IndoorGML becomes an separate
cell of IndoorGML. In this case, the topological property of inter-layer connec-
tion becomes either contain or equal. Both of these options can be applied to
remove the one-to-many or many-to-many possibility.

Layer2 N4 N5

Layer3 N1 N6 N7 N3

Layer1 N1 N2 N3

Fig. 13. Sharing the subspaced layer

6 Conclusions
Two geospatial standards - CityGML LoD 4 and IndoorGML - have been de-
veloped by OGC (Open Geospatial Consortium) to provide the interoperability
among indoor spatial information systems. While CityGML aims to provide a
framework of 3D city model including the interior space, IndoorGML mainly
focuses on indoor navigation in terms of cellular space model. In order to over-
come the mismatches between two geospatial standards, they may be served as a
complement for each other and their integration is useful for many indoor spatial
information service areas.
In this paper, we discussed the issues on the integration and proposed methods
and guidelines for the integration. The main contributions of the paper are
– First the mapping relationships between feature types of CityGML LoD 4
and IndoorGML, which is served as a fundamental understanding for the
integration.
– we presented methods and guidelines of automatic derivation of IndoorGML
instance from CityGML LoD dataset. Even though some restrictions should
be respected by CityGML LoD 4 dataset for successful derivation, we can
accurately build IndoorGML dataset with ease.
196 J.-S. Kim, S.-J. Yoo, and K.-J. Li

– we discussed the issue on mapping cardinality between CityGML and In-


doorGML for external references and proposed a solution by using the multi-
layered space model of IndoorGML.

A prototype has been developed to validate our approaches and proved that
the approaches are feasible but require more extensions. In particular, a part of
derivation of IndoorGML data is automatic but a manual work is still required
in several cases. Therefore more future works are required to strengthen the au-
tomatic derivation and reduce the part for manual work. And only integration
between CityGML and IndoorGML was handled in the paper. However the inte-
gration between IndoorGML and IFC is to be done for the future work since IFC
is widely used not only architectural engineering but also geospatial information
area and IFC can be used as a source of raw data in many cases.

Acknowledgment. This research was supported by a grant(11 High-tech G11)


from Architecture & Urban Development Research Program funded by Ministry
of Land, Infrastructure and Transport of Korean government.

References
1. buildingSMART, IFC4 (Industrial Foundation Classes XML 4) RC4, http://www.
buildingsmart-tech.org/specifications/ifcxml-releases/ifcxml4-release
2. Computational Geometry Algorithms Library, http://www.cgal.org
3. Hiller, B., Hanson, J.: The Social Logic of Space. Cambridge University Press
(1984)
4. Indoor Google Maps, http://maps.google.com/help/maps/indoormaps
5. Kim, J.S., Han, Y.S., Li, K.J.: K-anonymity in Indoor Spaces Through Hierarchical
Graphs. In: Fourth ACM SIGSPATIAL International Workshop on Indoor Spatial
Awareness, pp. 21–28 (2012)
6. Kolbe, T.H., Groger, G., Plumer, L.: Citygml interoperable access to 3d city mod-
els. In: 1st International Symposium on Geo-Information for Disaster Management,
pp. 21–23 (2005)
7. Lee, J.: 3D GIS for Geo-coding Human Activity in Micro-scale Urban Environ-
ments. In: Egenhofer, M., Freksa, C., Miller, H.J. (eds.) GIScience 2004. LNCS,
vol. 3234, pp. 162–178. Springer, Heidelberg (2004)
8. Lee, J., Li, K.J., Zlatanova, S., Kolbe, T.H., Nagel, C., Becker, T.: Requirements
and Space-Event Modeling for Indoor Navigation. OGC 10-191r1 (2010)
9. Li, K.J., Yoo, S.J., Han, Y.S.: Geocoding Scheme for Multimedia in Indoor Space.
In: 20th International Conference on Advances in Geographic Information Systems,
pp. 434–437 (2013)
10. Open Geospatial Consortium, OGC City Geography Markup Language (CityGML)
Encoding Standard, Version 2.0, OGC 12-019 (2012)
11. Open Geospatial Consortium, OpenGIS Geography Markup Language (GML) En-
coding Standard Version 3.2.1, OGC 07-036 (2007)
12. Tsetsos, V., Anagnostopoulos, C., Kikiras, P., Hasiotis, P., Hadjiefthymiades, S.:
A Human-centered Semantic Navigation System for Indoor Environments. In: In-
ternational Conference on Pervasive Services, pp. 146–155 (2005)
Author Index

Alfarrarjeh, Abdullah 67 Li, Ki-Joune 100, 184


Li, Pengpeng 85
Bast, Hannah 115 Li, Qingquan 54
Bellur, Umesh 1 Lu, Hua 148
Boysen, Mikkel 148 Lu, Ying 67

Cao, Jinzhou 54 Mehta, Paras 36


Moses, Yael 166
Davoine, Paule-Annick 134 Müller, Sebastian 36
de Haas, Christian 148
Dufilie, Andrew 19 Poulenard, Laurent 134
Fan, Hong 85
Radaelli, Laura 166
Feng, Hao 85
Shahabi, Cyrus 67
Gensel, Jérôme 134
Shi, Junyuan 67
Grinstein, Georges 19
Sternisko, Jonas 115
Gueguen, Philippe 134
Storandt, Sabine 115
Hu, Qingwu 54
Voisard, Agnès 36
Jang, In Sung 100
Jensen, Christian S. 166 Wang, Guanfeng 67
Wu, Huayi 85
Kim, Joon-Seok 184
Kim, Min Soo 100 Xie, Xike 148
Kim, Seon Ho 67
Yoo, Sung-Jae 184
Lee, Chung Ho 100
Li, Huan 85 Zimmermann, Roger 67

You might also like