This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Routing For OpenStreetMaps

Submitted By Venkata R Kella rvk4@le.ac.uk

Dissertation

Supervisor Name: Alexander Kurz Second marker Name: Emilio Tuosto

DECLARATION All sentences or passages quoted in this report, or computer code of any form whatsoever used and/or submitted at any stages, which are taken from other people’s work have been specifically acknowledged by clear citation of the source, specifying author, work, date and page(s). Any part of my own written work, or software coding, which is substantially based upon other people’s work, is duly accompanied by clear citation of the source, specifying author, work, date and page(s). I understand that failure to do these amounts to plagiarism and will be considered grounds for failure in this module and the degree examination as a whole. Name: Signed: Date:

i

Abstract

In the past few years, routing services have become the part of daily life, they may be online, offline or a standalone navigation device. There is no one best algorithm that most route planners use today. The algorithms vary on running time, memory usage and complexity. To Plan optimum routes on a very large road network still a challenge for route planers. The existing well known algorithms like Dijkstra and A-star cannot be used for road networks containing millions of nodes and ways. Many techniques have been developed to speed up the calculation time and reduce the memory usage. We will discuss the pre-processing techniques and approaches for speeding up the existing algorithms in this project. A thesis report on routing algorithms for car navigation devices forms the basis of this project. This project aims at developing a routing service that could calculate a shortest path using the data provided by OpenStreetMaps. OpenStreetMaps can be introduced as free mapping service that is developed to encourage developers to work with real map data. During the course of the project, working of OpenStreetMaps is well understood. The data provided by OpenStreetMaps is used by setting up dedicated database and data has been used to construct a graph structure. The router takes the source and destination location and returns set of nodes making the route.

ii

Contents

Abstract ......................................................................................................................................ii Chapter 1 .................................................................................................................................... 1 1. Introduction ........................................................................................................................ 1 1.1 1.2 1.3 1.4 1.5 1.5.1 1.5.2 1.6 Motivation .................................................................................................................. 1 Background ................................................................................................................. 2 Project Aim ................................................................................................................. 2 Objectives .................................................................................................................... 3 Requirements ............................................................................................................... 3 The Map Loader ...................................................................................................... 3 The Router ............................................................................................................... 4 Project Plan ................................................................................................................. 4

Chapter 2 .................................................................................................................................... 5 2. Problem Formulation ......................................................................................................... 5 2.1 2.2 2.3 2.4 Massive Data ............................................................................................................... 5 Time Dependency ...................................................................................................... 6 Data Structuring .......................................................................................................... 6 Level of Route Planning .............................................................................................. 6

Chapter 3 .................................................................................................................................... 7 3 Literature survey ................................................................................................................ 7 3.1 3.1.1 3.1.2 3.2 3.2.1 3.2.2 3.3 Search Algorithms ...................................................................................................... 8 Dijkstra’s Algorithm and Best-First-Search ............................................................ 8 A*- Heuristic Improvement ..................................................................................... 9 Road Networks ......................................................................................................... 12 Time independent Model ....................................................................................... 13 Time Dependant Model ......................................................................................... 13 Pre-Processing ........................................................................................................... 14 iii

3.3.1

Creating a Search Graph ........................................................................................ 16

3.3.1 Quality of a Partition ................................................................................................... 16 3.3.2 Algorithms of partitioning ..................................................................................... 17

Splitting Algorithm .............................................................................................................. 18 Merging-Algorithm .............................................................................................................. 19 Chapter 4 .................................................................................................................................. 21 4 Routing Services ............................................................................................................... 21 4.1 4.2 4.3 4.3.1 Tom-tom ................................................................................................................... 21 Google Maps ............................................................................................................. 21 Open Street Maps ..................................................................................................... 21 OSM Data .............................................................................................................. 22

Chapter 5 .................................................................................................................................. 23 5. Design and Implementation .............................................................................................. 23 5.1 5.1 5.1.1 5.1.2 5.1.3 5.2 5.2.1 5.2.2 5.2.3 Design........................................................................................................................ 23 Structuring data in database ...................................................................................... 23 Node: ..................................................................................................................... 23 Way:....................................................................................................................... 24 Relation:................................................................................................................. 25 Constructing the graph data structure (Graph Abstract data structure):.................... 25 Graph Class............................................................................................................ 26 Creating a Node ................................................................................................... 26 Creating a Edge ..................................................................................................... 26

5.3 Implementing Routing Algorithm .................................................................................. 27 Chapter 6 .................................................................................................................................. 28 6.1 6.2 Discussion ................................................................................................................. 28 Future Work .............................................................................................................. 29

Bibliography ............................................................................................................................ 31 iv

v

Chapter 1

1. Introduction

We encounter the problem of determining the optimum path in many applications and it is a major issue in many of the branches in computer science. We often play games which contain characters controlled by players where the character needs to find a route to the destination avoiding enemies and collecting points. A robot may need to accomplish a task of finding a route on its master command and also it could be a context where a network administrator routes the data packets in a computer network and specifically in navigation devices and applications. Path finding addresses the problem of finding a good path from source to the goal avoiding obstacles, enemies and minimizing cost factors like fuel, time, distance and money. The optimum path could be of a custom type such as shortest route, fastest route or an economical route. The Route planners have become part of everyday life in the past few years. Many types of route planners are available both online, offline and stand alone providing routing service to cars, bikers and pedestrians. Broad ranges of navigation devices are available for vehicles and many website provide routing service online. Though they are easy to be handled and used the complexity behind them is not quite simple. 1.1 Motivation

The Planning of optimum route on a road network can be traced back to thousands of years. People willing to travel long distances normally rely on geographical maps and have to spot the start location and destinations and needs to consider all possible routes towards the destination to pick the least expensive route. Thanks to the advancement of technology, car navigation systems are introduced which can plan the route between any set of source and destinations. The user is expected to enter the destination point. Information on all possible destinations is stored on database. It uses this information to calculate route and present it to the user. The digital map stored in the database contains information about all countries, states, cities and road connecting them. These constitute of millions of nodes and edges pointing geographical locations and ways connecting them. There should be some pre-processing even before the actual routing to cut short the number of nodes and edges considered to plan route. 1

1.2

Background

The key components of the navigation system are positioning i.e. determining the current position of the vehicle on the road network, route planning i.e. planning a route from the current position of the car to the destination and guidance i.e. guiding the driver by giving both visual instructions and oral instructions. The driver has an option to choose between fastest route and shortest routes or selecting routes avoiding motorways or ferries. Car navigation system also gives guidance to user i.e. giving instructions to the driver through vocally and visually .After determining the position of the car which can be done by GPS signalling. The challenge would be planning the route from current location to user desired location. The information of the map is made available and a routing algorithm is used to determine the route. Many factors influence the route planning namely the terrestrial distance between source and destination and Traffic information, i.e. information on traffic jams, road works and road conditions for example, are received via RDS-TMC (Radio Data System Traffic Message Channel) and rules. The progress of the car is reported to the driver continuously. More clearly, a route planning system determines the current position on the road and plans a route from the current position to the destination giving instructions. The algorithm used for planning a route is generally based on Dijkstra’s shortest path algorithm. When considering the map network it is pictured as several number of road segments connected. The major challenge comes when modelling this type of systems considering several factors the affect the route planning. User has the option to choose among several optimising criteria shortest route and fastest route or giving preference to the motor ways etc. More effectively this system should allow the user to specify to the risk of traffic jams on the chosen route and accident statistics. Planning an optimum route that takes daily congestion patterns into account can be seen as planning a route in a graph that has costs that are not constant. 1.3 Project Aim

The aim of the project extends from using the data provided by OpenStreetMap [1] i.e.) structuring data in database to building the graph structure for a routing algorithm to work upon. This report discusses the pre-processing techniques and also tries to implement some of those.

2

The data provided by OpenStreetMap is not supposed to be ordered hence should be structured. The graph Structure built from the OSM data is used to find the optimum route by solving the path finding problem. 1.4 Objectives

The thesis titled Route Planning Algorithms for Car Navigation [4] submitted by Ingrid C.M. Flinsenberg provided the platform for this project. The main challenges include understanding research level material, implementing and validating the results. The challenge would be working with real time data. We use the data provided by OpenStreetMaps [3]. The data is provided through OSM files which are in XML format. The OSM file needs to be parsed and it should be structured in database. The data needs to be filtered as it contains some unwanted information which is not interest of us. Thereafter a shortest path should be calculated between any two locations of choice. For this the graph structure is formed from the data we structured previously. Calculating the shortest path on a graph which is made of all available data of map makes it expensive and slower. So we need some pre-processing even before the routing takes place. Pre-processing limits the search area thereby reducing the memory for storing the map. Graph partitioning is the advanced challenge which includes dividing the whole road graph into partitions. 1.5 Requirements

This project involves developing two applications, the map loader for parsing the osm file provided by the OpenStreetMaps and the router for performing routing on the map data. The Map Loader. The Router.

1.5.1 The Map Loader This application parses the OSM xml file and populates the MySQL database with the map data. This application coded in java uses DOM parser to do the job. Challenges include filtering the data and removing nodes and ways having null values. It uploads data to three tables, namely nodes, ways, way_nodes supposed to contain data of geographical locations and nodes making up the way.

3

1.5.2 The Router The Router has to plan a route between any two chosen locations. It delivers a set of nodes making the route. Reducing the amount of space used by the graph structure and response time after selecting the source and destination may be the key challenges. 1.6 Project Plan

In order to achieve the project requirements, the project development process is spread into series of stages. The learning process started with researching on a topic (planning as model checking) which is not relevant to the current project, the first two weeks were spent on it. Due to reasons I moved from planning as model checking to routing for Open Street Maps. There onwards I spent time understanding research material on route planning algorithms which lasted for week 3 and week 4. For implementing routing algorithms and pre-processing techniques we worked with the data provided by the Open Street Maps. The next week was spent on understanding Open Street Maps. Week 6 to week 9 involved developing the parser of populating the MySQL database and building the graph structure for implementing routing algorithm. The final stage involved working on the pre-processing techniques and ideas for simplifying the graph structure build for routing.

4

Chapter 2

2. Problem Formulation

This section throws some light on the basic functionality of the route navigation systems and introduces challenges involved. As mentioned above car navigation systems have became part of daily life and nowhere expensive for common man. Although it sounds easy to get on with the device but the complexities involved are nowhere less. 2.1 Massive Data

Car navigation system uses a map loaded on a physical location which consists of millions of nodes and road segments which correspond to geographical locations and real roads. The user enters the destination location and the car navigation system plans the route from the current location to the destination, the user is not likely to wait for long time for the system to plan the route and present it. While planning the route data has to be retrieved from the data base which is time consuming. The main aim is to design algorithm and approaches to enable the car navigation system to plan optimum routes on large road networks with millions of nodes and edges within small amount of time. Approaches include pre-processing techniques that loads only a part of data on to the systems physical memory at a time. However many of the current approaches does their best to minimise the response time. Because the road network we are going to work contains millions of nodes and edges it is necessary to do some pre-processing before the actual router plans a route. The result of the pre-processing makes it easy and fast to plan optimum routes on road networks containing millions of nodes. This dissertation tries to explain the route planning functionality of car navigation system. The pre-processing techniques and the actual routing algorithms designed to solve the complex path finding. And also tries to implement these techniques with the real world map data provided be the open street maps. Open Street Map shortly OSM creates and provides free geographical data to anyone who wants them with any legal or technical restrictions. The primary challenge would be parsing the raw data provided by the Open street maps. Viewing data as graph structure and implementing a routing algorithm.

5

2.2

Time Dependency

Considering timing constraints road networks can be classified as time dependant and time independent and stochastic time dependant models. Some roads may be closed during specific time periods. For example, a road can be closed for construction work during several hours or days. Algorithms and approaches vary depending on the network model considered. 2.3 Data Structuring

As said earlier data is provided by OpenStreetMaps. This may be specific to this project as data provided by OpenStreetMaps is not supposed to be ordered hence should be structured. Data provided by OSM is publicly edited by large set of OSM users hence need not be correct. Data constitute of dumb nodes and ways which are no interest of us and also include information of cycle ways and foot paths which is no use of us. 2.4 Level of Route Planning

Meaningfully, the level at which route search should take place i.e. the type of roads we consider in planning routes. For instance, when we try to route between two cities there is no meaning of considering residential road or other small roads rather than motor ways and carriage ways. The challenge is to determine the type of roads to be considered when planning custom routes.

6

Chapter 3

3 Literature survey

Finding a route on a road network can be referred to a shortest path algorithm and it is studied extensively from past few decades. Several algorithms came into existence to solve this problem. The well known algorithm would be Dijkstra (1959) [1]. Several algorithms and data structures have been put forward since the classic shortest path algorithm by Dijkstra. Dijkstra’s algorithm works by visiting each node in the graph from the starting point and repeatedly examines all adjacent nodes which are not yet visited. It expands outwards until it reaches the goal node i.e. all nodes to the highest level are examined before it reaches the goal state. Best thing is that it guarantees to find shortest path unless no edges have negative costs. However Dijkstra’s algorithm can be speeded up by taking a heuristic estimation of the destination into account. A* - algorithm [Hart, Nilsson & Raphael, 1968] [2] is the next well known algorithm for calculating shortest path between two nodes using heuristics. As a result of extensive result many algorithms came into existence, Bellman-Ford algorithm [Bellman, 1958; Ford Jr. & Fulkerson, 1962], the D’Esopo-Pape algorithm [Pape, 1974] and the Floyd-Warshall algorithm [Floyd, 1962] are some of them. The main intention of all the research and developments is to calculate a shortest path from source to destination taking minimum time and memory. For a car navigation system, a standard Dijkstra-like algorithm [Dijkstra, 1959; Hart, Nilsson & Raphael, 1968] is not fast enough to plan optimum routes in large real-world road networks. Because of the high demands on planning speed, the route planning process has to be speeded up, which can be done by pre-processing the road network. This section introduces the path finding problem and discusses the evolution of algorithms designed to encounter this problem and the pre-processing techniques used to reduce the time and memory required to calculate route. Also gives introduction to various road networks in existence 3.1 Shortest Path Problem

The problem of finding a route on a road network of minimum cost can be referred as a shortest path problem. The shortest path can be defined as collection of edges or nodes from any two vertices on a road network for which the sum of the costs traversing each edge is as 7

low as possible. Throughout this report the road network is compared to a directed and weighted graph. Nodes are formed form the intersection of streets. A graph is defined as G (V, E) where V is the finite set of nodes and E is the finite set of edges. Each edge has the cost associated with it i.e. the cost of traversing the edge. The graph is considered to be a directed weighted graph. The routing Algorithm returns a path which contains sequence of vertices (v1, v2,...,vn). Each node is expected to remember its parent and child. After the routing algorithm finishes the routing task the path is reconstructed by back tracing the t he set of nodes. The most common approaches for solving the minimising problem include [9]. Exhaustive search Linear programming Relaxation methods Simulated annealing

All above approaches have their pros and cons. We choose the best approach that suits and can be adapted to our problem. We consider relaxation technique as it better in performance when compared to the exhaustive search. Exhaustive enumerate over all available paths and returns the shortest path. This approach cannot go well with a large road work. Relaxation method determine optimum path from single source to all other vertices. It chooses the best path from a single node to all the neighbours. Dijkstra’s, Bellman-Ford’s and A* are example of the relaxation technique. 3.2 Algorithms

As discussed earlier there exists wide range of algorithms that solve the path finding problem. As we go into detail. 3.2.1 Dijkstra’s Algorithm and Best-First-Search Dijkstra’s algorithm (1959) [1] works by visiting nodes in the graph with the starting point and repeatedly examines all adjacent nodes which are not yet visited. It expands outwards until it reaches the goal node i.e. all nodes to the highest level are examined before it reaches the goal state. Best thing is that it guarantees to find shortest path unless no edges have negative costs.

8

The Greedy Best-first-search algorithm [16] the next popular search algorithm uses an estimate called a heuristic, which constantly takes record of how far the destination is. It runs quicker then Dijkstra’s algorithm but cannot guarantee a shortest path. As it uses a heuristic value which guides its way towards the goal very quickly. For instance if the goal is towards south-east of the map the best first search algorithm focus its way that lead towards the southeast. The combination of the both the quicker property form the Best First Search algorithm and the efficiency of the Dijkstra’s algorithm is merged to develop a more efficient and quicker algorithm called the A* algorithm. 3.2.2 A*- Heuristic Improvement A* algorithm [Hart, Nilsson & Raphael, 1968] was developed in 1968 as a combination of heuristic approach of the Greedy Best-First-Search and the formal Approach like the Dijkstra’s algorithm. It’s like the Dijkstra’s algorithm that can find the shortest path and the like a Greedy Best-First-Search algorithm that uses heuristic to guide itself towards the goal. A* is capable to return the best path (if it exists) between any pair of nodes, according to the accessibility/orientation and the cost of the arcs. Into more detail. A* is known to have simple data structures to maintain the list of explored and unexplored nodes, namely the open list and the closed list. The open list and the closed list, in simple words open list holds all the nodes we are currently working on and the closed list contains all the nodes which are fully explored. Each node has three values associated with it: G, H and finally F. G : node. H : Heuristic Value, the estimated i.e. heuristic cost to reach the Goal Value, the exact cost to reach the current node from the starting

destination from the current node(will be further explained in detail) F : Fitness Value, the sum of the above two values G+H records how

expensive it will be to reach our goal from the current node F (n) =G (n) +H (n)

9

Adding to these variables each node needs to be aware of its parent so it is possible to establish how we reached the node. Each time the open list is considered the node with the lowest F score is considered. Into more detail, Algorithm starts with Open list containing the starting node and Closed list as empty. The best node the open list is considered every time i.e. the node with lowest F scores and moved to the closed list. So for the first time the starting node is considered and all the adjacent nodes are moved to the open list and the source node is dropped from the open list and added to the closed list. And the best node in the open list is considered and all its adjacent nodes are examined. Two cases are considered when examining the neighbouring node. Case 1: If the neighbouring node is in the closed list or open list and the current G value is lower, the G value of the current neighbouring node is updated with lower G value and the neighbouring node’s parent is set to the current node. Case 2: If the neighbour is not in either of the list, we add it to the open list and associated scores are calculated. The above process is repeated until we reach our goal node. Working backwards from the goal node, visiting each node to its parent node until we reach the starting node, that is the shortest path. And something on heuristics, H can be estimates in a variety of ways namely Manhattan distance, Diagonal distance, Euclidean distance etc. As heuristic is just an estimation which guides us towards the goal node we need to consider how it effects the result of the A* algorithm. If we have H (n) is 0, then G (n) plays a role and turns A* into Dijkstra’s algorithm which guarantees to find a shortest path. If H (n) is lower than or equal to the actual cost of moving from the current node to goal. Then it explores more number of nodes making the process of finding the shortest path slower. Still it is capable of finding the shortest path. Having A-star to have H (n) exactly to the cost of moving from the current node to goal node makes it very fast. Again to remember H (n) is a estimate of cost. 10

If H (n) is greater than the actual cost of moving from current node to goal, it does not guarantee to find a shortest path but will run quicker. If it is relatively very high then it turns into Greedy best first search where H (n) plays a role.

The one main issue with both Dijkstra’s algorithm and A* algorithm is memory usage. As the search space increase, the memory usage increases. In our problem the search space may contain millions of nodes and ways. For each node visited we calculate the G cost, f cost and maintain the parent and the child of the node visited. Due to the memory requirements A* has been extended to Iterative Deepening A* (IDA*) and Simplified Memory-Bounded A* (SMA*). More details can be found at [17] and [18]. We now discuss the suboptimal techniques developed of calculating the shortest path between two geographical locations. Some them include 3.3 Bidirectional search Multilevel approach Bi-directional Search

This searching technique imposes an arbitrary search algorithm on both directions, from the source node in forward direction and from the destination node in backward direction. We can start two searches - one from start to finish and other from finish to start. The two searches are supposed to meet at particular instance, when they meet we have the optimum path. By this approach the search area can be reduced when compared to the unidirectional search. Bidirectional search may be more useful if your map is complex. The front-to-front variation of bidirectional search links the two searches together. Instead of selecting the best forward search node n of F value g(start, n) + h(n, goal) and the best backward search node m of F value g(start, m) + h(m, goal) this approach chooses g(start, n) + h(n , m) + g(m, goal). The retargeting approach doesn’t start with simultaneous searches on both directions instead it first performs a forward search to choose a best forward node then performs a backward search to the selected best forward node. In the backward search it chooses the best backward node. In the next stage it performs a forward search form the chosen best forward node to chosen best backward node. This process continues till the two nodes are one and the same. [19]

11

3.4

Multi-level Approach

The multi-level approach is near to the pre-processing techniques that will be discussed in the further sections. When planning routes between two locations distributed over cities the largest part of the shortest path remains same, disregard of source and destination. This approach is extended to Highway Hierarchies which is introduced by Sanders and Schultes [6]. It performs pre-processing to create multiple levels on the road graph. From the lower level which is the original road graph it generates multiple levels and each higher level is an abstraction of lower level. Each time a new level is generated, some node and edges are deleted and some edges may generalise. When planning route between two cities highways form the major part of the shortest path, so the multilevel approach finds the highways and add them to the next level. At each new level, paths having nodes with only two neighbours the edge will be replaced with single edge connecting the start point and end point and all isolated nodes are removed. From each level a new level is generated. When the desired number of levels has been created, all nodes in each level are linked to the corresponding ones in the level below to form the final graph. This will make sure that the shortest path is found.

3.2

Road Networks

A car navigation system uses a map containing a road network to plan routes. This road network is contained with all information needed to plan routes. Information is stored about road segments and also intersection of road segments. Information may constitute of geographical coordinates, length of the road segment, speed, and highway type, street names etc. A road Network/Map can be viewed as a multigraph containing nodes and edges. The intersections form nodes and the road segments between intersections in a road network form edges. In simple words nodes represents geographical locations and edges connecting them form road connecting geographical location. A multigraph is permitted to have multiple edges that have same end nodes. A Multigraph G is an ordered pair G= (V, E), N denote the finite non empty set of nodes. E denotes the finite non empty set of edges.

12

We introduce three road network models. Time independent road network is the basic model where driving times are constant and time dependant road network where time dependant driving times and time dependant costs are incorporated and also takes daily congestion patterns into account when planning routes. Time dependant model is extended to uncertainty in travel times and costs results in third network model, stochastic time-dependant model. 3.2.1 Time independent Model As said earlier this is the basic model which does not take time constraints into account. Information on traffic rules such a turn restrictions and one way roads as taken into consideration. Because one way roads exists all edges are directed. The cost of traversing an edge is determined by cost function We. Existence of traffic rules prevents drivers to take a right turn or prevent access to some routes. Traffic rules are incorporated by modelling a cost function on pair of edges. If traffic rule forbids a turn between two edges then infinite cost is incorporated between the pair of edges. Rule cost is denoted by Wr Finally a time independent road network with no time constraints and can be defined as Graph (N, E, We, Wr) where N is the set of vertices, E is the set of edges. We is the cost of traversing a edge and Wr is the rule cost. Rules are used to represent for instance forbidden turns in a roadgraph. A slightly modified A* Algorithm can be used to accommodate rule costs. The modified algorithm can plan optimum routes in a graph with rules and evaluates edges instead of nodes [8]. 3.2.2 Time Dependant Model The time independent model extends to time dependant model where edge cost function We and Wr depend upon time at which the edge is traversed. In real world certain properties of a road network might change over a time period for example roads may be closed during a time period for construction or during rush hours driving time of a route may be longer. To incorporate timing costs time dependant road network is introduced. Node and edge set of the road graph remain constant over time and only the costs change over time because if edge is not available over time the rule cost associated with it can be made infinity. We define the time dependant road network as 13

Road network with timing variable Gt = (N,E,wet, wrt, tet, trt) where N is the set of vertices, E is the set of edges. We is the cost of traversing a edge at time t and Wr is the time dependant rule cost at time t. tet is the driving time needed to traverse a edge at time t, trt rule cost function at time t

3.3

Pre-Processing

The important feature of navigation systems includes the time it takes to compute the route and present it to the user. The road graph used by navigation system consists of millions of nodes and edges. Though standard A* algorithm is an efficient algorithm for planning routes from source to destination when a large real world graph is considered it may fall flat on time taken to compute the path. It is not fast enough to plan route on car navigation systems. In order to decrease the time taken to compute the shortest path we use pre-processing techniques which decreases the area of search graph thereby reducing the number of nodes and edges. The road network is divided into number of disjoint subgraphs that are connected by boundary graph. We discuss a pre-processing approach that enables to reduce the size of the road graph used for planning routes, while it is possible to plan optimum routes on the derived graph. The approach is called partitioning and proceeds by creating partitions in the road graph. The road graph can be divided into several subgraphs before it is given as input to routing algorithm. Each sub graph consists of a set of nodes and edges and every node in a road graph is exactly contained in one subgraph. One such subgraph is called as cell. Cells are formed by highly connected subgraph and can be city or district. The edges connecting cells are termed as boundary edge and the ends of the boundary edge are boundary nodes. The collection of boundary nodes and boundary edges result in new graph called the boundary graph that is not necessarily connected.

14

Figure 2.1. Road graph before division into cells.

Figure 2.2 Cells after dividing onto cells.

**Figure 2.3. Resulting Boundary graph
**

Figures taken from Route Planning Algorithms for navigation [8]

15

3.3.1 Creating a Search Graph The previous section forms the basis for creating a search graph. The optimum route between any two boundary nodes is defined as a route edge. So when a path needs planned through a partition it simply traverses through the route edge as it is the optimum route between the boundary nodes. The optimum routes between every pair of boundary nodes are stored in the map data. When a route needs to be planned between any chosen source and destination, a search graph is created and given as input to the routing algorithm. A searchgraph consists of the boundary graph and the cells containing the start node and the destination node and all route edges of all edges of all other cells [4].

Figure 2.4. Search graph for source S and destination D As route edge is an optimum route between boundary nodes it has to be replaced with the actual road segments. 3.3.1 Quality of a Partition Determining the quality of partition plays a prominent role as the running time of the route planning algorithm depends on the number of nodes and edges in the search graph. Creating partition is considered to be NP-hard problem. Since it is believed that it is not possible to find optimal solution to NP-hard problem there exist two approximation algorithms to partition the large graph. The quality of the partition depends on the size of the graph that is given as input to the routing algorithm which is described earlier as a search graph. The main criterion is to minimise the number of boundary edges between each cell [Pothen, 1997] and [Falkner, Rendl & Wolkowicz, 1994]. As the number cells increases the number of edges increases. In the next section we discuss how quality of partition is measured. 16

The running time of the route planning algorithm depends on the number of nodes and edges of the search graph G. For determining the quality of partition we minimize the number of edges [8]. We can choose to minimize the maximum of edges or the average number of edges in search graph for all possible start and destinations pairs. The Average number of edges in search graph G for all possible start and destination node pairs is equal to the sum of number of edges in the boundary graph and the average number of route edges in all cells except the cells containing the start and destination nodes and average number of edges in the cells containing the start and destination nodes for all pairs of start and destination. The average number of edges is denoted by AE (G, C1........CK) where (C1........CK) is the partition and G is the graph. AE (G, C1........Ck) = ∑ ( )

2

ri} + mB

The goal is to minimise the average number of evaluated edges by the route planning algorithm. The actual number of edged in the search graph is very hard to determine as it depends on various factors such as the planning algorithm, the partition,, the partition criterion etc. Therefore we approximate the average number of evaluated edges.

Figure showing search area of route planning algorithm [8]. The A* algorithm evaluates edges in the ellipse shaped area. The start and target nodes make the foci of the ellipse. By calculating the distance between the start node and destination node we can calculate the area of the ellipse. From the figure if SE is the area of the ellipse and SM is the area of the entire road graph. The number of edges evaluated by the route planning algorithm can be estimated by the average number of internal edges of the start and

17

destination cells plus α =SE/SM times the average number of boundary and route edges in the search graph. . This leads to the estimated average number of evaluated edges by route planning algorithm, denoted by EE (G, C1........Ck).

EE (G, C1........Ck) = ∑

(

)

∑

2

}+ mB)

By solving the above equation the best possible partition can be obtained as confirmed by Van der Horst [2003]. The above equation is solved with various values of , choosing leads to the minimisation of number of edges in the search graph and making partitions with cells containing single node. Smaller values of Flinsenberg, Van der Horst, Lukkien & Verriet [2004] .3.2 Algorithms of partitioning =1

=0 leads to

lead to smaller partitions.

As discussed earlier the objective of creating a partition is to minimise the estimated expected number of edges given as input to the routing algorithm. The problem of creating a partition is considered to be NP- hard. The algorithms have to be suitable to partition very large road networks containing millions of nodes and edges. Partitioning can be done by either recursively partitioning a particular cell in to number of sub graphs or the reverse where each cell contains one cell and proceeds by recursively merging the cells in each step. So we have two algorithms for partitioning large road graphs. Splitting-Algorithm Merging-Algorithm

Splitting Algorithm The name itself conveys, this algorithm recursively split a cell into fixed number of partitions. The cell with most nodes is divided into constant number of sub graphs not necessarily of equal size. N-partitioning splitting algorithm splits the subgraph into N number of partitions. Here in our graph 2-partitioning is preferable as higher partitioning results in empty cells and also 2- partitioning is simple compared to 3-partitioning or higher. The best partition is always the partition with minimum value of estimated expected edges. The brief description of the algorithm follows. When a Cell C is chosen to be divided into two sub graphs, the algorithm starts by creating an empty cell C1 and a cell C2 equal to C2. The algorithm works by moving the selected nodes from one cell to the other cell. Here if cell C2 is the entire roadgraph, we randomly choose an 18

internal node u from cell C2, otherwise we randomly select a boundary node u belonging to cell C2 and move it to the cell C1. We now determine the boundary nodes of the node u, and then we repeatedly add a boundary node v from cell C2 adjacent to a node in cell C1, and move it to cell C1. The boundary node that is moved to cell C1 is selected according to a priority function. Boundary nodes with relatively many adjacent edges are given higher priority and the internal nodes are given zero priority. This algorithm is discussed in detailed in [4]. Merging-Algorithm This approach is just opposite to the previous one which recursively spilt the graph into sub graphs. The current approach recursively merges two cells to form a new cell. This approach results in new approximation algorithm called as merging algorithm. It is explained in detail in [Flinsenberg, Van der Horst, Lukkien & Verriet, 2004]. The Merging-Algorithm is a greedy algorithm which starts with each cell containing exactly one node. In each step it proceeds by repeatedly merging two cells into a new cell. This process continues until the graph is remained with single cell. This whole process is called a run. To create good partitions the cells selected to merge are chosen according to the priority function. As mentioned earlier good partition contains few boundary nodes per cell. For more detailed explanation refer page 48 of [4]. 3.4 Cartography

This section gives introduction to map making and serves as platform for understanding OpenStreetMaps discussed in the next chapter. Cartography is the study of map making. The advent and spread of computers revolutionised and this ruled out traditional paper maps and digital maps came into existence. In the next few sections we discuss the common terminologies in mapping Digital mapping is process of collecting, storing, querying geographical data. This is often called as Geographical Information System (GIS). GIS is the system that captures, stores, analyses, manages and visualises data of a geographical location. GIS is merging of cartography and database technology [20]. Geographical data can be represented in two formats namely, the Raster data and the Vector data. The raster data is represented in a two dimensional grid of numerical values representing some measurable characteristic where the spatial data surface is projected onto the grid plane. 19

Raster data consists of rows and columns of cell. Each cell stores single value. Additional values may be used for storing the information of the cell. Vector data can be also called as geometric data. Data is represented as geometrical shapes. Geographical locations are expressed by geometrical shaped like point, lines and polygons. This type of data is well used in many mapping applications. Vector data can be classified as spatial data and non spatial data. Geographical databases hold huge data hence should be maintained efficiently. Dedicated databases came into existence to maintain the geographical data. We discuss spatial databases in the next section. 3.4.1 Spatial databases A spatial database is a database that is optimized to store and query data related to geographical data. It offers spatial data types in its data model and query language. It provides spatial indexing and spatial join. Spatial data types include point, lines and polygons. Spatial databases add support for geographical objects. Dedicated databases came into existence to support geographical data. Oracle Spatial, PostGreSQL with PostGIS extension are some of those which provide geospatial functions. 3.4.2 Projection techniques As we know earth is spherical in shape so should be displayed on a 2d surface. This is where projection techniques came into existence, by simple or complex math’s depending upon the type of projection, the spherical earth can be projected on to 2D surface. There are different projection techniques some them are Mercator projection, Platte Carrie, Miller projection and Equirectangular projection etc

20

Chapter 4

4 Routing Services

They are wide range of routing applications available online, offline or stand alone devices. Navigation systems are designed for automobiles which can acquire position of the user through a GPS signal. Using the road database the unit can give directions to the user and can drive him to the destination. Some of the routing services include 4.1 Tom-tom

Tom-tom is a large international company that offers stand alone navigation devices. Their devices are among the most popular ones, mainly because of the intuitive user interface, the speed with which it calculates routes and the accuracy of their data. The devices can calculate routes by car, by bike or on foot. Unfortunately, their routing algorithm and data source are not open to the public. Recently Tom-tom has released an online version of their route planner as well, however, this version misses many of the features the stand alone device offers. Among other things, it lacks an option to plan routes by bike or on foot. Both the online version and the stand alone version are available in Dutch. 4.2 Google Maps

Google has recently started their own online map and routing service, named Google Maps. It is a very fast, free service. It is obvious that Google has performed some kind of preprocessing or caching to make their routing service this fast, but details or their routing algorithm stay within the company. Google Maps is available in Dutch, but only offers routes by car and on foot. 4.3 Open Street Maps

OpenStreetMap is an open initiative to create and provide free geographical data such as street maps to anyone who want them. OpenStreetMap Foundation is an international non profit organisation supporting this project. There are many offline, embedded and we-based routing services using OpenStreetMap data. As OpenStreetMap data is used we will focus more on it.

21

4.3.1 OSM Data OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them. The project was started because most maps you think of as free actually have legal or technical restrictions on their use, holding back people from using them in creative, productive, or unexpected ways. Getting into detail about how it works, People gather location data with GPS devices or from free satellite imagery upload it and add names and other tags. This mapping service works in 5 step process. Collect data Upload data Create/edit OSM data Label data and add details Use and render maps.

Open Street Maps provides data on OSM xml file which can view on a text editor. It consists of data related to the geographical locations and roads connecting them, street names, highway type (residential, motor way, footway) and one-way status etc. Also contains information about time stamps and user who actual created the data and map editor used and change sets. Database is a key component of OpenStreetMap, because data of maps is stored in their most raw form. OpenStreetMaps maintains database to store its data. Interestingly there exist dedicated databases which support geospatial data. All well known databases have GIS support. Open street maps uses PostGreSQL with PostGIS to its store its data uploaded by users. The PostGIS extension for PostgreSQL is often used for geographic data. PostGIS add geospatial functions. Open Street Maps also provide tools to populate databases with osm data. Some of these include osm2pgsql and osmosis. These tools are capable of creating a dedicated spatial database with map data. As discussed in the earlier projection techniques should be used to earth spherical surface to a 2D surface. To do that OSM uses Mercator and Platte Carrée techniques as conversion techniques. 22

Chapter 5

5. Design and Implementation

We would be working with real time data provided by Open Street Maps to perform the actual routing. As said earlier our main challenge is to structure data provided by the Open Street Maps in database. To implement a routing algorithm we need to construct a graph structure. 5.1 Design

We tackle this problem in series of stages where we structure the data provided by osm in database and view it as graph data structure and then implement a route planning algorithms. We could see this in three stages 1. Structuring data in database. 2. Constructing the graph data structure (Graph Abstract data structure). 3. Actual implementation of planning algorithm. 4. Pre-processing techniques 5.1 Structuring data in database

The raw data provided by OpenStreetMaps should be structures before we can actually use it for graph traversing. Structuring data included removing dumb node sand null ways which are no use us and will yield bad results if we allow them to remain. Proper structuring of data could be helpful of performance of the algorithm and so for the application. Data is retrieved as and when needed so retrieving of data should not take more time as it affect performance. OSM maps are made up of simple elements namely nodes, ways, and relations. Each element may have arbitrary number of properties which are key value pairs. 5.1.1 Node: A Node is a basic element of the OSM data structure. Nodes correspond to a geographical location with latitude and longitude entries. Each node is identified by unique node id.

23

<node

id="26508924"

lat="52.6399567"

lon="-1.1078991"

user="morwen"

uid="2851" visible="true" version="1" changeset="235220" timestamp="200703-12T21:24:32Z">

Though it has few more tags (uid, visible, changeset etc) they are no interests of us. 5.1.2 Way: A way is an ordered interconnection of at least 2 or more nodes that describe a linear feature such as a street, area, or any other physical location. Each way has a unique way id and tags which are key value pairs with no key occurring twice. Name, highway, oneway are the key tags.

<way id="4344944" user="morwen" uid="2851" visible="true" version="3"

changeset="148447" timestamp="2008-02-18T17:53:22Z"> <nd ref="26462982"/> <nd ref="26462983"/> <nd ref="26462984"/> <nd ref="26462985"/> <nd ref="26462986"/> <nd ref="197707558"/> <nd ref="26462987"/> <nd ref="197707724"/> <nd ref="26462988"/> <nd ref="26462989"/> <nd ref="248188357"/> <nd ref="26462990"/> <nd ref="26462991"/> <nd ref="26462992"/> <nd ref="26462993"/> <nd ref="26462994"/> <nd ref="26462995"/> <nd ref="26462996"/> <tag k="created_by" v="JOSM"/> <tag k="highway" v="residential"/> <tag k="name" v="Evington Drive"/>

24

</way>

5.1.3 Relation: Relation is basically groups of object in which each object may take on a specific role. It is used to specify relationship between objects. A relation can group other elements together nodes, ways and it may be even other relations. Elements are members of relation and each member in a relation have a role. A relation may have arbitrary number of tags. When we actually implement the algorithm, the algorithm performs several iteration and retrieves arbitrary amount of data in each iteration. So considering this a good structured data always helps the algorithm running time. We could develop a parser that could parser an OSM xml file of a location of our preference and populate our data base with map data. Our database consists of three table namely Nodes: Maintain datasets of each node i.e. contains the unique node id and geographical coordinates of the node. Ways: Maintains information of streets and road connecting nodes i.e. Way _name, Way_id and type of highway and oneway information. Way_nodes: Maintains all ways and nodes making each way. As the implementation of the routing service is in the primary stage the relation element is not considered and can be made into consideration on further development. 5.2 Constructing the graph data structure (Graph Abstract data structure):

Now we have raw data modelled into structured tables. We have nodes with geometrical coordinates and ways which are made up of ordered list nodes. Now the challenge would be putting the data into a graph structure. Graph structure can be modelled as an Adjacency list or an Adjacency matrix. These two techniques differ in how nodes and edges are maintained. Adjacency list graph maintain set of nodes and each node maintains a list of neighbours. Adjacency list representation is a space efficient representation of a graph. For a graph with V nodes and E edges requires V+E node instances to represent a graph. Adjacency matrix uses n x n matrix for a graph with n

25

nodes. So it is less space efficient than Adjacency list. So we implement the graph in adjacency list technique. 5.2.1 Graph Class The graph class creates the graph structure and have dedicated method to create each node and each edge. Data is fetched from the database which we have created earlier. Mysql connector serves the connection to Mysql database 5.2.2 Creating a Node

The Node class represents a single node in the graph. Node class hold node id and latitude and longitude and node list of neighbours and costs mapping a weight from the Node to a specific neighbour.

class Node { public string Id { get; private set; } public double Latitude { get; set; } public double Longitude { get; set; } public NodeList Neighbors { get; private set; } private List<double> costs; }

5.2.3 Creating a Edge We could add a directed edge between two nodes. As way is made up ordered list of nodes we have to create an edge between each successive node and itself. The cost of traversing a edge i.e. geometrical distance between to coordinates (latitude and longitude) can be calculated by Haversine formula. R = earth’s radius (mean radius = 6,371km) Δlat = lat2− lat1 Δlong = long2− long1 a = sin² (Δlat/2) + cos(lat1).cos(lat2).sin²(Δlong/2) c = 2.atan2 (√a, √ (1−a)) d = R.c

26

**5.3 Implementing Routing Algorithm
**

A* is one of the least complex algorithms, being only slightly more complex than Dijkstra. Algorithm uses two data structures to maintain closed and open list. Closed list is hash set while open list is priority queue.

static public Path<Node> FindPath( Node start, Node destination, Graph graph ) { string startCity = start.Id; var closed = new HashSet<Node>(); var queue = new PriorityQueue<double, Path<Node>>(); queue.Enqueue(0, new Path<Node>(start)); while(!queue.IsEmpty) { var path = queue.Dequeue(); if(closed.Contains(path.LastStep)) continue; if(path.LastStep.Equals(destination)) return path; closed.Add(path.LastStep); foreach(Node n in path.LastStep.Neighbors) { double d = distanceCalculator.Distance(path.LastStep, n); var newPath = path.AddStep(n, d); queue.Enqueue(newPath.TotalCost + distanceCalculator.Distance(n,destination), newPath); } } return null; }

Priority queue is chosen because this type of queue will always have the priority associated with it. Every time we dequeue it returns the least expensive path till then. The Path class maintains the nodes of the path explored in each iteration and keeps track of previous steps and total cost involved. Distance between two nodes is calculated by distance calculator class through Haversine estimation.

27

Chapter 6

6.

6.1

**Conclusions and Further Work
**

Discussion

Routing for OpenStreetMaps is derived from the combination of route planning algorithms for car navigation (the thesis report submitted by Ingrid C.M. Flinsenberg [4]) and Open street maps. It is an implementation of the research conducted by C.M. Flinsenberg on real time data provided by Open Street Maps. This project is an extract of paradigm named Planning as model checking, the paradigm I’m interested in. Planning as model checking is technique that synthesises a plan from a formal model of domain. It has the planning domain on which a plan guides the state transition model towards the goal by issuing series of action. In a process of idealising the planning domain, a route planning system is considered as a domain to work on. The thesis report mentioned above is considered as a basis of planning domain. Route planning algorithms for car navigation and the pre-processing techniques used for paling a route on large road network seemed be more interesting. Hence I thought of implementing those with real time data provided by Open Street Maps. So finally came up with a new project called Routing for Open Street Maps. This project is aimed at using the free data provided by Open Street Maps and implementing the algorithms and techniques mentioned in the thesis report. It could populate the MySQL database with the map data and router uses to the uploaded map data for creating the graph structure. OSM differs from most established GIS-Software and data-formats. GIS stands Geographical Information Systems plays a prominent role in mapping projects. Dedicated database have been developed to support geographical data. Oracle spatial and PostGreSQL with PostGIS extension are some of those which support spatial data features. Though I haven’t built a spatial database for the routing service, building one such database makes routing precise and faster. Spatial databases support geographical data types which support points, lines, and polygons. The current routing application returns a list of nodes which make up the shortest path between two geographical locations. Displaying the world map through integrated graphics has not been considered throughout the project period. The map data can be rendered and

28

displayed by rendering tools provided by the Open Street Maps. Kosmos, Osmarender, Mapnik are the rendering tools provided by Open Street Maps [8]. Although the current application can be made as a standalone routing service either online or offline, it can be still linked with planning as model checking paradigm. We should be able to model a road network as a state transition system and derive as the requirements as linear temporal logics. Requirements could be anything like choosing a quantifying route i.e. a route when measured falls between specified limits of distance or a route which as accident risk rate less than one. This concept of specifying quantitative limits on requirements is called Grading Modalities. Thus it could lead to a new standard. 6.2 Future Work

The current implementation can only perform routing on the Open Street Map data. There definitely much more do as discussed above. Spatial databases can be built with derived data from OSM. We call it derived data because OSM data is converted into Spatial coordinates through projection techniques. There are already well developed tools that could populate database with spatial data. Osm2pgsql is one among the tools capable of loading data to a PostGIS data bases. Due to the time constraints visualising of map data is not considered through the project period. This can be done by using OpenLayers. OpenLayers is the java script library for displaying map data on web browsers. OpenLayers are used to deploy slippy map, a web interface for browsing and displaying Open Street Map data. As mentioned in the previous section, there exists map rendering tools which can render OSM data. Mapnik is open source tool kit for rendering slippy maps. We discussed about pre-processing techniques which could reduce the computation time for calculating the shortest path between any pair of geographical location. In section 3.3, we discussed about creating route edge in each cell between boundary edges which is optimum route for traversing the cell. Unfortunately Open Street Maps data has no artifacts for supporting route edges. So after we could successfully implement one of the partition algorithms we could derive the route element similar to the node element and way element in the OSM data. This can be something like shown below.

29

<route id="23242863" > <routenode ref="251490798" /> <routenode ref="251490797" /> < routenode ref="251490799" /> < routenode ref="251490800" /> < routenode ref="251490801" /> < routenode ref="251490802" /> < routenode ref="251490803" /> < routenode ref="251490804" /> < routenode ref="251490805" /> < routenode ref="251490806" /> < routenode ref="251490807" /> <tag k="Total Distance" v="456" /> <tag k="distance type" v="miles" /> </route>

This can be extended by maintaining a data table for routes in our data base.

30

Bibliography

[1] E. Dijkstra, “A note on two problems in connexion with graphs,” in Numerische Mathematik, vol. 1. 1959, pp. 269–271. [2] P. E. Hart, N. J. Nilsson and B. Raphael, “A Formal Basis for the Heuristic Determination of Minimum Cost Paths,” in IEEE Transactions on Systems Science and Cybernetics SSC4, vol. 2. 1968, pp. 100–107. [3] PAPE, U. [1974], Implementation and efficiency of Moore-algorithms for the shortest route problem, Mathematical Programming 7, 212–222. [4] FORD JR., L.R., AND D.R. FULKERSON [1962], Flows in Networks, Princeton University Press, Princeton, New Jersey, United States. [5] BELLMAN, RICHARD [1958], on a routing problem, Quarterly of Applied Mathematics 16, 87–90. [6] FLOYD, ROBERT W. [1962], Algorithm 97 shortest path, Communications of the ACM 5, 345. [7] [8] OpenStreetMap [online] April 2010, www.openstreetmap.org Route planning algorithms for car navigation by Ingrid C.M. Flinsenberg. Eindhoven: Technische Universiteit Eindhoven, 2004.

alexandria.tue.nl/extra2/200412420.pdf [9] T. H. Cormen, C. E. Leiserson, R. L. Rivest, Introduction to Algorithms. The MIT Press/McGraw-Hill, 1990. [10] Microsoft.com,”From Trees to graphs”, [Online] April 2010

http://msdn.microsoft.com/en-us/library/ms379574(VS.80).aspx [11] Leniel.net, A star implementation, April 2010 http://www.leniel.net/2009/06/astarpathfinding-search-in-csharp.html [12] A. V. Goldberg and C. Harrelson, “Computing the shortest path: A search meets graph theory,” in Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2005, pp. 156-165.

31

[13]

OpenStreetMap, “Beginners guide”, http://wiki.openstreetmap.org/wiki/Beginners_Guide_1.5 OSM based routing services, April 2010 [online]. http://wiki.openstreetmap.org/wiki/List_of_OSM_based_Services http://www-cs-students.stanford.edu/~amitp/gameprog.html “Greedy Best-first-search algorithm” Pearl, J. Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley, 1984. p. 48

[14]

[15] [16]

[17]

R. E. Korf, “Depth-first iterative-deepening: an optimal admissible tree search,” in Artificial Intelligence, vol. 27, 1985, pp. 97-109.

[18]

S. Russell, “Efficient memory-bounded search methods,” in Proceedings of the 10th European Conference on Artificial intelligence, 1992, pp. 1-5.

[19]

Variations of A-star [online], May 2010, http://theory.stanford.edu/~amitp/GameProgramming/Variations.html P. Sanders and D. Schultes, “Engineering highway hierarchies,” in Proceedings of the 14th European Symposium on Algorithms, 2006, pp. 804–816.

[20]

[21]

Geographical Information System, May 2010. http://en.wikipedia.org/wiki/Geographic_information_system

32

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd