You are on page 1of 10

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1

Accelerated Map Matching for GPS Trajectories


Marko Dogramadzi and Aftab Khan , Member, IEEE

Abstract— The processing and analysis of large-scale journey the volume of vehicles traveling on certain roads needs to be
trajectory data is becoming increasingly important as vehicles ascertained using GPS data from their journeys quickly. Map
become ever more prevalent and interconnected. Mapping these matching is also required in traverse time estimation [2], where
trajectories onto a road network is a complex task, largely
due to the inevitable measurement error generated by GPS many trajectories need to be rapidly analysed and compared
sensors. Past approaches have had varying degrees of suc- to determine travel times. Due to the ever-increasing number
cess, but achieving high accuracy has come at the expense of of vehicles on the road, the scale of this type of problem has
performance, memory usage, or both.In this paper, we solve increased significantly, exacerbating the need to perform map
these issues by proposing a map matching algorithm based matching even more efficiently.
on Hidden Markov Models (HMM). The proposed method is
shown to be more efficient when compared against a traditional Moreover, map matching is also required due to the
HMM based map matching method, whilst maintaining high inevitable measurement errors associated with GPS sensors.
accuracy and eschewing any requirements for CPU-intensive and These errors manifest in different forms such as the Urban
memory-expensive pre-processing. The proposed algorithm offers Canyon effect, where the presence of tall buildings in an
a method for significantly accelerating transition-probability urban environment blocks a number of GPS satellites and
calculations using instances of high data-availability, which have
previously been a large bottleneck in map matching algorithm there are not enough available satellite signals to estimate the
performance. It is shown that this can be accomplished with positioning information of a fix [3]. Other errors may occur
the application of road-network segmentation combined with a due to multipath satellite signals that arrive at a receiver via
spatially-aware heuristic. Experiments are performed using two a non-direct path, such as being reflected off high buildings
different datasets, with over 9 hours of GPS samples. We show in built-up city areas. Instead of resulting in a lack of fix,
that the proposed framework is able to offer a reduction in
run-time of over 90% with no significant effect on the algorithm’s these can lead to an inaccurate position being calculated. This
accuracy when compared against the traditional HMM approach. is potentially more challenging than simply obtaining no GPS
point, as it can lead to the map matching algorithm propagating
Index Terms— Map matching, GPS, trajectories.
these errors during the route calculation. Additionally, the sam-
pling rate of the GPS data also has a significant impact towards
I. I NTRODUCTION
the accuracy of the map matching algorithms. Historically,

M AP Matching refers to the problem of matching a series


of – potentially inaccurate – raw GPS points to a road
network. An example of this problem can be observed in car
it has not been feasible to sample the GPS sensor regularly due
to energy and transmission-bandwidth limitations. However,
the inclusion of GPS sensors within vehicles and phones
navigation systems in which it is required to map location with increasingly efficient bandwidth utilisation has allowed
coordinates received from the GPS to the car’s actual position GPS sampling at much higher sampling rates. This increased
on the road-map displayed on the screen. Smartphone mapping volume of data is great for improving accuracy, although it
software faces the same issue, with a similar requirement of dramatically increases the computational cost of existing map
relating the user’s GPS location to the road network when matching algorithms.
providing a turn-by-turn navigation service. The widespread adoption and increasing popularity of these
In addition to the aforementioned applications, map match- systems has led to a large amount of research being conducted
ing is also required for identifying road segments that have in this area [4]–[8]. Although some approaches use techniques
been traversed during a journey, retrospectively. Although off- such as Neural Networks and Fuzzy-Logic [7], the majority
line, this process still needs to conducted in near real-time of research has been based on the work by Newson [4] that
for example in the context of traffic-flow analysis [1], where utilises a Hidden Markov Model combined with the Viterbi
Manuscript received August 6, 2019; revised February 19, 2020, algorithm.
June 17, 2020, and October 8, 2020; accepted November 23, 2020. This To address the performance issues, an accelerated map
work was supported in part by the EU REPLICATE (Renaissance of Places matching algorithm (AMM) is presented in this paper that
with Innovative Citizenship and Technology) project under the aegis of
EU’s Horizon 2020 Programme under Grant 691735. This work is based circumvents the bottlenecks of traditional approaches to map
on a previously granted patent [26]. The Associate Editor for this article matching whilst maintaining similarly high accuracy. The
was J. W. Choi. (Corresponding author: Aftab Khan.) proposed algorithm uses a weighted Hidden Markov Model
Marko Dogramadzi is with Toshiba Europe Ltd., Bristol Research and Inno-
vation Laboratory, Bristol BS1 4ND, U.K., and also with the Department of approach that depends upon spatially-aware heuristics to accel-
Electrical and Electronic Engineering, University of Bristol, Bristol BS8 1UB, erate the determination of the traversed road segments. This
U.K. work builds upon the approach proposed in [4], by taking
Aftab Khan is with Toshiba Europe Ltd., Bristol Research and Innovation
Laboratory, Bristol BS1 4ND, U.K. (e-mail: aftab.khan@toshiba-bril.com). advantage of instances when data availability is high in order
Digital Object Identifier 10.1109/TITS.2020.3046375 to eschew computationally intensive elements of the algorithm.
1558-0016 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

approach makes predictions for a given point based on the


combination of the point’s own position and the position
of the point which precedes it. This means that topological
information and travel distance can both be taken into account.
The main advantage of using such an approach is in attaining a
relatively high accuracy whilst requiring much less processing
time and memory than the global approach (described next).
An example of this approach is proposed in [4] that provides
Fig. 1. Visualisation of the local, incremental, and global approaches to map high accuracy using a GPS data set collected in Washing-
matching. The point in (a) is the centre point in (b). (c) represents the whole ton (also outlined in Section IV.A). This approach can be
journey.
improved further, particularly in improving run-time, which
forms the main contribution of this paper.
In Section II, the three main approaches to map matching
The global approach [16], [17] considers the entire jour-
are discussed. This is followed by an explanation of the opera-
ney, and utilises elements from the local and incremental
tion of Newson’s map matching method, with a description of
approaches in order to map the entire journey to a path along
the Viterbi algorithm that is used to estimate the most likely
the road network using some similarity measures. This avoids
path through the road network.
the error propagation problem encountered with incremental
In Section III, the exact part of Newson’s algorithm that
algorithms, when one mis-matched point leads to the next
results in the performance bottleneck is identified. This is
being in error and so on. This also allows for novel approaches
followed by the road-map processing stage which is required
such as supervised learning to be used which take individ-
to ascertain the topographical information used in the final
ual driver preferences for certain routes into account [18].
algorithm. Next, it is shown how the performance bottleneck in
Therefore, the highest accuracy can be achieved by utilising
the baseline approach can be overcome using this topographi-
a global method, although the trade-off in performance makes
cal information, whilst maintaining a similarly high accuracy.
the suitability of such an approach debatable. An interesting
Removing this performance bottleneck leads to a significant
example of this approach is the AntMapper algorithm [5],
decrease in the algorithm’s run-time, which forms the main
which adopts an ant-colony-based optimisation approach that
contribution of this paper. After this, a hybridisation technique
mimics the transporting actions of ants in nature. This claims
is outlined which describes how the proposed algorithm can
accuracy improvements over Newson’s method which range
be seamlessly integrated with Newson’s [4] in an automated
from 2.18% to 11.65% as the sampling rate decreases. It is
way, to dynamically adapt to the availability of data. In order
important to note that no testing was performed on sampling
to significantly improve performance when performing map
rates above 0.1Hz, as higher sampling rates exponentially
matching on larger road-maps, we also propose a process
increase processing time for the global approach to map
called ‘Map Stitching’ in this section.
matching. The Ant-Mapper paper also claims dramatic per-
The rest of the paper is organised as follows. Section IV
formance improvements over Newson’s HMM method [4],
outlines the two datasets used, and the experimental pro-
reducing run-time by over 80%. However, the authors do not
tocol. In Section V, results using the proposed method are
include their pre-processing stage in the run-time calculation,
shown with comparisons against the baseline, followed by
which if included would potentially increase the run-time
an explanation of how these improvements are related to
to even higher than Newson’s method. Therefore, although
the data sampling rate. Finally, conclusions are presented in
the global approach can be seen to increase accuracy con-
Section VI.
siderably for lower sampling rates, this comes at a signifi-
II. BACKGROUND cant expense in performance, especially as the sampling rate
increases.
A. Approaches to Map Matching Other extensions have also been proposed for Newson’s [4]
There are three main approaches to performing map match- algorithm, such as by Goh et al. [19]. However, this suffers
ing that are reported in the literature: local, incremental, and from its own computational bottleneck, like [5], requiring
global. The local approach [9]–[11] simply maps each GPS pre-processing in the form of training an SVM classifier prior
point to the closest point on the road network. As can be to the algorithm’s deployment.
observed when comparing Figures 1a and 1b, there are cases This paper presents an adaptation of the incremental map
when this approach leads to an incorrect mapping, particularly matching method of [4] with the aim to significantly improve
when there is a GPS point with multiple roads nearby. There- the run-time performance whilst maintaining the high accuracy
fore, although many map matching methods utilise the distance associated with it. Newson’s method, used as a benchmark
from a GPS point to its nearest road, this only addresses one approach in this paper, employs a Hidden Markov Model
part of the overall estimate. Local approaches are rarely used combined with the Viterbi algorithm to find the most likely
in practice: although they can be very fast and efficient, they path. It will be shown that by taking into account the
are extremely vulnerable to measurement noise. topology of the road network, the calculations required to
The incremental approach [12]–[15] is the most prevalent perform the Viterbi algorithm can be significantly accelerated
and includes the algorithm proposed by Newson et al. [4] in terms of run-time with very little change in the overall
and others that seek to optimise this approach [6], [8]. This accuracy.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DOGRAMADZI AND KHAN: ACCELERATED MAP MATCHING FOR GPS TRAJECTORIES 3

TABLE I
N OTATION

Fig. 2. Hidden Markov model represented by a Bayesian network diagram,


where: S=State, O=Observation.

B. Newson’s Method
The map matching method proposed by Newson takes a the previous state. In practice, this returns a sequence which
set of GPS points as an input and then uses two metrics, mea- is very close to the most likely one, but not guaranteed to be
surement probability and transition probability, to estimate the the optimum. The Viterbi algorithm operates using two sets
most likely route through the road network. The relationship of probabilities, which represent the likelihoods of individual
between GPS points and road network positions is modelled states and the transitions between them:
using the Hidden Markov Model, while the calculation of the • Measurement Probability
most likely route is performed by the Viterbi algorithm. Both • Transition Probability
are detailed in the following section. The measurement probability is the likelihood of each
1) Hidden Markov Model: A Hidden Markov Model is a candidate point for a GPS point belonging to that point. For
type of Markov Chain, which is defined as “A stochastic example, a candidate point closer to the raw GPS point would
model describing a sequence of possible events in which the have a higher measurement probability than one that is further
probability of each event depends only on the state attained in away. This is intuitive as GPS error can be modelled as a
the previous event” [20]. This is related to the incremental map zero-mean Gaussian distribution [21], and so we should expect
matching approach, which considers the previous point when the correct candidate to be close to the raw GPS point. This
matching the current one. A Hidden Markov Model, however, can be written as follows:
has hidden states which are related (but not directly linked) 1 De
to observations. In other words, an observation provides infor- M p = e− σ
σ
mation about the state, but does not completely reveal which where σ is the standard deviation of GPS measurement noise.
state it comes from. The current state is also dependent on the The value of σ was calculated in Newson’s paper to be just
previous one, as can be observed in Figure 2. over 4 metres.
Figure 2 shows the model used; the states represent the car’s The transition probability is the probability that each candi-
actual location on the road, and the observations are the raw date point for the current GPS point comes after each candidate
GPS points. The states are related because there is a direct link for the previous GPS point. For example, two candidates which
between a car’s current location, and it’s previous location. lie on the same section of road are more likely than two
Each observation is related to its state, but because there is candidates which lie on parallel roads, as it is very unlikely
noise in the measurement it is impossible to be certain about that the car has jumped from one road to the other across two
exactly which state it has come from. This is why the states sequential observations. This is calculated by assuming that the
are considered ‘hidden’. In the context of map matching, a Euclidean distance between two candidates should be similar
‘state’ is a candidate point on the road network. In other words, to the road-network distance between them, with Newson
a possible point which a raw GPS point may map to. A set confirming that this fits an exponential distribution with the
of candidate points is obtained by finding the closest point on parameter β. To take the earlier example, the difference
each road, within some threshold radius, to the GPS point. The between these distances when two candidates lie on the same
threshold radius accounts for noise in the GPS measurement. section of road is near zero, whereas if they are on parallel
For example, a GPS point on a motorway will have candidate roads then the routing distance would be much higher and
points on the road running in each direction, and it is up to therefore the difference greater (and probability lower). The
the map matching algorithm to decide which one is correct transition probability T p can therefore be expressed as follows:
(e.g. by observing where the previous point is).
1 − Dβt
2) Viterbi Algorithm: The next core element of Newson’s Tp = e
paper is the use of the Viterbi algorithm. This is an efficient β
way of finding the most likely sequence of hidden states, where Dt = |Dr − De |.
or rather the most likely route along the road network. The Viterbi algorithm takes the series of hidden states
It should be noted that it does not find a global optimum (candidate points), along with their associated measurement
but rather a local one. Finding a global optimum likelihood or transition probabilities as input. The first matched point
would be computationally expensive, so the Viterbi algorithm is found simply as the candidate with the highest measure-
makes the Markov assumption: a state is dependent only on ment probability, as there are no transition probabilities for

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Algorithm 1 Accelerated Map Matching


1: procedure ROUTE D ISTANCE C ALCULATION

 For every candidate pair across sets


y
2: for (cix , ci−1 ) in (Ci , Ci−1 ) do
y
 Start/End coordinates of cix and ci−1 sections
3: (sx1 , sx2 ) ← segment of cix
y
4: (s y1 , s y2 ) ← segment of ci−1

5: if (sx1 , sx2 ) == (s y1 , s y2 ) then  Same Segment


6: route_distance ← De (c j , ck )

Fig. 3. Segmentation of a road, where X a and X z represent its two end- 7: else if sx1 == s y2 then  Adjacent Segments
points, is performed by recursively halving it until the maximum segment 8: route_distance ← De (c j , ck ) + s
length is less than the threshold of 60m. The distance of 140m is used for
illustrative purposes only.
9: else if sx2 == s y1 then  Adjacent Segments
10: route_distance ← De (c j , ck ) + s
this base case. After this, the sets of candidates (for each
GPS point) are iterated through and the candidate with the 11: else  Non-Adjacent segments
highest combination of measurement and transition probability 12: route_distance ← De (c j , ck ) + l
is selected, considering only the transitions to the previous
candidate points as per the Markov assumption.
For every GPS point from x 0 to x N , the transition probabil-
III. ACCELERATED M AP M ATCHING
ity needs to be calculated for every possible pairing of points
A. Accelerating Calculation of T p from Ci to Ci−1 , which requires the road-network distance
The operation of Newson’s map matching algorithm should be calculated between each such pairing. It should be noted
now be clear, especially with regards to the calculation and that this volume of routing calculations are the main reason
usage of the measurement and transition probabilities. To opti- why Newson’s algorithm takes so long to run. The proposed
mise the algorithm, it is necessary to analyse which part is algorithm is based on the following method of simulating this
causing a performance bottleneck, or in other words which route-distance calculation:
part is taking up a high proportion of the run-time. Algorithm 1 provides a method for quickly obtaining a ‘sim-
The biggest bottleneck in Newson’s map matching algo- ulation’ of the actual routing distance which combines the fast
rithm is the routing function used to find the road-network Euclidean Distance function with the adjacency information of
distance from each of a GPS point’s candidates (Ci ) to each the segments on which the candidate points lie. The intuition
of the candidates for the point which came before it (Ci−1 ). behind this is that candidate points on the same segment will
The road-network distance is needed for the calculation of T p have a very similar routing and Euclidean distance, as they are
between each pair of candidate points. close together. Adjacent segments will also have this property,
The proposed algorithm is able to circumvent the require- although the small penalty applied is useful for reasons to
ment of using the routing function by segmenting each road be explained later. If the candidate points are not on the
on the road-map which the GPS points are being matched to. same segment, nor adjacent segments, then the large penalty
Figure 3 shows how every road is split in half until its serves to make this transition unlikely as it will artificially
segments are shorter than the maximum threshold length, increase the value of |Dr − De | and thus lower the transition
which in this case was chosen to be 60m. It should be noted probability. This method completely obviates the need to
that this means that the minimum possible segment length find the road-network distance, which is responsible for a
is 30m, as a segment just over 60m long will be halved. large speed improvement. However, this approach does have
It should also be noted that every segment is only part of limitations; if the sampling rate is too low then it is unlikely
one road, and cannot stretch across multiple roads. The value that neighbouring points will fall on adjacent segments. This
of 30m was chosen because a car travelling at 70mph would causes the proposed algorithm to revert to Newson’s approach.
travel approximately this distance in one second. Therefore, A high sampling rate ensures that the correct candidate point
at legal driving speeds, the commonly used 1Hz sampling rate for each GPS point almost always lies on the same or an
would almost always place subsequent GPS points on adjacent adjacent segment to the correct candidate for the previous
segments. GPS point. This is what allows for the ‘acceleration’ in the
This change in the topology of the roadmap led to the main Accelerated Map Matching approach; having dense GPS data
reasoning behind the Accelerated Map Matching algorithm: removes the need for the routing function and speeds up the
using the adjacency of the sections in order to speed up calculation of the transition probabilities. The extent of this
the transition-probability calculations. improvement will be analysed in the Results section.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DOGRAMADZI AND KHAN: ACCELERATED MAP MATCHING FOR GPS TRAJECTORIES 5

Fig. 4. Diagram showing where the proposed algorithm breaks down. Green circles are GPS points with arrows mapping to correct candidate point. Blue
lines represent a segmented road. L s is the maximum segment length.

B. Hybridisation & Error Correction

The Accelerated Map Matching method is able to accelerate


the transition probability calculation when neighbouring GPS
points are close together. However, when the points are not
close together, the correct candidate points for the current and
previous GPS points may no longer lie on the same or adjacent
segments. In this case, Accelerated Map Matching would apply
the ‘large penalty’ to the transition between the two correct
candidates, causing accuracy problems when determining the
most likely route. This is illustrated using the nodes connected
with a red arrow in Figure 4. Fig. 5. Diagram showing overshoot; left figure shows the incorrect placement
Therefore, it is required that the Accelerated Map Matching of matched points due to the raw GPS points overshooting the turn. Right
figure illustrates how a small penalty can fix this. Orange dots are raw GPS
algorithm have an automated way of detecting when it should points and blue dots are matched points.
use the accelerated approach to route-distance calculation,
and when it should fall back to Newson’s approach which
uses the actual routing distance. Figure 4 shows that the same node, which in this case is the intersection. The image
distance between sequential GPS points is similar to the on the right of Figure 5 shows how this achieves the correct
distance between their correct candidate points. Therefore, the matching of points.
accelerated approach is run only when the distance between
two sequential GPS points is less than the minimum segment C. Map Stitching
length. This ensures that the accelerated approach is only The main performance bottleneck identified in Newson’s
run when the correct candidates are on the same or adjacent algorithm was the routing function used to calculate route
segments, which maintains the accuracy of the road-network distance between the candidate points. Once this bottleneck
distance calculation. was removed with the Accelerated Map Matching algorithm,
This represents a hybridised approach to calculating the a new one became apparent, especially with large/complex
route distance, which dynamically adapts itself to the avail- road-maps. This new bottleneck occurred during the process
ability of GPS data. If the GPS data for a journey is sparse, of identifying candidate points on the road network; for every
Newson’s approach is used. If the journey has certain stretches GPS point it is necessary to check every segment in the
with dense GPS data, then the acceleration is run for these roadmap to see whether it is close enough to map a candidate
parts. This allows for a large reduction in run-time to be point onto.
achievable with almost no change in accuracy. Figure 6 provides a visualisation of this: to check if a
There are certain situations where the accelerated approach segment is within the orange circle where candidate points are
causes small matching errors. For example, this may occur mapped onto road segments, it is required to run a distance
when the journey involves making a turn but the GPS points calculation to every segment in the entire roadmap. If there
do not track the turn very precisely, such as in Figure 5. This are many GPS points composing a journey, then the roadmap
occurs because at an intersection, the candidate points on all which encloses them will be very large, and a large amount
roads leading out of the intersection are on an adjacent section of distance calculations will be required, especially as this
to the previous correct candidate. Therefore, the algorithm distance check needs to be performed for every single GPS
simply chooses whichever candidate is closest to the GPS point.
point. However, when the GPS point overshoots the turn, the Map-stitching is the process of breaking up the sequence
closest candidate is on the wrong road, causing the error as of GPS points into smaller groups, fetching the roadmap
shown in Figure 5. This is corrected by applying a small enclosing only the group currently being matched, and then
penalty (equal to the segment length) whenever an adjacent performing the map matching algorithm as usual on each
segment is transitioned to, i.e., whenever the current candidate group. The matched points returned from each group are then
point is on an adjacent segment to the previous candidate point. appended into a list which contains the matched points for the
This has the effect of penalising multiple transitions over the entire journey.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

TABLE II
PARK U S D ATASET S UMMARY

TABLE III
WASHINGTON D ATASET S UMMARY

Fig. 6. Left shows journey from Washington data. Right shows a single
GPS point and the area around it within which candidate points are mapped
to segments.

conducted across a trial of the ParkUs mobile application1 (see


Table II). The ParkUs application collects each user’s trip data.
Then, parking information using a ‘cruising’ (searching for
parking) detection algorithm [22], [23] is computed. The app
requires data from a large number of trips which are processed
simultaneously. The app suffers from prohibitive delays in map
matching these many trips which results in providing parking
information that may be stale by the time all the computations
are completed and presented to the users. The collected dataset
contains trip data only in the last 400m of each journey,
resulting in a large number of relatively short journeys. The
GPS data was sampled at 1Hz; the maximum allowed by
the Android operating system. The approach presented in this
paper solves this problem, with results detailed in the next
Fig. 7. Visualisation of the map-stitching method, with the roadmaps being section.
mapped to shown using blue boxes.

B. Washington Dataset
Figure 7 illustrates the way in which map-stitching operates. The Washington Dataset was first presented by
Every blue box contains a sub-set of GPS points which Newson et al. [4], and is one of the most widely utilised
compose the journey, and the box itself represents the roadmap benchmark data sets for testing map matching algorithms.
which the sub-set is being matched to. It can be seen that the It contains GPS data from a drive around Seattle, WA, USA
areas of the roadmap which are not contained within a blue using SiRF Star III GPS chipset with WAAS (Wide Area
box are no longer in consideration as potential roads which Augmentation System) enabled. The journey was sampled
may have a candidate point on them. Therefore, the distance at 1Hz and contains just over two hours of driving in both
to them no longer needs to be calculated, which dramatically challenging inner city environments and the outer suburbs.
reduces the run-time of the algorithm; this is shown in the The total route was 80km long with 7531 data samples
next section, within Table V. containing latitude and longitude pairs. Table III shows an
The reason why the blue boxes in Figure 7 overlap slightly overview of this dataset. Further details regarding the dataset
is because it is necessary to append the last point of each can be found in [4].
group to the start of the next one. This allows for the transition
probability calculation to be performed, and the accuracy of V. R ESULTS
the overall algorithm to remain exactly the same as it would
A. ParkUs Dataset Results
be without map-stitching.
Table IV shows the average run-time of the map matching
algorithms across both trials. We used an Intel i5-4670 CPU
IV. E XPERIMENTS @ 3.4GHz with 8GB of DDR3 RAM to perform the run-time
A. ParkUs Dataset analysis. It can be seen that the proposed approach offers
a run-time reduction of over 95% using this data set. This
ParkUs is a smart parking application [22], [23] providing demonstrates the algorithm’s ability to bring the execution time
parking availability information to users in order to reduce down from a matter of minutes to a few seconds. Furthermore,
parking search times. In this work we use an extension of the
ParkUs dataset, which contains GPS data from 117 journeys 1 https://play.google.com/store/apps/details?id=toshiba.parkus

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DOGRAMADZI AND KHAN: ACCELERATED MAP MATCHING FOR GPS TRAJECTORIES 7

TABLE IV TABLE V
AVERAGE RUN -T IME OF N EWSON ’ S M ETHOD AND P ROPOSED M ETHOD RUN -T IME OF N EWON ’ S M ETHOD AND P ROPOSED M ETHOD FOR
FOR PARK U S D ATA S ET W ITHOUT M AP -S TITCHING WASHINGTON D ATA S ET

Fig. 8. Proposed method vs Newson’s method: GPS sampling rate vs


algorithm run-time for Washington data set.

Fig. 9. Proposed method vs Newson’s method: GPS sampling rate vs


this has been made possible without creating a large table of accuracy (% of points matched to correct road segment) for Washington data
pre-computed distances as in [6], which keeps the memory set.
usage of the algorithm minimal.
therefore dependent on the size of the road-map. The 95%
B. Washington Dataset Results run-time reduction observed with the ParkUs data set can be
Table V shows the run-time analysis of map matching for achieved only if the road-map size isn’t too large. This can
both algorithms, as well as the reductions offered by using the be performed using the map-stitching method, as detailed in
map-stitching approach from the previous section. The same Section III (C).
computing platform as described above is used to perform this
analysis. Without map-stitching, it is evident that the run-time C. Sampling Rates
of the accelerated approach offers a much smaller percentage Figure 8 shows how the performance of Newson’s and the
improvement as shown in Table V. This is due to the size of Proposed methods vary with the sampling rate of the GPS
the roadmap being matched to; every GPS point checks every data. Newson’s method speeds up as the sampling rate is
single section to see if it is close enough to place a candidate reduced. This is due to there being fewer points to match
point on. This new bottleneck means that even though the which leads to direct run-time improvements, although there
accelerated approach is faster, the magnitude of improvement is an accuracy penalty to this which will be discussed later.
is much lower. The proposed method takes slightly longer when the rate is
This is why the map-stitching approach was developed; seg- dropped from 1Hz to 0.5Hz. This is because it begins to
menting the GPS points and mapping only parts of the journey use Newson’s routing-calculation approach more frequently
at a time means that the map being matched to becomes much due to a greater number of GPS points being further apart
smaller, without making any difference to the accuracy. This than the segment length. As the sampling rate drops further,
eliminates a large part of this new bottleneck, and makes the the run-time eventually converges as Newson’s method is
matching of a large number of points much more efficient, used for the majority of route calculations. In summary, the
as can be observed in the percentage reductions along the proposed method will always be at least as fast as Newson’s,
bottom of Table V. The combination of map-stitching and the but the degree of improvement will always be dependent on
accelerated algorithm reduces the run-time of map matching the sampling rate of the data.
on the Washington data set from 1974.87 seconds to 49.4 Figure 9 illustrates the accuracy drop as the sampling rate
seconds, a percentage reduction of 97.5%. is decreased. The term ‘accuracy’ in this context refers to the
When both methods use map-stitching the run-time of the percentage of GPS points which have been matched to the
proposed algorithm is reduced by 58.21%, which is a smaller correct segment of the road network.
run-time reduction than the one achieved for the ParkUs The accuracy drop when the sampling rate is lowered is due
data set. The reason for this is the size of the road-map to the previous GPS point providing less information about the
being matched to; even with map-stitching, the road-map is current one, due to it being further away. However, the drop
much larger in the Washington data set. As described in between 1Hz and 0.5Hz is minimal. Comparing the results of
Section III(C), this increases the overhead caused by checking Figure 8 and 9, it can be deduced that a 0.5Hz sampling rate
which road segments the algorithm should consider a candi- provides the best trade off between accuracy and run-time for
date point for. The performance of the proposed algorithm is both approaches.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Fig. 10. Proposed method (left) showing correct matching at the parallel roads compared against Newson’s method (right). Orange represents the ground
truth and blue is the estimated match.

Fig. 11. Proposed method (left) showing incorrect matching at the roundabout compared against Newson’s method (right). Orange and blue represent the
ground truth and estimated matches respectively.

Further analysing the results, in Figure 10, we compare VI. C ONCLUSION AND F UTURE W ORK
the performance of the proposed method against Newson’s
approach [4]. It can be seen that the proposed method matches In this paper, a new method for map matching was proposed
the GPS points to the road network more accurately than for efficiently mapping GPS trajectories onto a road network.
Newson’s method in this scenario. This is due to Newson’s This was achieved by leveraging the higher sampling rate
method propagating an error that was made on one match. required by majority of novel application (including our test
Such errors are inherently avoided in the proposed method dataset) to accelerate the routing-distance calculations. These
due to the penalisation of multiple transitions over the same calculations were a known source of computational bottle-
intersection as demonstrated in Figure 5. In Figure 11, it can neck to the previous algorithms’ performance. We extensively
be seen that the proposed method does not achieve the evaluated the proposed approach and, combined with the
same accuracy as Newson’s in this scenario. This is due to map-stitching method, demonstrated the reduction in the total
difficulties stemming from the segmentation of the round- run-time of the open source Washington data to be approxi-
about ‘road’, which leads to the algorithm not performing as mately one-sixth of the time taken by our baseline algorithm.
expected. We also showed that the proposed algorithm provides an

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DOGRAMADZI AND KHAN: ACCELERATED MAP MATCHING FOR GPS TRAJECTORIES 9

average run-time reduction of 95.40% using our smart parking [7] M. Pashaian and M. Mosavi, “Accurate intelligent map matching algo-
dataset (ParkUs), whilst maintaining a similar level of accuracy rithms for vehicle positioning system,” Int. J. Comput. Sci., vol. 9, no. 2,
pp. 114–118, 2012.
compared against the baseline. [8] J. Xu, N. Ta, C. Xing, and Y. Zhang, “Online map matching
The performance of the proposed and benchmark algorithm using segment angle based on hidden Markov model,”
map-matching methods rely on two factors that we identified in Proc. 14th Web Inf. Syst. Appl. Conf. (WISA), Nov. 2017,
pp. 50–55.
in this paper including, a) road map size i.e., if the road map [9] C. E. White, D. Bernstein, and A. L. Kornhauser, “Some map
size (or road-map density) is large, there is a performance matching algorithms for personal navigation assistants,” Transp.
degradation in run-time, and b) GPS sampling rate, where Res. Part C: Emerg. Technol., vol. 8, nos. 1–6, pp. 91–108,
Feb. 2000. [Online]. Available: http://www.sciencedirect.com/science/
lower rates slightly reduce the map-matching accuracy. For article/pii/S0968090X00000267
the former we proposed a map-stitching approach in which we [10] G. Taylor, G. Blewitt, D. Steup, S. Corbett, and A. Car,
divide the journey into shorter sets of GPS points. We then use “Road reduction filtering for GPS-GIS navigation,” Trans. GIS,
vol. 5, no. 3, pp. 193–207, Jun. 2001. [Online]. Available:
these shorter subsections of the road-map for map-matching https://onlinelibrary.wiley.com/doi/abs/10.1111/1467-9671.00077
that improves the computational time by 97% on a widely [11] H. Wu, W. Sun, and B. Zheng, “Is only one gps position sufficient to
used benchmark dataset that has over 2 hours of data for locate you to the road network accurately?” in Proc. ACM Int. Joint
Conf. Pervas. Ubiquitous Comput., New York, NY, USA, Sep. 2016,
a single journey. For sampling rates, based on our evalu- pp. 740–751, doi: 10.1145/2971648.2971702.
ation, less than 1% performance degradation was observed [12] N. R. Velaga, M. A. Quddus, and A. L. Bristow, “Developing an
for 0.1Hz when compared against 1Hz. We believe, although enhanced weight-based topological map-matching algorithm for intelli-
gent transport systems,” Transp. Res. C, Emerg. Technol., vol. 17, no. 6,
nearly all mobile devices support GPS sampling rates of over pp. 672–683, Dec. 2009.
1Hz, lower sampling rates under 0.1Hz might cause further [13] M. Quddus and S. Washington, “Shortest path and vehicle
performance degradation and therefore is not recommended trajectory aided map-matching for low frequency GPS data,”
Transp. Res. C, Emerg. Technol., vol. 55, pp. 328–339,
for map-matching. Jun. 2015.
In this paper, we assumed a Gaussian distribution for GPS [14] M. Hashemi and H. A. Karimi, “A weight-based map-matching
errors [21]. As a future research study, other –more realistic– algorithm for vehicle navigation in complex urban networks,”
J. Intell. Transp. Syst., vol. 20, no. 6, pp. 573–590, Nov. 2016,
error distributions may be considered that are more applicable doi: 10.1080/15472450.2016.1166058.
for modeling GPS errors. This may help in more adverse [15] F. Abdallah, G. Nassreddine, and T. Denoeux, “A multiple-hypothesis
conditions where GPS errors are significantly severe (e.g., due map-matching method suitable for weighted and box-shaped state esti-
mation for localization,” IEEE Trans. Intell. Transp. Syst., vol. 12, no. 4,
to tall buildings or loss of signal). Such errors may also be pp. 1495–1510, Dec. 2011.
handled by using other forms of sensing information such as [16] S. Brakatsoulas, D. Pfoser, R. Salas, and C. Wenk, “On map-
using dead reckoning [24], and using the estimated position as matching vehicle tracking data,” in Proc. 31st Int. Conf. Very
Large Data Bases (VLDB), 2005, pp. 853–864. [Online]. Available:
an input for our map matching algorithm. Other methods such http://dl.acm.org/citation.cfm?id=1083592.1083691
as [25] may also be used in conjunction with the proposed [17] R. R. Joshi, “A new approach to map matching for in-vehicle navigation
method for further optimisation based on the most optimal systems: The rotational variation metric,” in Proc. IEEE Intell. Transp.
Syst., Aug. 2001, pp. 33–38.
sampling rate for a given set of points. Moreover, we also [18] M. Xu, Y. Du, J. Wu, and Y. Zhou, “Map matching based on condi-
observed that segmentation of roads in specific scenarios tional random fields and route preference mining for uncertain trajec-
such as when traversing a roundabout leads to incorrectly tories,” Math. Problems Eng., vol. 2015, Sep. 2015, Art. no. 717095,
doi: 10.1155/2015/717095.
matched GPS points. This may be resolved by handling [19] C. Y. Goh, J. Dauwels, N. Mitrovic, M. T. Asif, A. Oran, and P. Jaillet,
segmentation of roundabouts either via reducing the segment “Online map-matching based on hidden Markov model for real-time
lengths or dynamically handling this based on the vehicle traffic sensing applications,” in Proc. 15th Int. IEEE Conf. Intell. Transp.
Syst., Sep. 2012, pp. 776–781.
speed. [20] L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition. San Francisco, CA, USA: Morgan
R EFERENCES Kaufmann, 1990, pp. 267–296.
[21] F. Diggelen. (2007). GPS Accuracy: Lies, Damn Lies, and Statistics.
[1] A. Hofleitner, R. Herring, P. Abbeel, and A. Bayen, “Learning GPS World. [Online]. Available: http://gpsworld.com/gps-accuracy-lies-
the dynamics of arterial traffic from probe data using a dynamic damn-lies-and-statistics/
Bayesian network,” IEEE Trans. Intell. Transp. Syst., vol. 13, no. 4, [22] P. Carnelli, J. Yeh, M. Sooriyabandara, and A. Khan, “Parkus:
pp. 1679–1693, Dec. 2012. A novel vehicle parking detection system,” in Proc. 31st AAAI
[2] Y. Wang, Y. Zheng, and Y. Xue, “Travel time estimation of a path Conf. Artif. Intell., 2017, pp. 4650–4656. [Online]. Available:
using sparse trajectories,” in Proc. 20th ACM SIGKDD Int. Conf. Knowl. http://dl.acm.org/citation.cfm?id=3297863.3297869
Discovery Data Mining (KDD), Aug. 2014, pp. 25–34. [23] M. Jones, A. Khan, P. Kulkarni, P. Carnelli, and M. Sooriyabandara,
[3] Y. J. Cui and S. S. Ge, “Autonomous vehicle positioning with GPS “ParkUs 2.0: automated cruise detection for parking availability infer-
in urban canyon environments,” in Proc. IEEE Int. Conf. Robot. ence,” in Proc. 14th EAI Int. Conf. Mobile Ubiquitous Syst., Com-
Autom. (ICRA), vol. 2, May 2001, pp. 1105–1110. put., Netw. Services, New York, NY, USA, Nov. 2017, pp. 242–251,
[4] P. Newson and J. Krumm, “Hidden Markov map matching through doi: 10.1145/3144457.3144495.
noise and sparseness,” in Proc. 17th ACM SIGSPATIAL Int. Conf. Adv. [24] Z. Tao, Y. Diange, L. Ting, and L. Xiaomin, “Vehicle state estimation
Geographic Inf. Syst. (GIS), Nov. 2009, pp. 336–343. system aided by inertial sensors in GPS navigation,” in Proc. Int. Conf.
[5] Y.-J. Gong, E. Chen, X. Zhang, L. M. Ni, and J. Zhang, “AntMapper: Electr. Control Eng., Jun. 2010, pp. 5793–5796.
An ant colony-based map matching approach for trajectory-based appli- [25] A. Khan, N. Hammerla, S. Mellor, and T. Plötz, “Optimising sam-
cations,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 2, pp. 390–401, pling rates for accelerometer-based human activity recognition,” Pattern
Feb. 2018, doi: 10.1109/TITS.2017.2697439. Recognit. Lett., vol. 73, pp. 33–40, Apr. 2016. [Online]. Available:
[6] C. Yang and G. Gidófalvi, “Fast map matching, an algorithm integrating http://www.sciencedirect.com/science/article/pii/S0167865516000040
hidden Markov model with precomputation,” Int. J. Geographical Inf. [26] M. Dogramadzi and A. Khan, “Method and device for accelerated map-
Sci., vol. 32, no. 3, pp. 547–570, Mar. 2018. matching,” U.S. Patent 10 598 499 B2, Mar. 24, 2020.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Marko Dogramadzi is currently a Graduate Aftab Khan (Member, IEEE) received the B.Eng.
Engineer at Siemens plc., having joined after fin- (Hons.) degree in electronic engineering and the
ishing a degree in computer science with elec- Ph.D. degree in machine learning from the Univer-
tronics, while being sponsored by Siemens through sity of Surrey, in 2008 and 2013, respectively. He is
the E3 Academy. His third-year industrial project currently a Principal Research Engineer with the
was related to the development of a scalable smart Bristol Research and Innovation Laboratory, Toshiba
parking system (ParkUs), which involved in creating Europe Limited, U.K. Prior to joining Toshiba in
a high-performance map-matching algorithm (pre- 2015, he worked as a Post-Doctoral Research Asso-
sented in this paper). This method has also been ciate at Newcastle University, U.K. During his Ph.D.,
patented with Toshiba. His fourth-year project was he developed methods for hierarchical analysis of
related to the creation of a voice-controlled infra-red time series data as a part of the EPSRC Project
remote-control, with the goal of creating a home-automation system which can (ACASVA). These concepts he developed further, which led to one of the
function without Internet connectivity. His work at his current employment first papers on automated and generalized skill assessment from body-worn
in Siemens has been oriented around industrial-cloud technology, web-app sensor data for which he (and his coauthors) received the Honorable Mention
development, and project management. Award at the 2015 ACM Ubicomp Conference (developed under the EPSRC
SiDE Project). As a part of the EU H2020 REPLICATE Project, he and his
coauthors developed one of the first methods for automatically detecting the
behavior of searching for parking using mobile crowdsensing with the aim
of providing real-time parking information within a smart city. His research
agenda is mainly focused on machine learning, artificial intelligence, and
pattern recognition with a particular interest in human behavior analysis
through automated activity recognition and the IoT sensing.

Authorized licensed use limited to: University of Melbourne. Downloaded on September 28,2021 at 15:20:52 UTC from IEEE Xplore. Restrictions apply.

You might also like