You are on page 1of 26

International Journal of

Geo-Information

Article
An Intersection-First Approach for Road Network
Generation from Crowd-Sourced Vehicle Trajectories
Caili Zhang 1 , Longgang Xiang 1 , Siyu Li 1, * and Dehao Wang 2
1 State Key Laboratory of LIESMARS, Wuhan University, 129 Luoyu Road, Wuhan 430079, China;
cailizhang@whu.edu.cn (C.Z.); geoxlg@whu.edu.cn (L.X.)
2 Alibaba, Building No.9 Wangjing East Garden 4th Area, Chaoyang District, Beijing 100000, China;
dehao.wdh@alibaba-inc.com
* Correspondence: lsrain@whu.edu.cn

Received: 29 July 2019; Accepted: 21 October 2019; Published: 24 October 2019 

Abstract: Extracting highly detailed and accurate road network information from crowd-sourced
vehicle trajectory data, which has the advantages of being low cost and able to update fast, is a
hot topic. With the rapid development of wireless transmission technology, spatial positioning
technology, and the improvement of software and hardware computing ability, more and more
researchers are focusing on the analysis of Global Positioning System (GPS) trajectories and the
extraction of road information. Road intersections are an important component of roads, as they play
a significant role in navigation and urban planning. Even though there have been many studies on
this subject, it remains challenging to determine road intersections, especially for crowd-sourced
vehicle trajectory data with lower accuracy, lower sampling frequency, and uneven distribution.
Therefore, we provided a new intersection-first approach for road network generation based on
low-frequency taxi trajectories. Firstly, road intersections from vector space and raster space were
extracted respectively via using different methods; then, we presented an integrated identification
strategy to fuse the intersection extraction results from different schemes to overcome the sparseness
of vehicle trajectory sampling and its uneven distribution; finally, we adjusted road information,
repaired fractured segments, and extracted the single/double direction information and the turning
relationships of the road network based on the intersection results, to guarantee precise geometry and
correct topology for the road networks. Compared with other methods, this method shows better
results, both in terms of their visual inspections and quantitative comparisons. This approach can
solve the problems mentioned above and ensure the integrity and accuracy of road intersections and
road networks. Therefore, the proposed method provides a promising solution for enriching and
updating navigable road networks and can be applied in intelligent transportation systems.

Keywords: low frequency; crowd-sourced vehicle trajectories; intersections; fusion; road network

1. Introduction
Digital road information is a pivotal part of basic geographic information and plays an important
role in urban planning, intelligent transportation, and location services [1]. In the past, the collection
of road information, mostly through traditional measurement methods, had a high cost of time and
manpower. With the development of remote sensing technology, many scholars have proposed road
extraction methods based on remote sensing images [2,3] and Light Detection and Ranging (LIDAR)
point cloud data [4]. However, due to the mixture of multiple spatial information, road extraction
is susceptible to interference. Additionally, the acquisition cost of remote sensing images and point
cloud data is high and has a lag problem. Although it is possible to collect and edit data via a Mobile
Mapping System, this method is also expensive, and road extraction information also often lags behind

ISPRS Int. J. Geo-Inf. 2019, 8, 473; doi:10.3390/ijgi8110473 www.mdpi.com/journal/ijgi


ISPRS Int. J. Geo-Inf. 2019, 8, x473
FOR PEER REVIEW 22 of
of 27
26

the latest changes in the actual road. With the development of the Internet and the popularity of GPS
technology, the methods of geographic information acquisition have undergone tremendous changes.
More and more locations can be reached via GPS or hand-held devices. Additionally, data are collected
and uploaded by the public. These phenomena stimulate the generation of crowd-sourced trajectory
data, which are acquired by soliciting contributions from a large group of volunteers. These data
are different from those obtained by traditional measurement and remote sensing methods and have
the advantages of being low cost, in real-time, and on a large scale, which is more suitable for the
acquisition and rapid update of large-scale rural [5] and urban [6] road networks. At the same time,
these data contain a large amount of driving record information, which makes a great contribution
to the extraction of road-related geometry and attribute information, such as sidewalk networks [7],
implicit entities [8], road boundary information [9], road slope and aspect information [10], and other
road semantic information [11–13]. Therefore, the big real-time traffic data from crowd-sourced data
not only make it a reality to extract a low-cost, fast-updating, highly detailed, and accurate road
network, but can also mine a lot of road semantic information. This is a significant research field for
road information extraction.
At present, existing studies dedicated to generating road maps from vehicle GPS traces are
common and have made great progress. Previous work mainly focused on the extraction of the
road skeletons, which can be divided into incremental methods [14–16], clustering methods [17–19],
and raster methods [20,21]. Incremental methods combine routes one-by-one to form a graph [22],
which starts from an empty road network, taking the first trajectory as the benchmark, and continuously
adjusts and improves the initial road network by gradually inserting the trajectory. Clustering methods
use some kind of clustering algorithm, such as K-means, to form a road network by considering the
position or direction characteristics of the tracking points. Raster methods first rasterize the GPS track
points to form a binary map and then use image-processing techniques to extract the road network.
However, incremental methods are calculation-intensive and cannot determine the correct
intersections. Clustering methods, on the other hand, have difficulty handling roads that are close in
space. These two kinds of methods can only process high frequency, high precision, and high-density
trajectories; they perform poorly for low frequency and high GPS error data. Even though the raster
method can fit low-frequency trajectories, the precise geometry and correct topology of road networks
cannot be guaranteed, and the morphology of a road is often distorted. As shown in Figure 1,
the intersection position often deviates its right location and road segments around intersections with
a few track points often distorted and broken.

(a) (b) (c)


Figure 1. Intersections position deviation, and road segments around intersections are distorted and
broken: (a) Position deviation; (b) distorted road segments; (c) broken road
road segments.
segments.

To
To overcome
overcome the the shortcomings
shortcomings of of previous
previous road
road data
data extraction
extraction methods
methods based
based on the trajectory
on the trajectory
data,
data, some studies argue that correct intersections can generate a high-quality road network,some
some studies argue that correct intersections can generate a high-quality road network, and and
researchers have have
some researchers begun to dotostudies
begun on on
do studies thethe
extraction of of
extraction road
roadintersections
intersectionsbased
based onon Longest
Longest
Common
Common Subsequence
Subsequence(LCSS)
(LCSS)[23],
[23],heading
headingdirection
directionchanging
changing[24–27], G index
[24,25,26,27], G analysis [28,29],[28,29],
index analysis shape
descriptors
shape descriptors [30], etc. Additionally, some researchers [27,28,29] have attempted to extractbased
[30], etc. Additionally, some researchers [27–29] have attempted to extract turn rules turn
on clustering
rules based onmethods. However,
clustering the above
methods. roadthe
However, intersection extraction
above road methods
intersection are either
extraction complexare
methods or
ISPRS Int. J. Geo-Inf. 2019, 8, 473 3 of 26

inefficient [30], need to train a large number of data and build an empirical model to be viable, or are
only applicable to high-frequency sampling data or high-density areas.
Therefore, there exist many challenges for road network extraction from crowd-sourced vehicle
GPS trajectories, as follows:
First, high-frequency tracks are hard to obtain. Low-frequency tracks are more common, and most
trajectory data are obtained by Commodity GPS devices in which the average sampling interval is
more than 40 s, which means that the next points are often recorded after the vehicle passes one or
more intersections. Thus, most methods mentioned above are inefficient for this kind of data.
Second, trajectory data are unevenly distributed, and road track points are intensive in traffic hub
regions, while sparse in branch roads.
Third, the urban road network is extremely complex, as its intersections have multiple levels and
types and may be too close to each other, meaning that the extraction of precise road segments around
intersections cannot be guaranteed. Moreover, the morphology of a road is often distorted and broken
for its low-frequency data properties.
In order to overcome the aforementioned challenges, a new intersection priority strategy to
extract the road network using real-world vehicle GPS trajectory data was presented in this paper.
We developed an integrated identification strategy focusing on the detection of road intersections and
then constructed road improvement methods based on intersection results. The contributions of this
paper are the following:

(1) In grid space, we extracted the intersections by identifying the number of eight neighborhoods,
per pixel, in the single-pixel “center-lines” and considered different resolutions, which can keep
as many intersections as possible in our extraction method and also make the extraction results
more accurate in the subsequent fusion processing.
(2) In vector space, the clustering method to rapidly search for and find density peaks (CFDP) [31]
was used to extract road intersections for the first time. Moreover, our method, which considers
the distribution density of trajectory data and the traffic flow characteristic of roads, is available
for both low-frequency and high-frequency trajectories, thereby solving the problem faced by
other methods, such as heading and speed only being available for high-frequency trajectories.
(3) An intersection fusion mechanism that integrates the advantages of cluster computing and image
recognition was developed to overcome sampling sparseness and uneven distribution of the
vehicle trajectory data. At the same time, this fusion mechanism can recognize true intersections
and undetermined intersections, thereby guaranteeing the integrity of the extraction results and
reducing the blindness of eliminating false intersections.
(4) Making full use of the direct dependency between intersections and center-lines of the road
can further adjust road information and repair fractured segments, thereby guaranteeing the
precision of road segments around intersections. We also constructed an Intersection–Link model
and used the road segment between intersections as the unit to identify single/double directional
information and the turning relationships of the road network, to guarantee the precise geometry
and correct topology of the road networks.

The remainder of this paper is organized as follows. Section 2 describes the method’s flow and the
strategy used to extract the intersections and build a road network. Section 3 outlines the procedure
of our integrated intersection extraction method. Section 4 outlines the procedure of our method for
building a road network. Section 5 presents a set of experimental analyses. Section 6 discusses the
conclusions and suggestions for further work.

2. Road Network Generation Framework


On the one hand, intersections are key parts of a road network, as their effective identification is
the basis of constructing a navigable road network. On the other hand, identifying road segments and
turning relationships based on intersection extraction results is a divide-and-conquer method, which is
ISPRS Int. J. Geo-Inf. 2019, 8, 473 4 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 4 of 27

which
helpfulistohelpful
reduceto reduce
the the difficulty
difficulty of urban
of urban road roadconstruction.
network network construction.
However, However, the frequency
the sampling sampling
frequency of crowd-sourced vehicle GPS trajectories is low, and its accuracy is not high.
of crowd-sourced vehicle GPS trajectories is low, and its accuracy is not high. Furthermore, urban roadFurthermore,
urban road
networks networks
are are intersections
dense, and dense, and intersections areso
are very close, very
it isclose,
very so it is very challenging
challenging to constructto construct
urban road
urban road networks based on crowd-sourced vehicle GPS trajectories. This paper
networks based on crowd-sourced vehicle GPS trajectories. This paper proposes an intersection-first proposes an
intersection-first approach for road network generation that uses low-frequency real-world
approach for road network generation that uses low-frequency real-world vehicle GPS trajectories to vehicle
GPS trajectories
correctly to above
solve the correctly solve theAccording
problems. above problems. According
to our proposed to our
road proposedextraction
information road information
scheme,
extraction
there are four key parts for obtaining road network information from GPS trajectories, asfrom
scheme, there are four key parts for obtaining road network information shown GPS
in
trajectories,
Figure 2. as shown in Figure 2.

Input:
Output:
GPS Trajectories
Road Network

Multiresolution Intersection-Link Single/double


Estimating density
rasterization Model
direction s and turning

relationships
Intersections extracting
Morphological filter by CFDP extraction
Intersections fusion &
eliminating false
intersections
Multiresolution
Morphology thinning Road modifying
intersections extracting
and repairing

Simplifying center-lines by Douglas-Peucker algorithm


Vectorization

Figure 2. Workflow for road information extracting.


Figure 2. Workflow for road information extracting.
(1) In the grid space, road intersections were extracted by identifying the number of pixels in
(1) Ineach pixel’s
the grid eightroad
space, neighborhoods
intersectionsfrom weredifferent
extractedresolution single-pixel
by identifying “center-lines”
the number of pixelstoin
extract
each
as many road intersections as possible. Before this, corresponding mathematical
pixel’s eight neighborhoods from different resolution single-pixel “center-lines” to extract as many morphology
processing, such
road intersections as rasterization,
as possible. opencorresponding
Before this, operation, close operation, filling
mathematical holes, smoothing
morphology surfaces,
processing, such
and morphology thinning, needed to be done.
as rasterization, open operation, close operation, filling holes, smoothing surfaces, and morphology
(2) In the
thinning, vectortospace,
needed we considered the distribution density of the trajectory and extracted the
be done.
road intersections via the clustering method of CFDP. In this critical part, in order to eliminate the
(2) Intrajectory
the vectorpoints
space,with
we considered the distribution
large deviations density
caused by drift, of the
Kernel trajectory
Density and extracted
Estimation (KDE) the
wasroad
first
intersections via the clustering
used to smooth the trajectory data. method of CFDP. In this critical part, in order to eliminate the
trajectory points with large deviations caused by drift, Kernel Density
(3) In the aspect of integrated intersection extraction, a corresponding fusion mechanism wasEstimation (KDE) was first
used constructed.
to smooth theThe trajectory data.
true intersections and undetermined intersections were distinguished according
to the fusion of the clustering
(3) In the aspect of integrated intersectionresults of theextraction,
CFDP algorithm and the mathematical
a corresponding morphological
fusion mechanism was
extraction results from different resolution raster images. Finally, Principal
constructed. The true intersections and undetermined intersections were distinguished according Component Analysisto
(PCA) was used to analyze the undetermined intersections and detect
the fusion of the clustering results of the CFDP algorithm and the mathematical morphological the pseudo intersections so
as to ensure
extraction results the
fromaccuracy of eliminating
different the false
resolution raster intersections.
images. Finally, Principal Component Analysis
(4)
(PCA)Inwas
termsusedof to
road network
analyze the generation,
undetermined a road improvement
intersections method
and detect thewas proposed
pseudo to guarantee
intersections so as
the precise geometry and correct topology of
to ensure the accuracy of eliminating the false intersections. the road networks based on the above intersection
extraction results. At the same time, we also designed the Intersection–Link model to identify the
(4) Insingle/double
terms of road direction
network generation,
informationaand roadturning
improvement method
relationships of was proposed
the road to guarantee the
network.
precise geometry and correct topology of the road networks based on the above intersection
3. Integrated
extraction Intersection
results. Extraction
At the same time, we also designed the Intersection–Link model to identify the
single/double direction information
Road intersections, which usually and turning
connect relationships
two or moreofroad the road network.
segments, are key parts of the
road network. In this study, we extracted the road intersections first and then processed the road
3. Integrated Intersection Extraction
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 5 of 27

ISPRS Int. J. Geo-Inf. 2019, 8, 473 5 of 26


Road intersections, which usually connect two or more road segments, are key parts of the road
network. In this study, we extracted the road intersections first and then processed the road
information. To ensure the accuracy and integrity of the result’s location, two methods were used to
intersections. One method involves using mathematical morphology to calculate the trajectory
extract intersections.
resolutions, and the other involves clustering the GPS trajectories to obtain
raster images of different resolutions,
the "road cluster point" via the clustering algorithm CFDP. The intersection results are obtained by
fusing these two methods, so the intersection extraction is as complete as possible while the position
accuracy.
maintains high accuracy.

3.1. Extracting Intersections


3.1. Extracting Intersections under
under aa Raster
Raster Space
Space Using
Using the
the Morphology
Morphology Method
Method
Mathematical
Mathematicalmorphology
morphologyprocessing,
processing, which
whichhashas
been widely
been usedused
widely in theinroad
theextraction of remote
road extraction of
sensing images, images,
remote sensing is a computational method for
is a computational grid space.
method Remote
for grid space.sensing
Remoteimages
sensing contain
images a variety
containofa
different
variety ofspectral
different information, which havewhich
spectral information, a great impact
have on impact
a great road extraction, while images
on road extraction, whilegenerated
images
by rasterization
generated based on taxi
by rasterization GPSon
based trajectories
taxi GPS are not interfered
trajectories are notwith by other
interfered ground
with information,
by other ground
thereby producing
information, therebybetter results. In
producing addition,
better using
results. In the mathematical
addition, using the morphology
mathematical approach requires
morphology
less
approach requires less time to process the trajectory data. Based on our observations, the cells of
time to process the trajectory data. Based on our observations, the cells in which the number in
pixel
whichpoints in eightofneighborhoods
the number pixel points iniseight
moreneighborhoods
than two in single-pixel
is more than “center-lines” have the “center-
two in single-pixel greatest
opportunity
lines” have the to begreatest
road intersections.
opportunityAs toshown
be roadin Figure 3, cell a As
intersections. is an example
shown of a road
in Figure 3, intersection,
cell a is an
while
examplecellofb aisroad
an example of a non-intersection.
intersection, while cell b is an In this study,
example we used the mathematical
of a non-intersection. morphology
In this study, we used
method to obtain the
the mathematical single-pixel
morphology “center-lines”
method to obtainand the extracted the“center-lines”
single-pixel intersections by andidentifying
extracted the
number of pixels
intersections in each pixel’s
by identifying eight neighborhoods.
the number of pixels in each pixel’s eight neighborhoods.

Figure 3. Intersection recognition example.

The
The key
key steps
steps for
for extracting
extracting roadroad intersections
intersections via
via the
the morphological
morphological method method are are as
as follows:
follows:
Step 1: Rasterizing the GPS trajectories. In order to minimize the effects
Step 1: Rasterizing the GPS trajectories. In order to minimize the effects of noise, of noise, we set a statistical
we set a
value c forvalue
statistical each pixel basedpixel
c for each on the characteristics
based of a large number
on the characteristics of track
of a large number points, and specify
of track points, that
and
when a track point falls into a pixel of the raster image, the value of c for that pixel
specify that when a track point falls into a pixel of the raster image, the value of c for that pixel will will be increased by
1.
beIfincreased
the c value by of
1. aIfpixel
the cisvalue
ultimately greater
of a pixel than or equal
is ultimately to the
greater thangiven threshold,
or equal to the the pixel
given value is
threshold,
set to 255; otherwise, the value is 0.
the pixel value is set to 255; otherwise, the value is 0.
Step
Step 2:2: Preprocessing
Preprocessingthe theraster
rasterimage.
image.ToTo accurately
accurately restore
restore thethe shape
shape of the
of the original
original image,image,
the
the etched
etched imageimage is used
is used as as
thethe marker
marker image
image andthe
and theoriginal
originalimage
imageasasthe the mask
mask image
image for for the
the
mathematical morphological reconstruction to eliminate outliers.
mathematical morphological reconstruction to eliminate outliers. At the same time, theAt the same time, the reconstruction
involves an iterative
reconstruction involves dilation process,
an iterative whichprocess,
dilation can ensure
whichthat holes
can are that
ensure filledholes
without changing
are filled the
without
shape
changing of the
theobject.
shape Operation
of the object. preprocessing, such as open and
Operation preprocessing, suchclose operations,
as open and closecan operations,
also be usedcan to
achieve image enhancement and smoothing, respectively.
also be used to achieve image enhancement and smoothing, respectively.
Step
Step 3: Extracting the
3: Extracting the single-pixel
single-pixel “center-lines”
“center-lines” of of the
the road
road byby morphology
morphology thinning
thinning (shown
(shown in in
Figure
Figure 4).4).
ISPRS Int. J. Geo-Inf. 2019, 8, 473 6 of 26
ISPRS
ISPRS Int.
Int. J.
J. Geo-Inf.
Geo-Inf. 2019,
2019, 8,
8, xx FOR
FOR PEER
PEER REVIEW
REVIEW 66 of
of 27
27

Figure
Figure 4.
4. Raster
Raster image preprocessing and
image preprocessing and morphology
morphology thinning.
thinning.

Step
Step 4: Calculating the number of pixels in eacheachpixel’s eighteight
neighborhoods in thein thinning result.
Step 4:4: Calculating
Calculating the the number
number of of pixels
pixels inin each pixel’s
pixel’s eight neighborhoods
neighborhoods in the the thinning
thinning
The pixels
result. whose neighborhood numbers are more than two are considered as intersections.
result. The
The pixels
pixels whose
whose neighborhood
neighborhood numbers
numbers are are more
more than
than two
two are
are considered
considered as as intersections.
intersections.
After
After preprocessing of of
mathematical morphology, the intersections can be obtained more accurately.
After preprocessing of mathematical morphology, the intersections can be obtained more
preprocessing mathematical morphology, the intersections can be obtained more
However,
accurately. when the trajectory points are rasterized, the resolution of the trajectory raster image will
accurately. However,
However, when when the the trajectory
trajectory points
points are
are rasterized,
rasterized, thethe resolution
resolution of of the
the trajectory
trajectory raster
raster
affect
image the final extraction result. Using a higher resolution, the location of the road can be reflected more
image willwill affect
affect the
the final
final extraction
extraction result.
result. Using
Using aa higher
higher resolution,
resolution, thethe location
location ofof the
the road
road can
can be
be
accurately,
reflected as shown in Figure 5a. However, for areas where thefor track points are sparsely distributed,
reflected more accurately, as shown in Figure 5a. However, for areas where the track points are
more accurately, as shown in Figure 5a. However, areas where the track points are
such processing may such
sparsely generate more outliers that will be treated as noise and eliminated. If using a
sparsely distributed,
distributed, such processing
processing may may generate
generate moremore outliers
outliers that
that will
will bebe treated
treated as as noise
noise and
and
lower resolution,
eliminated. as shown in Figure 5b, the proportion of trajectory pixels in the total number of pixels
eliminated. If using a lower resolution, as shown in Figure 5b, the proportion of trajectory pixels in
If using a lower resolution, as shown in Figure 5b, the proportion of trajectory pixels in
will
the increase, which facilitates increase,
the enhancement of sparse areas. Therefore, according to the strategy
the total
total number
number of of pixels
pixels will
will increase, which
which facilitates
facilitates thethe enhancement
enhancement of of sparse
sparse areas.
areas. Therefore,
Therefore,
mentioned
according above,strategy
this paper uses high resolutionpaperand low resolution to extract road intersections,
according to to the
the strategy mentioned
mentioned above,
above, this
this paper uses uses high
high resolution
resolution and and low
low resolution
resolution toto
respectively,
extract road so our methodrespectively,
intersections, can extract as so many
our intersections
method can as possible
extract as manyandintersections
also keep theas extraction
possible
extract road intersections, respectively, so our method can extract as many intersections as possible
results
and more accurate in the subsequent fusion processing.
and also
also keep
keep thethe extraction
extraction results
results more
more accurate
accurate in in the
the subsequent
subsequent fusion
fusion processing.
processing.

(a)
(a) (b)
(b)
Figure 5.
Figure 5. Effect
Effect of
of image resolution on
image resolution on rasterization:
rasterization:
rasterization: (a)
(a) high
high resolution; (b) low
resolution; (b) low resolution.
resolution.
resolution.

Urban road
Urban road widths
road widths
widthsare are diverse,
arediverse, ranging
diverse,ranging
rangingfrom from
from3030
30toto
to4040 meters
40meters
meters wide
wide
wide forfor main
main
for main roads
roads
roads and
andand10 10 to
to 15
to 15
10 m
15
meters
wide
metersforwide for
for secondary
secondary
wide roads. Inroads.
secondary order In
roads. order
order to
to accurately
In to accurately
reflect thereflect
accurately the
theoflocation
location
reflect all roadsof
location all
ofand roads
roads and
all reduce reduce
the impact
and reduce
the
the impact of noise, the raster resolution should not be too small. Therefore, two resolutions of 2.5
of impact
noise, theof noise,
raster the raster
resolution resolution
should not should
be too not be
small. too small.
Therefore, Therefore,
two two
resolutions resolutions
of 2.5 m of
and 5m
2.5 m
and
were 5 m were
and 5selected selected
m wereforselected for
morphological morphological processing.
processing. processing.
for morphological The
The intersection intersection
and center-line
The intersection and center-line
andextraction
center-line extraction
results from
extraction
results
results from
differentfrom different
resolutions
different viaresolutions
rasterization
resolutions via rasterization
viausing the Wuhan
rasterization using the
dataset
using Wuhan
the are shown
Wuhan dataset are
in Figure
dataset are shown in
in Figure
6. The results
shown show
Figure 6.
6.
The
that results
the show
number ofthat
2.5 m the number
intersection of 2.5 m
extraction intersection
results is extraction
66, the results
number of 5
The results show that the number of 2.5 m intersection extraction results is 66, the number of 5 m is
m 66, the number
extraction resultsof 5
is m
77,
extraction
which is 11results
extraction is
more than
results is 77, which
77,the is
is 11
number
which 11ofmore
2.5 m
more than the
the number
extraction
than of
of 2.5
results.
number 2.5 m
m extraction
extraction results.
results.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 7 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 7 of 27

(a) (b)
Figure 6. Intersection and relative center line extraction results from different resolutions via
rasterization: (a) 2.5 m results;
Figure 6. Intersection (b) 5 mcenter
and relative results.
line extraction results from different resolutions via
rasterization: (a) 2.5 m results; (b) 5 m results.
Even though the method considers sparse areas and sets different kinds of resolutions to extract
intersections,
Even though it is the
stillmethod
not suitable for less
considers sparsedense
areasregions,
and setsasdifferent
some centerkindslines and intersections
of resolutions to extract in
sparse areas have
intersections, it is still
still not
not been completely
suitable extracted.
for less dense regions,Therefore,
as someitcenter
is necessary
lines andto combine
intersectionsthe peak
in
sparse areas
density have method
clustering still not and
beenmathematical
completely extracted.
morphology Therefore,
methoditto is extract
necessary the to combine the
intersections to peak
ensure
density
the clusteringofmethod
completeness and mathematical morphology method to extract the intersections to
the results.
ensure the completeness of the results.
3.2. Extracting Intersections Under a Vector Space Based on CFDP
3.2. Extracting Intersections Under a Vector Space Based on CFDP
Due to the complexity of the intersections, a large number of GPS sampling points are gathered
near the
Dueintersection
to the complexityso, although the GPS trajectories
of the intersections, a large of some areas
number of GPSnear intersections
sampling points areare gathered
sparse, the
near theclustering
density intersection so, although
algorithm can betheusedGPS to trajectories of some areas
extract intersections. Thenear
CFDP intersections are sparse,isthe
clustering algorithm used
density
in clustering
this work becausealgorithm
it capturescanthebelocal
used to extractdensity
maximum intersections.
positions The ofCFDP clustering
cells and algorithm
is not easily is
disturbed
used
by in thistrack
uneven workdensity
becausedistribution.
it captures the local maximum
Moreover, the CFDP density positions
algorithm hasofa cells and is not settings.
few threshold easily
disturbed
In contrast,byDBSCAN
uneven track mergesdensity distribution.
proximal Moreover,
high-density cells the
intoCFDP algorithm
clusters; the mean has ashift
fewalgorithm
threshold is
settings. In contrast, DBSCAN merges proximal high-density
often affected by locally dense areas, and the results fluctuate greatly. cells into clusters; the mean shift
algorithm
The GPS is often affected
trajectory by locally
points dense area
in the study areas, and the
range from results fluctuate
thousands greatly. and the presence of
to millions,
discrete points (noise points) may result in the detection of false intersections. Sinceand
The GPS trajectory points in the study area range from thousands to millions, it isthe presence to
impossible
of discrete points (noise points) may result in the detection of false intersections.
process the entire study area directly using the CFDP algorithm, KDE is used for data smoothing. Since it is impossible
to process the entire study area directly using the CFDP algorithm, KDE is used for data smoothing.
3.2.1. Estimating Density
3.2.1. Estimating Density
KDE is used to calculate the density of the point elements, which represents the sum of the kernel
KDEthat
functions is used
fallto calculate
within the the
bump,density of the point
as shown elements,
in Equation (1):which represents the sum of the kernel
functions that fall within the bump, as shown in Equation (1):
m
3 X 1
^ f (ˆx) = 3 K(1 (x − xi )) (1)
𝑓(𝑥) = mh2 i=1𝐾( h (𝑥 − 𝑥 )) (1)
𝑚ℎ ℎ
where m is the number of neighbor cells, xi is the center point of the ith cell, h is the bandwidth,
where m is the number of neighbor cells, xi is the center point of the ith cell, h is the bandwidth, and
and K(x) is the kernel function adopted in this work, as shown in Equation (2):
K(x) is the kernel function adopted in this work, as shown in Equation (2):

2
3𝜋3π−1(1
(1 −
−X 𝑋T X𝑋)
) , ,X𝑋T X𝑋<<
1 1

𝐾(𝑥)
K(x)=

= (2) (2)

 0,0,𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
otherwise
In the density grid, large-value cells always exist within intersections, and small-value cells (also
In the density grid, large-value cells always exist within intersections, and small-value cells (also
called noisy cells) usually exist at the margins of intersections or along roadways, as shown in Figure
called noisy cells) usually exist at the margins of intersections or along roadways, as shown in Figure 7a.
7a. Based on the statistics of the density grid, we can acquire a statistical distribution map, as shown
Based on the statistics of the density grid, we can acquire a statistical distribution map, as shown
in Figure 7b. Setting the appropriate threshold K, we can extract the main road information, which
consists of high-density cells for intersection detection.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 8 of 26

in Figure
ISPRS 7b. Setting
Int. J. Geo-Inf. 2019, 8,the appropriate
x FOR threshold
PEER REVIEW K, we can extract the main road information, 8which
of 27
consists of high-density cells for intersection detection.

(a) (b)
Figure7.
Figure Estimatingdensity:
7.Estimating density:(a)
(a)Kernel
Kerneldensity;
density;(b)
(b)statistical
statisticaldistribution.
distribution.

3.2.2.Extracting
3.2.2. ExtractingIntersections
Intersections
The CFDP
The CFDP algorithm
algorithm isis based
based onon the
the assumption
assumption that that the
the cluster
cluster center
center isis surrounded
surrounded by by
neighboringpoints
neighboring pointswithwithlower
lowerlocal
localdensities
densitiesand andhashasaarelatively
relativelylarge
largedistance
distancefrom
fromany anypoint
pointwith
with
a higher density. Thus, for each point in the whole research region, two quantities
a higher density. Thus, for each point in the whole research region, two quantities are calculated: the are calculated:
the local
local density
density of theofpoint
the point anddistance
and the the distance
from from the point
the point to theto the point
point with awith a higher
higher local density,
local density, both
both of which depend on the distance between
of which depend on the distance between the data points. the data points.
Basedon
Based onestimating
estimatingdensity
densityprocessing,
processing,experimental
experimentaldata dataretains
retainsthe
the main
main information
informationabout
about
the intersection, and some noise points are removed
the intersection, and some noise points are removed while the number while the number of data is reduced. Moreover,
 N n oofN
data is reduced. Moreover,
the extracted results
the results have
havethe
thedensity attributeρi{ρ
densityattribute i=}1 . Then,
. Then, let set qi {q }denote
let set i=1
a descending
denote order
a descending
 N {ρ }
of set . In other words, as ρ ≥ ρ ≥ ⋯ ≥ ρ
of set ρi i=1 . In other words, as ρq1 ≥ ρq2 ≥ . . . ≥ ρqN , then the point distance is defined as Equation
order , then the point distance is defined as
Equation
(3): (3): n o

𝑚𝑖𝑛{𝑑 },, i𝑖 ≥≥2 2
min d qiqj



qj



𝛿δqi ==  j < i

(3)(3)

 𝑚𝑎𝑥 (𝛿
 
), 𝑖
δqj , i = 1 = 1
 max



j≥2
According
According to the decision
to the decisiondiagram
diagramofofthe the relationship
relationship between
between δ and
δqi and ρ ,cluster
ρi , the the cluster
center,center,
which
which is a candidate for intersection points, can be selected. Because
is a candidate for intersection points, can be selected. Because the distribution of the GPSthe distribution of sampling
the GPS
sampling points is inconsistent,
points is inconsistent, if theand
if the distance distance and
density density threshold
threshold are considered
are considered at the same attime,
the same time,
the sparse
the sparse area extraction results will be affected, and the integrity of the extraction
area extraction results will be affected, and the integrity of the extraction results will be reduced. results will be
reduced. Therefore, this paper takes into account the special qualities of intersections
Therefore, this paper takes into account the special qualities of intersections and only considers the and only
considers the distance threshold.
distance threshold.
With
Withthetheprocessing
processing ofofthe
thekernel
kerneldensity
densityanalysis,
analysis,the
thedata
datacluster
clusterisissmoothed,
smoothed,and andthethedistance
distance
within
within the cluster is suppressed. The larger the distance between the clusters, the more likely
the cluster is suppressed. The larger the distance between the clusters, the more likely the
the
intersection is. Amplifying the range of the red rectangular in the decision graph
intersection is. Amplifying the range of the red rectangular in the decision graph shown in Figure 8b, shown in Figure 8b,
we
we can
can see
see that
that the
the distance
distance of of many
many points
points clusters
clusters under
under 20 20 meters.
meters. Thus,
Thus, setting
setting the
the distance
distance
threshold
thresholdD Dasas2020meters,
meters,the theextraction
extractionresults
resultsusing
usingthe
theWuhan
Wuhandataset
datasetvia
viathe theCFDP
CFDPalgorithm
algorithmare are
shown in Figure 8c. The extraction of the intersection is only related to distance,
shown in Figure 8c. The extraction of the intersection is only related to distance, which solves the which solves the
difficult
difficult parameter
parameter adjustment
adjustment problems.
problems.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 9 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 9 of 27

(a)

(b)

(c)

Figure
Figure Extractionresults
8. 8.Extraction resultsby
byclustering:
clustering: (a)
(a) decision
decision graph;
graph; (b)
(b)amplifying
amplifyingresult;
result;(c)(c)
intersection
intersection
extraction results (overlaying 2.5 m raster track
extraction results (overlaying 2.5 m raster track data).data).
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 10 of 27
ISPRS Int. J. Geo-Inf. 2019, 8, 473 10 of 26

3.3. Fusion and Eliminating Pseudo Intersections


3.3. Fusion andtoEliminating
In order reduce the Pseudo Intersections
blindness of eliminating pseudo intersections and ensure the integrity and
accuracy of the
In order extracted
to reduce theintersections,
blindness of this paper establishes
eliminating a corresponding
pseudo intersections fusionthe
and ensure mechanism
integrity
based on the extraction
and accuracy resultsintersections,
of the extracted of different methods to distinguish
this paper establishes abetween true intersections,
corresponding pseudo
fusion mechanism
intersections,
based on the extraction results of different methods to distinguish between true intersections, pseudo
and undetermined intersections. This study also uses PCA to eliminate
intersections.
intersections, The
androad intersectionsintersections.
undetermined fusion frame isThis
shown in Figure
study 9. PCA to eliminate pseudo
also uses
intersections. The road intersections fusion frame is shown in Figure 9.

High resolution results Low resolution results Clustering results

Fusion rules

False Undetermined True

PCA Analysis

Extracted Results

Figure 9. Frame of the road intersections fusion frame.

3.3.1. Fusion
3.3.1. Fusion of
of the
the Intersection
Intersection Extraction
Extraction Results
Results
Considering the
Considering thedensity
densitycharacteristics of theofroad
characteristics theintersections compensates
road intersections for the morphology
compensates for the
extraction results, but also results in some pseudo intersections. Consequently,
morphology extraction results, but also results in some pseudo intersections. Consequently, we proposedwe a
variety
proposed of afusion rules
variety (listedrules
of fusion in Table 1) based
(listed in Tableon1)our extraction
based on our results, including
extraction results, high-resolution
including high-
results, low-resolution results, and CFDP clustering results, to
resolution results, low-resolution results, and CFDP clustering results, to help help recognize the undetermined
recognize the
results and reduce
undetermined the and
results errorreduce
rate ofthe
eliminating
error ratepseudo intersections.
of eliminating pseudoIf only high-resolution
intersections. If onlyresults
high-
or low-resolution results are present within the distance of a given fusion threshold
resolution results or low-resolution results are present within the distance of a given fusion threshold R1, then the
intersections are false. If only the CFDP clustering results are present, then
R1, then the intersections are false. If only the CFDP clustering results are present, then the the intersections are
undetermined.
intersections areIn other situations,
undetermined. the fusion
In other results
situations, theare based
fusion on theare
results high-resolution results (if they
based on the high-resolution
exist). Otherwise, the fusion results are based on the low-resolution results.
results (if they exist). Otherwise, the fusion results are based on the low-resolution results.

Table 1. Fusion Rules. CFDP, clustering method via fast searching and the finding of density peaks.
Table 1. Fusion Rules. CFDP, clustering method via fast searching and the finding of density peaks.
High-resolution Results Low-resolution
High-resolution Low-resolution Results CFDP Clustering Results
CFDP clustering Judging Results
√ √ √ Judging results
results results √ results True

√√ √ √ √ True True
True
√ √ √ √ True True
√√ √ True False

√ √ √
True False
√ Undetermined
False
False

Based on the above rules, we set 80 meters as the fusion threshold value. The fusion results
√ 57 undetermined
distinguish 79 true intersections, 4 false intersections, and Undetermined
intersections, which are
Based on the
shown in Figure 10.above rules, we set 80 meters as the fusion threshold value. The fusion results
distinguish 79 true intersections, 4 false intersections, and 57 undetermined intersections, which are
shown in Figure 10.
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 11 of 27

ISPRS Int. J. Geo-Inf. 2019, 8, 473 11 of 26


ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 11 of 27

(a) (b)
Figure 10. The fusion results (overlaying 2.5 m raster track data): (a) True and false results; (b)
undetermined results.(a) (b)
Figure 10.
Figure
Undetermined 10. TheThefusion
fusionresults
results results (overlaying
all come from 2.5
(overlaying m raster
2.5 m raster
clustering track data):
track
results (a)greatly
data):
that True and
(a) True false
and results;
supplement (b) raster
false results;
the
undetermined results.
(b) undetermined results.
extraction results, especially the intersections of the residential streets, living streets, and pedestrian
streets that have no (or few) taxi tracks. However, the congestion in some sections of the road and
Undetermined
Undeterminedresults resultsallall comecome from clustering
from clusteringresults that greatly
results supplement
that greatly the raster
supplement theextraction
raster
long taxi stays result in a large number of loci, and some pseudo intersections have also been
results,
extraction especially
results,the intersections
especially of the residential
the intersections streets, living
of the residential streets,
streets, living and pedestrian
streets, streets that
and pedestrian
extracted. We also extracted a method to eliminate the pseudo intersections, focusing on the
have
streets nothat(or few)
havetaxi no (or tracks.
few)However,
taxi tracks. theHowever,
congestion theincongestion
some sections of the
in some road and
sections long
of the taxiand
road stays
undetermined
long taxi stays
results.
result in a large number of loci, and some pseudo intersections have also been
result in a large number of loci, and some pseudo intersections have also been extracted. We also
extracted.a method
extracted We alsotoextracted eliminateathe method
pseudo tointersections,
eliminate thefocusing pseudoon intersections,
the undetermined focusing on the
results.
3.3.2. Elimination of False Intersections from Undetermined Results
undetermined results.
3.3.2.Pseudo
Elimination of False (that
intersections Intersections
is, non-real fromintersections)
Undetermined thatResults
are identified as intersections may be
3.3.2.
caused Elimination
by taxis of False
stopping, cars Intersections from Undetermined Results
Pseudo intersections (thatturning at gasintersections)
is, non-real stations, viaducts, that are or identified
other special road landmarks.
as intersections mayFor be
this
caused reason,
Pseudo we
by taxis use the
intersections PCA
stopping,(that method
cars is, to
non-real
turning analyze the
intersections)
at gas linear significance
that are identified
stations, viaducts, of
or other as spatial distribution
intersections
special may near
road landmarks. be
undetermined
caused
For this by taxisintersections,
reason, stopping,
we use thecars PCA which
turning
method are
at toextracted
gas stations,
analyze based
the linearonsignificance
viaducts, orthe otherabove offusion
special rules
road landmarks.
spatial to further
distribution For
near
distinguish
this reason, pseudo-intersections
we use the PCA from
method undetermined
to analyze the intersections.
linear significance
undetermined intersections, which are extracted based on the above fusion rules to further distinguish First, ofwe created
spatial a circular
distribution buffer
near
with radius R2 at
undetermined
pseudo-intersections thefromundetermined
intersections, which intersection
undetermined extractedto based
are intersections. collect
First, the
on wethetrack points
above
created falling
fusion
a circular into to
rules
buffer the
with circular
further
radius
buffer.
distinguish
R2 at the Then, we construct
pseudo-intersections
undetermined a covariance
intersection matrixthe
fromtoundetermined
collect by track
taking x andfalling
intersections.
points yFirst,
coordinates
we created
into from thebuffer.
a circular
the circular set buffer
of Then,
track
points
with as
radiusvariables
R2 at and
the calculate
undetermined the eigenvalues
intersection λ1
to and
collect λ2 of
the the
track
we construct a covariance matrix by taking x and y coordinates from the set of track points as variables matrix.
points Finally,
falling we
into use
the Equation
circular
(4)
and tocalculate
buffer. judgeThen, the main
we
the direction
construct
eigenvalues λ1ofand
traffic
a covariance λ2 offlow
theat
matrix thetaking
by
matrix. undetermined
x andwe
Finally, y use intersections
coordinates
Equationfrom and
(4) the
the
to intensity
set
judge of
thetrack
mainin
each
points direction.
direction as of It is flow
variables
traffic obvious
and calculatethatundetermined
at the the
thelarger the Δ,
eigenvalues the
λ1 stronger
and λ2 and
intersections ofthe
the linear
thematrix.feature,
intensity inwhich
Finally, we use
each means the It
Equation
direction. less
is
(4)
likely to judge
it is tothe main
become direction
a true of traffic
intersection, flow asat the
shown undetermined
in Figure
obvious that the larger the ∆, the stronger the linear feature, which means the less likely it is to become intersections
11. In order and
to the
retainintensity
more in
true
aeach
truedirection.
intersections,
intersection,weIt can
is as
obvious
choose
shown that
in the
the larger
Figurelarger
Δ11. the
In Δ,
value asthe
order thestronger
retainthe
tothreshold more linear
value truefeature,
to which
filter the
intersections, means
intersection
we can thepoints.
less
choose
Δlikely
theis larger it is
calculated to based
∆ value become onathe
as the true intersection,
following
threshold valueformula:as shown
to filter in Figure points.
the intersection 11. In order to retain based
∆ is calculated more on truethe
intersections, we can choose the larger Δ value as the threshold value to filter the intersection points.
following formula: 𝑚𝑎𝑥( 𝜆 , 𝜆 ) − 𝑚𝑖𝑛( 𝜆 , 𝜆 )
Δ is calculated based on the following 𝛥 = formula:
max(λ1 , λ2 ) − min(λ1 , λ2 ) (4)
∆= 𝑚𝑎𝑥( 𝜆 , 𝜆 ) (4)
𝑚𝑎𝑥( 𝜆 ,max 𝜆 )(− λ1 ,𝑚𝑖𝑛(
λ2 ) 𝜆 , 𝜆 )
𝛥= (4)
𝑚𝑎𝑥( 𝜆 , 𝜆 )

Figure 11. A real/false intersection schematic diagram.


ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 12 of 27

ISPRS Int. J. Geo-Inf. 2019, 8, 473 Figure 11. A real/false intersection schematic diagram. 12 of 26

4. Road Network Generation


4. Road Network Generation
According to the above work, we extracted the single-pixel “center-lines” of the road by
According
morphology to the Our
thinning. abovegoalwork,
was to weproduce
extracted the single-pixel
an initial “center-lines”
road map consisting of the
of nodes and road by
edges.
morphology
Thus, we firstthinning.
vectorizedOur the goal was image
skeleton to produce an initial
and used road map consisting
the Douglas–Peucker algorithmof nodes
[32] toand edges.
produce
Thus, we first
the edges thatvectorized
make up the skeleton
shape ofimage and used
each road the Douglas–Peucker
segment. Secondly, in order algorithm [32] to the
to determine produce
road
the edgesand
integrity thattopological
make up the shape of each
consistency, we used road segment.
the extractedSecondly, in order
intersections to determine
to adjust the distortedthe road
integrity
segmentsand topological
around consistency,and
the intersection we used the extracted
connected intersections
the fracture segmentsto adjust the distorted
to guarantee the road
segments
network’saround the Finally,
accuracy. intersection and connected
we designed the fracture segments
the Intersection–Link model to guarantee
and took the road road network’s
segments
accuracy.
between the Finally, we designed
intersections as the Intersection–Link
units to identify model and took the road
the single/double segments
directions and between
turning
the intersections as the units
relationships of the road network. to identify the single/double directions and turning relationships of the
road network.
4.1. The Road Improvement
4.1. The Road Improvement
The road segments around the intersections are distorted and broken. In this section, we build
Thefor
buffers road segments around
intersections. We set the
80 mintersections
(the same asare thedistorted and broken.
fusing threshold R1) Inas this section,
the buffer we build
radius and
buffers for intersections. We set 80 m (the same as the fusing threshold R1) as
designed the specific algorithm to adjust the roads’ endpoints falling into the buffer of the relative the buffer radius and
designed the to
intersections specific
correctalgorithm to adjust
the position and the
shape roads’
of theendpoints falling into
road segments the buffer
around of the relative
the intersections, as
intersections
shown in Figure to correct
12a,b.the
At position
the sameand shape
time, of the
we also road segments
connected around the
the fractured intersections,
segments accordingas shown
to the
in Figure
broken 12a,b. Atdistance
segment’s the sameandtime, we alsodeviation,
direction connectedor the fractured
the distancesegments according
and direction to thebetween
deviation broken
segment’s distance and direction deviation, or the distance and direction deviation
the broken segment and the relative intersection, by using extracted intersection results, as shown between the brokenin
segment and
Figure 12c,d. the relative intersection, by using extracted intersection results, as shown in Figure 12c,d.

(a) (b)

(c) (d)

Someexamples
Figure 12. Some examplesforfor
roadroad segment
segment improvement
improvement around
around intersections
intersections (gray(gray
is theis pre-
the
pre-correction
correction road road segments
segments andand
redred is the
is the corrected
corrected roadsegments):
road segments):(a)
(a)Position
Positionand
andshape
shapecorrection;
correction;
(b) position
positionand
andshape correction;
shape (c) connecting
correction; fractured
(c) connecting segments;
fractured (d) connecting
segments; fractured segments.
(d) connecting fractured
segments.

Except for the intersection extraction results of set I in this stage, this connecting strategy requires
to detect the suspension points set S and the suspension lines set L and to calculate the following two
parameters: dis(pm,pn) < 500 m or dis(pm,Pi) < 500 m and dir(lm, (pn-pm)) < 30o or dir(lm, (Pi-pm)) <30o,
ISPRS Int. J. Geo-Inf. 2019, 8, 473 13 of 26

Except for the intersection extraction results of set I in this stage, this connecting strategy requires
to detect the suspension points set S and the suspension lines set L and to calculate the following two
parameters: dis(pm ,pn ) < 500 m or dis(pm ,Pi ) < 500 m and dir(lm , (pn -pm )) < 30o or dir(lm , (Pi -pm )) < 30o ,
wherepm , pn ∈ S, m, n = 0, 1, 2, · · · , m , n,Pi ∈ I, i = 0, 1, 2, · · · ,lm ∈. The point that satisfies the above
conditions and has the minimum distance will be selected to connect the suspension point. These
parameters were determined based on the distance and direction between fractured segments as well
as after much experimental exploration to yield the best results.

4.2. Identifying Single/Double Directions and Turning Relationships


A road intersection is an intersection of two or more roads that divide the urban road network into
sections of moderate length without overlapping. Therefore, we designed the Intersection–Link model
and used the road segment between intersections as the unit to identify the single/double directions
and turning relationships of the road network.

4.2.1. Intersection–Link Model Construction


Building the Intersection–Link model is a process of track segmentation. A track is divided
into several sections according to the road intersections, and each pair of adjacent road intersections
corresponds to multiple track segments. Denoting any two adjacent road intersections in the node
set as Ik and Ij, traversing the original track data, and calculating all n track segments between them,
the Intersection–Link model can be obtained:

(Ik , Ij) ~ {t1,t2,t3, . . . ,tn}.

Here, tk(0<k≤n) represents the information of a track segment between Ik and Ij, which are two
adjacent road intersections, including track ID and the track point coordinate information.
The Intersection-Link model’s construction is used to assign the original track points to an adjacent
road intersection. For any track, it is necessary to determine which point passes through a road
intersection. If two points in the track pass through two road intersections successively, then the point
between them belongs to the section between these two intersections.

4.2.2. Removing the Pseudo-Intersection–Link Relationship


Due to the low sampling frequency of the track data, some tracks may cross several sections before
passing through a road Intersection, thereby forming a pseudo-Intersection–Link relationship. Then,
we adopted the following strategies to eliminate the pseudo-Intersection–Link relationship:

• We use the straight line connecting the two road intersections as the diameter to establish the
circular buffer zone. If there are too many other intersections in the buffer, the two intersections
are too far apart. If so, delete the Intersection–Link relationship of the two intersections.
• The Intersection–Link relationships are formed between two of the three road intersections (such as
a, b, and c), which are located in the same straight line. The lengths of ab, ac, and bc were calculated
accordingly to eliminate the Intersection–Link relationship of the longest pair of intersections,
as shown in Figure 13a.
• Most sections of urban roads are straight lines, while some sections have small radians. Based
on this observation, the maximum distance d from the intersection line (let the length be L)
in the normal track points is calculated. If d/L is greater than the threshold, then delete the
Intersection–Link relationship. This strategy can eliminate the situation seen in Figure 13b.
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 14 of 27

ISPRS Int. J. Geo-Inf. 2019, 8, 473 14 of 26


ISPRS Int.a J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 14 of 27

a f
b
f
d
b
d
c
e
(a) c (b)
Figure 13. A pseudo-Intersection–Link relationship (points a-f are road intersections): (a) A pseudo-
e
Intersection–Link(a)relationship located in the same straight line; (b) A (b)pseudo-Intersection–Link
relationship
Figure
Figure13.13. Alocated in the sections which
pseudo-Intersection–Link
pseudo-Intersection–Link have some(points
relationship
relationship radians.
(points
a-f a-f
are are
roadroad intersections):
intersections): (a) A
(a) A pseudo-
pseudo-Intersection–Link relationship
Intersection–Link relationship located
located in in
thethesame
samestraight
straight line; (b)
(b)AApseudo-Intersection–Link
pseudo-Intersection–Link
4.2.3. Determining
relationship
relationship theinCharacteristics
located
located thethe
in sections of have
which
sections a Road
which Network
some
have radians.
some radians.
The Intersection–Link
4.2.3. Determining model shows
the Characteristics the relationship
of a Road Network between road segments and track segments,
4.2.3. Determining
which can be usedthe to Characteristics of a Road Network
identify the single/double directions and turning relationships of the road
The Intersection–Link
network. Suppose the model showsintersections
start-and-end the relationship of abetween
road road segments
section a andand track segments,
The Intersection–Link model shows the relationship between roadare segments b,andrespectively.
track segments, The
which can be
Intersection–Linkused to identify
relationship the single/double directions and turning relationships of the road network.
which can be used to identifyonly the exists between adirections
single/double to b or b to and a, indicating that this road
turning relationships of section
the road is
Suppose
anetwork. the start-and-end
single-lane road section.intersections
The of a road sectionrelationship
Intersection–Link are a and b, respectively.
exists betweenTheaIntersection–Link
to b and b to a,
Suppose the start-and-end intersections of a road section are a and b, respectively. The
relationship
indicating onlythe
that exists
road between
sectionaonlyistoa btwo-way
or b to a,road
indicating
section. that
Take this road section is a3 single-lane 14road
Intersection–Link relationship exists between a to b or b toroad intersection
a, indicating in Figure
that this road section as an
is
section.
example. The Intersection–Link
Intersections 2 and 3relationship
have an exists between
Intersection–Link a to b and
relationship. b to a, indicating
Intersections 3that
and the
4 roadan
have
a single-lane road section. The Intersection–Link relationship exists between a to b and b to a,
section is a two-way road section.
Intersection–Link Takesame roadtime,
intersection is3insufficient
in Figure 14 as an example. Intersections
indicating that therelationship
road sectionatisthe a two-way which
road section. Take roadtointersection
indicate that
3 inthere is a 14
Figure turning
as an
2 relationship
and 3 have betweenan Intersection–Link
the relationship. Intersections 3 and 4 have an Intersection–Link
example. Intersections 2 two
and sections.
3 have anThus, the track segment
Intersection–Link in two Intersection–Link
relationship. Intersections 3 and relationships
4 have an
relationship
needs to beatchecked.
the same time,
If the which
track is insufficient intersections
to indicate that there is a turning relationship
Intersection–Link relationship atsegment
the samebetween
time, which is insufficient 2 and to3indicate
belongs to the
that theresameis atrack as
turning
between
the track the two sections.
segment between Thus, the track 3segment
intersections and 4, in two
and the Intersection–Link
vehicle's time from relationships
intersection 2needs
to 4 to
does
relationship between the two sections. Thus, the track segment in two Intersection–Link relationships
benot
checked.
exceed If the track segment then between intersections 2 and 3 belongs is ato3turning
the same track as the track
needs to beachecked.
certain threshold,
If the track it can
segment be determined
between that
intersections there
2 and belongsrelationship
to the samebetween track as
segment
section between
A and intersections
section D at 3 and
intersection 4, and
3. the
It is vehicle’s
worth time
mentioning from intersection
that the number2 to 4
of does not exceed
intersections that
the track segment between intersections 3 and 4, and the vehicle's time from intersection 2 to 4 does
a certain
can threshold, then it can be determined that there is a turning relationship between section A
not be usedafor
exceed turning
certain at road intersections
threshold, then it can beisdetermined
often morethat thanthere
thoseisthat cannotrelationship
a turning be used for betweenturning,
and
so section D
the selection at intersection
of storage 3. It is worth
turning restriction mentioning that
relationships the number of intersections that can be
section A and section D at intersection 3. It is worth mentioningcan thateffectively
the number improve the storage
of intersections that
used for turning at road intersections is often more than those that cannot be used for turning, so the
efficiency.
can be used for turning at road intersections is often more than those that cannot be used for turning,
selection of storage turning restriction relationships can effectively improve the storage efficiency.
so the selection of storage turning restriction relationships can effectively improve the storage
efficiency.

Figure 14. An example of a road network model based on node-edge representation.


Figure 14. An example of a road network model based on node-edge representation.
5. Experiments and Analysis

5. Experiments
5.1. andand
Study AreaFigure Analysis
Data
14. Setsexample of a road network model based on node-edge representation.
An

Real-world
5.1. Study Area vehicle
and trajectory data for seven days from May to June 2014 in the Wuhan city and
5. Experiments andData Sets
Analysis
Chicago Campus Bus Datasets were used to test the feasibility and effectiveness of the proposed
5.1. Study Area and Data Sets
ISPRS Int. J. Geo-Inf. 2019, 8, 473 15 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 15 of 27

intersection extraction method and road construction method. Both of them include the following
main fields: vehicle ID; time; longitude; latitude; speed (m/s); and heading (degree), which is relative
to north.
Wuhan dataset covers a region of 4.2 km × 2.8 km, containing about 2000 taxi trajectories and
nearly 800,000 trajectory points. The average sampling interval of the dataset is more than 40 seconds.
The experimental area is an old central urban area near Hankou in Wuhan city, and the track points are
unevenly distributed on different volume roads with significant noise, as shown in Figure 15a. Part of
the dataset can be shown in Figure 15c. The Chicago dataset, which has been widely used in related
work [33] and is available on the website [34], is relatively clean and contains 118,364 GPS points and
889 trajectories within a region of 3.88 km × 2.4 km at the University of Illinois in Chicago, as shown in
Figure 15b.

(a) (b)

(c)

Figure 15. Trajectory datasets: (a) GPS trajectory dataset for Wuhan; (b) GPS trajectory dataset for
Chicago; (c) trajectory text data example for Wuhan.
Figure 15. Trajectory datasets: (a) GPS trajectory dataset for Wuhan; (b) GPS trajectory dataset for
5.2. Experimental
Chicago; (c) Results
trajectoryoftext
thedata
Proposed Method
example for Wuhan.

Based on the approach described above, we developed a set of programs to generate intersections
5.2. Experimental Results of the Proposed Method
and road networks from vehicle GPS trajectories using Python and C# via the ArcEngine platform.
The experimental parameters involved in this experiment include the size of the structural elements
S during morphological operations, kernel density threshold K, clustering distance threshold D, fusing
threshold R1, circular buffer radius R2, and the threshold ∆ of eliminating false intersection points.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 16 of 26

In order to extract more intersection points, the close operation uses larger structural elements, the open
operation uses smaller structural elements, and ∆ needs to be set to the larger value. The kernel density
threshold can be set according to the quantile classification method. For the Wuhan dataset, K was set
to 25% of the maximum kernel density. For the Chicago dataset, K was set to 100% of the maximum
kernel density. Since the distribution of the Chicago dataset is relatively even and the dataset has few
noises, the threshold of K is reasonable for the Chicago dataset, which also indicates the feasibility of
the quantile classification method. Clustering distance threshold D and fusing threshold R1 are related
to the spacing of road intersections at all levels, so they can be set according to the minimum spacing
planning for road intersections in relative research areas. However, in a situation in which the distance
δqi of many points clusters under a certain distance, as shown in Figure 8, the clustering distance
threshold needs to be set as the corresponding distance for more extraction results. The circular buffer
radius R2 was empirically set to 100. The parameter settings for the experiment are shown in Table 2.

Table 2. The parameters settings for experimental results.

S
2.5 Resolution 5 Resolution
Datasets K D R1 R2 ∆
Close Open Open Open
Operation Operation Operation Operation
Wuhan 10 4 4 3 0.25 20 80 100 0.96
Chicago 4 3 3 3 1 75 80 100 0.9

5.2.1. Intersection Detecting Results


The extraction results of the road intersections in Chicago and Wuhan are presented in Figure 16a,b.
After PCA analysis and the elimination of false intersections from the undetermined results, a total
of 117 and 51 intersections were detected from the Wuhan dataset and Chicago dataset, respectively.
Overlaying the final extraction intersections, the remote sensing image and 2.5 m raster track data
revealed a good fit with the map, as shown in Figure 16. Compared with the morphology method
alone, the fusion method not only ensures the integrity of the results but also confirms the correction of
the extraction results. Greater than 90% of the detected results are true intersection points, indicating
that our methods can correctly distinguish a variety of road intersections from other roadways.
However, there are also some road intersections not being detected due to having insufficient GPS
data. As shown in Figure 16, there are non-existent, or few taxi tracks passing residential streets, living
streets, and pedestrian streets, so some of these types of road intersections cannot be extracted. Though
our method complements some kind of these intersections, it’s still a challenging task to recognize all
intersections only using taxi tracks. This issue could be solved by considering walking trajectory data.
Finally, due to the uneven distribution of primary and secondary roads in some areas, a few false
intersections could not be removed by the PCA method. Meanwhile, the PCA method, which is used
to determine false intersection points according to their linear characteristics, can not apply to the
turning area.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 17 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 17 of 27

(a)

Legend
!( Correctly detection
D Incorrectly detection

(b)

Figure
Figure 16.16.
TheThe finalintersection
final intersectionextraction
extraction results:
results: (a)
(a) results
resultsofofthe
theintersection
intersectiondetection
detectionin in
Wuhan;
Wuhan;
(b)(b) results
results of of
thethe intersectiondetection
intersection detectionin
inChicago.
Chicago.

5.2.2. Road Network


However, thereDetection Results
are also some road intersections not being detected due to having insufficient
GPS data. As shown in Figure 16, there are non-existent, or few taxi tracks passing residential streets,
Due to the serious absence of the road information for 2.5 m resolution thinning images,
living streets, and pedestrian streets, so some of these types of road intersections cannot be extracted.
we vectorize and simplify the 5 m resolution thinning image to get the vector road center-lines
Though our method complements some kind of these intersections, it’s still a challenging task to
by using the Douglas–Peucker algorithm. The results, according to road reprocessing, are shown in
Figure 17. Almost all the roads that taxi tracks pass through can be extracted. The roads that are
ISPRS Int. J. Geo-Inf. 2019, 8, 473 18 of 26

residential streets, living streets, and pedestrian streets with no taxi tracks also could be extracted by
fusing the walking trajectory data. We must mention that some extracted road segments are seriously
distorted simply because the road segments are too close to each other or do not have enough trajectory
points to form the correct road shape. These are all difficult problems for current research.

ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 19 of 27


(a)

Legend
Road extraction results

(b)

Figure The
17.17.
Figure Thefinal
finalroad
roadnetwork
networkextraction
extraction results: (a) results
results: (a) resultsofofroad
roadnetwork
networkdetection
detectionin in Wuhan;
Wuhan;
(b)(b)
results of of
results road
roadnetwork
networkdetection
detectionininChicago.
Chicago.

5.2.3. Single/Double Direction Information and Turning Relationship Detection Results


The Intersection–Link relationship can be used to extract the single/double direction information
and turning relationship of the road network. Taking intersection 69, extracted from the Wuhan
dataset, as an example (shown in Figure 18), the result of this relationship is shown in Table 3 and
Table 4.
(b)

ISPRSFigure 17. The


Int. J. Geo-Inf. final
2019, road network extraction results: (a) results of road network detection in Wuhan;
8, 473 19 of 26
(b) results of road network detection in Chicago.

Single/Double Direction Information and Turning


5.2.3. Single/Double Turning Relationship
Relationship Detection
Detection Results
Results
The Intersection–Link relationship can be used to extract
extract the
the single/double
single/double direction information
and turning
turningrelationship
relationshipofof
thethe
road network.
road Taking
network. intersection
Taking 69, extracted
intersection from the
69, extracted Wuhan
from dataset,
the Wuhan
as an example
dataset, (shown in
as an example Figurein
(shown 18), the result
Figure of this
18), the relationship
result is shown inisTables
of this relationship shown3 in
and 4. 3 and
Table
Table 4.

62

65

69

72

75

Figure 18. Intersection 69.


Figure 18.
Table 3. The single/double direction decision.
Table 3. The single/double direction decision.
The Single/Double
Paths Whether
Whether
oneOne
can Can Pass Paths
Whether Whether OneThe
one Cansingle/double
Pass direction
Paths Paths Direction Decision
pass can pass decision
65–69 Yes 69–65 No Single
65–69
62–69 Yes No 69–65 69–62No Yes Single Single
62–69
72–69 No No 69–62 69–72Yes Yes Single Single
75–69
72–69 No Yes 69–72 69–75Yes NO Single Single
75–69 Yes 69–75 NO Single
Table 4. The turning results of intersection 69.
Table 4. The turning results of intersection 69.
Paths Paths Whether One Can Pass Whether One Can Pass in Reality
65–69 69–62 Yes Yes
65–69 69–72 Yes Yes
65–69 69–75 NO NO
62–69 69–65 NO NO
62–69 69–72 NO NO
62–69 69–75 NO NO
72–69 69–62 NO NO
72–69 69–65 NO NO
72–69 69–75 NO NO
75–69 69–62 Yes Yes
75–69 69–65 NO NO
75–69 69–72 Yes Yes

According to the Intersection–Link model, we can determine which road sections and road
intersections the track passed and then extract the single/double direction information, as well
as the turning relationship, of the road. Where the viaduct passes, it may form an area with a
pseudo-intersection, which can also be recognized by the Intersection–Link model.

5.3. Results Evaluation and Analysis


In this section, we compare our proposed method with the aforementioned method proposed by
Ahmed [15] and Davies [20]. For Ahmed, the values of ε to the cluster sub trajectories are 140 and 80 m
ISPRS Int. J. Geo-Inf. 2019, 8, 473 20 of 26

for Wuhan and Chicago, respectively. The parameter of the proximity for Davies’ algorithms is 17 m.
Directly using the Wuhan dataset to extract the road network via the methods of Ahmed and Davies
cannot achieve appropriate results for low-frequency data. Therefore, in this experiment, the Delaunay
triangulation method [35] was used for the Wuhan dataset preprocessing, and then we extracted the
trajectory points within 60 s and deleted the trajectory whose number of track points fewer than two.

5.3.1. Visual Inspection


Figure 19 shows the detection results from Ahmed’s method and Davies’ method. Compared
with the extraction results from the proposed method shown in Figures 16 and 17, these two methods
did not extract more road intersections and skeleton information, and a lot of results deviated from
their correct positions. For the Chicago dataset, although Ahmed’s method extracted more skeleton
information, it produced a large number of offset and messy outputs. The visualization clearly shows
that the proposed method is better than the other two methods.

5.3.2. Quantitative Comparisons


We also use precision, recall, and the F-value to evaluate our method. Accuracy comparison for
detecting road intersections can be shown in Table 5. The proposed method achieved a significantly
higher precision value and F-value than the method of Ahmed and Davies, which demonstrates
superior performance for detecting road intersections. Three indicators were computed utilizing
Equation (5):
correctly detected
Precision = (5a)
correctly detected + incorrectly detected
correctly detected
Recall = (5b)
correctly detected + not detected
2 ∗ precision ∗ recall
F − value = (5c)
precision + recall

Table 5. Accuracy comparison for detecting road intersections.

dataset Methods Precision Recall F-value


Proposed 50/51 = 98.04% 50/71 = 70.42% 81.97%
Chicago Ahmed 40/52 = 76.92% 40/71 = 56.34% 65.04%
Davies 19/19 = 100% 19/71 = 26.76% 42.22%
Proposed 109/117 = 93.16% 109/123 = 88.62% 90.84%
Wuhan Ahmed 37/50 = 74% 37/123 = 30.08% 42.77%
Davies 69/81 = 85.19% 69/123 = 56.10% 67.65%

According to the visual inspections and quantitative comparisons, we can see that the proposed
method extracted more intersections and has considerable consistency with the remote sensing image.
At the same time, the road extraction results also show more accurate and richer information, especially
for the Wuhan dataset, which contains low-frequency data. However, based on the results in Table 4,
there was also about a 2% and 7% chance of incorrectly identifying road intersections, and about
30% and 20% of the road intersections were not identified for the Chicago dataset and the Wuhan
dataset, respectively.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 21 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 21 of 27

(a)

Legend
!( Correctly detection
D Incorrectly detection
Road extraction results

(b)

Figure 19. Cont.


ISPRS Int. J. Geo-Inf. 2019, 8, 473 22 of 26
ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 22 of 27

Legend
!( Correctly detection
D Incorrectly detection
Road extraction results

(c)

Legend
!( Correctly detection
D Incorrectly detection
Road extraction results

(d)

Figure 19. The


Figure detection
19. The results
detection using
results usingDavies’
Davies’ method andAhmed’s
method and Ahmed’s method:
method: (a) (a) Davies’
Davies’ detection
detection
results
results in Wuhan;
in Wuhan; (b) (b) Ahmed’s
Ahmed’s detectionresults
detection results in
in Wuhan;
Wuhan;(c)(c)Davies’
Davies’detection results
detection in Chicago;
results in Chicago;
(d) Ahmed’s
(d) Ahmed’s detection
detection results
results in in Chicago.
Chicago.

5.3.2.further
For Quantitative Comparisons
observation of the extraction results, we overlay the final extraction results, Open-street
Map (theWe blue line is the urban traffic
also use precision, recall, androads, the yellow
the F-value line our
to evaluate is the urbanAccuracy
method. living streets), 2.5 m
comparison forraster
trackdetecting
data androad
remote sensing image,
intersections as shown
can be shown in Figure
in Table 5. The20. The incorrectly
proposed identified
method achieved intersections are
a significantly
higher
mainly precision on
concentrated value and F-value
elevated than the method
road entrances and exits,of Ahmed
sharply andcurvedDavies, which
points, demonstrates
points close to living
superior
streets, performance
and points for where
of interest detectingcarsroad
are intersections.
parked for a Three indicators
long period. Wewere computedrandom
can consider utilizingforest
Equation (5):
and other technologies to further improve the accuracy of eliminating pseudo intersections. In fact,
these special false intersections are also =important𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦
Precision
𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑
for navigation. We can consider distinguishing (5a) these
𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑 + 𝑖𝑛𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑑𝑒𝑡𝑒𝑐𝑡𝑒𝑑
points in future work and provide a reminder service for the navigation user. Thus, further analyses
and improvements are still needed.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 23 of 26

(a)

ISPRS Int. J. Geo-Inf. 2019, 8, x FOR PEER REVIEW 24 of 27

(b)

(c) (d)
Figure 20.
Figure The situation
20. The situation of
of incorrectly
incorrectly identifying
identifying road
road intersections
intersections (the
(the blue
blue line
line is
is the
the urban
urban traffic
traffic
roads and the yellow line is the urban living streets): (a) elevated road entrances and exits; (b)
roads and the yellow line is the urban living streets): (a) elevated road entrances and exits; (b) sharply sharply
curved points;
curved points; (c)
(c) points
points close
close to
to the
the living
living streets;
streets; (d)
(d) points
points of
of interests.
interests.

Based on
Based on thetheresults
resultsshown
shownininFigure
Figure16,16,
thethe road
road intersections,
intersections, which
which werewere
not not identified,
identified, are
are mainly the intersections of the residential streets, living streets, and pedestrian streets
mainly the intersections of the residential streets, living streets, and pedestrian streets that have no that have
no (or
(or few)
few) taxitaxi tracks.
tracks. Although
Although ourour method
method complements
complements someofofthese
some theseintersections,
intersections,ititisis still
still aa
challenging task to recognize them all only using taxi tracks. This issue can be solved
challenging task to recognize them all only using taxi tracks. This issue can be solved by consideringby considering
walking
walking trajectory
trajectory data.
data.
There also
There also remains
remains aa small
small amount
amount of of road
road skeleton
skeleton information that was
information that was not
not extracted
extracted forfor the
the
Chicago dataset and Wuhan dataset, which can be fitted via Multivariate Adaptive
Chicago dataset and Wuhan dataset, which can be fitted via Multivariate Adaptive Regression Regression Splines
(MARS) according
Splines (MARS) to the intersection
according results and
to the intersection the Intersection–Link
results model, asmodel,
and the Intersection–Link shown asin Figure
shown 21. in
Figure 21.
(or few) taxi tracks. Although our method complements some of these intersections, it is still a
challenging task to recognize them all only using taxi tracks. This issue can be solved by considering
walking trajectory data.
There also remains a small amount of road skeleton information that was not extracted for the
Chicago
ISPRS Int. J.dataset and8,Wuhan
Geo-Inf. 2019, 473 dataset, which can be fitted via Multivariate Adaptive Regression
24 of 26
Splines (MARS) according to the intersection results and the Intersection–Link model, as shown in
Figure 21.

(a) (b)
Figure 21. Fitting the road segment-based Multivariate Adaptive Regression Splines (MARS): (a) track
points
Figure belonging
21. Fitting to
thetwo intersections;
road segment-based(b) the fitted result.
Multivariate Adaptive Regression Splines (MARS): (a) track
points belonging to two intersections; (b) the fitted result.
In addition, time-cost is an important factor to be considered in algorithm application. Since the
overhead of the time-cost
In addition, CFDP algorithm is O(n3 ),factor
is an important our method has a larger
to be considered time costapplication.
in algorithm compared Since
with the
the
overhead of the CFDP algorithm is O(n ), our method has a larger time cost compared withroad
methods of Ahmed and Davies. It took about
3 five and seven hours longer than them to extract the
information
methods based on
of Ahmed theDavies.
and WuhanItand tookChicago datasets,
about five respectively,
and seven even though
hours longer than themwe had considered
to extract road
the kernel density estimation to smooth the data and reduce the amount of data
information based on the Wuhan and Chicago datasets, respectively, even though we had consideredcalculation. Despite
this,kernel
the compared
densitywith some incremental
estimation to smooth the methods (such
data and as Cao’s
reduce methodof[14])
the amount dataand neural network
calculation. Despite
methods
this, (such as
compared Fathi’s
with somemethod
incremental[30]), methods
the overhead
(suchofasour method
Cao’s method is not
[14])soandbad,neural
because these
network
methods may take dozens of days. Parallel computing will be considered to solve
methods (such as Fathi’s method [30]), the overhead of our method is not so bad, because these this problem in
the future.

6. Conclusions
Crowd-sourced Vehicle Trajectories have the disadvantage of lower accuracy, lower sampling
frequency, more noise, and uneven distribution, which makes it more difficult to determine a road
network than in most existing approaches that use high precision and high-frequency GPS trajectories.
Thus, because these problems cannot be solved by a single method, we focused instead on the
identification of urban road network intersections, proposing an integrated strategy that considers
both dense and sparse areas, in order to extract intersections using density peak clustering and
mathematical morphological processing in vector space and raster space, respectively. Compared with
the traditional detection algorithms for road intersections, our methods have higher detection accuracy
and integrity, as well as good application value and practical significance for quickly building road
networks and acquiring more fine digital information. At the same time, based on the extracted road
intersection information, this paper improves the morphology road network extraction results and
makes them more closely resemble the real road. The proposed Intersection–Link model can also detect
the single/double direction information of road sections and extract the turning rules of intersections.
In summary, the proposed method represents a considerable advancement in generating a
navigable digital road network. However, improvements are still required to make the method
better, such as considering the random forest algorithm and other technologies to further improve the
accuracy of eliminating pseudo intersections and optimizing the extraction algorithm and calculation
process to achieve real-time application. Moreover, our method only focused on the location of urban
road intersections and urban road skeleton information, without considering more road semantic
information (such as road congestion information, road grades, driving time, road slope, and aspect
information) and detailed road geometry information (such as the shape of complex interchanges, road
lanes, sidewalk network). In the future, we will consider elevation data, Point of Interest (POI) data,
and remote sensing data together with all kinds of crowd-sourced GPS traces to mine more significant
road information, which will involve not only the urban road network but also the rural road network.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 25 of 26

Author Contributions: Conceptualization, Caili Zhang, Longgang Xiang and Siyu Li; Data curation, Caili Zhang,
Longgang Xiang and Siyu Li; formal analysis, Caili Zhang; funding acquisition, Longgang Xiang; investigation,
Siyu Li; methodology, Caili Zhang, Longgang Xiang and Dehao Wang; project administration, Longgang Xiang;
resources, Longgang Xiang; software, Caili Zhang; supervision, Longgang Xiang; validation, Caili Zhang;
visualization, Caili Zhang; writing—original draft, Caili Zhang; writing—review and editing, Caili Zhang,
Longgang Xiang and Siyu Li.
Funding: This paper was funded by the National Natural Science Foundation of China under grant No. 41771474.
Acknowledgments: We express our thanks to the anonymous reviewer for constructive comments.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Chiang, Y.Y.; Knoblock, C.A. Automatic extraction of road intersection position, connectivity, and orientations
from raster maps. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in
Geographic Information Systems, Irvine, CA, USA, 5–7 November 2008.
2. Fu, G. Road Extraction Method Using Multi-Source Remote Sensing Data; Tsinghua University: Beijing,
China, 2014.
3. Li, R.; Si, Y.; Ma, D.; Qu, Y. Extraction method of high-resolution image of road intersections based on
semantic rules. J. Surv. Mapp. Sci. Technol. 2017, 34, 168–174.
4. Hu, X.; Li, Y.; Shan, J.; Zhang, J.; Zhang, Y. Road Centerline Extraction in Complex Urban Scenes from LiDAR
Data Based on Multiple Features. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7448–7456.
5. Oloo, F. Mapping Rural Road Networks from Global Positioning System (GPS) Trajectories of Motorcycle
Taxis in Sigomre Area, Siaya County, Kenya. ISPRS Int. J. Geo-Inf. 2018, 7, 309. [CrossRef]
6. Ekpenyong, F.; Palmerbrown, D.; Brimicombe, A. Extracting road information from recorded GPS data using
snap-drift neural network. Neurocomputing 2009, 73, 24–36. [CrossRef]
7. Mobasheri, A.; Huang, H.; Degrossi, L.; Zipf, A. Enrichment of OpenStreetMap Data Completeness with
Sidewalk Geometries Using Data Mining Techniques. Sensors 2018, 18, 509. [CrossRef]
8. Amin, M. A Rule-Based Spatial Reasoning Approach for OpenStreetMap Data Quality Enrichment; Case Study
of Routing and Navigation. Sensors 2017, 17, 2498.
9. Wei, Y.; Tinghua, A.; Wei, L. A Method for Extracting Road Boundary Information from Crowdsourcing
Vehicle GPS Trajectories. Sensors 2018, 18, 1261.
10. John, S.; Hahmann, S.; Rousell, A.; Löwner, M.O.; Zipf, A. Deriving incline values for street networks from
voluntarily collected GPS traces. Cartogr. Geogr. Inf. Sci. 2017, 44, 152–169. [CrossRef]
11. Li, J.; Qin, Q.; You, L.; Xie, C. Parking lot extraction method based on floating car data. Geomat. Inf. Sci.
Wuhan Univ. 2013, 38, 599–603.
12. Tang, L.; Kan, Z.; Zhang, X.; Yang, X.; Huang, F.; Li, Q. Travel time estimation at intersections based on
low-frequency spatial-temporal GPS trajectory big data. Cartogr. Geogr. Inf. Sci. 2016, 43, 417–426. [CrossRef]
13. Winden, K.V.; Biljecki, F.; Spek, S.V.D. Automatic Update of Road Attributes by Mining GPS Tracks. Trans.
GIS 2016, 20, 664–683. [CrossRef]
14. Cao, L.; Krumm, J. From GPS traces to a routable road map. In Proceedings of the 17th ACM SIGSPATIAL
International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 4–6 November
2009.
15. Ahmed, M.; Wenk, C. Constructing street networks from GPS trajectories. In Proceedings of the 20th Annual
European Symposium on Algorithms, Ljubljana, Slovenia, 10–12 September 2012.
16. Tang, L.L.; Liu, Z.; Yang, X. Spatial-temporal trajectory fusion and road network generation method in line
with cognitive rules. Acta Surv. Mapp. 2015, 44, 1271–1276.
17. Edelkamp, S.; Schrödl, S. Route Planning and Map Inference with Global Positioning Traces. In Computer
Science in Perspective, Essays Dedicated to Thomas Ottmann; Springer: Berlin/Heidelberg, Germany, 2003.
18. Bruntrup, R.; Edelkamp, S.; Jabbar, S.; Scholz, B. Incremental map generation with GPS traces. In Proceedings
of the 9th IEEE International Conference on Intelligent Transportation Systems, Las Vegas, NV, USA,
6–8 June 2005.
19. Liao, L.C.; Jiang, X.H.; Zou, F.M.; Li, L.M.; Lai, H.T. Directed density method for trajectory data clustering of
floating vehicles. J. Earth Inf. Sci. 2015, 17, 1152–1161.
ISPRS Int. J. Geo-Inf. 2019, 8, 473 26 of 26

20. Davies, J.J.; Beresford, A.R.; Hopper, A. Scalable, Distributed, Real-Time Map Generation. IEEE Pervasive
Comput. 2006, 5, 47–54. [CrossRef]
21. Biagioni, J.; Eriksson, J. Map inference in the face of noise and disparity. In Proceedings of the International
Conference on Advances in Geographic Information Systems, Redondo Beach, Los Angeles County, CA,
USA, 6–9 November 2012.
22. Mariescu-Istodor, R.; Fränti, P. CellNet: Inferring Road Networks from GPS Trajectories. ACM Trans. Spat.
Algorithms Syst. 2018, 4, 8. [CrossRef]
23. Xie, X.; Liao, W.; Aghajan, H.; Veelaert, P.; Philips, W. Detecting road intersections from GPS traces using
longest common subsequence algorithm. ISPRS Int. J. Geo-Inf. 2017, 6, 1. [CrossRef]
24. Karagiorgou, S.; Pfoser, D.; Skoutas, D. Segmentation-based road network construction. In Proceedings
of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,
Orlando, FL, USA, 5–8 November 2013.
25. Xie, X.; Bingyungwong, K.; Aghajan, H.; Veelaert, P.; Philips, W. Inferring directed road networks from GPS
traces by track alignment. ISPRS Int. J. Geo-Inf. 2015, 4, 2446–2471. [CrossRef]
26. Tang, L.; Niu, L.; Yang, X.; Zhang, X.; Li, Q.; Xiao, S. Recognition and Structural Extraction of Urban Road
Intersection Using Large Trajectory Data. Acta Geod. Cartogr. Sin. 2017, 46, 770–779.
27. Wang, J.; Wang, C.; Song, X.; Raghavan, V. Automatic intersection and traffic rule detection by mining
motor-vehicle GPS trajectories. Comput. Environ. Urban Syst. 2017, 64, 19–29. [CrossRef]
28. Wang, J.; Rui, X.; Song, X.; Tan, X.; Wang, C.; Raghavan, V. A novel approach for generating routable road
maps from vehicle GPS traces. Int. J. Geogr. Inf. Sci. 2014, 29, 69–91. [CrossRef]
29. Deng, M.; Huang, J.; Zhang, Y.; Liu, H.; Tang, L.; Tang, J.; Yang, X. Generating urban road intersection models
from low-frequency GPS trajectory data. Int. J. Geogr. Inf. Sci. 2018, 32, 2337–2361. [CrossRef]
30. Fathi, A.; Krumm, J. Detecting road intersections from GPS traces. In Proceedings of the 6th International
Conference on Geographic Information Science, Zurich, Switzerland, 14–17 September 2010; pp. 56–69.
31. Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science 2014, 344, 1492. [CrossRef]
[PubMed]
32. Douglas, D.H.; Peucker, T.K. Algorithms for the Reduction of the Number of Points Required to Represent a
Digitized Line or Its Caricature. Cartogr. Int. J. Geogr. Inf. Geovis. 1973, 10, 112–122. [CrossRef]
33. Ahmed, M.; Karagiorgou, S.; Pfoser, D.; Wenk, C. A comparison and evaluation of map construction
algorithms using vehicle tracking data. GeoInformatica 2015, 19, 601–632. [CrossRef]
34. Mapconstruction. Available online: https://pfoser.github.io/mapconstruction/ (accessed on 22 July 2018).
35. Tang, L.; Yang, X.; Kan, Z.; Wang, X.; Li, Q.; Shaw, S. Traffic Lane Numbers Detection Based on the Naive
Bayesian Classification. China J. Highw. Transp. 2016, 29, 116–123.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like