You are on page 1of 8

2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing &

Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation

Urban Micro-circulation Bus Planning Based on


Temporal and Spatial Travel Demand
Bowen Du† , Yanan Qiao† , Jiejie Zhao† , Leilei Sun†∗ , Weifeng Lv† , Runhe Huang§

SKLSDE Lab and BDBC, Beihang University, Beijing, P.R. China.
§
Computer and Information Sciences, Hosei University, Tokyo, Japan.
Email: {dubowen, qiaoyanan, zjj, lwf}@buaa.edu.cn, leileisunr@gmail.com, rhuang@hosei.ac.jp

Abstract—Bike sharing systems improve public transport by


providing the first or last-mile connections for citizens, and
therefore, bring great convenience to modern urban life. How-
ever, the rapid growth of Bike sharing systems, particularly,
the dockless ones, leads to a series of new problems, such
as destructive competition, wasting of resources, discretionary
parking, etc. To this end, this paper aims to develop a new
type of public transport service, namely, micro-circulation bus,
according to spatiotemporal analysis of multimode pick-up and
drop-off demands. In particular, we first identify the stations
of micro-circulation bus by clustering the large-scale pick-up
and drop-off points, then construct transition network according
to spatiotemporal Origin-Destination (OD) analysis. Last, an Fig. 1. A sketch of the micro-circulation bus planning scenario. The
iterative greedy algorithm is proposed to plan the routes of key issues in the scene are: 1) how to find the virtual station for pick-up
micro-circulation buses by decreasing transportation afford on passengers; 2) how to find the optimal routes to nearby subway stations.
constructed transition network gradually. Experiments are con-
ducted on real-world data, results demonstrate the effectiveness
of the proposed methods, and also reveal the potential of first
and last-mile transportation service by micro-circulation buses. To solve this problem, the micro-circulation bus is a new
type of public transport service raised recently. Similarly,
Keywords-public transport, sharing-economy, spatiotemporal
Mini Cab and Dollar Cab appear in England and America,
analysis, micro-circulation bus
respectively [3]. All of them aim to provide a travel service
for first or last-mile connections. Compared to bike sharing
I. I NTRODUCTION systems, it provides a more safe and convenient way to travel,
especially in bad weather conditions. Figure 1 shows the
Public transport (e.g., bus, subway) is the most important
scenarios of micro-circulation bus . Traditional bus planning
transport mode in many cities around the world owing to
methods rely on investigations on people’s mobility patterns
it not only offers a safe, affordable, and convenient way to
[4]. Despite the substantial time and overhead on the survey
travel within a city but also saves fuel, reduces congestion
process, these methods do not provide detailed analyses of the
and carbon emission [1]. For instance, the proportion of green
spatiotemporal Origin-Destination (OD) pairs.
trips reached 73% and the rail transit mileage reached more
than 630 kilometers in Beijing, 20181 . Better public transport With the equipment of Global Positioning System (GPS) de-
planning can help reduce the congestion. More importantly, it vices on taxis, online car-hailing and bicycles, rich information
improves the travel quality efficiently for passengers [2]. about multiple transport modes, including where and when
In many cities, public transport systems are usually well- passengers are picked up or dropped off and which route a
designed for long-distance travel services around the city, taxi takes for a certain trip can be collected and extracted. The
however, for short-distance travel, most public transport sys- valuable information provided by ODs is useful to understand
tems are out of service, leaving bike sharing systems as the passengers mobility patterns in a city at different time of a day
only option for first or last-mile traveling and have come to with different transport modes, making it possible to accurately
dominate the short-distance travel, such as Ofo and Mobike. plan new micro-circulation bus routes, with the expectation
Although bike sharing system brings great convenience to to maximize the number of passengers along the routes and
modern urban life by providing the first or last-mile connec- minimize the cost.
tions for citizens, it also leads to a series of new problems with In this paper, we intend to explore the data-driven micro-
the rapid growth of bike sharing systems, such as destructive circulation bus planning problem leveraging multiple transport
competition, wasting of resources, discretionary parking, etc. data, including bicycle trips, online car-hailing and taxi GPS
traces. This problem can be divided into three subproblems:
∗ Corresponding Author the bus station identification, transition network construction
1 http://www.sohu.com/a/289511442 745330 and the best micro-circulation bus route planning. It is a

978-1-7281-4034-6/19/$31.00 ©2019 IEEE 981


DOI 10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00193
Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
difficult task to address these problems due to the following II. R ELATED WORK
challenges: 1) the pick-up and drop-off points of passengers
in thses mode are distributed in the whole city, but there is no
clear guideline about where the micro-circulation bus station This section briefly reviews the related work which can
should be sited; 2) different transport mode are heterogeneous be grouped into two categories. The first is about the micro-
in term of spatiotemporal distribution, thus it is difficult to circulation bus, and the second is about bus routing optimiza-
capture the integrated travel demand with multimode human tion in urban planning.
mobility data to better model the OD in a holistic way; 3) to Micro-circulation Bus. There are several public transport
deliver more passengers, the better micro-circulation bus route modes around the world which are similar as the micro-
should go through more stations and more than one route is circulation bus. For example, a public transport service called
not satisfactory, but this will take more cost. Hence, a trade-off Mini Cab is provided in England. Also, Dollar Cab appears in
between cost and passenger carry is needed. America. Recent researches mainly focus on the discovery of
travel demands, the optimization of the bus route and departure
intervals. Szeto et al. [5] proposed several constraints and
regarded the cost of users as objective function, such as the
number of transfers and the total travel time. Almasi et al.
[6] used meta-heuristic algorithms to optimize transit services
and coordinated schedules for feeder bus. Ciaffi et al. [7]
simultaneously generated routes and frequencies of the feeder
bus network in a real size large urban area. They combined
heuristic algorithm and shortest-path algorithm using genetic
algorithm. Ge et al. [8] developed a double circulation method
to get the required stop-set using an optimal model, given
the threshold of Average Walking Distance. In general, the
research of micro-circulation bus mainly used parameter meth-
ods in traffic area, and the route planning is confined to genetic
algorithms, ant colony algorithms, etc.
Fig. 2. Framework overview.
Bus Routing Optimization. Bus network design is an
To fill this research gap, we develop a new type of public intensely studied area in the urban planning and transportation
transport service, namely, micro-circulation bus, according to field, which is known to be a complex, non-linear, non-convex,
spatiotemporal OD analysis of pick-up and drop-off demands. multi-objective NP-hard problem [9], [10], [11]. An objective
Figure 2 presents the framework of our approach. First, we function with constraints would be proposed in bus route
identify the candidate stations by clustering the large-scale optimization. There are some widely used objectives include
pick-up and drop-off points. Then, we construct transition shortest distance, shortest travel time, lowest cost, maximum
network according to spatiotemporal OD analysis. Finally, the service quality, constraints such as time, capacity and resource
micro-circulation bus routes are planned by an iterative greedy [12]. In fact, the selection of the objectives often considers the
algorithm. Experiments are performed on real-world datasets user demands which are often conflicting, leading to a trade-
to demonstrate the effectiveness. off decision.
Our contributions can be summarized as follows: In order to solve this problem, various models have been
• A Density Peak Cluster (DPC) based method is proposed proposed include cost-oriented models, passenger-oriented
to identify the hot spots by clustering the large-scale models, game-theoretic models, and location-based models
multimode transport pick-up and drop-off demands. [13]. In the general way, the passenger flows and demands are
• The transition network is constructed according to spa- often given by user survey or population estimation. However,
tiotemporal OD analysis which can be utilized in route with the wide equipment of Automatic Fare Collection (AFC)
planning. systems on bus networks and GPS devices on the taxi and
• An iterative greedy algorithm is designed to plan the shared bicycles, more and more researches tried to analyze
routes of micro-circulation bus by decreasing transporta- the travel patterns and to understand the human mobility [14],
tion afford on constructed transition network gradually. [15]. In this way, using a data-driven method for bus route
The remaining of this paper is organized as follows. Section planning or urban planning has become a trend [16], [17],
II summarizes the related literature. Section III presents the [11], [18]. For instance, Liu et al. [18] exploited heterogeneous
formalization of the studied problem and provides the prelim- human mobility patterns for intelligent bus routing, Chen [11]
inaries of this paper. The spatiotemporal demand analysis and addressed the night-bus route design problem leveraging the
the proposed iterative greedy algorithm is provided in Section taxi passenger OD flow data. Different from all of them, we
IV and V, respectively. Finally, the experiments are conducted integrate the real mobility demand for short-term travel based
in Section VI, and conclusion is drawn in Section VII. on different transport modes.

982

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
(a) Bicycles origins (b) Taxi origins (c) Car-hailing origins

Fig. 3. The origin points distribution of Wangjing, Beijing.

III. P ROBLEM D EFINITION Definition 2: Trip. A trip describes a passenger’s origin and
In this section, we first describe the data and definitions destination. It can be defined as
used in this paper and then present the formulation of the trip = ((o.t, o.lon, o.lat), (d.t, d.lon, d.lat)), (1)
micro-circulation bus planning problem.
where o and d represent origin and destination points, t, lon
A. Data Description and lat are respectively the timestamp, longitude and latitude
Discovering real demand in researched area is the premise of the point. The set of all the trips is notated as T.
for designing a proper micro-circulation bus route. The bus Definition 3: Candidate Station. In order to plan the bus
route consists of several stations, therefore, only the demands route, the alternate stations are prepared first. The bus station
identified accurately can a bus route has the most profit. The must satisfy two criteria: 1) there is enough demand around
pick-ups and drop-offs of the taxis, online car-hailing and the bus station; 2) it is the most convenient position for
shared bicycles reflect the demand of short trip in a way, and all passengers around the station. Under station-undesired
reveal the inconvenience of the bus. We use data of a week transportation like taxi or shared bicycles, passengers may
from March 23th to 30th, 2018 on the three datasets. have arbitrary departure and arrival locations. In this paper, we
• Taxi Data: Taxi trajectory data contain not only the pick- congregate the scattered locations using DPC-based method,
ups and drop-offs of passengers, but also the trajectories and cluster centers are used to represent demand centers. We
of taxis on the road, from which we can design the regard them as candidate stations alternative to route planning.
route between the two stations. Over 7.8 million trips And the set of all the candidate stations is noted as S.
are extracted in our experiments. Definition 4: Station-to-Station Transition. Given a trip
• Online Car-hailing: Recently, passengers prefer to use trip, the origin and destination can be mapped to identified
online service when traveling. Compared with taxis, candidate stations. The station-to-station transition can be
passengers are provided with more freedom of choosing denoted as
positions to depart, which can better reflect their real tr = ((o.t, o.s), (d.t, d.s)), (2)
travel demand. There are about 5.8 million trips extracted.
where time denotes the timestamp and station represents the
• Shared Bicycles: Shared bicycles often used in short
corresponding candidate station.
trip, especially when transiting to the subway station. The
Definition 5: Transition Flow. Let (oi , dj ) represents a trip
micro-circulation bus is planned to replace these demands
from station si to sj , where i, j ∈ {1, 2, . . . , N }, transition
mainly. We extract about 14 million trips in Beijing for
flow Wi,j is defined as the number of trips from si to sj :
analyzing the mobility patterns.
These transportation data contain different travel informa- Wi,j = |{(oi , dj ) : oi = si ∧ dj = sj }| . (3)
tion, as shown in Fig. 3, there is a distinct difference among the Definition 6: Station Demand. We use station demand as
mobility patterns of three transport modes. Also, the mobility station attraction which is defined as
patterns vary significantly over different days and different
time periods of a day. di = {trip : trip ∈ T ∧ trip.o.s = i}, (4)

B. Definitions where di represents the demand of station i and T is the set


of all the trips. The demand of a station is the number of trips
To understand the micro-circulation bus route, we formally
departing the station.
provide the following definitions.
Definition 1: Micro-circulation Bus Route is a fast and C. Problem Formulation
short bus route with a length of 3 to 10 km, the number of This paper tries to plan n micro-circulation bus routes to
stations between 6 and 20, and the distance between adjacent decrease traffic afford. As a result, the route r is defined as:
stations is 200 to 1700 m. It is used to address local frequent
traffic demand and to connect residences with subway stations. r = (< s, s1 >, < s1 , s2 >, . . . , < sn−1 , sn >), (5)

983

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
where si is a candidate station in S. In particular, as the goal of bus or subway between 11:00pm and 5:00am regularly, the
a micro-circulation bus is connecting residences with subway micro-circulation bus planned to run from 5:00am to 11:00pm.
stations, we select a subway station s in the researched area We use c = 1, . . . , 6 to present the temporal plots.
as the route origin.
The key problem of micro-circulation bus route planning is TABLE I
T EMPORAL S LOTS FOR W EEKDAY AND W EEKEND
to satisfy the short-distance trip demand as much as possible
with affordable operation expense. Therefore, the objective Weekday Weekend
function f maximizing the traffic flow with given constraints is slot 1 5:00am-8:00am slot 1 5:00am-8:00am
proposed in this paper. Also, the properties of stations are also slot 2 8:00am-11:00am slot 2 8:00am-11:00am
slot 3 11:00am-2:00pm slot 3 11:00am-2:00pm
taken into consideration. Our main goal is finding n bus routes
slot 4 2:00pm-5:00pm slot 4 2:00pm-5:00pm
to maximize the function f , which is discussed in Section V. slot 5 5:00pm-8:00pm slot 5 5:00pm-8:00pm
slot 6 8:00pm-11:00pm slot 6 8:00pm-11:00pm
IV. M ICRO - CIRCULATION B US S TATION I DENTIFICATION
In the proposed framework, the first phase is to identify
micro-circulation bus stations by analyzing passenger behav- 2) DPC-based clustering: Assuming that the distribution
iors. To find spatiotemporal patterns and identify candidate of demand is a core with larger local density and a relatively
stations, the whole process consists of three parts: 1) in order sparser halo surrounding it, and the core can be the candidate
to find temporal characteristics, we cluster pick-ups and drop- station, we develop a DPC-based clustering method. The
offs on a single transport mode and time slot; 2) for each algorithm is shown in Algorithm 1.
cluster, we extract features which are essential for candidate
stations; 3) merge and fuse clusters on different transport Algorithm 1: DPC-based clustering
modes and time slots which are to obtain candidate stations.
Input: ρ: local density array of each point
A. Hot Spots Discovery δmin : the minimum distance between points
dc : cutoff distance
To find candidate stations, we first use pick-ups and drop- Output: discovered hot spots
offs points on different transport modes and cluster them 1 let n, δ be two new arrays and U be a two-dimension array;
respectively. The cluster centers are regarded as hot spots. In 2 = 1 if x < 0 else χ(x) = 0;
let χ(x) 
consideration of the significant influence of time on mobility 3 let ρi = j χ(dij − dc );
patterns, a DPC-based cluster method with temporal partition 4 ρ ← [ρi1 , ρi2 , . . . , ρin ] ← [ρ1 , ρ2 , . . . , ρn ]; // let ρ get sorted
is applied. in descending order;
1) Temporal partition: Passengers’ travel demand has a 5 let t be a new k-d tree initialized with i1 , where k = 2;
significant time-varying characteristic which should be taken 6 ni1 , δi1 ← −1, +∞;
into account when planning bus routes. People may go to 7 U .append(i1 );
different places on weekends compared with weekdays. Also, 8 for j = 2 to n do
9 nij ← search nearest neighbor of ij in t;
travel patterns vary over different time periods. As Fig. 4
10 δij ← distance between nij and ij ;
shows, there are two travel peaks on weekdays while travel 11 t.insert(ij );
demand varies relatively smoothly at other time. 12 if δ ≥ δmin ∧ ρij ≥ ρmin then
13 U .append(ij );

14 let C ← U ;
15 foreach ρij in ρ do
16 if ij not in C then
17 s ← nearest neighbor in C;
18 U [s].append(ij )

19 return U

In particular, considering that people always choose the


nearest station to take a bus, in this problem, every pickup or
dropoff belongs to the nearest cluster center. In fact, there are
36 partitions which need cluster, for 6 time slots on weekdays
and weekend on three transport modes.

B. Hot Spots Feature Extraction


Fig. 4. Statistical result of travel demand per hour on weekday and weekend
When set up a station, the attributes of stations should also
To incorporate these facts, we segment the studied period to be considered besides the position. Therefore, we also extract
the temporal slots shown in TABLE I. Since few people take features to capture the attributes of hot spots.

984

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
Station demand: The station demand is defined in Section Algorithm 2: Hot Spots Merge
III. It represents the attraction of the station, which is consid- Input: c1 , c2 : two clusters on different time
ered in most of the researches. δ: threshold of overlay area
Coverage radius: To describe the area influenced by the σ: threshold of reserved passenger demand
station, we assume each cluster is a circle and the radius of Output: merge of two clusters
the cluster is denoted as : 1 if c1 ∩ c2 > δ then
2 num = cc11+c
∩c2
2
∗ (c1 .num + c2 .num)/2 ;//c1 ∩ c2 is the
Rsi = max{dist(pk , si )|pk ∈ Ci }, (6) overlay area of two clusters, c1 + c2 is the total area of
two clusters ;
where Rsi represent the converge radius of station i, Ci is a 3 center = (c1 .center + c2 .center)/2;
set of points of cluster i. And dist is a function to compute 4 update R;
the distance of two points. We use the converge radius as 5 else if c1 ⊂ c2 then
passengers’ longest walk distance to the station. 6 num = max(c1 .num, c2 .num);
Persistence: Indeed, the stability of passenger flow is one of 7 center = (c1 .center + c2 .center)/2;
the important factor for bus route planning. As we divide time 8 R = max(c1 .R, c2 .R);
in six time slots, the persistence is defined as the number of 9 else
appearances in all six periods, and it would be computed after 10 if num> σ then
11 reserve the cluster;
the merge of different time periods cluster results. Obviously,
persis ∈ {1, 2, 3, 4, 5, 6} represents the persistence of station. 12 else
13 discard the cluster;
Transport capacity: The public transport capacity is also
considered when set up a station. We use the number of existed
bus stations to represent it which is defined as
ki + ξ As shown, three relations between hot spots is considered,
T Ci = , (7)
M and then the features are updated correspondingly. The persis-
where ki is the number of existed bus stations near the station tence of hot spots is computed in Y-Merge and updated in the
i, ξ is a small threshold, and M denotes the total number of other two merges.
existed bus stations in researched area. Finally, the fusion demand are the candidate stations which
can be chosen to generate the bus route.
C. Merge-based Spatiotemporal Demand Fusion
V. M ICRO - CIRCULATION ROUTE P LANNING
Hot spots at different time periods on different transport
mode represent the temporary and model demand which After finding the candidate stations, the aim of phase two is
cannot be used as the candidate station directly. In this paper, a generating a micro-circulation bus route expected to maximize
merge-based hot spots fuse algorithm is proposed to generate passenger flow and decrease the traffic afford. So, the first
the demand for a long period. is building a transition matrix using exists trips to estimate
Three-dimensions merges is designed as follows: 1) Merge the passenger flow, and then an iterative greedy algorithm is
demand of six time slots a day called Y-Merge; 2) Merge proposed to plan the routes.
different days on weekdays and weekends separately called A. Transition Network Construction
X-Merge; 3) Merge demand on three different transport modes
The route planning should be designed to decrease trans-
called Z-Merge. The process is shown in Fig. 5, and the
portation afford to the greatest extent. Therefore, to estimate
algorithm is given as Algorithm (2).
traffic flow between stations, we build the transition network
using historical trips. Though in this paper, we use several
transport mode, considering that shared bicycles are common
transport for short trips, the bicycles trips to construct the
transition matrix.
The transition network is a network with several nodes
represent the stations, and the edges represent the estimated
demands between two nodes. As the historical trips record the
real position of pickup and dropoff, we first divide them into
specific stations. The definition is given in Equation (3).
B. Candidate Bus Routes Generation
Designing a new bus route is challenged since two con-
flicting requirements must be meet: one is to maximize the
Fig. 5. The process of three-dimensions merge. Y-Merge, Z-Merge and X- number of passengers along the route, the other is the micro-
Merge are done in order.
circulation bus route should have limited stops and run within
a limited time. For example, if we choose the bus stop with

985

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
heavier traffic flow, it may be long or not satisfy the constraints the first designed bus route and then use the same method
of bus route. Considering about this, we propose an iterative to generate it. It can ensure the designed n routes have little
greedy algorithm with constraints. duplication and maximum the passenger flow in a way.
1) Objective Function: The passenger flow is an important
factor which influence the bus route. Therefore, we use transi- VI. E XPERIMENTAL E VALUATION
tions to estimate the demand, and add it to objective function In this section, we will first introduce the evaluation metrics
f . However, the features of stations also have a big impact on and settings of our experiments. Then we evaluate the results
the bus route. For example, some stations may have a bigger of proposed algorithms. Finally, we show the case study in
traffic flow at a period, but it may have little demand at other real datasets.
period, which is not suitable for a bus route. Therefore, we
define the station score using the features proposed in Section A. Evaluation Metrics
IV to evaluate the station and add it to f . The definition is as
In order to evaluate the performance of the proposed clus-
follows:
tering method, we consider the following metrics which are
scorei = numi × Ri × persisi × T Ci (8) frequently used in clustering problem.
 k
 • Calinski-Harabaz (CH). The Calinski-Harabaz score is
f (W, score, μ) = Wi,j + μ scorei , (9) given as the ratio of the between-clusters dispersion mean
i,j i=1 and the within-cluster dispersion [19]:
where scorei represents the score of station i, numi , Ri , T R(Bk ) N − k
persisi , T Ci are the features of station i and k is the number CH(k) = ∗ , (14)
T r(Wk ) k − 1
of stations in designed routes. The first term is the expected k 

passenger flow and the second term will prefer selecting Wk = (x − cq )(x − cq )T , (15)
stations with stable demand and high impact. q=1 x∈Cq
2) Constraints: Considering the characteristics of the 
micro-circulation bus, we should obey the following con- Bk = nq (cq − c)(cq − c)T , (16)
q
straints when generate the bus routes.
Proper distance. We constraint the length of the micro- where Bk is the between group dispersion matrix and
circulation bus route and the distance between two adjacent WK is the within-cluster dispersion matrix, N is the
stations. number of points in data, Cq is the set of points in cluster
Lmin ≤ L ≤ Lmax , (10) q, cq is the center of cluster q, nq is the number of points
in cluster q.
δmin ≤ dist(si+1 , si ) ≤ δmax , (11) • Silhouette Coefficient (SC) measures how good the clus-
where L is the total length of the whole route, and tering fits the original data based on statistical properties
Lmin , Lmax , δmin , δmax are user-specified parameters. of the clustered data [20].
Amount of stations. Since the goal of micro-circulation bus b−a
is providing the last-mile connections for citizens, the amount s= , (17)
max(a, b)
of stations cannot be too many.
where a is the mean distance between a sample and all
Smin ≤ S ≤ Smax , (12) other points in the same class, b represents the mean
No zigzag route. distance between a sample and all other points in the
next nearest cluster. A higher Silhouette Coefficient score
arg min(dist(si+1 , sj )) = si (j = 1, 2, · · · , i), (13) relates to a model with better defined clusters.
sj
In order to evaluate the performance of the proposed routing
which ensures the smoothness of the route. There would be planning method, we consider the following metrics:
no sharp zigzag path in the designed route.
• Demand Coverage (DC) can evaluate the effectiveness
3) The iterative greedy algorithm for route generation:
of routing planning algorithm by computing the demand
Although there are some constraints on the route, the problem
coverage of the planned routes.
of enumerating all possible routes is proved to be NP hard and
• Routing Similarity (RS). To evaluate the relationships
the solution space is too large to solve. A greedy algorithm is
of n routes, we use route similarity which is defined as
proposed to find the route. To maximize the objective function
follow.
f , we prefer to choose the next station with a biggest sum of |A ∩ B|
station scores and transition under the constraints. And then sim(A, B) = , (18)
|A ∪ B|
an iterative process is used to find n routes which can deliver
more passengers and decrease the traffic afford. When the first where A, B represents two designed routes, |A∩B| is the
route is generated, for the next route, the transition network amount of the same bus stations while |A ∪ B| denotes
is updated by removing the demand that has been covered by the total number of bus stations in two routes.

986

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
B. Experimental Settings
Researched area: The micro-circulation bus is used to
build the connections with the subway stations. Considering
passengers will take subway when go to work or go to school
or go shopping, we choose an area called Wangjing in Beijing.
This area is mixed by a lot of companies, big malls, and
schools. To ensure our assumption, we analyze the trips in
Beijing, and there are 0.24 million trips per day in this area,
which is larger than most of the other areas. Fig. 7. Station identification. Weekday (left) and weekend (right) are quite
Parameters settings: The parameters of bus route is set in- different.
spired by existed bus. Specifically, we set the maximum length
of bus Lmax = 10km, and the minimum length Lmin = 3km. Probabilistic Random Walk (PRW) model [24]. In particular,
As for the distance between two adjacent stops, δmin = 200m the probability is defined by using transitions only. We do not
and δmax = 1500m. The number of stations is set to no more consider the constraints proposed. When n =2, the result is
than 10. shown in TABLE II. As there are too many candidate stations,
C. Performance on Clustering Algorithms RW performs not well but PRW performs much better than PR
since transitions vary widely and it is easier to choose a station
In this paper, we develop two cluster methods, the one is with higher transition. Our method performs much better due
Density Peak Cluster (DPC) on single datasets and the other to the iteration of two routes.
is Merged-based cluster fusion (DPC-M). We compare them
with K-means and DBSCAN. TABLE II
ROUTE C OMPARISON
K-Means try to separate samples in n groups of equal
variance, minimizing a criterion known as the inertia or within-
Metrics DC RS
cluster sum-of-squares [21].
RW 1256 0.053
DBSCAN views clusters as areas of high density separated PRW 8980 0.250
by areas of low density. The central component to the DB- IGA 16582 0.176
SCAN is the concept of core samples, which are samples that
are in areas of high density [22].
We use CH and SC to evaluate these algorithms. As shown Fig. 8 presents the demand coverage of our method for
in Fig. 6, we can see the four algorithms have different different n. From this figure we can see the demand coverage
performance under the two evaluations, and the performance increases when add a route, but the rate of increase is different
also varies from different evaluations. The proposed DPC- with different n. It is obvious that when n > 4, the rate of
M have a good performance on both evaluations while DPC increase reduce which means it is not essential to add a route.
performs not well on SC. The results of DPC-M on real dataset
is shown in Fig. 7, from which a total difference can be clearly
showed. It can be seen the demand distributes irregularly
and weekday has a more concentrated demand than that in
weekend.

Fig. 8. Demand coverage comparison on three different origins for the


propose method with different iteration times.
Fig. 6. Performance on cluster algorithms. CH (left) and SC (right) are
compared on four algorithms.
E. Case Study
D. Performance on Iterative Greedy Algorithm for Route To illustrate the effectiveness of the solution, we select
Planning two routes calculated from the same origin, Wangjing subway
station and mark them on the map. In order to make the
To demonstrate the effectiveness of the proposed iterative route much clearer, we do some adjustments on the planning
greedy algorithm(IGA) for routing planning, we compared our routes relying on the actual road, as shown in Fig. 9. The two
method with classical Random Walk (RW) model [23] and routes only have one same station expect the origin, which

987

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.
can decrease the traffic afford obviously. The route 1 begins [3] M. Glöss, M. McGregor, and B. Brown, “Designing for labour: uber
and the on-demand mobile workforce,” in Proceedings of the 2016 CHI
at Wangjing subway station then runs through the community conference on human factors in computing systems. ACM, 2016, pp.
and the destination is a school which have a large demand. The 1632–1643.
route 2 goes south and crosses the big mall finally arrive the [4] J. Aslam, S. Lim, X. Pan, and D. Rus, “City-scale traffic estimation from
a roving sensor network,” in Proceedings of the 10th ACM Conference
destination which is closed to another subway station, South on Embedded Network Sensor Systems. ACM, 2012, pp. 141–154.
of Wangjing subway station. [5] W. Y. Szeto and Y. Wu, “A simultaneous bus route design and frequency
setting problem for tin shui wai, hong kong,” European Journal of
Operational Research, vol. 209, no. 2, pp. 141–155, 2011.
[6] M. H. Almasi, A. Sadollah, S. M. Mounes, and M. R. Karim, “Optimiza-
tion of a transit services model with a feeder bus and rail system using
metaheuristic algorithms,” Journal of Computing in Civil Engineering,
vol. 29, no. 6, p. 04014090, 2014.
[7] F. Ciaffi, E. Cipriani, and M. Petrelli, “Feeder bus network design
problem: A new metaheuristic procedure and real size applications,”
Procedia-Social and Behavioral Sciences, vol. 54, pp. 798–807, 2012.
[8] Y. Ge, J.-j. Zhao, Y. Bian, and J. Rong, “Feeder bus stop selection within
an integrated feeder bus planning framework,” in ICCTP 2011: Towards
Sustainable Transportation Systems, 2011, pp. 2854–2865.
[9] M. H. Baaj and H. S. Mahmassani, “Trust: A lisp program for the anal-
ysis of transit route configurations,” Transportation Research Record,
vol. 1283, no. 1990, pp. 125–135, 1990.
[10] G. Newell, “Some issues relating to the optimal design of bus routes,”
Transportation Science, vol. 13, no. 1, pp. 20–35, 1979.
[11] C. Chen, D. Zhang, Z.-H. Zhou, N. Li, T. Atmaca, and S. Li, “B-planner:
Night bus route planning using large-scale taxi gps traces,” in 2013 IEEE
Fig. 9. Two designed micro-circulation routes. international conference on pervasive computing and communications
(PerCom). IEEE, 2013, pp. 225–233.
[12] S. A. Bagloee and A. Ceder, “Transit-network design methodology for
actual-size road networks,” Transportation Research Part B Methodolog-
VII. C ONCLUSIONS ical, vol. 45, no. 10, pp. 1787–1804, 2011.
[13] V. Guihaire and J.-K. Hao, “Transit network design and scheduling: A
In this paper, we have investigated the problem of micro- global review,” Transportation Research Part A: Policy and Practice,
circulation bus planning by leveraging the taxi GPS traces, vol. 42, no. 10, pp. 1251–1273, 2008.
shared bicycles and online car-hailing which are motivated [14] X. Ma, C. Liu, H. Wen, Y. Wang, and Y.-J. Wu, “Understanding
commuting patterns using transit smart card data,” Journal of Transport
by the needs of the first/last-mile connections for citizens. Geography, vol. 58, pp. 135–145, 2017.
To solve the problem, we develop a new type of public [15] N. Lathia, J. Froehlich, and L. Capra, “Mining public transport usage for
transport service, namely, micro-circulation bus, according to personalised intelligent transport systems,” in 2010 IEEE International
Conference on Data Mining. IEEE, 2010, pp. 887–892.
the spatiotemporal analysis of multimode pick-up and drop- [16] T. A. Chua, “The planning of urban bus routes and frequencies: A
off demands. Specifically, a DPC-based method is first used survey,” Transportation, vol. 12, no. 2, pp. 147–172, 1984.
to identify the hot spots and a fusion algorithm is applied on [17] F. Bastani, Y. Huang, X. Xie, and J. W. Powell, “A greener transportation
mode: flexible routes discovery from gps trajectory data,” in Proceedings
spatiotemporal data to get the candidate stations. Then, a tran- of the 19th ACM SIGSPATIAL International Conference on Advances in
sition network is constructed on real data. Finally, we provide Geographic Information Systems. ACM, 2011, pp. 405–408.
a solution for generating n micro-circulation bus routes using [18] Y. Liu, C. Liu, N. J. Yuan, L. Duan, Y. Fu, H. Xiong, S. Xu, and
J. Wu, “Exploiting heterogeneous human mobility patterns for intelligent
an iterative greedy algorithm. Experiments are conducted on bus routing,” in 2014 IEEE International Conference on Data Mining.
real-world data, results demonstrate the effectiveness of the IEEE, 2014, pp. 360–369.
proposed method, and also reveal the potential of first and [19] V. Punzo and M. Montanino, “Speed or spacing? cumulative variables,
and convolution of model errors and time in traffic flow models valida-
last-mile transport service by micro-circulation bus. tion and calibration,” Transportation Research Part B: Methodological,
vol. 91, pp. 21–33, 2016.
ACKNOWLEDGMENT [20] M. Nilashi, D. Jannach, O. bin Ibrahim, and N. Ithnin, “Clustering-and
regression-based multi-criteria collaborative filtering with incremental
We thank the anonymous reviewers for their constructive updates,” Information Sciences, vol. 293, pp. 235–250, 2015.
[21] K. Alsabti, S. Ranka, and V. Singh, “An efficient k-means clustering
comments on this research work. This work is supported algorithm,” 1997.
by the National Natural Science Foundation of China under [22] H.-P. Kriegel and M. Pfeifle, “Density-based clustering of uncertain
Grant No. 51822802, 51778033, and U1811463, the Science data,” in Proceedings of the eleventh ACM SIGKDD international
conference on Knowledge discovery in data mining. ACM, 2005, pp.
and Technology Major Project of Beijing under Grant No. 672–677.
Z171100005117001. [23] J. Lee, M. Cho, and K. M. Lee, “Hyper-graph matching via reweighted
random walks,” in CVPR 2011. IEEE, 2011, pp. 1633–1640.
[24] H. Yu, L. Chen, X. Cao, Z. Liu, and Y. Li, “Identifying top-k im-
R EFERENCES portant nodes based on probabilistic-jumping random walk in complex
networks,” in International Conference on Complex Networks and their
[1] J. d. D. Ortúzar and L. G. Willumsen, Modelling transport, 2002, vol. 3.
Applications. Springer, 2017, pp. 326–338.
[2] H. Poonawala, V. Kolar, S. Blandin, L. Wynter, and S. Sahu, “Singapore
in motion: Insights on public transport service level through farecard
and mobile data analytics,” in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and data mining.
ACM, 2016, pp. 589–598.

988

Authorized licensed use limited to: Pontificia Universidad Javeriana. Downloaded on September 26,2023 at 07:02:11 UTC from IEEE Xplore. Restrictions apply.

You might also like