You are on page 1of 17

Transportation Research Part B 124 (2019) 1–17

Contents lists available at ScienceDirect

Transportation Research Part B


journal homepage: www.elsevier.com/locate/trb

A latent-class adaptive routing choice model in stochastic


time-dependent networks
Jing Ding-Mastera a, Song Gao b,∗, Erik Jenelius c, Mahmood Rahmani c,
Moshe Ben-Akiva d
a
Stantec, New York, NY 10017, USA
b
University of Massachusetts, Amherst, MA 01003, United States
c
Royal Institute of Technology, Stockholm, Sweden
d
Massachusetts Institute of Technology, Cambridge, MA 02139, United States

a r t i c l e i n f o a b s t r a c t

Article history: Transportation networks are inherently uncertain due to random disruptions; meanwhile,
Received 18 November 2017 real-time information potentially helps travelers adapt to realized traffic conditions and
Revised 20 February 2019
make better route choices under such disruptions. Modeling adaptive route choice behav-
Accepted 25 March 2019
ior is essential in evaluating real-time traveler information systems and related policies.
This research contributes to the state of the art by developing a latent-class routing pol-
icy choice model in a stochastic time-dependent network with revealed preference data. A
routing policy is defined as a decision rule applied at each link that maps possible realized
traffic conditions to decisions on the link to take next. It represents a traveler’s ability to
look ahead in order to incorporate real-time information not yet available at the time of
decision.
A case study is conducted in Stockholm, Sweden and data for the stochastic time-
dependent network are generated from hired taxi Global Positioning System (GPS) read-
ings. A latent-class Policy Size Logit model is specified, with routing policy users who
follow routing policies and path users who follow fixed paths. Two additional layers of
latency in the measurement equation are accounted for: 1) the choice of a routing pol-
icy is latent and only its realized path on a given day can be observed; and 2) when GPS
readings have relatively long gaps, the realized path cannot be uniquely identified, and the
likelihood of observing vehicle traces with non-consecutive links is instead maximized.
Routing policy choice set generation is based on the generalization of path choice set
generation methods. The generated choice sets achieve 95% coverage for 100% overlap
threshold after correcting GPS mistakes and breaking up trips with intermediate stops,
and further achieve 100% coverage for 90% overlap threshold.
Estimation results show that the routing policy user class probability increases with
trip length, and the latent-class routing policy choice model fits the data better than a
single-class path choice or routing policy choice model. This suggests that travelers are
heterogeneous in terms of their ability and/or willingness to plan ahead and utilize real-
time information, and an appropriate route choice model for uncertain networks should


Corresponding author.
E-mail addresses: jing.ding-mastera@stantec.com (J. Ding-Mastera), sgao@umass.edu (S. Gao), jenelius@kth.se (E. Jenelius), mahmoodr@kth.se (M. Rah-
mani), mba@mit.edu (M. Ben-Akiva).

https://doi.org/10.1016/j.trb.2019.03.018
0191-2615/© 2019 Published by Elsevier Ltd.
2 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

take into account the underlying stochastic travel times and structured traveler hetero-
geneity in terms of real-time information utilization.
© 2019 Published by Elsevier Ltd.

1. Background and literature overview

1.1. Background and motivation

Transportation networks are frequently subject to random disruptions such as incidents and bad weather, resulting in
variable and unpredictable traffic conditions. According to the 2015 Urban Mobility Scorecard (Schrank et al., 2015), traf-
fic congestion in the United States cost 160 billion dollars in 2014. Meanwhile, with the fast development of sensor and
telecommunication technologies, real-time information is increasingly available for travelers and system operators to make
better decisions in such an uncertain network, which includes radios, websites, smartphone applications, in-vehicle naviga-
tion systems, and connected vehicles. A crucial component in designing and evaluating real-time travel information systems
is understanding travelers’ route choice behavior in response to a wide range of traveler information situations in a network
with dynamic and random traffic conditions.
A traveler makes decisions based on his or her knowledge of the available alternatives and their attributes. This knowl-
edge is periodically updated by both personal experience and exogenous information, and as a result the decisions might be
revisited and revised. In other words, a traveler “adapts” to the decision environment. The time scale at which route choice
adaption happens can be broadly divided into two types: day-to-day and within-day. In a day-to-day context, a traveler’s
route choice today might be different from yesterday due to information collected yesterday during the trip. In a within-day
context, route choice could be revised en route, e.g., taking a detour upon receiving information on a crash along the original
route. This research focuses on within-day adaptive route choice, where the real-time information reflects travel conditions
at or close to the decision time. The scope of our study is within-day route choice, which is arguably the most researched
area of traveler response to real-time traveler information system.

1.2. Literature overview and research contributions

Three types of route choice models (fixed path models, adaptive path models, and routing policy choice models) have
been studied based on whether and how travelers react to real-time information.

Fixed path models. Fixed path models assume that a traveler is non-adaptive who chooses a fixed path at the origin of
a trip and follows it till the end, not accounting for any real-time information provided en-route. One of the most widely
used models is the Multinomial Logit (MNL) due to its attractive closed-form formula, and corrections have been proposed
to overcome the overlapping problem among route alternatives, e.g., the C-Logit model (Cascetta et al., 1996) and Path
Size Logit model (Ben-Akiva and Bierlaire, 1999). Notably, the Path Size Logit model has been successfully implemented
based on RP data in Dynamic Traffic Assignment (DTA) in a subnetwork of Beijing, China (Ben-Akiva et al., 2012). Later
on, more complicated models are developed such as Multinomial Probit (Bolduc and Ben-Akiva, 1991), Error Component
model (Frejinger and Bierlaire, 2007), latent route choice model with network-free data (Bierlaire and Frejinger, 2008), model
assuming a universal choice set estimated based on a sampling approach (Frejinger et al., 2009), model assuming a universal
choice set estimated through repeated link choices based on a dynamic discrete choice approach (Fosgerau et al., 2013), and
cross-nested logit based on sampling of choice sets (Lai and Bierlaire, 2015).

Adaptive path models. Travelers’ route choice behavior in an uncertain network with real-time information will conceivably
be different from that in a deterministic network. With real-time information provided en-route, travelers could make route
choice decisions at intermediate nodes based on the current situation in order to avoid delay downstream. An adaptive path
model assumes that a traveler is reactive and route choice is a series of path choices at each node.
Some researchers model adaptive route choice behavior by successively estimating a sequence of non-adaptive path
choice models at each node and updating the attributes of alternative paths to the destination to reflect real-time infor-
mation. In principle all fixed path models can be applied this way. The simulation-based traffic prediction models in Dy-
naMIT (Ben-Akiva et al., 2002) and DYNASMART (Mahmassani, 2001) are such examples, which update the path choice at
intermediate decision nodes according to the latest network travel times. These models are calibrated over aggregate mea-
surements such as counts and speeds, in which route choice parameters are among the calibration variables. In this regard,
adaptive path models have been calibrated as a part of traffic prediction models in real-life networks.
A large body of research on route choice in response to real-time information focuses on binary route switchings in real
life (e.g., Polydoropoulou et al. (1996); Chatterjee and McDonald (2004); Peeta and Ramos (2006); Tsirimpa et al. (2007)), or
more advanced hypothetical real-time traveler information system in SP surveys or driving simulators (e.g., Mahmassani and
Liu (1999); Srinivasan and Mahmassani (2003); Abdel-Aty and Abdalla (2004); Peeta and Yu (2005); Bogers et al. (2005);
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 3

Abdel-Aty and Abdalla (2006); Ardeshiri et al. (2015)). For recent reviews of empirical studied of traveler response to in-
formation, please refer to Balakrishna et al. (2013) and Ben-Elia and Avineri (2015). In all of these studies, travelers are
assumed to respond to real-time information on the spot, and the complete decision process is a series of path choices,
each of which is based on updated traffic conditions revealed by real-time information at the time of decision. The implicit
assumption is that a traveler is myopic and does look ahead for future information.

Routing policy choice models. Although an adaptive path model could account for diversions from an initial chosen path,
it assumes that travelers are simply reactive to information on the spot and do not plan ahead for real-time information
that will be available later in the trip, even though in reality they might switch to another path at the next node. Both
reactive and looking-ahead travelers adapt to real-time information at intermediate nodes; the difference between a reactive
and looking-ahead traveler lies on whether the decision at an intermediate node takes into account future information
availability and diversion possibilities. A looking-ahead traveler fully considers the future information availability and the
possible actions they might take at all future nodes. Therefore they decide what next link to take but not the next path to
destination. A routing policy is defined as a decision rule applied at each link that maps possible realized traffic conditions
to decisions on the link to take next.
A routing policy choice model was first developed in Gao (2005) and imbedded in a dynamic traffic assignment model.
Gao et al. (2008) studied two types of models that account for travelers’ adaptation to real-time information: an adaptive
path model and a routing policy choice model based on synthetic data and a simplified network. Empirical studies of the
routing policy choice to this date have only been carried out with SP data (Razo and Gao, 2010; 2013). Traffic prediction
models where routing policy choices are assumed for travelers have also been studied using simplified networks either in
an equilibrium context (e.g., Gao (2012)) or a disequilibrium context (e.g., Boyer et al. (2015)).

Contributions. In the literature, fixed path models and adaptive path models have been estimated in both hypothetical
simplified networks and real-life networks based on both SP and RP data. Routing policy choice models have been estimated
in hypothetical simplified networks based on SP data. However, the estimation of routing policy choice models in real-life
networks under real-time information using RP data remains an unexplored area. The challenges associated with estimating
such a model in real-life networks include that 1) travelers are heterogenous in terms of their ability and/or willingness
to plan ahead and utilize real-time information (whether they follow fixed paths or routing policies), 2) the choice of a
routing policy is latent and only its realized path on a given day can be observed; and 3) when GPS readings have relatively
long gaps, the realized path cannot be uniquely identified. This research thus contributes to the state of art by specifying a
latent-class, latent-choice, latent-path Policy Size Logit model to address the above three challenges, and demonstrating its
applicability to real life networks using revealed preference (RP) data.

Organization. The remainder of the paper is structured as follows. Section 2 introduces the modeling framework and
methodologies. Section 3 presents a case study in Stockholm, Sweden. Section 4 concludes and provides future directions.

2. Modeling framework and methodologies

2.1. Network, information, route choice behavior

A stochastic time-dependent (STD) network has link travel times that are jointly distributed time-dependent random
variables, and is denoted as G = (N, A, H, P ), where N is the set of nodes, A the set of links with |A| = m, H the set of time
intervals {0, 1, . . . , K − 1} with an equal length δ , and P the probabilistic representation of link travel times. Beyond the end
of time interval K − 1, travel times are static and deterministic. F(i, j, k, t) is the deterministic turning penalty from link (i,
j) to link (j, k) when turning at node j happens at time period t.
A support point is defined as a distinctive value that a discrete random variable can take, or a distinctive vector of values
that a discrete random vector can take depending on the context. Thus a probability mass function (PMF) of a random
variable (or vector) is a combination of support points and the associated probabilities. A joint PMF of all time-dependent
link travel time random variables is used: P = {v1 , v2 , ..., vR }, where vi is a vector with a dimension K × m, i = 1, 2, ..., R,

and R is the number of support points. The rth support point has a probability pr , and Rr=1 pr = 1. When link travel time
observations from multiple days are available, a support point can be viewed as a day, R is the number of days, and pr =
1/R, ∀r.
Real-time information is assumed to include realized travel times of certain links at certain time periods. For example,
perfect online information (POI) includes realized travel times on all links up to the current time, while global pre-trip
information includes realized travel times of all links up to the departure time. See Gao and Huang (2012) for discussions
on a number of real-time information access. The passive GPS readings of taxi drivers used in this study cannot tell us what
real-time information the drivers have. POI is assumed, since taxi drivers are in general highly sensitive to traffic conditions
and stay informed at all times. The discussion in the remainder of the research is therefore specific to POI.
With the help of online information, a traveler becomes more certain about the future traffic conditions, that is, the
network becomes less stochastic. At a given time period t, the available real-time information is represented by a joint real-
ization of travel times on all links at previous time periods 0, 1, . . . , t. The joint realization corresponds to a unique subset of
4 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

Fig. 1. An illustrative STD network.

compatible support points, defined as an event collection, EV. It represents the conditional distribution of future link travel
times given the realization of past link travel times. As more information becomes available, the size of an event collection
decreases or remains the same. When an event collection becomes a singleton, the network becomes deterministic.
When a traveler is at the end of link (i, j) at time t with event collection EV, she makes a decision to take the next link
(j, k). Upon arrival at the end of link (j, k), she will be in a different time period due to the traversal time on link (j, k) and
the turning penalty F(i, j, k, t). She will also have a potentially different event collection EV , which accounts for realized
link travel times between t and the arrival time at the end of link (j, k). She continues the routing decision process based
on dynamically involved event collections. Define x as a state with three elements: link (i, j), time t and event collection EV.
A routing policy μ is therefore defined as a mapping from all possible states to the decision of the link to take next, μ: x
→ (j, k).
A routing policy can capture traveler’s looking-ahead capability in that the decision at state x depends on the evaluation
of all possible future states throughout the remainder of the trip. Specifically, the fact that more information will be available
in the future is represented by the series of EV that could be encountered. A routing policy is realized as a path for a given
support point (day), and the realized path topologies potentially vary from day to day due to the randomness of travel times.

2.2. An illustrative example of routing policy

An illustrative example is shown in Fig. 1 and Table 1. The network consists of two time intervals, five nodes including
a dummy node o, and five links including a dummy link 0 going out of the dummy origin o with zero travel time. Travel
times are expressed as multiples of the length of a time interval. There are two support points, each with a probability of
0.5, for the joint distribution of 8 travel time random variables (links 1, 2, 3, and 4 at time intervals 0 and 1). Travel times
beyond time interval 1 are the same as those in time interval 1 in either of the two support points. All turning penalties
are assumed to be zero for simplicity. Two paths are available: link 0 - link 1 - link 2 - link 4 (path 1) and link 0 - link 1
- link 3 - link 4 (path 2). At time 0, there is only one possible event collection (v1 , v2 ), as travel times on all links are the
same across the two support points. At time 1, there are two possible event collections, v1 and v2 .
Consider the following routing policy: the traveler starts with an initial state {link 0, time 0,(v1 , v2 )} and takes the only
outgoing link, link 1 with a travel time of 1; at the end of link 1 (node j), two states {link 1, time 1, v1 } or {link 1, time
1, v2 } are possible. If the state {link 1, time 1, v1 } is encountered, the traveler first takes link 2 and arrives at node k with
a state of {link 2, time 4, v1 } as travel time on link 2 is 3 in v1 , and then takes link 4 and arrives at the destination with
a final state of {link 4, time 5, v1 }. If the state {link 1, time 1, v2 } is encountered, the traveler first first takes link 3 and
arrives at node k with a state of {link 3, time 3, v2 } as travel time on link 3 is 2 in v2 , and then takes link 4 and arrives at
the destination with a final state of {link 4, time 4, v2 }. This is represented intuitively in Fig. 2 as a “state tree” for routing
policy μ1 . Fig. 2 also includes the other three routing policies, μ2 , μ3 , μ4 . μ2 and μ4 are not adaptive to states and are
simply fixed paths. It is also illustrated that multiple routing policies can be realized as the same path for a given support
point, e.g., μ1 and μ2 are realized as path 1 for support point v1 , μ1 and μ4 are realized as path 2 for support point v2 .
The choice of a routing policy is latent - it can be viewed as a plan in the traveler’s mind, and only the result of the
plan execution is observed, which is the realized path. The model must be estimated based on path observations and thus
a latent-choice specification is needed. A routing policy is realized as a path for a given support point, which can be fully

Table 1
Support points and event collections for the
illustrative STD network.

Time Link v1 v2

0 1 1 1
2 2 2
3 3 3
4 1 1
1 1 1 2
2 3 2
3 2 2
4 1 1
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 5

Fig. 2. State trees of all routing policies in the illustrative example.

defined by the observed travel times on all random links. In Fig. 2, routing policy μ1 and μ2 are two different routing
policies but they are realized as the same path for support point v1 . Thus if this path is observed for support point v1 , the
traveler could have chosen either routing policy μ1 or μ2 and both policies should be considered.
Furthermore, some GPS readings have large gaps and thus the realized path cannot be uniquely identified. In Fig. 2, if
two GPS readings are matched to link 1 and link 4 respectively, then either path 1 or 2 could be the taken path.

2.3. Model specification and estimation

It is hypothesized that there are two classes of travelers, routing policy users who follow routing policies, and path users
who follow fixed paths. λ is defined as the probability of a traveler belonging to the policy user class, and thus (1 - λ) is the
probability of a traveler belonging to the path user class. The major difference between the two classes is the choice sets,
where the routing policy choice set C˜n is a superset of the path choice set Cn , as a path is a special routing policy where
routing decisions are independent of real-time information. In general its attribute (e.g., travel time, # of intersections) is
calculated as the expected value of the attribute for the realized paths.
Eqs. (1) and (2) show that the choice of an alternative (path i or policy μ) for individual n from either class is described
by a Logit model with systematic utility V, which is a function of explanatory variables and the parameters of the variables
(β or β  ) are to be estimated from data. PS (Path Size) is a deterministic correction for overlapping of paths, and PoS (Policy
Size) is its counterpart for routing policies, calculated as the expected path size. The utility functions and parameter sets
could differ by class, and a simplified case is when the difference is only by a scale, i.e. β = Scale ∗ β  . PSi and PoSμ can be
calculated by Eqs. (3) and (4) respectively.
exp(Vi (β ) + ln P Si )
P (i|Cn ; β ) =  (1)
j∈Cn exp (V j (β ) + ln P S j )


 exp(Vμ (β ) + ln PoSμ )
P (μ|C˜n ; β ) =   (2)
θ ∈C˜n exp(Vθ (β ) + ln PoSθ )
  Tl  1
P Si = (3)
Ti Ml,n
l∈Ii

where
Ii = set of links of path i,
6 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

Tl = travel time of link l,


Ti = travel time of path i,
Ml,n = number of paths in choice set Cn using link l.
  

R  Tlr 1
PoSμ = r
P (r ) (4)
r
Tμr Ml,n
r=1 l∈Iμ

where
r = set of links on the realized path of routing policy μ for support point r,

Tlr = travel time of link l for support point r,
Tμr = realized travel time for routing policy μ for support point r,
r = number of routing policies in choice set C˜ using link l for support point r, and
Ml,n n
P(r) = probability of support point r.
Route choice observations are obtained from individual level passive GPS readings. In some applications these readings
are sparse with large gaps (e.g., longer than 1 minute), and thus an individual’s chosen route cannot be uniquely identified.
The estimation problem is thus based on maximizing the likelihood of observing vehicle traces, where a trace is an ordered
set of map-matched links between an OD pair where the links are generally not consecutive.1
Eq. (5) describes the likelihood of observing trace g for a path user n on day r. The first equality shows that day r is
irrelevant, since the individual does not adapt her choice to realized traffic conditions on any given day. The likelihood of
observing trace g is the sum of the likelihood of observing paths from the choice set Cn that contain trace g. P(g|i) is a
binary indicator which is equal to 1 if path i contains trace g and 0 otherwise.

path
Pn,r (g|β ) = Pnpath (g|β ) = P (i|Cn ; β )P (g|i ) (5)
i∈Cn

Eq. (6) describes the likelihood of observing trace g for a policy user n on day r as the sum of the likelihood of choosing
policies from the choice set C˜n that contain GPS trace g. A routing policy μ is not observable and it is viewed as chosen if
the realized path i on day r contains trace g. Pr (g|μ) is a binary indicator which is equal to 1, if the realized path of routing
policy μ on day r contains trace g, and 0 otherwise.
  
policy
Pn,r (g|β ) = P (μ|C˜n ; β )Pr (g|μ )
(6)
μ∈C˜n

Eq. (7) describes the likelihood of observing a GPS trace g on day r for individual n as the convex combination of the
likelihood from the two classes. λ is represented by a logit form membership function in Eq. (8), where W is a linear
function of an constant and explanatory variables for being a routing policy user. The explanatory variables could include
trip attributes, such as an indicator of a long trip, and, characteristics of the travelers, such as an indicator of an experienced
driver. These variables are not alternative-specific.
 
Pn,r (g|β, β ) = λPn,r
Policy
(g|β ) + (1 − λ )Pn,r
Path
(g|β ) (7)

exp(W )
λ= (8)
exp(W ) + 1

2.4. Choice set generation

There could be numerous alternative paths/routing policies in a general transportation network for an OD pair, but some
of them may be unrealistic by being too circuitous or otherwise unsuitable. Therefore, the objective of the choice set gen-
eration is to provide a subset of realistic alternatives considered by travelers. For the path user class, the choice set Cn is
composed of fixed paths; for the policy user class, the choice set C˜n is composed of routing policies and contains the path
choice set as a subset, Cn ⊂ C˜n , as a path is a special type of routing policy.
Two choice set generation processes are carried out. The first process generates paths (Section 2.4.1), Dn , and the sec-
ond process generates routing policies (Section 2.4.2), D˜n , which contains two mutually exclusive and collectively exhaustive
path adaptive
subsets: D˜ n , the degenerate routing policies; and D˜ n , the routing policies that are adaptive, that is, realized as dif-
path
ferent paths over different support points. The final path choice set Cn is the union of Dn and D˜ n , while the final routing
˜
policy choice set Cn is the union of Dn and Dn . ˜

2.4.1. The generation of Dn


The generation of Dn follows the typical process in the literature in a transformed static and deterministic network,
where the link travel time is the average over both time-of-day and days of the original STD link travel times. The most

1
The term “vehicle trajectory” is often used if the full route is observed, that is, map-matched links are consecutive.
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 7

commonly used methods in the literature include link elimination, simulation, and generalized cost methods (for recent
reviews, see, e.g., Bekhor et al. (2006); Ding-Mastera (2016)), which are adopted in this research, and results from all three
methods are pooled.

Link elimination method. The shortest path is first calculated for an OD pair. Links on the shortest path are removed from
the network one at a time, and a new shortest path is generated and added to the choice set if not already included.

Simulation method. Each link travel time is perturbed independently following a certain zero-mean distribution (usually a
normal distribution) and a shortest path is generated for each jointly perturbed links. The path is added to the choice set if
not already included. The number of joint perturbations is pre-determined and adjusted empirically.

Generalized cost method: Highway bias, and functional class change penalty, and intersection penalty. Additional at-
tributes are included to capture considerations other than travel time (see estimation results in Section 3.3.2) and diversify
the choice set. First, highway bias captures travelers’ preference towards highway. A positive scaler smaller than 1 is multi-
plied to travel times on highway links. Second, a functional class change penalty is introduced to capture travelers’ aversion
to switching between road types, especially in terms of on and off highways. A positive constant is added to any highway
link whose succeeding link is non-highway and any non-highway link whose succeeding link is highway. Third, aversion
to intersections are captured by adding a positive intersection penalty to any link ending at a 4-way or more complicated
intersection. In each of the three cases, the relevant parameter is assumed a number of different values and for each value
a shortest path is generated to be added to the choice set.

2.4.2. The generation of D˜ n


Most methods to generate path choice set can be generalized to generate routing policy choice set by replacing the
shortest path algorithm with the optimal routing policy (ORP) algorithm (Ding et al., 2014) and using the original STD
network. The essence of any path (routing policy) choice set generation method is to systematically modify the network and
generate a shortest path (ORP) in each modified network. The nuances in generalization are discussed below.
For path choice set generation, in the link elimination method, the modified network is a subnetwork of the original
network, while in the other two methods, the modified network is the same as the original network in topology (different
in some travel times), and thus the generated paths always exist in the original network.
On the other hand, a routing policy in an STD network is defined by not only topology but also link travel times, that is,
it is a mapping from states to actions where a state contains link travel time realizations. When travel times are changed in
the modified network, the resulting ORP does not necessarily exist in the original STD network, as the link travel times that
define the state space have changed. When the number of links with different travel times is small, states in the original
and modified network can be matched following certain rules of finding “closet neighbors”. This is the case for the link
elimination and generalized cost methods. In the simulation method, however, almost all link travel times have changed in
the modified network, and it is conceivably difficult to find meaningful matching between states in the original and modified
networks, especially when the variance of the random perturbations is large. Therefore, the generation of D˜ n only entails
the link elimination and generalized cost methods.

3. Stockholm case study

As the capital and the largest city of Sweden, Stockholm constitutes the most populated urban area in Scandinavia. As
for transportation network, Stockholm is at the junction of the European routes E4, E18 and E20, and a half-completed
motorway ring road exists on the south and west sides of the City Center.
A subset of the Stockholm network is studied, which includes the Arlanda airport area, E4 motorway between the airport
and the city, and northeast part of the inner city. In this sub-network, according to the observations of local residents, taxi
drivers adapt to traffic conditions when making route choices going into and out of the city center. In particular, between
point A and point B shown in Fig. 3, there is a choice among two common routes, either the western route along E4 or the
eastern route along E18 and LV276.

3.1. Data processing

Network and map-matching. The network is represented as a directed graph with links for streets, nodes for intersections,
and locations where link attributes change (Table 2). Each link has a number of attributes including speed limit, functional
class and presence of traffic signal. The network is simplified so that links in series with identical speed limit and functional
class attributes are merged, reducing time and memory requirements of subsequent processing.
Time-stamped GPS coordinates of taxis from a fleet management system in Stockholm were obtained from November
1, 2012 through January 18, 2013, covering the time periods of Mondays through Fridays, resulting in 56 days (support
points). They are matched to the road network using a 4-step map-matching method designed for sparse Floating Car Data
(FCD), which is data collected from traced vehicles that “float” with the traffic (Rahmani and Koutsopoulos, 2013). The
method first finds candidate links in the vicinity of each GPS coordinate, then connects the candidate links of each pair
8 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

Fig. 3. Road network in the Stockholm case study.

Table 2
Statistics for the Stockholm network.

# of nodes 2872

# of links 5447
# of stochastic links 619
# of taxis 1500
# of support points 56
GPS reading time gap 1–2 min
# of hired taxi traces 4520
# of traces for model estimation 500
Time interval length 5 min
Study period duration 7:30 AM–11:30 AM
Departure time duration 7:30 AM–9:00 AM

of coordinates. The method then creates a candidate graph between a sequence of coordinates and, finally, finds the most
likely path (inferred path) from the candidate graph.

Vehicle traces. Vehicle traces are the route choice observations against which the proposed model is estimated as shown
in Eq. (7). Only hired taxi traces are used, since when there are passengers on board, taxi drivers have clearly specified
origins and destinations, and their objectives and behaviors are conceivably similar to those of regular commuters, whereas
for-hire taxis roam the network in order to pick up passengers. It is likely that taxi drivers are more experienced, aggres-
sive, and knowledgeable about the area than regular commuters Therefore, the developed model represents behaviors of a
subset of the general drivers who are knowledgeable about the network and sensitive to real-time traffic information. The
methodology, however, is general and can be applied to model regular commuters’ behaviors if data is available.

Empirical, joint link travel time distribution. The distribution is represented as a collection of support points, where a
support point is comprised of travel times on all links over all times for a given day. A non-parametric method is used to
compute the link travel times per time interval using the map-matched GPS data. For each road segment between a pair
of GPS coordinates, the observed travel time (i.e., the difference between the time stamps) is decomposed to the traversed
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 9

Table 3
Summary of coverage for each choice set generation method (100% overlap).

Choice sets Methodology Average # of paths/policies in choice sets # of matching trips Coverage

Path (1) Link elimination 15 450 90.0%


(2) Simulation 3 418 83.6%
(3) Generalized cost 23 454 90.8%
(4) Combination of (1), (2) and (3) 24 456 91.2%
(5) Combination of (4) and Paths from (8) 24 456 91.2%
Routing policy (6) Link elimination 27 438 87.6%
(7) Generalized cost 2 402 80.4%
(8) Combination of (6) and (7) 28 452 90.4%
(9) Combination of (8) and (4) 52 456 91.2%

links proportionally to their free-flow speeds and overlapping lengths. The weighted average, where the weight reflects
the overlap with both the considered link and other links, over observations from different vehicles within the same time
interval is the estimated link travel time. The travel time estimation is performed for each time interval separately for each
day in the data set, producing an empirical, joint travel time distribution. Please refer to Rahmani et al. (2015) for detailed
evaluation of the method.
With the available data, there are link-day-interval combinations for which the travel time cannot be estimated due
to lack of observations. These missing values are filled in through a sequence of inter/extrapolation steps. Furthermore,
unreasonably high or low link travel times are removed to produce reliable estimates.
A link is treated as deterministic when there is not enough variation of travel time over time and day, or not enough
observations to derive reliable travel time estimates. In this case, a single mean travel time is estimated across all days and
time intervals.

Vehicle trace sampling. 500 out of 4520 hired taxi traces are sampled for model estimation. To ensure geographic spread,
the airport area is divided into three zones and the downtown area is divided into nine zones. A total of 500 trips are then
sampled with trip ends (Os and Ds) evenly distributed across the airport and downtown zones.

3.2. Choice set evaluation

Path choice sets and routing policy choice sets are generated for the 500 vehicle traces. Their coverage and adaptiveness
are reported in Sections 3.2.1 and 3.2.2 respectively. The systematic approach described in Section 2.4 is first applied and
results are reported first. To further improve the coverage, investigation into the specific data set is conducted and results
are reported next.

3.2.1. Coverage
Coverage describes how well generated choice sets cover observed routes, following the same criterion used in the lit-
erature (Ben-Akiva et al., 2004). A choice set is said to cover an observed route, if a generated alternative in the choice set
“matches” the observed route. Coverage is defined as the percentage of observed routes with a choice set covers it.
The definition of “matching” warrants a few notes:
• Matching is based on a certain level of overlap, defined as the percentage of the observed route’s travel time shared by a
generated route. A higher overlap is a stricter criterion of matching. Overlap lower than 100% is typically used when the
network contains a large number of similar routes that are difficult to match to 100%, such as a downtown grid network
encountered in this case.
• Due to the relatively large gap (1–2 minutes) of GPS readings used in the study, vehicle traces instead of complete routes
were observed. Therefore the calculation of overlap and the determination of matching are based on a trace (a sequence
of non-consecutive links) instead of a route.
• The network is STD for modeling purpose, but for checking coverage it is converted to a deterministic and static network.
• A routing policy is a contingency plan and not observable, and only the realized path on a given day can be observed.
Therefore a generated routing policy is matched to an observed trace on a given day, if the routing policy is realized, on
the same day, as a path that is matched to the observed trace.

Results from choice set generation methods. Table 4 shows the average number of alternatives per choice set and the cov-
erage for each method described in Section 2.4 separately, as well as those from pooling the results from multiple methods,
all with 100% overlap. Fig. 4 shows a scattered plot of the number of alternatives in a choice set for each of the 456 origi-
nally matched trips. It can be seen that the size of a choice set ranges from 2 to over 100, and the average choice set size
is 24 for the path choice set and 52 for the routing policy choice set. Both the path and routing policy choice sets achieve a
coverage of 91.2% (Table 3).
It is also noted that generated alternatives always overlap with the chosen alternative to a certain extent, that is, al-
ternatives that are completely edge-disjoint with the chosen alternative do not exist. Further investigation shows that the
10 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

Fig. 4. # of Alternatives in a choice set for the 456 originally matched trips.

Table 4
A summary of different methods that improve the coverage.

Method # of additionally matched trips Overlap Coverage

Choice set generation (Section 2.4) 456 100% 91.2%


Correcting GPS mistakes 12 100% 93.6%
Breaking up trips with intermediate destinations 7 100% 95%
Relaxing overlap threshold for downtown trips 25 90% 100%

overlap often occurs at the start or end of a trip. This is intuitive given that this data set is restricted to trips in or out
of the airport, which usually has limited access road. The diversity of the choice sets in terms of number of edge-disjoint
alternatives would be an interesting metric in a more diversified data set.

Further investigation of the data set and network. To further improve the coverage, investigation of the specific data set
and network used in the case study is conducted manually. Obvious GPS mistakes are identified and corrected. One example
is shown in Fig. 5 where a GPS link is out of the vicinity of any other GPS links. Red links are map-matched GPS readings,
and blue route is a generated alternative.
Unusually long detours could be caused by an intermediate destination. One example is shown in Fig. 6 in which the
traveler takes a detour south before traveling north to the final destination. A trip with an intermediate destination is
manually divided into two trips and the major trip is kept for model estimation.
Some trips are not covered due to the many similar alternatives in a downtown grid network. Overlap is thus relaxed to
90% to avoid unnecessarily increasing the choice set size without adding much insight.
Table 4 shows a summary of the improvement of coverage. After correcting GPS mistakes and breaking up trips with
intermediate stops, the coverage achieves 95% at a 100% overlap. A 100% coverage is achieved when the overlap threshold
is relaxed to 90% for trips that go through downtown.

3.2.2. Adaptiveness
Adaptiveness for a routing policy is defined as the ratio of the number of different realized paths divided by the num-
ber of support points (days in this study). If a routing policy is realized as the same path on all days, the routing policy
adaptiveness is 1 divided by 56, which equals 0.0179; if a routing policy is realized as a different path on every different
day, the routing policy adaptiveness is 1. Adaptiveness for a given OD pair is the average over all routing policies in the
choice set. This is a new criterion for routing policy choice sets, as path choice sets contain paths that are fixed over days.
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 11

Fig. 5. An example of GPS mistake.

Fig. 6. An example of long detour.


12 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

Fig. 7. Adaptiveness histogram of the 456 originally covered routing policy choice set.

It evaluates how well a routing policy choice set captures adaptive behavior. Note that this is an ad hoc measure dependent
on the number of support points, so comparison over different datasets are not advised.
440 out of the 456 originally covered routing policy choice sets have adaptiveness larger than 0.0179, indicating that
most routing policies are realized as different paths over days. The average adaptiveness is 0.113 (6.3 different paths over 36
days), and the median is 0.103 (5.8 different paths over 36 days). The histogram is shown in Fig. 7.
The adaptiveness of the 440 routing policy choice sets increases with the expected travel time averaged over the choice
set, as shown in Fig. 8. This trend is intuitive, as longer trips generally allow for more diversion opportunities.

3.3. Systematic utility specification and model estimation

3.3.1. Systematic utility functions


Long Trip Dummy and Alternative Specific Constant (ASC) are in the membership function for routing policy user prob-
ability. Long Trip Dummy is a dummy variable that equals 1 if the shortest path travel time between the OD is at least 15
minutes, and 0 otherwise.
The systematic utility function for a path or routing policy alternative is linear in parameter with attributes of Expected
Travel Time (min), Travel Time Range (min), interaction term between Travel Time Range and Airport Bound (dummy),
# of Signals, # of Left Turns, # of Functional Class Changes, Average Speed (m/s), as well as dummy variables for Min
Expected Travel Time, Max Expected % of Highway Distance, and Min # of Functional Class Changes. For routing policies,
the attributes are averaged over all support points. The parameters of Policy Size and Path Size are fixed at 1 following the
original definition of Path Size (Ben-Akiva and Ramming, 1998). The attribute of Travel Time Range (the difference of the
maximum and minimum travel time) is a measure of travel time reliability. Other measures of reliability have also been
tested, including travel time standard deviation, variance, travel time reserve (difference between 95 percentile and median
travel time), and coefficient of variation (the ratio of the travel time standard deviation and the mean travel time). Average
Speed is calculated as the distance divided by Expected Travel Time. The parameters for the two classes of travelers differ
by a scale (Path Parameters = Scale × Policy Parameters), as introduced in Section 2.3.

3.3.2. Latent-class routing policy model estimation results


All model estimation was performed using BIOGEME Python 2.0 (Bierlaire, 2003, 2008). Table 5 presents the estimation
results of the latent-class routing policy choice model as well as two restricted models, based on the 475 covered trips with
100% overlap threshold.
Long Trip Dummy coefficient in the routing policy user membership function is positive and significant at the 0.1 level,
indicating that travelers are more likely to look ahead for longer trips, which is intuitive since longer trips allow for more
division possibilities and travelers plan more carefully for longer trips. Individual characteristics, such as driving experience
and familiarity with technology might affect the membership function, however, such factors cannot be investigated in this
study due to data limitations. As smartphone-based surveys become more prevalent, passive travel trajectory monitoring can
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 13

Fig. 8. Adaptiveness vs. expected travel time.

Table 5
Estimation results for latent-class routing policy model and restricted models.

Parameters Latent-class policy model Policy user probability = 1 Path user probability = 1

Estimate t-stat Estimate t-stat Estimate t-stat

ASC -2.23 -1.60 NA NA


Long Trip Dummy (SPT ≥ 15 min) 3.60 1.75∗ NA NA
Expected Travel Time (min) -1.15 -2.91∗∗∗ -0.669 -5.34∗∗∗ -0.626 -5.19∗∗∗
Travel Time Range (min) 0.162 0.94 0.153 1.40 0.0558 0.610
Travel Time Range ∗ Airport Bound -0.894 -1.96∗∗ -0.591 -2.55∗∗ -0.485 -2.17∗∗
# of Signals -0.266 -1.86∗ -0.149 -1.83∗ -0.165 -2.38∗∗
# of Left Turns -0.992 -2.83∗∗∗ -0.494 -2.40∗∗ -0.604 -2.99∗∗∗
# of Functional Class Changes -2.30 -3.49∗∗∗ -1.32 -6.21∗∗∗ -1.15 -6.84∗∗∗
Average Speed (m/s) 1.82 2.63∗∗∗ 0.877 4.91∗∗∗ 1.04 4.98∗∗∗
Min Expected Travel Time 2.78 5.07∗∗∗ 1.38 2.71∗∗∗ 1.24 3.91∗∗∗
Max Expected % of Highway Distance 1.78 2.23∗∗ 1.34 2.67∗∗∗ 0.941 2.91∗∗∗
Min # of Functional Class Changes 2.19 2.40∗∗ 2.58 8.01∗∗∗ 1.03 2.39∗∗
Path Class Scale 0.491 3.86∗∗∗ NA NA
Sample Size 475 475 475
# of Parameters 13 10 10
Adjusted Rho Squared 0.620 0.602 0.616
Null Loglikelihood -731.4 -731.4 -731.4
Final Loglikelihood -265.2 -281.4 -271.1

NA indicates that the parameter is not included in a model.



: significant at the 0.10 level ∗∗ : signifiant at the 0.05 level ∗∗∗ : significant at the 0.01 level.

be combined with user surveys to enrich the dataset and thus help better understand looking-ahead behaviors as related to
individual characteristics.
One of the most important factors affecting travelers choices is travel time. Travelers do not like long travel time, and
the negative signs of coefficients for Expected Travel Time and Min Expected Travel Time Dummy agree with the intuition.
Travelers also in general do not like variations in travel time (repressed by Travel Time Range), and it is shown that the
attitude towards travel time variation varies by travel direction. Travelers are risk neutral, that is, the variation in travel time
has no impact, when not traveling towards the airport, indicated by the statistically and numerically insignificant parameter
estimate. They are risk averse when traveling airport bound, indicted by the statistically significant negative coefficient of
the interaction between Travel Time Range and Airport Bound Dummy. This is intuitive since variation in travel time when
14 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

going to the airport can cause serious consequence of missing your flight. The ratio of the coefficient estimates for travel
time range and travel time mean for airport-bound trips is around 0.78, indicating that for travelers are willing to accept
0.78 minutes of average travel time increase to obtain a 1 minute reduction in travel time range. A recent literature review
(Jin et al., 2015) shows a wide range of estimates for the value of reliability (VOR) in comparison to the value of time (VOT),
due to the large variation in the definition of variability (standard deviation, variance, schedule delay), and the types of data
used (RP survey, SP survey, loop detector and dynamic toll data). VOR estimates varied from 0.55 to 3.22 times the VOT
estimates.
# of Signals and # of Left Turns estimate show that alternatives with fewer signals and left turns are preferred. # of
Functional Class Changes and Min # of Functional Class Changes Dummy estimates suggest that travelers also prefer not to
switch on/off highways frequently. Speed is also an important factor that affects travelers’ route choice. For instance, given
two alternatives of same travel times, many travelers choose the one with faster speed even if it has longer distance. This
phenomenon is related to travelers’ preference to highways, which is further substantiated by the positive estimate for Max
Expected % of Highway Distance Dummy. The While Policy Size and Path Size coefficients are fixed at 1, the parameters
for path users are 0.491 times of those for routing policy users (statistically different from 0 or 1 at the 0.01 level). This
suggests higher random errors in path user utility functions. The reasons for this difference are not immediately clear, and
one hypothesis is that path users who are fixated on a particular path might be less knowledgable about the network, and
thus a higher perception errors of the route attributes.
Overall the model achieves a final log likelihood of -265.2, and an adjusted rho squared of 0.620 when compared with a
null model. The null model is a path choice model where all parameters are zero except that for Path Size to discount paths
that are overlapping. This is a more reasonable benchmark than the equal-probability model, which can be manipulated to
have a very low log likelihood (and thus an inflated model fit for the final model) by adding a large number of alternatives
to the choice set.

3.3.3. The latent-class model vs. restricted models


Two restricted models are estimated where all users are path users or routing policy users respectively. The latent-class
model reduces to either of the restricted model when the class probability approaches 0 or 1, achieved by setting the ASC
in the membership function to either positive or negative infinity. The attributes in the restricted models are similar to
those in the unrestricted, latent-class model, except that there are no path user class scale or membership function related
parameters. The restricted path choice model uses the path choice sets only, and the restricted routing policy model uses
the routing policy choice sets only.
A likelihood ratio test performed on the unrestricted latent-class routing policy model over the two restricted models
shows that either of the restricted models is rejected at the 0.05 level. This suggests that travelers are heterogeneous in
terms of their ability and willingness to plan ahead and utilize real-time information. Therefore, there could be potential
biases when simplified assumptions are applied that travelers follow fixed path choice under real-time information. An
appropriate route choice model for uncertain networks should take into account the underlying stochastic travel times and
structured traveler heterogeneity in terms of real-time information utilization.

3.3.4. Impact of relaxing the overlap threshold


As discussed in Section 3.2.1, 25 out of the 500 trips are not covered with 100% overlap and are covered when overlap is
relaxed to 90%. To investigate the impact of relaxing overlap threshold due to the non-coverage issue, an additional latent-
class routing policy choice model is estimated with all 500 trips. Note that given the relaxed overlap threshold in assessing
coverage, the indicator functions P(g|i) in Eq. (5) and Pr (g|μ) in Eq. (6) are also evaluated based on 90% overlap.
Table 6 presents the estimation results for the latent-class routing policy model based on two data sets generated based
on different coverage. The estimates appear stable across different samples. A Hausman’s test is conducted to compare the
two vectors of estimates. The statistic H is calculated as:

H = (β1 − β0 ) (Cov(β0 ) − Cov(β1 ))−1 (β1 − β0 ) (9)


H has an asymptotically chi-squared distribution with the number of degrees of freedom equal to the rank of the differ-
ence in the variance-covariance matrices, Cov(β0 ) − Cov(β1 ). β 0 and β 1 are the two vectors of estimates for 100% and 90%
overlap threshold. The statistic for the two vectors of estimates equals 0.993 and the number of degrees of freedom is 13.
Therefore, the differences are not significant and the two vectors of estimates are consistent at the level of 0.01.

3.3.5. Prediction
10-fold cross-validation is performed for the latent-class model and two restricted models. In each fold, 2/3 of the ob-
servations (318) are randomly chosen as the training set for model estimation, while the remaining 1/3 of the observations
(157) are used as the validation set for prediction. The three models are estimated based on the training set, and evaluated
on the validation set based on the estimated parameters from the corresponding training set. Table 7 shows the training
and validation results in terms of the final loglikelihood and adjusted rho squared for each of the 10 folds.
The estimation results show the same trend as those in the full dataset reported in Table 5, that is, on average, the
latent-class policy model fits the data better than either of the restricted model does, as suggested by the adjusted rho
squared. However, it should be noted that latent-class and path-only models are close. The prediction result suggests that
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 15

Table 6
Estimation results for latent-class routing policy model with samples of different coverage.

Parameters Coverage = 95% Coverage = 100%

Estimate β 0 t-stat Estimate β 1 t-stat

ASC -2.23 -1.60 -2.17 -1.65


Long Trip Dummy (SPT ≥ 900 s) 3.60 1.75 4.40 1.20
Expected Travel Time (min) -1.15 -2.91 -1.06 -3.36
Travel Time Range (min) 0.162 0.94 0.116 0.79
Travel Time Range ∗ Airport Bound -0.894 -1.96 -0.756 -1.94
# of Signals -0.266 -1.86 -0.227 -1.96
# of Left Turns -0.992 -2.83 -0.843 -2.72
# of Functional Class Changes -2.30 -3.49 -2.07 -4.12
Average Speed (m/s) 1.82 2.63 1.60 2.82
Min Expected Travel Time 2.78 5.07 2.45 5.09
Max Expected % of Highway Distance 1.78 2.23 1.57 2.51
Min # of Functional Class Changes 2.19 2.40 2.14 3.03
Scale for Two Class Parameters 0.491 3.86 0.549 4.62
Sample Size 475 500
# of Parameters 13 13
Final Loglikelihood -265.223 -274.532

Table 7
Cross validation results of the latent-class and restricted models.

Parameters Latent-class Policy Model Policy User Probability = 1 Path User Probability = 1

Training Validation Training Validation Training Validation

Set 1 Null Loglikelihood -945.6 -480.4 -945.6 -480.4 -945.6 -480.4


Final Loglikelihood -175.6 -92.9 -185.1 -99.7 -178.5 -94.8
Adjusted Rho Squared 0.510 0.449 0.493 0.429 0.511 0.455
Set 2 Null Loglikelihood -957.1 -469.0 -957.1 -469.0 -957.1 -469.0
Final Loglikelihood -183.4 -83.2 -194.6 -88.0 -188.5 -83.6
Adjusted Rho Squared 0.497 0.485 0.476 0.475 0.492 0.499
Set 3 Null Loglikelihood -949.0 -477.0 -949.0 -477.0 -949.0 -477.0
Final Loglikelihood -171.0 -97.4 -185.6 -97.7 -173.8 -99.3
Adjusted Rho Squared 0.519 0.434 0.488 0.448 0.519 0.440
Set 4 Null Loglikelihood -952.5 -473.6 -952.5 -473.6 -952.5 -473.6
Final Loglikelihood -170.7 -94.8 -182.5 -100.0 -176.9 -94.9
Adjusted Rho Squared 0.529 0.425 0.506 0.413 0.521 0.440
Set 5 Null Loglikelihood -958.1 -467.9 -958.1 -467.9 -958.1 -467.9
Final Loglikelihood -183.9 -82.9 -195.3 -87.7 -188.8 -83.0
Adjusted Rho Squared 0.497 0.485 0.475 0.475 0.492 0.500
Set 6 Null Loglikelihood -948.4 -477.6 -948.4 -477.6 -948.4 -477.6
Final Loglikelihood -182.3 -85.2 -193.5 -89.6 -186.6 -85.9
Adjusted Rho Squared 0.487 0.501 0.465 0.494 0.483 0.513
Set 7 Null Loglikelihood -955.0 -471.1 -955.0 -471.1 -955.0 -471.1
Final Loglikelihood -194.8 -72.4 -205.8 -77.3 -200.6 -72.6
Adjusted Rho Squared 0.488 0.502 0.468 0.491 0.481 0.518
Set 8 Null Loglikelihood -943.5 -482.5 -943.5 -482.5 -943.5 -482.5
Final Loglikelihood -176.9 -95.0 -186.2 -96.8 -178.9 -94.7
Adjusted Rho Squared 0.514 0.420 0.498 0.427 0.517 0.438
Set 9 Null Loglikelihood -960.6 -465.5 -960.6 -465.5 -960.6 -465.5
Final Loglikelihood -190.9 -79.7 -198.6 -84.1 -193.3 -78.9
Adjusted Rho Squared 0.478 0.502 0.466 0.495 0.480 0.523
Set 10 Null Loglikelihood -961.6 -464.5 -961.6 -464.5 -961.6 -464.5
Final Loglikelihood -192.1 -81.1 -204.1 -79.3 -193.1 -81.0
Adjusted Rho Squared 0.475 0.496 0.452 0.522 0.480 0.513
Sample Size 427 48 427 48 427 48

# of Parameters 13 10 10
Average Adjusted Rho Squared 0.499 0.470 0.479 0.467 0.498 0.484

the latent-class model performs better than the policy-only model (although close), but worse than the path-only model.
This reversed result for the path-only model corroborates with that observed in the literature, that is, a model that fits a
dataset the best is not necessarily the best in prediction for another dataset. The lack of obvious advantage of the latent-class
policy model in this case study might be due to the geographic restriction of the data, in that observations were collected
for travel between the airport and city center only, along which freeways dominate and diversion possibilities might not be
large enough. Future applications of the methodology should focus on areas with a healthy mix of complementary roads to
make diversions and forward thinking advantageous.
16 J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17

4. Conclusions and future directions

Conclusions. A latent-class routing policy choice model in an STD network based on sparse GPS readings is developed and
estimated using hired taxi GPS data from Stockholm, Sweden. Two classes of travelers, routing policy users and path users,
differ by their choice sets and utility function parameters. A routing policy represents travelers’ looking ahead ability to
account for traffic information not yet available, and the choice set generation for routing policies is a generalization of path
choice set generation. A path is a special case of a routing policy, and thus the routing policy choice set for any given OD
always contains the path choice set.
The ensemble of choice set generation methods (link elimination, simulation, generalized cost) can achieve a 95% cover-
age with 100% overlap and further achieves a 100% coverage with 90% overlap.
Estimation results show that the routing policy user class probability increases with trip length, and the latent-class
routing policy choice model fits the data better than a single-class path choice or routing policy choice model. This suggests
that travelers are heterogeneous in terms of their ability and/or willingness to plan ahead and utilize real-time information,
and an appropriate route choice model for uncertain networks should take into account the underlying stochastic travel
times and structured traveler heterogeneity in terms of real-time information utilization. Travelers are risk averse when
traveling to the airport and risk neutral otherwise. Path user class parameters have a smaller scale than those of routing
policy class, indicating that the two classes differ not only by choice set, but also perception of attributes. Further studies to
understand the underlying behavioral processes of travelers’ decision making under uncertainty with real-time information
could shed light on the sources of the difference.
Cross-validation results show that a model that fits a dataset the best is not necessarily the best in prediction for another
dataset. Future applications of the methodology should focus on areas with a healthy mix of complementary roads to make
diversions and forward thinking advantageous.

Future directions. Passive GPS data cannot generate observations regarding the information access of travelers, and the fact
that the information in a dynamic network changes over time and space makes it more difficult to observe. This study
circumvents this difficulty by assuming POI access for taxi drivers who are generally attentive to traffic information. Com-
muters and other travelers typically have a wide variety of information access that vary in spatial and temporal scopes, e.g.,
VMS, radio, Google Maps Traffic, Waze. The proposed modeling methodologies can be applied to any types of information
access given the encapsulation of information access in the definition of a routing policy (Gao and Chabini, 2006), and the
challenge lies on identifying information access over time and space for any given traveler. Potential means include periodic
survey questions delivered to smartphones in real time after some major diversion points are traversed, location prompted
recall survey at the end of day, and records of in-vehicle GPS navigation systems that provide real-time information.
Fosgerau et al. (2013) developed a link-based route choice model without choice set generation and Mai et al. (2015) ex-
tended it to nested logit. They adopted a dynamic discrete choice approach for consistently estimating route choice model
parameters based on path observations through repeated link choices. The approach does not require choice set generation
or sampling. So far such studies have only been carried out on path choice models in static and deterministic networks.
Routing policy choice is naturally suited for such an approach due to the sequence of link choices. Therefore, a potential
future direction is to explore link-based dynamic discrete choice models for routing policy choice in an STD network.

Acknowledgement

The research is funded by the US Department of Transportation through the New England University Transportation
Center (UTC).

References

Abdel-Aty, M., Abdalla, M.F., 2004. Modeling drivers’ diversion from normal routes under ATIS using generalized estimating equations and binomial probit
link function. Transportation (Amst) 31, 327–348.
Abdel-Aty, M., Abdalla, M.F., 2006. Examination of multiple mode/route-choice paradigms under ATIS. IEEE Trans. Intell. Transp. Syst. 7 (3), 332–348.
Ardeshiri, A., Jeihani, M., Peeta, S., 2015. Driving simulator-based study of compliance behavior with dynamic message sign route guidance. IET Intell.
Transport. Syst. 9, 765–772.
Balakrishna, R., Ben-Akiva, M., Bottom, J., Gao, S., 2013. Information Impacts on Traveler Behavior and Network Performance: State of Knowledge and Future
Directions. In: Ukkusuri, S.V., Ozbay, K. (Eds.), Advances in Dynamic Network Modeling in Complex Transportation Systems, Vol. 2 of Complex Netwroks
and Dynamic Systems. Springer, New York, pp. 193–224.
Bekhor, S., Ben-Akiva, M.E., Ramming, S., 2006. Evaluation of choice set generation algorithms. Ann. Oper. Res. 144 (1).
Ben-Akiva, M., Bierlaire, M., 1999. Discrete Choice Methods and Their Applications to Short-term Travel Decisions. In: Hall, R. (Ed.), Handbook of Trans-
portation Science. Kluwer, pp. 5–34.
Ben-Akiva, M., Bierlaire, M., Koutsopoulos, H.N., Mishalani, R., 2002. Real-time Simulation of Traffic Demand-supply Interactions within Dynamit. In: Gen-
dreau, M., Marcotte, P. (Eds.), Transportation and Network Analysis: Current Trends. Springer, pp. 19–36.
Ben-Akiva, M., Gao, S., Wei, Z., Wen, Y., 2012. A dynamic traffic assignment model for highly congested urban networks. Transport. Res. Part C 24, 62–68.
Ben-Akiva, M., Ramming, M.S., Bekhor, S., 2004. Route Choice Models. In: Schreckenberg, M., Selten, R. (Eds.), Human Behavioiur and Traffic Networks.
Springer, New York, pp. 23–45.
Ben-Akiva, M., Ramming, S., 1998. Lecture Notes: Discrete Choice Models of Traveler Behavior in Networks. Prepared for Advanced Methods for Planning
and Management of Transportation Networks, Capri, Italy.
Ben-Elia, E., Avineri, E., 2015. Response to travel information: a behavioural review. Transp. Rev. 35, 352–377.
J. Ding-Mastera, S. Gao and E. Jenelius et al. / Transportation Research Part B 124 (2019) 1–17 17

Bierlaire, M., 2003. Biogeme: A Free Package for the Estimation of Discrete Choice Models. In: Proceedings of the 3rd Swiss Transport Research Conference,
Ascona, Switzerland.
Bierlaire, M., 2008. An introduction to biogeme version 1.6. http://biogeme.epfl.ch.
Bierlaire, M., Frejinger, E., 2008. Route choice modeling with network-free data. Transport. Res. Part C 16, 187–198.
Bogers, E., Viti, F., Hoogendoorn, S., 2005. Joint modeling of advanced travel information service, habit, and learning impacts on route choice by laboratory
simulator experiments. Transport. Res. Rec. J.Transport. Res. Board 1926, 189–197.
Bolduc, D., Ben-Akiva, M., 1991. A multinomial probit formulation for large choice sets. Proceed. 6th Int. Conf.Travel Behav. Quebec, Canada.
Boyer, S., Blandin, S., Wynter, L., 2015. Stability of transportation networks under adaptive routing policies. Transport. Res. Part B 81, 886–903.
Cascetta, E., Nuzzolo, A., Russo, F., Vitetta, A., 1996. A Modified Logit Route Choice Model Overcoming Path Overlapping Problems: Specification and Some
Calibration Results for Interurban Networks. In: Lesort, J.B. (Ed.), Proceedings of the 13th International Symposium on Transportation and Traffic Theory,
Lyon, France.
Chatterjee, K., McDonald, M., 2004. Effectiveness of using variable message signs to disseminate dynamic traffic information: evidence from field trials in
european cities. Transp. Rev. 24 (5), 559–585.
Ding, J., Gao, S., Jenelius, E., Rahmani, M., Huang, H., Ma, L., Pereira, F., Ben-Akiva, M., 2014. Routing policy choice set generation in stochastic time-depen-
dent networks: case studies for stockholm and singapore. Transp. Res. Rec. 2466, 76–86.
Ding-Mastera, J., 2016. Adaptive Route Choice in Stochastic Time-Dependent Networks: Routing Algorithms and Choice Modeling. Phd thesis. Department
of Civil and Environmental Engineering, University of Massachusetts Amherst.
Fosgerau, M., Frejinger, E., Karlstrom, A., 2013. A link based network route choice model with unrestricted choice set. Transport. Res. Part B 56, 70–80.
Frejinger, E., Bierlaire, M., 2007. Capturing correlation with subnetworks in route choice models. Transport. Res. Part B 41, 363–378.
Frejinger, E., Bierlaire, M., Ben-Akiva, M., 2009. Sampling of alternatives for route choice modeling. Transport. Res. Part B 43 (10).
Gao, S., 2005. Optimal Adaptive Routing and Traffic Assignment in Stochastic Time-Dependent Networks. Phd thesis. Massachusetts Institute of Technology.
Gao, S., 2012. Modeling strategic route choice and real-time information impacts in stochastic and time-dependent networks. IEEE Trans. Intell. Transp. Syst.
13 (3), 1298–1311.
Gao, S., Chabini, I., 2006. Optimal routing policy problems in stochastic time-dependent networks. Transport. Res. Part B 40 (2), 93–122.
Gao, S., Frejinger, E., Ben-Akiva, M., 2008. Adaptive route choice models in stochastic time-dependent networks. Transp. Res. Rec. 2085, 136–143.
Gao, S., Huang, H., 2012. Real-time traveler information for optimal adaptive routing in stochastic time-dependent networks. Transport. Res. Part C 21 (1),
196–213.
Jin, X., Hossan, M.S., Asgari, H., 2015. Investigating the value of time and value of reliability for investigating the value of time and value of reliability for
managed lanes. Florida International University Technical report fdot project number bdv29-977-12.
Lai, X., Bierlaire, M., 2015. Specification of the cross-nested logit model with sampling of alternatives for route choice models. Transport. Res. Part B 80,
220–234.
Mahmassani, H., Liu, Y.H., 1999. Dynamics of commuting decision behaviour under advanced traveller information systems. Transport. Res. Part C 7, 91–107.
Mahmassani, H.S., 2001. Dynamic network traffic assignment and simulation methodology for advanced system management applications. Netw. Spat. Econ.
1, 267–292.
Mai, T., Fosgerau, M., Frejinger, E., 2015. A nested recursive logit model for route choice analysis. Transport. Res. Part B 75, 100–112.
Peeta, S., Ramos, J.L.J., 2006. Driver response to variable message signs-based traffic information. IEE Proceed. Intell. Transport.Syst. 153, 2–10.
Peeta, S., Yu, J.W., 2005. A hybrid model for driver route choice incorporating en-route attributes and real-time information effects. Netw. Spat. Econ. 5,
21–40.
Polydoropoulou, A., Ben-Akiva, M., Khattak, A., Lauprete, G., 1996. Modeling revealed and stated en-route travel response to advanced traveler information
systems. Transp. Res. Rec. 1537, 38–45.
Rahmani, M., Jenelius, E., Koutsopoulos, H.N., 2015. Non-parametric estimation of route travel time distributions from low-frequency floating car data.
Transport. Res. Part C 58B, 343–362.
Rahmani, M., Koutsopoulos, H.N., 2013. Path inference from sparse floating car data for urban networks. Transp. Res. Part C 30, 41–54.
Razo, M., Gao, S., 2010. Strategic thinking and risk attitudes in route choice: a stated preference approach. Transp. Res. Rec. 2085, 136–143.
Razo, M., Gao, S., 2013. A rank-dependent expected utility model for strategic route choice with stated preference data. Transport. Res. Part C 27, 117–130.
Schrank, D., Eisele, B., Lomax, T., Bak, J., 2015. 2015 urban mobility scorecard. Texas A&M Tranportation Institute Technical report.
Srinivasan, K.K., Mahmassani, H., 2003. Analyzing heterogeneity and unobserved structural effects in route-switching behavior under ATIS: a dynamic kernel
logit formulation. Transport. Res. Part B 37, 793–814.
Tsirimpa, A., Polydoropoulou, A., Antoniou, C., 2007. Development of a mixed multi-normal logit model to capture the impact of information systems on
travelers’ switching behavior. J. Intell. Transport. Syst. 11 (2), 79–89.

You might also like