Yu-Yang2019 Article DynamicRoutingWithReal-timeTra PDF

Oper Res Int J (2019) 19:1033–1058
DOI 10.1007/s12351-017-0314-9
ORIGINAL PAPER
Dynamic routing with real-time traffic information
Guodong Yu1 • Yu Yang1
Received: 11 June 2016 / Revised: 1 October 2016 / Accepted: 13 April 2017 /

Published online: 15 May 2017
Ó Springer-Verlag Berlin Heidelberg 2017
Abstract We consider the vehicle routing problem (VRP) with real-time traffic
information, where stochastic intermediate times (travel times and service times) are
assumed to be realized with probability distributions at the end of each customer’s
service and before determining the next customer to visit. We propose a dynamic
VRP (DVRP) model addressing the varying intermediate times and show that the
DVRP can significantly reduce the total duration than the static or priori VRP
model. To solve the DVRP model, we develop an approximate dynamic pro-
gramming algorithm based on a semi-infinite linear programming, which can be
derived from a class of affine time-to-go approximation functions and generate
lower bound only dependent on the expected duration and the description of support
set of the stochastic time vectors. We also propose a greedy heuristic time-directed
policy to produce good solutions and improve computational efficiency even for the
worst-case condition, and prove that it can be solved within polynomial time. The
results show that our approach is of high applicability for the VRP with dynamic
and real-time traffic.
Keywords Dynamic VRP Stochastic intermediate times Dynamic

programming Approximation Greedy heuristic
& Yu Yang
yuyang@cqu.edu.cn
Guodong Yu
yuguodong@cqu.edu.cn
1
State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing, China
123
1034 G. Yu, Y. Yang
1 Introduction
The vehicle routing problem (VRP), first proposed by Dantzig and Ramser (1959),
has been recognized as an essential problem in the fields of transportation and
logistics. Among various VRP models, travel times and service times are important
data. The assumption that the travel times between two arbitrary locations
(including depots and customers) and the service times at every customer are
constant can be found extensively in the existing literature. However, this
assumption is rather ideal especially for the urban environment where travel times
are closely restrained by traffic accident, congestion during rush hours and the door-
to-door service condition where the details of the service to be performed at each
customer maybe unknown beforehand. Consequently, stochastic VRPs addressing
uncertain travel times and service times have been studied extensively during last
two decades. As travel times and service times are associated with internal nodes or
arcs of the VRP network, they are combined by intermediate times in this paper.
Similar definition can also be found in Pillac et al. (2013).
Within the stochastic routing context, one of the most common approaches for
representing the stochastic parameters is preassigning a uniform distribution, e.g.,
Errico et al. (2016) assume that the probability distributions of service times are
known exactly ahead of time. However, in practice, this distribution is typically
only partially known through historical data or the beliefs of the decision maker, and
it is usually a very difficult job to collecting the required data. In this case, the
family of probability distributions and consequent robust optimization problems are
developed to deal with ambiguous data, e.g., Han et al. (2013) and Jaillet et al.
(2016). As the solutions by aforementioned models are devised before being
implemented in practice, these models are called as priori optimization method.
Though important benefits can be obtained from the priori VRP optimization, the
applicability of its solutions is also restricted for the changeable real-time
information. Hence, the dynamic VRP (DVRP) models under stochastic conditions
have been considered to enhance the flexibility and adaptability. In particular, for
routing problems in which traffic congestion exerts a great effect on the routes
duration, the DVRP based on real-time traffic information can significantly reduce
travel times (Kok et al. 2010). In the light of Pillac et al. (2013), four categories of
the VRP can be identified: static and deterministic, static and stochastic, dynamic
and deterministic and dynamic and stochastic, according to two important
dimensions, i.e., evolution and quality of information, proposed by Psaraftis
(1980). The topic we address falls into the dynamic and stochastic VRP. As for the
VRPs. one can refer to Laporte (2007, 2009) and Golden et al. (2008) for
comprehensive overviews.
In the dynamic routing problem with uncertain intermediate times, the driver or
operation officer reoptimizes the remaining routes when some new traffic
information emerges, such as Toriello et al. (2014) assume the salesman can
observe outgoing arc realizations at each city location before determining the next
city to visit in a dynamic Traveling Salesman Problem. For each reoptimizition
during the execution, we find that the decision maker are assumed to be risk-neutral
123
Dynamic routing with real-time traffic information 1035
when facing probabilistic traffic conjection and even road failure. However, most
transportation managers are depicted as either risk-averse or risk-seekers, instead of
as risk-neutral (Dixit et al. 2015). Several studies of risk taking in related fields also
attest to risk-awareness in optimization, especially for dangerous goods transporta-
tion (Gheorghe et al. 2005), hazardous material transportation (Kang et al. 2014)
and dynamic ermergency routing (An et al. 2015). Furthermore, it has been argued
that a risk-averse approach for decision-making problems under uncertainty can
provide more robust solutions compared to the risk-neutral approach (Toumazis and
Kwon 2013; Sun et al. 2015). Despite this, the literature has very little discussion on
risk preferences in dynamic routing problems.
In this paper, the DVRP where intermediate times are unknown exactly in
advance is proposed to minimize the total duration of the route. Following Toriello
et al. (2014), we assume drivers can observe outgoing route realizations at the end
of a service for each customer and before choosing the next destination. This
assumption is reasonable in reality because a drive can get sufficient real-time traffic
information in virtue of some navigation software or mobile applications. We also
assume the decision maker’s risk attitude is consistent over the horizon. Then, we
propose a DVRP model addressing both stochastic intermediate times and risk
preference. In our case, the aforementioned assumption means the states of our DP
are extended by all possible route realizations, which may cause an uncountable state
space and raise a tough issue for traditional DP due to the curse of dimensionality.
To solve the DVRP model, we develop an approximate dynamic programming
(ADP) using a class of time-to-go functions and propose a greedy heuristic time-
directed policy to produce good solutions and improve solution efficiency.
The remainder of the paper is organized as follows. Section 2 reviews the
literature relevant to our problem. Section 3 describes the DVRP with stochastic
intermediate times. Section 4 explains our solution methodology of the DVRP
model. The results of our computational study are reported in Sect. 5. The
conclusion follows in Sect. 6.
2 Literature review
In this section section, we present an overview of related work. We survey the VRP
literature addressing the stochastic travel and service times from both a static
perspective and dynamic perspective.
First, from a static perspective, the VRP literature with stochastic travel and
service times have proposed different models, such as pure stochastic programming,
chance-constrained programming, robust optimization (Errico et al. 2016). Laporte
et al. (1992) develop a chance-constrained model, a three-index simple recourse
model and a two-index recourse model to deal with the stochastic service and travel
times, and they present a branch and cut algorithm to solve these models. Lambert
et al. (1993) model an integer stochastic programming and devise a heuristic
procedure for the VRP in banking context where travel times are random. Sungur
et al. (2010) consider a variant of the VRP, the courier delivery problem (CDP),
where customers and corresponding service times are uncertain. The authors present
123
1036 G. Yu, Y. Yang
a robust optimization and an insertion-based solution heuristic for the uncertainty in

service times. Li et al. (2010) formulate the VRP with stochastic travel and service
times as a chance constrained programming and use a tabu search heuristic to
efficiently solve the model. Lei et al. (2012) propose a generalized variable
neighborhood search heuristic for the VRP with stochastic service times. Nguyen
et al. (2016) propose a chance-constrained programming and use satisfying measure
approach for the VRP with stochastic travel times, and a tabu-search heuristic is
proposed to solve this model. Errico et al. (2016) model the VRP with stochastic
service times as a two-stage stochastic program and propose exact branch-cut-and-
price algorithms to solve it.
These static VRP models, as a-priori optimization, proposed in above works
contain one or more parameters are stochastic, which means that some future events
are random variables with a known probability distribution (Ritzinger et al. 2015).
The optimal plan from a-priori optimization is performed before its realization and
can undergo minor changes during their execution.
We second review the VRP studies from a dynamic perspective. In the light of
Psaraftis (1995), if some inputs of a problem are not known ahead of time but can be
realized as time goes by, we can define a dynamic model for this condition. Thus,
dynamic model for the VRP with stochastic travel and service times can be found
among previous literature. Fu (2002) studies the VRP in dial-a-ride context, where
travel and service times are time-varying and stochastic on traffic congestion. A
conventional heuristic algorithms combined with coupled with a first-in-first-out
assumption is proposed to solve the problem. Dabia et al. (2013) present dynamic
model and a branch-and-price algorithm for the VRP with time-dependent travel
times. Ghannadpour et al. (2013) present a multi-objective dynamic VRP with
fuzzy travel times, of which the required data are not known in advance. They use
the genetic algorithm to form the dynamic solving strategy.
For the dynamic VRPs, routing decision are made based on the most current
state, which updates every time the vehicle arrives at a location (Novoa and Storer
2009). Thus the driver can select the most approximate next customer according to
the current state.
Among the above literature, uncertain information make the VRP more complex,
especially the solution methodology. We find that most of solutions, in addition to
Dabia et al. (2013) and Errico et al. (2016), are computed based on heuristic
algorithm, which can significantly improve the computational efficiency comparing
to the weak performance of exact algorithm in solution times. To this end, more
recently, Novoa and Storer (2009) propose an approximate dynamic programming
approach for the VRP with stochastic demands. They extend the roll-out algorithm
by implementing different base sequences. The approximate dynamic programming
is also used by Powell (2009), Topaloglu and Powell (2006) and Topaloglu and
Powell (2007) to solve the dynamic problems with high dimensions. This provides a
motivation for our later solution method.
For the risk management in routing problems, risk-averse static routing models
have be studied extensively. Bell (2006), Kang et al. (2014), Sun et al. (2015),
Toumazis and Kwon (2013) and Toumazis and Kwon (2015) develop kinds of risk-
averse methods to design robust routing plans for hazardous materials. Zhao et al.
123
(2013) incorporate two risk measures (travel-time budget and mean-excess travel
time) into the transit service routing problem. However, we have found very few
papers addressing the decision’s maker in dynamic routing problem except that
Xiao and Lo (2013) who develop an adaptive navigation approach for risk-averse
travelers to real time traffic conditions.
3 DVRP with stochastic intermediate times
3.1 Description and formulation
Generally, the VRP can be represented by a graph G ¼ ðV; AÞ; where

V , f0; 1; . . .; ng is a set of nodes and A , fði; jÞ 2 V V; i 6¼ jg is the set of arcs.
The node f0g denotes the depot where a fleet of homogeneous vehicles is initially
located; Vc , f1; . . .; ng is the set customers. In the VRP, the vehicles visit each
customer in Vc ; exactly once starting from and returning to the depot. In our model,
each intermediate arc times, namely the duration that a vehicle departs from
customer i 1 continues to finish the service for customer i, is random. For
rotational convenience, we integrate this duration in arc ði 1; iÞ, in this way, the
intermediate time equals to the associated arc time and can be realized at the end at
the arcs end, as shown in Fig. 1. We aim to minimize the expected total duration and
the risk value of the route by choosing the next customer to visit at the current
location. Let Ti , Tijl : j 2 Vni; l 2 Lij 2 R denote the random vector of outgoing
times at customer i, where Lij , indexed by l, is the set of alternative routes between
n o
i and j. The distribution of each Ti , Pr Tlij , can be known upon arrival at
customer i and can be different among customers. We assume that all Ti are
pairwise independent, which means that Ti is only determined by the condition
before current customer i and not by the remaining customers, even though they are
not necessarily completely independent because of the sharing tail. We denote the
support ofh Tiiby a compact set Ti 2 Rn . In this paper, for the ease of notation, we
let tij , E Tijl ; tij , minl2Lij Tijl and ^tij , maxl2Lij Tijl , respectively.
We formulate the DVRP based on the classical VRP model (Laporte 1992;
Christofides et al. 1981) but including the outgoing arc time. A state of the DVRP
represents the current customer, the remaining customers, and the realized outgoing
arc time vector. Thus the state space can be formulated as
Fig. 1 The arc time
123
1038 G. Yu, Y. Yang
S,fði; U; ti Þ : i 2 V; U Vni; ti 2 Ti g [ f0; ;g;
where the state fði; U; ti Þ : i 2 V; U Vni; ti 2 Ti g refers to intermediate duration

of the route. For the state ði; U; ti Þ; the vehicle is serving customer i, and has
remaining customers in U to visit, and the time is given by ti : If i ¼ 0; it means the
vehicle is at the depot, the start of a route, and has n customers to visit. The terminal
state f0; ;g indicates the finish of the whole route. When U 6¼ ; in the state ði; U; ti Þ,
driver must determine a next customer j 2 U. Then the state updates to
the
j; Unj; tj with tj 2 Tj by a transition based on the distribution of Tj . As the
transition proceeds, the cardinality |U| decreases. Let xi;U ðti Þ denote the minimum
expected time-to-go from state ði; U; ti Þ. For the state space S, there is an rea-
sonable interpretation that xi;U ðti Þ : ti 2 Ti can be seen as a function of ti :
Considering the terminal arc time x0;;
¼ 0, we have the recursion formulation of the
DVRP,
8 n h io
>
>
< minj2U tij þ E xj;Unj Tj ; U 6¼ ;
xi;U ðti Þ, ti0 ; i 2 Vc ; U ¼ ; ð1Þ
>
>
:
0; i ¼ 0; U ¼ ;:
We can obtain a policy, } : Snð0; ;Þ ! V , based on the solution of (1) corre-
sponding to an action j 2 U from the current state ði; U; ti Þ 2 S,
( n h io
arg minj2U tij þ E xUnj Tj ; U 6¼ ;;
} ði; U; ti Þ ¼ ð2Þ
0; U ¼ ;;
Therefore, the linear programming formulation of (1) can be represented as
max E½xV ðT0 Þ ð3aÞ

x

s:t:xV ðt0 Þ E xi;Uni ðTi Þ t0i ; 8i 2 Vc ; t0 2 T0 ð3bÞ

xi;U[j ðti Þ E xj;U Tj tij ; 8i 2 Vc ; j 2 Vc ni; U Vc nfi; jgti 2 Ti ð3cÞ
xi;; ðti Þ tio ; 8i 2 Vc ; ti 2 Ti : ð3dÞ
Observe that the variable xi;U can be seen as a function from the infinite set Ti to
R; and all constraints (3b–3d) are related the pairs with the format ðstate actionÞ
indexed partly by the sets Ti , thus this LP is potentially doubly infinite.
Considering that (1) is non-decreasing, the feasible region of each xi;U can be
restricted within an appropriate feasible region, e.g., the space of continuous
functions on Ti : Note that it is not necessarily selecting an exact region because
problem (3) is naturally intractable and we are not prone to deal with it directly.
However, it is worth being expressed here because the solutions of (3) can provide
lower bounds on the optimal expected time-to-go.
123
Proposition 1 Let xi;U : Ti ! R for i 2 V and U Vc ni be feasible for

problem (3). Then xi;U ðti Þ xi;U ðti Þ for all ði; U; ti Þ 2 S: Particularly,

E½xV ðT0 Þ E xV ðT0 Þ .
Proof It is inductive according to the definition of x :
3.2 Advantages comparing to the static VRP
To demonstrate the advantages of the DVRP (1), we compare it with the static VRP.
In fact, the issue we concern mostly is whether the DVRP can improve the
adaptability
than
the static model. We use a deterministic VRP with arc time given
by tij ¼ E Tij to denote the static problem. We first consider a particular example
that arc times are i.i.d. Bernoulli random variables with parameter p 2 ð0; 1Þ, and
the arc time at the depot, i ¼ 0; is zero with probability one. Let Ep ½T denote the
total expected time of the static VRP. In this case, we have the following
proposition.
Proposition 2 For 8i 2 V; xi;V ðti Þ Ei;p ½T .
Proof For the static VRP, we have the expected time at location i, denoted by
Ei;p ½T ¼ pn: In the DVRP, at period t ¼ 0; . . .; n 1, the policy } can generate the
expected total arc time with probability pnt ; and even can lead to zero time when
all outgoing arcs have no time. Therefore,
X
n1
pð 1 pn Þ
xi;V ðti Þ ¼ pnt ¼ :
t¼0
1p
Dividing one by the other one, we have

Ei;p ½T nð1 pÞ
f ¼ ¼
xi;V ðti Þ 1 pn
Obviously the ratio f ! þ1 when n ! þ1: In other words, the decisions of the
DVRP can significantly decrease expected arc time over a route.
Next, we consider the case when each Ti is sufficiently small. In this case, the
difference of t between the DVRP and the static VRP maybe small. Define the
optimal time-to-go function for the static VRP as
8
< mini2V tij þ xj;Uni ;
> U 6¼ ;
xi;U , ti0 ; i 2 Vc ; U ¼ ; : ð4Þ
>
:
0; i ¼ 0; U ¼ ;:
We also define the diameter of a set W Rn as DðW Þ, supx;y2W k x yk: Then the
following proposition is immediate.
Proposition 3 For 8ði; U; ti Þ 2 S, we have
123
1040 G. Yu, Y. Yang

X

xi;U ðti Þ xj;U
DðTi Þ ð5aÞ
j2i[U

h i
X

E x0;Vc ðt0 Þ x0;Vc
D ð Ti Þ ð5bÞ
i2V
Proof Let i 2 Vc and ti 2 Ti : We begin by U ¼ ;; as ti 2 convðTi Þ, then

xi;; ðti Þ xj;;
¼ jti0 ti0 j sup fjti0 ti0 jg DðTi Þ;
ti 2Ti
the difference of time-to-go functions between the DVRP and the static VRP is

n h i o

xi;U ðti Þ xj;U
¼
min tij þ E xj;Unj Tj min tij þ xj;Unj
j2U j2U
n
h i
o

max
tij þ E xj;Unj Tj tij þ xj;Unj
j2U

n
h i
o

max
tij tij
þ max
E xj;Unj Tj xj;Unj
j2U j2U
n
h i
o

DðTi Þ þ max
E xj;Unj Tj xj;Unj
j2U
X
DðTi Þ
j2i[U
P p 1p
The norm D can be extended to any lp norm, i.e., kxkp ¼ i j xj . Given that t
in this proof only need to satisfy ti 2 convðTi Þ. Thus, for any time support set in the
convex hull, we can obtain a similar result. In other words, the difference in
Proposition (3) is bounded by the diameter of a support set. Hence, when the
diameter is sufficiently small, e.g., a constant with n, the DVRP can be
approximated deterministically with a possible realization of arc times and a close
approximation may be obtained.
To further observe the advantages of the DVRP over the static VRP, we consider
the optimistic solution of the static VRP with arc time tij ¼ minti 2Ti tij : Based on
(4), we also denote the optimistic time-to-go function by xi;U with times t instead of
t. For the difference of two sets ðQ; W Þ; we use the sets deviation by Shapiro et al.
(2014) defined by DðQ; W Þ, supx2Q inf y2W k x yk. Then we have the following
proposition.
Proposition 4 For 8ði; U; ti Þ 2 S, we have
X
xi;U ðti Þ D Tj ; tj xi;U xi;U ðti Þ ð6aÞ
j2i[U
123
h i X h i
E x0;Vc ðT0 Þ D Ti ; tj x0;Vc E x0;Vc ðT0 Þ ð6bÞ
i2V
Proof The left-hand inequalities can be proved according to the way of the proof
of Proposition (3). With respect to the right-hand inequalities, we see that x is
feasible for (3) if xi;U ðti Þ ¼ xi;U ; 8ti 2 Ti ; thus xi;U can be seen as a lower bound for

the time-to-go at the state ði; U; ti Þ; Particularly, E x0;Vc ðT0 Þ ¼ x0;Vc is a lower
bound of (3).
From Proposition (4), we find that any bound for time-to-go of the static VRP
with times t also generates a lower bound for the time-to-go functions of the DVRP,
and the quality of bound is somehow associated with the difference between actual
times and the optimistic prediction t. We detail this effect in the next section.
4 Solution methodology
4.1 Approximate linear program
For the solution of problem (3), we can’t find a tractable way even for the problems
with moderate size. To this end, we introduce a collection of basis functions bi;U :
Ti ! Rm ; for each i 2 V; U Vc ni and m
m j S j : Then for 8a 2 R , the time-to-
go can be approximated as: xi;U ðti Þ
a; bi;U ðti Þ , of which hi is the inner product.
Therefore, the resulting approximate programming is

max E a; b0;Vc ðT0 Þ ð7aÞ
a

s:t: a; b0;Vc ðt0 Þ E a; bi;Vc ni ðTi Þ t0i ; 8i 2 Vc ; t0 2 T0 ð7bÞ

a; bi;U[j ðti Þ E a; bj;U Tj tij ; 8i 2 Vc ; j 2 Vc ni; U Vc nfi; jgti 2 Ti
ð7cÞ

a; bi;; ðti Þ ti0 ; 8i 2 Vc ; ti 2 Ti : ð7dÞ
Based on Proposition (4), any feasible solution of (7) also provides a lower bound
on (3) combined with (7a) because it gives a feasible solution for problem (3). Next
we further analyze the approximate programming by setting the basis b only
determined by the current customer i and the set of remaining customers U , namely
bi;U ðti Þ ¼ bi;U . Since tij ¼ minti 2Ti tij , the left sides of the constraints (7b–7d) do
not change with t, and then (7) can be reformulated as follows.
123
1042 G. Yu, Y. Yang

max E a; b0;Vc
a

a; b0;Vc bi;Vc ni t0i ; 8i 2 Vc ;

a; bi;U[j bj;U tij ; 8i 2 Vc ; j 2 Vc ni; U Vc nfi; jg;

a; bi;; ti0 ; 8i 2 Vc :
We see that any feasible solution of this approximate programming provides a lower
bound on the static VRP with optimistic times t, thus it is also a lower bound on (3)
according to Proposition (4). When the assumption bi;U ðti Þ ¼ bi;U do not hold for all
states, it means that the arc times may change, then the problem becomes more
complex. In this case, to improve the optimistic bound, we propose an affine time-
to-Go approximation functions. Consider the time-to-go approximation,
X
x0;Vc ðt0 Þ
a0 þ t0i b0i ; 8t0 2 T0 ; ð8aÞ
i2Vc
X
xi;U ðti Þ
ai0 þ aig þ tig big ; 8i 2 Vc ; ; 6¼ U Vc niti 2 Ti ; ð8bÞ
g2U
xi;; ðti Þ
ai0 þ ti0 bi0 ; 8i 2 Vc ; ti 2 Ti ; ð8cÞ
2 2
where a 2 Rn þ1 and b 2 Rn þn . According to (8), the route generates total times ai0
when the vehicle returning to the depot, and aig for each customer g 2 U 6¼ ;. We
then adjust a by big depending on the realization tig , the outgoing time to a customer
g to visit next. Since the vehicle always has the same remaining customers from a
location, the approximation depends on only a0 for the initial states at the depot.
Using this approximation, we reformulate (7) as
" #
X
max E a0 þ t0g b0g ð9aÞ
a;b
g2Vc
X
s:t:a0 ai0 þ t0i b0i þ t0g b0g aig tig big t0i ; 8i 2 Vc ; t0 2 T0
g2Vnf0;ig
X ð9bÞ
ai0 aj0 þ aij þ tij bij þ aig þ tig big ajg tjg bjg tij ;
g2U
8i 2 Vc ; j 2 Vc ni; ; 6¼ U Vc nfi; jgti 2 Ti ; ð9cÞ
ai0 aj0 þ aij þ tij bij tj0 bj0 tij ; 8i 2 Vc ; j 2 Vc ni; t 2 Ti ; ð9dÞ
ai0 þ ti0 bi0 ti0 ; 8i 2 Vc ; ti 2 Ti : ð9eÞ

By these approximate programmings, we have reduced the number of decision
variables. However, given the exponential growth of n, the constraint set is still too
large. To obtain an acceptable problem scale, we next use duality method to explore
the further insight of the approximate programming.
123
4.2 Duality
The dual of problem (8a) is

2 3
X X X X X XX
min4 t0i yt0i0 þ tij ytij;U
i
þ ti0 yti0i 5 ð10aÞ
y
i2Vc t0 2T0 i2Vc j2Vc ni UVc nfi;jg i2Vc ti 2Ti
X X
s:t: yt0i0 ¼ 1; ð10bÞ
i2Vc t0 2T0
X X
t0 yt0i0 ¼ t0 ; ð10cÞ
i2Vc t0 2T0
2 3
X X X X X tj 5
X
yti0i þ 4 ytij;U
i
yji;U yt0i0 ¼ 0; 8i 2 Vc ;
ti 2Ti j2Vc ni UVc nfi;jg ti 2Ti tj 2Tj t0 2T0
2 3
X X X X X X t
ytij;U
i
þ 4 ytig;U[j
i
g
ygi;U[j 5
UVc nfi;jg ti 2Ti g2Vc nfi;jg UVc nfi;j;gg ti 2Ti tg 2Tg
ð10dÞ
X
yt0i0 ¼ 0; 8i 2 Vc ; j 2 Vc ni; ð10eÞ
t0 2T0
X X X t
ti0 yti0i ti0 j
yji;; ¼ 0; 8i 2 Vc ; ð10fÞ
ti 2Ti j2Vc ni tj 2Tj
2 3
X X X X X X t
tij ytij;U
i
þ 4 tij ytig;U[j
i
tij g
ygi;U[j 5
UVc nfi;jg ti 2Ti g2Vc nfi;jg UVc nfi;j;gg ti 2Ti tg 2Tg ð10gÞ
X
tij ¼ 0; 8i 2 Vc ; j 2 Vc ni;
t0 2T0
y 0; y has finite support. ð10hÞ
Here, variable ytij;U

i
means the probability of the realization of state ði; U [ j; ti Þ and
choosing the next action j , variables yt0i0 and yti0i respectively refer to the probability
of selecting customer i from initial state ð0; Vc ; t0 Þ and of intermediate visiting state
ði; ;; ti Þ immediately before the terminal state. According to the sets Ti , these
decision variable sets can be finite, countable, but the finite number in these sets
must and only must be positive for the optimal solution. The objective function
(10a) minimizes the total expected duration with the probability variables y. Con-
straints (10b), associated with a0 , indicate that the sum of probability of visiting
state-action pairs ð0; Vc ; t0 ; iÞ at the depot should be one. Constraints (10c) corre-
sponding to b0i for each i 2 Vc mean that the total expected time with probabilities
123
1044 G. Yu, Y. Yang
yt0i0 should equal to t0 . Constraints (10c) can be seen as a relaxation, since the
probabilities corresponding to an actual policy must not simply map its expectation
but exactly map the distribution of T0 . Constraints (10d), associated with ai0 ;
enforce a probability flow balance that the probability of visiting customer i equals
to the probability of leaving. Constraints (10e) corresponding to aij ; guarantee a
probability flow balance that the probability of visiting customer i before customer j
must be equivalent to the probability of leaving customer i when customer j
remains, either by going to customer j itself or to another remaining customer. Here,
(10e) can also be seen as a relaxation because in a solution this must hold not only
for individual customers j , but for sets of customers as well. Involvingbi0 , con-
straints (10f) indicates that the total expected time of returning to the depot 0 from i,
the last customer visited, should be ti0 . Constraints (10g), corresponds to bij , enforce
the expected arc time of ði; jÞ, conditioned on visiting customer i before customer j ,
equal tij .
Proposition 5 There exists strong duality between Problems (9) and (10), which
means their optimal values are equal.
Proof For problem (9), the index set of semi-infinite constraint problem varies
over a compact space, and all coefficients of variables and the right-hand side
functions are continuous of the index. We can easily find the constructed Slater
point: a0 ¼ aij ¼ C; and for sufficiently large C, all of other variables set to 0.
Based on Goberna and López (1998), the semi-infinite constraint problem is with
the norm of Farkas–Minkowski, and hence the duality formulation (10) is strong.
This proposition motivates us to expect a higher realized times in any state to
obtain a higher time-to-go. Then we have the following corollary.
Corollary 1 In problem (9), we can set b 0 without loss of any optimality.
Proof By adding new constraints b 0 to (9), we have a new optimization
problem with strong duality alike (10), but constraints (10c), (10f) and (10g) are
relaxed to be greater-or-equal. In this way, as any optimal solution do not contain
outgoing probability flow corresponding to these constraints with longer duration,
thus the modified model must follow these new constraints at equality.
To demonstrate the quality of the bound of problem (9) provides for the original
problem (3), we compare it with the LP relaxation of the arc-based formulation for the
static VRP with optimistic duration t: To distinguish the formulations, variable z is
introduced to substitute variable y. As we know, the LP relaxation can be solved within
polynomial time. As it is a lower bound on x0;Vc , the static VRP with arc times t, thus it
h i
can be set as a lower bound on the optimal expected value E x0;Vc ðT0 Þ , which has been
proved as a bound on our original problem based on Proposition (4).
2 3
XX X
min4 tij zij 5 ð11aÞ
z
i2V j2Vni UVc
123
X
s:t: zij ¼ 1; 8i 2 V; ð11bÞ
j2Vni
X
zji ¼ 1; 8i 2 V; ð11cÞ
j2Vni
X X
zij 1; 8; 6¼ U V; ð11dÞ
i2U j2VnU
z 0: ð11eÞ
Theorem 1 The optimal solutions of problem (9) provide lower bounds which are
greater than or equal to the bounds derived from problem (11).
Proof Let bij ¼ 0; for 8i 2 V; j 2 Vc in (9), we formulate the LP as
max a0
a
X
s:t: a0 ai0 aig t0i ; 8i 2 Vc ;
g2Vc ni
X
ai0 aj0 þ aij þ aig ajg tij ; 8i 2 Vn0; j 2 Vc ni; U Vc nfi; jg;
g2U
ai0 ti0 ; 8i 2 Vc :
Obviously it follows the Theorem 8 proposed by Toriello (2014) and we can prorate
that The optimal solutions of problem (9) provide lower bounds which are greater
than or equal to the bounds derived from problem (11).
As an example of Theorem , we assume Tij ( j 2 Vc ) are i.i.d. Bernoulli random
variables and p 2 ð0; 1Þ. The Ti0 ¼ 0 with probability one. Since t ¼ 0, there are no
deterministic bounds able to improve the zero bound. The ALP formulation (9) is
represented as
" #
X
max a0 þ p b0g
a;b
g2Vc
X
s:t: a0 ai0 þ t0i b0i þ t0g b0i aig pbig t0i ; 8i 2 Vc ; t0 2 f0; 1gVc ;
g2Vc ni
X
ai0 aj0 þ aij þ tij bij þ aig þ tig big ajg pbjg tij ;
g2U
8i 2 Vc ; j 2 Vc ni; U Vc nfi; jg; tij 2 f0; 1gU[j ;

ai0 0; 8i 2 Vc ;
b 0:
Based on the symmetric property of the example, we have ai0 ¼ aj0 ; aig ¼ ajg for
distinct i; j; g 2 Vc , with similar equalities holding for b. ai0 ¼ 0 can be implied by
123
1046 G. Yu, Y. Yang
the last constraint class, and bi0 can be neglected because we do not use it in the LP.
Given that b 0, we reduce the first and second constraints to

a0 þ ðti0 þ ðn 1ÞÞb0i t0i þ ðn 2Þ aij þ pbij ; 8t0i 2 f0; 1g;

aij þ tij þ ðn 2Þð1 pÞ bij tij ; 8tij 2 f0; 1g:
Therefore, the ALP can be converted to two LP with only two variables and simple
relationship,
max½a0 þ npb0i
a0 ;b0i
s:t:a0 þ ðn 1Þb0i ðn 2Þs;

a0 þ nb0i 1 þ ðn 2Þs;
b0i 0;
and

max aij þ pbij
aij ;bij
s:t:aij þ ðn 2Þð1 pÞbij 0;

aij þ ð1 þ ðn 2Þð1 pÞÞbij 1;
bij 0;
where s refers to the optimal value of the second LP. Obviously,

s ¼ maxf0; p ðn
n 2Þð1 pÞg. Based on this, we have the optimal value
o of the
2
first LP: max 0; ðn 1Þð pðn 1Þ ðn 2ÞÞ; np ðn 1Þ ð1 pÞ : For
p [ ððn1
n2Þ
Þ; this bound is nonzero and approximates the optimal expected time-to-go
as p ! 1.
4.3 Constraint generation
From the aspect of practical computation, it necessitates generating the constraints

in problem (9a) efficiently. In this section, we use near-optimal policies in the light
of Adelman and Klabjan (2012) to do this. Observing that the separation problem
associated with constraints (9b), (9d) and (9e) are equivalent to maximize the linear
function over the sets Ti (i 2 V). we need only to find a method that solves this
maximization problem within polynomial time. However, constraints (8c) make the
situation more complex. In this case, we first fix a and b; we then have an equivalent
formulation of the separation problem for ordered customer pairs i 2 Vc and j 2
Vc ni .
X
max tij bij 1 þ aig þ tig big ajg tjg bjg ð12Þ
ti 2Ti ;;6¼U2Vc nfi;jg
g2U
Problem (12) is a bilinear optimization problem involving binary variables in the

compact set Ti f0; 1gVc nfi;jg , which is usually challenging to solve.
123
Lemma 1 The separation problem (12) is NP-hard even when the time support
sets Ti are l2 balls.
Proof For i 2 Vc , j 2 V c ni, a and b; supposing
Ti ¼ t 2 Rn : t ¼ m þ U; kUk2 1 , i.e., Ti is an l2 unit ball with the center at
m 2 Rn . Because
( )
X
max tij bij 1 þ tig big
ti 2Ti
g2U
8 9
< sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X 2 X 2 =
¼ max mij bij 1 þ mig big þ bij 1 þ big ;
ti 2Ti : ;
g2U g2U
problem (12) can be represented as

8 9
<X sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X ffi=
max bg þ aj þ ag ð13Þ
;6¼U2Vc nfi;jg: ;
g2U g2U
with an appropriate choice of aj , ag 0 and bg . Note that, by above settings,

problem (13) becomes a particular instance of the utility maximization problem
proposed by Adelman and Klabjan (2012). Based on the Proposition 1 of Adelman
and Klabjan (2012), problem (13) is NP-hard. Therefore, as the generalized problem
(13), problem (12) must be NP-hard.
Proposition 6 For i 2 Vc , if there existing polynomial extreme points for each set
Ti , denoted by Oð pðnÞÞ, problem (13) can be solvable within OðnpðnÞÞ time for
each customer pair ði; jÞ, and therefore problem (9a) can be solvable within
polynomial time.
Proof For 8g 2 Vc nfi; jg, put all positive terms in U and make sure U 6¼ ;: If there
existing Oð pðnÞÞ polynomial extreme points for each set Ti , at each point we can
obtain the overall maximizer by this procedure within OðnpðnÞÞ time.
This proposition suggests a tractable way when dealing with arc time support sets
Ti : Solve (13) by defining tij , minl2Lij Tijl , minti 2Ti tij and
^tij , maxl2Lij Tijl , maxti 2Ti tij , respectively. This approach may lead to a conser-
vative results but of a higher computational efficiency.
4.4 Heuristic time-directed policies
We can obtain a candidate solution x of problem (3) via the greedy policy associated
with (2) by substituting x for x . In the light of Adelman (2007), this policy can also
be called price-directed. For the particular instance of the approximation in problem
2 2
(8a), for 8a 2 Rn þ1 ; b 2 Rn þn , by breaking ties arbitrarily, we have
123
1048 G. Yu, Y. Yang
8 n P o
>
< arg minj2U tij þ aj0 þ g2Vc nj ajg þ tjg bjg ;
> jU j 2;
‘a;b ði; U; ti Þ ¼ j; U ¼ j 2 Vc ni;
>
>
:
0; U ¼ ;;
To further analyze the policy, we begin by assuming auxiliary structure of the arc
times and the optimal expected time-to-go. Particularly, we here assume that the
optimal expected time-to-go is non-decreasing for the remaining customer set, and
thus we have
h i h i
E xi;U ðTi Þ E xi;U[j ðTi Þ ; 8i 2 Vc ; j 2 Vc ni; U Vc nfi; jg ð14aÞ
h i h i
E xi;Vc ni ðTi Þ E x0;Vc ðT0 Þ ; 8i 2 Vc : ð14bÞ
We believe that this assumption is feasible in the real-world; the more customers
the vehicle left to visit, the longer the duration would incur. For the deterministic
case of the static VRP, (14) holds and satisfies the triangle inequality. However, in
our DVRP, it is infeasible assuming the triangle inequality always holds without
violating the independence of arc times with different tails.
The result of time-directed policies obtained by the first approximation is generic.
Since all approximations are lower bounds of the optimal expected time-to-go, the
following theorem requires only that the approximation must be a lower bound of
the expectation.
Theorem 2 Let x~i;U : Ti ! R be an approximate time-to-go function for each
h i
i 2 V; U Vc ni, satisfying E x~i;U ðTi Þ E xi;U ðTi Þ . Assume (14) holds and
h i
E xi;U ðTi Þ 0 and
h i
kE xi;U ðTi Þ E x~i;U ðTi Þ ; 8i 2 V; U Vc ni ð15Þ
for 8k 2 ð0; 1. Then the expected duration hby the time-directed
i policy with

approximation x~ is bounded by ð1 þ ð1 kÞnÞE x0;Vc ðT0 Þ .
Proof Suppose the policy selects j 2 U as the next action, but g 2 U is an optimal
decision. Then

tij þ E x~j;Unj Tj tig þ E x~g;Ung Tg
h i
tig þ E xg;Ung Tg
¼ xi;U ðti Þ
therefore,
h i
tij xi;U ðti Þ E x~j;Unj Tj xi;U ðti Þ kE xj;Unj Tj
123
Let t 2 T0 Tn denote the realizations of all arc times. Under this real-
ization, the customers are relabeled with the order that the time-directed policy
visits them . Thus the total duration of the time-directed route satisfies
X
n1 X h i
ti;iþ1 þ tn0 x0;Vc ðt0 Þ þ xi;fiþ1;...;ng ðti Þ kE xi;fiþ1;...;ng ðTi Þ :
i¼0 i2Vc
The remaining customer set when i is visited by the time-directed policy is denoted
by a random variable Ui . For the expectation times on the previous inequality,
" #
h i X X h i
E½T0 E x0;Vc ðT0 Þ þ xi;Ui ðTi Þ kE xi;Ui ðTi Þ
i2Vc i2Vc
Since the independence between distinctTi , Ui and Ti are independent, and hence
h i X X h i
E½T0 E x0;Vc ðT0 Þ þ ð1 kÞ PðUi ¼ U ÞE xi;U ðTi Þ
i2Vc UVc ni
h i
ð1 þ ð1 kÞnÞE x0;Vc ðT0 Þ
Note that we cannot use this result directly with an optimal

h solutioni of problem
(9). Even in the static VRP, it provides a tight bound for E x0;Vc ðT0 Þ as well as
gives a bad approximation of the optimal expected time-to-go values. We will
observe this later with numerical experiments.
Therefore, to take advantage of Theorem (2), we follow the roll-out framework to
recalculate the bound at every step for every available action. This roll out
framework contains Oðn2 Þ LPs total solutions with the form problem (9). If the ALP
can be solved within polynomial time, our solution also has polynomial time. In the
numerical experiments, we use a heuristic modification based on this generating
high-quality solutions within practical solution times.
t^
Theorem 3 If ðiÞ tij [ 0 and tij !; for some ! [ 0 all distinct i; j 2 V, and ðiiÞ t
ij
is symmetric and satisfies the triangle inequality tij ¼ tji and tij tig þ tgj for all

distinct i; j; g 2 V, hold, at any state ði; U; ti Þ with U 6¼ ;, let E x~j;Unj Tj for j 2 U
be given by recalculating (9) j U j times, using every remaining customer j 2 U
instead of 0 as start customer and Unj as customers to visit. Then Theorem (2)
5
applies with a ¼ ð8! , so that the total expected duration of this policy is bounded by
Þh i
1 þ 1 ð8!Þ n E x0;Vc ðT0 Þ .
5
h i
Proof From (i) we have xj;Unj !E xj;Unj Tj for 8; 6¼ U Vc ; j 2 U. The xj;Unj
is the duration of the shortest Hamiltonian path starting from
customer
j, visiting
customers Unj and ending at 0. According to Theorem 1, E x~j;Unj Tj can provide a
123
1050 G. Yu, Y. Yang
Fig. 2 Comparison of optimal values with independently distributed times for 25 customers
lower bound on xj;Unj at least as well as its LP relaxation. Then, based on Seb} o
(2013), we can obtain a bound within a factor of 5/8 for xj;Unj by considering (ii).
5 Computational results
In this section, we present several numerical examples to illustrate the performance

of our models and the proposed solution methodology. We are particularly
interested in the performance of various bounds and policies. We will compare the
quality of the bounds from ALP with the optimistic bound and the posterior bound.
We use the asymmetric data set from Dabia et al. (2013) including 25 or more
customers. We implement the numerical experiments by CPLEX 12.6.0 on a
personal computer with Intel Core i5-4570 processor and 8 GB of RAM, operating
under 64-bit Microsoft Windows 7 Enterprise.
In the first instance, we suppose that there are independently distributed arc times
with two possible realizations, high and low. For the high realization, the
deterministic time is multiplied by the factor h ¼ 1 þ ch ; and by multiplied by the
factor ‘ ¼ 1 þ c‘ for the low realization. The input parameter ch indicates the
123
augmenter of high arc times and PrðhÞ denotes the probability of high realization.
For the the probability of low realization, Prð‘Þ ¼ 1 PrðhÞ. To match the
ch PrðhÞ
deterministic arc time, we compute c‘ by c‘ ¼ 1Pr ðhÞ . For example, if high and low
arc times have equal likehood, then ch ¼ c‘ . Whereas when the probability of a high
time is 80%, then c‘ ¼ 4ch . In this way, the optimal total expected duration of a
route equals the deterministic optimal duration, which can be used as the benchmark
of our results. In the second instance, there are also two realizations, high or low. It
is different from the first instance that the outgoing arc times at a customer are all
either high or low, which can represent the traffic conditions like heavy traffic or
smooth traffic. In this case, the support set Ti involves two points.
In both instances, recall that x0;Vc represents the optimal expected duration of a
route, with an equivalent value to the optimal duration from the deterministic
model. Based on the arc time distribution, tij ¼ ‘tij for every customer pair o (i, j),
and thus we have the best optimistic bound x0;Vc ¼ ‘ x0;Vc . We will compare this
optimistic bound x0;Vc with the ALP bound (9). In addition, we compare the
performance of ALP with an a posterior bound (Secomandi 2001), which is able to
solve many NP-hard optimization problems efficiently. In addition, we will observe
the performance of the heuristic policy, which is proposed to improve the time-
123
1052 G. Yu, Y. Yang
directed policy for the solution of of (9) with (13). At a state ði; U; ti Þ with jU j 2,
instead of optimistic nduration, o we denote the expected duration by
}LP ði; U; ti Þ , arg minj2U tij þ xLP LP
j;Unj , where x j;Unj is the optimal value of the LP
relaxation with deterministic times t. In the heuristic policy. we use xLPj;Unj as a time-
to-go estimation.
We assume jVc j 2 f25; 50; 100g, h 2 f1:00; 1:05; 1:1; 1:15; 1:2; 1:25g and
hÞ 2 f0:5; 0:55; 0:6; 0:65g, and thus there are 72 numerical instances for these
Prð
two types of arc time distributions. We report the results of numerical tests in
Figs. 2, 3, 4, 5, 6 and 7. As illustrated in these figures, we can obtain that
X
The optimistic bound; x0;Vc ALP bound; a0 þ t0g b0g the posteriori bound
g2Vc
heuristic policy best fixed tour; x0;Vc ;

ð16Þ
123
Fig. 5 Comparison of optimal values with high/low correlated times for 25 customers
Moreover, we observe a similar pattern for all other instances, with independently
distributed times and high/low correlated times, that the differences between two
neighboring quantities in (16) uniformly increase with h and Prð
hÞ.
According to the conclusion in the work of Secomandi (2001), the bound from
posterior model is the tightest in most instances. Hence we need only to compare
the gaps between the bounds from our model and the posterior model. From our
results, we find that the proposed ALP bound is the second only to the posterior
bound, and even in the most extreme instances as well. Note that our ALP bound
always performs better than the optimistic bound x0;Vc , which is usually the best
bound with deterministic times. Our heuristic policy also provides quite tight bound
in most instances. In particular, in extreme instances, the heuristic policy prevails
over the route with deterministic times. For example, for the extreme case
PrðhÞ ¼0.65 and h ¼ 1:25, the heuristic policy is only 59% of the deterministic
routes duration. We can find similar results in other instances with independent
distributed times.
123
1054 G. Yu, Y. Yang
Figures 2, 3, 4, 5, 6 and 7 illustrate the bound performance for the instances with
high/low correlated arc times. There are some differences comparing to the first test
an obvious. For the most cases, there is a obvious dependence between the gaps
between the posterior bound and the other two bounds and the parameters Prð hÞ and
h, that is, smaller parameters corresponding to smaller gaps for and larger larger
parameters corresponding to lager gaps. However, the only exception is the gaps
between the posterior bound and the heuristic policy are quite smaller. For the
instances including 50 customers, the difference of optimal values obtained by the
heuristic policy are less than 10% of optimality in all instances.
With respect to the reasons of the gaps, Proposition 3 provides xLP
j;Unj as a feasible
h i

approximation reasonable for each actions time-to-go E xj;Unj Tj because of the
computational aspects. And thus we can find a close optimal solution by the
heuristic policy.
In above experiments, we test the tightness of several methods. It is equally
important to see the solution times of these method from computational aspects. We
summarize the CPU times (s) in Table 1, which contains the total seconds to
compute ALP bounds, the average second to compute the posterior bounds and the
heuristic policy. It is obvious that the solution times of ALP method are much
123
longer than the other two methods. This is the reason why we develop the heuristic
policy in Theorem 3, although the ALP bounds are tighter than the heuristic policy.
6 Conclusions
In this paper we have proposed a dynamic VRP model with stochastic travel and
service times, which can be applied to deal with the real-time traffic information in
practice. We derived the bound of the dynamic model based on an approximate
linear programming. The realization of the approximation only depends on the
expected duration and the description of the real-time traffic information. To
produce good solutions and improve solution efficiency, we have also developed a
greedy heuristic time-directed policy. The numerical experiments show that our
model could generate tighter bounds than heuristic method and dynamic and
deterministic models, and our algorithms are more effective than standard ALP.
Although the proposed approximate programming has reduced the number of
decision variables, it’s still NP-hard, where the approximation maybe difficult to
find. In addition, we have used greedy policy to improve the computational
123
1056 G. Yu, Y. Yang
Table 1 CPU times(s) for instances with 50 customers

PrðhÞ
h Independent times High/low correlated times
Heuristic ALP Posterior Heuristic ALP Posteriori

policy model model policy model model
0.5 1 79 3687 46 78 8005 38

1.05 86 2144 54 80 7569 41
1.1 115 2641 66 110 7125 52
1.15 151 3024 101 140 7621 65
1.2 188 2813 113 165 7214 71
1.25 214 3214 161 200 7244 93
0.55 1 87 3115 59 75 8012 44
1.05 98 2044 65 91 7921 47
1.1 135 2798 70 113 7655 59
1.15 177 3211 121 135 6811 78
1.2 200 2810 110 160 7210 73
1.25 256 3116 101 211 7580 76
0.6 1 90 2056 70 80 8612 48
1.05 106 1987 85 95 7321 68
1.1 211 2001 83 121 7314 79
1.15 230 2810 100 150 6944 70
1.2 265 3211 86 177 7211 66
1.25 288 2840 73 265 7821 64
0.65 1 108 1896 78 90 7885 51
1.05 122 1925 96 125 7701 65
1.1 220 2144 115 166 7955 77
1.15 256 3100 78 240 8002 75
1.2 299 3254 65 288 8366 80
1.25 301 3655 55 321 8410 64
efficiency, instead of exact methods. Thus, in the future, we aim to study how to
obtain the approximate programming in general conditions, and explore a feasible
exact method.
Acknowledgements This research is supported by the the National Natural Science Foundation of China
(No. 71571023).
References
Adelman D (2007) Price-directed control of a closed logistics queueing network. Oper. Res.
55(6):1022–1038
Adelman D, Klabjan D (2012) Computing near-optimal policies in generalized joint replenishment.
INFORMS J Comput 24(1):148–164
An S, Cui N, Bai Y, Xie W, Chen M, Ouyang Y (2015) Reliable emergency service facility location under
facility disruption, en-route congestion and in-facility queuing. Transp Res Part E Logist Transp Rev
82:199–216
123
Bell MG (2006) Mixed route strategies for the risk-averse shipment of hazardous materials. Netw Spat
Econ 6(3–4):253–265
Christofides N, Mingozzi A, Toth P (1981) Exact algorithms for the vehicle routing problem, based on
spanning tree and shortest path relaxations. Math program 20(1):255–282
Dabia S, Ropke S, Van Woensel T, De Kok T (2013) Branch and price for the time-dependent vehicle
routing problem with time windows. Transp Sci 47(3):380–396
Dantzig GB, Ramser JH (1959) The truck dispatching problem. Manag sci 6(1):80–91
Dixit VV, Harb RC, Martı́nez-Correa J, Rutström EE (2015) Measuring risk aversion to guide
transportation policy: contexts, incentives, and respondents. Transp Res Part A Policy Pract
80:15–34
Errico F, Desaulniers G, Gendreau M, Rei W, Rousseau LM (2016) A priori optimization with recourse
for the vehicle routing problem with hard time windows and stochastic service times. Eur J Oper Res
249(1):55–66
Fu L (2002) Scheduling dial-a-ride paratransit under time-varying, stochastic congestion. Transp Res Part
B Methodol 36(6):485–506
Ghannadpour SF, Noori S, Tavakkoli-Moghaddam R (2013) Multiobjective dynamic vehicle routing
problem with fuzzy travel times and customers satisfaction in supply chain management. Eng
Manag IEEE Trans 60(4):777–790
Gheorghe AV, Birchmeier J, Vamanu D, Papazoglou I, Kröger W (2005) Comprehensive risk assessment
for rail transportation of dangerous goods: a validated platform for decision support. Reliab Eng Syst
Saf 88(3):247–272
Goberna MA, López MA (1998) Linear semi-infinite optimization. Wiley, Chichester, England
Golden BL, Raghavan S, Wasil EA (2008) The vehicle routing problem: latest advances and new
challenges, vol 43. Springer, Berlin
Han J, Lee C, Park S (2013) A robust scenario approach for the vehicle routing problem with uncertain
travel times. Transp Sci 48(3):373–390
Jaillet P, Qi J, Sim M (2016) Routing optimization under uncertainty. Oper Res 64(1):186–200
Kang Y, Batta R, Kwon C (2014) Value-at-risk model for hazardous material transportation. Ann Oper
Res 222(1):361–387
Kok AL, Meyer CM, Kopfer H, Schutten JMJ (2010) A dynamic programming heuristic for the vehicle
routing problem with time windows and european community social legislation. Transp Sci
44(4):442–454
Lambert V, Laporte G, Louveaux F (1993) Designing collection routes through bank branches. Comput
Oper Res 20(7):783–791
Laporte G (2007) What you should know about the vehicle routing problem. Nav Res Logist (NRL)
54(8):811–819
Laporte G (2009) Fifty years of vehicle routing. Transp Sci 43(4):408–416
Laporte G (1992) The vehicle routing problem: an overview of exact and approximate algorithms. Eur J
Oper Res 59(3):345–358
Laporte G, Louveaux F, Mercure H (1992) The vehicle routing problem with stochastic travel times.
Transp Sci 26(3):161–170
Lei H, Laporte G, Guo B (2012) A generalized variable neighborhood search heuristic for the capacitated
vehicle routing problem with stochastic service times. Top 20(1):99–118
Li X, Tian P, Leung SC (2010) Vehicle routing problems with time windows and stochastic travel and
service times: models and algorithm. Int J Product Econ 125(1):137–145
Nguyen V, Jiang J, Ng K, Teo K (2016) Satisficing measure approach for vehicle routing problem with
time windows under uncertainty. Eur J Oper Res 248(2):404–414
Novoa C, Storer R (2009) An approximate dynamic programming approach for the vehicle routing
problem with stochastic demands. Eur J Oper Res 196(2):509–515
Powell WB (2009) Approximate dynamic programming: solving the curses of dimensionality. 2nd edn,
Wiley Series in Probability and Statistics, New Jersey
Pillac V, Gendreau M, Guéret C, Medaglia AL (2013) A review of dynamic vehicle routing problems. Eur
J Oper Res 225(1):1–11
Psaraftis HN (1980) A dynamic programming solution to the single vehicle many-to-many immediate
request dial-a-ride problem. Transp Sci 14(2):130–154
Psaraftis HN (1995) Dynamic vehicle routing: status and prospects. Ann Oper Res 61(1):143–164
Ritzinger U, Puchinger J, Hartl RF (2015) A survey on dynamic and stochastic vehicle routing problems.
Int J Product Res 54(1):1–17
123
1058 G. Yu, Y. Yang
Seb}o A (2013) Eight-fifth approximation for the path tsp. In: Integer programming and combinatorial
optimization. Springer, pp 362–374
Secomandi N (2001) A rollout policy for the vehicle routing problem with stochastic demands. Oper Res
49(5):796–802
Shapiro A, Dentcheva D et al (2014) Lectures on stochastic programming: modeling and theory, vol 16.
SIAM
Sun L, Karwan MH, Kwon C (2015) Robust hazmat network design problems considering risk
uncertainty. Transp Sci 50(4):1188–1203
Sungur I, Ren Y, Ordóñez F, Dessouky M, Zhong H (2010) A model and algorithm for the courier
delivery problem with uncertainty. Transp Sci 44(2):193–205
Topaloglu H, Powell WB (2006) Dynamic-programming approximations for stochastic time-staged
integer multicommodity-flow problems. INFORMS J Comput 18(1):31–42
Topaloglu H, Powell WB (2007) Sensitivity analysis of a dynamic fleet management model using
approximate dynamic programming. Oper Res 55(2):319–331
Toriello A (2014) Optimal toll design: a lower bound framework for the asymmetric traveling salesman
problem. Math Program 144(1–2):247–264
Toriello A, Haskell WB, Poremba M (2014) A dynamic traveling salesman problem with stochastic arc
costs. Oper Res 62(5):1107–1125
Toumazis I, Kwon C (2013) Routing hazardous materials on time-dependent networks using conditional
value-at-risk. Transp Res Part C Emerg Technol 37:73–92
Toumazis I, Kwon C (2015) Worst-case conditional value-at-risk minimization for hazardous materials
transportation. Transp Sci 50(4):1174–1187
Xiao L, Lo HK (2013) Adaptive vehicle routing for risk-averse travelers. Transp Res Part C Emerg
Technol 36:460–479
Zhao H, Zhang C, Gao Z, Si B (2013) Risk-based transit schedule design for a fixed route from the view
of equity. J Transp Eng 139(11):1086–1094
123

Yu-Yang2019 Article DynamicRoutingWithReal-timeTra PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Yu-Yang2019 Article DynamicRoutingWithReal-timeTra PDF

Uploaded by

Copyright:

Available Formats

Oper Res Int J (2019) 19:1033–1058

Dynamic routing with real-time traffic information

Guodong Yu1 • Yu Yang1

Received: 11 June 2016 / Revised: 1 October 2016 / Accepted: 13 April 2017 /

Keywords Dynamic VRP Stochastic intermediate times Dynamic

a robust optimization and an insertion-based solution heuristic for the uncertainty in

3 DVRP with stochastic intermediate times

3.1 Description and formulation

Generally, the VRP can be represented by a graph G ¼ ðV; AÞ; where

Fig. 1 The arc time

S,fði; U; ti Þ : i 2 V; U  Vni; ti 2 Ti g [ f0; ;g;

where the state fði; U; ti Þ : i 2 V; U  Vni; ti 2 Ti g refers to intermediate duration

max E½xV ðT0 Þ ð3aÞ

xi;; ðti Þ  tio ; 8i 2 Vc ; ti 2 Ti : ð3dÞ

Proposition 1 Let xi;U : Ti ! R for i 2 V and U  Vc ni be feasible for

Proof It is inductive according to the definition of x :

3.2 Advantages comparing to the static VRP

Dividing one by the other one, we have

Proof Let i 2 Vc and ti 2 Ti : We begin by U ¼ ;; as ti 2 convðTi Þ, then

min tij þ E xj;Unj Tj min tij þ xj;Unj

4.1 Approximate linear program

8i 2 Vc ; j 2 Vc ni; ; 6¼ U  Vc nfi; jgti 2 Ti ; ð9cÞ

ai0 þ ti0 bi0  ti0 ; 8i 2 Vc ; ti 2 Ti : ð9eÞ

The dual of problem (8a) is

y 0; y has finite support. ð10hÞ

Here, variable ytij;U

8i 2 Vc ; j 2 Vc ni; U  Vc nfi; jg; tij 2 f0; 1gU[j ;

s:t:a0 þ ðn 1Þb0i  ðn 2Þs;

s:t:aij þ ðn 2Þð1 pÞbij  0;

where s refers to the optimal value of the second LP. Obviously,

4.3 Constraint generation

From the aspect of practical computation, it necessitates generating the constraints

Problem (12) is a bilinear optimization problem involving binary variables in the

problem (12) can be represented as

with an appropriate choice of aj , ag 0 and bg . Note that, by above settings,

4.4 Heuristic time-directed policies

Note that we cannot use this result directly with an optimal

In this section, we present several numerical examples to illustrate the performance

 heuristic policy  best fixed tour; x0;Vc ;

Table 1 CPU times(s) for instances with 50 customers

Heuristic ALP Posterior Heuristic ALP Posteriori

0.5 1 79 3687 46 78 8005 38

You might also like

S,fði; U; ti Þ : i 2 V; U Vni; ti 2 Ti g [ f0; ;g;

where the state fði; U; ti Þ : i 2 V; U Vni; ti 2 Ti g refers to intermediate duration

max E½xV ðT0 Þ ð3aÞ

xi;; ðti Þ tio ; 8i 2 Vc ; ti 2 Ti : ð3dÞ

Proposition 1 Let xi;U : Ti ! R for i 2 V and U Vc ni be feasible for

Proof It is inductive according to the definition of x :

min tij þ E xj;Unj Tj min tij þ xj;Unj

8i 2 Vc ; j 2 Vc ni; ; 6¼ U Vc nfi; jgti 2 Ti ; ð9cÞ

ai0 þ ti0 bi0 ti0 ; 8i 2 Vc ; ti 2 Ti : ð9eÞ

8i 2 Vc ; j 2 Vc ni; U Vc nfi; jg; tij 2 f0; 1gU[j ;

s:t:a0 þ ðn 1Þb0i ðn 2Þs;

s:t:aij þ ðn 2Þð1 pÞbij 0;

heuristic policy best fixed tour; x0;Vc ;