10 1016@j Cor 2017 09 008

Accepted Manuscript
A Hybrid Dynamic Programming and Memetic Algorithm to the

Traveling Salesman Problem with Hotel Selection
Yongliang Lu, Una Benlic, Qinghua Wu
PII: S0305-0548(17)30231-9
DOI: 10.1016/j.cor.2017.09.008
Reference: CAOR 4320
To appear in: Computers and Operations Research
Received date: 14 April 2017

Revised date: 7 September 2017
Accepted date: 9 September 2017
Please cite this article as: Yongliang Lu, Una Benlic, Qinghua Wu, A Hybrid Dynamic Programming and
Memetic Algorithm to the Traveling Salesman Problem with Hotel Selection, Computers and Operations
Research (2017), doi: 10.1016/j.cor.2017.09.008
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
A Hybrid Dynamic Programming and

Memetic Algorithm to the Traveling Salesman
Problem with Hotel Selection
Yongliang Lu a , Una Benlic b,c , Qinghua Wu a,∗
T
a School
of Management, Huazhong University of Science and Technology, No.
IP
1037, Luoyu Road, Wuhan, China, email: luyongliang@hust.edu.cn;
qinghuawu1005@gmail.com
CR
b School of Electronic Engineering and Computer Science, Queen Mary University
of London, London, email: u.benlic@qmul.ac.uk
c University of Electronic Science and Technology of China, North Jianshe Road,
US
Sichuan 610054
AN
Abstract
M
The Traveling Salesman Problem with Hotel Selection (TSPHS) is a variant

of the classic Traveling Salesman Problem. It arises from a number of real-life
applications where the maximum travel time for each “day trip” is limited. In
ED
this paper, we present a highly effective hybrid between dynamic programming

and memetic algorithm for TSPHS. The main features of the proposed method
include a dynamic programming approach to find an optimal hotel sequence for
a given tour, three dedicated crossover operators for solution recombination, an
PT
adaptive rule for crossover selection, and a two-phase local refinement procedure
that alternates between feasible and infeasible searches. Experiments on four sets of
131 benchmark instances from the literature show a remarkable performance of the
CE
proposed approach. In particular, it finds improved best solutions for 22 instances

and matches the best known results for 103 instances. Additional analyses highlight
the contribution of the dynamic programming approach, the joint use of crossovers
AC
and the two local search phases to the performance of the proposed algorithm.
Keywords: dynamic programming; the traveling salesman problem; infeasible local
search
∗ Corresponding author.
Preprint submitted to Elsevier 12 September 2017

ACCEPTED MANUSCRIPT
1 Introduction
The Traveling Salesman Problem with Hotel Selection (TSPHS), originally

proposed by Vansteenwegen et al. [35], is an extension of the classic Traveling
Salesman Problem (TSP) where a salesman is required to visit a set of cities
(customers) with limitation to the number of working hours per day. As a
salesman often cannot visit all customers in a single trip, he needs to select a
hotel after each trip such that the hotel location where the salesman ends one
day is the same as the hotel location where the salesman starts the next day.
We use the term “trip” to refer to an ordered set of customers visited in a day
T
together with a starting and an ending hotel, while a “tour” denotes an ordered
IP
set of trips covering all the customers. More formally, the problem is defined
over a complete weighted graph G = (V, E) with a set of vertices V = C ∪ H,
where H = {1, ..., m} is the set of m hotel locations and C = {m+1, ..., m+n}
CR
is the set of n customer locations. Each edge (i, j) ∈ E (i, j ∈ V ) is associated
with a weight tij that represents the required travel time between the two
corresponding locations i, j ∈ V . Furthermore, each customer is associated
US
with a service time τi (where ∀i ∈ H, τi = 0).
The primary objective of the problem is to minimize the number of trips in a

AN
tour (the required number of working days), while the secondary objective is
to minimize the total travel time of a tour such that the total travel time per
trip does not exceed a maximum limit T (also termed the time budget).
M
In addition to its theoretical significance as an NP-hard combinatorial

problem, TSPHS is notable for its ability to formulate a number of practical
and important problems. A simple TSPHS application is the problem of
ED
finding an optimal path for an electric vehicle visiting all customers in a

transportation network [28]. The path can be divided into trips whose maximal
duration is constrained by the battery charge. At the end of each trip, the
PT
electric vehicle must find a charging station to recharge before its battery runs
out. Another application closely related to TSPHS arises from the combination
of the multi-day tour planning for truck drivers and the refueling problem for
CE
gasoline-powered vehicles, where drivers must decide at which station to refuel

in order to minimize the total cost of fuel. While the selection of refuel stations
is based on the price of the fuel, the selection of hotels in TSPHS is solely based
AC
on the distance to the particular hotel and not its cost.
This study introduces a highly effective hybrid between dynamic programming

and memetic search (denoted as HDM) for TSPHS that integrates a number
of distinguishing and novel ingredients: (i) a fast dynamic programming
approach to optimally partition a given tour into a set of feasible trips,
which is also incorporated into the local refinement procedure of HDM for
further improvement of produced solutions; (ii) three dedicated crossover
2
ACCEPTED MANUSCRIPT
operators for generating high quality offspring solutions; (iii) an adaptive

rule for crossover selection; (iv) a local refinement procedure that alternates
between a feasible and an infeasible search for an effective exploration of the
search space; and (v) a simple quality-based population update strategy for
population management. From the computational perspective, the proposed
hybrid approach shows a very competitive performance on 4 sets of 131
commonly used benchmark instances from the literature [35]. We further
provide experimental analyses to highlight the relevance of the novel dynamic
programming approach, the joint use of crossovers and the combination of two
local search phases to the performance of the proposed algorithm.
T
The remainder of this paper is organized as follows. Section 2 provides a
short review of the existing solution methods for TSPHS and a brief summary
IP
of the related node routing problems. Section 3 presents the algorithmic
framework, and details the building blocks of the proposed approach. Section
CR
4 is dedicated to computational results and comparisons with the best-
performing TSPHS approaches from the literature. Section 5 investigates
some important ingredients of the HDM algorithm, followed by conclusions
in Section 6.
US
AN
2 Literature review
This section provides a brief overview of related node routing problems and a
M
summary of the most recent advances in solution methods for TSPHS.
As TSPHS is a node routing problem that involves planning of multiple trips in

ED
a tour, it shares some similarities with several other routing problems dealing
with multiple-route planning, including the multiple Traveling Salesman
Problem (mTSP) [27], the Vehicle Routing Problem (VRP) [33] and the Multi-
PT
Depot Vehicle Routing Problem (MDVRP) [9]. Given a predefined number of

salesmen, all starting and ending at the same depot, mTSP is to determine
trips for each salesman such that each customer is visited exactly once with a
CE
minimal total travel length. In VRP, the objective is to find an optimal set of
routes for a fleet of vehicles with limited capacities to cover all the customers,
such that each route begins and ends at the same depot. MDVRP is a variation
AC
of VRP with multiple available depots, each with a fleet of vehicles. Although
similar, there are two main features that distinguish TSPHS from the three
problems: (i) While a trip in TSPHS may start at one hotel and end at another
hotel, ‘trips’ in the other three problems have to start and end at the same
depot, and (ii) since there is only one vehicle or salesman in TSPHS, all trips
have to be connected.
The constraint that stipulates that all routes must start and end at the same
3
ACCEPTED MANUSCRIPT
depot is relaxed in a class of routing problems with intermediate facilities (IF).

Similar to TSPHS, these problems allow a route to be split into trips that may
start from and end at a different IF. Several routing problems with IFs have
been considered in the literature [18,11,32,1]. The Waste Collection Vehicle
Routing Problem with Time Windows (WCVRPTW) [18] consists of a set of
customers at which waste is collected, a set of waste disposal facilities and an
unlimited number of vehicles located at a single depot. Departing from the
depot emptily, vehicles repeatedly collect waste from customers and empty it
at waste disposal facilities. At the end of a work shift, vehicles return to the
depot emptily. The problem further includes several practical constraints, such
as time windows, driver’s lunch breaks, workload balancing, vehicle capacity
T
and route compactness. The work in [11] tackles routing problems with IFs
IP
and introduces a Multi-Depot Vehicle Routing Problem with Inter-Depot
Routes (MDVRPI), which considers intermediate depots where vehicles can
be replenished with goods during the course of a rotation. The set of all the
CR
routes traversed by vehicles of a limited capacity is called a rotation, which
corresponds to the term “tour” used in TSPHS. A route may start and end
at the same depot or at different depots, and the total duration of a rotation
US
cannot exceed a certain time limit. Other interesting routing problems with
IF include the Vehicle Routing Problem with Intermediate Replenishment
Facilities (VRPIRF) [32] and the Periodic Vehicle Routing Problem with
AN
Intermediate Facilities (PVRP-IF) [1]. Although these node routing problems
with IFs exhibit some similarities with TSPHS, they differ from TSPHS in the
two important features: (i) in TSPHS, the length of each trip is constrained
with a time budget and the capacity of salesman is not defined, while the
M
length of a trip (route) in these routing problems with IFs is limited by vehicle
capacity and no explicit time limit constraint is applied to the route length.
Instead, in some of these problems, the time budget constraint applies to the
ED
whole tour (rotation) but not to a trip (route) traversed by a vehicle; (ii) in
TSPHS, the primary objective is to minimize the number of trips, rather than
to minimize the total travel distance.
PT
Another routing problem involving hotel selection is the Orienteering Problem

with Hotel Selection (OPHS) [13], which can be viewed as a variant of TSPHS.
CE
The problem contains a set of vertices with a score and a set of hotels. The
objective of OPHS is to determine a tour visiting a selection of vertices so as to
maximize the sum of the collected scores. An OPHS tour can be divided into
AC
a fixed number of connected trips, each with a limited duration and starting
from and ending at one of the hotels. Unlike in TSPHS, not all the vertices
need to be visited and the number of trips is limited beforehand.
We next briefly summarize the solution methods proposed in the literature

for TSPHS. The problem was first introduced by Vansteenwegen et al. [35],
together with a set of benchmark instances of varying sizes. The article also
proposed a two-index formulation for TSPHS that can be optimally solved
4
ACCEPTED MANUSCRIPT
by CPLEX (10.0) for instances with up to 40 nodes. To approximately solve

TSPHS, the authors further provided an iterated local search algorithm that
uses two initialization methods and an improvement approach. In [4], Castro et
al. presented a new integer programming model that modifies the formulation
previously proposed in [35], and provided a powerful memetic algorithm which
combines a random one-point crossover operator based on hotel exchange with
a tabu search procedure. According to the results reported in [4], the memetic
algorithm produces highly competitive results on four sets of commonly used
instances. To the best of our knowledge, it constitutes the best performing
existing heuristic for TSPHS. In [5], Castro et al. proposed a set-partitioning
formulation for TSPHS and an effective heuristic approach, which combines
T
an order-first split-second construction method with a fast improvement local
IP
search. Finally, Sousa et al. [31] presented a variable neighborhood search
method for TSPHS.
CR
3 Hybrid between dynamic programming and memetic search
(HDM)
US
This section presents the main framework of the proposed hybrid for TSPHS,
AN
and provides a detailed description of its algorithmic components. The key
elements of HDM include: (i) a dynamic programming procedure for optimal
hotel sequencing in a given TSP tour, which is used for generating initial
solutions and for further improvement of a TSPHS tour; (ii) a procedure for
M
generating an initial population of solutions; (iii) three crossover operators

for creating offspring solutions; (iv) an adaptive mechanism for crossover
ED
selection; (v) a local refinement procedure for offspring improvement; and (vi)
a population updating rule for managing the population.
PT
3.1 Main scheme

CE
The general idea behind memetic approaches (MA) [22] is to combine the
advantages of both crossover for exploration of promising search space regions,
and local search for concentrating the search around these regions. Given an
AC
initial population consisting of locally optimal solutions, an MA creates a

new set of improved local optima by means of a crossover and/or a mutation
operator, followed by local refinement. Aside from application to TSPHS [4],
MAs have been successfully used for tackling many node routing problems
including the vehicle routing problem, the capacitated arc routing problem
and general routing problems with nodes, edges and arcs [25].
A distinguishing feature of the proposed HDM algorithm is the search
5
ACCEPTED MANUSCRIPT
alternation between feasible and infeasible search space regions. Each solution
in the population involves the same number of trips, which is initialized with
one of the two solution construction procedures, C1 or C2 (see Section 3.3).
Each time a feasible solution with a smaller number of trips is found, HDM
constructs a new initial population consisting of solutions with a reduced
number of trips. This is achieved by introducing the newly found feasible
solution into the population and by rebuilding other members from the
population using construction method C2 described in Section 3.3.
Algorithm 1 Main framework of HDM for TSPHS
Require: A TSPHS instance
Ensure: Best solution s∗ found during the search
T
1: Initialize population P
IP
2: s∗ ← Best(P ) /*s∗ records the best solution found so far*/
3: while stop condition is not met do
CR
4: Randomly select p1 and p2 from P
5: (o1 , o2 ) ← Crossover(p1 , p2 )
6: for each offspring o ∈ (o1 , o2 ) do
7: Local ref inement(o)
8:
9:
10:
Update s∗ with o if applicable
US
Update population P with o if applicable
end for
AN
11: end while
12: Return s∗
The general framework of the proposed HDM, evolving a population of

solutions with a fixed number of trips, is summarized in Algorithm 1. After
M
population initialization, the algorithm enters the main optimization process

which is mainly driven by recombination operators and by a local refinement
ED
procedure. At each generation of HDM, two randomly chosen parents are

mated to generate two new offspring solutions by means of one of the three
dedicated crossover operators. One crossover is based on hotel sequence
exchange, while the other two are based on both customer sequence and hotel
PT
sequence exchanges. An adaptive selection mechanism is applied to choose

between the three crossovers. Each newly created offspring is further improved
by the local refinement procedure. Finally, the population updating rule
CE
decides whether the resulting solution should be inserted into the population
and which existing solution should be replaced. The algorithm terminates
when a time limit has been reached. The following subsections provide a
AC
detailed description of the HDM components.
3.2 A dynamic programming approach for optimal hotel sequencing in a tour
Given a TSP tour which starts and ends at a predefined hotel and visits the
complete set of n customers, we use a dynamic programming (DP) approach
6
ACCEPTED MANUSCRIPT
to optimally partition a given tour into a set of feasible trips during the
population initialization phase (see Section 3.3). It is also incorporated into
the local refinement procedure to join two consecutive trips during the Join-
trips move (see Section 3.5.1) and to re-optimize the hotel sequence at the end
of local refinement (see Section 3.5.3). The DP approach can be more formally
described as follows. Let S = (s0 , s1 , ..., sn+1 ) be the sequence of nodes forming
a TSP tour, where s1 , ..., sn represent the n customers while s0 and sn+1 denote
a starting and an ending hotel respectively. Moreover, let T be the maximum
duration of a day trip, let tij be the travel time between two locations i and
j, and let τi be the service time at location i. The DP approach maintains a
T ×(n+2) matrix M , where each entry M [t, v] denotes the optimal travel cost
T
that can be attained when arriving at the v th node in S with t time still left
IP
for a trip. The travel cost is composed of two parts: (i) the total travel time
to reach location sv , including both the travel time and the customer service
time, and (ii) the total cost of staying at hotels. To minimize the number of
CR
intermediate hotels, a large fixed value δ is added to the travel cost each time
a hotel is visited.
US
Initially, all entries of M are initialized with an infinite cost, with exception
of the first entry M [T, 0] (corresponding to the starting point s0 of the travel)
which is set to zero. Starting from entry M [T, 0], DP computes the matrix in
AN
an ascending order with respect to column v. For each column v, the algorithm
fills each entry M [t, v] 6= ∞ (0 ≤ t ≤ T ) of v using the following two forward
recursion methods:
M
R1 If (tsv ,sv+1 +τsv+1 ) ≤ t and (M [t, v]+tsv ,sv+1 +τsv+1 ) < M [t−tsv ,sv+1 −
τsv+1 , v + 1], then M [t − tsv ,sv+1 − τsv+1 , v + 1] = (M [t, v] + tsv ,sv+1 +
τsv+1 ). This recursion represents the case where the salesman goes
ED
directly from location sv to sv+1 .

R2 For each hotel h ∈ H, if t ≥ tsv ,h , T ≥ (th,Sv+1 +τsv+1 ) and (M [t, v]+
tsv ,h + th,sv+1 + τsv+1 + δ) < M [T − th,sv+1 − τsv+1 , v + 1] then M [T −
PT
th,sv+1 −τsv+1 , v+1]=(M [t, v]+tsv ,h +th,sv+1 +τsv+1 +δ). This recursion
represents the case where the salesman leaves from location sv and
selects to stay at a hotel before visiting sv+1 .
CE
The above forward recursion methods determine the values for each M [t, k]
(0 ≤ t ≤ T , 1 ≤ k ≤ n + 1). The entry with the minimum M value in
AC
the last column of the matrix corresponds to the optimal TSPHS tour. It is
easy to determine the hotel sequence of the optimal travel cost by applying a
backtrack procedure. The steps performed by DP are presented in Algorithm
2. N [t, v] 6= ∞ implies that the salesman leaves location sv−1 and selects to
stay at a hotel N [t, v] before visiting sv . Otherwise, the salesman goes directly
from location sv−1 to sv .
We further propose several acceleration techniques to speed up the DP
7
ACCEPTED MANUSCRIPT
Algorithm 2 Dynamic Programming (DP) procedure

Require: TSP tour S = (s0 , s1 , ..., sn+1 ); available intermediate hotels H =
{1, ..., m}
0
Ensure: Feasible TSPHS tour S
0
1: S ← S
2: for v = 0, ..., n + 1 do
3: for t = 0, ..., T do
4: M [t, v] ← ∞ , L[t, v] ← ∞ , N [t, v] ← ∞
5: end for
6: end for
7: M [T, 0] ← 0
T
/* forward recursion procedure */
8: for v = 0, ..., n do
IP
9: for t = 1, ..., T do
10: if M [t, v] 6= ∞ then
CR
11: if (tsv ,sv+1 + τsv+1 ) ≤ t and (M [t, v] + tsv ,sv+1 + τsv+1 ) < M [t − tsv ,sv+1 −
τsv+1 , v + 1] then
12: M [t − tsv ,sv+1 − τsv+1 , v + 1] ← M [t, v] + tsv ,sv+1 + τsv+1
13: N [t − tsv ,sv+1 − τsv+1 , v + 1] ← ∞
14:
15:
16:
end if
for each hotel h ∈ H do
US
L[t − tsv ,sv+1 − τsv+1 , v + 1] ← t
AN
17: if t ≥ tsv ,h , T ≥ (th,Sv+1 + τsv+1 ) and (M [t, v] + tsv ,h + th,sv+1 + τsv+1 +
δ) < M [T − th,sv+1 − τsv+1 , v + 1] then
18: M [T − th,sv+1 − τsv+1 , v + 1] ← M [t, v] + tsv ,h + th,sv+1 + τsv+1 + δ
19: N [T − th,sv+1 − τsv+1 , v + 1] ← h
M
20: L[T − th,sv+1 − τsv+1 , v + 1] ← t

21: end if
22: end for
ED
23: end if
24: end for
25: end for
26: Select t ∈ arg min0≤t≤T M [t, n + 1]
PT
/* backtrack procedure */
27: for v = n + 1, ..., 1 do
28: if N [t, v] 6= ∞ then
CE
29: Insert an intermediate hotel N [t, v] between location s0v−1 and s0v
30: end if
31: t ← L[t, v]
AC
32: end for

0
33: Return S
performance. The first acceleration technique is a filtering strategy that omits

hotels that cannot be used to extend the TSP path to an optimal TSPHS tour.
More precisely, when extending all entries in column i to entries in column
i + 1 using recursion method R2 (i.e., extending the TSPHS tour from si to
si+1 by visiting an intermediate hotel), some hotels can be directly omitted
8
ACCEPTED MANUSCRIPT
from consideration instead of examining all the hotels during this procedure.
Let h∗ be the hotel with a minimal travel time to si+1 . Insertion of a hotel
h ∈ H between si and si+1 cannot extend the given TSP tour to an optimal
TSPHS solution if tsi ,h > tsi ,h∗ , since the partial route (si , h∗ , si+1 ) would lead
to a smaller travel cost.
The second acceleration technique is to use dominance rules to avoid extending

dominated entries which cannot lead to the optimal TSPHS tour. Each time
when we extend all entries in column i to generate new entries in column
i + 1, we only need to extend the non-dominated entries in column i. Let
M [t1 , i] and M [t2 , i] be two entries in the same column i, M [t1 , i] dominates
T
M [t2 , i] if t1 > t2 and M [t1 , i] < M [t2 , i], since M [t1 , i] represents a shorter
IP
path from s0 to si , and with more remaining time to continue the journey. In
our implementation, we speed up identification of dominated entries from a
given column i by first considering entries with larger values of t. Furthermore,
CR
we maintain a variable mlatest to record the cost of the last extended entry in
column i. Each time we move to a new entry, if its cost value is larger than
mlatest , we ignore it and move to the next entry in this column. Otherwise, we
US
extend it and update mlatest by the cost value of this entry. In this way, we
can easily avoid extending the dominated entries.
AN
The above defined recursive methods R1 and R2 require the travel time tij
between two locations i and j to take on an integer value. In our work,
we use the same sets of benchmark instances as used in [35], where some
instances were generated by rounding the travel time tij to the nearest whole
M
(integer) number. The proposed DP approach can be applied directly to these

instances. In other instances from [35], tij is a decimal number rounded to
the nearest tenth. In order to apply our DP approach to these instances, we
ED
first multiply tij by 10 to convert it into an integer number. Furthermore, the

customer service time (τ ) and the maximum duration of a day trip (T ) are
also multiplied by 10.
PT
3.3 Population initialization

CE
To generate an initial diversified population consisting of high-quality but

AC
not necessarily feasible solutions, we use a population initialization procedure

similar to that used in [4]. Each solution is obtained with one of the two
construction methods. The first solution from the population is generated
with the first construction method (denoted as C1), adopted from the work in
[4]. C1 initially finds a high-quality TSP tour by means of the Lin-Kernighan
heuristic [20], and then optimally partitions the TSP tour into feasible trips
using the DP approach described in Section 3.2. It is perhaps worth noting
that the partitioning of the TSP tour with the original C1 in [4] is performed
9
ACCEPTED MANUSCRIPT
using a split method inspired by the work in [24], and based on the work by
Beasley [2]. The other solutions from the population are generated with the
second construction method (denoted as C2) that is identical to the one in [4].
C2 builds a TSPHS tour from a random sequence of |H| intermediate hotels.
Starting from the first trip, C2 inserts customers in each trip one by one, until
no new customer can be added without violating the maximum limit. The
order in which customers are added to a trip is determined with the approach
proposed in [8]. Finally, customers that are not included in any of the trips
are added to the tour at the best position that results in the least extra travel
time. Please note that a solution generated with C2 may not be feasible due
to a poor sequence of intermediate hotels. Each newly constructed solution is
T
improved with the local refinement procedure described in Section 3.5, and
IP
introduced into the population if its hotel sequence is not the same to that of
any solution currently in the population.
CR
3.4 Crossover operator
US
At each generation of our HDM algorithm, a crossover operator is applied to
create new offspring individuals by recombining two parent solutions selected
AN
from the population at random. It has been commonly recognized that the
success of evolutionary algorithms crucially depends on the recombination
operator, which not only should generate offspring solutions that differ from
their parents but should also transfer quality elements (building blocks) from
M
parents to offspring. Our HDM algorithm employs three crossover operators

and uses a probabilistic mechanism for crossover selection. The first crossover
operator CX1 is based on the exchange of hotel subsequences, while the other
ED
two operators CX2 and CX3 are based on the exchange of both hotel and
customer subsequences. An analysis presented in Section 5.2 revealed that the
combined use of the three crossovers constitutes an important ingredient of
PT
the proposed algorithm.
The three crossovers are detailed bellow.

CE
CX1 : This crossover is a single-point random operator. It has previously been

used for TSPHS within a memetic framework [4]. The operator simply consists
AC
in swapping a subset of hotels between two randomly selected parent solutions,

while leaving the customer sequences unchanged. More precisely, the customer
sequence for offspring o1 is inherited from parent p1 , while the customer
sequence for o2 is copied from parent p2 . To exchange the hotel subsequences
between the two parent solutions, a crossover point is first randomly selected
to divide into two parts the hotel sequences of both parents. The first part
of the hotel sequence from p1 is then combined with the second part of the
hotel sequence from p2 to generate the hotel sequence for o1 . The remaining
10
ACCEPTED MANUSCRIPT
two parts of the hotel sequences from the two parents are combined to build
the hotel sequence for o2 .
CX2 : To generate more diversified offspring solutions, CX2 exchanges both

the hotel and the customer sequences between two selected parent solutions.
Given two parents p1 and p2 , crossover CX2 performs in three steps. First, the
hotel sequences of the selected parents are recombined using CX1 to create
two new hotel sequences for the offspring solutions, denoted as Ho1 and Ho2
respectively. Next, the customer sequences of the parents are recombined using
a linear-order-based crossover to generate two new TSP tours, denoted as To1
and To2 . Finally, to obtain a full TSPHS tour for offspring o1 , the hotels in Ho1
T
are added to the TSP tour To1 at the same positions as in p1 and following
the order imposed by Ho1 . The TSPHS tour for offspring o2 is generated with
IP
the same procedure.
CR
P1 hp11 c8 c6 c7 hp12 c5 c1 hp13 c2 c3 c4 hp14
P2 hp21 c5 c8 hp22 c2 c3 c4 hp23 c6 c7 c1 hp24
Ho1 hp11 hp12 hp23 hp24 Ho2 hp21 hp22

US hp13 hp14
AN
Tp1 c8 c6 c7 c5 c1 c2 c3 c4 Tp2 c5 c8 c2 c3 c4 c6 c7 c1
M
To1 c8 c3 c4 c5 c1 c2 c6 c7 To2 c8 c7 c5 c3 c4 c6 c1 c2
ED
PT
O1 hp11 c8 c3 c4 hp12 c5 c1 hp23 c2 c6 c7 hp24

CE
O2 hp21 c8 c7 hp22 c5 c3 c4 hp13 c6 c1 c2 hp14
Fig. 1. Process of the Crossover CX2 operator

AC
The linear-order-based crossover is a widely used crossover operator for

both TSP [23] and VRP [24]. Given two customer sequences Tp1 and Tp2
of the randomly selected parent solutions, the linear-order-based crossover
proceeds as follows. First, two cutting points are selected at random. The
customer sequence To1 of the first offspring is obtained by transferring
the subsequence between the two cutting points from Tp1 to To1 , and by
copying in the same order the remaining customers from Tp2 . The customer
11
ACCEPTED MANUSCRIPT
sequence To2 for the second offspring is obtained in the same way. Fig.
1 gives an example of a crossover operation with CX2 on two selected
parent solution p1 = {h1p1 , c8, c6, c7, h2p1 , c5, c1, h3p1 , c2, c3, c4, h4p1 } and p2 =
{h1p2 , c5, c8, h2p2 , c2, c3, c4, h3p2 , c6, c7, c1, h4p2 }. First, the single-point crossover
CX1 is applied to the hotel sequences of the parent solutions, resulting in
two hotel sequences for the offspring solutions Ho1 = {h1p1 , h2p1 , h3p2 , h4p2 } and
Ho2 = {h1p2 , h2p2 , h3p1 , h4p1 }. Second, the linear-order-based crossover is applied to
create new TSP tours To1 and To2 for the offspring solutions. The linear-order-
based crossover starts by selecting two cutting points at random - first cut is
between positions 3 and 4, while the second is between positions 6 and 7. Next,
the customers between the two cutting points from Tp1 and Tp2 (i.e., {c5, c1, c2}
T
and {c3, c4, c6}) are copied respectively to To1 and To2 while preserving the
IP
order of the remaining customers as in Tp1 and Tp2 . The obtained customer
TSP tours for offspring o1 and o2 are To1 = {c8, c3, c4, c5, c1, c2, c6, c7} and
To2 = {c8, c7, c5, c3, c4, c6, c1, c2}. Finally, the hotels in Ho1 are added to
CR
To1 according to the hotel positions in p1 and following the order imposed
by Ho1 . Since the hotel positions in p1 are 1, 5, 8 and 12, we fill these
positions in o1 with the hotels in Ho1 . Thus, the resulting offspring solution
US
o1 is {h1p1 , c8, c3, c4, h2p1 , c5, c1, h3p2 , c2, c6, c7, h4p2 }. The other offspring solution
o2 = {h1p2 , c8, c7, h2p2 , c5, c3, c4, h3p1 , c6, c1, c2, h4p1 } is obtained in the same way.
AN
CX3 : Like CX2 , CX3 also generate its offspring solutions in three steps.
It first exchanges the hotel subsequences between two parent solutions,
then exchanges the customer subsequences between the parents, and finally
combines the new hotel and customer sequences to form a TSPHS tour. CX3
M
differs from CX2 mainly in the way the customer sequences for offspring
are created. It employs a subtour-order-based crossover [29] to generate TSP
tours considering all customers, while preserving common subtours of the two
ED
parents.
Tp1 c8 c6 c7 c5 c1 c2 c3 c4
PT
CE
Tp2 c5 c8 c2 c3 c4 c6 c7 c1
AC
Fig. 2. Common subtours
A subtour refers to a subsequence of a TSP tour with a length greater

than or equal to two. Fig. 2 shows an example of the common subtours
in two parent TSP tours. Assume that the two parent TSP tours are
Tp1 = {c8, c6, c7, c5, c1, c2, c3, c4} and Tp2 = {c5, c8, c2, c3, c4, c6, c7, c1}. The
common subtours in Tp1 and Tp2 are {c6, c7} and {c2, c3, c4}.
12
ACCEPTED MANUSCRIPT
The subtour-order-based crossover can be summarized with the following

general procedure:
(1) Identify all the common subtours in Tp1 and Tp2 with a maximum number
of locations shared by Tp1 and Tp2 . This process is repeated until no new
common subtour can be found. The problem of finding a common subtour
with a maximum number of locations can be viewed as the Longest
Common Subsequence (LCS) [34] problem that can be easily solved by the
Hirschberg’s algorithm [16] with a computational complexity of O(|Tp1 |).
(2) Copy the common customer subtours to To1 and To2 following the same
order as in Tp1 and Tp2 .
T
(3) The rest of the customers from Tp2 are copied to To1 in the order as they
appear in Tp2 . In the same way, the rest of the customers from Tp1 are
IP
copied to To2 following the same order as in Tp1 .
CR
US
To1 c5 c6 c7 c8 c1 c2 c3 c4 To2 c8 c5 c2 c3 c4 c6 c7 c1
AN
Fig. 3. Process of the Crossover CX3 operator

Figure 3 illustrates the operation of the proposed subtour-order-based crossover
M
operator on two TSP tours Tp1 = {c8, c6, c7, c5, c1, c2, c3, c4} and Tp2 =
{c5, c8, c2, c3, c4, c6, c7, c1}. First, we find all the common subtours of Tp1 and
ED
Tp2 - {c6, c7} and {c2, c3, c4}. Second, we copy the common subtours to To1
and To2 at the same positions as in Tp1 and Tp2 respectively. Third, we copy the
remaining locations {c5, c8, c1} from Tp2 to To1 by following the order of their
appearance in Tp2 , resulting in a TSP tour To1 = {c5, c6, c7, c8, c1, c2, c3, c4}
PT
for the first offspring. In the same way, the TSP tour for the second offspring
To2 = {c8, c5, c2, c3, c4, c6, c7, c1} is obtained.
CE
It is notable that all the three crossovers may lead to infeasible trips, i.e., with
durations exceeding the given time budget. Any infeasibility introduced with
the crossovers is repaired in the subsequent local refinement phase.
AC
At each generation, HDM selects a crossover with an adaptive probabilistic

mechanism inspired by Martı́ et al. [21]. The probability of selecting any of
the three crossovers is proportional to the number of high-quality solutions
produced with the given crossover during the search. We consider a solution
to be of high quality if it is introduced into the population pool during the
population update phase. Specifically, the initial probability of selecting each
of these three crossovers is set to 13 , and is dynamically being updated to favor
13
ACCEPTED MANUSCRIPT
the crossover that produces the highest quality solutions. For each of the three
crossovers CXi , we thus maintain an integer variable qi that counts the number
of times a solution created with CXi has been introduced into the population
pool. The probability of selecting a crossover CXi is then determined with the
following formula:
33 + qi
P (i) =
99 + q1 + q2 + q3
It is easy to see that the higher the value of qi , the more probable it is to select
the corresponding crossover.
T
3.5 Local refinement procedure
IP
Algorithm 3 Local refinement procedure
CR
Require: A TSPHS instance, an initial solution s0
Ensure: Best solution sbest found so far
1: sbest ← s0
/*Infeasible local search phase*/
2: sbest ← Inf LS(s0 ) /*Section 3.5.2*/
3: if sbest is feasible then
4: s0 = sbest
US
AN
/*Feasible local search phase*/
5: sbest ← F LS(s0 ) /*Section 3.5.3*/
6: end if
7: Return sbest
M
Another key component of our HDM algorithm is the local refinement

procedure that has the critical role in exploiting the search space. The
ED
proposed local refinement procedure consists of two local search phases -

an Infeasible Local Search (InfLS) phase which exploits both feasible and
infeasible regions, and a Feasible Local Search (FLS) phase which only
PT
examines the feasible search space. The general scheme is summarized in

Algorithm 3. Starting from an initial solution which can be feasible or
infeasible, the procedure first enters the infeasible local search phase during
CE
which both feasible and infeasible solutions are allowed in order to favor a
large exploration of the search space and to facilitate transitions between
structurally different solutions. If the best solution found during the infeasible
AC
search phase is a feasible one, at the end of this exploration phase, the search
switches to the feasible local search phase to further refine the solution by
concentrating the search in the feasible region. Both local search phases are
based on the tabu search framework [15] and jointly use eight move operators
which are explained below. It is notable that the output of the local refinement
procedure may be infeasible due to a poor sequence of intermediate hotels,
generated by the second construction phase C2 or the single-point crossover
CX1 .
14
ACCEPTED MANUSCRIPT
3.5.1 Move operators
During both search phases, the local refinement procedure jointly employs
eight move operators to generate neighborhoods, including single insertion,
double insertion, three insertion, single swap, double swap, three swap, three-
opt and join-trips. These operators are briefly described as follows:
• Single insertion: delete a customer from its original trip, and insert it to
another trip;
• Two insertion: delete two consecutive customers from a given trip, and
T
insert them to another trip by maintaining their original order or reverse
order;
IP
• Three insertion: delete three consecutive customers from a given trip, and
insert them to another trip in their original order or reverse order;
CR
• Single swap: swap two customers from two different trips;
• Two swap: swap two pairs of consecutive customers from two different
trips in their original order or reverse order;
• Three swap: swap two 3-tuples of consecutive customers from two different
US
trips in their original order or reverse order;
• Intra-trip 3-opt [19]: this operator is used to re-optimize a trip within
a tour. It involves deleting 3 edges in one trip, reconnecting this trip in
AN
all other possible ways, and then selecting the best trip among the newly
formed ones. This move operator has also been applied previously in [4].
• Join-trips: if deletion of an intermediate hotel between any two consecutive
trips within a feasible solution results in a feasible tour, use DP to re-build a
M
new optimal hotel sequence for the current solution with hopefully a smaller
number of intermediate hotels. The original Join-trips operator in [4] only
ED
deletes a intermediate hotel to join two consecutive trips while ensuring

feasibility, i.e., there is no further optimization of the new hotel sequence.
The following two subsections explain how these move operators are employed
PT
in the two local phases.

CE
3.5.2 The infeasible local search phase

AC
For many node routing problems such as TSPHS [4], VRP [14] and some of
its variants [7], it is known that a controlled exploration of infeasible solutions
helps discover high-quality feasible solutions as it facilitates transitions be-
tween structurally different solutions. Following this commonly used strategy,
the refinement procedure considers infeasible solutions during the InfLS phase
by relaxing the time budget constraint for trips. During InfLS, solutions are
evaluated based on the following penalty cost function:
15
ACCEPTED MANUSCRIPT
Algorithm 4 Infeasible Local Search (InfLS)

Require: A TSPHS instance, initial solution s0 , search depth LInf LS
Ensure: Best found solution s∗
1: Iter ← 0 /*Iteration counter*/
2: N I ← 0 /*N I is the number of consecutive iterations during which s∗ or
0
s is not improved */
3: s∗ ← N ILL
0 0
4: s ← s0 /*s records the best solution according to the objective function
defined in Eq. 1*/
T
5: s ← s0
IP
6: Initialize tabu list
7: while N I < LInf LS do
8: Construct neighborhoods of s with the insertion and the swap move
CR
operators /*see Section 3.5.1*/
00
9: Choose the best neighbor s that is non-tabu or satisfies the aspiration
criterion
10:
11:
12:
Update solution s ← s
Update tabu list
00
N I ← N I + 1, Iter ← Iter + 1
US
AN
13: if Iter mod uc = 0 then
14: Re-optimize each trip with Intra-trip 3-opt operator
15: end if
16: Use Join-Trips operator
M
17: if s is a feasible solution and f (s) < f (s∗ ) then

18: s∗ ← s
19: NI ← 0
ED
20: end if
21: if φ(s) < φ(s0 ) then
22: s0 ← s
PT
23: NI ← 0
24: end if
25: if Iter mod up = 0 then
CE
26: Update penalty factor ϕ

27: end if
28: end while
29: if s∗ 6= N ILL then
AC
30: Use DP to re-build an optimal hotel sequence for s∗

31: Return s∗
32: else
33: Return s0
34: end if
16
ACCEPTED MANUSCRIPT
D
X X
φ(s) = (ti,j + τj )xdij + ϕ ∗ EX(s), (1)
d=1 (i,j)∈E
where xdij = 1 if arc (i, j) is traversed during day trip d and 0 otherwise,
P P
EX(s) = D d=1 max{0,
d
(i,j)∈E (ti,j + τj )xij − T } is the total excess of the time
budget T in s, and ϕ is the self-adjusting penalty parameter.
Algorithm 4 outlines the InfLS procedure for TSPHS. Let s∗ be the best
T
feasible solution found in the current InfLS phase and let s0 be the highest-
quality solution according to the evaluation function defined in Eq. 1.
IP
Starting from an initial solution s0 , InfLS starts by examining the complete
neighborhoods produced with the three insertion move operators and the three
CR
swap move operators. As the Intra-trip 3-opt move operator is computationally
expensive, it is only applied periodically as suggested in [4], to re-optimize each
trip modified by the insertion and the swap move operators. This periodicity
US
is controlled by the parameter uc . At the end of each iteration, operator Join-
Trips is called to merge two consecutive trips whenever possible, my means of
the DP procedure. If an improvement with respect to s0 or s∗ was not achieved
in LInf LS consecutive iterations, InfLS terminates and returns the best feasible
AN
solution s∗ in case a feasible solution was detected during the InfLS phase.
Otherwise, solution s0 is returned. Solution s∗ is further improved by the
dynamic programming procedure before being returned as the final output
M
of InfLS.
To prevent InfLS search from short-term cycling, each time a customer is

ED
moved from its original trip to another trip with an insertion or a swap move,
it is forbidden to move it back to its original trip for the next tl iterations
(called tabu tenure). The tabu tenure tl is tuned according to the number n
of customers:
PT
n
tl = α × ( ) + random(10), (2)
100
CE
where random(10) takes a random number from the range [1, 10] and α is the
tabu tenure management factor.
AC
Along with this tabu rule, a simple aspiration criterion is used to permit
application of a move regardless of its tabu status if it leads to a solution
better than s0 in terms of the evaluation function in Eq. 1 or to a feasible
solution better than s∗ . Finally, to tune the self-adjusting penalty parameter
ϕ from Eq. 1, we use the same method as suggested in [4]. More precisely, ϕ
is updated with a frequency up (i.e., after up iterations). Initially set to 1, ϕ is
reduced (increased) by a factor of 2 if feasible (infeasible) solutions have been
achieved in up consecutive iterations.
17
ACCEPTED MANUSCRIPT
3.5.3 The feasible local search phase

Algorithm 5 Feasible Local Search (FLS)
Require: A TSPHS instance, initial feasible solution s0 , search depth LF LS
Ensure: Best found solution s∗
1: Iter ← 0 /*Iteration counter*/
2: N I ← 0 /*N I is the number of consecutive iterations during which s∗ is
not improved*/
3: s ← s0
4: Initialize tabu list
5: while N I < LF LS do
T
6: Construct neighborhoods of s with the insertion and the swap operators
respecting the time budget constraint
IP
7: Choose the best allowed neighbor s0 that is non-tabu or satisfies the
aspiration criterion
CR
8: Update solution s ← s0
9: Update tabu list
10: Iter ← Iter + 1, N I ← N I + 1
11:
12:
13:
14:
if Iter mod uc = 0 then
end if
Use Join-Trips operator
US
Re-optimize each trip with Intra-trip 3-opt operator
AN
15: if f (s) < f (s∗ ) then
16: s∗ ← s
17: NI ← 0
M
18: end if
19: end while
20: Use DP to re-build a optimal hotel sequence for s∗
ED
21: Return s∗
Starting from the best feasible solution produced with InfLS, FLS aims to
further improve it by exploiting more thoroughly the feasible search region
PT
around the given starting point. The evaluation function employed in this
phase is the total travel time of a tour, defined as:
CE
D
X X
f (s) = (ti,j + τj )xdij . (3)
d=1 (i,j)∈E
AC
The FLS procedure strictly imposes the time budget constraint and ensures
that all the performed moves respect the trip’s time budget. The general
scheme of the FLS procedure is provided in Algorithm 5, where s∗ represents
the best feasible solution found during the search. At each iteration, FLS
examines all the neighboring solutions obtained with the insertion and the
swap operators, and selects the overall best feasible solution according to Eq.
3 that is not prohibited by the tabu list. Like InfLS, FLS applies the Intra-trip
18
ACCEPTED MANUSCRIPT
3-opt operator to re-optimize the trips every uc iterations. At the end of each
iteration, it also makes use of the Join-Trips operator to merge consecutive
trips whenever possible. To avoid short-cycling, FLS applies the same tabu rule
as that used in InfLS. FLS stops when no improved feasible solution has been
found for LF LS consecutive iterations. At the end, DP improves (reconstructs)
the hotel sequence corresponding to the best found solution s∗ .
3.6 Population management
T
For each offspring solution s0 generated by the crossover operator and further
improved with the local refinement procedure, we decide whether s0 should be
IP
introduced into the population and which member from the population should
be replaced. To maintain a healthily diversified population, our algorithm
CR
applies a simple distance-and-quality replacement strategy which inserts s0
into the population if it satisfies one of the two following criteria: (1) if the
hotel sequence of s0 is distinct from that of any solution from the population,
US
and if its quality is better than that of the worst member from the population
in terms of the objective value, then s0 replaces this worst member from the
population; (2) if the hotel sequence of s0 is identical to that of another solution
s0 from the population, and if the quality of s0 is better than that of s0 , then
AN
s0 is replaced by s0 .
M
4 Computational Experiments
ED
The purpose of this section is to evaluate the performance of the proposed

HDM algorithm by means of extensive comparisons with the best-performing
state-of-art TSPHS approaches from the literature.
PT
4.1 Benchmark and Experimental Protocol

CE
We assess the performance of the proposed HDM on the following four sets of
131 benchmark instances from [35]:
AC
SET 1: This data set is formed from six well-known VRPs with time
windows from Solomon [30], and 10 multi-depot VRPs with time
windows from Cordeau et al. [10]. The Solomon instances consist of
100 customers and 6 hotels, while the number of customers in [10]
ranges from 48 to 288 with 6 hotels.
SET 2: This set consists of small data instances created from SET 1
by only considering the first 10, 15, 30 and 40 customers. The
19
ACCEPTED MANUSCRIPT
optimal solutions can easily be attained for the complete set using
commercial solvers.
SET 3: This set is created from a selection of classic TSP instances such
that the total travel length of the optimal TSPHS solution and
the known optimal TSP solution are equal. Instances from SET 3
are grouped into three subsets containing three, five and ten hotels
respectively.
SET 4: The final set 1 arises out of the 16 TSP instances from SET 3
by assigning 10 hotels at randomly selected customers’ locations,
and by arbitrarily determining the time budget. These instances
constitute the hardest TSPHS benchmark set used in the literature.
T
The complete benchmark is available at http://antor.uantwerpen.be/?s=tsphs.
IP
To evaluate the performance of HDM 2 , we perform comparisons with a
CR
Memetic Algorithm (MA) from [4] and a fast local search method named
P-LS from [5]. As MA clearly outperforms the heuristic from [35] (denoted as
I2LS in the corresponding paper), we do not include it in our comparisons.
US
For a relatively fair comparison, we use a procedure described in [7] to scale
the CPU times reported for MA and P-LS in the corresponding papers. The
stopping condition for HDM is the maximum time limit, set to be equal to
the scaled CPU times used for MA on a given instance. All these results are
AN
based on a single execution, as the number of runs per test instance is not
specified in [4] and [5].
M
Our HDM algorithm is coded in C++ and compiled with the g++ compiler
using the ‘-O3’ option. The experiments are performed on a computer with an
Intel Xeon E5440 processor (2.83 GHz CPU and 2 GB RAM) running under
ED
Linux operating system. Table 1 shows the setting of HDM parameters used
in the reported experiments.
Table 1
Setting of parameters
PT
Parameter Section Description Value

P 3.3 size of population 5
L 3.5 search depth during InfLS phase and FLS phase {100, 200}
α 3.5 tabu tenure management factor 10
CE
uc 3.5 frequency of re-optimizing each trip 5

up 3.5 frequency of updating ϕ 15
For the HDM parameters that are in common with the existing MA algorithm
(i.e., parameters P , uc and up ), we use the same setting as in [4]. We set
AC
L = 100 for instances with n ≤ 200, and L = 200 otherwise. The values for
L and α are determined based on a preliminary experiment on a sample set
consisting of 12 representative instances of different sizes, selected uniformly
1 As an erroneous solution is given for the instance lin105, we do not consider it in
our comparisons.
2 The best solutions found by HDM are now publicly available at: http://msor-
world.com/resources/TSPHSresults.zip.
20
ACCEPTED MANUSCRIPT
at random from the four data sets SET 1 - SET 4. To identify the best setting
for L and α, we tested 25 parameter setting pairs that are within a reasonable
range (L ∈ {100, 150, 200, 250, 300}, α ∈ {1, 5, 10, 15, 20}), and applied the
Friedman non-parametric statistical test to estimate the sensitivity of the
parameter pair. The Friedman test revealed a statistically significant difference
in performances for the tested settings (with p-value = 7.8228×10−5 ), implying
that greater performance gains could be achieved by a careful tunning of L
and α.
4.2 Computational results
T
IP
We next provide computational comparisons with the current best-performing
memetic algorithm MA and the local search method P-LS on each instance
CR
from the TSPHS benchmark set. For instances from SET 2, we further include
comparisons with the optimal solutions given in [4]. For few instances from
SET 2 for which the optimal result is not known, we indicate the best-known
US
lower bound instead. Columns ‘Instance’ and ‘N’ respectively provide the
instance name and the number of customers contained in the given instance.
For each instance, we show the result reported by each method, including
AN
the number of trips in a tour, the travel length and the computing time in
seconds. When the optimal solution is not known (for SET 1 and SET 4),
column ‘Gap(%)’ denotes the percentage gap between the total travel time of
a tour obtained with MA/P-LS and HDM respectively. The percentage gap
M
between two results L1 and L2 is computed as
L1 − L2
Gap = 100 ×
ED
L1
The positive values for the gap indicate that our HDM outperforms MA and
P-LS. When the optimal/exact solution is known (for SET 2 and SET 3),
PT
‘Gap(%)’ shows the percentage gap between the total travel time of a tour
obtained with HDM/MA/P-LS and the known optimum.
CE
The results for SET 1 are presented in Table 2. Within a comparable time limit,
we observe that HDM was able to report a better or matching performance
with respect to MA on all but one of the 16 instances, with an average
AC
gap of 0.09. When compared with P-LS, HDM reports a better or matching
performance on all the 16 instances with an average gap of 0.42.
Tables 3-6 present the results for instances in SET 2 involving 10, 15, 30 and 40
customers respectively. On all the 52 small instances, HDM attains the same
performance as MA within a very short CPU time. More precisely, both HDM
and MA were able to find the known optimum for all the cases (47 instances),
and very likely attained the unproven optimum for the remaining 5 instances.
21
ACCEPTED MANUSCRIPT
Table 2
Computational results for the benchmark instances from SET 1. The entries in bold
indicate the cases for which HDM is able to provide a better solution than MA and
P-LS.
Instance N HDM MA P-LS
Trips Length Time(s) Trips Length Time(s) Gap(%) Trips Length Time(s) Gap(%)
c101 100 9 9591.1 24.0 9 9595.6 24.8 0.04 9 9596.9 0.1 0.06
r101 100 8 1695.5 24.5 8 1704.6 24.7 0.53 8 1717.4 0.2 1.27
rc101 100 8 1673.4 29.4 8 1674.1 30.3 0.04 8 1674.3 0.2 0.05
c201 100 3 9559.9 16.3 3 9560.0 16.6 0.00 3 9563.1 0.1 0.03
r201 100 2 1642.8 12.4 2 1643.4 11.9 0.03 2 1648.1 0.1 0.32
rc201 100 2 1642.7 12.6 2 1642.7 12.7 0.00 2 1644.3 0.2 0.09
pr01 48 2 1412.2 2.0 2 1412.2 2.8 0.00 2 1412.2 0.0 0.00
pr02 96 3 2548.8 18.1 3 2543.3 18.6 -0.21 3 2551.3 0.2 0.09
pr03 144 4 3404.2 48.6 4 3415.1 49.8 0.31 4 3421.1 0.3 0.49
pr04 192 5 4215.3 166.2 5 4217.4 170.7 0.04 5 4217.4 0.6 0.04
pr05 240 5 4948.9 340.4 5 4958.7 341.7 0.19 6 4974.7 1.1 0.51
T
pr06 288 7 5960.9 337.6 7 5963.1 337.5 0.03 7 6032.0 1.6 1.17
pr07 72 3 2070.3 12.5 3 2070.3 13.1 0.00 3 2070.3 0.0 0.00
pr08 144 4 3367.7 66.9 4 3372.0 66.8 0.12 4 3399.9 0.4 0.94
IP
pr09 216 5 4414.9 234.7 5 4420.3 234.8 0.12 5 4445.7 1.1 0.69
pr10 288 7 5932.0 422.8 7 5940.5 421.2 0.14 7 5991.5 2.4 0.99
Avg. 0.09 0.42
CR
For instances in SET 2, the computation times are not reported for P-LS in
[5], while P-LS was able to reach the known optimum for 46 instances.
Table 3
Results for instances from SET 2 containing 10 customers.
InstanceN
c101 10 1
Exact
Trips Cost
955.1
Trips
1 955.1
HDM
Length Time(s)
<1
USGap(%)
0.00
Trips
1 955.1
MA
Length Time(s)
<1
Gap(%)
0.00
Trips
1
P-LS
Length
955.1
Gap(%)
0.00
AN
r101 10 2 272.8 2 272.8 <1 0.00 2 272.8 <1 0.00 2 272.8 0.00
rc101 10 1 237.5 1 237.5 <1 0.00 1 237.5 <1 0.00 1 237.5 0.00
pr01 10 1 426.6 1 426.6 <1 0.00 1 426.6 <1 0.00 1 426.6 0.00
pr02 10 1 661.9 1 661.9 <1 0.00 1 661.9 <1 0.00 1 661.9 0.00
pr03 10 1 553.3 1 553.3 <1 0.00 1 553.3 <1 0.00 1 553.3 0.00
pr04 10 1 476.4 1 476.4 <1 0.00 1 476.4 <1 0.00 1 476.4 0.00
M
pr05 10 1 528.9 1 528.9 <1 0.00 1 528.9 <1 0.00 1 528.9 0.00
pr06 10 1 597.4 1 597.4 <1 0.00 1 597.4 <1 0.00 1 597.4 0.00
pr07 10 1 670.2 1 670.2 <1 0.00 1 670.2 <1 0.00 1 670.2 0.00
pr08 10 1 573.4 1 573.4 <1 0.00 1 573.4 <1 0.00 1 573.4 0.00
pr09 10 1 645.5 1 645.5 <1 0.00 1 645.5 <1 0.00 1 645.5 0.00
ED
pr10 10 1 461.5 1 461.5 <1 0.00 1 461.5 <1 0.00 1 461.5 0.00
Avg. 0.00 0.00 0.00
Table 4
PT
InstanceN Exact HDM MA P-LS

Trips Cost Trips Length Time(s) Gap(%) Trips Length Time(s) Gap(%) Trips Length Gap(%)
c101 15 2 1452.2 2 1452.2 < 1 0.00 2 1452.2 < 1 0.00 2 1452.2 0.00
r101 15 2 379.8 2 379.8 <1 0.00 2 379.8 <1 0.00 2 379.8 0.00
CE
rc101 15 2 303.2 2 303.2 <1 0.00 2 303.2 <1 0.00 2 303.2 0.00
pr01 15 1 590.4 1 590.4 <1 0.00 1 590.4 <1 0.00 1 590.4 0.00
pr02 15 1 745.6 1 745.6 <1 0.00 1 745.6 <1 0.00 1 745.6 0.00
pr03 15 1 632.9 1 632.9 <1 0.00 1 632.9 <1 0.00 1 632.9 0.00
pr04 15 1 683.4 1 683.4 <1 0.00 1 683.4 <1 0.00 1 683.4 0.00
pr05 15 1 621.2 1 621.2 <1 0.00 1 621.2 <1 0.00 1 621.2 0.00
AC
pr06 15 1 685.2 1 685.2 <1 0.00 1 685.2 <1 0.00 1 685.2 0.00
pr07 15 1 795.3 1 795.3 <1 0.00 1 795.3 <1 0.00 1 795.3 0.00
pr08 15 1 707.2 1 707.2 <1 0.00 1 707.2 <1 0.00 1 707.2 0.00
pr09 15 1 771.7 1 771.7 <1 0.00 1 771.7 <1 0.00 1 771.7 0.00
pr10 15 1 611.9 1 611.9 <1 0.00 1 611.9 <1 0.00 1 611.9 0.00
Avg. 0.00 0.00 0.00
Tables 7-9 present the results for the three benchmark subsets of SET 3, that
include 3, 5 and 10 hotels respectively. The optimal total travel length of a tour
is known and is given in column ‘TSP’. For the cases with 3 hotels (see Table
22
ACCEPTED MANUSCRIPT
Table 5
c101a 30 3 2829.4 3 2863.2 1.4 1.18 3 2863.2 < 1 1.18 3 2863.6 1.19
r101 30 3 655.2 3 655.2 2.1 0.00 3 655.2 <1 0.00 3 655.2 0.00
rc101a 30 3 610.0 3 705.5 1.7 13.53 3 705.5 <1 13.53 4 683.8 10.79
pr01 30 1 964.8 1 964.8 <1 0.00 1 964.8 <1 0.00 1 964.8 0.00
pr02 30 2 1078.3 2 1078.3 < 1 0.00 2 1078.3 < 1 0.00 2 1078.3 0.00
pr03 30 1 952.5 1 952.5 <1 0.00 1 952.5 <1 0.00 1 952.5 0.00
pr04 30 2 1091.6 2 1091.6 1.8 0.00 2 1091.6 < 1 0.00 2 1091.6 0.00
pr05 30 1 924.7 1 924.7 <1 0.00 1 924.7 <1 0.00 1 924.7 0.00
pr06 30 2 1063.2 2 1063.2 < 1 0.00 2 1063.2 < 1 0.00 2 1063.2 0.00
pr07 30 2 1130.4 2 1130.4 < 1 0.00 2 1130.4 < 1 0.00 2 1130.4 0.00
pr08 30 2 1006.2 2 1006.2 < 1 0.00 2 1006.2 < 1 0.00 2 1006.2 0.00
pr09 30 2 1091.4 2 1091.4 < 1 0.00 2 1091.4 < 1 0.00 2 1091.4 0.00
pr10 30 1 918.9 1 918.9 <1 0.00 1 918.9 <1 0.00 1 918.9 0.00
T
Avg. 1.13 1.13 0.92
a values in Columns 2 and 3 correspond to the best-known lower bound.
IP
Table 6
Results for instances in SET 2 containing 40 customers.
CR
c101a 40 4 3817.5 4 3866.1 2.0 1.25 4 3866.1 2.6 1.25 4 3867.3 1.28
r101a 40 4 842.9 4 862.8 1.6 2.30 4 862.8 2.0 2.30 4 873.5 3.50
rc101a 40 3 652.1 4 850.3 3.3 23.30 4 850.3 2.2 23.30 5 870.8 25.11
US
pr01 40 2 1160.5 2 1160.5 < 1 0.00 2 1160.5 < 1 0.00 2 1160.5 0.00
pr02 40 2 1336.9 2 1336.9 1.6 0.00 2 1336.9 < 1 0.00 2 1336.9 0.00
pr03 40 2 1303.4 2 1303.4 < 1 0.00 2 1303.4 < 1 0.00 2 1303.4 0.00
pr04 40 2 1259.5 2 1259.5 < 1 0.00 2 1259.5 < 1 0.00 2 1259.5 0.00
pr05 40 2 1200.7 2 1200.7 < 1 0.00 2 1200.7 < 1 0.00 2 1200.7 0.00
pr06 40 2 1242.9 2 1242.9 < 1 0.00 2 1242.9 < 1 0.00 2 1242.9 0.00
AN
pr07 40 2 1407.0 2 1407.0 1.7 0.00 2 1407.0 < 1 0.00 2 1410.3 0.23
pr08 40 2 1222.2 2 1222.2 < 1 0.00 2 1222.2 < 1 0.00 2 1222.2 0.00
pr09 40 2 1284.4 2 1284.4 < 1 0.00 2 1284.4 < 1 0.00 2 1284.4 0.00
pr10 40 2 1200.4 2 1200.4 < 1 0.00 2 1200.4 < 1 0.00 2 1200.4 0.00
Avg. 2.06 2.06 2.31

M
a values for columns 2 and 3 correspond to the best lower bound.
7), we observe that both HDM and MA were able to find a solution with the
optimal tour length for 15 out of the 16 instances. However, HDM exhibited
ED
a better performance than MA on instance a280, producing a solution of

a shorter travel length and one fewer day trip. For the cases with 5 hotels
and 10 hotels (see Table 8 and 9), HDM reached the solution with optimal
PT
travel length in 15 out of the 16 and in 14 out of the 16 cases respectively,

outperforming MA on two instances with 5 hotels (a280 and pr1002) and on
two instances with 10 hotels (a280 and pcb442). For the two latter subsets
CE
of instances, HDM thus reduced the average optimality gap from 0.51% to
0.14% and from 0.45% to 0.27% respectively. On the other hand, for each
subset of instances, P-LS reports the solution of optimal travel length in 13
AC
out of the 16 cases and produced significantly worse results than HDM on
three instances (eil76, a280, pcb442). It is worth mentioning that for most
of these instances, the best result was attained during solution construction
with the C1 procedure, prior to solution recombination and local refinement
phases.
Finally, Table 10 shows the results for SET 4. We observe that HDM
outperforms MA on 9 out of the 15 instances, and is outperformed by MA on
23
ACCEPTED MANUSCRIPT
Table 7
Results for instances from SET 3 containing 3 hotels. The entries in bold indicate
the cases for which HDM was able to find a better solution than MA and P-LS.
Instance N TSP HDM MA P-LS
Trips Length Time(s) Gap(%) Trips Length Time(s) Gap(%) Trips Length Time(s) Gap(%)
eil51 51 426 4 426 3.0 0.00 4 426 3.9 0.00 4 426 0.0 0.00
berlin52 52 7542 4 7542 3.0 0.00 4 7542 3.2 0.00 4 7542 0.0 0.00
st70 70 675 4 675 3.1 0.00 4 675 7.2 0.00 4 675 0.0 0.00
eil76 76 538 4 538 3.1 0.00 4 538 28.5 0.00 5 556 0.0 3.23
pr76 76 108159 4 108159 10.2 0.00 4 108159 15.0 0.00 4 108159 0.1 0.00
kroa100 100 21282 4 21282 10.3 0.00 4 21282 14.6 0.00 4 21282 0.1 0.00
kroc100 100 20749 4 20749 10.2 0.00 4 20749 15.5 0.00 4 20749 0.0 0.00
krod100 100 21294 4 21294 10.4 0.00 4 21294 16.3 0.00 4 21294 0.1 0.00
rd100 100 7910 4 7910 10.7 0.00 4 7910 16.1 0.00 4 7910 0.1 0.00
eil101 101 629 4 629 10.6 0.00 4 629 17.4 0.00 4 629 0.1 0.00
lin105 105 14379 4 14379 10.9 0.00 4 14379 16.8 0.00 4 14379 0.1 0.00
ch150 150 6528 4 6528 10.7 0.00 4 6528 36.7 0.00 4 6528 0.2 0.00
tsp225 225 3916 4 3916 10.2 0.00 4 3916 14.5 0.00 4 3916 0.5 0.00
T
a280 280 2579 4 2583 235.3 0.15 5 2591 235.2 0.46 5 2615 0.8 1.37
pcb442 442 50778 4 50778 693.2 0.00 4 50778 692.2 0.00 5 51144 3.8 0.71
IP
pr1002 1002 259045 4 259045 3266.8 0.00 4 259045 3268.0 0.00 4 259045 35.7 0.00
Avg. 0.01 0.02 0.33
CR
Table 8
eil51
berlin52
st70
eil76
pr76
51
52
70
76
76
426
7542
675
538
108159
6
6
6
6
6
426
7542
675
538
108159
3.0
3.0
3.1
3.0
3.0
US
0.00
0.00
0.00
0.00
0.00
6
6
6
6
6
426
7542
675
538
108159
3.6
3.7
7.3
9.5
8.3
0.00
0.00
0.00
0.00
0.00
6
6
6
6
6
426
7542
675
566
108159
0.0
0.0
0.0
0.1
0.1
0.00
0.00
0.00
4.95
0.00
AN
kroa100 100 21282 6 21282 10.2 0.00 6 21282 14.9 0.00 6 21282 0.1 0.00
kroc100 100 20749 6 20749 10.3 0.00 6 20749 14.2 0.00 6 20749 0.1 0.00
krod100 100 21294 6 21294 10.3 0.00 6 21294 15.1 0.00 6 21294 0.1 0.00
rd100 100 7910 6 7910 10.3 0.00 6 7910 13.9 0.00 6 7910 0.1 0.00
eil101 101 629 6 629 10.2 0.00 6 629 16.7 0.00 6 629 0.1 0.00
lin105 105 14379 6 14379 10.0 0.00 6 14379 15.8 0.00 6 14379 0.1 0.00
ch150 150 6528 6 6528 10.4 0.00 6 6528 41.4 0.00 6 6528 0.2 0.00
M
tsp225 225 3916 6 3916 89.8 0.00 6 3916 89.4 0.00 6 3916 0.4 0.00
a280 280 2579 7 2639 199.1 2.27 7 2646 198.8 2.53 7 2652 0.7 2.75
pcb442 442 50778 6 50778 490.9 0.00 6 50778 497.6 0.00 7 51087 3.2 0.60
pr1002 1002 259045 6 259045 3987.5 0.00 7 259774 3998.9 0.28 6 259045 21.2 0.00
ED
Avg. 0.14 0.17 0.51
Table 9
PT
eil51 51 426 10 426 3.0 0.00 10 426 3.3 0.00 10 426 0.0 0.00
berlin52 52 7542 8 7864 4.9 4.09 8 7864 4.0 4.09 9 7542 0.0 0.00
CE
st70 70 675 10 675 3.1 0.00 10 675 6.7 0.00 10 675 0.0 0.00
eil76 76 538 11 538 3.0 0.00 11 538 9.2 0.00 12 576 0.1 5.11
pr76 76 108159 11 108159 10.1 0.00 11 108159 8.1 0.00 11 108159 0.1 0.00
kroa100 100 21282 11 21282 10.1 0.00 11 21282 14.5 0.00 11 21282 0.1 0.00
kroc100 100 20749 11 20749 10.1 0.00 11 20749 14.2 0.00 11 20749 0.1 0.00
krod100 100 21294 11 21294 10.1 0.00 11 21294 14.5 0.00 11 21294 0.1 0.00
AC
rd100 100 7910 10 7910 10.1 0.00 10 7910 14.7 0.00 10 7910 0.1 0.00
eil101 101 629 11 629 10.1 0.00 11 629 15.6 0.00 11 629 0.1 0.00
lin105 105 14379 10 14379 10.1 0.00 10 14379 15.7 0.00 10 14379 0.1 0.00
ch150 150 6528 11 6528 10.7 0.00 11 6528 36.8 0.00 11 6528 0.2 0.00
tsp225 225 3916 11 3916 84.9 0.00 11 3916 82.2 0.00 11 3916 0.4 0.00
a280 280 2579 11 2585 135.2 0.23 11 2613 138.1 1.30 12 2596 0.7 0.65
pcb442 442 50778 11 50778 526.9 0.00 11 51774 526.7 1.92 12 50919 2.6 0.27
pr1002 1002 259045 11 259045 3325.8 0.00 11 259045 3320.8 0.00 11 259045 12.3 0.00
Avg. 0.27 0.45 0.37
three instances. The average gap between the two algorithms is 0.33 in favor
24
ACCEPTED MANUSCRIPT
of HDM. Moreover, HDM outperforms P-LS on 11 out of the 15 instances.

The average gap between the two algorithms is 0.57 in favor of HDM. For
instances pr76, kroa100 and rd100, HDM and MA led to a smaller number of
trips than the P-LS approach.
Table 10
Computational results for the benchmark instances SET 4. The entries in bold
indicate the cases for which HDM provides a better solution than MA and P-LS.
Instance N HDM MA P-LS
Trips Length Time(s) Trips Length Time(s) Gap(%) Trips Length Time(s) Gap(%)
eil51 51 6 433 4.2 6 429 3.9 -0.93 6 436 0.0 0.68
berlin52 52 7 8642 4.1 7 8642 4.3 0.00 7 8642 0.0 0.00
st70 70 6 718 10.2 6 723 8.7 0.69 6 731 0.0 1.77
eil76 76 6 539 12.6 6 548 12.7 1.64 6 539 0.0 0.00
T
pr76 76 6 118318 8.0 6 118061 8.4 -0.21 7 118719 0.1 0.33
kroa100 100 6 22205 20.3 6 22343 19.7 0.61 7 22044 0.1 -0.73
kroc100 100 6 20933 12.2 6 20933 13.1 0.00 6 21116 0.1 0.86
IP
krod100 100 6 21548 17.0 6 21664 17.8 0.53 6 21464 0.1 -0.39
rd100 100 6 8352 23.7 6 8244 24.9 -1.31 7 8245 0.1 -1.29
eil101 101 6 634 23.0 6 634 23.4 0.00 6 652 0.1 2.76
ch150 150 6 6578 56.1 6 6647 56.3 1.03 6 6728 0.2 2.22
CR
tsp225 225 6 4508 120.3 6 4571 122.2 1.37 6 4502 0.9 -0.13
a280 280 6 2645 168.6 6 2646 163.4 0.03 6 2658 0.9 0.48
pcb442 442 6 54137 890.8 6 54339 898.6 0.37 6 55134 5.9 1.80
pr1002 1002 7 289337 3529.5 7 292690 3526.1 1.14 7 290110 30.5 0.26
US
Avg. 0.33 0.57
To conclude, Table 11 summarizes the results reported with the three

approaches on SET 1-4. For each approach, the first row shows the number of
AN
instances solved to optimality, the second row indicates the number of best-
known solutions found, while the last row provides the average gap to the
optimal solution (or to the best-know solution if the optimum is unknown).
When compared on the 131 benchmark instances, HDM attains 22 improved,
M
103 equal and 6 worse results than the optimal solution (or the best-know
solution). Table 11 reveals the largest performance gap between MA/P-LS and
ED
HDM on SET 1 and SET 4, which constitute the most challenging instances
from the TSPHS benchmark.
Table 11
PT
Summary of results
SET 1 SET 2 SET 3 SET 4
HDM MA P-LS HDM MA P-LS HDM MA P-LS HDM MA P-LS
Optimal solutions - - - 47/47 47/47 46/47 - - - - - -
Best-known solutions 15/16 4/16 2/16 - - - 44/48 42/48 39/48 10/15 6/15 4/15
CE
Average Gap(%) 0.01 0.10 0.43 0.00 0.00 0.00 0.14 0.22 0.41 0.24 0.57 0.82
AC
5 Discussion
In this section, we turn our attention to an analysis of three important

ingredients of the proposed algorithm: the dynamic programming approach,
the adaptive selection of crossovers and the combined use of two local search
phases.
25
ACCEPTED MANUSCRIPT
5.1 Impact of dynamic programming
As described in Section 3.3, the first construction method C1 optimally

partitions a TSP tour into feasible trips using the DP algorithm described
in Section 3.2. However, the original C1 from [4] is based on a minimum cost
path algorithm (Dijkstra’s algorithm [12]) to sequentially minimize the number
of trips and the total traveled length. Starting from an identical TSP tour
generated with the Lin-Kernighan heuristic [20], we compare the two versions
of C1 using the standard Dijkstra’s algorithm and DP respectively, on a
selection of 12 representative instances from SET 1 - SET 4. The experimental
T
results are shown in Table 12. As both algorithms are exact and are applied
to an identical TSP tour, we observe the same results in terms of the number
IP
of trips and the total traveled length for both versions. However, DP appears
to be faster than the Dijkstra’s algorithm on the large-sized instances.
CR
Table 12
Computational results for the two versions of the C1 construction procedure using
Dijkstra’s algorithm and DP respectively.
SET 1
Instance
rc101
pr03
pr10
rc101
N
100
144
288
30
TSP
1643.0
3374.2
5879.4
610.6
Trips
8
4
7
4
US Dijkstra
Length
1714.5
3495.0
6070.5
702.1
Time(ms)
1
3
14
0
Trips
8
4
7
4
DP
Length
1714.5
3495.0
6070.5
702.1
Time(ms)
2
3
28
0
AN
SET 2 c101 40 3828.3 4 3898.0 0 4 3898.0 0
rc101 40 760.8 5 884.3 0 5 884.3 0
eil76.s3 76 538.0 4 538.0 0 4 538.0 0
SET 3 a280.s3 280 2579.0 5 2691.0 6 5 2691.0 3
pcb442.s10 442 50778.0 13 51581.0 133 13 51581.0 26
eil76 76 538.0 6 577.0 2 6 577.0 1
M
SET 4 pcb442 442 50778.0 8 64042.0 113 8 64042.0 53

pr1002 1002 259045.0 7 330991.0 648 7 330991.0 525
Additionally, our algorithm uses the DP approach to join two consecutive

ED
trips during the Join-trips move and to re-optimize the hotel sequence at the
end of local refinement (Section 3.5). To assess the contribution of DP to the
performance of HDM, we compare it with a version that does not use DP for
PT
solution refinement, where the original Join-trips operator [4] without DP is

used. We respectively run the two algorithms on a selection of 12 representative
instances from SET 1 - SET 4 (Table 13). The Wilcoxon test reveals a
CE
statistically significant difference in performances between the two versions

with a p-value < 0.01. From the sum of the signed ranks, the advantage
of local refinement with DP is evident. This experiment thus confirms the
AC
contribution of the DP approach to the overall HDM performance.

Table 13
Wilcoxon test on solutions obtained with and without the use of DP for local
refinement.
N Positive Ranks Negative Ranks P-Value Diff?
12 0 45 0.0077 Yes
26
ACCEPTED MANUSCRIPT
5.2 Impact of the joint use of three crossover operators
1
CX1
0.9 CX2
CX3
Gap to the optimal/best−known solution
0.8 Adaptive
0.7
0.6
T
0.5
IP
0.4
0.3
CR
0.2
0.1
0
2 4 US
6
Instance
8 10 12
AN
Fig. 4. Solution gaps to the optimal/best-known result
As described in Section 3.4, HDM jointly uses three crossover operators by

means of a probabilistic selection mechanism. To investigate the power of the
M
proposed crossover combination, Figure 4 compares our algorithm with three

weakened versions of HDM that use a single crossover CX1 , CX2 and CX3
ED
respectively. Experiments are performed on a selection of 12 representative

instances from SET 1 - SET 4, with 50 generations per algorithm execution.
The x-axis shows the considered instance, while the y-axis indicates the
percentage gap to the optimal/best-known solution. The percentage gap
PT
between two results L1 and L2 is calculated as gap = 100 × (L1 − L2 )/L1 . The
lower the gap to the optimal/best-known solution, the better the performance
of the algorithm. The analysis clearly shows that the choice of a crossover is
CE
instance-dependent, while the best performance can be achieved by exploiting

their complementarities.
AC
5.3 Impact of the combined use of feasible and infeasible local search phases
As described in Section 3.5, our local refinement procedure consists of two

local search phases - an Infeasible Local Search (InfLS) and a Feasible Local
Search (FLS). To evaluate the merit of the joint use of the two local searches,
we compares our algorithm with a weakened version of HDM where the local
27
ACCEPTED MANUSCRIPT
0.4
InfLS
InfLS and FLS
0.35
Gap to the optimal/best−known solution
0.3
0.25
0.2
T
0.15
IP
0.1
0.05
CR
0
2 4 6 8 10 12
Instance
US
Fig. 5. Solution gaps to the optimal/best-known result
AN
refinement procedure only uses the Infeasible Local Search. In this comparison,
we do not include the version of HDM that relies solely on FLS since an initial
solution generated with the second construction method C2 or by a crossover
operator may not be feasible, and as such cannot be improved by FLS.
M
Therefore, FLS cannot be independently integrated into the HDM algorithm.

Figure 5 summarizes the results reported with the two HDM versions for a
selection of 12 representative instances from SET 1 - SET 4. The x-axis shows
ED
the considered instance, while the y-axis indicates the percentage gap to the
optimal/best-known solution. We observe that the weakened HDM algorithm
produces inferior results for 5 out of the 12 instances and outperforms the
PT
proposed HDM only on a single instance, which clearly highlights the benefit
of the combined use of feasible and infeasible local search phases.
CE
6 Conclusions
AC
We presented a highly effective hybrid approach between dynamic program-

ming and memetic search (HDM) for the Traveling Salesman Problem with
Hotel Selection (TSPHS), an NP-hard problem that arises from several
interesting real-world applications. An important element of HDM is a
dynamic programming procedure for an optimal partition of a TSP tour into
a set of feasible trips. Another key factor to the good performance of the
proposed algorithm is the alternation between exploitation in feasible and
28
ACCEPTED MANUSCRIPT
infeasible search regions, which may lead to high-quality feasible solutions as

it facilitates transitions between structurally different solutions. Furthermore,
effective exploration of the search space is ensured by a combined use of three
different crossover operators.
Computational assessments on 4 sets of 131 benchmark instances revealed

that the proposed HDM approach competes favorably with two recent best-
performing approaches (MA and P-LS) from the literature. In particular,
HDM reports new best solutions for 22 benchmark instances, while reaching
the existing best-known results for all but 6 instances from the problem bench-
T
mark. Analyses showed that the proposed dynamic programming approach,
the joint use of three crossover operators and the two local search phases
IP
provide a significant contribution to the good performance of HDM.
CR
For future work, several research directions can be considered. First, the
proposed HDM algorithm could perhaps further be improved by introducing
other crossover operators used for TSP or VRP (e.g., the alternating edge
US
crossover [26]), and dedicated move operators for local refinement such as the
multi-route exchange operator used in [3]. Furthermore, to escape from local
optimum trap during local refinement, it would be useful to consider several
perturbation mechanisms, such as the worst removal and random insertion
AN
strategy [36]. Second, it is worth considering the combined use of both feasible
and infeasible local search to tackle other highly constrained combinatorial
optimization problems such as the Constraint Maximum Collection problem
M
[17] and the Capacitated Minimum Spanning Tree problem [6]. Finally, some
ideas presented in this work could be applied to other similar problems such
as the Orienteering Problem with Hotel Selection and the Refueling Problem
ED
for Gasoline-powered Vehicles.

PT
CE
7 Acknowledgment
AC
We are grateful to the anonymous referees for valuable suggestions and

comments which helped us improve the paper. We would like to thank
Marco Castro for answering our questions. This work is partially supported
by the National Natural Science Foundation Program of China [Grant No.
71401059,71620107002,71531009], Huazhong University of Science and Tech-
nology (5001300001), and the DAASE project funded by EPSRC programme
grant EP/J017515/1.
29
ACCEPTED MANUSCRIPT
References
[1] Angelelli, E., & Speranza, M. G. (2002). The periodic vehicle routing problem
with intermediate facilities. European Journal of Operational Research, 137(2),
233-247.
[2] Beasley, J.E. (1983). Route first - cluster second methods for vehicle routing.
Omega, 11(4):403-408.
[3] Belhaiza, S., Hansen, P., & Laporte, G. (2014). A hybrid variable neighborhood
tabu search heuristic for the vehicle routing problem with multiple time
T
windows. Computers & Operations Research, 52, 269-281.
IP
[4] Castro, M., Sörensen, K., Vansteenwegen, P., & Goos, P. (2013). A memetic
algorithm for the travelling salesman problem with hotel selection. Computers
CR
& Operations Research, 40(7), 1716 - 1728.
[5] Castro, M., Sörensen, K., Vansteenwegen, P., & Goos, P. (2015). A fast
metaheuristic for the travelling salesman problem with hotel selection. 4OR,
13(1), 15-34.
US
[6] Chandy, K. M., & Lo, T. (1973). The capacitated minimum spanning tree.
Networks, 3(2), 173-181.
AN
[7] Chen, Y., Hao, J. K., & Glover, F. (2016). A hybrid metaheuristic approach
for the capacitated arc routing problem. European Journal of Operational
Research, 253(1), 25-39.
M
[8] Christofides, N., Mingozzi, A., Toth, P. & Sandi, C. (1979). Combinatorial
Optimization. Wiley, Chichester, pp. 315 - 338.
ED
[9] Cordeau, J., Gendreau, M., & Laporte, G. (1995). A tabu search heuristic for
periodic and multi-depot vehicle routing problems. Networks, 30(2), 105-119.
[10] Cordeau J.F, Laporte, G., & Mercier, A. (2001). A unified tabu search heuristic
PT
for vehicle routing problems with time windows. Journal of the Operational
Research Society, 52(8), 928-936.
CE
[11] Crevier, B., Cordeau, J. F., & Laporte, G. (2007). The multi-depot vehicle
routing problem with inter-depot routes. European Journal of Operational
Research, 176(2), 756-773.
AC
[12] Dijkstra, E. W. (1959). A note on two problems in connexion with graphs.

Numerische mathematik, 1(1), 269-271.
[13] Divsalar, A., Vansteenwegen, P., Sörensen, K., & Cattrysse, D. (2014). A
memetic algorithm for the orienteering problem with hotel selection. European
Journal of Operational Research, 237(1), 29-49.
[14] Gendreau, M., Hertz, A., & Laporte, G. (1994). A tabu search heuristic for the
vehicle routing problem. Management science, 40(10), 1276-1290.
30
ACCEPTED MANUSCRIPT
[15] Glover, F. (1986). Tabu search - part I. ORSA Journal on Computing, 1(3),190-
260.
[16] Hirschberg, D. S. (1975). A linear space algorithm for computing maximal

common subsequences. Communications of the ACM, 18(6), 341-343.
[17] Kataoka, S., & Morito, S. (1988). An algorithm for single constraint maximum
collection problem. Journal of the Operations Research Society of Japan, 31(4),
515-531.
[18] Kim, B. I., Kim, S., & Sahoo, S. (2006). Waste collection vehicle routing problem
with time windows. Computers & Operations Research, 33(12), 3624-3642.
T
[19] Lin, S. (1965). Computer solutions of the traveling salesman problem. Bell
IP
System Technical Journal, 44(10), 2245–2269.
[20] Lin, S., & Kernighan, B.W. (1973). An Effective Heuristic Algorithm for the
CR
Traveling-Salesman Problem. Operations Research, 21(2), 498-516.
[21] Martı́, R., Duarte, A., & Laguna, M. (2009). Advanced scatter search for the
max-cut problem. INFORMS Journal on Computing, 21(1), 26-38.
US
[22] Merz, P., & Freisleben, B. (1999). A comparison of memetic algorithms, tabu
search, and ant colonies for the quadratic assignment problem. Proceedings of
the 1999 Congress on Evolutionary Computation, Vol.3, pp. 2063–2070.
AN
[23] Oliver, I. M., Smith, D. J., & Holland, J. R. C. (1987). A Study of Permutation
Crossover Operators on the Traveling Salesman Problem. Proceedings of the
Second International Conference on Genetic Algorithms and their Application,
M
pp. 224-230.
[24] Prins, C. (2004). A simple and effective evolutionary algorithm for the vehicle
ED
routing problem. Computers & Operations Research, 31(12), 1985-2002.
[25] Prins, C., & Bouchenoua, S. (2005). A Memetic Algorithm Solving the VRP,
the CARP and General Routing Problems with Nodes, Edges and Arcs. Recent
PT
Advances in Memetic Algorithms. Springer Berlin Heidelberg, pp. 65-85.
[26] Puljić, K., & Manger, R. (2013). Comparison of eight evolutionary crossover
CE
operators for the vehicle routing problem. Mathematical Communications,

18(2), 359-375.
[27] Reviews, E. M. (2006). The multiple traveling salesman problem: an overview

AC
of formulations and solution procedures. Omega, 34(3), 209-219.
[28] Schneider, M., Stenger, A., & Goeke, D. (2013). The electric vehicle-routing
problem with time windows and recharging stations. Transportation Science,
48(4), 500-520.
[29] Soak, S. M., & Ahn, B. H. (2003). New subtour-based crossover operator for
the TSP. International Conference on Genetic and Evolutionary Computation.
Springer-Verlag, vol. 2724, pp. 1610-1611.
31
ACCEPTED MANUSCRIPT
[30] Solomon, M.M. (1987). Algorithms for the vehicle routing and scheduling
problems with time window constraints. Operations Research, 35(2), 254-265.
[31] Sousa, M. M., Ochi, L. S., Coelho, I. M., & Goncalves, L. B. (2015). A variable
neighborhood search heuristic for the traveling salesman problem with hotel
selection. Conferencia LatinoAmericana de Informática, pp. 1-12.
[32] Tarantilis, C. D., Zachariadis, E. E., & Kiranoudis, C. T. (2008). A

hybrid guided local search for the vehicle-routing problem with intermediate
replenishment facilities. Informs Journal on Computing, 20(1), 154-168.
[33] Toth, P., & Vigo, D. (2008). The vehicle routing problem. Siam Monographs
on Discrete Mathematics & Applications, 94(2-3), 127-153.
T
[34] Ullman, J. D., Aho, A. V. & Hirschberg, D. S.(1976). Bounds on the complexity
IP
of the longest common subsequence problem. Journal of the ACM, 23(1), 1-12.
CR
[35] Vansteenwegen, P., Souffriau, W., & Sörensen, K. (2012). The traveling
salesman problem with hotel selection. Journal of the Operational Research
Society, 63(2), 207-217.
US
[36] Wei, L., Qin, H., Zhu, W., & Wan, L. (2015). A study of perturbation operators
for the pickup and delivery traveling salesman problem with lifo or fifo loading.
Journal of Heuristics, 21(5), 617-639.
AN
M
ED
PT
CE
AC
32

10 1016@j Cor 2017 09 008

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 1016@j Cor 2017 09 008

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

A Hybrid Dynamic Programming and Memetic Algorithm to the

Yongliang Lu, Una Benlic, Qinghua Wu

To appear in: Computers and Operations Research

Received date: 14 April 2017

A Hybrid Dynamic Programming and

Yongliang Lu a , Una Benlic b,c , Qinghua Wu a,∗

The Traveling Salesman Problem with Hotel Selection (TSPHS) is a variant

this paper, we present a highly effective hybrid between dynamic programming

proposed approach. In particular, it finds improved best solutions for 22 instances

Preprint submitted to Elsevier 12 September 2017

The Traveling Salesman Problem with Hotel Selection (TSPHS), originally

The primary objective of the problem is to minimize the number of trips in a

In addition to its theoretical significance as an NP-hard combinatorial

finding an optimal path for an electric vehicle visiting all customers in a

gasoline-powered vehicles, where drivers must decide at which station to refuel

on the distance to the particular hotel and not its cost.

This study introduces a highly effective hybrid between dynamic programming

operators for generating high quality offspring solutions; (iii) an adaptive

summary of the most recent advances in solution methods for TSPHS.

As TSPHS is a node routing problem that involves planning of multiple trips in

Depot Vehicle Routing Problem (MDVRP) [9]. Given a predefined number of

depot is relaxed in a class of routing problems with intermediate facilities (IF).

Another routing problem involving hotel selection is the Orienteering Problem

We next briefly summarize the solution methods proposed in the literature

by CPLEX (10.0) for instances with up to 40 nodes. To approximately solve

generating an initial population of solutions; (iii) three crossover operators

3.1 Main scheme

initial population consisting of locally optimal solutions, an MA creates a

A distinguishing feature of the proposed HDM algorithm is the search

The general framework of the proposed HDM, evolving a population of

population initialization, the algorithm enters the main optimization process

procedure. At each generation of HDM, two randomly chosen parents are

sequence exchanges. An adaptive selection mechanism is applied to choose

detailed description of the HDM components.

3.2 A dynamic programming approach for optimal hotel sequencing in a tour

directly from location sv to sv+1 .

We further propose several acceleration techniques to speed up the DP

Algorithm 2 Dynamic Programming (DP) procedure

20: L[T − th,sv+1 − τsv+1 , v + 1] ← t

32: end for

performance. The first acceleration technique is a filtering strategy that omits

The second acceleration technique is to use dominance rules to avoid extending

(integer) number. The proposed DP approach can be applied directly to these

first multiply tij by 10 to convert it into an integer number. Furthermore, the

3.3 Population initialization

To generate an initial diversified population consisting of high-quality but

not necessarily feasible solutions, we use a population initialization procedure

parents to offspring. Our HDM algorithm employs three crossover operators

the proposed algorithm.

The three crossovers are detailed bellow.

CX1 : This crossover is a single-point random operator. It has previously been

in swapping a subset of hotels between two randomly selected parent solutions,

CX2 : To generate more diversified offspring solutions, CX2 exchanges both

P2 hp21 c5 c8 hp22 c2 c3 c4 hp23 c6 c7 c1 hp24

Ho1 hp11 hp12 hp23 hp24 Ho2 hp21 hp22

O1 hp11 c8 c3 c4 hp12 c5 c1 hp23 c2 c6 c7 hp24

O2 hp21 c8 c7 hp22 c5 c3 c4 hp13 c6 c1 c2 hp14

Fig. 1. Process of the Crossover CX2 operator

The linear-order-based crossover is a widely used crossover operator for

Fig. 2. Common subtours

A subtour refers to a subsequence of a TSP tour with a length greater

The subtour-order-based crossover can be summarized with the following