You are on page 1of 10

Computers & Industrial Engineering 136 (2019) 70–79

Contents lists available at ScienceDirect

Computers & Industrial Engineering


journal homepage: www.elsevier.com/locate/caie

Optimal mathematical programming for the warehouse location problem T


with Euclidean distance linearization
Meng You, Yiyong Xiao, Siyue Zhang, Pei Yang, Shenghan Zhou

School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China

ARTICLE INFO ABSTRACT

Keywords: The warehouse location problem (WLP) involves determining one (or multiple) locations as the materials/
Warehouse location problem products collecting/distributing centers for serving a group of customers scattered geographically in a region, at
Mixed-integer linear programming a minimum total transportation cost. The most conventional and widely used approach for solving the WLP is the
Clustering weighted k-means algorithm. However, this is not a global approach, because it always traps into local optima
Optimization
and is sensitive to the initial settings. Our numeric examples demonstrated that the solutions obtained by the
weighted k-means could depart from the optimal values by as much as 16.8% on average. In this paper, we
present an optimal programming approach based on mixed-integer linear programming (MILP) for the WLP,
which is irrelative to the initial solution and can be optimally solved by commercial solvers. For large-sized
datasets, we developed an MILP-based dynamic iterative partial optimization (MILP-DIPO) to search for the
near-optimal results with controllable computational time. Experiments on 14 datasets, including 6 small-sized
synthesized datasets and 8 variants of the known benchmark datasets in the UEF repository, were performed to
validate the proposed model and heuristics. The computational results confirm that improvements with the
proposed method could be as great as –22.9% (−14.0% on average) for small-sized datasets. For the eight
benchmark datasets, the MILP-DIPO algorithm delivered near-optimal solutions in a reasonable computational
time, with up to −8.0% (−2.6% on average) improvement compared to the results obtained by the conventional
weighted k-means algorithm.

1. Introduction type of clustering problem. An earlier version of WLP can be found in


Hakimi (1964), who developed the first model to evaluate the op-
The warehouse location problem (WLP) has an important role in the timum locations for police stations in the communication network of a
cost-efficient operation of a logistics system. It can be briefly described highway system. Krarup and Pruzan (1983) studied an important fa-
as how to identify the best locations for a number of dispatching/col- mily of discrete, deterministic, single-criterion, NP-hard, and widely
lecting warehouse centers, to serve the surrounding customers at the applicable optimization problems known as the simple plant location
minimum transport cost measured by the total weighted distances be- problem. Sridharan (1995) studied the capacitated plant location
tween the customers and facility centers by the logistics volumes (i.e., problem considering each plant with a maximum capacity, and ex-
demands). German scholar Alfred Weber first recognized the industrial amined two innovative concepts: the Lagrangian heuristic and vari-
layout design as a continuous site selection problem, called the Weber able splitting. In addition to the static and deterministic models
problem, which seeks to determine a point in the plane that minimizes mentioned above, Owen and Daskin, (1998) considered the effects of
the sum of the transportation costs from this point to n destination future event uncertainties including time-varying environmental fac-
points (Friedrich, 1929). In the following years, the site selection pro- tors, population shift, and market development trends, and reviewed
blem has inspired several major variants such as the set covering pro- the literature offering significant improvements on solving the facility
blem, center location problem, and p-median problem (Brandeau & location problem. Snyder, (2006) reviewed the stochastic and robust
Chiu, 1989; Wang, 2006). facility location models existing in the literature and summarized a
The WLP has been frequently considered a clustering problem, and rich variety of approaches and their applications in different fields
there are a number of methods in the literature developed to solve this (see Fig. 1).


Corresponding author.
E-mail addresses: youmdyx@buaa.edu.cn (M. You), Xiaoyiyong@buaa.edu.cn (Y. Xiao), zhang_sy@buaa.edu.cn (S. Zhang), yangpei@buaa.edu.cn (P. Yang),
zhoush@buaa.edu.cn (S. Zhou).

https://doi.org/10.1016/j.cie.2019.07.020
Received 25 April 2019; Received in revised form 7 July 2019; Accepted 8 July 2019
Available online 09 July 2019
0360-8352/ © 2019 Elsevier Ltd. All rights reserved.
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Fig. 1. Framework of weighted k-means algorithm for WLP.

With further consideration of the practical factors and environ- proposed a new mixed-integer linear programming (MILP) model for
mental conditions, the latest models for WLP have become considerably the p-median problem and proposed a VNS-based meta-heuristic algo-
closer to a practical application. García et al. (2014) and Huang, Wang, rithm as the solution approach. Similar works can be found in
Rajan, and Rakesh (2015) considered the nature of stored goods, and Mladenovic, Brimberg, Hansen, and Perez (2007), Herda (2017),
included the effects of time and price on demand. Further to the Kocatepe, Ozguven, Horner, and Ozel (2017), and Adler, Njoya, and
quantitative factors, Demirel, Nihan, and Cengiz (2010) also considered Volta (2018). However, the p-median requires the cluster centers to be
qualitative factors that could be inaccurate, incomplete, or vague, and selected from a set of candidate sites that are known beforehand. The
adopted a multi-criteria method for evaluating the solution result of the WLP involves determining the continuous location coordinates of the
WLP. Recently proposed models of the WLP in the literature, such as warehouses/facilities in a region; the Euclidean distance metrics always
Maharjana and Shinya (2017) and Shahabi, Tafreshian, Unnikrishnan, cause the mathematical models to be nonlinear and thereby only non-
and Boyles (2018), have considered more actual factors of develop- optimal solutions can be obtained with heuristic processes.
ment, disaster safety, transportation accessibility, and demand change In this study, we formulate the WLP as an MILP model, known as
to solve the problem with more realistic solutions. WLP-MILP, with the Euclidean distance linearization method in-
The k-means-like algorithms are effective and frequently used troduced by Xie, Zhou, Xiao, Sadan, and Konak (2018). Compared to
methods for warehouse or facility location problems. Gupta and the nonlinear models existing in the literature, the proposed WLP-MILP
Tangwongsan (2008) proposed local search algorithms for metric in- model overcomes the previously mentioned existing drawbacks, and
stances of facility location problems including the uncapacitated facility can be optimally solved using a commercial MILP solver. Furthermore,
location problem (UFLP), as well as uncapacitated versions of the k- the linearization approximation method can guarantee the maximum
median, k-center, and k-means problems. Liao and Guo (2010) trans- deviation between actual and approximated distances within a small
formed the Capacitated Facility Location Problem (such as warehouse error range; hence, the solutions can be deemed as practically optimal.
location problem) into a clustering model and developed a weighted k- For large-scale problems that cannot be solved to optimality by MILP
means clustering method. However, drawbacks inherent to the classical solvers, we develop a fix-and-optimize heuristic approach based on
WLP algorithm also exist in the weighted k-means clustering algorithm, dynamic iterative partial optimization to obtain near-optimal solutions
including uncontrollable error, results being influenced by the initial within a controllable CPU time. Computational experiments were
solution, and not being able to guarantee an optimal solution. The latest conducted to demonstrate the accuracy and computational efficiency of
survey on the k-means algorithm and several expanded methods can be the proposed approach. The results were compared to those obtained by
found in Blömer, Lammersen, Schmidt, and Sohler (2016). In addition the conventional weighted k-means algorithm, which demonstrated
to the weighted k-means, the fuzzy c-means is another popular algo- that the optimal/best solutions obtained by the proposed method are
rithm employed for the WLP; these use a fuzzy membership function to considerably superior to those obtained by the traditional methods
assign one object to multiple clusters with different degrees of be- (−14.0% improvement on average).
longingness (Zalik, 2006; Küçükdeniz & Büyüksaatçi, 2008; The remainder of this paper is organized as follows. In Section 2, we
Küçükdeniz, Baray, Ecerkale, & Esnaf, 2012; Zhang, Kaku, & Xiao, introduced the WLP and its conventional solution approach, i.e., the
2012; Şakir, Küçükdeniz, & Tunçbilek, 2014). weighted k-means algorithm. In Section 3, an MILP model is presented
However, the majority of the mathematical models for the WLP for the WLP, in addition to the MILP-based dynamic iterative partial
existing in the literature are nonlinearly formulated and are solved by optimization (MILP-DIPO) heuristic algorithm. In Section 4, computa-
using clustering algorithms (such as the weighted k-means), heuristic tional experiments are performed on benchmark datasets to test the
optimization algorithms (such as the genetic algorithm (GA)), variable proposed model and heuristic algorithm, and compare the obtained
neighborhood search (VNS), or Simulated Annealing (SA) (Xiao, Kaku, results with those of the conventional weighted k-means algorithm.
Zhao, Zhang, 2011, 2012; Xiao and Konak, 2016, 2017; Lin & Wang, Finally, the conclusions are presented in Section 5.
2018; Adasme, 2018). Therefore, the obtained results are not guaran-
teed to be globally optimized and, according to our comparative ex-
2. Warehouse location problem
periments, had potentially intolerable deviations from the optimal so-
lutions. The main reason leading to this phenomenon is related to the
The WLP can be described as follows: A set N of customers are to be
use of weighted mean center as the cluster center, which can deviate
served by a set K of warehouses. Each customer i, where i N , is as-
considerably from the optimal cluster center, especially when custo-
sociated with a demand ai and a distance dik to each k in K. The problem
mers’ locations and demands are unevenly distributed. Lin and Wang
is to determine the optimal locations for the warehouses (by coordinate
(2018) proposed a mixed-integer nonlinear model for the WLP and
variables xk and yk) and the assignments of customer to warehouse (by
developed a two-stage solution approach that uses a GA for the first
binary variable wik) with the objective of minimizing the total trans-
stage and a gradient method for the second stage. Adasme (2018)
portation cost measured by the sum of the product of the customer’s

71
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

demand and distance to the warehouse to which it belongs. A mathe- acceptable CPU time, we present a fix-and-optimize heuristic approach
matical formulation of the WLP can be described as follows (Liao & for obtaining a near-optimal solution. We hereby refer to the datasets
Guo, 2010): that can be optimally solved by an MILP solver as small-sized, and refer
Problem WLP: to those that cannot be solved optimality within an acceptable CPU
time (within 24 h) as large-sized.
Min. F (X , W ) = wik ·dik ·ai
i N k K (1)
3.1. Mixed-integer linear programming model
Subject to:

wik = 1 i N In this subsection, we model the WLP as an MILP. The parameters


k K (2) and decision variables used to describe the MILP model are summarized
as follows:
dik = (x i xk )2 + (yi yk )2 i N, k K (3) Parameters:

wik {0, 1}; xk 0; yk 0; i N, k K (4)


N set of customers, N = {1, 2, 3,…, n}
In the above formulations, N represents the set of customers, and ai i index of customers, i N
and (xi , yi ) represent the demand and coordinates of customer i, re- K set of warehouses/facilities, K = {1, 2, 3,…, m}
spectively. Terms (xk , yk ) are the coordinates of warehouse k, where k index of warehouses/facilities
(xk , yk ) X and k K . Variable wik indicates if customer i is served by (Xi, Yi) coordinates of customer i
Xmin, Xm- minimum and maximum coordinates of customers in x direction
warehouse k (wik = 1) or not (wik = 0). The objective function in Eq. (1) ax

is to minimize the total weighted distance of customers to their assigned Ymin, Ym- minimum and maximum coordinates of customers in y direction
warehouses. Constraint (2) requires that a customer can be assigned to ax

only one warehouse. Constraint (3) computes the Euclidean distances of ai amount of demand (or service) of customer i
β a given maximum error percentage allowed in linearization approxima-
customers to their assigned warehouse. Constraint (4) defines the value
tion of Euclidean distance
domain of the decision variables. θ a constant angel calculated by = arccos(1 + 4 + 2 2) that is used to
The traditional steps of the weighted k-means algorithm for the WLP determine the number of linear constraints for Euclidean distance
are described as follows (Chen, Yin, Tu, & Zhang, 2009): linearization
However, the above algorithm has two drawbacks: (1) it is a greedy q
an integer number calculated by q = , indicating the number of
2
process that is quite likely to trap into a local optima, and (2) the final linear constraints for Euclidean distance linearization, where · is the
results are sensitive to the initial warehouse locations. We provide an operator to get the upper integer
example in Fig. 2 to demonstrate how the weighted k-means algorithm M large number
cannot identify the optimal solution, where a warehouse location must
Decision variables:
be determined for 4 customers represented by circular points “1”, “2”,
“3”, and “4” with coordinates (0, 1), (0, 0), (0, −1), and (4, 0), re-
spectively. The demand of customers is “1”. The algorithm determines uik binary decision variables denoting whether customer i is serviced by
the weighted mean coordinates of the four customers at point A (1, 0), warehouse k
with the objective function value of 1 + 3 + 2 2 6.28. However, we (xk, yk) non-negative continuous variables denoting coordinates of warehouse k
dikx non-negative continuous variable denoting distance between customer i
can easily find a reduced objective value of 6 if the warehouse location
and warehouse k in x-axis direction
is set at another point B (0, 0). Further, to demonstrate the effect of y y non-negative continuous variable denoting distance between customer i
dik dik
different demands on the warehouse location optimization, we assume and warehouse k in y-axis direction
another two points, P (1, 0) and Q (4, 0), that have demands of “1” and dik dik non-negative continuous variable denoting Euclidean distance between
“3”, respectively. Thus, according to the weighted k-means algorithm, customer i and warehouse k

the weighted center location for them should be (3.25, 0), with a cost of The objective function is to minimize the total logistics volume mea-
4.5. However, one can easily verify that the cost can be reduced to three sured as the sum of the distance from each customer to its corre-
(reduced by −50%) if point Q (4, 0) is used as the center location. sponding warehouse center weighted by the demand amount. Thus, the
WLP can be formulated as an MILP model as follows:
3. Mathematical programming model and heuristic approach Problem WLP-MILP:
m n
In this section, we present the linear model for the WLP with
Min. Total _Log = dik ai
mathematical programming techniques, which is directly solvable using k=1 i=1 (5)
a commercial MILP solver (such as CPLEX or Lingo) for small-sized
datasets. For large datasets that cannot be directly solved within an Subject to:
m
uik = 1 i N
k= 1 (6)

dikx Xi xk M (1 uik ) i N, k K (7)

dikx xk Xi M (1 uik ) i N, k K (8)

diky Yi yk M (1 uik ) i N, k K (9)

diky yk Yi M (1 uik ) i N, k K (10)

dik dikx cos(p ) + diky sin(p ) i N, k K , p = 0, 1, 2, ...,q 1


Fig. 2. Simple example of WLP. (11)

72
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Table 1 are applied in Step 3(B) to guide the selection of the p percentage of uik.
Parameters θ and q with respect to different β values. The first is the weighted Distance Priority Policy (WDPP), which assumes
β −5.00% −1.00% −0.50% −0.1% −0.05% −0.01% that customers with greater weighted distances to their assigned
warehouse centers should have higher probabilities to be selected. The
θ 0.6351 0.2831 0.2001 0.0895 0.0632 0.0283 underlying philosophy is that because these customers are farther from
q 3 6 8 18 25 56
their warehouse centers, they contribute more to the objective function
and are more likely to shift their memberships to reduce the cost
function. The second policy is called the Time Priority Policy (TPP). This
uik {0, 1}; xk 0; yk 0;
i N, k K suggests that a customer waiting for a longer time (not selected) should
dikx 0; diky 0; dik 0 (12) have a higher probability to be selected next time. Under TPP, each
customer is associated with a time recorder to record how many times it
In the above formulations, Constraint (6) guarantees that each
has been selected and the customers with lower recorded times should
customer i is assigned to one warehouse. Constraints (7) and (8) require
have higher probabilities to be selected. Further, we also use the
that variable dikx have a value that is no less than the distance from
random selection policy (RSP) to guide the iterative partial optimiza-
customer i to warehouse k in the x-axis direction. Similarly, Constraints
tion. The effect of mixed use of these three policies was verified in our
(9) and (10) require that variable diky have a value that is no less than
computational experiments.
the distance from customer i to warehouse k in the y-axis direction.
In the above MILP-DIPO algorithm, the CPU time was primarily
Constraint (11) uses the inequalities developed in Xie et al. (2018) to
consumed by Step 3(D), where the MILP solver is called to implement a
approximate the Euclidean distance dik = (dikx ) 2 + (diky ) 2 within a
partial optimization on the WLP-MILP model. For a given parameter p,
maximum allowed error rate β. The relation of parameters θ, q, and β is
its computational time can be deemed as a constant represented by
displayed in Table 1. In the experimental section, we used β = −0.1%
T(p). We adopted a dynamic strategy to automatically adjust parameter
as our pilot; experiments indicated that setting β = −0.1% guaranteed
p in Steps 3(F) and 3(G), where parameter p increases by 1% if the used
the most actual requirements for solution accuracy and did not sig-
CPU time is less than a given minimum threshold tmin (e.g., 1 s), or will
nificantly increase the computational burden. Constraint (12) defines
decrease by 1% if the CPU time used is greater than a given maximum
the value domains of all the variables.
threshold tmax (e.g., 10 s). The algorithm’s computational complexity
The following Constraint (13) is used to break the clustering sym-
can be judged by how many times the MILP solver is called, which is
metry of the solution space, which reduces the computational time by
determined linearly by the parameter Cmax. Thus, we can largely esti-
approximately 50% according to our experimental comparison. This is
mate the complexity of the algorithm as O (n × m × Cmax × T(p)),
because solutions with the same locations of cluster centers can be
where n and m correspond to the numbers of customers and warehouses
deemed as the same, though the clusters could be numbered differently.
and T(p) represents the CPU time used by the MILP solver under
xk 1 xk parameter p. The conventional weighted k-means algorithm is known
k K, k > 1
or yk 1 yk (13) for its high computational efficiency and can be estimated with a
polynomial complexity of O (k × n), where k is the number of clusters
and n is the number of customers.
3.2. Fix-and-optimize solution approach
4. Computational experiments
The fix-and-optimize heuristic is a concrete adaptation of the dy-
namic iterative partial optimization strategy to solve the WLP-MILP The computational experiments were performed on a Linux PC
model (Xiao, Kaku, et al., 2011, Xiao, Zhang, Kaku, 2011, Xiao et al., server with two 2.30 GHz Intel Xeon (R) CPUs and 128 GB RAM. The
2014; Xiao and Konak, 2016, 2017). Its basic principle is described as MILP solver AMPL/CPLEX (version 12.6.0.1) was used to solve the
follows: For a large-sized complex problem with multiple decision tested instances. The experiment consisted of two parts. The first part
variables that cannot be solved optimally within an acceptable CPU was on six synthesized small datasets to test the optimality effect of the
time, a partial optimization can be applied on only a smaller set of the proposed WLP-MILP model compared to the traditional weighted k-
selected decision variables while fixing a majority of the other variables means algorithm. The second part was to test the performance of the
with given values. Thus, the selected decision variables can be effi- proposed MILP-DIPO algorithm on solving benchmark instances ex-
ciently optimized in minimal CPU time. This is then repeated, selecting isting in the literature. All the test datasets and optimal solutions are
a different smaller set of the decision variables to fulfill the partial available on the Mendeley Data Repository (Xiao, 2019) at link https://
optimization iteratively, until no further improvement can be made to data.mendeley.com/datasets/46kw4zxxnw/draft?a=76812ccd-543a-
the objective function after a given number of continuous attempts. 49eb-8f44-fc12d74b3226.
In WLP-MILP, the variable uik is an independent decision variable
that determines the membership of customers to warehouses. Other 4.1. Experiments on testing optimality effect
variables, including xk , yk , dikx , diky , and dik , are dependent variables on
uik. Thus, the fix-and-optimize heuristic was designed to conduct dy- First, we randomly synthesized 6 small-sized instances (named “R1”
namic iterative partial optimizations on variable uik using an MILP to “R6”) with customer numbers from 8 to 40, and cluster numbers
solver (this was named MILP-DIPO). The MILP-DIPO algorithm starts from 2 to 3 to illustrate the difference between the optimal solutions by
from a random initialization of the variable uik and calculates the ob- the WLP-MILP model and the heuristic solutions by the weighted k-
jective function using an MILP solver (e.g., the CPLEX). Then, it re- means algorithm. The customer locations were scattered randomly in a
peatedly selects a set of uik to fulfill the partial optimization while fixing 100 × 100 plane region and their demands were generated randomly in
the remaining part of uik with their current values until the objective the range [1, 100]. We first applied the WLP-MILP model to obtain their
function cannot be further improved after a given number of attempts optimal solutions using the CPLEX solver. We then used the weighted k-
or a time limit. The size of the variables to be selected in the next at- means algorithm to solve them repeatedly ten times and obtained ten
tempt of partial optimization is dynamically adjusted according to the heuristic solutions. The results are displayed and compared in Table 2,
CPU time used in the last attempt. The main steps of the MILP-DIPO where column k indicates the number of clusters, column n indicates
algorithm for WLP are outlined in Fig. 3 as follows: the number of customers, column Obj. indicates the obtained objective
To improve the efficiency and accuracy of the algorithm, the values by the WLP-MILP model and by the weighted k-means method.
priority policies introduced in Xiao, Zuo, Kaku, Zhou, and Pan (2019) The terms Min. and AVG. represent, respectively, the minimum and

73
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Fig. 3. Main framework of MILP-DIPO algorithm.

average values obtained by the ten runs. instance R1, even though the customers are clustered identically by
It can be observed that the optimal solutions by the WLP-MILP both methods, the warehouse locations are considerably different, re-
model are on average −14.0% less than the heuristic solutions by the sulting in a cost deviation of −9.8%. Other instances from R2 to R6
weighted k-means (or the later departs from the former by 16.8% on have not only different costs, but also different clustered layouts under
average), and the maximum improvement was greater than −20% for the two methods.
instances R1 and R2. The average improvement by the new approach Furthermore, we solved these six instances again using the
was −8.1% for the tested instances, even compared to the best solu- weighted k-means algorithm starting from the optimal centers de-
tions found by the weighted k-means algorithm. In Figs. 4–9, we termined by the WLP-MILP model. That is, we set the optimal centers
compare the optimal solutions by the WLP-MILP model and the best as the initial seeds to execute the weighted k-means algorithm gain
solution delivered by the weighted k-means algorithm, respectively, and thereby validate the optimality of this traditional algorithm. We
where hollow “squares” represent customers, filled “circles” represent received the resulting costs 410.9, 8649.0, 8126.8, 32293.3, 35052.5,
warehouses (i.e., cluster centers), and customers in the same cluster are and 42280.4 for instances R1 to R6, respectively. These results con-
plotted in the same color and outlined in dotted ellipses. It can be ob- tinue to remain greater than the optimal values by 10.9%, 4.8%, 2.6%,
served that the solutions obtained by the different methods have con- 1.2%, 2.1%, and 2.8%, respectively. Thus, this comparison suggests
siderably different clustering centers. This means that noticeably dif- that the traditional weighted k-means algorithm cannot determine the
ferent locations have been suggested according to the two different optimal solutions, regardless of the quality of the initialization it is
methods, even for the same problem instance, with minimum cost rate started from.
deviations ranging from −4.3% to −10.6%. As indicated in Fig. 4, for

Table 2
Result comparison on synthesized small-sized datasets.
WLP-MILP model 10 runs of weighted k-means Improvements

Ins. k n Obj. T. (s) 1 AVG obj. Min. obj. T. (s) Dev. Min. % AVG %

R1 2 5 370.6 <1 465.9 410.9 <1 −40.3 −9.8% −20.5%


R2 2 8 8253.8 <1 10702.5 9230.7 <1 −976.9 −10.6% –22.9%
R3 2 10 7923.0 <1 8604.3 8707.2 <1 −681.3 −7.9% −9.0%
R4 3 30 31919.5 145 36969.6 35340.7 <1 −3421.2 −9.7% −13.7%
R5 3 35 34344.7 331 37927.1 36630.8 <1 −2286.1 −6.2% −9.4%
R6 3 40 41121.6 1319 44988.7 42985.1 <1 −1863.5 −4.3% −8.6%
AVG −8.1% −14.0%

Note: bold face indicates optimal values.

74
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Fig. 4. Clustering results of R1 by different approaches (n = 5, k = 2).

Fig. 5. Clustering results of R2 by different approaches (n = 8, k = 2).

Fig. 6. Clustering results of R3 by different approaches (n = 10, k = 2).

75
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Fig. 7. Clustering results of R4 by different approaches (n = 30, k = 3).

Fig. 8. Clustering results of R5 by different approaches (n = 35, k = 3).

Fig. 9. Clustering results of R6 by different approaches (n = 40, k = 3).

76
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Table 3
Computational results on benchmark datasets.
Type k n 10 runs of MILP-DIPO 10 runs of weighted k-means Improvement

Min. obj. AVG obj. T. (s) Min. obj. AVG obj. T. (s) Min. AVG

Flame 2 240 37606.4 37606.4 337 38469.4 38472.1 <1 −2.2% −2.3%
Jain 2 373 126360.8 126377.5 200 128688.8 128768.8 <1 −1.8% −1.9%
Pathbased 3 300 69603.3 69603.3 423 70022.0 70026.6 <1 −0.6% −0.6%
Spiral 3 312 91710.5 91,716 625 92043.9 92098.3 <1 −0.4% −0.4%
S2 3 334 148208.4 156964.8 475 150931.5 168847.2 <1 −1.8% −7.0%
S1 5 334 108355.3 131303.3 589 115537.2 133704.1 <1 −6.2% −1.8%
Compound 6 399 53980.1 59493.5 1075 54599.8 64651.5 <1 −1.1% −8.0%
Aggregation 7 394 66578.3 69279.8 1114.1 66713.6 68477.2 <1 −0.2% 1.2%
AVG 604 −1.8% −2.6%

Fig. 10. Trends of incumbent solution by MILP-DIPO algorithm.

4.2. Experiments on testing the MILP-DIPO algorithm against the solving time. As can be observed in these figures, the in-
cumbent solution decreased rapidly at the beginning and improved
In this subsection, we tested the MILP-DIPO algorithm proposed in gradually as the partial optimization process continued. The result was
Section 3.2 using benchmark datasets existing in the literature. The test delivered after expending the given time length without improvement.
datasets were from the UEF repository (http://cs.uef.fi/sipu/datasets/) In Fig. 11, we display the operator success rates (OSR) of the three
with customer numbers ranging from 240 to 399. We refer to these as operators used in the iterative partial optimization: WDPP, Frequency
“Flame”, “Jain”, “Pathbased”, “Spiral”, “S2”, “S1”, “Compound”, and Priority Policy (FPP), and RSP. The OSR is defined as the ratio of the
“Aggregation”. The customers’ demands were randomly generated be- time of using the operator to the time of improvement on the incumbent
tween 1 and 100. We used the MILP-DIPO algorithm described in Fig. 3 solution (Xiao et al., 2016). An operator with a higher OSR indicates
to solve the datasets with parameter settings p = 5%, Cmax = 50, greater efficiency in optimizing the problem. It indicates that WDPP
tmin = 5, and tmax = 5, and compared the obtained results to those and FPP have higher rates than RSP, contributing more time of im-
obtained by the weighted k-means algorithm. We repeated both algo- provement to the incumbent solution. Note that we used these three
rithms ten times, and display their average and best solutions in operators with equal probabilities in the MILP-DIPO algorithm. The
Table 3. competitive strategy of Dellaert and Jeunet (2000), where the operator
It can be observed that in terms of the best solutions obtained, the with the higher OSR is given a higher probability of use, could be
MILP-DIPO heuristic outperformed the weighted k-means for all tested
datasets, with improvements ranging from −0.2% to −6.2% and an
average improvement of −1.8%. In terms of the average improvement,
except the instance “Aggregation”, the instances demonstrated im-
provements ranging from −0.4 to −8.0% with an average improve-
ment of −2.6%. It is notable that the overall improvement was not as
significant as that of the small-sized datasets in Table 2. This is likely
because of two reasons: (1) the solutions by MILP-DIPO are not guar-
anteed to be optimal and can possibly be further optimized and (2) the
uneven distributions of the customers and demands were balanced to an
extent by introducing more customers in a random manner, which
could have neutralized the solution differences.
In Fig. 10, we select two instances (Compound and Aggregation) to
demonstrate the detailed performance of the MILP-DIPO algorithm,
where the convergence progress of the incumbent solutions is plotted Fig. 11. Comparison of success rates of different operators.

77
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Fig. 12. Initialization of MILP-DIPO* algorithm.

Table 4
Computational results with improved initialization.
Type k n 10 runs of MILP-DIPO* 10 runs of weighted k-means Improvement

Min. obj. AVG obj. T. (s) Min. obj. AVG obj. T. (s) Min. AVG

Flame 2 240 37606.4 37607.1 272 38469.4 38472.1 <1 −2.2% −2.2%
Jain 2 373 126360.8 126360.8 244 128688.8 128768.8 <1 −1.8% −1.9%
Pathbased 3 300 69603.3 70800.4 340 70022.0 70026.6 <1 −0.6% 1.1%
Spiral 3 312 91686.5 92143.5 478 92043.9 92098.3 <1 −0.4% 0.0%
S2 3 334 148208.4 152860.3 294 150931.5 168847.2 <1 −1.8% −9.5%
S1 5 334 108355.3 134072.3 320 115537.2 133704.1 <1 −6.2% 0.3%
Compound 6 399 53980.1 59242.5 383 54599.8 64651.5 <1 −1.1% −8.4%
Aggregation 7 394 66589.8 70215.9 834 66713.6 68477.2 <1 −0.2% 2.5%
AVG 395 −1.8% −2.2%

helpful in accelerating and improving the optimizing process. perspective of real-life applications. However, the new algorithm has
Next, we re-executed the MILP-DIPO algorithm with an improved efficient drawbacks for large-sized problems and it does not guarantee
initialization. That is, before starting the iterative partial optimization solution optimality in such situations. Future works are planned in two
on variable uik, we implemented the partial optimization on variables directions: (1) to develop more efficient and effective heuristics for
(xk, yk) and uik alternatively as a whole. The initialization is depicted in solving large-sized problem instances, and (2) to consider a capacitated
Fig. 12 and the computational results are presented in Table 4, where WLP with limited warehouses and stochastic customer demands.
the MILP-DIPO algorithm with the new initialization method is denoted
as MILP-DIPO*. Compared to the performances by MILP-DIPO in Acknowledgments
Table 3, MILP-DIPO* provided a similar level of solution quality in
terms of both the minimum and average improvements upon the results This study was partly funded by the National Natural Science
of the weighted k-means algorithm. However, the MILP-DIPO* de- Foundation of China under Grant Nos. 71871003 and 71571004, and
monstrated greater efficiency, successfully reducing the average com- the Fundamental Research Funds for the Central Universities (grant
putational time by 34%. number YWF-19-BJ-J-330).

5. Conclusions References

This study developed an optimal mathematical programming ap- Adasme, P. (2018). P-Median based formulations with backbone facility locations. Applied
proach for solving the warehouse location problem (WLP) with truly Soft Computing, 67, 261–275.
Adler, N., Njoya, T. E., & Volta, N. (2018). The multi-airline p-hub median problem ap-
optimized solutions. The new approach resolved the two known plied to the African aviation market. Transportation Research Part A, 107, 187–202.
drawbacks of the conventional weighted k-means algorithms: 1) easy to Blömer, J., Lammersen, C., Schmidt, M., & Sohler, C. (2016). Theoretical analysis of the k-
trap into local optima, and (2) sensitive to initial locations. The com- means algorithm-a survey. Algorithm Engineering, 9220, 81–116.
Brandeau, M. L., & Chiu, S. S. (1989). An overview of representative problems in location
parative experiments demonstrated that the new approach consistently research. Management Science, 35(6), 645–674.
delivered superior solutions to those of the traditional weighted k- Chen, X., Yin, W., Tu, P., & Zhang, H. (2009). Weighted k-means algorithm based text
means algorithm for small-sized instances. The average cost reductions clustering. IEEE Computer Society, 253, 51–55.
Dellaert, N., & Jeunet, J. (2000). Solving large unconstrained multilevel lot-sizing pro-
were considerably large, ranging from −8.6% to –22.9% for different blems using a hybrid genetic algorithm. International Journal of Production Research,
tested instances. This cost-saving potential could be much significant to 38, 1083–1099.
logistic companies in real-life applications, but had received less at- Demirel, T., Nihan, C. D., & Cengiz, K. (2010). Multi-criteria warehouse location selection
using Choquet integral. Expert Systems with Applications, 37, 3943–3952.
tention from both researchers and practitioners in literature. This paper
Friedrich, J. C. (1929). Alfred Weber's theory of the location of industries. American
is the first to bring up this issue to front and revealed quantitatively the Journal of Sociology, 35(5), 853.
possible amazing gaps through comparisons between the traditional García, Alvarado, Blanco, Jiménez, & Maldonado, Cortés (2014). Multi-attribute eva-
solutions and the truly optimized ones for the WLP model. luation and selection of sites for agricultural product warehouses based on an analytic
hierarchy process. Computers and Electronics in Agriculture, 100, 60–69.
The proposed MILP-DIPO algorithm proved capable of obtaining Gupta, A., & Tangwongsan, K. (2008). Simpler analyses of local search algorithms for
better heuristic solutions for real-scale instances compared to the tra- facility location. Journal of Religious Studies, 82, 191–196.
ditional weighted k-means, and the tested problem sizes ranged from Hakimi (1964). Optimum locations of switching centers and the absolute centers and
medians of a graph. Operations Research, 3, 450–459.
240 to 399 customers which can be deemed as medium-sized from

78
M. You, et al. Computers & Industrial Engineering 136 (2019) 70–79

Herda, M. (2017). Parallel genetic algorithm for capacitated p-median problem. Procedia Sridharan (1995). The capacitated plant location problem. European Journal of Operational
Engineering, 192, 313–317. Research, 87, 203–213.
Huang, S., Wang, Q., Rajan, B., & Rakesh, N. (2015). An integrated model for site se- Wang, F., Xu, Y., & Li, Y. (2006). A review of the discrete facility location problem.
lection and space determination of warehouses. Computers & Operations Research, 62, International Journal of Plant Engineering & Management, 11(1), 40–50.
169–176. Xiao, Y., Kaku, I., Zhao, Q., & Zhang, R. (2011a). A variable neighborhood search based
Kocatepe, A., Ozguven, E., Horner, M., & Ozel, H. (2017). Pet-and special needs-friendly approach for uncapacitated multilevel lot-sizing problems. Computers & Industrial
shelter planning in south Florida: A spatial capacitated p-median-based approach. Engineering, 60, 218–227.
International Journal of Disaster Risk Reduction, 153, 650–666. Xiao, Y., Zhang, R., & Kaku, I. (2011b). A new approach of inventory classification based
Krarup, Pruzan (1983). The simple plant location problem: Survey and synthesis. on loss profit. Expert Systems with Applications, 38(8), 9382–9391.
European Journal of Operational Research, 12, 36–81. Xiao, Y., Kaku, I., Zhao, Q., & Zhang, R. (2012). Neighborhood search techniques for
Küçükdeniz, T., Baray, A., Ecerkale, K., & Esnaf, Ş. (2012). Integrated use of fuzzy c- solving uncapacitated multilevel lot-sizing problems. Computers & Operations
means and convex programming for capacitated multi-facility location problem. Research, 39(3), 647–658.
Expert Systems with Applications, 39(4), 4306–4314. Xiao, Y., Zhang, R., Zhao, Q., Kaku, I., & Xu, Y. (2014). A variable neighborhood search
Küçükdeniz, T., & Büyüksaatçi, S. (2008). Fuzzy C-means and center of gravity combined with an effective local search for uncapacitated multilevel lot-sizing problems.
model for a capacitated planar multiple facility location problem. International European Journal of Operational Research, 235(1), 102–114.
Conference on Multivariate Statistical Modeling & High Dimensional Data Mining, 142, Xiao, Y., & Konak, A. (2016). The heterogeneous green vehicle routing and scheduling
23–37. problem with time-varying traffic congestion. Transportation Research Part E: Logistics
Liao, K., & Guo, D. (2010). A clustering-based approach to the capacitated facility loca- and Transportation Review, 88, 146–166.
tion problem. Transactions in Gis, 12(3), 323–339. Xiao, Y., & Konak, A. (2017). A genetic algorithm with exact dynamic programming for
Lin, Y. S., & Wang, K. J. (2018). A two-stage stochastic optimization model for warehouse the green vehicle routing & scheduling problem. Journal of Cleaner Production, 167,
configuration and inventory policy of deteriorating items. Computers & Industrial 1450–1463.
Engineering, 120, 83–93. Xiao, Y., Zuo, X., Kaku, I., Zhou, S., & Pan, X. (2019). Optimal mathematical program-
Maharjana, R., & Shinya, H. (2017). Warehouse location determination for humanitarian ming and variable neighborhood search for k-modes categorical data clustering.
relief distribution in Nepal. Transportation Research Procedia, 25, 1151–1163. Pattern Recognition, 90, 183–195.
Mladenovic, N., Brimberg, J., Hansen, P., & Perez, M. (2007). The p-median problem: A Xiao Y., 2019. Data for: Optimal mathematical programming for the warehouse location
survey of metaheuristic approaches. European Journal of Operational Research, 179, problem with Euclidean distance linearization, Mendeley Data, v1. https://data.
927–939. mendeley.com/datasets/46kw4zxxnw/draft?a=76812ccd-543a-49eb-8f44-
Owen, S. H., & Daskin (1998). Strategic facility location: A review. European Journal of fc12d74b3226.
Operational Research, 111, 423–447. Xie, Y., Zhou, S., Xiao, Y., Sadan, K., & Konak, A. (2018). A β-accurate linearization
Şakir, E., Küçükdeniz, T., & Tunçbilek, N. (2014). Fuzzy C-means algorithm with fixed method of Euclidean distance for the facility layout problem with heterogeneous
cluster centers for uncapacitated facility location problems: Turkish case study. distance metrics. European Journal of Operational Research, 265, 26–38.
Supply Chain Management under Fuzziness, 313, 489–516. Zalik, K. R. (2006). Fuzzy c-means clustering and facility location problems. Artificial
Shahabi, M., Tafreshian, A., Unnikrishnan, A., & Boyles, S. D. (2018). Joint productio- Intelligence and Soft Computing, 544, 256–261.
n–inventory–location problem with multi-variate normal demand. Transportation Zhang, R., Kaku, I., & Xiao, Y. (2012). Model and heuristic algorithm of the joint re-
Research Part B Methodological, 110, 60–78. plenishment problem with complete backordering and correlated demand.
Snyder (2006). Facility location under uncertainty: A review. IISE Transactions, 38, International Journal of Production Economics, 139(1), 33–41.
547–564.

79

You might also like