You are on page 1of 12

JID: OME

ARTICLE IN PRESS [m5G;May 23, 2020;6:11]


Omega xxx (xxxx) xxx

Contents lists available at ScienceDirect

Omega
journal homepage: www.elsevier.com/locate/omega

Data-driven decision and analytics of collection and delivery point


location problems for online retailers
Xianhao Xu a, Yaohan Shen a, Wanying (Amanda) Chen b,∗, Yeming Gong c, Hongwei Wang a
a
School of Management, Huazhong University of Science and Technology, Wuhan, China
b
Zhejiang Gongshang University, Hangzhou, 310018, China
c
EMLYON Business School, 23 avenue Guy de Collongue, Ecully Cedex 69134, France

a r t i c l e i n f o a b s t r a c t

Article history: The location of collection and delivery points (CDPs), impacted by online customers’ demand data, plays
Received 27 December 2019 an important role for online retailers. While previous delivery points optimization researches do not use
Accepted 14 May 2020
customer behavior data, we propose new models, integrating with customer behavior data analysis, to
Available online xxx
optimize collection and delivery points for online retailers. We explore a real customer behavior data and
Keywords: use totally 257,685 users’ records (212,062 records for training set and 45,623 records test set). We first
Data-driven decision making and analytics estimate the customer purchase probability by five data mining models. Based on the estimation results,
Customer behavior data we establish two facility location models to respectively optimize the attended and unattended CDPs lo-
Machine learning cations with the objective of cost minimization. Our numerical experiments make a quantitative analysis
Collection and delivery points of customer service level and location cost. Our results can further help online retailers to decide the suit-
Facility location able CDPs with trading off the consumer service level and the total logistics cost. We make interesting
contributions: (i) we analyze real customer behavior data and find that gradient boosting trees algorithm
outperform other four algorithms when estimating customers’ purchase probabilities; (ii) We propose a
new data-driven method integrating data mining models and facility location models to determine CDP
locations for online retailers.
© 2020 Elsevier Ltd. All rights reserved.

1. Introduction RadioShack to host Amazon Locker kiosks. Wall Street Journal re-
ports the delivery lockers are Amazon’s new secret weapon and
Our research is partially motivated by new logistics practice by bring benefits to Amazon and its partners [8,11]. Retailers receive
online retailers including Alibaba, Amazon, Vipshop, and Jingdong. a stipend from Amazon to host the kiosks and Amazon get ad-
When an online retailer delivers the products to a customer, the vantage of product delivery. This logistics problem and its influ-
customer may be not at home for the work in office or holidays. ence is important, for example, Alibaba and Cainiao build 40,0 0 0
Then the online retailers will temporarily deliver products to a so- CDPs, and Jingdong builds 6906 CDPs to improve their order
called collection and delivery point (CDP). The customer can pick delivery.
up the products from the CDP when it is available. The locations The CDPs can be classified into attended CDP (See Fig. 1(a)) and
of collection and delivery points, impacted by the customers’ be- unattended CDP (See Fig. 1(b)). Attended CDP refers to the service
havior, play an important role in the e-business competition be- points which need the human workers attended and can only pro-
cause they can affect the customers’ satisfaction by increasing the vide service in a certain time. Most of the attended CDPs are of
availability of products and the operational cost by reducing re- a shop-in-shop type, which means the logistic provider makes a
peated delivery or delivery errors [9,27]. Amazon Locker program commissioned agency agreement with shop owners and pay for
(kiosks) was launched in 2011 in New York City, Seattle, and Lon- the operational costs. Potential attended CDP locations include su-
don. Now Lockers were available in over 2800 locations in more permarkets, communications operating rooms, and leisure facilities
than 70 cities in the world [8,31]. Amazon partners with retail (such as sports clubs). Unattended CDP is a smart reception boxe
stores such as 7-Eleven, Spar, Staples,Co-op Food, Morrisons, and or smart locker which can provide 24 h service and customers can
pick up the package without shopper attended. Unattended CDP is
installed, for example, in a company car park, or beside petrol sta-

Corresponding author. tions, or a central park. The use of unattended CDP is gradually in-
E-mail address: wanyingchen@mail.zjgsu.edu.cn (W. (Amanda) Chen). creasing and it could significantly improve delivery efficiency [22].

https://doi.org/10.1016/j.omega.2020.102280
0305-0483/© 2020 Elsevier Ltd. All rights reserved.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

2 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

records via five data mining models including random forest, mul-
tilayer perceptron, and naive Bayes models. We find that the gra-
dient boosting trees performs best, so we implement it to estimate
purchase probabilities of potential customers. At the Second stage,
we weight customers with their purchase probabilities, and build
facility location models for attended and unattended CDPs, studing
their location problems with techniques of set covering problems.
Moreover, we consider optimizing CDP locations as a multi-period
problem, because the demands of the online retailers are fluctu-
ant and online retailers can terminate the contract with a com-
missioned agency at a certain stage because of the low delivery
performance of that point or the changing of customers demands.
We can provide the suggestion and solution to optimize the CDP
with our method for every contract period. Comparing with tradi-
tional location approaches that generate lower bounds of solutions,
our method can effectively reduce total cost for locating CDPs for
online retailer with integrating with big data analytics.
Instead of directly exploring customer behavior data via data
mining, we propose a data-driven approach to optimize CDPs by
Fig. 1. Collection and delivery points.
utilizing valuable information obtained from customer behavior
data. We take advantages of online retailer characteristics with big
customer behavior data to establish new facility location models.
The high development of e-business leads to the change of cus- Our research makes the following contributions:
tomers behavior, which makes the new challenge for the location
• We extract useful information via data mining models from
choice of CDPs. The new information technology makes that the
customer behavior big data to optimize CDP for online retail-
customers behavior can be studied more accurately by data than
ers.
before [23]. Therefore, our paper focuses on the optimization of
• We propose a data-driven method integrating data mining
CDPs for online retailers driven by customer behavior data. We
models and facility location models to determine CDP locations.
consider online retailers who sells clothing, bags, and cosmetic
• Our research explores real data set and five machine learning
products. These types of commodities have three main character-
models are employed to analyze the data. We find that the gra-
istics. First, the decision time (between when the products are
dient boosting trees performs best.
viewed and when the products are purchased) of purchasing the
• Our experiments demonstrate how to locate CDPs explicitly,
commodities is longer because the customers spend more time on
and we provide quantitative analysis of relationship between
viewing and comparing the products. Second, the learning capa-
customer service level and retailers’ benefit.
bility of customers is weak. To be exact, the last purchasing ex-
perience of the customer would not help them shorten the cur- The rest of this paper is organized as follows: Section 2 re-
rent decision time. Third, the purchase cycle is usually longer than views the related literature. Section 3 provides the details of our
the normal daily necessities, but less than the durable necessities. research problem and the approach we proposed. Section 4 de-
Unlike the repeated purchasing products, customers usually would scribes purchase probability estimation stage. Section 5 presents
not purchase the same clothes or shoes at different times. These location models and Section 6 presents our experiments. The last
characteristics imply that customers’ demands are dynamic. There- section is the conclusion.
fore,the online retailers need to optimize the CDP location based
on customers dynamic demand. 2. Literature review
In literature, facility location has been investigated in differ-
ent fields, such as facility location for the healthcare facility [1,35], To answer our research questions, we review the related litera-
with different backgrounds, such as facility location before or af- ture from the following aspects, collection and delivery points, fa-
ter disaster [36], for different decision makers, such as for govern- cility location problem, and customer behavior data.
ment officials [36], by different methodologies [10,25]. However,
few scholars study facility location decision of CDPs for online re- 2.1. Collection and delivery points
tailers in e-business environment by optimization model driven by
big data. Although [37] propose a non-linear model to study the The worldwide tendency of shopping online leads to a grow-
CDPs location of urban last-mile distribution center, the impact ing number of parcels to be delivered. It is well-known that
of customers behavior on locations of CDPs has not been studied last-mile delivery is a challenging problem of logistics. Xu et al.
yet. Therefore, how to optimize the facility location under the e- [33] propose to set up CDP systems for e-commerce Logistics,
business environment considering the customers behavior data is and analyze operation patterns and investment sources. As an
still an open question. Based on these considerations and literature alternative home delivery mode, CDPs can effectively reduce labor
gaps, our study answers the following research questions: How to costs and prevent failed home deliveries. Therefore, establishing a
optimize the CDP locations for online retailer with the customers’ be- CDP system is beneficial for e-commerce companies. Few studies
havior data considering the customers’ satisfaction and operational provide strategies for companies to build CDP systems. Wu et al.
cost? How to integrate the facility location optimization models with [32] suggest to locate self-collection points by clustering slightly
big data analytics? modified customer locations based on crowd centers identified via
To answer the research question, our research proposes a data- public transport data. Wang et al. [29] designs an optimization
driven method to optimize CDP locations. The method consists model for pickup points to first maximize the demand coverage
of purchase probability estimation stage and location optimization and then customer satisfaction. Bard and Jarrah [4] investigate a
stage. At the first stage, we first extract useful future set and then design problem of pickup and delivery networks. They separately
explore a real customer behavior data containing 257,685 users consider commercial and residential customers to make a strategic

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx 3

analysis of the networks. Our research considers both attended historical purchase actions and records users click action in web-
and unattended CDPs, optimize CDP locations from the perspective sites. This is similar to clickstream data but focuses on the cus-
of online retailers, and makes use of customers online behavioral tomer instead of the user action. It can be transformed from click-
data and historical purchase data to extract advanced information stream data, which is defined to be path information from visitors
for online retailers. [18]. Path data may contain a users goals, knowledge, and interest,
Recently many empirical and case studies have analyzed cus- and presents a new method to predict the customer’s purchasing
tomers’ preference trends and CDP acceptance. Kedia et al. [16] re- behavior [21]. Specially, the path encodes the events leading up to
search the customers’ CDP acceptability in New Zealand and con- a purchase, as opposed to looking at the purchasing occasion alone.
clude that the distance between customers’ homes or companies The same is applied to customer behavior data. Many studies ex-
and the CDPs mainly affects the citizens acceptance of them. Yuen amine how, using data from websites, to predict the purchasing
et al. [34] give deep insights into the influence factors of the con- conversion rate, which can provide advice for online retailers to
sumers intention to choose self-collection services. Wang et al. personalize their website. Poel and Buckinx [26] identify nine key
[29] analyze consumers self-collection service adoption behavior factors out of ninety-two possible measures from customer pur-
and find that their attitudes affect their intentions the most. Liu chase behavior to predict whether a visitor will make a purchase
et al. [19] explore the major factors of trip mode choices to collect at the next visit. Khan et al. [17] present a framework using cus-
parcels at CDPs, and find that 50% of closer CDPs can result in a tomer behavior data to obtain customer churn information ahead.
7% decrease in the probability of choosing car modes. These stud- A patent of invention uses customer behavior data to produce dig-
ies prove that the distance between the customer locations and the ital media marketing messages [2]. Guo et al. [13] utilize online in-
CDPs, which can be regarded as the service range of the CDPs, is formation extracted from customers’ reviews and clicks to analyse
an important parameter of customer service level. Thus, we con- customer preference. Different from studies above, we investigate
sider the service range as a crucial parameter that can be decided the benefits of customer behavior data and analyzed them from a
by online retailers, and we consider cost-changing situations about perspective of facility locations.
the service range so that online retailers may trade off cost and
customer service level.
3. Problem description and methodology

2.2. Facility location problem 3.1. Problem description

Facility location problem has been researched in many fields We consider that a retailer needs to choose strategic locations
such as military [15] and transportation fields [28]. In general, to of attended and unattended CDPs for a certain area and a time
solve the problem, we should decide where to locate one or several horizon H, which is divided into several periods t–that is, H =
facilities to serve a set of demand points [24]. Despite the strate- {t }t=1
T . The online retailer needs to locate CDPs and decide on the

gic and long-term nature of facility location decisions, most mod- number of CDPs to trade off consumer service level and the total
els adopt a static approach [7,14]. This static approach simplifies logistics cost.
the problem by assuming that a decision maker in the future can There is sufficient customer historical behavior data Z , which is
replicate the present situation. However, in most cases, future sce- generated on the retailer’s websites before the initial period. More-
narios differ from present scenarios, and optimal facility locations over, in every period t, the online retailer possesses updated cus-
are subject to change. tomer behavior data zt that is stored in web logs. Based on the
The natural and practical extension of static facility location data and the customers’ address information, the retailer can de-
problems is multiperiod facility location models. Such models seek termine the optimal attended and unattended CDPs denoted by
the optimal policy for the planning period. Early work in the area Ot , Ot . Table 1 shows the notation used in this paper. To model this
was performed by Ballou [3] in the context of locating and relo- problem, we make the following assumptions:
cating a single warehouse to maximize the cumulative profits for a
planning period. The programming method serves as a mathemat- • The demands of the retailer is fluctuant.
ical technique to find the optimum solution in that model. Multi- • The relocating an unattended CDP will lead to a cost cr in a city.
period location models are applied in diverse fields, including sup- • For attended CDPs, the parcels can only be delivered to cus-
ply chains [20], health-care facilities [12], and emergency logistics tomers once. If the customers are not at home during the de-
[5]. The objective of multiperiod models is to determine optimal livery, they should go to the CDP to pick up the parcels them-
facility points in every period so that the level of customer service selves.
can be maintained and the cost of maintaining this level of cus- • CDPs are selected from potential points, and the potential loca-
tomer service can be minimized. tion set is known.
We consider locating CDPs as a multiperiod decision prob-
lem, allowing for constantly changing customer demand over time.
3.2. Methodology
However, a typical characteristic of locating CDPs that differs from
a conventional facility location problem is the lower setting-up and
In this section, we introduce the method proposed in this paper.
relocation costs. Consequently, the managers of the online retailer
Since the retailer has to determine optimal CDPs in every contract
can open CDPs and then close them shortly based on the change
period, we propose to repeatedly solve the problem of optimizing
of customer demands over time. In this study, we locate CDPs by
CDPs for each period. Our method is shown in Fig. 2. The process
using set covering and maximal covering models, but with innova-
is divided into two main parts: the purchase probability estimation
tion to integrate them with big data analytics.
(Section 4) and the location optimization (Section 5). The data on
the customers e-shopping behavior is collected from the retailers
2.3. Customer behavior data website logs. With the customer behavior data, we first estimate
the demand information, such as demand points and correspond-
To trade off the cost and service level, we propose to optimize ing purchase probabilities. Based on the customer purchase prob-
CDPs with the consideration of the customers demands. We can abilities and the customers’ addresses, the optimal CDP locations
estimate demand through customer behavior data, which includes are selected from the potential location set.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

4 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

Table 1
Notations.

Symbol Meaning

parameters
H = {t }t=1
T
a time horizon that the retailer needs to make strategies of locating CDPs (divided into T ∈ N+ period)
I = {i}ni=1 set of demand points (customers)
L = {l }ml=1
potential set of attended CDPs
J = { j }m
j=1
potential set of unattended CDPs
clt operation cost in period t when an attended CDP is located
cr cost of relocating an unattended point
fit purchase probability of demand point i at period t
δ coverage ratio that all CDPs are able to reach
S service range, distance beyond which a demand point is considered ”uncovered” (we take a priori value based on experience)
dij shortest distance from node i to node j
Nit l ∈ L: dil ≤ S, t ∈ [0, T], the set of potential attended CDPs whose distance from customer i is less than the service range S
Mit j ∈ J: dij ≤ S, t ∈ [0, T], the set of potential unattended CDPs whose distance from customer i is less than the service range S
vit number of products that the customer i buy at period t
vu capacity of an unattended point
pt number of unattended delivery and pickup points to be located in a period
Ot optimal attended CDPs for period t
Ot optimal unattended CDPs for period t
variables
xlt equal to 1 if an attended CDP is located at site l in period t, otherwise 0
yit equal to 1 if the demand point i is covered by at least one attended CDP in period t, otherwise 0
zjt equal to 1 if an unattended CDP is located at site j in period t, otherwise 0
wit equal to 1 if the demand point i is covered by at least one unattended CDP in period t, otherwise 0

Fig. 2. Illustration of the method framework.

4. Purchase probability estimation process the data set, we extract a feature set including twenty fea-
tures, as described in Table 2.
4.1. Data processing
4.2. Data mining model
In this section, we deal with a real-world data set provided by
the Ali IJCAI, and estimate customers’ purchase probabilities based Given the large quantity of window shoppers, the data sets
on the customer behavior data for optimization. The data contain are imbalanced. Therefore, we split the training data into two
customer behavioral records generated in May, June, July, August, parts: train (70%) for data mining model learning and validation
September, October (totally six months) and on the “Double11” (30%) for model evaluation. Then we try diverse machine learning
day. The data on the “Double11” day is not general, and we have models with the train data and assess it with the validation data.
not used it. We apply the data set that is generated in the six The parameters of the model will be determined after the training
months and involves 257,685 labeled records including the users’ step. We compare the performances among several machine learn-
basic information and their activity logs. The corresponding labels ing models, including random forest, naive Baye, etc. The results
of users in the data set that can indicate, in our case, if the user are displayed in Table 3. Since the gradient boosting trees (GBT)
made a purchase. outperforms other models, we choose the GBT as our data mining
We divide the data set into training set (212,062 records) and model.
test set (45,623). We first extract customer behavior data from the When training the model with historical customer behavior
raw data set, which provides the customers’ demographic informa- data Z, we sequentially train k decision tree fk and assemble them

tion and their activity logs, and sees a large number of sales. We in a weighted way F (Z ) = αk fk (Z ), where α k is calculated by
consider all merchants incorporating a large online retailer. To pre- an optimization strategy called line search to minimize the loss.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx 5

Table 2
Feature set description.

Feature name Description

Gender Customer’s gender: 0 for female, 1 for male, 2 for unknown


AgeRange Age range of the customer: 1 for < 18; 2 for [18,24]; 3 for [25,29]; 4 for [30,34]; 5 for [35,39]; 6 for [40,49]; 7 for ≥ 50
ActTime A list of action times of the customer
TimeStamp A list of action numbers in every month (for the training algorithm, we will change it to one-hot format)
TimeVar The variance of action numbers in every month
TimeRatio The ratio of days when the customer acts to a month
Click Total past click times
MeanClick Average click times of the customer in a month
AddCart Total times of adding products to the cart
MeanAddCart Average times of adding products to the cart in a month
AddFav Total time of adding products to the favorites list
MeanAddFav Average times of adding products to the favorites list in a month
Purchase Total number of purchases the customer has ever made
MeanPurchase Average number of historical purchases in a month
ItemCount Total number of items that the customer has acted on
ItemRatio The ratio of acted items to total items
CatCount Total number of categories that the customer acted on
CatRatio The ratio of acted categories to total categories
BrandCount Total number of brands that the customer has acted on
BrandRatio The ratio of acted brands to total brands

Table 3 safer and more flexible in terms of parcel sizes and modes of pay-
Model evaluation.
ment [16,30]. In addition, we make difference between two modes
Methods Accuracy AUC for the easy implementation of decision makers of logistics. Some
Gradient boosting trees 92.953% 0.6041 small logistics service companies only provide service of attended
Random forest 91.010% 0.5588 CDPs. Therefore, we first built location model for attended CDPs
Naive Bayes 84.745% 0.5997 then for unattended CDPs.
Logistic regression 65.427% 0.6039 We assume that the parcels can only be delivered to the cus-
Multilayer perceptron 86.846% 0.5716
tomers once for attended CDPs. If the customers are not at home
during the delivery, they should go to the CDP to pick up the
parcels themselves. Potential locations of attended CDPs are conve-
After training step, the model can be used to make estimation with nience stores, retail chains, petrol stations, etc. Optimal locations of
updated customer behavior data. CDPs are selected among a set of potential locations L = {l }m l=1
. We
assume that there is a fixed cost clt of each attended CDP in period
4.3. Customer purchase probability t. We use fi to denote the possibility that the customer will buy
the products in period t. The coverage percentage refers to the per-
We choose GBT to make a purchase estimation on the test set, centage of customers in the service area of certain attended CDPs,
 
since it achieves the highest accuracy and decent AUC (area un- which can be calculated as δ = i∈I fit yit / i∈I fit , where yit ∈ 0,
der a curve). We assume that the retailer needs to decide on a 1. A demand point is covered when that point is at a distance less
strategic CDP location for half a year and the contract duration is than or equal to S. A demand point is ’uncovered’ when the closest
three months, so the time horizon is six months and consists of CDP is at the distance greater than S.
two periods (every period is three months), that is H = {t = 1, 2}. The attended CDP location model is shown as follows.
Then we use records that are generated in the former and the last 
3 months in test data set to estimate purchase probability of po-
Attended model: Min xlt clt t = 1, 2 . . . T (1)
l∈L
tential customers. The records involve 11,390 users when t = 1 and
34,233 users when t = 2. As mentioned above, we regard the out-
puts of the data mining model, which are the customer purchase subject to:

probabilities fit , whose distribution is shown in Fig. 3. We only con- f y
sider customers whose purchase probability fit ≥ 0.5, and 312 cus- α  δ = i∈I it it t = 1, 2 . . . T (2)
i∈I f it
tomers that meet the condition when t = 1, 1013 customers when
t = 2. 
yit  xlt t = 1, 2 . . . T (3)
l∈Nit
5. Location optimization

5.1. Location models vit fit yit  va xlt t = 1, 2 . . . T (4)
l∈Nit
In this section, we formulate two location optimization models
 
for both attended and unattended CDPs that have significant dif- vit fit yit  va xlt t = 1, 2 . . . T (5)
ferences in terms of application conditions, security requirement, i∈I l∈L
operational configurations, and preference of customers. In a re-
gion or neighborhood with a low level of safety, lockers will be
xlt , yit ∈ {0, 1}, t = 1, 2 . . . T (6)
not safe since they can be damaged and stolen. Moreover, the ac-
ceptability of attended CDPs and unattended CDPs are different in The objective is to minimize the operational costs of locating
different regions. In Europe, attended CDPs tend to be more suc- the CDPs, as shown in equation (1). Constraint (2) restricts the cov-
cessful than locker points because they are operated by humans, erage ratio α . Constraint (3) assures that there is at least one CDP

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

6 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

Fig. 3. Distribution of customers’ purchase probabilities.

to cover the demand point i when it is calculated. Constraint (4) The objective is to minimize the relocation cost, taking into ac-
guarantees every customer’s demand can be satisfied. Constraint count the purchase probabilities and demand capacities. Constraint
(5) makes sure that selected attended points can serve all “cov- (8) restricts the coverage ratio α . Constraint (9) restricts at least
ered” customers. one unattended CDP chosen for the certain region at period t. Con-
Some customers prefer unattended CDPs, which do not pro- straint (10) and (11) are capacity restrictions. The number of unat-
vide home delivery but have no time limitation. They can pick up tended CDPs to be allocated is restricted by constraint (12). We set
parcels on the way home from the work-place at any time or on t = 0 as the initial period in which unattended points are randomly
the way to the workplace without checking the open time of CDPs. selected.
Therefore, the potential locations of unattended CDPs J = { j}m
j=1
in-
clude metro stations, bus exchanges, company car parks, and lo-
cal urban distribution centers. Considering high setup costs of an 5.2. Analysis of location models
unattended CDP, the number of unattended CDPs in a certain re-
gion, such as a city, is limited. We assume that the number of Considering the attended model in one period, the constraints
unattended CDPs is less than or equal to pt at each period. We (2), (3) and (4) are equivalent to
deal with this problem as a maximal covering location problem. 
f˜it yit  α (14)
This type of problem seeks the maximal population, which can be
i∈I
served within a given distance or a limited number of facilities
[38]. The mathematical formulation of the unattended CDP can be

presented as follows: va f (dil )xlt − cit yit  0 (15)

T 
J i∈I
Unattended model: Min cr (z j,t − z j,t−1 )2 (7) 
t=1 j=1 where f˜it = fit / i fit , f (dil ) = (1/2 )[1 + sign(dil )], dil = dil −
S, and va , ci = fit vit are constants. Moreover, the constraints (5)
can be satisfied when the capacity of an attended point va  n ·
subject to:
 m−1 max(vit ). Therefore, the constraint set of the attended model
i∈I f it wit
αδ=  t = 1, 2 . . . T (8) can be presented as
i∈I f it  
C = {xlt ∈ Rm , yit ∈ Rn | f˜it yit  α , va f (dil )xlt − ci yit  0}

i∈I i∈I
wit  z jt t = 1, 2 . . . T (9)
j∈Mit Then we give the following proposition:

vit fit wit  vu z jt t = 1, 2 . . . T (10) Proposition 1. If, for ∀i ∈ I, we have  i ∈ I f(dil ) > 0 and va  n ·
j∈Mit m−1 max(vi ), then the constraint set C is nonempty, and the optimal
value f∗ is finite.
 
vit fit wit  vu z jt t = 1, 2 . . . T (11)
Proof. Considering the corresponding LP problem of the attended
i∈I j∈J
model, we convert it to a standard minimum problem format:

z jt  pt t = 1, 2 . . . T (12) Min FTX
j∈J
subject to
z jt , wit ∈ {0, 1}, t = 1, 2 . . . T (13) AX  b,

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx 7

where F = [ f1t , f2t , . . . , fmt , 0, 0, . . . , 0],


   Algorithm 1: Optimize attended CDP.
n
Data: A potential CDP set L, historical user behavior data Z  ,
X = [x1t , x2t , . . . , xmt , y1t , y2t , . . . , ynt ], b = [α , 0, 0, . . . , 0], and
T T
   updated user behavior data zt , a data mining model F ,
n
an optimization algorithm for location model M, and
⎡ ⎤ planning time horizons H = {t }t=1
T
0 0 ··· 0 f˜1t f˜2t ··· f˜nt
⎢ f (d11 ) f (d12 ) ··· f ( d 1m ) −1 0 ··· 0 ⎥ Result: Optimization locations of CDP for every period
⎢ f ( d ) f (d22 ) ··· f ( d 2m ) −1 ··· 0 ⎥
⎢ 21 0 ⎥ 1 for each period t ∈ T do
⎢ . ⎥
⎢ .
.
.
.
.
.
.
.
.
. . ⎥ 2 zt−1 ← zt−2
⎢ . . . . . . ⎥
⎢ ⎥ 3 It , fit ← F (zt−1 )
A = ⎢ f ( d n1 ) f ( d n2 ) ··· f (dnm ) 0 0 ··· −1 ⎥
⎢ v f ( d ) va f (d12 ) ··· va f ( d 1m ) −c1 ···
⎥ 4 Ot = M (It , L, fit , fit )
⎢ a 11 0 0 ⎥
⎢va f (d21 ) va f (d22 ) ··· va f ( d 2m ) 0 −c2 ··· 0 ⎥ return Ot
⎢ ⎥ 5
⎢ . . . . . . ⎥ end
⎣ . . . . . . ⎦ 6
. . . . . .
va f ( d n1 ) va f ( d n2 ) ··· va f (dnm ) 0 0 ··· −cn

Then the constraint set of this LP problem is Different from the attended CDP model, the number of unat-
( m+n )
S = {X ∈ R |AX  b}. tended CDPs to be built is limited. We assume that an online re-
tailer is planning to build pt unattended CDPs in period t. The
Apparently, when ∀i ∈ I,  i ∈ I f(dil ) > 0 and va  n · m−1 max(vi ), dynamic clustering algorithm (see Appendix B) can be used to
X = [1, 1, . . . , 1] ∈ C ⊂ S, which means that [1, 1, . . . , 1] is a feasi- solve the unattended CDP model. We apply the branch-and-cut al-
ble solution. Therefore, C ⊂ S is nonempty. Moreover, for any X that gorithm in every period to obtain better solutions, as shown in
 ˜
satisfies AX  0, we have fit yit  0 and f˜it ∈ [0, 1]. So we can in- Algorithm 2.
fer yit ≥ 0. From other constraints we obtain  f(dil )xlt ≥ yit ≥ 0
and f(dil ) ∈ 0, 1. Therefore F T X  0 absolutely. For the LP problem,
when the constraint set is nonempty, the following statements are Algorithm 2: Optimize unattended CDP.
equivalent [6]: Data: A potential CDP set L, historical user behavior data Z,
updated user behavior data zt , a data mining model F ,
(1) If AX  0 for some X ∈ R(m+n ) , then F X  0. an optimization algorithm for location model M, and
(2) The optimal value f¯∗ is finite. planning time horizons H = {t }t=1
T

Result: Optimization locations of CDP for every period


The first condition is proved to be satisfied, and thus, the other
1 for each period t ∈ T do
condition is satisfied as well. The optimal value of the LP problem
2 pt ← pt−1
f¯∗ is finite, and it is a lower bound of the optimal value f∗ , so f∗ is
3 zt−1 ← zt−2
finite. 
4 It , fit ← F (zt−1 )
Similarly, we analyse unattended model in a single period, and 5 Ot = M (It , J, fit , fit )
the constraint set of the unattended model can be presented as 6 return Ot
  
C = {wit ∈ Rn , z jt ∈ Rm | i∈I f˜it wit  α , i∈I vu f (di j )z jt − ci wit  7 end

0, z jt  pt }. Then, we give the following proposition:

Proposition 2. The constraint set C is nonempty, and the optimal

value of the unattended model f is finite. 6. Numerical experiments

Propositions 1 and 2 both indicate that optimal solutions of the 6.1. Instance generation
attended and unattended models exist. We denote optimization al-
gorithms for the attended and unattended models as M(I, L, fit ) and In this section, we illustrate the optimization process and
M (I, J, fit ), respectively. explicitly show how to locate the sites with the two models.
Section 6.1 describes the data that we used in the experiments.
5.3. Optimization algorithms Section 6.2 presents the results of our method and compares
them with the benchmark. Section 6.3 illustrates the solutions and
We apply the branch-and-cut algorithm [39] to solve location makes an analysis. All experiments were operated in Python with
models in every period (see Appendix A) and implement this algo- Gurobi.
rithm with the Gurobi Optimizer, a state-of-the-art mathematical After obtaining demand information, the customers’ addresses
programming solver. After obtaining demand information and pur- are essential. However, because of the privacy concerns of online
chase probability fit , we filter customers whose purchase probabil- customers and potential business competitive advantages gained
ity fit ≤ 0.5 to reduce the computational complexity of our models. by the analysis of online customers’ private information, online
The optimal attended and unattended CDPs in period t found via retailers hesitate to share their e-store’s user private information,
this algorithm are denoted by Ot = M (I, L, fit ) and Ot = M (I, J, fit ), which makes it difficult for researchers and practitioners to col-
respectively. lect the customers’ location data. We therefore generate several in-
In addition, the retailer requires optimal CDPs for multiple stances for the experiments.
periods, so we propose two algorithms to optimize CDPs for a First, we generate the customers’ location data–that is, demand
multiperiod problem. Considering the attended CDP model, we points I–on the basis of an e-commerce company’s real data. We
make every demand point be serviced by its nearest potential at- first estimate the joint distribution of the real data by mixture
tended CDP. After this processing, we obtain a preliminary location model algorithms, regarding the data as consisting several compo-
scheme and then calculate the service level and the cost to opti- nents whose distributions are different from one another. A widely
mize the attended CDP location. The algorithm steps are shown in used algorithm in this category is the mixture of Gaussians, which
Algorithm 1. means that the data comes from a mixture of Gaussian distribu-

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

8 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

Fig. 4. Demand points and potential locations.

Table 4 Table 5
Validation for attended model. Validation for unattended model.

Our method Method / Period t=1 t=2


Method Benchmark
t=1 t=2 Benchmark (Relocation cost) 16,000 16,000
Our method (Relocation cost) 4000 12,000
Total cost 26,140.28 15,004.08 18,706.34
Number of CDPs 12 9 11

6.3. Illustration and analysis

tions with different mean and covariance matrices. So we generate In this section, we use figures to intuitively show our solutions
the demand points with this method and remove the outliers. Af- and analyze how to provide guidance for online retailers so that
ter obtaining the demand points, we randomly generate potential they can formulate location strategies trading off the service level
locations indicating shops or crowd centers at the range of the cus- and the cost. We separately evaluate the consumer service level
tomers’ locations whose number is around one third of the num- and the retailers’ benefit by the following measures:
ber of demand points. The points we generated are longitudes and
latitudes. We transform them into points on a plane and show the
• Service range S: This is the radius of every CDP service range.
I1 involving 312 demand points, I2 involving 1013 demand points, The coverage ratio α described above can evaluate the con-
and L = J involving 367 potential locations in Fig. 4. The operation sumer service level as well.
cost cl of every potential point is a stochastic constant between
• Total operation cost: For attended points, it is calculated with
1500 and 3000, which reconciles with reality. the attended location model. For unattended points, it consists
of smart locker cost which is proportional to pt , the number of
unattended CDP to be located in a period and the relocation
cost which is calculated with the unattended location model.

6.2. Results and validation Therefore, retailers are able to trade off consumer service level
and their own benefit based on the measures above. We will quan-
To assess our solution methods, we compare them with tra- titatively analyze these parameters in the following two sections.
ditional location approaches that generate lower bounds of solu- (1) Attended points
tions. For attended CDPs, the benchmark (Appendix C) we use is As displayed above, when we set the radius of every CDP ser-
to distribute the overall locations evenly, divide the whole region vice range S = 8 and coverage ratio α = 90%, then the result shows
into several sub-regions, and then select points in potential lo- that the retailer needs to choose 9 locations from potential points
cations nearest to the centers of the sub-regions. The number of (the total cost is 15,004.08 when t = 1), and to choose 11 locations
sub-regions K is decided by the radius of every CDP service range from potential points (the total cost is 18,706.34 when t = 2). Fig. 5
S and coverage ratio α –that is, K = πRSα2 , where R is the area of shows explicitly the optimal attended CDPs located with the opti-
the whole region. For unattended CDPs, we choose a scenario in mal algorithm.
which unattended CDPs are selected randomly (see Fig. D.9) and A smaller service range S is beneficial to the customers. How-
apply the clustering algorithm (Appendix B) as the benchmark. ever, retailers need to consider the total cost and service level.
Parameters of attended model are: S = 8, α = 0.9, vi t ∈ [1, 10], va = Therefore, we adopt a diverse service range to analyze the vari-
40. We set parameters of unattended model as p1 = p2 = 10, S = ation tendency of the number of optimal locations and the total
8, α = 0.8, vi t ∈ [1, 5], va = 20, cr = 20 0 0. Then, we solve the prob- cost. Fig. 6 shows the outcome of the quantitative analysis in the
lem with the benchmark approach and our methods and display last period. We observe that the number of optimal locations and
the results in Tables 4—5. Our methods effectively reduce the to- the total cost increase gradually as the service range reduces in
tal cost for attended points and the relocation cost for unattended general. We make sure the coverage percentage α is higher than
cost. 80% and 90% separately and test the service range S from 3 to 10.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx 9

Fig. 5. Optimal attended CDP with service range S = 8 and α = 90%.

Fig. 6. Quantitative analysis for the second period.

Fig. 7. Solutions of unattended model when pt = 10 and S = 8.

In Fig. 6(a), the x-axis is the service range S, and the y-axis repre- We use algorithm 2 to solve the unattended location model.
sents the corresponding costs. In other words, the two axes stand The number of unattended CDPs to be located in the two peri-
for the service level and the retailers’ benefit, respectively. There- ods p1 = p2 = 10, and we set the service range S = 8, the cover-
fore, retailers are able to trade them off according to this analysis. age percentage α = 80%. In this case, the cost of smart lockers is
Meanwhile, the y-axis displays the number of optimal locations in fixed if the number of unattended point is determined, and we
Fig. 6(b). optimize the cost of relocations. The locations that the retailer
(2) Unattended points should choose are shown in Fig. 7. We also examine the cluster-

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

10 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

dient boosting trees performs best. Our experiments demonstrate


how to locate CDPs explicitly, and provide new quantitative anal-
ysis over the relationship between customer service level and re-
tailers’ benefit.
Our research has limitation since we mainly use customer be-
havior data. In the future, we can consider external environmental
data including geographic, economic, weather, and traffic big data
in our models to improve the system performance. In this paper, it
is infeasible since we have no further channels to coordinate with
geographic, economic, weather, and traffic public departments or
governments to get information, some of which are not available
to the public.

CRediT authorship contribution statement

Xianhao Xu: Funding acquisition, Resources. Yaohan Shen:


Software, Data curation, Visualization, Methodology, Formal analy-
sis. Wanying (Amanda) Chen: Writing - review & editing, Method-
Fig. 8. Quantitative analysis of coverage ratio α and location number pt . ology, Software. Yeming Gong: Conceptualization, Writing - origi-
nal draft, Methodology, Investigation, Formal analysis, Writing - re-
view & editing, Supervision. Hongwei Wang: Project administra-
ing algorithm and find that our method obtains solutions that re- tion, Supervision.
quire lower cost. But if the data is tremendous, it will be time-
consuming to solve the problem with it. Solutions via the cluster-
ing algorithm are shown in Appendix D. Acknowledgements
Then we quantitatively analyze how the coverage ratio α
changes when the number of predetermined locations pt varies Yaohan Shen and Xianhao Xu are first author of this paper. We
and the service range differs via calculating the optimal coverage thank Haitao Li for discussion in the earlier version. This research
ratio for the given location number and service range. In Fig. 8, we was supported in part by the National Natural Science Founda-
compare the coverage ratios under six service range values for di- tion of China [grant number 71620107002, 71821001, 71971095],
verse numbers of CDP locations in the second period. The two axes the National Social Science Foundation of China [grant number
represent the service level and the retailers’ benefit, respectively. 16ZDA013], the Ministry of Education, Humanities and Social Sci-
From this figure, we can find the rising tendency of the coverage ences of China [grant number 17YJC630013] and Zhejiang Natural
ratio α with the gradually increasing number of locations pt , and Science Foundation [grant number LQ19G030 0 04], Qianjiang River
the larger the service range, the slower the growth trend of the Talent Scholarship (grant number QJC1802002).Yeming GONG is
coverage ratio. supported by Business Intelligence Center of EMLYON.

7. Concluding remarks Appendix A. Branch-and-cut algorithm

In this paper, we design a CDP location procedure taking full The branch-and-cut algorithm [39] can be regarded as a search
advantage of characteristics of online retailers who possess suffi- tree and the nodes of the tree stands for linear programming prob-
cient customer behavior data in web logs of their websites. We lem to be solved. Five steps are applied in the optimization pro-
propose to apply state-of-art machine learning algorithms to make cess:
use of the abundant data, obtaining advance purchase information Step 0: Presolve the model by eliminating unnecessary con-
and then optimize CDP locations based on customers’ address in- straints to reduce the model size.
formation which can be easily acquired by online retailers from Step 1: Ignore integer constraints and solve corresponding lin-
historical orders or IP address. We discuss both attended and unat- ear programming problems and then the optimal value of LP is a
tended CDP. Taking their features into account, two multi-period bound of the original model.
optimization models are established for locating them separately. Step 2: Choose a value of one feature to decompose the model
Then we present solutions for two models and give proofs to make into two new subproblems, and solve their LP relaxations respec-
sure the solutions must exist. Finally we illustrate location results tively.
under some conditions, and make quantitative analysis about the Step 3: Add a cutting plane to improve the linear programming
service level and benefit of retailers to reveal their relationship, so relaxation. Repeat this step until the optimal solution of the LP is
that online retailers can make wise decision to trade off these two feasible or the LP is unbounded.
contradictory factor. Step 4: Select another branching feature and repeat Step 2 and
Our research makes the following contributions: We extract Step 3 until no more subproblem should be explored. Select the
useful information via data mining models from customer behav- optimal solution from solutions produced by Step 3.
ior big data to optimize CDP for online retailers. We are among
the earliest to propose a data-driven method integrating data min-
ing models and facility location models to determine CDP locations Appendix B. Clustering algorithm
for online retailers. Exploring real data set and employing five ma-
chine learning models to analyze the data, we find that the gra- Algorithm 3.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx 11

Algorithm 3: Clustering algorithm for locating unattended


CDPs.
Data: A potential CDP set J, historical user behavior data Z, a
data mining modelF , an clustering algorithm C, and
planning time horizons H = tt=1T

Result: Optimal locations of CDP for every period


1 for each period t ∈ T do
2 pt ← pt−1
3 Zt−1 ← Zt−2
4 It ← F (Zt−1 )
5 classes = C (It , pt )
6 for each classk ∈ calssesK do
7 find center point ck
8 select the point jk in J nearest to c
9 end
10 return Ot = { j}Kk=1
11 end

Fig. D9. Initial locations of unattended CDP.

Appendix C. Benchmark for attended CDPs


Step 2: Locate centers of sub-regions.
In practice, retailers may locate CDPs as uniformly distribution. Step 3: Select points that are nearest to centers of sub-regions
Distribute locations overall evenly. Divide the whole region into in potential locations L.
several sub-regions, then select points in potential locations that
are nearest to centers of sub-regions. Four steps are applied in the Appendix D. Initial locations of unattended CDP and solutions
location process: via clustering algorithm
Step 0: Given potential location set L, the service area of the
whole region R. Fig. D.10.
Step 1: Calculate the number of sub-regions K which is decided
by the radius of every CDP service range S and coverage ratio α ,
that is K = πRSα2 .

Fig. D10. Solutions of unattended model when pt = 10 and S = 8.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280
JID: OME
ARTICLE IN PRESS [m5G;May 23, 2020;6:11]

12 X. Xu, Y. Shen and W. (Amanda) Chen et al. / Omega xxx (xxxx) xxx

Supplementary material [21] Montgomery AL, S Li KS, Liechty JC. Modeling online browsing and path anal-
ysis using clickstream data. Mark Sci 2004;23:579–95.
[22] Morganti E, Dablanc L, Fortin F. Final deliveries for online shopping: the de-
Supplementary material associated with this article can be ployment of pickup point networks in urban and suburban areas. Res Transp
found, in the online version, at doi:10.1016/j.omega.2020.102280. Bus Manage 2014;11:23–31.
[23] Nguyen TV, Zhou L, Chong AYL, Li B, Pu X. Predicting customer de-
References mand for remanufactured products: a data-mining approach. Eur J Oper Res
2020;281:543–58.
[24] Kinay OB, da Gama FS, Kara BY. On multi-criteria chance-constrained
[1] Ahmadi-Javid A, Seyedi P, Syam SS. A survey of healthcare facility location.
capacitated single- source discrete facility location problems. Omega
Comput Oper Res 2017;79:223–63.
2019;83:107–22.
[2] Angell R, Kraemer J. Generating customized marketing messages at a customer
[25] Ortiz-Astorquiza C, Contreras I, Laporte G. Multi-level facility location prob-
level using current events data. US Patent 2014;8(639):563.
lems. Eur J Oper Res 2018;267:791–805.
[3] Ballou RH. Dynamic warehouse location analysis. J Mark Res 1968;5:271–6.
[26] Poel DVd, Buckinx W. Predicting online-purchasing behaviour. Eur J Oper Res
[4] Bard JF, Jarrah AI. Integrating commercial and residential pickup and delivery
2005;166:557–75.
networks: a case study. Omega 2013;41:706–20.
[27] Shan W, Yan Q, Chen C, Zhang M, Yao B, Fu X. Optimization of competitive
[5] Bélanger V, Ruiz A, Soriano P. Recent optimization models and trends in lo-
facility location for chain stores. Ann Oper Res 2019;273:187–205.
cation, relocation, and dispatching of emergency medical vehicles. Eur J Oper
[28] Taherkhani G, Alumur SA. Profit maximizing hub location problems. Omega
Res 2019;272:1–23.
2019;86:1–15.
[6] Bertsekas D., Nedic A.. Convex analysis and optimization. (conservative)2003;.
[29] Wang X, Yuen KF, Wong YD, Teo CC. An innovation diffusion perspective of e–
[7] Chan Y. Facility location: a survey of applications and methods. Transp Sci
consumers initial adoption of self-collection service via automated parcel sta-
1999;33:429–30.
tion. Int J Logist Manage 2018;29:237–60.
[8] Chao L. 7-eleven expands locker space, hoping to cash in on e-commerce wave.
[30] Weltevreden JWJ. B2c e-commerce logistics: the rise of collection-and-delivery
Wall Street J 2015;12.
points in the netherlands. Int J Retail DistribManage 2008;36(8):638–60.
[9] Dan T, Marcot P. Competitive facility location with selfish users and queues.
[31] Williams R. Pick up your amazon deliveries on your tube commute. Telegraph
Oper Res 2019;67:479–97.
2014;25.
[10] Fischetti M, Ljubic I, Sinnl M. Redesigning benders decomposition for large-s-
[32] Wu H, Shao D, Ng WS. Locating self-collection points for last-mile logistics
cale facility location. Manage Sci 2016;63:2146–62.
using public transport data. In: Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Mo-
[11] Greg B. Amazon’s new secret weapon: delivery lockers. Wall Street J 2012;7.
toda H, editors. Advances in knowledge discovery and data mining. Springer
[12] Guerriero F, Miglionico G, Olivito F. Location and reorganization problems: the
International Publishing, Cham; 2015. p. 498–510.
calabrian health care system case. Eur J Oper Res 2016;250:939–54.
[33] Xu J, Hong L, Li Y. Designing of collection and delivery point for e-commerce
[13] Guo M, Liao X, Liu J, Zhang Q. Consumer preference analysis: a data-driven
logistics. In: Proceedings of the 2011 international conference of informa-
multiple criteria approach integrating online information. Omega 2019:102074.
tion technology. In: Computer engineering and management sciences, vol. 03.
[14] Hillsman EL. Spatial analysis and location-allocation models. Econ Geogr
Washington, DC, USA: IEEE Computer Society; 2011. p. 349–52.
1988;64:196–8.
[34] Yuen KF, Wang X, Ng LTW, Wong YD. An investigation of customers in-
[15] Jenkins PR, Lunday BJ, Robbins MJ. Robust, multi-objective optimization for the
tention to use self- collection services for last-mile delivery. Transp Policy
military medical evacuation location-allocation problem. Omega 2019:102088.
2018;66:1–8.
[16] Kedia A, Kusumastuti D, Nicholson A. Acceptability of collection and delivery
[35] Chauhan A, Singh A. A hybrid multi-criteria decision making method approach
points from consumers perspective: a qualitative case study of christchurch
for selecting a sustainable location of healthcare waste disposal facility. J Clean
city. Case Stud Transp Policy 2017;5:587–95.
Prod 2016;139(15):1001–10.
[17] Khan MR, Manoj J, Singh A, Blumenstock J. Behavioral modeling for churn pre-
[36] Farahani Z, Samira F, Ruiz R, Sara H. OR models in urban service facility loca-
diction: early indi- cators and accurate predictors of custom defection and loy-
tion: a critical review of applications and future developments. Eur J Oper Res
alty. In: Proceedings of the 2015 IEEE international congress on big data, IEEE
2016;276(1):1–27.
computer society, New York, NY, USA; 2015. p. 677–80.
[37] Janjevic M, Winkenbach M, Daniel Merchán. Integrating collection-and-deliv-
[18] Lee J, Podlaseck M, Schonberg E, Hoch R. Visualization and analysis of click-
ery points in the strategic design of urban last-mile e-commerce distribution
stream data of online stores for understanding web merchandising. Data Min
networks. Transp Res E Logist Transp Rev 2019;137:37–67.
Knowl Discov 2001;5:59–84.
[38] Church R, Velle CR. The maximal covering location problem. Papers in regional
[19] Liu C, Wang Q, Susilo YO. Assessing the impacts of collection-delivery points
science 1974;32(1):101–18.
to individuals activity- travel patterns: a greener last mile alternative? Transp
[39] Mitchell JE. Branch-and-Cut Algorithms for Combinatorial Optimization Prob-
Res Part E 2019;121:84–99.
lems, 1999.
[20] Melo M, Nickel S, da Gama FS. Facility location and supply chain management
– a review. Eur J Oper Res 2009;196:401–12.

Please cite this article as: X. Xu, Y. Shen and W. (Amanda) Chen et al., Data-driven decision and analytics of collection and delivery point
location problems for online retailers, Omega, https://doi.org/10.1016/j.omega.2020.102280

You might also like