SSRN Id4322670

Crowdsourcing Last-Mile Delivery with Hybrid Fleets
under Uncertainties of Demand and Driver Supply:

Optimizing Profitability and Service Level
Akshit Goyal
Industrial and Systems Engineering, University of Minnesota, Twin Cities, goyal080@umn.edu
Yiling Zhang
Industrial and Systems Engineering, University of Minnesota, Twin Cities, yiling@umn.edu
Saif Benjaafar
Industrial and Systems Engineering, University of Minnesota, Twin Cities, saif@umn.edu
Electronic copy available at: https://ssrn.com/abstract=4322670

Last-Mile Delivery Optimization with Hybrid Crowdsourced Fleets
2
Problem definition: With the emergence and growth of e-commerce, e-retailers are challenged to provide
faster and more cost-effective last-mile delivery by crowdsourcing independent drivers (IDs). To mitigate
the risk of insufficient ID supply, retailers such as Amazon, Walmart, and other platforms rely on a hybrid
delivery fleet of having professional drivers (PDs) to complement IDs for more viable delivery services. Such
hybrid fleet delivery systems involve complex planning and operational decisions under multiple sources of
uncertainties, which imposes significant computational challenges. The potential value of using IDs may vary
under impacts of the uncertainties. Methodology: We formulate the problem as a multistage stochastic
integer program. At the planning level, we optimize the fleet size and allocation in each zone of PDs under
uncertain demand and IDs’ availability. On a daily basis, each stage corresponds to a delivery time window.
For each time window, given a realized demand-ID scenario, we construct an expanded transportation
network with time and delivery-capacity status to model the operations of the hybrid fleet including order
allocation and routing. We develop an iterative method based on approximate dynamic programming that
makes use of piecewise linear functions. Results: Via testing instances generated based on the City of
Minneapolis, Minnesota, we demonstrate the efficacy of the solution approach and study the benefits of
employing IDs and how the uncertainties impact the employment of IDs. Managerial implications: The
hybrid delivery fleet can significantly increase the profitability and service level of last-mile delivery. The use
of IDs can endure the impacts of larger demand surges and demand variation in spatial distributions. Given
the lower-fixed-cost and higher-variable-cost structure for hiring IDs, increasing demand volatility can lead
to less use of IDs and fewer system benefits in profitability and service level. More IDs are used (i) when
the available IDs are more, (ii) when their vehicle capacity is larger, or (iii) when the demand locations are
relatively far from depots/warehouses.
Key words : crowdsourcing, last-mile delivery, stochastic integer programming, approximate dynamic
programming methods
1. Introduction
With more than 50% world population living in urban areas1 and wide access to mobile devices,
there is a global trend of transforming retail into a mostly online industry. In the past decades,
business-to-consumer (B2C) e-commerce has experienced astonishing growth and is expected to

3
take about one-quarter share of total global retail sales in 2023 (Aggarwal 2019). Over the past few
years, COVID-19 has further accelerated this growth and the transform “4 to 6 years” (Koetsier
2020). This growth comes with increasing competition in providing delivery services with shorter
time windows, leading to an escalation from “next-day delivery” to “ same-day delivery” to “2-
hour-delivery.” In delivery logistics, last-mile delivery is the most expensive and time-consuming
component (Dolan 2018). To improve their logistics competitiveness, large e-retailers are turning
to crowdsourced delivery for providing faster and more cost-effective last-mile delivery.
In crowdsourced delivery systems, a group of independent drivers (IDs) carry out the last-mile
deliveries with their privately owned vehicles, from warehouses or stores to the customer’s doorstep.
The drivers and retailers are connected via a delivery platform that matches the drivers with pickup
orders. The platforms are owned mainly by two categories: e-retailers (e.g., Amazon Flex, Walmart
Spark Delivery, and Target Shipt) and third-party couriers (e.g., Postmates, Instacart, Doordash,
and TaskRabbit). Some crowdsourced delivery systems, for example, Amazon Flex, require IDs to
sign up for their availability in advance. A fixed flat-rate pay is guaranteed for IDs in their available
time slots even if they are not matched with any delivery tasks. Another form of crowdsourced
delivery systems, initially suggested by Walmart (Barr 2013), asks in-store customers to drop off
packages of online customers with a detour from their initial trip home.
Although, IDs are more preferred by platforms given their asset-lightness, one challenge of relying
on IDs is their uncertain availability as the drivers decide when, where, and how long to work. To
achieve more reliable service guarantee and enhance customer satisfaction, there is an emerging
business mode (employed by, e.g., Amazon, Walmart, Veho, and Bringg) of using a hybrid delivery
fleet of complementing ID vehicles with a fleet of vehicles, owned by the service provider, with
contracted professional drivers (PDs). From a service provider’s point of view, although operating
their own fleet of vehicles can provide more reliable service, coordinating the PD fleet with IDs adds
management and maintenance complexities and, more critically, increases financial investment.
Thus, while maintaining the quality of service (QoS), the overall profitability can be questionable

4
for the hybrid fleet. To crowdsourced platforms, for balancing their profitability and QoS, on a
planning level, it is important to decide the fleet size of PDs and optimally locate PDs to service
zones. For daily operations, when orders arrive, platforms need to allocate them to either IDs or
PDs. For more efficient delivery, platforms such as Amazon and Walmart also recommend routes to
drivers. Moreover, the spatial-temporary varying demand further imposes another layer of difficulty
in operating the hybrid fleet. The objectives of this paper pertain mainly to investigating the key
drivers for using more IDs (and thus lowering the financial investment) while maintaining good
QoS by considering the planning and operations of the hybrid fleet under uncertain demand and
ID supply.
1.1. Methodology Overview
Multistage stochastic programming is a model commonly used for system planning under uncer-
tainty, where the uncertainties reveal in a sequential manner. Decisions that are implemented
before the observation of uncertainties are proactive and are often associated with planning deci-
sions; while those made after the revelation of some uncertainties are rather reactive and are often
associated with operational decisions.
In this paper, we develop a multistage stochastic integer linear programming model for optimizing
operational decisions for each delivery time window after the planning decision of allocating PDs
is determined. We employ the sample average approximation (SAA) method (e.g., Kleywegt et al.
2002, Shapiro et al. 2009) by using Monte Carlo sampling techniques to generate scenario paths
of the random delivery demand and available IDs. For each sample path, we construct expanded
spatial-temporal networks with delivery-capacity status to model the operational decisions of order
dispatch, routing, and relocation based on the planning decision of PD allocation. In particular,
all the planning and operational decision variables are purely integral. Due to the large number
of integral decisions, the computation can be challenging. We employ an approximate dynamic
programming (ADP) method (see, e.g., Topaloglu and Powell 2006) to solve the generated SAA
problem by constructing piecewise linear approximations of the corresponding value function in a
dynamic programming formulation of the problem.

5
1.2. Contributions and Main Results
We summarize our contribution and main results as follows.
1. We develop a new model for hybrid delivery systems with multiple depots/warehouses account-
ing for the strategic planning problem of allocating PD fleet in service zones to satisfy uncertain
demand with uncertain ID supply. We formulate a multistage stochastic integer program-
ming model to optimize fleet planning, including service coverage and fleet deployment while
accounting for routing and matching of drivers, under two uncertainties from delivery demand
and IDs’ availability.
2. Via realistic numerical experiments covering ten Minneapolis zip-code areas that are served by
two Amazon Flex warehouses, we show that the hybrid delivery fleet can significantly improve
profitability and service level. The use of IDs can withstand the impacts of larger demand
surges and spatial variations in demand. Analogous to the base-surge leagile supply chain
strategy, IDs are used as agile principles for fulfilling random surge demand while PDs are used
to fulfill base demand. Different from supply chain literature (e.g., Christopher and Towill
2001, Allon and Van Mieghem 2010), which suggests that increasing demand volatility and
increasing demand surge increase demand allocation to agile principles, we find that increasing
demand volatility and surge can decrease the use of IDs and thus can decrease the demand
fulfilled by IDs given the low-fixed-cost and high-variable-cost structure of IDs. Furthermore,
more IDs are used when the IDs’ availability is higher, when their vehicle capacity is larger,
and when demand locations are relatively far from depots/warehouses.
3. We demonstrate the computational efficacy of an ADP method based on piecewise linear
function approximations for optimizing the multistage stochastic integer programming model
with a large number of operational decision variables in an expanded transportation network
for each delivery time window. The computational performance is further enhanced in parallel
computing.

6
1.3. Structure of the Paper
The remainder of the paper is organized as follows. In Section 2, we review relevant studies on
crowdsourced last-mile delivery. In Section 3, we describe the problem and formulate the multistage
stochastic integer programming model with the details of the spatial-temporal delivery networks
for each stage. Section 4 presents the ADP algorithm for solving the model. Section 5 shows insights
from the case studies and demonstrates computational the efficacy of the proposed algorithm. In
Section 6, we conclude the paper and discuss future research directions. For an easy reference, we
summarize the notation of the multistage formulation in Table 6 in Appendix A.
2. Literature Review
Archetti et al. (2016) are among the first few papers to consider the vehicle routing problem (VRP)
with occasional drivers (e.g., in-store customers who can carry out deliveries with a detour on
their way home). They assume that one occasional driver can make at most one delivery and that
there is an unlimited number of PDs available to complete the delivery which cannot be carried
out by the occasional drivers. The problem is formulated as an integer program and a multi-start
heuristic is proposed. Macrina et al. (2017) extend the VRP with occasional drivers by considering
time windows for both the occasional drivers and customers and assuming that the occasional
drivers can make multiple deliveries. In their later work Macrina et al. (2020), they further consider
the VRP with transshipment nodes closer to the delivery areas which potentially attracts more
occasional drivers to perform deliveries. Raviv and Tenzer (2018) study the routing problem of a
more general setting with service points in the network. Each service point is used as a drop-off,
pickup, or intermediate transfer location. People who sign up for delivery stop by service points
during their trips.
Due to the inherent uncertainties in the driver supply and demand, recently, stochastic crowd-
sourced delivery concerning system uncertainties has attracted increasing interest. Gdowska et al.
(2018) consider the matching and routing problem of stochastic last-mile delivery with a hybrid
fleet, where the occasional drivers may decline delivery tasks. They propose a heuristic method

7
to identify a subset of customers to be occasional drivers. Dayarian and Savelsbergh (2020) focus
on the routing problem of the hybrid fleet with uncertain occasional drivers’ availability. They
develop rolling horizon dispatching approaches with and without incorporating future informa-
tion of occasional drivers’ availability. Arslan et al. (2019) assume that both delivery demand and
drivers arrive dynamically. They develop a recursive algorithm to match drivers and orders in a
rolling horizon framework. In a bilevel framework, Gdowska et al. (2018) study the occasional
drivers’ willingness to accept or reject assigned delivery tasks. Castillo et al. (2022) use discrete
event simulation techniques to simulate home delivery with a hybrid fleet for a retail pharmacy.
Behrendt et al. (2022) is the most relevant work to our paper, which studies joint planning and
coordination for a hybrid fleet using fluid models. On the planning level, they consider the fleet
size of PDs and compensation for IDs; on the operational level they consider order allocations to
either IDs or PDs. They model the delivery of food or pharmacy which requires each vehicle to
take only one order in a trip and thus routing is not needed. Whereas, our paper allows multiple
orders in one trip, which is common for order delivery of e-commerce retailers.
Crowdsourced delivery systems with intermediate transshipment nodes have also been studied.
Qi et al. (2018) look into a one-transshipment system where the packages are carried by trucks or
vans in the inbound stage and the outbound stage is carried out by IDs. They propose a continuous
approximation model for the planning decisions such as truck routes and service zones. Mousavi
et al. (2022) consider a variant where a mobile depot is allocated for occasional drivers to pick up
packages for delivery. The problem is formulated as a two-stage stochastic integer program. We
refer interesting readers to Alnaggar et al. (2021), Savelsbergh and Ulmer (2022) for recent reviews
of crowdsourced delivery platforms and operations research literature.
All the literature mentioned here assumes one depot/warehouse and one delivery trip (with
one or multiple stops) for crowdsourced drivers on their way to their destinations; while, in this
paper, multiple depots/warehouses and multiple delivery trips are allowed, which is not uncommon
in practice for crowdsourced drivers. To the best of our knowledge, we are the first to integrate

8
both planning decisions of fleet sizing and operational decisions of matching and routing for a
hybrid delivery fleet into comprehensive stochastic programming models that take into account
both random driver supply and demand.
We also note that there is another stream of research concerning labor acquisition and welfare in
crowdsourced services by considering pricing and wage rate, which, although, is not a focus of this
paper. For instance, Benjaafar et al. (2022) study labor welfare in on-demand platforms. Lei et al.
(2020) study the problem of dynamically acquiring new drivers to fulfill demand in crowdsourced
delivery platforms. Bai et al. (2019) study pricing strategies on on-demand service platforms.
Fatehi and Wagner (2022) consider a labor planning and pricing problem for crowdsourced delivery
systems using a robust queueing model.
3. Problem Description and Formulation
We consider fleet sizing and management of a hybrid delivery fleet in an urban area where several
depots/warehouses keep inventory for fulfilling orders from online customers. In particular, we
focus on the impact of crowdsourcing when taking into account uncertainties of demand and IDs’
availability. Specifically, we consider a last-mile delivery system over a finite planning horizon. The
planning time horizon has a partition in which each interval represents a delivery time window.
For example, Amazon Flex and Instacart have delivery windows of one or two hours. There is a
set of potential demand locations where online orders need to be delivered. Each demand location
can represent a delivery location for multiple orders and the home locations of multiple customers.
For example, a demand location can represent a subdivision or an apartment complex. When
customers place an order online, they specify a preferred delivery time window, within which the
order has to be delivered. The delivery tasks can be assigned to either PDs or IDs. There are
three major differences between PDs and IDs. (1) Availability: PDs are available throughout the
entire planning horizon; while IDs are only available for the time window they sign up for. (2)
Cost breakdowns: For PDs, the crowdsourced delivery system has to pay for vehicle purchase costs,
drivers’ compensations, and maintenance of vehicles (including fuel, depreciation, etc.); whereas,

9
for IDs, the system only pays for their compensations. (3) After Last delivery: At the end of their
respective service duration, from the last delivery locations, PDs have to return to their depots;
while IDs may leave directly. For any demand that is not fulfilled by the crowdsourced delivery
system, we assume a penalty which may be interpreted as the expense if the delivery is alternatively
implemented by a third-party logistic company. Different from most other work considering in-
store customers as occasional drivers (see, e.g., Archetti et al. 2016, Dayarian and Savelsbergh
2020) who make only one delivery trip on their way home, the IDs in this paper are available for
multiple trips during the time windows they sign up for. In the rest of the paper, we term the
depots/warehouse as warehouses which serve as transport hubs for orders.
Consider a service region where there are multiple warehouses. Each warehouse supplies demand
for a pre-specified area. The problem is formulated as a multistage stochastic integer linear pro-
gramming model, where the first stage makes a planning (long-term) decision of how many PDs to
employ for each warehouse. The following stages, each stands for one delivery time window. In each
delivery time window, upon the arrival of orders and the revelation of available IDs, the system
operator decides the order-driver assignments and routing for both PDs and IDs to estimate the
expected operational cost minus revenue over the planning horizon given the planning decision.
3.1. Planning Problem and Formulation
Consider J = {1, . . . , J } the set of locations of warehouses. The number of PDs hired at warehouse
j cannot be more than Mj . We denote a positive integer variable xj the number of PDs to be
hired at warehouse j ∈ J . We associate the per driver hiring cost cj with xj for warehouse j ∈ J .
Let x = (xj , j ∈ J )> . We formulate the overall problem as a multistage stochastic integer linear
program as follows.
min c> x + V (x) : xj ≤ Mj , j ∈ J , x ∈ ZJ+

(1)
x
Here, (1) minimizes the total cost of hiring PDs plus the expected operational cost V (x) over the
planning horizon, which we will specify later in Section 3.3.

10
The service region is divided into J zones. Each zone contains exactly one warehouse. There is
a set K = {1, . . . , K } of all potential demand locations in the service region. Denote by αjk the
pre-determined matching of demand locations to warehouses where αjk = 1 if demand location k
is served by warehouse j, 0 otherwise. Orders to location k will be picked up from warehouse j for
delivery by assigned drivers if αjk = 1. For warehouse j ∈ J , denote N (j) = {k ∈ K | αjk = 1} as the
set of demand locations in zone j. For simplicity, in the rest of the paper, we abuse the notation
of j ∈ J to denote the zone that is served by warehouse j.
3.2. Spatial-Temporal Delivery Network
We consider a planning horizon of T = {1, . . . , T } periods which is divided into 1, . . . , N time
windows. The first N − 1 time windows are for delivery where both PDs and IDs are active in
the system. The last time window is for all the PDs returning to their respective warehouses
(while the IDs can directly leave from their last delivery locations). For notation simplicity, the
last time window is viewed as a special delivery time window where there is no demand and
only PDs are active and returning to their warehouses. Delivery time window n is of periods
{tn , tn + 1, . . . , sn − 1, sn }. For time window n, given the first-stage decision x and operations in
the previous time windows, we optimize drivers’ operations, including dispatching, delivery, and
relocation over two spatial-temporal delivery networks, G1n = (N1n , A1n ) and G2n = (N2n , A2n ), for
IDs and PDs, respectively.
We assume that the PDs are available for the entire planning horizon; while the IDs are only
available for the delivery time windows that they sign up for. For timely delivery, the IDs are
assumed to arrive one period before the signed-up delivery time window so that they will be ready
for delivery when the delivery time window starts. For the delivery time window n, the IDs arrive
at the warehouses at tn − 1. Let ỹjts be the number of IDs available in zone j starting from period t
to s and d˜kts be the number of orders that have to be delivered between period t and s for demand
>
location k. Let d̃n = d˜k,tn ,sn , k ∈ K be the random demand of delivery time window n, and
>
ỹn = ỹj,tn −1,sn , j ∈ J be the random number of IDs available in delivery time window n. For

11
delivery time window n, we denote ξñ = [d̃n , ỹn ] the uncertainty, and construct networks G1n and
G2n for IDs and PDs, respectively. The networks for delivery time window n include nodes of one
period earlier than the time window, from tn − 1 period.
Each order placed at location k ∈ K generates a revenue of rk and is assigned to IDs or PDs. For
any unmet demand, there occurs a penalty ρk , which can be interpreted as the cost for fulfilling
the order by a third-party logistics company. When the orders are in zone j and assigned to IDs,
warehouse j dispatches them to the IDs with a cost of ddisp

j per vehicle. We assume the dispatch
can be done in one time period. Each ID may take multiple orders in one delivery trip. Then IDs
deliver the assigned orders with a delivery cost ddeliv

ik per vehicle per period for delivering from one
location (a warehouse or demand location) i ∈ J ∪ K to another demand location k ∈ K. Within
the IDs’ available time window, IDs may carry out several trips of deliveries. When returning from
the last delivery location k to warehouse j for the next trip, a return cost dreturn
kj occurs per vehicle
per period, where demand location k is within the zone j, i.e., k ∈ N (j). If no more delivery task
is assigned to a ID after their last delivery location, we assume that the ID stays idle at the last
delivery location k with an idle cost didle

k per vehicle per period until the end of their available time
window.
Similarly, for the orders assigned to PDs, we denote cdisp

j , cdeliv return
ik , ckj the dispatching cost, the
delivery cost, and the return cost. Assume that the PDs can only be idle at warehouses and an idle
cost cidle
j occurs per vehicle per period. One difference between the PDs and IDs is that the PDs
can be relocated to another zone due to a spatial mismatch between supply and demand. Let crelo
ij
be the relocation cost per vehicle per period from zone i to zone j.
We use the term “movements” to denote changes in vehicle states and use arcs a ∈ A1n and
a ∈ A2n to represent all the movements in the two spatial-temporal delivery networks G1n and G2n
for IDs and PDs, respectively. We assume homogeneous-sized orders and homogeneous fleets for
IDs and PDs, respectively. Let B 1 = {0, 1, . . . , b1max } be the set of the number of orders one ID
vehicle can carry, where b1max is the delivery capacity; similarly define B 2 and b2max for PD vehicles.

12
Let node mitb ∈ A1n with b ∈ B 1 represent a state of IDs and node mitb ∈ A2n with b ∈ B 2 represent
a state of PDs, i.e., being at location i ∈ J ∪ K (a warehouse or a demand location) in time period
t. Moreover, in network N1n for IDs, we introduce a dummy entry node m000 to denote the state
of all the IDs before they start services and a dummy exit node m0T 0 to denote the state of all the
IDs who have finished service. In each network, we construct arcs in between pairs of nodes with all
possible transitions between their corresponding states. Denote τi` the minimum travel time from
location i to location `. We consider the following types of arcs A1n and A2n in the nth delivery
time window, for IDs and PDs, respectively.
Arcs of A1n for IDs for delivery time window [tn , sn ].
1. Entry arcs Aentry

1n = (m000 , mj,tn −1,0 ) : j ∈ J : Flows on these arcs represent IDs available
for delivery time window n arriving at warehouse j with zero cost.
2. Dispatch arcs Adisp

1
1n = (mjt0 , mj,t+1,b ) : j ∈ J , tn ≤ t + 1 < sn , 0 ≤ b ≤ bmax : Flows on these
arcs represent dispatching b orders to IDs at warehouse j from period t to t + 1 with cost ddisp
j
per unit flow.
3. Delivery arcs. Define delivery arcs Adeliv

1n = Af-deliv
1n ∪ Ai-deliv
1n , where Af-deliv
1n and Ai-deliv
1n are
described as follows.

(a) First delivery arcs Af-deliv
1n = (mjtb , mk,t+τjk ,b−bk ) : tn ≤ t ≤ t + τjk ≤ sn , 0 ≤ bk ≤ b ≤
b1max , d˜k,tn ,sn > 0, k ∈ N (j), j ∈ J : Flows on these arcs represent IDs at warehouse j at
time t with b orders delivering bk orders to demand location k. The cost per unit flow is
ddeliv deliv
jk τjk − rk bk , which incorporates cost djk τjk of hiring an ID and revenue of package
delivery rk bk .

(b) Intermediate delivery arcs Ai-deliv
2n = (mktb , mk0 ,t+τkk0 ,b−bk0 ) : tn ≤ t ≤ t + τkk0 ≤ sn , 0 ≤
bk0 ≤ b ≤ b1max , d˜k,tn ,sn > 0, k 6= k 0 ∈ N (j), j ∈ J : Flows on these arcs represent IDs in
zone j at demand location k at time t with b orders delivering bk0 orders to another
demand location k 0 with cost ddeliv

kk0 τkk0 − rk0 bk0 per unit flow.

4. Return arcs Areturn
1n = (mkt0 , mj,t+τkj ,0 ) : tn ≤ t ≤ t + τkj < sn , N (j) 3 k, j ∈ J , k ∈ K :
Flows on these arcs represent IDs returning to warehouse j from their last delivery location
k ∈ N (j) in one trip from period t to t + τkj with cost dreturn

kj τkj unit flow.

13

5. Idle arcs Aidle
1n = (mkt0 , mk,t+1,0 ) : k ∈ K, tn ≤ t ≤ t + 1 ≤ sn : Flows on these arcs represent
IDs with no orders idling at demand location k from period t to t + 1 with cost per vehicle
didle
k .

6. Exit arcs Aexit
1n = (mksn 0 , m0T 0 ) : k ∈ K : Flows on these arcs represent IDs at their last
delivery location k exiting the system at the end time of the nth time window sn . Similar to
the entry arcs, there is zero cost associated with the flows.
The arc set A1n of IDs is the union of the six types of arcs, i.e., A1n = Aentry
1n ∪Adisp deliv return
1n ∪A1n ∪A1n ∪
Aidle exit
1n ∪ A1n . In Figure 1, we illustrate an example of the spatial-temporal delivery network G1n of
IDs, with one warehouse (A), two demand locations (X and Y), five time periods (T = {1, 2, 3, 4, 5}),
and delivery capacity of 2 (Be1 = {0, 1, 2}). The numbers alongside different types of arcs indicate a
flow solution of entering, dispatching, delivering, returning, idling, and exiting IDs. We illustrate
all six types of arcs in Figure 1. Note that within the part of the network for the same location,
there are only dispatch and idle arcs as they do not need location transition of IDs. The other
four types of arcs (entry, delivery, return, and exit arcs) connect two nodes representing different
locations.
Warehouses 1 Demand locations

1
# of pkgs
𝑛𝐴12 𝑛𝐴22 𝑛𝐴32

# of pkgs
𝑛𝑌12 𝑛𝑌22 𝑛𝑌32

1
𝑛𝐴11 𝑛𝐴21 𝑛𝐴31 𝑛𝑌11 𝑛𝑌31
𝑛𝑌21
1
1
𝑛𝐴00 𝑛𝐴10 𝑛𝐴20 𝑛𝐴30 𝑛𝑌00 𝑛𝑌10 𝑛𝑌20 1 𝑛𝑌30
t=0 t=1 t=2 t=3 1 t=0 t=1 t=2 t=3

4 2
# of pkgs
𝑛𝑋12 𝑛𝑋22 𝑛𝑋32

𝑛000 1
𝑛050
𝑛𝑋11 𝑛𝑋21 𝑛𝑋31 1
Idle Delivery Exit
Dispatch Return Entry 𝑛𝑋00 𝑛𝑋10 𝑛𝑋20 𝑛𝑋30
t=0 t=1 t=2 t=3
Figure 1 An example of G1n for IDs. Here, A is a warehouse with four IDs available from period 0 to 4. Two
demand locations X, Y request dX14 = 2 packages and dY 14 = 5 orders from period 1 to 4, respectively.

14
Arcs of A2n for PDs for delivery time window [tn , sn ].
1. Dispatch arcs Adisp

2
2n = (mjt0 , mj,t+1,b ) : j ∈ J , tn − 1 ≤ t < sn − 1, 0 ≤ b ≤ bmax : Flows on
these arcs represent PDs at warehouse j dispatching b orders from period t − 1 to t with cdisp
j
per unit flow.
2. Delivery arcs Define delivery arcs Adeliv

2n = Af-deliv
2n ∪ Ai-deliv
2n , where Af-deliv
2n and Ai-deliv
2n are
defined as follows.

(a) First delivery arcs Af-deliv
2n = (mjtb , mk,t+τjk ,b−bk ) : tn ≤ t ≤ t + τjk ≤ sn , 0 ≤ bk ≤ b ≤
b2max , d˜k,tn ,sn > 0, k ∈ N (j), j ∈ J : Flows on these arcs represent PDs at warehouse j
with b orders delivering bk orders to demand location k from time period t to t + τjk . The
cost per unit flow is cdeliv op deliv

jk τjk + c ljk − rk bk , where cjk is the unit time compensation
of a PD, cop incorporates operational costs per travel distance, ljk is the travel distance
between j and k, and rk bk is the delivery revenue.

(b) Intermediate delivery arcs Ai-deliv2n = (mktb , mk0 ,t+τkk0 ,b−bk0 ) : tn ≤ t ≤ t + τkk0 ≤
sn , bk0 ≤ b ≤ b2max , d˜k,tn ,sn > 0, k 6= k 0 ∈ N (j), j ∈ J : Flows on these arcs represent PDs
at demand location k at time t with b orders delivering bk0 orders to demand location k 0
at time t + τkk0 with cost cdeliv op

kk0 τkk0 + c lkk0 − rk0 bk0 per unit flow.

3. Relocation arcs Arelo
2n = (mit0 , mj,t+τij ,0 ) : tn ≤ t ≤ t + τij < sn , i 6= j ∈ J : Flows on these
arcs represent PDs at warehouse i being relocated to warehouse j from period t to t + τij with
cost crelo op
ij τij + c lij per unit flow, where cij
relo
is the relocation cost per unit time.

4. Idle arcs Aidle
2n = (mjt0 , mj,t+1,0 ) : j ∈ J , tn ≤ t ≤ t + 1 ≤ sn : Flows on these arcs represent
PDs idling at warehouse j from period t to t + 1 with cost cidle j per unit flow.

5. Return arcs Areturn
2n = (mkt0 , mj,t+τkj ,0 ) : k ∈ N (j), j ∈ J , tn ≤ t ≤ sn : Flows on these arcs
represent PDs at their last demand location k ∈ N (j) at time t returning to warehouse j at
time t + τkj with cost creturn

kj τkj + cop lkj per unit flow, where creturn
kj is the return cost per unit
time.
For delivery time window n < N before the last time window, the set A2n of PDs is the union of
the five types of arcs, i.e., A2n = Adisp deliv return

2n ∪ A2n ∪ A2n ∪ Arelo idle
2n ∪ A2n . For the last time window N

15
where all PDs are returning, we note that there are only three types of arcs A2N = Areturn
2N ∪ Arelo
2N ∪
Aidle
2N :

• Relocation arcs Arelo
2N = (nit0 , nj,t+τij ,0 ) : tN ≤ t ≤ t + τij ≤ sN , i 6= j ∈ J .

• Idle arcs Aidle
2N = (njt0 , nj,t+1,0 ) : j ∈ J , tN +1 ≤ t ≤ t + 1 ≤ sN .

• Return arcs Areturn
2N = (nkt0 , nj,t+τkj ,0 ) : tN ≤ t ≤ t + τkj ≤ sN , k ∈ N (j), j ∈ J .
In Figure 2, we illustrate an example of the spatial-temporal delivery network G2n , with two ware-
houses (A and B), three demand locations (X, Y, and Z), six time periods (T = {1, 2, 3, 4, 5, 6}), and
delivery capacity of 2 (B 2 = {0, 1, 2}). We illustrate all five types of arcs (dispatching, delivering,
returning, relocation, and idling) in Figure 2, among which the dispatch and idle arcs are within
the network of the same locations as they do not involve PD’s location transition. The other arcs
of delivery, return, and relocation connect across nodes of different locations.
We denote the per unit net flow cost and capacity on arc a by ca and Ca . The unit flow costs
and capacities of arcs in A1n and A2n of IDs and PDs are summarized in Table 1.
Type of arc a Capacity Ca Cost per unit flow ca
Entry arc (m000 , mj,tn −1,0 ) ỹj,tn −1,sn 0
Dispatch arc (mjt0 , mj,t+1,b ) +∞ ddisp

j
First delivery arc (mjtb , mk,t+τjk ,b−bk ) d˜k,tn ,sn ddeliv

jk τjk − rk bk
A1n of IDs Intermediate delivery arc (mktb , mk0 ,t+τkk0 ,b−bk0 ) d˜k0 ,tn ,sn ddeliv
kk0 τkk0 − rk0 bk0
Return arc (mkt0 , mj,t+τkj ,0 ) +∞ dreturn

kj τkj
P
Exit arc (mksn 0 , m0T 0 ) j:k∈N (j) ỹj,tn −1,sn 0
Idle arc (mkt0 , mk,t+1,0 ) +∞ didle

k
cdisp
P
Dispatch arc (mjt0 , mj,t+1,b ) `∈J x` j
First delivery arc (mjtb , mk,t+τjk ,b−bk ) d˜k,tn ,sn cdeliv op

jk τjk + c ljk − rk bk
Intermediate delivery arc (mktb , mk0 ,t+τkk0 ,b−bk0 ) d˜k0 ,tn ,sn cdeliv op
kk0 τkk0 + c lkk0 − rk0 bk0
A2n of PDs
creturn τkj + cop lkj
P
Return arc (mkt0 , mj,t+τkj ,0 ) `∈J x` kj
crelo op
P
Relocation arc (mit0 , mj,t+τij ,0 ) `∈J x` ij τij + c lij
cidle
P
Idle arc (mjt0 , mj,t+1,0 ) `∈J x` j
Table 1 Unit flow costs and capacities of arcs

16
Warehouses Demand locations
# of pkgs
# of pkgs
1 t=0 t=1 t=2 t=3
t=0 t=1 t=2 t=3 t=4 t=5
# of pkgs
𝑛𝐴52
# of pkgs
𝑛𝐴51 t=0 t=1 t=2 t=3
𝑛𝐴50
# of pkgs
t=0 t=1 t=2 t=3 t=4 t=5
Idle Delivery Relocation

Dispatch Return
t=0 t=1 t=2 t=3
Figure 2 An example of G2n for PDs: A and B are warehouses with five and one PDs hired, respectively, i.e., xA =
5, xB = 1. Three demand locations X, Y and Z request dX13 = 1, dY 13 = 2, and dZ13 = 4 orders from
period 1 to 3, respectively. X and Y are served by A and Z is served by B, i.e., N (A) = {X, Y }, N (B) =
{Z}.
3.3. Multistage Formulation
We formulate the multistage problem based on two types of spatial-temporal delivery networks
of IDs and PDs in the previous section. Specifically, we consider N stages each corresponding
to a delivery time window {tn , tn + 1, . . . , sn − 1, sn }. For stage n, define the state variable by
Sn = (RnJ , RnK , WJn ), where RnJ := (Rjn , j ∈ J ) with Rjn as the number of empty PD vehicles at
warehouse j at period tn − 1, RnK := (Rkn , k ∈ K) with Rkn as the number of empty PD vehicles at
demand location k at period tn , and WJn := (Wt,j

n n
, tn ≤ t ≤ sn , j ∈ J ) with Wt,j as the number
of en-route empty PD vehicles that start their return trip in the previous time window and will
arrive at warehouse j within the current time window. After obtaining the planning decision x,
the number of PDs hired at each warehouse, for the state variable S1 of the first stage, let R1J = x
and R1K = 0, WJ1 = 0. Let Vn (Sn ) be the expected operational cost from stage n to the remaining

17
stages given the state variable Sn . The expected operational cost function V (x) in (1) is thus equal
to V1 (S1 ).
Given a solution to the state variable Sn and a realization of uncertainty ξˆn of time window n,
a multicommodity flow problem is solved. The two types of drivers, IDs and PDs, are viewed as
two commodities. We let un = (una , a ∈ A1n )> and vn = (vna , a ∈ A2n )> denote the flows for IDs
and PDs, respectively; let wn = (wk,tn ,sn , k ∈ K)> denote the number of undelivered orders within
time window n. A penalty cost of ρk is associated with unmet demand. Recall that ξñ = [d̃n , ỹn ]
denotes the uncertainties from random demand and available IDs of stage n. The value function
Vn (Sn ) is defined as Vn (Sn ) = E[Vn (Sn , ξñ )], where
X X X
Vn (Sn , ξˆn ) = min c1na una + c2na vna + ρk wk,tn ,sn + Vn+1 (Sn+1 ), (2)
(un ,vn ,wn ,Sn+1 )∈Yn (Sn ,ξ̂ n )
a∈A1n a∈A2n k∈K
where ξˆn is a realization of uncertainty ξñ . The set Yn (Sn , ξˆn ) ⊂ Zq+n is the feasible region of
(un , vn , wn , Sn+1 ), where the dimension qn = O ((J + K)2 ). The details of constraints in the feasible
regions are provided in Appendix B.
4. Approximate Dynamic Programming Approach
Given a set of Mote Carlo samples, the multistage formulation (1) can be approximated by a mono-
lithic mixed-integer linear program (MILP) involving the min-cost integer multicommodity flow
problem, which is shown to be NP-hard (Even et al. 1975). The problem suffers from the curse of
dimensionality (see, e.g., Simao et al. 2009, Powell 2014) with more number of scenarios. Due to the
sequential nature of the problem, it is natural to apply approximate dynamic programming (ADP)
(e.g., Powell 2007) approaches for more efficiently finding practically implementable computational
solutions.
In general, the value function is not convex in stochastic integer programs with integer recourse.
Under some special cases, such as simple integer recourse, the value function is convex (Haneveld
and van der Vlerk 1999). Although the multistage formulation (1) does not have simple recourse,
motivated by the computational results of the value function Vn (·) illustrated in Figure 3, we

18
900
800
700
Vn
600
500
400
xI2 : #10of 8 6 4 10
PDs at d 2 4 6 8
epot I2 0 0 2 xI1: # of PDs at depot I1
Figure 3 Illustration of the value function Vn (·) (See the computational setup in Section 5.1).
approximate Vn (·) using a convex function. For computational tractability, the value function Vn (·)
is approximated by a separable piecewise linear convex function (Topaloglu and Powell 2006):
dn
X
Vbn (Sn ) := Vbn` (Sn,` ),
`=1
P
where dn denotes the dimension of the state variable Sn . Let M := j∈J xj denote the total number
of PDs hired across all service zones. Vbn` (·) is a piecewise linear convex function with the discrete
Pdn PM ` ` `
support of {0, 1, . . . , M }. More specifically, Vbn (Sn ) = `=1 s=1 v̂n,s zn,s , where v̂n,s is the slope of
`
Vbn` (·) over (s − 1, s), and v̂n,1 `
≤ v̂n,2 `
≤ · · · ≤ v̂n,M `
. For each slope v̂n,s `
, zn,s is a binary coefficient such
`
P M `
that zn,s = 1, if s ≤ Sn,` and thus Sn,` := s=1 zn,s , where Sn,` is the `th element of state variable
Sn . Then, for a given Monte Carlo sample ξˆn , the approximation of (2) is Vbn (Sn , ξˆn ) =
dn+1 M
X X X XX
1 2
min cna una + cna vna + ρnk wk,tn ,sn + v̂n` +1,s zn` +1,s (3a)
(un ,vn ,wn ,Sn+1 )∈Yn (Sn ,ξ̂n )
a∈A1 a∈A2 k∈K `=1 s=1
M
X
s.t. Sn+1,` = zn` +1,s , 0 ≤ zn` +1,s ≤ 1, ` = 1, . . . , dn+1 , s = 1, . . . , M. (3b)
s=1
`
The integrality constraint zn+1 is relaxed due to total unimodularity as its constraint matrix is a
concatenation of an identity matrix and a consecutive-ones matrix (e.g., Fulkerson and Gross 1965,
Nemhauser and Wolsey 1999).

`
The approximation components v̂n,s , s = 1, . . . , M are not known exactly and are updated iter-
atively. At iteration κ, assume {ξˆn,κ }N bκ N

n=1 is the realizations of the uncertain parameters, {Vn }n=1
is the sequence of value function approximations, and {Sn }N

n=1 is the sequence of state decisions
generated by solving the approximation (3). Denote the approximation value as νn = Vbn (Sn , ξˆn,κ ).

19
Component v̂n` is updated by using νn`+ − νn and νn − νn`− , where νn`+ := Vbn (Sn + e` , ξˆn,κ ) and
νn`− := Vbn (Sn − e` , ξˆn,κ ) with unit vector e` having one in the `th position and zeros elsewhere. Let
v̂n`,κ be the slope estimate in iteration κ. The slope is updated by projecting q`,κ
n so that all the
elements are in non-decreasing order.
v̂n`,κ+1 = arg min kγ − q`,κ 2

n k : γs ≤ γs+1 , 1 ≤ s ≤ M − 1 , (4)
γ
and qn`,κ is defined as






 (1 − ακ )v̂n,s
`,κ
+ ακ (νn`+ − νn ), s = Sn,`



`,κ
qn,s := (1 − ακ )v̂ `,κ + ακ (νn − ν `− ), s = Sn,` − 1
 n,s n





`,κ
 v̂n,s , s 6= Sn,` , Sn,` − 1, s = 1, . . . , M − 1;


For the updation, ακ ∈ (0, 1) is the step size in iteration κ. The detailed algorithms can be found in
Algorithms 1 and 2. In Algorithm 1, given a set of initial slope estimates {v̂n` }d`=1
n
, the slope estimates
are updated iteratively. If for the Limit number of consecutive iterations, the objective value of
Vb1 , calculated based on the current slope estimates, changes within δ tolerance, the algorithm
terminates. Otherwise, the slope estimates are updated according to (4) for every stage. To improve
the computation, the value function approximation procedure in Algorithm 2 can be implemented
in parallel across the M components. Topaloglu and Powell (2006) have shown that the ADP

20
approach outperforms the rolling-horizon procedure by significant margins computationally for a
stochastic integer multicommodity flow problem.

Algorithm 1: Separable Piecewise Linear Convex Approximation
d
Input: Slope estimates {v̂n` }`=1
n
, n = 1, . . . , N , stopping tolerance δ, iteration limit Limit , step size
rule ακ for all iterations.
1 Initialize VbN +1 = 0, κ = 0, count = 0.
2 while count ≤ Limit do
3 Generate samples {ξ n,κ }N

n=1 of uncertain parameters, and calculate the approximation function
Vb1 using (3) backwards from VbN +1 .
4 Solve problem (1) using Vb1 to obtain x, S1 and its objective value ν κ .
|ν κ −ν κ−1 |
5 if κ ≥ 1 and |ν κ−1 |
≤ δ then
6 count ← count + 1
7 else
8 count ← 0
9 end
10 κ←κ+1
11 Set step size ακ .
12 {v̂1` }d`=1
1
← update Val Func Approx(S1 , ξ 1,κ , {v̂1` }d`=1
1
, Vb2 , 1, ακ )
13 Update Vb1 with {v̂1` }d`=1

1
.
14 for n = 1, . . . , N − 1 do
15 For stage n, solve (3) using ξˆn := ξ n,κ , Sn and Vbn+1 to obtain Sn+1 .
dn+1 dn+1
16 {v̂n` +1 }`=1 ← update Val Func Approx(Sn+1 , ξ n+1,κ , {v̂‘`n+1 }`=1 , Vbn+2 , n + 1, ακ )
dn+1
17 Update Vbn+1 with {v̂n` +1 }`=1 .
18 end
19 end
N
20 return x, S1 and Vbn n=1
Algorithm 2: Value Function Update for Stage n + 1

dn+1
Input: State S, sample ξ, slope estimates {v̂n` +1 }`=1 , value function approximation Vbn+2 , stage
n + 1, step size α.
dn+1
1 def update Val Func Approx(S, ξ, {v̂n` +1 }`=1 , Vbn+2 , n, α):
2 For stage n + 1, solve (3) using ξˆn+1 := ξ, Sn+1 := S and Vbn+2 to obtain objective value νn+1 .
3 for ` = 1, . . . , dnElectronic
+1 do
copy available at: https://ssrn.com/abstract=4322670
21
5. Computational Experiments
In this section, we present computational results to demonstrate (i) the computational efficiency
of the ADP algorithm, (ii) the benefits of utilizing IDs, and (iii) the impacts of uncertainties.
Specifically, we consider a case study of Minneapolis, Minnesota where the instances are generated
based on the geographic locations of the city as explained in detail in Section 5.1.
5.1. Computational setup
In the city of Minneapolis, we consider two warehouses/distribution centers at the locations (as
{J1, J2} shown in Figure 4) of Amazon delivery stations that serve the Minneapolis area. We
assume each warehouse can hire up to 10 PDs. Ten potential service zones/demand locations
({K1, . . . , K10} shown in Figure 4) are considered based on the zip-code areas. The ten zones are
partitioned into two groups N (J1) = {K2, K3, K7, K10} and N (J2) = {K1, K4, K5, K6, K8, K9},
which are served by the two warehouses, respectively. The partition minimizes the travel distance
from the warehouses to the demand locations, of which the travel distance is calculated using the
Haversine formula2 for the distance on a sphere given the longitudes and latitudes of the zip-code
area centers.
For daily operation, the delivery service runs for 8 hours (= 480 minutes) and an additional hour
for PDs returning to their warehouses. Let the operational time period be 15 minutes and thus the
planning horizon is T = 36 time periods. Each delivery time window is 2 hours, or equivalently 8
time periods. There are four delivery time windows: [0, 8], [8, 16], [16, 24], [24, 32] and one returning
duration of [32, 36] for PDs’ return. Given the area of the city of Minneapolis, it takes no longer
than an hour (i.e., 4 periods) to travel from one location to another in the area. The cost parameters
of hiring PDs are scaled to a daily basis. For each hired PD, a vehicle is provided for them and
the vehicle purchase cost is set as cj = $15. The cost is calculated by assuming the price $38,300
of delivery vehicle, a depreciation of $28,000 after a five-year average operation3 .
The compensation for IDs is set as $19 per hour (i.e., $4.75 per vehicle per period for all types of
ID arcs) based on average earnings of Amazon Flex drivers4 ; the compensation for PDs is $16 per

22
J1 Depot
Demand
45.10
4.81
K10
45.05 8.72 11.19

K3 K7
10.36
45.00 K2
Latitude
K4
44.95 K5 9.48
K1
12.07
8.31
K8 K6
44.90 8.35 5.9
K9
7.4
J2
44.85
93.40 93.35 93.30 93.25 93.20 93.15
Longitude
(b) The matching of demand locations K1 to K10
with warehouses J1, J2. (The numbers above the
dotted lines indicate the distance (in miles) between
(a) The spatial distribution with warehouses J1, J2 locations.)
and demand locations K1 to K10.

Figure 4 Warehouse and demand locations
hour (i.e., $4 per vehicle per period for all types of ID arcs) according to the pay rate of Amazon
drivers5 . Additionally, for the dispatch, delivery, return, and relocation arcs of PDs, it occurs an
operational cost $1.10 per mile‡ . We use the revenue of rk = $4.50 per order based on the USPS
rate of commercial parcels6 . The per order penalty for unmet demand is ρk = $5.
There are two sources of uncertainties: demand at each demand location and IDs available for
each delivery time window. For each instance, using Monte Carlo sampling techniques, 50 scenario
paths of the two uncertainties are generated using distributions as follows. Following Fatehi and
Wagner (2022), the orders at demand location k ∈ K come in according to a Poisson process with
the rate λk = 4.5 per period. That is, in time window n: [tn , sn ], the random demand d˜k,tn ,sn at
location k follows the Poisson distribution with mean λk (sn − tn ) = 36. The number of available IDs
ỹj,tn −1,sn at warehouse j available for delivery in time window n follows a binomial distribution with

23
success probability psucc = 0.05 (Torres et al. 2022). For each time window, the maximum supply
of potential IDs at each warehouse is 100. We later refer to the instances following the settings
described above as the baseline case. After solving the instances, following the same procedure,
100 scenario paths are generated for evaluating the solutions in out-of-sample tests The detailed
evaluation procedure is given in Appendix C.
All the computational experiments are performed using Python 3.7 with Gurobi 9.1.2 on Mesabi,
a High Performance Computing (HPC) cluster at the University of Minnesota. We use up to
64 cores provided by Intel Haswell E5-2680v3 processors and 480 GB memory. Algorithm 1 is
implemented by updating value functions using Algorithm 2 in parallel across the M components
of each value function. The parallelization scheme is implemented via multiprocessing.Pool()
of the multiprocessing package in Python. For Gurobi, the number of Threads is set to 4 and
the optimality gap to 0.05.
5.2. Computational time
We first solve the multistage model (1) using Algorithm 1 by updating value functions in sequential
order and compare it with implementing Algorithm 1 in parallel. Ten instances are solved for the
baseline case. Each PD is assumed to take up to 10 orders. Five instances have IDs with a capacity
of 5 orders, and the other five have IDs with a capacity of 10 orders. For each instance, 25 scenario
paths are generated following the procedure described in Section 5.1. The input parameters of
Algorithm 1 are as follows. Set the stopping tolerance δ = 1.5 × 10−3 , iteration limit Limit = 5,
and step size ακ = 1/(1 + 2κ). Initial slope estimates are randomly generated and normalized to a
unit vector. Elements of the vector are sorted in non-decreasing order.
When implementing Algorithm 1 sequentially, all the instances failed to be solved within the
time limit of three hours (=10,800 seconds). The details of implementing Algorithm 1 in parallel
are reported in Table 2: the total time ttot of implementing Algorithm 1, the time of parallelly
updating value functions tparallel , and the total number of iterations Iter. When the IDs’ capacity
is of b1max = 5 orders, all instances are solved within the time limit and the average solution time

24
is about one hour. When the IDs’ capacity increases to b1max = 10 orders, the average solution time
increases to about one and half hours due to the increasing size of the spatial-temporal networks.
Table 2 Computational time (in second) of different IDs’ capacities
ID vehicle cap. 5 10
instance ttot tparallel Iter. ttot tparallel Iter.
1 3230.17 2679.76 114 8826.52 6850.56 232
2 2668.85 2198.60 82 6287.39 4916.24 172
3 1593.64 1329.84 56 4558.28 3605.32 136
4 9713.32 9298.52 92 6801.35 5155.04 220
5 2140.53 1779.52 64 3671.27 2818.68 104
avg. 3869.30 3457.25 82 6028.96 4669.17 173
5.3. Benefits of Employing IDs
We now study the benefits of employing IDs with PDs in the delivery service and how the benefits
are impacted by IDs’ availability and capacities. We consider two performance measures, including
the total operating profit: the negative of the objective value in (1) excluding the penalty cost of
unserved demand, and service level: the percentage of orders fulfilled by either the PDs or IDs.
The profit represents the benefit to delivery service providers; the service level reflects the self-
sustainability and represents the benefit to customers. According to supply chain research (see, e.g.,
Christopher and Towill 2001, Allon and Van Mieghem 2010) on base-surge policies, cost-service
trade-offs are impacted by the service capacity of agile principles. Analogous to the base-surge
policies, the service capacity of IDs is expected to impact the performance measures. In particular,
we look into the impact of vehicle capacity and availability for IDs. Given that the vehicle capacity
of PDs is ten orders, we consider two cases of the IDs’ capacity: (1) smaller than the PDs’ capacity,
of five orders, and (2) compatible with the PDs’ capacity, of ten orders. In addition to the baseline

25
case described in Section 5.1, we double the maximum number of potential IDs to 200 for each
delivery time window.
Table 3 shows the profits and service levels with detailed results of the number of PDs hired,
cost, and revenue breakdowns under different delivery systems with or without PDs/IDs. The row
of “% increase” shows the percentage of profit increase compared with that of only employing
PDs. There are three main observations from Table 3. (1) Employing IDs can increase profitability
significantly. The benefit is more significant when the IDs’ vehicle capacity is larger or when there
are more available IDs. For example, when the IDs’ vehicle capacity is 10, the hybrid case yields a
574.28% increase in profit. (2) When the IDs’ vehicle capacity is smaller or when there are fewer
available IDs, employing PDs significantly increases the service level compared with the case of
only employing IDs. (3) When the IDs’ vehicle capacity is large and there are sufficient available
IDs, the system yields a good profit and a high service level with only IDs. For instance, when
both types of drivers are allowed in the model, no PDs are employed when IDs’ vehicle capacity
is 10 and IDs’ availability is doubled. The profit increases more than eight times compared to the
case of hiring only PDs and the service level is close to 90%.
5.4. Impacts of Uncertainties
In this section, we study how the two sources of uncertainties (random IDs’ availability and demand)
impact IDs’ employment. In supply chain surge-base policies, demand volatility is a key driver
of demand allocations. It is natural to expect that demand volatility affects the use of IDs as
well (Section 5.4.1). Moreover, we also examine the impacts of time variation (Section 5.4.2) and
spatial distribution (Section 5.4.3) of demand. In particular, we focus on a single service zone of
one warehouse J2 with the six demand locations N (J2) = {K1, K4, K5, K6, K8, K9} (as shown in
Figure 4).
5.4.1. Impact of volatility To study the impact of volatility, we assume both uncertain-
ties following normal distributions so that we can adjust their variance while keeping their mean

26
Table 3 The benefits of employing IDs and the impact of IDs’ vehicle capacity and availability
ID availability baseline double
ID vehicle cap. 5 10 5 10
PDs Yes No Yes No Yes No Yes No Yes
IDs No Yes Yes Yes Yes Yes Yes Yes Yes
# of PDs 16 N/A 13 N/A 6 N/A 10 N/A 0
service level (%) 72.53 31.69 85.54 58.09 82.16 58.61 93.20 89.72 89.72
total profit ($) 324.65 398.95 488.61 2140.80 2189.04 540.80 596.01 3189.33 3189.33
% increase N/A 22.89 50.50 559.42 574.28 66.58 83.59 882.39 882.39
cost ($) N/A 1661.96 1565.95 1636.50 1619.35 3270.39 2825.12 2644.56 2644.56
IDs
revenue ($) N/A 2060.91 1766.21 3777.30 3701.84 3811.19 3200.94 5833.89 5833.89
cost ($) 4391.35 N/A 3507.70 N/A 1533.83 N/A 2639.24 N/A 0.00
PDs
revenue ($) 4716.00 N/A 3796.07 N/A 1640.39 N/A 2859.44 N/A 0.00
unchanged. Let N (µ, σ 2 ) denote the normal distribution with mean µ and variance σ 2 . The random
demand follows the normal distribution with mean µd = 36 and variance σd2 = (µd /3)2 , d˜k,tn ,sn ∼
N (µd , σd2 ); the random number of IDs follows the normal distribution with mean µy = 20 and
variance σy2 = (µy /3)2 , ỹJ2,tn −1,sn ∼ N (µy , σy2 ). Note that the numbers generated from normal distri-
butions are rounded to the nearest integers. We also consider the cases of them with zero variance
(i.e., deterministic demand).
Table 4 shows the performance measures with the numbers of PDs and IDs, and the breakdowns
of cost, revenue, and delivery. In the row of “# of IDs”, the first number is the daily average number
of IDs employed, and the second number is the daily average number of IDs available. The row of
“% increase” shows the percentage of profit increase compared with the case (0, 0) of deterministic
demand and available IDs. The row of “delivery (%)” reports the percentage of orders delivered
by IDs (or PDs). Although in practice the number of available IDs cannot be perfectly forecasted,
Table 4 suggests that high demand volatility can improve the profitability and service level. For
example, when both the demand and available IDs are uncertain, the total profit is $236.82 with a
service level 90.71%, which is much improved compared to the case with deterministic demand and

27
uncertain IDs. Different from the characteristics of surge-base policies, increasing demand volatility
does not always increase but can discourage the use of IDs. For instance, for a deterministic supply
of IDs, when the demand is of high variance, the expected number of IDs is 56.99 and the percentage
of deliveries is 65.85%, which is less than those of the deterministic demand. This can be due to
the cost structures of both types of drivers: IDs have lower fixed costs and higher operational costs;
while PDs have lower variable costs and higher fixed costs.
Table 4 The benefit of employing IDs under different volatilities.
variance of (y, d) (0, 0) (0, σy2 ) (σd2 , 0) (σd2 , σy2 )
# of PDs 2 6 4 5
# of IDs 68.00/80.00 42.04/79.72 56.99/80.00 48.55/79.72
service level (%) 92.59% 90.32% 93.51% 90.71%
total profit ($) 285.09 212.03 259.07 236.82
% increase N/A -25.63 -9.13 -16.93
cost ($) 2840.80 1773.74 2389.83 2038.68
IDs revenue ($) 3060.00 1891.80 2561.45 2182.46
delivery (%) 78.70 48.66 65.85 56.10
cost ($) 474.11 1526.02 988.80 1253.05
PDs revenue ($) 540.00 1620.00 1076.27 1346.09
delivery (%) 13.89 41.67 27.67 34.60
To further look into the impacts of temporal and spatial variation of the demand, in the following
two sections, we will focus on the deterministic system with zero variance for demand and IDs’
availability.
5.4.2. Impact of demand time variation For a district, depending on the time of the day,
the demand may vary given the customers’ preferences and the major function of the district.
For example, for a business district, customers may prefer to receive their orders during the lunch
break (e.g., 11 a.m. to 1 p.m.) compared with other times of the day. In this section, we look into

28
the impact of time variation of the demand. We assume the six demand locations in the business
district and consider a demand surge during the second time window of 11 a.m. to 1 p.m.
In particular, we consider four levels of the surge: 5%, 10%, 15%, and 20% of the baseline (0%)
demand. Figure 5a shows the percentage of orders delivered by IDs for all four delivery time
windows. Figure 5b shows the ratio between the numbers of IDs and PDs employed. In contrast to
surge-base policies, the relative number of IDs employed is not monotonically increasing for larger
demand surges. When the demand surge is small (5%), one more PD is hired while fewer IDs are
employed compared with the baseline case. For larger demand surges (starting from 5%), relatively
more IDs are employed.
0.90 9
Percentage deliverd by IDs
ratio of IDs and PDs

8
0.85
7
0.80
6
0.75
5
0% 5% 10% 15% 20% 0% 5% 10% 15% 20%
0.70 4
1 2 3 4 1 2 3 4
Delivery Time Window Delivery Time Window
(a) Percentage of delivery fulfilled by IDs under dif- (b) Ratio between the numbers of IDs and PDs
ferent demand surges. employed under different demand surges.

Figure 5 Impact of demand surges.
5.4.3. Impact of demand spatial distribution So far, we have focused on uniform demand
distributions across the entire service zone. Depending on the urban spatial structure of a city,
given the relative location of the warehouse and the residential and business areas, the demand
may vary geographically in the city. In this section, we look into the impact of the spatially varied
demand distribution by considering three distributions of demand.
1. Uniform: The demand locations (e.g., residential and business areas) are geographically dis-
persed. The delivery demands are uniformly distributed among all the six demand locations.
This distribution is considered in the baseline case and used in all the previous studies.

29
2. Nearby: The demand locations are relatively close to the warehouse. The delivery demands
occur only in the nearby locations (including K6, K9).
3. Faraway: The demand locations are relatively far from the warehouse. The delivery demands
occur only in faraway locations (including K1, K4, K5, K8).
Table 5 shows the solution details under different demand spatial distributions. Compared with
the uniform distribution case, when the demand locations are relatively far from the warehouse,
the delivery service relies more on IDs as it is more cost-efficient to assign faraway deliveries to IDs
given their flat hourly compensation. Whereas, PDs are cheaper to carry out deliveries to closer
locations.
Table 5 Solution details under different demand spatial distributions
Distribution Nearby Uniform Faraway
# of PDs 7 2 0
# of IDs 42 68 79
service level (%) 97.22 92.59 91.20
total profit ($) 370.59 285.09 244.40
cost ($) 1743.43 2840.80 3301.60

IDs
revenue ($) 1890.00 3060.00 3546.00
cost ($) 1665.97 474.11 0.00

PDs
revenue ($) 1890.00 540.00 0.00
6. Conclusions
In this paper, we studied a hybrid crowdsourced delivery system with both PDs and IDs. The
problem is formulated as a multistage stochastic integer programming model to decide the optimal
hiring strategy and fleet deployment under uncertain demand and available IDs. We developed
an ADP approach to efficiently solve the proposed model. We further proposed a parallelization
scheme for more computational improvement. Via numerical studies using scenarios generated for

30
the city of Minneapolis, we showed the insights and the key drivers of the use of IDs in the hybrid
delivery system. First, our results indicate that the hybrid delivery fleet can benefit the service
provider through significantly increased profitability and service level, especially with large vehicle
capacity and availability of IDs. Second, the use of IDs withstands the impacts of larger demand
surges and demand variations in spatial distributions. Third, given the lower-fixed-cost and higher-
variable-cost structure of IDs, increasing demand volatility can result in less use of IDs and less
profitability. Thus, understanding the system dynamics of the hybrid fleet delivery system is of
great importance for last-mile delivery platforms.
Our model can be extended in several ways. First, the current model assumes the number of
available IDs is independent of the compensation scheme. We can plan the hybrid delivery system by
considering endogenous (random) ID supply. For example, discrete choice models can be employed
to model the supply endogeneity. Second, the current model is formulated for designing a hybrid
delivery fleet from scratch. It can be easily extended to handling the expansion of an existing
delivery fleet. Third, we do not consider electric vehicles (EVs) in the problem. It is possible that
the charging or battery swapping operations brings more opportunities for different cost structures
and potential benefits by utilizing EVs as movable electric suppliers to compensate the power grids
when needed.
Acknowledgement
The authors acknowledge the Minnesota Supercomputing Institute (MSI) at the University of
Minnesota for providing resources that contributed to the research results reported within this
paper. URL: http://www.msi.umn.edu
Endnotes
1
See https://population.un.org/wup/Publications/Files/WUP2018-Report.pdf.
2
See https://en.wikipedia.org/wiki/Haversine_formula.
3
See https://www.mbvans.com/.
4
See https://flex.amazon.com/faq.
5
See https://hiring.amazon.com/faq.
6
See https://pe.usps.com/text/dmm300/Notice123.htm.

31
References
Aggarwal A (2019) E-commerce share of total global retail sales from 2015 to 2023. https://www.statista.
com/statistics/534123/e-commerce-share-of-retail-sales-worldwide/.
Allon G, Van Mieghem JA (2010) Global dual sourcing: Tailored base-surge allocation to near-and offshore
production. Management Science 56(1):110–124.
Alnaggar A, Gzara F, Bookbinder JH (2021) Crowdsourced delivery: A review of platforms and academic
literature. Omega 98:102139, ISSN 0305-0483, URL http://dx.doi.org/https://doi.org/10.1016/
j.omega.2019.102139.
Archetti C, Savelsbergh M, Speranza MG (2016) The vehicle routing problem with occasional drivers. Euro-
pean Journal of Operational Research 254(2):472–480.
Arslan AM, Agatz N, Kroon L, Zuidwijk R (2019) Crowdsourced delivery–a dynamic pickup and delivery
problem with ad hoc drivers. Transportation Science 53(1):222–235.
Bai J, So KC, Tang CS, Chen X, Wang H (2019) Coordinating supply and demand on an on-demand service
platform with impatient customers. Manufacturing & Service Operations Management 21(3):556–570.
Barr J Alistair; Wohl (2013) Exclusive: Wal-Mart may get customers to deliver packages
to online buyers. https://www.reuters.com/article/us-retail-walmart-delivery/
exclusive-wal-mart-may-get-customers-to-deliver-packages-to-online-buyers-idUSBRE92R03820130328.
Behrendt A, Savelsbergh M, Wang H (2022) Crowdsourced same-day delivery: Joint planning and coor-
dination for centralized and decentralized couriers. Available at Optimization-Online: https: //
optimization-online. org/ ?p= 20103 .
Benjaafar S, Ding JY, Kong G, Taylor T (2022) Labor welfare in on-demand service platforms. Manufacturing
& Service Operations Management 24(1):110–124.
Castillo VE, Bell JE, Mollenkopf DA, Stank TP (2022) Hybrid last mile delivery fleets with crowdsourcing:
A systems view of managing the cost-service trade-off. Journal of Business Logistics 43(1):36–61.
Christopher M, Towill D (2001) An integrated model for the design of agile supply chains. International
Journal of Physical Distribution & Logistics Management 31(4):235–246.

32
Dayarian I, Savelsbergh M (2020) Crowdshipping and same-day delivery: Employing in-store customers to
deliver online orders. Production and Operations Management 29(9):2153–2174, URL http://dx.doi.
org/10.1111/poms.13219.
Dolan S (2018) The challenges of last mile delivery logistics and the technology solutions cutting costs.
https://www.businessinsider.com/last-mile-delivery-shipping-explained.
Even S, Itai A, Shamir A (1975) On the complexity of time table and multi-commodity flow problems. 16th
annual symposium on foundations of computer science (SFCS 1975), 184–193 (IEEE).
Fatehi S, Wagner MR (2022) Crowdsourcing last-mile deliveries. Manufacturing & Service Operations Man-
agement 24(2):791–809.
Fulkerson D, Gross O (1965) Incidence matrices and interval graphs. Pacific Journal of Mathematics
15(3):835–855.
Gdowska K, Viana A, Pedroso JP (2018) Stochastic last-mile delivery with crowdshipping. Transportation
research procedia 30:90–100.
Haneveld WKK, van der Vlerk MH (1999) Stochastic integer programming: General models and algorithms.
Annals of operations research 85:39–57.
Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic
discrete optimization. SIAM Journal on Optimization 12(2):479–502.
Koetsier J (2020) Covid-19 accelerated e-commerce growth ‘4 to 6 years’. https://www.forbes.com/
sites/johnkoetsier/2020/06/12/covid-19-accelerated-e-commerce-growth-4-to-6-years/
#7067ef82600f.
Lei YM, Jasin S, Wang J, Deng H, Putrevu J (2020) Dynamic workforce acquisition for crowdsourced last-
mile delivery platforms. Available at SSRN: https: // ssrn. com/ abstract= 3532844 .
Macrina G, Pugliese LDP, Guerriero F, Laganà D (2017) The vehicle routing problem with occasional drivers
and time windows. International Conference on Optimization and Decision Science, volume 217, 577–
587 (Springer).
Macrina G, Pugliese LDP, Guerriero F, Laporte G (2020) Crowd-shipping with time windows and transship-
ment nodes. Computers & Operations Research 113:104806.

33
Mousavi K, Bodur M, Roorda MJ (2022) Stochastic last-mile delivery with crowd-shipping and mobile
depots. Transportation Science 56(3):612–630.
Nemhauser G, Wolsey L (1999) Integer and Combinatorial Optimization, volume 55 (John Wiley & Sons).
Powell WB (2007) Approximate Dynamic Programming: Solving the curses of dimensionality, volume 703
(John Wiley & Sons).
Powell WB (2014) Clearing the jungle of stochastic optimization. Bridging data and decisions, 109–137
(INFORMS).
Qi W, Li L, Liu S, Shen ZJM (2018) Shared mobility for last-mile delivery: Design, operational prescriptions,
and environmental impact. Manufacturing & Service Operations Management 20(4):737–751.
Raviv T, Tenzer EZ (2018) Crowd-shipping of small parcels in a physical internet. Available at
https://www.researchgate.net/publication/326319843_Crowd-shipping_of_small_parcels_
in_a_physical_internet.
Savelsbergh MW, Ulmer MW (2022) Challenges and opportunities in crowdsourced delivery planning and
operations. 4OR 20(1):1–21.
Shapiro A, Dentcheva D, Ruszczyński A (2009) Lectures on Stochastic Programming: Modeling and Theory,
volume 9 (Philadelphia, PA: SIAM).
Simao HP, Day J, George AP, Gifford T, Nienow J, Powell WB (2009) An approximate dynamic programming
algorithm for large-scale fleet management: A case application. Transportation Science 43(2):178–197.
Topaloglu H, Powell WB (2006) Dynamic-programming approximations for stochastic time-staged integer
multicommodity-flow problems. INFORMS Journal on Computing 18(1):31–42.
Torres F, Gendreau M, Rei W (2022) Vehicle routing with stochastic supply of crowd vehicles and time
windows. Transportation Science 56(3):631–653.

34
Appendix A: Nomenclature
Table 6 summarizes all the notations of the multistage formulation in the paper.
Table 6 Nomenclature
Parameters and sets
J Set of warehouses, J = {1, . . . , i, . . . J}
K Set of all potential demand locations, K = {1, . . . , k, . . . K}
T Set of service operating time periods, T = {1, . . . , t, . . . T }
N (j), j ∈ J Set of demand locations served by warehouse j
b1max /b2max Vehicle capacity of IDs/PDs
B1 /B2 Set of the numbers of packages carried by one ID/PD, i.e., B1 = {0, 1, . . . , b, . . . , b1max }
(or B2 = {0, 1, . . . , b, . . . , b2max }
Mj , j ∈ J Hiring capacity of warehouse j
cj , j ∈ J Cost of hiring a PD at warehouse j
τij , i, j ∈ K ∪ J Minimum travel time between location i and j
lij , i, j ∈ I ∪ K Travel distance between location i and j
αjk , j ∈ J , k ∈ K Binary parameter indicating whether demand locaiton k is served by whether j or not
A1n /A2n Set of arcs for IDs/PDs in delivery time window n
N1n /N2n Set of nodes for IDs/PDs in delivery time window n
G1n = (N1n , A1n ) Spatial-temporal delivery network of IDs
G2n = (N2n , A2n ) Spatial-temporal delivery network of PDs
ρk Penalty cost of unmet order to demand location k
ddisp
j /cdisp
j , j∈J Cost per ID/PD vehicle per period for dispatching orders from warehouse j
ddeliv
ik /cdeliv
ik /, i ∈ J ∪ K, k ∈ K Cost per ID/PD vehicle per period for delivering order from location i to location k
crelo
ij , i, j ∈ J Cost per PD vehicle per period for relocating from zone i to j
didle
j /cidle
j , j∈J Cost per ID/PD vehicle per period for idling in zone j
dreturn
kj /creturn
kj , j ∈ J , k ∈ N (j) Cost per PD/CD vehicle per period for returning from demand k to warehouse j
cop Operational cost per PD vehicle per travel distance
Uncertainties
d˜kts , k ∈ K, t, s ∈ T Number of orders that have to be delivered to location k between period t and s
d̃n = (d˜k,tn ,sn ), k ∈ K)> Vector of demand of delivery time window n
ỹjts , j ∈ J , t, s ∈ T Number of IDs available in zone j from period t until period s
ỹn = (ỹj,tn −1,sn , j ∈ J )> Number of available IDs in delivery time window n
Decision variables
xi , i ∈ I Integer decision variable of the number of PDs to hire at depot i
una , a ∈ A1n Flows on arc a of PDs’ network in delivery time window n
vna , a ∈ A2n Flows on arc a of IDs’ network in delivery time window n
wk,tn ,sn , k ∈ K undelivered orders to location k in delivery time window n
Appendix B: Constraints in the set Yn (Sn , ξˆn )

Let δn+ (mjtb ) and δn− (mjtb ) be the sets of arcs in the IDs’ network G1n for which mjtb is the origin
and destination node, respectively. Similarly, define σn+ (mjtb ) and σn− (mjtb ) for the PDs’ network
G2n .There are the following four types of constraints of the feasible region Yn (Sn , ξˆn ) for the mul-
ticommodity flow problem.
1. Entry of IDs
(ua , a ∈ Aentry > n

1n ) ≤ ŷ . (5a)
Constraints (5a) ensure that the number of IDs used of a given warehouse cannot exceed the
number of IDs available at that warehouse.

35
2. Flow balance constraints of IDs

X
= ua0 , a0 ∈ δ − (mj,tn −1,0 ) ∩ Aentry
1n , j ∈ J (5b)
a∈(∪k∈N (j) δ + (m ) exit
k,sn ,0 ) ∩A1n
X X
ua = ua , ∀i ∈ J ∪ K, tn − 1 ≤ t ≤ sn , b1max ≥ b ≥ 0. (5c)
+ −
a∈δn (mitb ) a∈δn (mitb )
Constraints (5b) and (5c) are flow balance constraints with respect to the network G1n of IDs.
In particular, constraints (5b) impose the number of IDs leaving zone j equal to the number
of IDs entering zone j.
3. Flow balance constraints of PDs
X
ua = Rjn , ∀j ∈ J (5d)
+
a∈σn (mj,tn −1,0 )
X
ua = Rkn , ∀k ∈ K (5e)
+
a∈σn (mk,tn ,0 )
X X
n
ua = ua + Wt,j , ∀j ∈ J , tn ≤ t ≤ sn (5f)
+ −
a∈σn (mjt0 ) a∈σn (mjt0 )
X
Rjn+1 = va , ∀j ∈ J (5g)
−
a∈σn (mj,sn −1,0 )
X
Rkn+1 = va , ∀k ∈ K (5h)
−
a∈σn (mk,sn ,0 )
X
n+1
Wj,t = u(mk,t−lkj ,0 ,mjt0 ) , ∀j ∈ J , tn+1 ≤ t ≤ sn+1 (5i)
k∈K
X X
ua = ua , ∀i ∈ J ∪ K, tn ≤ t ≤ sn , b ≥ 0, (5j)
+ −
a∈σn (mitb ) a∈σn (mitb )
where constraints (5d)-(5f) impose the relationship between the state variables Sn =
(RnJ , RnK , WJn ) and the flows of time window n. Constraints (5j) are the flow balance con-
straints with respect to the network G2n of PDs. For the last returning time window N , we
replace (5g)-(5i) with the following constraints
X
va = xj ∀j ∈ J
−
a∈σn (nj,t ,0 )
N
to ensure all the PDs return to warehouses at the end of the planning horizon.
4. Demand allocation
X X
b a ua + ba ua + wk,tn ,sn = dˆk,tn ,sn ∀k ∈ K, (5k)
a∈Adeliv
1n (k) a∈Adeliv
2n (k)
where Adeliv deliv

1n (k) ⊂ A1n and Adeliv deliv
1n (k) ⊂ A1n represent the sets of delivery arcs with destina-
tion at demand location k for IDs and PDs, respectively; ba is the number of orders delivered
for vehicles on arc a. In Constraints (5k), wk,tn ,sn is the number of undelivered orders.

36
Appendix C: Out-of-Sample Policy Evaluation Algorithm
Algorithm 3: Out-of-Sample Policy Evaluation

Input: {Vbn }N
n=1 obtained from Algorithm 1, Ntest scenario paths.
1 Initialize uκ = {}, vκ = {}, wκ = {}, ∀κ = 1, . . . , Ntest .
2 Solve (1) using Vb 1 to obtain (x, S1 ).
3 for κ = 1, . . . , Ntest do
4 Sample a realization {ξ n,κ }N

n=1 of uncertain parameters.
5 for n = 1, . . . , N do
6 For stage n, solve (3) using ξˆn := ξ n,κ , Sn and Vbn+1 to obtain (un , vn , wn , Sn+1 ).
7 uκ ← uκ ∪ {un }; vκ ← vκ ∪ {vn }; wκ ← wκ ∪ {wn }
PNtest PN P 1 κ
P 2 κ
P κ

8 V̄1 (S1 ) ← 1/Ntest · κ=1 n=1 a∈A1n cna una + a∈A2n cna vna + k∈K ρnk wk,tn ,sn
9 return {uκ }N κ Ntest κ Ntest

κ=1 , {v }κ=1 , {w }κ=1 , V̄1 (S1 )
test

SSRN Id4322670

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id4322670

Uploaded by

Copyright:

Available Formats

Crowdsourcing Last-Mile Delivery with Hybrid Fleets

under Uncertainties of Demand and Driver Supply:

Electronic copy available at: https://ssrn.com/abstract=4322670

relatively far from depots/warehouses.

business-to-consumer (B2C) e-commerce has experienced astonishing growth and is expected to

Electronic copy available at: https://ssrn.com/abstract=4322670

Electronic copy available at: https://ssrn.com/abstract=4322670

1.1. Methodology Overview

associated with operational decisions.

of integral decisions, the computation can be challenging. We employ an approximate dynamic

problem by constructing piecewise linear approximations of the corresponding value function in a

dynamic programming formulation of the problem.

Electronic copy available at: https://ssrn.com/abstract=4322670

1.2. Contributions and Main Results

We summarize our contribution and main results as follows.

demand with uncertain ID supply. We formulate a multistage stochastic integer program-

and IDs’ availability.

and when demand locations are relatively far from depots/warehouses.

3. We demonstrate the computational efficacy of an ADP method based on piecewise linear

with a large number of operational decision variables in an expanded transportation network

Electronic copy available at: https://ssrn.com/abstract=4322670

1.3. Structure of the Paper

summarize the notation of the multistage formulation in Table 6 in Appendix A.

during their trips.

Electronic copy available at: https://ssrn.com/abstract=4322670

of crowdsourced delivery platforms and operations research literature.

Electronic copy available at: https://ssrn.com/abstract=4322670

both random driver supply and demand.

systems using a robust queueing model.

3. Problem Description and Formulation

Electronic copy available at: https://ssrn.com/abstract=4322670

depots/warehouse as warehouses which serve as transport hubs for orders.

3.1. Planning Problem and Formulation

min c> x + V (x) : xj ≤ Mj , j ∈ J , x ∈ ZJ+

planning horizon, which we will specify later in Section 3.3.

Electronic copy available at: https://ssrn.com/abstract=4322670

pre-determined matching of demand locations to warehouses where αjk = 1 if demand location k

of j ∈ J to denote the zone that is served by warehouse j.

3.2. Spatial-Temporal Delivery Network

We consider a planning horizon of T = {1, . . . , T } periods which is divided into 1, . . . , N time

IDs and PDs, respectively.

Electronic copy available at: https://ssrn.com/abstract=4322670

period earlier than the time window, from tn − 1 period.

warehouse j dispatches them to the IDs with a cost of ddisp

deliver the assigned orders with a delivery cost ddeliv

location (a warehouse or demand location) i ∈ J ∪ K to another demand location k ∈ K. Within

delivery location k with an idle cost didle

Similarly, for the orders assigned to PDs, we denote cdisp

Electronic copy available at: https://ssrn.com/abstract=4322670

time window, for IDs and PDs, respectively.

Arcs of A1n for IDs for delivery time window [tn , sn ].

1. Entry arcs Aentry

for delivery time window n arriving at warehouse j with zero cost.

2. Dispatch arcs Adisp

per unit flow.

3. Delivery arcs. Define delivery arcs Adeliv

demand location k 0 with cost ddeliv

k ∈ N (j) in one trip from period t to t + τkj with cost dreturn

Electronic copy available at: https://ssrn.com/abstract=4322670

Warehouses 1 Demand locations

𝑛𝐴12 𝑛𝐴22 𝑛𝐴32

𝑛𝑌12 𝑛𝑌22 𝑛𝑌32