Drones 07 00679

drones
Article
Multi-UAV Urban Logistics Task Allocation Method Based
on MCTS
Zeyuan Ma 1 and Jing Chen 1,2, *
1 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing,
Wuhan University, Wuhan 430072, China; zeyuanma@whu.edu.cn
2 Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430072, China
* Correspondence: jchen@whu.edu.cn; Tel.: +86-139-7126-0269
Abstract: Unmanned aerial vehicles (UAVs) open new methods for efficient and rapid transportation
in urban logistics distribution, where task allocation is a significant issue. In urban logistics systems,
the energy status of UAVs is a critical factor in ensuring mission fulfillment. While extensive literature
addresses the energy consumption of UAVs during tasks, the feasibility of energy replenishment
must be addressed, which introduces additional uncertainty to the task allocation. This paper realizes
multi-tasking, considering the energy consumption and replenishment of UAVs, to ensure that
the tasks can be accomplished while reducing energy consumption. This paper proposes uniform
distribution K-means to realize balanced multi-task grouping. Based on the Monte Carlo tree search
(MCTS), a task-allocation-oriented MCTS method is proposed, including improving the selection and
simulation process of MCTS. The aim was to collaborate with multiple trees for node selection and
record historical simulation information to guide subsequent simulations for better results. Finally,
the optimality of the proposed method was validated by comparing it with other relevant MCTS
methods through several randomized experiments.
Keywords: multi-robot systems; energy-based task allocation; city delivery; Monte Carlo tree search
(MCTS)
1. Introduction
Citation: Ma, Z.; Chen, J. Multi-UAV
Urban Logistics Task Allocation In recent years, unmanned aerial vehicles (UAVs) have made outstanding progress in
Method Based on MCTS. Drones 2023, structure, control systems, and communication systems [1]. With technological advances
7, 679. https://doi.org/10.3390/ and cost reductions, the use of UAVs in areas such as mapping [2], agriculture [3], public
drones7110679 safety [4], transportation systems [5], health systems [6], and logistics [7] has been gradually
increasing, especially in the logistics sector.
Academic Editor: Carlos
Tavares Calafate
Currently, the logistics industry is experiencing rapid growth, with the number of
parcels delivered by various companies increasing annually. However, some inherent
Received: 22 September 2023 limitations and challenges are associated with traditional logistics and transportation
Revised: 10 November 2023 methods, such as traffic congestion, transportation time, and high labor costs [8,9]. The
Accepted: 15 November 2023 emergence of UAVs brings excellent opportunities for change in the logistics industry. First,
Published: 17 November 2023
UAVs are not restricted by ground transportation conditions and can carry out rapid cargo
distribution, significantly improving logistics efficiency [10]. Moreover, the flexibility of
UAVs enables them to enter areas with complicated terrain or inconvenient transportation
Copyright: © 2023 by the authors.
options, providing a new solution for logistics supply in remote areas and making up for
Licensee MDPI, Basel, Switzerland. the gaps that traditional logistics cannot cover [11]. Unlike traditional logistics’ dependence
This article is an open access article on petroleum resources [12], UAVs use clean energy, reducing energy needs [13,14]. The
distributed under the terms and green and low-carbon application mode of UAVs could help to address problems such as
conditions of the Creative Commons environmental pollution and global warming and meet the needs of green and sustainable
Attribution (CC BY) license (https:// development [15,16].
creativecommons.org/licenses/by/ Thanks to the high flexibility, low cost, and eco-friendliness of UAVs [17], many
4.0/). companies are actively exploring the introduction of UAVs into the logistics industry. As
Drones 2023, 7, 679. https://doi.org/10.3390/drones7110679 https://www.mdpi.com/journal/drones

Drones 2023, 7, 679 2 of 22
the world’s largest e-commerce company, Amazon has launched the Prime Air drone
delivery service, aiming to deliver parcels to homes within 30 min [18]. As a globally
renowned fast-food giant, Domino’s Pizza is a franchisee that has obtained the right to
deliver pizzas by drone. Pizza delivery is strictly limited by local regulators. Therefore,
with the authorization of the New Zealand government, they completed a test of drone
pizza delivery [19]. The German logistics company Deutsche Post DHL has initiated a
project called Parcelcopter, which delivers medical supplies by drone. This service is
particularly suitable for remote areas and situations where drugs are urgently needed [20].
In addition, companies like Google, Walmart, and Alibaba are also committed to drone
delivery research and have conducted a series of delivery tests [21].
Furthermore, to address the issue of UAV flight time being limited by battery capacity,
some enterprises and scholars are focusing on the research of UAV docks [22,23]. UAV
docks allow UAVs to land, change batteries automatically, and then take off again [24].
Through such a design, UAVs can undertake multiple tasks consecutively, extending the
working hours of UAVs and laying the groundwork for UAV applications in logistics.
In the logistics industry, a core issue is allocating multiple tasks to different UAVs
effectively. Allocating multiple tasks to multiple UAVs can be seen as a multi-robot task
allocation (MRTA) problem. In the past few years, many studies have proposed various
methods for MRTA problems, mainly categorized into methods based on linear program-
ming, heuristic-based methods, and auction algorithms.
The task allocation problem for multiple UAVs or robots working collaboratively
is a classic combinatorial optimization problem in mathematics and can be viewed as a
0–1 integer linear programming problem [25]. The primary method for solving linear
programming problems in the early days was the Hungarian algorithm [26], which can be
used to solve allocation optimization problems with one-to-one matching characteristics.
Madridano designed a UAV task allocation system based on the Hungarian algorithm for
building emergencies. This system assigned different tasks to UAVs based on parameters
such as urgency and distance of the task [27]. Mixed-integer linear programming methods
can handle more complex calculations and have been used to solve multi-objective task
allocation problems for robots under uncertain conditions [28–30]. Linear programming is
suitable for problems with solid constraint conditions, requiring the objective function of
the problem to be linear.
Heuristic-based methods for solving MRTA problems can find optimal or near-optimal
solutions while meeting specific constraints [31]. Based on heuristic information (such as
robot positions and capabilities), heuristic methods evaluate and select heuristic informa-
tion until a condition is met. For instance, the genetic algorithm for solving MRTA problems
simulates the evolutionary mechanisms in nature to obtain relevant information during
the search process, yielding the best allocation scheme [32–34]. Similarly, algorithms like
particle swarm optimization (PSO) and ant colony optimization (ACO) can also be used
to address MRTA problems [35–37]. However, heuristic-based methods converge slowly
when faced with problems with large search spaces.
In recent years, increasing research has focused on addressing MRTA problems using
auction algorithms [38]. Auction algorithms draw from the concept of market transactions,
viewing tasks as bidding objects. Each robot bids based on its capabilities and require-
ments, ultimately forming an optimized task allocation scheme [31]. Based on auction
algorithms, the consensus-based auction algorithm (CBAA) and consensus-based bundle
algorithm (CBBA) have been used to solve single-task and multi-task allocation prob-
lems [39]. Moreover, Cheng successfully used auction algorithms to solve multi-UAV task
allocation problems under multiple constraints [40]. Combining communication situations,
some research has delved into impaired communication states’ impact on the performance
of auction algorithms [41]. The Robot-Group Assignment Strategy also combines grouping
strategies and auction algorithms, effectively solving MRTA problems [42].
With the continuous advancement of UAV technology and the increasing complexity
of task allocation scenarios, the methods above have certain limitations when dealing
Drones 2023, 7, 679 3 of 22
with MRTA problems with various constraints. To address these challenges, some new
approaches offer better solutions for MRTA problems, such as the Monte Carlo Tree Search
(MCTS). MCTS was initially mainly used to improve computer performance in board
games, especially in Go [43]. AlphaGo, developed by Google DeepMind, employed MCTS
as its key component [44]. Moreover, MCTS has gradually been applied to broader domains.
For instance, in ever-changing factory environments, MCTS can facilitate rapid automated
decisions for human–machine collaboration [45]. Based on MCTS, signal control systems
at intersections can be optimized, providing an effective solution for urban traffic man-
agement [46]. In autonomous driving technology, by combining reinforcement learning
with MCTS, the unsafe behaviors of self-driving vehicles can be minimized, enhancing
their performance in intricate settings [47]. In UAV-aided wireless systems, UAV planning
problems are addressed using the random-MCTS (R-MCTS) strategy [48]. Furthermore,
the greedy-MCTS (G-MCTS) strategy and a combined approach of the random–greedy
simulation strategy (random–greedy-MCTS, RG-MCTS) can effectively tackle the weighted
vertex coloring problem [49]. However, only some studies leverage MCTS to solve UAV
task allocation challenges.
The information from the related literature and its method type is summarized in
Table 1. Although there have been many studies on task allocation, some aspects have not
been considered: (1) excess numbers of tasks can impact decision-making efficacy; (2) UAV
task planning is strictly limited by energy and capacity; and (3) energy replenishment
services of UAV docks should also be factored in. These aspects will introduce more
uncertainties to the task allocation.
Table 1. Summary of the related literature.
Reference Method Type Application Scenario Objectives Constraints

[27] Hungarian method UAV rescue service Time consumption Obstacle avoidance
mixed-integer
[28] UAV fleet coordination Travel distance Obstacle avoidance
programming
mixed-integer
[29] Robot trajectory planning Action cost Robots action Feasibility
programming
mixed-integer
[30] Robot tasks allocation Time consumption Tasks feasibility
programming
[32] Genetic Algorithm UAVs inspection service Time consumption Obstacle avoidance
Time and energy
[34] Genetic Algorithm UAVs inspection service Tasks feasibility
consumption
Particle Swarm
[35] Robot task allocation Travel distance Computing capacity
Optimization
Particle Swarm
[36] Robot rescue service Computing time Robot capability
Optimization
autonomous vehicles Travel distance and
[39] Auction algorithm Communication capability
task allocation feasible solutions
[40] Auction algorithm UAVs task allocation Travel cost Time window
[41] Auction algorithm Robot task allocation Travel cost Communication capability
[42] Auction algorithm Robot task allocation Time consumption Time window
Human-robot
[45] Monte Carlo Tree Search Time consumption Task execution time
collaboration
Traffic signal
[46] Monte Carlo Tree Search Signal optimization Computing capacity
optimization
[47] Monte Carlo Tree Search Autonomous driving Driving safety Obstacle avoidance
UAV-aided wireless UAV path and Energy consumption and
[48] Monte Carlo Tree Search
systems computing time user fairness
Energy consumption and
This paper Monte Carlo Tree Search City delivery Energy consumption replenishment
payload capacity
Drones 2023, 7, x FOR PEER REVIEW 4 of 22
Drones 2023, 7, 679 4 of 22

The core objective of this paper was to design an effective multi-task allocation
method that considers the need for UAVs to be energized via UAV docks to meet the de-
mandsTheofcoreperforming
objective oflonger-range
this paper was andtolonger-time missions
design an effective in the presence
multi-task allocationofmethod
limited
UAVconsiders
that payload the capacity andUAVs
need for batteryto energy.
be energizedIn thisviapaper,
UAV thedocks process
to meet of the
taskdemands
assignmentof
was dividedlonger-range
performing into two steps. andFirstly, to reduce
longer-time the search
missions in therange
presenceof task allocation,
of limited UAVthis paper
payload
capacity
divided all andthe battery energy.
tasks into In this
different paper, the
sub-task process
groups and of task assignment
performed was divided
task allocation within
into
eachtwo steps.group.
sub-task Firstly,Secondly,
to reducefor the
tasksearch range this
allocation, of task
paperallocation,
proposesthis thepaper divided
task allocation
all the tasks
MCTS (TA-MCTS)into different
methodsub-task
based ongroups
MCTS. and performed
TA-MCTS aims task allocation
to minimize thewithin
energyeach
con-
sub-task
sumption group.
of UAVs,Secondly, for task
considers the allocation,
constraintsthis papercapacity
of load proposes andthebattery
task allocation MCTS
energy, and ra-
(TA-MCTS) method based
tionally formulates on MCTS.
UAV flight TA-MCTS
strategies. aims methods
The main to minimize are the energy consumption
as follows:
of UAVs, considers
(1) Based on thethe constraints
K-Means of loadmethod,
clustering capacity and
this battery
paper energy, and
constrained rationally
the number of
formulates
elements inUAV eachflight strategies.
sub-task group so Thethat
main methods
it not are as follows:
only grouped all tasks but also ensured the
same(1) Based on
number the K-Means
of tasks clustering
were allocated method,
to each thisgroup.
sub-task paper constrained the number of
elements in each sub-task group so that it not only
(2) Within the divided sub-task groups, task allocation was grouped all tasks but also
realized ensuredtrees
by multiple the
same number of tasks were allocated
of MCTS corresponding to multiple UAVs. to each sub-task group.
(2)
(3)Within the divided
In the selection andsub-task
expansion groups,
phasetask allocation
of MCTS, was realized
selection by multiple
and expansion trees
optimiza-
of MCTS
tion corresponding
strategies to multiple
were proposed. UAVs.
By considering the energy state of the UAVs during the
(3) In the
execution of theselection
mission,andtheexpansion
following phase of MCTS,
possible nodesselection and expansion
were selectively optimiza-
expanded to re-
tion strategies were proposed.
duce the search range of the MCTS. By considering the energy state of the UAVs during the
execution of the mission, the following possible nodes were selectively
(4) In the simulation phase of MCTS, a simulation optimization strategy was pro- expanded to reduce
the search
posed. Therange of the
strategy MCTS.
utilized the simulation results during the multi-UAV task assignment
(4) In the simulation
process as heuristic information phase ofto MCTS,
guideathe simulation
selectionoptimization
of tasks in the strategy was proposed.
simulation phase.
The strategy utilized the simulation results during the multi-UAV task
The rest of the paper is organized as follows: Section 2 provides a problem descrip- assignment process
as heuristic
tion. information
An overview of thetoresearch
guide the selection of of
methodology tasks
thein the simulation
paper is given in phase.
Section 3. Section
The rest of the paper is organized as follows: Section 2 provides a problem description.
4 analyzes the performance of the proposed methodology in different settings. Finally, the
An overview of the research methodology of the paper is given in Section 3. Section 4
paper concludes in Section 5.
analyzes the performance of the proposed methodology in different settings. Finally, the
paper concludes in Section 5.
2. Problem Description
2.2.1. ProblemDescription
Problem Scenario
2.1. Problem
Urban Scenario
UAV logistics is a complex system involving the interaction of multiple enti-
ties, Urban
including
UAVUAVs, distribution
logistics is a complex centers
system (DC), UAV the
involving docks, and task
interaction of locations. In this
multiple entities,
paper, weUAVs,
including mainly considered
distribution how to
centers assign
(DC), UAV multiple tasks
docks, and to locations.
task UAVs to minimize energy
In this paper, we
consumption
mainly when
considered UAVs
how have limited
to assign multipleenergy
tasks and UAVto
to UAVs docks are available.
minimize As shown in
energy consumption
FigureUAVs
when 1, although the energy
have limited of the
energy andUAV
UAV could
docksmeet
are the needs ofAs
available. tasks
shown 1 and
in 2,Figure
it could
1,
not perform task 3. Therefore, the UAV should replace the battery at the
although the energy of the UAV could meet the needs of tasks 1 and 2, it could not perform dock after per-
forming
task task 1 tothe
3. Therefore, reach
UAVthe energy
should requirement
replace foratperforming
the battery the dock afterthree tasks. Unlike
performing task 1the
to
traditional
reach allocation
the energy method,for
requirement theperforming
consumption of tasks.
three UAV energy
Unlike affects the task allocation
the traditional allocation
process. the consumption of UAV energy affects the task allocation process.
method,
Figure1.1.Task
Figure Taskselection
selectionconsidering
consideringthe
theenergy
energylevel.
level.
To
Tosimplify
simplifythe
theproblem,
problem,thethefollowing
followingassumptions
assumptionswere
weremade
madeininthis
thispaper:
paper:
• The distribution center covers a specific area, and the distribution targets of the task
are distributed in this area.
Drones 2023, 7, 679 5 of 22
• UAV docks are distributed in this area and serve as service stations for UAVs to change
their batteries.
• The UAVs have limited energy sources, and when low on power, the UAVs need to
travel to docks for battery replacement.
• The UAVs have limited carrying capacity, and each can only carry a limited number of
items of the same weight.
• The UAVs maintain a consistent flight speed throughout the delivery process, correlat-
ing energy consumption per kilometer to the number of items carried.
2.2. Task Model

The task model and symbol are described as follows: Let U denote the set of m UAVs,
with the UAVs denoted by u1 , u2 , · · · , um . Each UAV has the same attributes: average
flight speed Vavg , average flight altitude z avg , maximum number of items to be carried Pmax ,
maximum battery capacity Emax , and energy consumed per kilometer Emax . Let T denote
the set of N tasks, with the tasks denoted by t1 , t2 , · · · , t N , and the task location denoted
by their coordinates l = ( x, y, z). Ecost is defined as Equation (1), where E represents the
energy consumed by the UAV per kilometer when no items are loaded, and α is the number
of loaded items. ∆ denotes the energy consumed by each item (based on the assumption
that the weight of the items is the same).
Ecost = E + α∆ (1)
When task t j is assigned to UAV ui , the

energy cij required for UAV ui to accomplish t j
is shown in Equation (2), where D l j−1 , l j denotes the distance of the UAV from position
t j−1 to position t j .
cij = D l j−1 , l j Ecost (2)
q
2 2
D l j −1 , l j = x j − x j −1 + y j − y j −1 + 2z avg − z j − z j−1 (3)
In this paper, all tasks are grouped, and each sub-task group is denoted by S, with the
same number of tasks in each group denoted as n. The following constraints are satisfied if
the task group S is feasible:
n
∑ xij ≤ Pmax ∀i ∈ U (4)
j =1
m
∑ xij ≤ 1 ∀j ∈ S (5)
i =1
xij ∈ {0, 1} ∀(i, j) ∈ U × S (6)

E xi( j−1) − cij > 0 ∀(i, j) ∈ U × S (7)
where xij = 1 indicates that task j is assigned to UAV i; otherwise, xij = 0. Equation (4)
indicates that the number of items carried by each UAV cannot exceed the maximum limit,
Pmax . Equation (5) indicates that, at most, one UAV is assigned to each task. E xij indicates
the energy left after UAV i completes task j. Equation (7) indicates that after completing the
previous task, the UAV should have enough energy to reach the following task location.
Considering that the MRTA problem can be expressed as a minimization objective
function, in this paper, the objective function was defined as all the energy C consumed by
the UAV to perform the tasks, and the objective function is shown in Equation (8) under
the above constraints:
m n
C= ∑ ∑ cij xij (8)
i =1 j =1
by the UAV to perform the tasks, and the objective function is shown in Equation (8) under
the above constraints:
Drones 2023, 7, 679 𝐶= 𝑐 𝑥 (8)

6 of 22
3.3.Methodology
Methodology
Generally,
Generally,a aDCDC inin
ananurban
urbanarea with
area withmany
many required delivery
required tasks
delivery is responsible
tasks for
is responsible
afor
specific area.area.
a specific In order to accomplish
In order all the
to accomplish alltasks with with
the tasks minimum
minimum energy consumption,
energy consump-
the core
tion, theidea
coreofidea
TA-MCTS
of TA-MCTSproposed in thisinpaper
proposed was to
this paper group
was the tasks
to group and solve
the tasks the
and solve
multi-task assignment problem by local search, as shown in Figure 2. This
the multi-task assignment problem by local search, as shown in Figure 2. This consisted consisted of two
main
of twoaspects. Firstly, the
main aspects. tasksthe
Firstly, were grouped
tasks into multiple
were grouped sub-tasksub-task
into multiple groups groups
by improved
by im-
K-means. Then, using MCTS, all the tasks contained in each
proved K-means. Then, using MCTS, all the tasks contained in each group were group were assigned to
assigned
multiple UAVs.
to multiple UAVs.
Figure2.2. Methodology
Figure Methodology of
ofTA-MCTS.
TA-MCTS.Solid
Solidcircles
circlesinin
different colors
different representing
colors divided
representing sub-task
divided sub-
groups.
task groups.
3.1.
3.1.Task
TaskGrouping
GroupingStrategy
Strategy
AAdirect
directassignment
assignmentofofall alltasks
tasksininDCDCwith
withmany
manytasks
taskswouldwouldcreate
createaalarge-scale
large-scale
search
searchrange,
range,leading
leadingtotopoor poorassignment
assignmentresults.
results.Therefore,
Therefore,ititwas wascrucial
crucialtotogroup
groupall all
tasks
tasksaccording
accordingtotoUAV UAVcharacteristics.
characteristics.TheseThesesub-task
sub-taskgroups
groupsdirectly
directlyaffected
affectedthe thewayway
the
theUAV
UAVperformed
performedits itstasks.
tasks.Moreover,
Moreover,grouping
groupingresults
resultsplay
playaacrucial
crucialrole
roleininreducing
reducing
energy consumption.
energy consumption.
According
Accordingtotothe thescenario
scenarioininSection
Section2.1,
2.1,the
theDCDCset
setininthis
thispaper
paperwaswassurrounded
surroundedby by
multiple
multipledistribution
distribution task locations.
task locations. Considering
Considering the the
loadload
capacity of UAVs,
capacity this paper
of UAVs, first
this paper
clustered the tasks
first clustered according
the tasks to their
according locations.
to their Currently,
locations. standard
Currently, clustering
standard clusteringmethods
meth-
include density-based, hierarchical, and partitioning clustering
ods include density-based, hierarchical, and partitioning clustering methods. methods.
K-means,
K-means,amongamong the the partitioned
partitioned clustering
clustering methods,
methods, is is widely
widely used
used duedue to to its
its effi-
ef-
ficiency. The K-means algorithm can randomly assign all tasks
ciency. The K-means algorithm can randomly assign all tasks to K clusters, with K repre-to K clusters, with K
representing the number of groups. K-means bases its classification
senting the number of groups. K-means bases its classification on the distances of each on the distances of
each element
element fromfrom
the the center
center of each
of each cluster
cluster without
without considering
considering thethe number
number ofofelements
elementsin
ineach
eachcluster.
cluster.However,
However,duringduringthe the task
task assignment
assignment process,
process, if if the
the clustering
clusteringresults
resultsininan an
unequal
unequal number of elements, it may result in the number of tasks exceeding themaximum
number of elements, it may result in the number of tasks exceeding the maximum
load capacity of the UAV, which will thus fail to complete the task assignment.
load capacity of the UAV, which will thus fail to complete the task assignment.
In this paper, based on K-means, the uniform distribution K-means (UD-Kmeans)
In this paper, based on K-means, the uniform distribution K-means (UD-Kmeans)
method was proposed to ensure that the number of elements in each cluster was the same
method was proposed to ensure that the number of elements in each cluster was the same
according to the payload capacity of the UAV. K is calculated as shown in Equation (9),
according to the payload capacity of the UAV. 𝐾 is calculated as shown in Equation (9),
where K denotes the number of all sub-task groups, and the meanings of m, N, Pmax are
shown in Section 2.2.
N
K = min m, (9)
mPmax
The details of the UD-Kmeans method are described in Algorithm 1. First, a point is
randomly selected as the first clustering center from all task locations (Line 2). The shortest
distance from each task location to the existing clustering center oi is calculated. Moreover,
Drones 2023, 7, 679 7 of 22
we select the next center point with a probability weighted by the square of the distance.
The first clustering center is selected by the above method (Lines 3–9). When all the K
clustering centers are selected, each task is assigned to the nearest center to form a sub-task
group S (Lines 12–15). After that, whether the item number in each sub-task group si
exceeds the specified number is checked. If it is exceeded, the task t j that is farthest from
the current si is moved out of si , and t j is moved into sk closest to t j (Lines 16–20). Then,
the mean value of si is calculated, and the clustering center is updated (Lines 22–24). The
next iteration is carried out until the clustering center does not change.
Algorithm 1: Uniform Distribution K-Means

1: Initialization: randomly select any task location in T as the first center point o1
2: Construct a list of clustering centers O, O = {o1 }
3: for i = 2 to K do
4: for each t j in T do
5: Calculate the distance from ti to all elements O, take the minimum value dis j
6: end for
7: Select the new centroid oi according to the probability P(ci ) ∝ dis2j
8: Add new center point oi to O
9: end for
10: while O is changed do
11: Clear all elements in si , si ∈ S
12: for each oi in O do
n o
13: i = argmin D oi , t j , oi ∈ O, t j ∈ T
n o
14: si ← si ∪ t j
15: end for
16: for each si in S do
17: while |si | > n
mPmax
do o n o
18: j = max D oi , t j , t j ∈ si , si ← si − t j
n o n o
19: k = min D ok , t j , ok ∈ O, ok 6= oi , sk ← sk ∪ t j
20: end while
21: end for
22: for each oi in O do

23: o i = ∑ t j ∈ si x j , y j , z j / | s i |
24: end for
25: end while
3.2. MCTS Task Allocation

MCTS is widely used as a powerful method in games and decision-making problems.
The method constructs a tree and stores information about each action through iterations,
allowing for more illuminating decisions to be made in subsequent decisions. Figure 3
shows one complete iteration in MCTS.
Selection: From the root node, MCTS selects the node with the highest value in each
iteration, according to the selection strategy.
Expansion: If a node is found that has not been expanded, one or more child nodes
are added to this node to indicate the following possible action.
Simulation: Starting from the selected node, a series of actions are simulated until an
end state is reached. Usually, a random or greedy strategy is adopted [50].
Backpropagation: MCTS propagates the simulation results to the root node and
updates all nodes on the path.
Drones 2023, 7, 679 8 of 22
In the selection phase, the upper confidence bounds for trees (UCT) method is most
commonly used to balance exploration and exploitation of the entire tree, as shown in
Equation (10): s
( )
ln[ N (s)]
UCT = argmax a∈ A(s) Q(s, a) + C (10)
N (s, a)
where A(s) denotes the set of actions available in state s. Q(s, a) denotes the average
reward of action a in state s. N (s, a) denotes the number of executions of action a in state s.
N (s) denotes the number of all executions of state s. C is a constant controlling the balance
between exploration and exploitation. A larger C implies exploring more unvisited nodes;
otherwise, it is more√inclined to utilize visited nodes. With the reward value set within
Drones 2023, 7, x FOR PEER REVIEW the22
8 of
range [0, 1], C = 1/ 2 favors the construction of the MCTS tree.
Figure3.3.One
Figure Oneiteration
iterationofofthe
theMCTS
MCTSapproach.
approach.
InSelection:
this paper, From
based theonroot
MCTS,node,theMCTS selects
TA-MCTS the node
method waswith the highest
proposed value
to solve the in each
multi-
iteration,
task according
assignment problemto the
of selection
UAVs. First, strategy.
considering that there were multiple UAVs in the
scene,Expansion:
constructingIf only a nodeoneistree
foundwasthat has not
expected beento
to lead expanded, one or in
a sharp increase more
the child
number nodes
of
are added
nodes, to this
affecting thenode
search to speed
indicate andthe following
efficiency. possible action.
Therefore, this paper constructs a separate
tree for Simulation:
each UAV.Starting from the selected node, a series of actions are simulated until an
endFigure
state is4reached.
shows the process
Usually, of TA-MCTS
a random proposed
or greedy strategyin is
this paper.[50].
adopted When searching
multiple trees in TA-MCTS,
Backpropagation: MCTSa public task setthe
propagates is established
simulationtoresults
recordtothe thechanges
root node of selected
and up-
tasks
datesinallthenodes
TA-MCTS on the process
path. corresponding to each UAV to prevent task selection conflicts.
Firstly,Ininthethisselection
paper, we selectthe
phase, a TA-MCTS process corresponding
upper confidence bounds for trees to a(UCT)
UAV, method
and according
is most
tocommonly
the selection rule,toa balance
used task is selected and removed
exploration from the task
and exploitation of theset entire
to prevent
tree,itas
from
shownbeingin
selected
Equation by(10):
other trees. After that, the node will expand according to the selected tasks. In
the simulation phase, based on the historical information generated in the iterative process
and the tasks selected by the current tree, the subsequent tasks 𝑙𝑛[𝑁(𝑠)]
that may be executed are
𝑈𝐶𝑇the = task
𝑎𝑟𝑔𝑚𝑎𝑥 ∈ ( ) 𝑄(𝑠, 𝑎) + 𝐶 (10)
obtained and removed from set. Finally, a backpropagation𝑁(𝑠, 𝑎)phase is performed to
update the information.
whereThe𝐴(𝑠)TA-MCTS
denotes method
the sethasof two essential
actions aspects:
available the 𝑠selection
in state . 𝑄(𝑠, 𝑎) and expansion
denotes opti-
the average
mization
reward of strategy
actionand 𝑎 inthe 𝑠. 𝑁(𝑠, 𝑎)
simulation
state optimization
denotes the number of executions of action a in
strategy.
stateSelection
𝑠. 𝑁(𝑠) and Expansion
denotes Optimization
the number Strategy of
of all executions (SEOS):
state 𝑠.In 𝐶theisselection
a constantphase of the
controlling
MCTS, this strategy ensures that, in an iteration, in addition to
the balance between exploration and exploitation. A larger 𝐶 implies exploring more un- considering the reward
values
visitedofnodes;
the nodes, there isitno
otherwise, task conflict
is more inclined between each
to utilize UAV.nodes.
visited In the With
expansion phase,value
the reward this
strategy considers the energy consumption of UAVs and selectively
set within the range [0,1], 𝐶 = 1/√2 favors the construction of the MCTS tree. expands the possible
tasks according
In this paper, to the current
based on energy
MCTS, status of UAVs.method
the TA-MCTS This method can efficiently
was proposed to solve buildthe
different nodes in the tree.
multi-task assignment problem of UAVs. First, considering that there were multiple UAVs
Simulation
in the Optimization
scene, constructing onlyStrategy
one tree(SOS): During the
was expected MCTS
to lead to atree construction,
sharp increase inwe the
recorded each node’s value as heuristic information. This
number of nodes, affecting the search speed and efficiency. Therefore, this paper information was used in subse-
con-
quent simulations to compute more rational results.
structs a separate tree for each UAV.
Figure 4 shows the process of TA-MCTS proposed in this paper. When searching
multiple trees in TA-MCTS, a public task set is established to record the changes of se-
lected tasks in the TA-MCTS process corresponding to each UAV to prevent task selection
conflicts. Firstly, in this paper, we select a TA-MCTS process corresponding to a UAV, and
according to the selection rule, a task is selected and removed from the task set to prevent
it from being selected by other trees. After that, the node will expand according to the
selected tasks. In the simulation phase, based on the historical information generated in
Drones 2023, 7, x FOR PEER REVIEW
Drones 2023, 7, 679 9 of 22
Figure 4. One iteration of the TA-MCTS method. Blue nodes represent tasks that all UAVs have not
Figure 4. One iteration of the TA-MCTS method. Blue nodes represent tasks that all
selected. The green node represents a task that is currently selected by one of the UAVs. The red node
selected. The green node represents a task that is currently selected by one of the
cannot be selected because it represents a task that another UAV has already selected.
node cannot be selected because it represents a task that another UAV has already s
Algorithm 2 shows the whole flow of the TA-MCTS method. TA-MCTS builds a search
tree for each UAV and sets its state. A public task set Vs is built to represent the tasks
The TA-MCTS method has two essential aspects: the selection and expa
visited to avoid task selection conflicts (Lines 3–4). Define the matrix M as an information
zation
repositorystrategy
to recordand the simulation
the information for the optimization strategy.
simulation optimization strategy (Lines 5).
Lines Selection
6–15 representand Expansion Optimization Strategy (SEOS):
all the processes during one iteration of TA-MCTS. First,In the selectio
a search
tree corresponding to a UAV ui is selected. Following the node selection strategy, the
MCTS, this strategy ensures that, in an iteration, in addition to considerin
tree selects and expands a task node. The simulation optimization strategy is used in the
values
simulationofprocess
the nodes,
to guidethere is choosing
nodes in no taskpossible
conflict between
tasks. each
Finally, the UAV. In the exp
backpropagation
this strategy
process updates considers the on
the result based energy consumption
the simulation outcome. of UAVs
During thisand selectively
process, Vs is exp
continuously updated to collaborate with the corresponding trees of different UAVs to
sible tasks according to the current energy status of UAVs. This method c
prevent task conflicts. The notations and functions involved are further elaborated in
build
Sectionsdifferent nodes in the tree.
3.2.1 and 3.2.2.
Simulation Optimization Strategy (SOS): During the MCTS tree constru
orded each node’s value as heuristic information. This information was u
quent simulations to compute more rational results.
Algorithm 2 shows the whole flow of the TA-MCTS method. TA-M
search tree for each UAV and sets its state. A public task set 𝑉𝑠 is built to
tasks visited to avoid task selection conflicts (Lines 3–4). Define the matrix 𝑀
Drones 2023, 7, 679 10 of 22
Algorithm 2: TA-MCTS
1: Initialization:
2: Set the energy of all UAVs in U to Emax
3: Construct an empty tree for each UAVs with the root node v0 representing the DC.
4: Build a public task set Vs denoting visited tasks
5: Define matrix M with dimensions (n + L) × (n + L)
6: while iterations are not completed do:
7: UAV ui is randomly selected
8: v j ← Selection(v)
9: Expand(v)
10: if the position xij of node v j belongs to S then
n o
11: Vs ← Vs ∪ xij
12: end if
13: C ← Simulation(v)
14: Backpropagation(v, C )
15: end while
3.2.1. Selection and Expansion Optimization Strategy

Combined with the energy situation of UAVs, this paper proposes a selection and
expansion optimization strategy to optimize the selection and expansion phases of MCTS.
Firstly, through SEOS, we ensured that there was no conflict in the task selection of each
UAV in the respective node selection phase. Furthermore, during the expansion process,
considering that UAV energy has limitations, not all locations satisfy the expansion needs.
SEOS selectively added feasible nodes according to the UAV energy status, enabling the
following simulation process to be effective and improving the multi-task assignment
efficiency for UAVs.
According to the definition in Section 2.2, let U denote the set of m UAVs, with the
UAVs denoted by u1 , u2 , · · · , um . pi denotes the number of items the UAV ui currently
carries. In this paper, task assignments for UAVs were performed in each sub-task group.
We used S to denote each sub-task group, and the number of tasks in each group is n, with
the tasks denoted by s1 , s2 , · · · , sn as shown in Equation (11).
S = [ s1 , s2 , · · · , s n ] (11)
Let R denote the set of L UAV docks, with the UAV docks denoted by r1 , r2 , · · · , r L as
shown in Equation (12).
R = [r1 , r2 , · · · , r L ] (12)
The task location j of UAV ui is denoted by xij , where j ∈ (S ∪ R) denotes the UAV
docks and task locations. As shown in Equation (13), the energy state of UAV ui at position
j is denoted using xeij . When satisfy xeij−1 − cij > 0, it means the UAV has enough energy
to fly from the previous position to xij . When xij belongs to R, it means the UAV completes
the battery exchange and the energy becomes Pmax again.
(
xeij−1 − cij , i f xeij−1 − cij > 0 and j ∈ S
xeij = i (13)
Pmax , i f xe j−1 − cij > 0 and j ∈ R
As shown in Figure 5, when the UAV ui is in the node expansion phase, it selects tasks
that have not been visited. Using Equation (13), it evaluates its energy and selects reachable
nodes while removing inaccessible nodes due to insufficient energy. Based on this, we
ensured the nodes’ validity and effectively reduced the search tree’s size, minimizing
invalid and redundant searches.
Drones 2023, 7, x FOR PEER REVIEW
Drones 2023, 7, 679 11 of 22
Figure5. 5.
Figure Expansion
Expansion process.
process.
Based on the above calculations, SEOS pseudo-code as shown in Algorithm 3.

Based on the above calculations, SEOS pseudo-code as shown in Algorithm
Algorithm 3: Selection and Expansion Optimization Strategy
Algorithm
1: 3: Selection
function Selection (v) and Expansion Optimization Strategy
1:
2: while state 𝑆𝑒𝑙𝑒𝑐𝑡𝑖𝑜𝑛
function of v satisfy pi 𝑣
< Pmax do
3: if v not fully expanded then
2:
4:
while state of 𝑣 satisfy 𝑝
return v
𝑃 do
3:
5: else if 𝑣 not fully expanded then
6:
4: v ← BestChild (v)
return 𝑣
7: end while
5:
8: return v else
6: 𝑣 ← 𝐵𝑒𝑠𝑡𝐶ℎ𝑖𝑙𝑑 𝑣
9: function Expand(v)
7:
10:
end while
choose j ∈ {(S\Vs) ∪ R}
8:
11: return byv Equation (13)
xeij is obtained
12: if xeij > 0 then
set state and position of the new node v0 according to j and xeij
13:
9: function 𝐸𝑥𝑝𝑎𝑛𝑑 𝑣
14: add a new child v0 to v
10:
15: choose
return v0 𝑗 ∈ 𝑆\𝑉𝑠 ∪ 𝑅
11:
16: else 𝑥𝑒 is obtained by Equation (13)
17: Expand(v)
12:
18: end if
if 𝑥𝑒 0 then
13: set state and position of the new node 𝑣 according to 𝑗 and 𝑥
19: function BestChild(v)
14:
20:
add a new child 𝑣 to 𝑣
v is obtained by Equation (10)
15:
21: return v return 𝑣
16: else
The above pseudo-code primarily outlines the SEOS method. Lines 1–7 depict the
17: 𝐸𝑥𝑝𝑎𝑛𝑑 𝑣
selection function. If a node has not reached its termination status, it means the UAV
ui18:
is currently end if items not exceeding Pmax . If this node is not fully expanded, it
carrying
will proceed with expansion. Otherwise, the optimal node will be selected. Lines 8–16
illustrate the expansion
19: function function.𝑣 An unvisited task location j is chosen, and then xeij is
𝐵𝑒𝑠𝑡𝐶ℎ𝑖𝑙𝑑
computed
20: based𝑣 on Equation (13).
is obtained If xeij > 0, a(10)
by Equation new node will be generated and added to
the child list of the prior node. If not, a new task location j that has not been visited will be
21: return 𝑣
reselected. Lines 17–19 depict the optimal node computation function, where its reward
value is computed through Equation (10).
The above pseudo-code primarily outlines the SEOS method. Lines 1–7 de
selection function. If a node has not reached its termination status, it means the
is currently carrying items not exceeding 𝑃 . If this node is not fully expanded
proceed with expansion. Otherwise, the optimal node will be selected. Lines 8–
trate the expansion function. An unvisited task location 𝑗 is chosen, and then 𝑥𝑒
Drones 2023, 7, 679 12 of 22
3.2.2. Simulation Optimization Strategy

Typically, the MCTS simulation phase starts at a selected node and carries out a series
of actions, either randomly or according to some strategy, until it reaches the termination
state. This process is also referred to as a rollout. However, the traditional simulation
approach may not be optimal when faced with the multi-UAV task assignment problem.
The simulation optimization strategy proposed in this paper aims to enhance the
simulation phase of MCTS for the multi-UAV task assignment problem. The following is
the core idea of the strategy:
In solving the multi-UAV task assignment, the method proposed in this paper builds
a separate tree for each UAV. Moreover, as the MCTS iterates the process many times,
many different results are formed. This paper recorded and shared these simulation results
with all UAVs. These results were used as heuristic information to form an information
repository during multiple iterations, which could be used as historical experience to guide
action selection in the subsequent simulation process. The simulation phase could be more
effective by combining the historical experience with the current state information of a
particular UAV. Thus, a stochastic simulation process was transformed into an intelligent
simulation that combined historical heuristic information. SOS improved the effectiveness
of the simulation phase of MCTS and provided a better solution for MCTS to solve the
multi-task assignment problem.
Based on SOS, this paper recorded the results of each iteration in the tree for each
UAV and updated the information repository. By defining matrix M as the information
repository with dimension (n + L) × (n + L), n is the number of tasks in the sub-task
group, and L is the number of UAV docks. In matrix M, the information between each task
position is computed as shown in Equation (14):
−1
Mij (t) = Mij (t − 1) + cij (14)
where Mij (t) denotes the value of information between positions i and j during the t-th
iteration, i, j ∈ (S ∪ R). The matrix M is constantly updated through multiple iterations of
multiple UAVs, which guides the simulation process in MCTS.
During the simulation process, the information in the matrix M helps guide the
algorithm to quickly select the following possible actions based on the existing information
to find a suitable allocation scheme. Specifically, the following action is selected in the
simulation process according to a certain random probability, and the random probability
is calculated as shown in Equation (15):
−1
Mij (t) × cij
Pij = −1
(15)
∑k∈allow Mik (t) × (cik )
where Pij denotes the probability of selecting location j based on the current location i, and
allow denotes the set of locations that have not yet been visited after removing location j.
Algorithm 4 is the simulation process of the TA-MCTS method. First, we determined
whether node v0 satisfied the termination state, and if it did, we computed C directly
according to Equation (8). Otherwise, we obtained the set of nodes NV (Line 4) that had
not been visited. We iterated through all the NV nodes, computed each node’s current
probability according to Equation (15), and stored it in the list of probabilities (Lines 6–12).
Subsequently, the next possible location j was obtained based on the generated random
number rn and the list of probabilities. A new node was set up based on the current state,
and the matrix M was updated by Equation (14) (Lines 13–17).
Drones 2023, 7, 679 13 of 22
4: get the set of not visited position 𝑁𝑉 = (𝑆\𝑉𝑠) ∪ 𝑅

5: define a list 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠
Algorithm 4: Simulation Optimization Strategy
6: for each 𝑗 in 𝑁𝑉 do
1:7: function Simulation(v) 𝑥𝑒 is obtained by Equation (13)
2: set v0 = v
3: while state of v0 satisfy pif
8: 𝑥𝑒 > 0 then
i < Pmax do
4:9: calculate
get the set of not visited position NV =𝑃{(Sby\VsEquation
) ∪ R} (15)
5:
10: define a list probabilities 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠[𝑗] = 𝑃
6: for each j in NV do
7:11: endbyifEquation (13)
xeij is obtained
12:
8: end fori
if xe j > 0 then
9:13: generate a random
calculate number(15)
Pij by Equation 𝑟𝑛 in the range [0,1)
10:
14: get 𝑗 according ijto 𝑟𝑛 and 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠
probabilities [ j ] = P
11: end if
15: set state and position of the new node 𝑣
12: end for
16:
13: 𝑉𝑠 a←random
generate 𝑉𝑠 ∪ number
𝑥 , 𝑥 rnisinthe theposition ) node 𝑣
range [0, 1of
17:
14: update
get j according to matrix 𝑀 by Equation (14)
rn and probabilities
15: set statenand position of the new node v0
18: end while o
16: Vs ← Vs ∪ x j , x j is the position of node v0
i i
19: get the value of 𝐶 according to Equation (8)
17: update matrix M by Equation (14)
20: return
18:
𝐶
end while
19: get the value of C according to Equation (8)
In summary,
20: return C based on the characteristics of multi-task assignment, this paper opt
mized the MCTS simulation phase. The results of multiple MCTS simulations were ut
In to
lized summary,
create anbased on the characteristics
information of multi-task
repository, which guided assignment,
the selectionthis paper op-
of actions during th
timized the MCTS
simulation phase. simulation phase. The results of multiple MCTS simulations were
utilized to create an information repository, which guided the selection of actions during
the simulation phase.
4. Experiments
4. Experiments
In order to validate the TA-MCTS method, experiments and comparisons were pe
formed. Thetoexperimental
In order runningmethod,
validate the TA-MCTS environment was aand
experiments 2.30comparisons
GHz CPU, were
16.0 per-
GB of RAM
formed. The experimental
and a 64-bit running environment was a 2.30 GHz CPU, 16.0 GB of RAM,
operating system.
and a 64-bit operating system.
4.1. Experimental Environment
4.1. Experimental Environment
TheThe experimental
experimental areaarea ofexperiment
of this this experiment was
was part part of
of Hong Hong
Kong, as Kong,
shown as shown6.in Figur
in Figure
In6.this
In this paper,
paper, threethree square
square areasselected
areas were were selected in thistoregion
in this region to the
simulate simulate the experimen
experiment.
Figure
Figure 6. 6. Selected
Selected research
research area.area.
Table 2 shows the size and the latitude and longitude of the lower left corner of eac
scenario selected in this paper. Based on the location of the lower-left corner, this pape
defines the local coordinates of the DC and UAV docks concerning the lower-left corne
Drones 2023, 7, 679 14 of 22
Table 2 shows the size and the latitude and longitude of the lower left corner of each
scenario selected in this paper. Based on the location of the lower-left corner, this paper
of the scenario.
defines the localMoreover, a specific
coordinates number
of the DC and UAVof task locations
docks were randomly
concerning generated
the lower-left corner
in
of the
the region.
scenario.With the expansion
Moreover, of scenarios
a specific number ofand taskthe increased
locations number
were of tasks,
randomly UAVs
generated
need
in themore energy
region. Withto complete
the them
expansion all. Therefore,
of scenarios broadening
and the increasedthe scenario,
number this paper
of tasks, UAVs
establishes
need more additional UAV docks.
energy to complete themTheall.increase in the
Therefore, number ofthe
broadening tasks and UAV
scenario, this docks
paper
directly adds to the uncertainty of task allocation, which can more effectively
establishes additional UAV docks. The increase in the number of tasks and UAV docks evaluate the
applicability and effectiveness of the TA-MCTS method in different environments.
directly adds to the uncertainty of task allocation, which can more effectively evaluate the
applicability and effectiveness of the TA-MCTS method in different environments.
Table 2. Positions defined in different scenarios.
Table 2. Positions defined in different scenarios.
Latitude and Lon- Relative Location
𝟐 Relative Location
Scenarios Area Size (𝐤𝐦 ) Latitude gitude
andin the of UAV Docks Number of Tasks
of DC
Relative (𝐤𝐦)
Location Relative Location of Number of
Scenarios Area Size (km )2 Lower-Left
Longitude in theCorner (𝐤𝐦)
of DC (km) UAV Docks (km) Tasks
Lower-Left Corner E
114.16885°
Scenario A (1 × 1 × 1) ◦E
(0, 0) (0.5, 0.5) 27
22.27325°
114.16885 N
Scenario A (1 × 1 × 1) ◦ (0, 0) (0.5, 0.5) 27
22.27325 N (0.75, 1.25)
114.12077° E
Scenario B (1.5 × 1.5 × 1.5) ◦E (0.75, 0.75) (0.25,
(0.75,0.25)
1.25) 64
114.12077
22.35433° N
Scenario B (1.5 × 1.5 × 1.5) ◦ (0.75, 0.75) (0.25,0.25)
(1.25, 0.25) 64
22.35433 N
(1.25, 0.25)
(0.6, 0.6)
114.16395° E (0.6,0.6)
(1.3, 0.6)
Scenario C (2 × 2 × 2) 114.16395◦ E (1,1) (1.3, 0.6) 125
Scenario C (2 × 2 × 2) 22.30251° N (1,1) (0.6, 1.3) 125
22.30251◦ N (0.6, 1.3)
(1.3,
(1.3,1.3)
1.3)
Combining the above settings, the locations of DC, UAV docks, and randomly gen-
eratedCombining the three
tasks for the abovescenarios
settings, the
are locations
shown inofFigure
DC, UAV docks,
7. The red and
pointrandomly gener-
represents the
ated tasks for the three scenarios are shown in Figure 7. The red point represents the
DC, the blue point represents the UAV dock, and the green point represents the task loca- DC,
the blue point represents the UAV dock, and the green point represents the task location.
tion.
Figure 7. Position of distribution centers, UAV docks and tasks.

Figure 7. Position of distribution centers, UAV docks and tasks.
Scenarios
Scenarios A,A, B,
B, and
and CC cover
cover small
small to
to large
large areas
areas where
where thethe number
number of of UAVs
UAVs involved
involved
and
and their flight capabilities differ. Table 3 shows the parameter settings of the UAVs
their flight capabilities differ. Table 3 shows the parameter settings of the UAVs inin the
the
three scenarios, with the number of UAVs as 𝑚, the maximum battery capacity
three scenarios, with the number of UAVs as m, the maximum battery capacity as Emax , the, as 𝐸
the maximum
maximum number
number of items
of items to be
to be carried
carried as as 𝑃 , E, represents
Pmax 𝐸 represents thethe energy
energy consumed
consumed by
by
thethe
UAVsUAVsperper kilometer
kilometer when when no items
no items are loaded,
are loaded, 𝛼 isnumber
α is the the number of loaded
of loaded items, items,
and ∆
and ∆ denotes
denotes the consumed
the energy energy consumed by each
by each item. item. to
In order Inbetter
orderconform
to bettertoconform to actual
actual conditions,
conditions, as the payload capacity of UAVs increases, the required battery
as the payload capacity of UAVs increases, the required battery capacity and the energy capacity and
the energy consumption per kilometer also increase accordingly. Moreover,
consumption per kilometer also increase accordingly. Moreover, variations in the energy variations in
the energy parameters
parameters of UAVs
of UAVs directly directly
affect affect the of
the frequency frequency of to
their visits their
UAV visits to UAV docks.
docks.
Parameters Scenario A Scenario B Scenario
Number of UAV docks 1 3 4
𝑚 3 4 5
Drones 2023, 7, 679
𝐸 400 Wh 500 Wh 15 of 22
600 Wh
𝐸 80 Wh/km 100 Wh/km 120 Wh/km
Table∆ 30 Wh/km
3. UAV parameters in different scenarios. 50 Wh/km 50 Wh/km
𝑃 Parameters 3
Scenario A Scenario B4 Scenario C 5
Number of UAV docks 1 3 4
4.2. Task GroupingmComparison 3 4 5
Emax 400 Wh 500 Wh 600 Wh
In this paper,E we first experimented
80 Wh/km with the task grouping120strategy.
100 Wh/km Wh/km Task gro
∆ 30 Wh/km 50 Wh/km 50 Wh/km
aims to cluster tasks
Pmax according to their
3 geographic 4location, narrowing5 the search
of task assignments for UAVs and improving the efficiency and accuracy of the
ment. Unlike the traditional
4.2. Task Grouping ComparisonK-means, the UD-Kmeans proposed in this paper e
In this paper,
that the number we first
of tasks experimented
within with the task was
each grouping grouping strategy. Task
consistent, thusgrouping
creating fav
aims to cluster tasks according to their geographic location, narrowing the search range of
conditions for subsequent task allocation.
task assignments for UAVs and improving the efficiency and accuracy of the assignment.
In Unlike
orderthe to traditional
verify the effectiveness
K-means, of theproposed
the UD-Kmeans UD-Kmeans method,
in this paper ensuredthisthatpaper
the con
group experiments
number of tasksfor all each
within tasks withinwas
grouping theconsistent,
three scenarios.
thus creatingBased on conditions
favorable the maximum
for subsequent task allocation.
UAVs and the number of UAVs, this paper determined that the number of groups
In order to verify the effectiveness of the UD-Kmeans method, this paper conducted
narios A, B, experiments
group and C was for3,all4, and
tasks 5, respectively,
within according
the three scenarios. Based on tothe Equation
maximum load (9). The
mental results are shown in Figure 8 and Table 4. Figure 8 indicates the grouping
of UAVs and the number of UAVs, this paper determined that the number of groups
in scenarios A, B, and C was 3, 4, and 5, respectively, according to Equation (9). The
with different colored dots, and Table 4 directly shows the number of elements i
experimental results are shown in Figure 8 and Table 4. Figure 8 indicates the grouping
group. results
UD-Kmeans made
with different the dots,
colored number of tasks
and Table in shows
4 directly each the
grouping
number ofresult
elementsconsistent
in
ever, the
eachK-means grouping
group. UD-Kmeans made results indicate
the number inconsistent
of tasks in each groupingtasks in consistent.
result each group. U
However, the K-means grouping results indicate inconsistent tasks in each group. Uneven
grouping results may lead to difficulties in subsequent task assignments. Therefor
grouping results may lead to difficulties in subsequent task assignments. Therefore, UD-
KmeansKmeans
contributes to to
contributes thethestability and
stability and reliability
reliability of subsequent
of subsequent multi-task multi-task
allocations. allocat
Figure 8.Figure 8. Comparison

Comparison of K-Means vs.
of K-Means vs.UN-Kmeans
UN-Kmeansresults.
results.
Table 4. Comparison of K-Means and UD-Kmeans.
Scenario Group K-Means UD-Kmean

Group1 9 9
Scenario A Group2 8 9
Group3 10 9
Drones 2023, 7, 679 16 of 22
Table 4. Comparison of K-Means and UD-Kmeans.
Scenario Group K-Means UD-Kmeans

Group1 9 9
Scenario A Group2 8 9
Group3 10 9
Group1 12 16
Group2 20 16
Scenario B
Group3 15 16
Group4 17 16
Group1 23 25
Group2 25 25
Scenario C Group3 27 25
Group4 33 25
Group5 17 25
4.3. MCTS Comparison

In order to validate the TA-MCTS, we conducted randomized experiments of TA-
MCTS with R-MCTS, G-MCTS, and RG-MCTS in three scenarios. For RG-MCTS, the
weights of random and greedy simulation strategies were set to 50%.
Compared with the original MCTS method, the TA-MCTS optimized the node selection
and simulation strategies. Generally, MCTS builds a separate tree for node selection and
simulation. However, in the case of task allocation considering multi-UAV and energy,
a separate tree may result in too many nodes and a slow search. Therefore, this paper
modified the three MCTS methods mentioned above to use multiple trees.
Due to the randomness of the methods, a single experiment could not fully reflect
the differences between different methods. In order to achieve a more precise comparison,
this paper executed 50, 100, 150, 200, 1000, and 2000 simulations for each method in each
scenario and then analyzed the results comprehensively.
Figure 9 shows the average energy consumption of different algorithms for multiple
experiments in different scenarios. Figure 9a–c represent the average energy consumption
under scenarios A, B, and C, respectively. The difference between the four methods was
relatively slight due to the small area covered by Scenario A. Figure 9b,c show that TA-
MCTS consumed less energy on average than the other three methods, regardless of the
number of experiments. Moreover, the results obtained by the TA-MCTS method were
more stable, while the other three methods were subject to fluctuations.
In order to show the effect of each method more clearly, this paper used violin plots to
show the characteristics of the data distribution of energy consumption for 2000 experi-
ments; the results are shown in Figures 10–12. As seen in Figure 10, due to the small area of
Scenario A, the UAVs carrying energy could cover almost the whole area, so the difference
in the data distribution obtained by the four methods was small. Figures 11 and 12 show
that in Scenarios B and C, the difference in the distribution of energy consumption among
the four methods was more evident due to the expansion of the area. In Scenario B, the data
concentration of TA-MCTS is better than the remaining three methods. The optimal and
worst energy consumption of TA-MCTS was also better than the other three methods. In
Scenario C, TA-MCTS indicated significantly lower energy consumption in the concentrated
area than RG-MCTS and G-MCTS. R-MCTS performed less well than the other algorithms,
and the distribution of results was relatively scattered due to its intrinsic randomness.
experiments in different scenarios. Figure 9a–c represent the average energy consumption
under scenarios A, B, and C, respectively. The difference between the four methods was
relatively slight due to the small area covered by Scenario A. Figure 9b,c show that TA-
MCTS consumed less energy on average than the other three methods, regardless of the
Drones 2023, 7, 679 17 of 22
number of experiments. Moreover, the results obtained by the TA-MCTS method were
more stable, while the other three methods were subject to fluctuations.
Drones 2023,
Drones 2023, 7,
7, xx FOR
FOR PEER
PEER REVIEW
REVIEW 17 oo
17
In order
In order to to show
show the
the effect
effect ofof each
each method
method more more clearly,
clearly, this
this paper
paper used
used violin
violin p p
to show
to show the the characteristics
characteristics ofof the
the data
data distribution
distribution of of energy
energy consumption
consumption for for 2000
2000 exp
exp
iments; the results are shown in Figures 10–12. As seen in Figure 10, due to the small aa
iments; the results are shown in Figures 10–12. As seen in Figure 10, due to the small
of Scenario
of Scenario A, A, the
the UAVs
UAVs carrying
carrying energy
energy couldcould cover
cover almost
almost thethe whole
whole area,
area, so
so the
the diff
diff
ence in
ence in the
the data
data distribution
distribution obtained
obtained by by thethe four
four methods
methods was was small.
small. Figures
Figures 11 11 and
and
show that
show that in in Scenarios
Scenarios B B and
and C,C, the
the difference
difference in in the
the distribution
distribution of of energy
energy consumpt
consump
among the four methods was more evident due to the
among the four methods was more evident due to the expansion of the area. In expansion of the area. In Scena
Scen
B, the
B, the data
data concentration
concentration of of TA-MCTS
TA-MCTS is is better
better than
than thethe remaining
remaining threethree methods.
methods.
optimal and
optimal and worst
worst energy
energy consumption
consumption of of TA-MCTS
TA-MCTS was was also
also better
better than
than the
the other
other thth
methods. In Scenario C, TA-MCTS indicated significantly
methods. In Scenario C, TA-MCTS indicated significantly lower energy consumption lower energy consumption
the concentrated
the concentrated area area than
than RG-MCTS
RG-MCTS and and G-MCTS.
G-MCTS. R-MCTSR-MCTS performed
performed less
less well
well than
than
Figure 9. Comparison of 9.
Figure average energyofconsumption.
Comparison average energy (a)consumption.
The average energy
(a) consumption
The average energyinconsumption in
other
other algorithms,
algorithms, and the
and the distribution
distribution of(c)
of results was relatively
relatively scattered due
due toto its
its intrin
intri
scenario A. (b) The average
scenario A.energy
(b) The consumption
average energyin scenario B.
consumption inresults was
The average
scenario energy
B. (c) The scattered
consump-
average energy consumption
randomness.
tion in scenario C.randomness.
in scenario C.
Figure
Figure 10.Violin
10.
Figure 10. Violinplot
Violin plot
plot presenting
presenting
presenting the distribution
the distribution
the distribution of energy
of
of energy energy consumption
consumption
consumption inA.Scenario
in Scenarioin Scenario A.
A.
Figure 11.Violin
Figure 11.
Figure 11. Violinplot
Violin plot presenting
presenting
plot the distribution
distribution
the distribution
presenting the of energy
of energy
of energy consumption
consumption inB.Scenario
in Scenarioin
consumption Scenario B.
B.
Drones 2023, 7, 679 18 of 22
Figure 11. Violin plot presenting the distribution of energy consumption in Scenario B.
12.Violin
Figure 12.
Figure Violinplot presenting
plot the distribution
presenting of energy
the distribution of consumption in Scenarioin
energy consumption C.Scenario C.
2023, 7, x FOR PEER REVIEW 18 of 22
Figure 13a–c show the iterative process for a sub-task group in scenarios A, B, and
Figure 13a–c
C, respectively. show
In the threethe iterative
scenarios, theprocess
number for a sub-task
of iterations grouptoin
required scenarios
converge wasA, B, an
respectively.
approximately In thethe
samethree
for allscenarios, the However,
four methods. number the of iterations required
TA-MCTS method to converge
proposed
in this paper obtained better obtained
approximately
in this paper results after
the same afor
bettersimilar number
all four
results a of
methods.
after iterations,
similar proving
However,
number ofthethe su- proving
TA-MCTS
iterations, method
the propo
periority of thesuperiority
TA-MCTS of method.
the TA-MCTS method.
Figure 13. Comparison of iteration numbers. (a) The iterative process for a sub-task group in sce-
Figure 13. Comparison of iteration numbers. (a) The iterative process for a sub-task group in scenario
nario A. (b) The iterative process for a sub-task group in scenario B. (c) The iterative process for a
A. (b) The iterative process for a sub-task group in scenario B. (c) The iterative process for a sub-task
sub-task group in scenario C.
group in scenario C.
To further illustrate the impact
To further of different
illustrate the impactdata
of in Tables data
different 2 andin3 Tables
on the 2results,
and 3 onthethe results, the
average number of visits to UAV docks in different scenarios was counted in this paper,
average number of visits to UAV docks in different scenarios was counted in this paper,
as shown in Tables 5–7. Inin
as shown theTables
descriptions
5–7. Ininthe
Tables 2 and 3, Scenarios
descriptions in TablesA2andandB,3,the smaller A and B, the
Scenarios
area and the small number of missions undertaken by the UAVs resulted
smaller area and the small number of missions undertaken by the UAVs in the UAVs notresulted in the
needing to visitUAVs
the UAV dock as frequently to change batteries. Therefore, the four meth-
not needing to visit the UAV dock as frequently to change batteries. Therefore,
ods show fewer thepasses through the
four methods showdocks,
fewerbut overall,
passes the TA-MCTS
through the docks,ledbuttooverall,
fewer passes
the TA-MCTS led to
through the UAVfewer passes through the UAV docks than other methods. Combining C
docks than other methods. Combining the descriptions of Scenario theindescriptions of
Tables 2 and 3,Scenario
the increase in the number of missions and the size of the area resulted
C in Tables 2 and 3, the increase in the number of missions and the insize of the area
the UAVs needing to visit the UAV docks more times to replenish their energy to complete
the missions. Therefore, the data in Table 7 show that the average number of times the TA-
MCTS method passes through the docks was better than the other algorithms.
Table 5. Average number of passes through the UAV docks in scenario A.

Drones 2023, 7, 679 19 of 22
resulted in the UAVs needing to visit the UAV docks more times to replenish their energy
to complete the missions. Therefore, the data in Table 7 show that the average number of
times the TA-MCTS method passes through the docks was better than the other algorithms.
Table 5. Average number of passes through the UAV docks in scenario A.
Number of Experiments TA-MCTS G-MCTS R-MCTS RG-MCTS

50 0.18 0.14 0.36 0.22
100 0.17 0.2 0.28 0.26
150 0.26 0.17 0.15 0.22
Drones 2023, 7, x FOR PEER REVIEW 200 0.19 0.17 0.18 0.1819 of 22
1000 0.22 0.16 0.21 0.25
2000 0.21 0.16 0.21 0.24
Number of Ex-
6. Average numberTA-MCTS
Tableperiments G-MCTS
of passes through the R-MCTS
UAV docks in scenario B. RG-MCTS
50 of Experiments 0.42 TA-MCTS

Number 1.01G-MCTS 0.56
R-MCTS 0.62
RG-MCTS
100 0.46 1.02 0.64 0.64
50 0.42 1.01 0.56 0.62
150 100 0.44 0.46 1.01 1.02 0.490.64 0.57
0.64
200 150 0.49 0.44 0.96 1.01 0.580.49 0.57
0.57
1000 200 0.47 0.49 1.01 0.96 0.520.58 0.57
0.66
20001000 0.44 0.47
0.99 1.01 0.550.52 0.66
0.67
2000 0.44 0.99 0.55 0.67
Table 7. Average number of passes through the UAV docks in scenario C.

Table 7. Average number of passes through the UAV docks in scenario C.
Number of Ex-
TA-MCTS G-MCTS R-MCTS RG-MCTS
periments
Number of Experiments TA-MCTS G-MCTS R-MCTS RG-MCTS
50 50 15.54 15.54 17.44 17.44 19.66
19.66 15.84
15.84
100 100 16.19 16.19 18.16 18.16 19.17
19.17 17.68
17.68
150 150 16.34 16.34 18.59 18.59 19.61
19.61 16.78
16.78
200 200 16.44 16.44 18.14 18.14 19.56
19.56 16.78
16.78
1000 16.49 18.63 19.36 16.91
10002000 16.49 16.33 18.63 18.64 19.36
19.29
16.91
16.73
2000 16.33 18.64 19.29 16.73
Figure
Figure14
14illustrates
illustratesthe
thetask
taskallocation
allocationof
ofTA-MCTS
TA-MCTSin inthree
threescenarios.
scenarios.Figure
Figure14a–c
14a–c
represent
represent the results under scenarios A, B, and C, respectively. The dotted linesof
the results under scenarios A, B, and C, respectively. The dotted lines ofdifferent
different
colors
colorsrepresent
representdifferent
differenttask
taskgroups,
groups,and
andthe
thegreen
greendots
dotsrepresent
representthe
thetask
tasklocations.
locations.
Figure14.
Figure 14.The
The result
result of
of task
task allocation
allocation in
in different
different scenarios.
scenarios.(a)
(a)The
Theresult
resultofoftask
taskdistribution
distributioninin
scenario A. (b) The result of task distribution in scenario B. (c) The result of task distribution
scenario A. (b) The result of task distribution in scenario B. (c) The result of task distribution in in
scenario C.
scenario C.
Afterthe
After theabove
above comparative
comparative analysis,
analysis, the TA-MCTS
the TA-MCTS proposed
proposed in thisin thiswas
paper paper was
shown
shown to have significant advantages, and it can reasonably accomplish multi-UAV
to have significant advantages, and it can reasonably accomplish multi-UAV task allocation.task
allocation. TA-MCTS ensured the UAVs minimized energy consumption in accomplishing
all the tasks, regardless of the scenario. These experiments highlight the innovation and
effectiveness of the TA-MCTS method and provide a powerful solution for multi-UAV
task allocation in urban logistics with practical applications.
Drones 2023, 7, 679 20 of 22
TA-MCTS ensured the UAVs minimized energy consumption in accomplishing all the tasks,
regardless of the scenario. These experiments highlight the innovation and effectiveness of
the TA-MCTS method and provide a powerful solution for multi-UAV task allocation in
urban logistics with practical applications.
5. Conclusions
This paper investigated the multi-task assignment method for UAV logistics in urban
low-altitude environments. Based on the traditional K-means, this paper proposes UD-
Kmeans to realize multi-task grouping and reduce the scope of multi-task allocation. On
this basis, we proposed TA-MCTS by improving the selection and simulation strategies
of MCTS. We utilized multiple collaborative trees for decision evaluation to effectively
accomplish multi-task allocation. From Table 4 and Figure 8, UD-Kmeans exhibited stability
in task grouping, ensuring the number of tasks within each grouping was consistent. The
advantages of TA-MCTS were illustrated through comprehensive comparisons with other
related MCTS methods. Tables 5–7 and Figures 9–13 showed that TA-MCTS efficiently
performs in solution optimality. The main findings are as follows:
(1) UD-Kmeans ensures that the number of tasks in each group is balanced while
performing clustering, effectively avoiding uneven task allocation.
(2) In the TA-MCTS method, using SEOS, conflict-free node selection is achieved in the
selection phase. In the expansion phase, selective expansion is performed according to the
UAV’s energy situation to reduce the number of possible subsequent ineffective searches.
(3) TA-MCTS optimizes the simulation strategy by collecting the simulation results
of multiple UAVs and constructing a historical information repository. The historical
information is used to guide the subsequent simulation, thus improving the effectiveness
of the simulation.
In subsequent research, we plan to optimize these methods further and explore the
implementation of task allocation among multiple scenarios. Moreover, with the develop-
ment of UAV technology, we plan to conduct more extensive research in the real world
with real UAV performance situations.
Author Contributions: Conceptualization, Z.M. and J.C.; methodology, Z.M.; software, Z.M.; valida-
tion, Z.M. and J.C.; formal analysis, Z.M.; investigation, Z.M.; resources, Z.M.; data curation, Z.M.;
writing—original draft preparation, Z.M.; writing—review and editing, Z.M. and J.C.; visualization,
Z.M.; supervision, Z.M. and J.C.; project administration, J.C.; funding acquisition, J.C. All authors
have read and agreed to the published version of the manuscript.
Funding: This research was funded by the financial support of the Fundamental Research Funds for
the Central Universities, China (Grant No. 2042022dx0001).
Data Availability Statement: Data is contained within the article.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Ahmed, F.; Mohanta, J.C.; Keshari, A.; Yadav, P.S. Recent Advances in Unmanned Aerial Vehicles: A Review. Arab. J. Sci. Eng.
2022, 47, 7963–7984. [CrossRef] [PubMed]
2. Siebert, S.; Teizer, J. Mobile 3D Mapping for Surveying Earthwork Projects Using an Unmanned Aerial Vehicle (UAV) System.
Autom. Constr. 2014, 41, 1–14. [CrossRef]
3. Kim, J.; Kim, S.; Ju, C.; Son, H.I. Unmanned Aerial Vehicles in Agriculture: A Review of Perspective of Platform, Control, and
Applications. IEEE Access 2019, 7, 105100–105115. [CrossRef]
4. Yeong, S.P.; King, L.M.; Dol, S.S. A Review on Marine Search and Rescue Operations Using Unmanned Aerial Vehicles. Int. J. Mar.
Environ. Sci. 2015, 9, 396–399. [CrossRef]
5. Coifman, B.; McCord, M.; Mishalani, R.G.; Redmill, K. Surface Transportation Surveillance from Unmanned Aerial Vehicles.
In Proceedings of the 83rd Annual Meeting of the Transportation Research Board, Washington, DC, USA, 11–15 January 2004;
Volume 28.
6. Eichleay, M.; Evens, E.; Stankevitz, K.; Parker, C. Using the Unmanned Aerial Vehicle Delivery Decision Tool to Consider
Transporting Medical Supplies via Drone. Glob. Health Sci. Pract. 2019, 7, 500–506. [CrossRef]
Drones 2023, 7, 679 21 of 22
7. Hossein Motlagh, N.; Taleb, T.; Arouk, O. Low-Altitude Unmanned Aerial Vehicles-Based Internet of Things Services: Compre-
hensive Survey and Future Perspectives. IEEE Internet Things J. 2016, 3, 899–922. [CrossRef]
8. Mehmood, Y.; Ahmad, F.; Yaqoob, I.; Adnane, A.; Imran, M.; Guizani, S. Internet-of-Things-Based Smart Cities: Recent Advances
and Challenges. IEEE Commun. Mag. 2017, 55, 16–24. [CrossRef]
9. Elloumi, M.; Dhaou, R.; Escrig, B.; Idoudi, H.; Saidane, L.A. Monitoring Road Traffic with a UAV-Based System. In Proceedings
of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain, 15–18 April 2018; pp. 1–6.
[CrossRef]
10. Carlsson, J.G.; Song, S. Coordinated Logistics with a Truck and a Drone. Manag. Sci. 2018, 64, 4052–4069. [CrossRef]
11. Pan, J.S.; Song, P.C.; Chu, S.C.; Peng, Y.J. Improved Compact Cuckoo Search Algorithm Applied to Location of Drone Logistics
Hub. Mathematics 2020, 8, 333. [CrossRef]
12. Pollet, B.G.; Staffell, I.; Shang, J.L. Current Status of Hybrid, Battery and Fuel Cell Electric Vehicles: From Electrochemistry to
Market Prospects. Electrochim. Acta 2012, 84, 235–249. [CrossRef]
13. Mitchell, S.; Steinbach, J.; Flanagan, T.; Ghabezi, P.; Harrison, N.; O’Reilly, S.; Killian, S.; Finnegan, W. Evaluating the Sustainability
of Lightweight Drones for Delivery: Towards a Suitable Methodology for Assessment. Funct. Compos. Mater 2023, 4, 4. [CrossRef]
14. Stolaroff, J.K.; Samaras, C.; O’Neill, E.R.; Lubers, A.; Mitchell, A.S.; Ceperley, D. Energy Use and Life Cycle Greenhouse Gas
Emissions of Drones for Commercial Package Delivery. Nat. Commun. 2018, 9, 409. [CrossRef] [PubMed]
15. Kang, P.; Song, G.; Xu, M.; Miller, T.R.; Wang, H.; Zhang, H.; Liu, G.; Zhou, Y.; Ren, J.; Zhong, R.; et al. Low-Carbon Pathways for
the Booming Express Delivery Sector in China. Nat. Commun. 2021, 12, 450. [CrossRef] [PubMed]
16. Goodchild, A.; Toy, J. Delivery by Drone: An Evaluation of Unmanned Aerial Vehicle Technology in Reducing CO2 Emissions in
the Delivery Service Industry. Transp. Res. Part D Transp. Environ. 2018, 61, 58–67. [CrossRef]
17. Djimantoro, M.I.; Suhardjanto, G. The Advantage by Using Low-Altitude UAV for Sustainable Urban Development Control. In
IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2018; Volume 109. [CrossRef]
18. Singireddy, S.R.R.; Daim, T.U. Technology Roadmap: Drone Delivery—Amazon Prime Air. Innov. Technol. Knowl. Manag. 2018,
387–412. [CrossRef]
19. Hwang, J.; Choe, J.Y. (Jacey). Exploring Perceived Risk in Building Successful Drone Food Delivery Services. Int. J. Contemp. Hosp.
Manag. 2019, 31, 3249–3269. [CrossRef]
20. Scott, J.; Scott, C. Drone Delivery Models for Healthcare. In Proceedings of the 50th Hawaii International Conference on System
Sciences, Hilton Waikoloa Village, HI, USA, 4–7 January 2017.
21. Yoo, W.; Yu, E.; Jung, J. Drone Delivery: Factors Affecting the Public’s Attitude and Intention to Adopt. Telemat. Inform. 2018, 35,
1687–1700. [CrossRef]
22. De Silva, S.C.; Phlernjai, M.; Rianmora, S.; Ratsamee, P. Inverted Docking Station: A Conceptual Design for a Battery-Swapping
Platform for Quadrotor UAVs. Drones 2022, 6, 56. [CrossRef]
23. Grlj, C.G.; Krznar, N.; Pranjić, M. A Decade of UAV Docking Stations: A Brief Overview of Mobile and Fixed Landing Platforms.
Drones 2022, 6, 17. [CrossRef]
24. Bláha, L.; Severa, O.; Goubej, M.; Myslivec, T.; Reitinger, J. Automated Drone Battery Management System—Droneport: Technical
Overview. Drones 2023, 7, 234. [CrossRef]
25. Oh, G.; Kim, Y.; Ahn, J.; Choi, H.-L. Task Allocation of Multiple UAVs for Cooperative Parcel Delivery. In Advances in Aerospace
Guidance, Navigation and Control; Springer: Berlin/Heidelberg, Germany, 2018; pp. 443–454. [CrossRef]
26. Kuhn, H.W. The Hungarian Method for the Assignment Problem. Nav. Res. Logist. Q. 1955, 2, 83–97. [CrossRef]
27. Madridano, Á.; Al-Kaff, A.; Martín, D.; de la Escalera, A. 3D Trajectory Planning Method for UAVs Swarm in Building Emergencies.
Sensors 2020, 20, 642. [CrossRef] [PubMed]
28. Bellingham, J.; Tillerson, M.; Richards, A.; How, J.P. Multi-Task Allocation and Path Planning for Cooperating UAVs. In Cooperative
Control: Models, Applications and Algorithms; Springer: Berlin/Heidelberg, Germany, 2003; pp. 23–41. [CrossRef]
29. Driess, D.; Oguz, O.; Ha, J.S.; Toussaint, M. Deep Visual Heuristics: Learning Feasibility of Mixed-Integer Programs for
Manipulation Planning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris,
France, 31 May–31 August 2020; pp. 9563–9569. [CrossRef]
30. Weckenborg, C.; Kieckhäfer, K.; Müller, C.; Grunewald, M.; Spengler, T.S. Balancing of Assembly Lines with Collaborative Robots.
Bus. Res. 2020, 13, 93–132. [CrossRef]
31. Seenu, N.; Kuppan Chetty, R.M.; Ramya, M.M.; Janardhanan, M.N. Review on State-of-the-Art Dynamic Task Allocation Strategies
for Multiple-Robot Systems. Ind. Rob. 2020, 47, 929–942. [CrossRef]
32. Liu, C.; Kroll, A. A Centralized Multi-Robot Task Allocation for Industrial Plant Inspection by Using A* and Genetic Algorithms.
In International Conference on Artificial Intelligence and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7268,
pp. 466–474. [CrossRef]
33. Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial
Intelligence; MIT Press: Cambridge, MA, USA, 1992; ISBN 0262581116.
34. Martin, J.G.; Frejo, J.R.D.; García, R.A.; Camacho, E.F. Multi-Robot Task Allocation Problem with Multiple Nonlinear Criteria
Using Branch and Bound and Genetic Algorithms. Intell. Serv. Robot. 2021, 14, 707–727. [CrossRef]
35. Wei, C.; Ji, Z.; Cai, B. Particle Swarm Optimization for Cooperative Multi-Robot Task Allocation: A Multi-Objective Approach.
IEEE Robot. Autom. Lett. 2020, 5, 2530–2537. [CrossRef]
Drones 2023, 7, 679 22 of 22
36. Mouradian, C.; Sahoo, J.; Glitho, R.H.; Morrow, M.J.; Polakos, P.A. A Coalition Formation Algorithm for Multi-Robot Task
Allocation in Large-Scale Natural Disasters. In Proceedings of the 2017 13th International Wireless Communications and Mobile
Computing Conference (IWCMC), Valencia, Spain, 26–30 June 2017; pp. 1909–1914. [CrossRef]
37. Puttewar, A.S.; Chatpalliwar, A.S. An Overview of Ant Colony Optimization (ACO) for Multiple-Robot Task Allocation (MRTA).
Res. J. Eng. Technol. 2013, 4, 107–112.
38. Bertsekas, D.P. Auction Algorithms. Encycl. Optim. 2009, 1, 73–77.
39. Choi, H.L.; Brunet, L.; How, J.P. Consensus-Based Decentralized Auctions for Robust Task Allocation. IEEE Trans. Robot. 2009, 25,
912–926. [CrossRef]
40. Cheng, Q.; Yin, D.; Yang, J.; Shen, L. An Auction-Based Multiple Constraints Task Allocation Algorithm for Multi-UAV System. In
Proceedings of the 2016 International Conference on Cybernetics, Robotics and Control (CRC), Hong Kong, China, 19–21 August
2016; IEEE: New York, NY, USA, 2016; pp. 1–5.
41. Otte, M.; Kuhlman, M.J.; Sofge, D. Auctions for Multi-Robot Task Allocation in Communication Limited Environments. Auton.
Robots 2020, 44, 547–584. [CrossRef]
42. Bai, X.; Fielbaum, A.; Kronmuller, M.; Knoedler, L.; Alonso-Mora, J. Group-Based Distributed Auction Algorithms for Multi-Robot
Task Assignment. IEEE Trans. Autom. Sci. Eng. 2023, 20, 1292–1303. [CrossRef]
43. Coulom, R. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In International Conference on Computers and
Games; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4630, pp. 72–83. [CrossRef]
44. Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam,
V.; Lanctot, M.; et al. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 2016, 529, 484–489.
[CrossRef] [PubMed]
45. Senington, R.; Schmidt, B.; Syberfeldt, A. Monte Carlo Tree Search for Online Decision Making in Smart Industrial Production.
Comput. Ind. 2021, 128, 103433. [CrossRef]
46. Qi, H.; Hu, X. Monte Carlo Tree Search-Based Intersection Signal Optimization Model with Channelized Section Spillover. Transp.
Res. Part C Emerg. Technol. 2019, 106, 281–302. [CrossRef]
47. Mo, S.; Pei, X.; Wu, C. Safe Reinforcement Learning for Autonomous Vehicle Using Monte Carlo Tree Search. IEEE Trans. Intell.
Transp. Syst. 2022, 23, 6766–6773. [CrossRef]
48. Qian, Y.; Sheng, K.; Ma, C.; Li, J.; Ding, M.; Hassan, M. Path Planning for the Dynamic UAV-Aided Wireless Systems Using Monte
Carlo Tree Search. IEEE Trans. Veh. Technol. 2022, 71, 6716–6721. [CrossRef]
49. Grelier, C.; Goudet, O.; Hao, J.K. On Monte Carlo Tree Search for Weighted Vertex Coloring. In European Conference on Evolutionary
Computation in Combinatorial Optimization (Part of EvoStar); Springer: Berlin/Heidelberg, Germany, 2022; Volume 13222, pp. 1–16.
[CrossRef]
50. Browne, C.B.; Powley, E.; Whitehouse, D.; Lucas, S.M.; Cowling, P.I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.;
Colton, S. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games 2012, 4, 1–43. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Drones 07 00679

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Drones 07 00679

Uploaded by

Copyright:

Available Formats

drones

Drones 2023, 7, 679. https://doi.org/10.3390/drones7110679 https://www.mdpi.com/journal/drones

Table 1. Summary of the related literature.

Reference Method Type Application Scenario Objectives Constraints

Drones 2023, 7, 679 4 of 22

2.2. Task Model

When task t j is assigned to UAV ui , the

xij ∈ {0, 1} ∀(i, j) ∈ U × S (6)

Drones 2023, 7, 679 𝐶= 𝑐 𝑥 (8)

Algorithm 1: Uniform Distribution K-Means

3.2. MCTS Task Allocation

3.2.1. Selection and Expansion Optimization Strategy

Based on the above calculations, SEOS pseudo-code as shown in Algorithm 3.

3.2.2. Simulation Optimization Strategy

Drones 2023, 7, 679 13 of 22

4: get the set of not visited position 𝑁𝑉 = (𝑆\𝑉𝑠) ∪ 𝑅

Drones 2023, 7, x FOR PEER REVIEW 14 of 22

Figure 7. Position of distribution centers, UAV docks and tasks.

Figure 8.Figure 8. Comparison

Table 4. Comparison of K-Means and UD-Kmeans.

Scenario Group K-Means UD-Kmean

Table 4. Comparison of K-Means and UD-Kmeans.

Scenario Group K-Means UD-Kmeans

4.3. MCTS Comparison

Table 5. Average number of passes through the UAV docks in scenario A.

Table 5. Average number of passes through the UAV docks in scenario A.

Number of Experiments TA-MCTS G-MCTS R-MCTS RG-MCTS

50 of Experiments 0.42 TA-MCTS

Table 7. Average number of passes through the UAV docks in scenario C.

You might also like