Professional Documents
Culture Documents
1
Faculty of Industrial Engineering and Management, Technion, Haifa 3200003, Israel;
email: karpase@technion.ac.il
2
Department of Informatics, King’s College London, London WC2B 4BG, United Kingdom;
email: daniele.magazzeni@kcl.ac.uk
4.1
Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37
1. INTRODUCTION
Modern robots and the software tools for controlling them are becoming increasingly capable
of performing “basic” tasks such as localization, navigation, and motion planning. While these
tasks are challenging in and of themselves and remain the focus of significant and interesting
research, we are finally at a point where it is possible to intelligently combine these capabilities
in order to achieve a high-level goal. This is the focus of this article, in which we review some
of the work on applying automated planning (1) to intelligently control one or more robots. A
recent survey provides a comprehensive overview of deliberation for autonomous robots (2); rather
than repeating this survey, our objective here is to provide a broad understanding of planning for
robotics and, specifically, how to endow a robot with planning capabilities.
The word planning is used in different ways in robotics, meaning different activities with differ-
ent goals—for example, motion planning (e.g., 3), path planning (e.g., 4), and kinematic planning
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
(e.g., 5). As mentioned above, this article focuses on task planning, often called automated plan-
Access provided by Syracuse University Library on 10/21/19. For personal use only.
ning (the term we use throughout this article) or AI planning. Automated planning is often used
in combination with other types of planning in integrated solutions.
Automated planning has been used in several real-world robotics scenarios, including space
exploration (e.g., 6), logistics industrial robotics (e.g., 7), and the oil and gas industry (e.g., 8).
Work presented at the International Conference on Automated Planning and Scheduling (ICAPS)
Workshop on Planning and Robotics (PlanRob) and Workshop on Scheduling and Planning Ap-
plications (SPARK) has covered many other applications of planning.
Throughout the article, we use the RoboCup Logistics League (RCLL) (9) as an illustrative ex-
ample. The RCLL is part of the RoboCup competition and is designed to simulate manufacturing
operations in a factory. Each game is played between two teams, each of which controls three mo-
bile robots and six different production machines. The robots can move workpieces between the
production machines, and each machine can perform one of several processing operations on the
workpieces. In the game, orders for workpieces with different configurations arrive on the fly and
are associated with a time window in which the order must be completed. In addition to not crash-
ing into the other team’s robots, each team must solve a planning problem in which it controls its
robots and machines and determines which actions to perform to fulfill orders in a timely manner.
1.1.2. The learning-based approach. In the learning-based approach, the agent (or robot)
learns the correct behavior from experience—either its own (as in reinforcement learning) or an
expert’s (as in imitation learning). The learning-based approach can be (and has been) used to
train robots to perform basic tasks, for which it would also be very difficult to formulate a model.
However, reinforcement learning approaches typically do not offer any guarantees on when an
optimal policy has been found or on the safety of the robot when it is still exploring. In addition,
learning a complex sequence of tasks is not as easy as learning single tasks. Nevertheless, it is very
useful to learn how to perform basic actions and combine these actions using planning.
In the context of the RCLL scenario, it is conceivable that reinforcement learning will converge
on an optimal policy. However, the large number of possible states in this problem suggests that,
without some generalization, convergence will take a long time to achieve.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
1.1.3. The model-based approach. In the model-based approach, we assume that we have
Access provided by Syracuse University Library on 10/21/19. For personal use only.
some model of the world, and the agent uses the model to reason about possible courses of action
and how they will affect the state of the world. Automated planning is the model-based approach
to autonomous behavior and is discussed throughout the rest of the article.
In the context of the RCLL scenario, it is possible to quickly synthesize a course of action
that completes all outstanding orders, but finding an optimal solution still requires a significant
computational effort.
Sense: The agent obtains some observations (percepts) from the environment using its
sensors.
Think: The agent thinks about what to do next and chooses the next action to execute.
Act: The agent performs the chosen action, thereby modifying the environment, and the
loop repeats.
If the agent’s decision-making is based on a fixed policy (which can be programmed manually
or learned), then this loop elegantly describes the agent’s operation. However, if we want to use an
automated planner in our decision-making, then the simple sense–think–act loop does not work:
a b Planner
Replan Plan
Environment Environment
Figure 1
Illustration of two agent architectures. (a) The sense–think–act loop, in which the agent obtains observations
of its environment, thinks about what to do next, and executes the chosen action, after which the sequence
repeats. (b) A planning-based agent architecture, in which the decision-making is split between executive and
planner components.
Planning is typically slow (usually measured in seconds, not milliseconds), and thus we cannot call
a planner at the rate of a typical robotic control loop (which typically runs at least once per second,
and often much faster). However, a planner will generate a long-term plan that will ultimately lead
to the goal.
Thus, in a model-based agent that uses a planner (which we refer to as a planning-based agent),
the decision-making is typically split among two different components: the executive and the plan-
ner (as illustrated in Figure 1b). The executive is responsible for interacting with the environment,
executing actions from the plan as appropriate, and receiving observations while monitoring plan
execution to make sure the plan is still valid. The executive also interacts with the planner by
calling for replanning (or plan repair) when the current plan becomes invalid (perhaps because
an action failed or there has been some unforeseen disturbance to the state of the world) and
receiving the plan for execution from the planner when it finishes planning.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
The above description of how a planning-based agent works is quite vague. In the rest of this
Access provided by Syracuse University Library on 10/21/19. For personal use only.
article, we review different types of models (formalisms) and discuss when and under what as-
sumptions they are applicable to robotics. For each such formalism, we also discuss how to execute
such plans. We begin by discussing the simplest type of planning formalism—classical planning.
Both STRIPS and SAS are grounded representations, using a flat (or propositional) represen-
Access provided by Syracuse University Library on 10/21/19. For personal use only.
tation of the world. However, planning problems could be very similar to each other—for example,
two different games of the RCLL scenario would be very similar to each other, varying only in
the specific positions of the robots and machines and in the orders. We say that these problems
come from the same domain.
The Planning Domain Definition Language (PDDL) (15) was created to facilitate comparing
the performance of different planners in the context of the International Planning Competition
(http://www.icaps-conference.org/index.php/Main/Competitions) and is used to represent
planning problems using a lifted (first-order logic) formalism. PDDL splits the definition of a
planning problem into two parts: the domain and the problem. The domain captures the parts of
the problem that are supposed to be common to all problems, while the problem file describes the
parts that are specific to each problem. PDDL is a rich language, and here we describe only its
core component; for more information, we refer the reader to a recent book dedicated to PDDL
(16).
A PDDL domain describes three things:
The set of possible types of objects. For example, in the RCLL, the types of objects would
be robot, machine, workpiece, etc.
The set of possible predicates, which describe the possible relations between objects of
the different types. For example, in the RCLL, the predicate at (?r, ?m) would indicate
whether robot ?r is at machine ?m. Each predicate is then grounded into several facts, e.g.,
at (r1, m1), at (r2, m1), . . . , each of which represents a specific fact that could be true or false.
The set of schematic operators, which describe the possible actions. For example, the op-
erator move(?r, ?m1, ?m2) would describe the action of robot ?r moving from machine ?m1
to machine ?m2. Again, each operator can be grounded into several grounded actions, e.g.,
move(r3, m2, m7), indicating that robot r3 moves from machine m2 to machine m7.
While the same domain description can be used in different problems, the problem definition
describes the parts that are problem specific:
The objects in the world. For example, in the RCLL, the objects would be robots r1–r3,
machines m1–m6, etc. Grounding predicates and operators is done with respect to the set
of objects in the problem description.
The initial state, which is given as a list of all the true grounded propositions.
The goal, which is given as a logical formula over the grounded propositions, describing the
desired condition goal states must satisfy.
Coming back to our example, in the PDDL domain we would declare that there are two types
of objects, robots and machines:
(:types robot machine)
(:predicates
(at ?r - robot ?m - machine)
)
(:action move
:parameters (?r - robot ?m1 - machine ?m2 - machine)
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
(:objects r1 - robot
m1 m2 m3 m4 m5 m6 - machine)
as well as the initial state and the goal, which we omit here for the sake of brevity.
There are several key differences between PDDL and STRIPS or SAS. First, unlike STRIPS
and SAS, PDDL can be represented in text files. Nevertheless, grounded PDDL and STRIPS are
practically the same, as PDDL was inspired by the STRIPS formalism. This highlights a funda-
mental difference between the roles these formalisms typically serve: PDDL is used mainly as a
representation formalism and was designed to make modeling problems (relatively) easy. On the
other hand, PDDL is more difficult to reason with computationally, as it requires first-order logic
queries. Thus, most modern planners translate their PDDL input into grounded STRIPS or SAS
(e.g., 17), and the planner reasons internally using the grounded representation.
Of course, having a set of such reliable behaviors is not enough. We must also be able to abstract
the state of the world (or the belief state of the robot) into something that can be represented using
symbolic logic. In the example above, the classical planner does not need to care about the exact
(continuous) position of the robot; rather, it can plan only the movement between the different
machines. Recent work has shown that, given a set of behaviors, it is possible to learn how to
abstract the world into a symbolic description, which a classical planner can later use to plan over
these behaviors (18).
Another important assumption is that a lot of the uncertainty can be absorbed by the controller
responsible for executing each behavior. This would allow us to treat each behavior as having a
deterministic effect and sequence high-level actions using classical planning. If this is the case, then
the executive can wrap the planner in a replanning loop and call the planner whenever an action
has some unexpected outcome (e.g., 20). On the other hand, some problems are probabilistically
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
interesting (21), meaning that the determinize-and-replan approach described above could fail. In
Access provided by Syracuse University Library on 10/21/19. For personal use only.
such cases, a better approach would be to formulate the planning problem as stochastic planning
(22), obtaining a solution that is aware of the inherent uncertainty in the world (we expand on
stochastic planning in Section 4). For example, if there are two alternative plans, one expensive
but safe and the other potentially cheap but risky, the determinize-and-replan approach would
attempt to follow the cheap and risky option (as it is blind to the risk), while the stochastic planning
approach would trade off cost versus risk and make the choice that is better suited to the defined
utility of the agent.
Finally, the PLANEX (plan execution) layer implemented the executive, which was responsible
for calling the planner when needed and monitoring the execution of the current plan. It also
contained some learning and plan reuse capabilities to potentially speed up calls to the planner
with similar planning problems.
Shakey was a groundbreaking project in its time. A recent effort to reproduce Shakey using
modern tools showed the significant advances in planning and robotics in the more than 50 years
since the Shakey project was started (24), requiring only several months of work rather than years.
variables and where actions have a duration. PDDL 2.1 (25) is the extension of PDDL for model-
Access provided by Syracuse University Library on 10/21/19. For personal use only.
ing temporal numeric planning. In this setting, a plan is no longer a sequence of actions; instead,
each action in the plan is associated with a starting time point and a duration. On the one hand,
temporal planning allows the control of robots when faced with deadlines (e.g., one task must be
completed by 6 PM) or time windows (e.g., the charging station is available only between 2 PM
and 4 PM). On the other hand, temporal planning can be used to coordinate concurrent activities.
It is important to note the difference between temporal planning and scheduling, as they might
look similar. Scheduling is the problem of deciding when to do things, under the assumption that
the things to be done (actions) are already defined. This is not the case for planning, where the
problem is to decide what to do and when to do it. The combination of numeric and temporal rea-
soning allows modeling of numeric change over time, which is particularly useful when resources
have to be managed (e.g., the battery of the robot). This modeling can be done using durative
actions such as the following:
(:durative-action charge
:parameters (?r - robot ?wp waypoint)
:duration (>= ?duration 0)
:condition (and
(over all (charge_available_at ?wp))
(over all (robot_at ?r ?wp))
(over all (<= (battery_charge ?r) (battery_capacity ?r)))
(at start (undocked ?r)))
:effect (and
(at start (docked ?r)) (at start (not (undocked ?r)))
(at end (undocked ?r))
(increase (battery_charge ?r) (∗ #t (charging_rate)))
))
This action models the robot recharging its battery. Note that the duration for this action is
not fixed; rather, it is left to the planner to decide for how long to recharge the battery (>=
?duration 0). In fact, the battery charge continuously increases while the action is being
executed (we discuss continuous change in more detail in Section 3.2). One of the conditions for
applying this action is that the robot must be at a location where the charging station is available for
the whole duration of the action (which is modeled using the over all condition that works as an
invariant).
Durative actions can be used nicely with time windows to model scenarios in which some
activities must be performed within specific time bounds. For example, let us assume there is a
charging station at location wp19, but it is available for use only between 4 PM and 7 PM. It is
then possible to specify this constraint in the initial state using time initial literals as follows:
(at 14.00 (charge_available_at wp19)) (at 17.00 (not
(charge_available_at wp19))
Note that, because the durative action charge contains an overall condition requiring
charge_available_at wp19 to be true for the duration of the action, the planner is forced to
put the charge action in a valid time frame. For example:
14.00: (goto HST wp4 wp19) [0.34] 14.35: (charge HRS wp19) [1.20]
Timeline-based planning (e.g., see 26–28) is an alternative way of modeling temporal problems
in which the core idea is the use of timelines. A timeline represents the evolution of a state variable
over time, and constraints over the timeline are used to model actions as well as initial and goal
conditions. Timeline-based planning is particularly well suited to problems that have a strong
scheduling component. There are several languages for modeling problems using timelines (e.g.,
see 29, 30). In particular, the Action Notation Modeling Language (ANML) was designed to create
a unifying modeling language for timeline-based planning (31).
An STN plan represents several (often infinite) valid executions of the plan encoded in all the
possible satisfying assignments (called consistent assignments) of the STN.
Definition 2 (consistency, from Reference 36). An STN is consistent if there exists an
assignment μ : T → R such that each constraint c ∈ C is satisfied by substituting each time
point t ∈ C with μ(t ) in c. Such an assignment μ is called a consistent assignment.
To use STNs to encode temporal plans, time points are used to represent events, such as the
start and end of an action, while the temporal constraints encode the minimum and maximum
time that can pass between the events. The executor can use this structure to adjust the timing
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
of actions within these bounds and allow some flexibility in the execution. Planners such as the
Access provided by Syracuse University Library on 10/21/19. For personal use only.
Extensible Universal Remote Operations Planning Architecture (EUROPA) (39), IxTeT (26), and
the Advanced Planning and Scheduling Initiative (APSI) planner (40) can directly produce plans
in STN form.
One key underlying assumption of the STN framework is that the executor can freely choose
among all the time points to satisfy the constraints. Sometimes, however, the executor is simply
unable to control the duration of an activity or the execution time of an event. In those cases, the
executor can only observe some of the events while deciding when to schedule the others in order
to successfully execute the plan. These situations are covered by using an STNU, an extension of
the STN formalism to represent and reason about temporal knowledge in the presence of tempo-
ral uncertainty. Analogously to STNs, all temporal constraints in an STNU are binary and convex.
Definition 3 (simple temporal network with uncertainty, from Reference 41). An
STNU is a tuple T , C, L , where (a) T is a set of real-valued variables, called time points,
that is partitioned into the sets Tc and Tu of controllable and uncontrollable time points, re-
spectively; (b) C is a set of binary constraints, each of the form t1 − t2 ≤ k, for some t1 , t2 ∈ T
and k ∈ R; and (c) L is a set of contingent links, each of the form b, l, u, e , where b ∈ Tc ,
e ∈ Tu , and l, u ∈ R with 0 < l < u.
A contingent link, b, l, u, e , represents a temporal interval from b to e whose duration is uncon-
trollable but bounded by e − b ∈ [l, u]; b is called the activation time point of the uncontrollable
time point e.
In contrast to an STN, the notion of consistency does not apply in an STNU, as the uncon-
trollable time points cannot be scheduled by the executive. Three different kinds of controllability
have been presented in the literature: strong, weak, and dynamic (37).
An STNU is strongly controllable if there is a solution that consists of a fixed, unconditioned,
nonreactive assignment to each controllable time point that is guaranteed to satisfy the constraints
in the network regardless of how the uncontrollable time points turn out while respecting the
contingent links. Such a solution corresponds to a time-triggered plan, in which activities are
started at fixed times that are determined in advance of execution. Intuitively, strong controllability
captures the blind executor—that is, the executor is not allowed to depend on the observed timings
in any way (42).
An STNU is weakly controllable if there is a solution strategy that assigns values to the con-
trollable time points as a function of the uncontrollable time points. Although the values for the
uncontrollable time points need not be known when solving the problem, this version of control-
lability presumes that all such information is provided to the executor in advance of execution. In
some sense, we can see a weak strategy as a clairvoyant program that schedules the controllable
time points given a forecast of the uncontrollable outcome. In weak controllability, the executor
is informed a priori of the duration of all the activities, and it can depend on such prediction.
Finally, a network is dynamically controllable if it has a dynamic execution strategy that can
react to uncontrollable time-point observations, but only those that have occurred in the past. In
other words, the values that the execution strategy assigns to the controllable time points may de-
pend on uncontrollable events, but only if that information has already been observed in real time;
it cannot depend on advance knowledge of future uncontrollable events. Typically, the execution
strategy must be able to deal with the branching that derives from delays and may interleave the
start times of activities with the observation of uncontrollable time points. In dynamic controlla-
bility, the executor can depend on past observation but has no access to any prediction. Micheli
(42) provided a good overview of planning in the context of temporal networks.
Building on this work in controllability, Morris & Muscettola (43) and others described effi-
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
cient algorithms for the execution of temporal plans represented as temporal networks. By exe-
Access provided by Syracuse University Library on 10/21/19. For personal use only.
cuting a temporal plan in the form of a constraint satisfaction problem, the executor may follow a
least commitment approach, in which decisions can be delayed until a commitment must be made.
One important point is the different communication requirements between using strong and
dynamic controllability. If the STNU represents a joint plan for multiple robots, then using strong
controllability requires no communication whatsoever, since each robot executes the agreed-upon
schedule. By contrast, using dynamic controllability for execution requires the robots to commu-
nicate when certain events occur, as the dynamic schedule depends on this information. Some
work (e.g., 44) has also looked at a distributed version of this issue, where robots must explicitly
decide what information to share, although the details are beyond the scope of this article.
Second, it is possible to use continuous numeric effects for effects that occur over time. For
example, turning on the same instrument would reduce the energy stored in the battery at a rate
of 100 Wh per hour, and this would continue as long as the instrument is still powered. In PDDL,
this would be encoded as
where #t stands for the differential of t, t. Continuous effects have the indirect effect of intro-
ducing a coupling between time and state, as the value of a state variable (for example, available
energy) depends on the time the action using the instrument ends. Note that this also applies to
the durative action charge in the example given at the start of Section 3.
Temporal planners that support continuous effects, such as COLIN (45), replace the STN used
to check for temporal consistency with a linear program that encodes both temporal constraints
and constraints on the numeric variables. Other approaches to temporal planning with continuous
change use discretization (e.g., 46–48) or satisfiability modulo theories (e.g., 49, 50).
In the above formulation, the rate of change of the continuous effects is fixed. It is also possible
to formulate planning problems in which the rate of change is controllable by a control variable.
For example, if we have a continuous effect updating the position of a mobile robot, then con-
trolling the velocity of the robot controls the rate of change. The Scotty planner (51) solves a
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
A different type of problem formulation allows for numeric action parameters. For example, an
action that withdraws money from an ATM could have a numeric control parameter that controls
how much money to withdraw. Note that this is different from controllable rates of change, as
the control parameters could be used in instantaneous effects. The Partial-Order Planning with
Constrained Real Numerics (POPCORN) planner (52) solves such problems, using mixed-integer
programming to check for consistency.
Finally, combined task and motion planning deals with planning with continuous robot
motions, mostly in the context of manipulation planning. These problems deal with continuous
variables that represent robot configurations, obstacle locations and geometries, and more. The
need to combine collision checking in high-dimensional configuration spaces with planning adds
another layer of complication to these problems. Several powerful frameworks for combined
task and motion planning have been developed in the robotics community, such as the Open
Motion Planning Library from the Kavraki laboratory (53), the Task-Motion Kit (TMKit) from
Dantam et al. (54, 55), and Combined Task and Motion Planning (CTAMP) from McMahon &
Plaku (56). Note that while most planners that address combined task and motion planning are
sequential in nature (e.g., 57–62), a few combine temporal planning with task and motion planning
(63).
Describing task and motion planning problems requires a richer modeling language that can
describe both the symbolic level and the geometric level of the problem. Such a language, together
with a platform-independent evaluation method for task and motion planning, has been proposed
by Lagriffoul et al. (64). As a comprehensive survey of task and motion planning is beyond the
scope of this article, we refer the reader to the Robotics: Science and Systems (RSS) Workshop
on Task and Motion Planning (http://dyalab.mines.edu/2019/rss-workshop), which is home
to many publications on this topic.
While it is tempting to use the most expressive planning formalism available to model the
problem at hand, there are some caveats. Propositional temporal planning is already more com-
putationally difficult than classical planning (65), being EXPSPACE complete (recall that classical
planning is PSPACE complete). Introducing numeric variables makes the planning problem un-
decidable in general (66).
Nevertheless, there is some hope, as some fragments of temporal planning are only PSPACE
complete (65), and some fragments of numeric planning are decidable (66). Furthermore, even
temporal planning problems that are not in PSPACE can be solved by compiling them into a
series of classical planning problems (67). Thus, it is important to understand the implications
that the choices made in modeling the problem have both on the fidelity of the model to reality
and on the ability to use the model to usefully plan within the time constraints.
For example, take a mobile robot driving in a factory, as in the RCLL scenario. If the robot
only ever needs to drive around among a fixed set of locations (in this case, the machines), then it is
possible to encode the robot’s position on a road map and use propositional temporal planning to
solve the planning problem. Temporal uncertainty can be used to encapsulate the uncertainty in
the driving time that results from the presence of the other team’s robots. However, if we give the
robots a new ability to hand a workpiece to another robot, and we assume that this handoff can oc-
cur anywhere, then we can no longer use a road map to encode the robots’ positions and must use
continuous variables to keep track of each robot’s position. In this case, if the robots have a fixed
velocity, we can use continuous linear effects, while if we want to control the robots’ velocities, we
can use a formulation with control variables. Finally, if we also need to reason about the exact place-
ment of the workpieces on the machines, combined task and motion planning would be in order.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
Until now, we have discussed only deterministic problems with full information. In this section,
we discuss how to relax these assumptions to deal with two issues that come up in the real world,
especially with robots: nondeterministic dynamics (e.g., actions can fail) and partial observability.
While such planning problems can often be formalized as an MDP (68) or a partially observable
MDP (POMDP) (69), techniques for solving these models often do not scale to solve large prob-
lems. Below, we review some techniques for using deterministic, full-information planners to plan
in uncertain, partially observable worlds.
The first case we address, which is also the easiest to handle in terms of planning, occurs when
the controller responsible for executing an action can absorb the uncertainty about the outcome
of the action, possibly taking longer to achieve this outcome. This occurs, for example, with a
mobile robot that can navigate and avoid obstacles and is guaranteed to reach its goal. In this case,
we can encode each action as having an uncontrollable duration and use deterministic temporal
planning with uncertainty about action durations (an STNU) to solve the planning problem (70).
If the controller cannot guarantee the outcome of the action with certainty but is still very re-
liable, then a simple determinize-and-replan approach often works well (20). The key idea behind
this approach is to look at the expected (or most likely) outcome of each action and treat the action
as a deterministic action with that outcome as its effect. The executive then uses a deterministic
planner but constantly monitors plan execution. If an unexpected outcome occurs for one of the
actions in the plan, then the executive triggers replanning from the current state.
The above approach works when the inherent uncertainty is not probabilistically interesting
(21), meaning that any unexpected outcome can be recovered from. However, in some problems,
acting intelligently means taking risk (in the form of unexpected outcomes) into account when
choosing the course of action. For example, equipping a car with a spare tire allows it to deal with
the unexpected outcome of getting a flat tire—something that the above approach would never
do. The question then arises of how pessimistic the planner should be. One interesting solution
is to set a threshold k on the number of faults (unexpected outcomes) we would like the planner
to plan for and then use k-fault tolerant planning (e.g., 71).
While all of the above approaches assume each action has an expected (usually good) outcome
and possibly several other unexpected (bad) outcomes, this is not always the case. For example,
flipping a coin has two equally likely outcomes. When faced with such problems, the more gen-
eral approach is to use a probabilistic planning formulation, in which we encode all the possible
outcomes of each action, each with its probability. Probabilistic PDDL (PPDDL) (72) is a variant
of PDDL that adds probabilistic effects (as well as rewards and uncertainty about the initial state).
For example, if we want to encode a probabilistic effect of a robot r driving from location x to
location y, where it will reach y with 70% probability and remain at x with 30% probability, then
the effect of that action would be encoded as follows:
(probabilistic 0.7 (at ?r ?y) 0.3 (at ?r ?x))
While PPDDL is similar to PDDL and thus easier for users familiar with PDDL to learn, there are
several types of planning problems that are difficult to model with it, specifically those that feature
both concurrency (multiple actions can be executed in the same time step, e.g., when controlling
multiple robots) and uncertainty. In such cases, the Relational Dynamic Influence Diagram Lan-
guage (RDDL) (73) might be better suited to encode the planning problem at hand.
Regardless of the choice of description language, when solving probabilistic planning problems
we must use probabilistic planners. While there has been much progress in probabilistic planners
in recent years (e.g., 22), probabilistic planning is still inherently more difficult than deterministic
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
planning, as the solution must now account for multiple possible outcomes for each action.
Access provided by Syracuse University Library on 10/21/19. For personal use only.
So far, we have discussed actions with nondeterministic effects. However, sometimes uncer-
tainty takes a different form. One such case occurs when the actions are deterministic but goals
arrive dynamically, online. An intelligent robot would anticipate these future goals and possibly
take some action to be prepared. For example, in the RCLL scenario, we know future orders will
arrive, so it is usually a good idea to stock raw material, which will likely be used later when the
order becomes known. Anticipatory planning (74) uses a distribution on future goals to take ac-
tions in anticipation of goals that are likely to arrive. The key idea here is to use a deterministic
planner to solve several samples of future goals and then identify the best immediate action based
on its utility in each of the resulting plans. This process is then repeated to select the next action,
and so on.
Another type of uncertainty occurs when we have partial knowledge about which objects exist
in the world. Many planners make the closed-world assumption—that is, the assumption that all
objects in the world are known in advance—which rarely holds in the real world. Nevertheless,
in some cases, even though the closed-world assumption is violated, it is still possible to use a
closed-world planner. In our RCLL scenario, for example, it is possible to formulate an open-
world quantified goal (75) that states that all open orders must be fulfilled. Note that this goal
does not require knowing which orders are currently open (as orders will arrive online), but for
any given set of orders, we can use a closed-world planner to synthesize a sequence of actions.
Both open-world quantified goals and anticipatory planning deal with specific types of uncer-
tainty due to partial knowledge. Generally, planning with partial knowledge (or partial observabil-
ity) is hard (76). Nevertheless, it is sometimes possible to solve a classical planning problem that
compiles away some of the uncertainty. The key idea here is to use a limited subset of epistemic
modal logic (77) to represent that the robot knows some fact p to be true (represented as the fact
K p), that the robot knows some fact p to be false (represented as the fact K ¬p), or that the robot
does not know whether p is true or false (in which case neither K p nor K ¬p is true).
In conformant planning, we are looking for a single sequence of actions that is guaranteed to
achieve the goal (78). In contingent planning, we are looking for a policy (usually represented
as a tree that accounts for the possible outcomes of each action) that is guaranteed to achieve
the goal (79). However, this policy tree could be very large, and only one branch of it will be
needed (although it is impossible to know offline which one). Thus, online planners (e.g., 80),
which choose the next action to perform without synthesizing a complete policy, are often useful
in practice.
The uncertainty we have addressed so far was due to nature and thus can be assumed to be
random. However, robots that operate in the presence of other agents (humans or other robots)
must face uncertainty that comes from the strategic behavior of the other agents, which can no
longer be treated as random. In multirobot problems, the relationships among the robots are very
important.
First, the robots could be cooperative (that is, share the same goal). If the robots are controlled
centrally, then this is just single-agent planning. However, if the robots have decentralized
control, we can use models such as decentralized POMDP (81), which allows us to plan not only
regular actions and sensing actions but also communication actions. While such models are highly
intractable, the use of carefully designed macro actions allows us to solve useful decentralized
planning problems for real robots (82). Alternatively, epistemic logic could also be employed for
multiagent planning, so that each agent reasons about the other agents’ state of knowledge (83).
Second, the robots could be adversarial and have competing objectives. For example, this occurs
in the RoboCup Soccer League (84), in which two teams of robots play a game of soccer. In such
cases, the problem can be formulated as a partially observable stochastic game. While methods
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
for solving such models exist (85), their practicality in general is still limited.
Access provided by Syracuse University Library on 10/21/19. For personal use only.
Finally, the robots could be noncooperative but not adversarial—that is, each robot has its own
goal, but the goals do not compete with each other. In this case, we are faced with the problem of
coordinating multiple robots to ensure that they do not interfere with each other. In such cases,
it is possible to plan for each agent individually using a nondeterministic single-agent planning
formalism by treating the other agents’ actions as uncertainty caused by nature (86). Alternatively,
a hierarchical approach could be used that plans for each robot and then combines the plans
while reasoning over their causal and temporal dependencies (87). A different approach calls for
instituting a social law that prohibits some actions to ensure that each agent can use a deterministic
single-agent planner to choose a plan, and any combination of plans will be guaranteed to work
together (88).
We conclude this section with an example of a real use case where planning with uncer-
tainty is used: the Spatio-Temporal Representation and Activities for Cognitive Control in Long-
Term Scenarios (STRANDS) project (http://strands.acin.tuwien.ac.at). This project developed
a framework for planning under uncertainty for mobile robots (89) that generates policies that
meet specifications given in cosafe linear temporal logic over MDP models of mobile robot be-
havior. The MDPs are based on a discretization of the environment based on a topological map.
As it navigates, the robot gathers data on the failure or success of navigation through an edge
and the time taken to do so. These data are used to build probabilistic edge models of navigation
success and corresponding durations, which allows for the construction of an MDP representing
robot navigation and task execution, where the probabilistic outcomes are learned from experi-
ence. The objective is then posed as finding policies that (a) maximize the probability of satisfying
a cosafe linear temporal logic specification, (b) achieve as much of the specification as possible if
it becomes unsatisfiable during execution, and (c) minimize the expected time to achieve the first
two goals. This approach has been used for long-term deployments of mobile robots (90).
For other excellent examples of the successful use of planning for mobile robots, we refer the
reader to the work done by Manuela Veloso and her team (http://www.cs.cmu.edu/∼mmv).
There are two different reasons we may want to tell the planner how things should be done.
First and foremost, sometimes modeling the dynamics of the planning problem accurately is
cumbersome and difficult, especially considering that we must also do this while abstracting
the problem. For example, consider a home service robot tasked with getting a beer out of the
refrigerator. We may want to tell our robot to always close the refrigerator door after getting the
beer, which is arguably easier than encoding the laws of thermodynamics that can be used to prove
that the food in the refrigerator will get warmer if the door is left open, as well as some biological
and chemical knowledge that can be used to prove that food at room temperature will spoil faster.
Second, we may have some additional knowledge about how to solve the planning problem
more quickly or more efficiently (that is, advice). However, following the “physics, not advice”
maxim, we should take care to understand the implications of using this control knowledge—can
we guarantee that the optimal solution will remain reachable? Can we guarantee that using the
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
control knowledge will really speed up planning? Thus, we argue that while the first reason for
Access provided by Syracuse University Library on 10/21/19. For personal use only.
using control knowledge is valid, especially in real-world scenarios, where encoding the laws of
nature might be very difficult, we should attempt to encode control knowledge to speed up the
planner only when planning speed is not adequate for the task at hand—and even so, we should do
so carefully. In the remainder of this section, we review some formalisms that allow us to encode
control knowledge.
We begin by describing hierarchical task networks (HTNs), a formalism that has been used in
many practical planning tasks (e.g., 92). In HTN planning, a problem is described by a set of tasks.
Some of these tasks are primitive—that is, they are actions that can be directly executed by the
robot. Nonprimitive tasks can be decomposed into task networks using methods. For example, a
task could be “get beer from refrigerator,” which can be decomposed into the sequence of tasks
“go to refrigerator,” “open refrigerator door,” “get beer,” “close refrigerator door,” “go back to
human,” and “deliver beer.” Depending on the robot’s capabilities, each of these tasks could be
further decomposed until some primitive task can be executed.
Although it seems like HTN planning is easier, HTN planning is actually more complex than
classical planning (93) and in the worst case could even be undecidable. However, recall that worst-
case complexity is rarely the right measure for choosing a planner, and furthermore, under some
restrictions, HTN planning can be only PSPACE complete. We further remark that, under some
restrictions, HTN planning can be translated to PDDL and solved using a classical planner (94).
Another formalism that allows us to encode control knowledge is RMPL (35). RMPL does not
rely on a hierarchical structure of decomposing tasks into lower-level tasks, but rather allows the
user to write a control program. A control program can be viewed as a partially elaborated plan,
where in some parts of the plan the user specifies exactly what the planner must do, while in other
parts the user specifies only some subgoal. The planner must then complete the partially specified
parts of the plan in a way that leaves the entire plan consistent.
be sequenced in order to achieve the goal. At this stage, having a list of action names is enough,
possibly already with some of the action parameters. For example, in our RCLL scenario, the
list of actions includes robots moving from machine to machine, robots loading workpieces onto
machines, robots picking up workpieces from machines, and machine processing operations.
After the list of possible actions is available, the next step is to try to model each action in
more detail, identifying its preconditions and effects. At this step, we can also determine whether
actions are deterministic, giving us important input as to the choice of our planning formalism.
Additionally, during this step, we keep track of the predicates used for each action’s preconditions
and effects, and these become the predicates used to model the state of the world.
We also remark that sometimes we need to use numeric variables rather than discrete predi-
cates (for example, to keep track of battery levels). For the sake of readability, we abuse notation
here and call numeric variables predicates as well. The decision about whether to use numeric vari-
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
ables (such as the x, y position of the robot) or a discrete abstraction (for example, a road map with
Access provided by Syracuse University Library on 10/21/19. For personal use only.
discrete locations) often depends on the actions—if we can depend on the robot to robustly navi-
gate between locations on a road map, then we can use a discrete predicate to model its position.
Otherwise, we might need to treat its position as a continuous variable and incur the associated
cost in computation required to plan.
Note that the discussion of predicates demonstrates an important point about how planning
domains are built—they are built in service of sequencing actions, and the predicates themselves
are there to support modeling action preconditions and effects. However, it is important to keep
in mind that any predicate used in the domain will somehow need to be provided as input to
the executive, as we discuss next. In our RCLL scenario, because the robots’ low-level behaviors
are fairly robust, we can model a robot’s position as a discrete road map whose waypoints
are the different machines. Additionally we have predicates describing the assembly status
of each workpiece and the location of each workpiece (in one of the machines or held by a
robot).
The next step, although not strictly part of modeling the task as a planning problem, is nev-
ertheless an important one: defining the state estimator. Starting with the list of predicates from
the above step, we must determine for each predicate whether we can sense it passively (that is,
without taking any special actions) using real sensors and possibly some processing. If this is the
case for all predicates, then things are relatively simple, as we can rely on a state estimator to con-
tinuously feed us observations of the full state. If there is some predicate we cannot sense passively,
then it is sometimes possible to keep track of this predicate by starting from a known value and
updating the value of the predicate according to the actions performed by the robot and occasional
observations. If this is the case, then we can still treat the problem as fully observable, assuming the
robot is the sole agent of change in the world. However, in some cases the agent must proactively
take some actions to sense parts of the world, and we must resort to a partially observable planning
formalism, which makes planning more complex. However, as discussed in Section 4, sometimes
it is possible to still treat the problem as fully observable and handle newly discovered information
via replanning.
After we have a list of predicates (with their associated observability properties) and a list of
actions (with their preconditions and effects), we can make a more informed decision about the
type of planning formalism to employ. If all actions are deterministic and all predicates are fully
observable (or at least we can treat them as fully observable), then we must decide whether to
use classical planning or temporal planning. The advantage of using classical planning is that the
planning process will typically be much quicker. However, if we want to support concurrent action
(for example, controlling two robots working together) or reason about deadlines, then we should
use temporal planning instead.
If some actions are nondeterministic, then it is usually a good idea to try to determine whether
the planning problem is probabilistically interesting. If it is not, and we have full observability of
the state, then using a deterministic planner inside a replanning loop is often a good solution. Oth-
erwise, an executive that can handle nondeterministic planning should be used. Note that special
care must be taken when implementing the dispatch of nondeterministic actions, as the executive
will often need feedback about which outcome of the action occurred. This can sometimes be
inferred from the state estimator, but many executives will want the action dispatcher to return
some information about the status of the action, which could lead to potential synchronization
issues in software development.
Finally, if we do not have full state observability, then we will often need to use some formalism
for partially observable planning with sensing. If it is still possible to handle newly discovered
information via replanning, then this is often the most efficient way of integrating a planner into
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
the robot. Otherwise, we will typically need to use a contingent planner and closely monitor the
Access provided by Syracuse University Library on 10/21/19. For personal use only.
observations we get from the state estimator to execute the right branch of the plan according to
the information observed by the robot.
Choosing the right planning formalism for modeling the problem at hand is often the most
important decision when attempting to endow robots with the ability to plan. However, we must
still write some software to implement this intelligent behavior. The open source ROSPlan frame-
work (95; http://kcl-planning.github.io/ROSPlan) provides a collection of tools for automated
planning inside the Robot Operating System (ROS) ecosystem (96) and can be used to build differ-
ent architectures. Once the right formalism and planner have been chosen, ROSPlan components
can be assembled to build an agent architecture that connects to the rest of a ROS-based system.
ROSPlan supports many of the planning formalisms that have been discussed, providing integra-
tion for the PDDL family of languages for propositional, temporal, or contingent planning and
probabilistic planning with RDDL (73) and PPDDL (72).
ROSPlan consists of reusable components that include a knowledge base for storing a rep-
resentation of the state in one formalism and producing problem instances, an interface to the
knowledge base for state estimation, an interface to the planner for producing plans, an interface
to the plan executor, and several default plan executors for propositional, temporal, and proba-
bilistic planning. The motivation of ROSPlan is to provide a framework for connecting automated
planning to a ROS platform in an open and accessible way. For this reason, the tools provided are
lightweight and make few assumptions about the nature of the agent. ROSPlan is therefore well
suited to replace what can be a considerable integration effort.
DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTS
We would like to acknowledge colleagues who gave us useful feedback on earlier drafts of this
article or helped us with specific topics, including Michael Cashmore, Bruno Lacerda, Andrea
Micheli, and Nick Hawes. D.M. was partially supported by Engineering and Physical Sciences
Research Council grant EP/R033722/1, “Trust in Human-Machine Partnerships.”
We dedicate this article to the memory of Nils Nilsson, one of the founding fathers of both
automated planning and robotics.
LITERATURE CITED
1. Ghallab M, Nau D, Traverso P. 2016. Automated Planning and Acting. New York: Cambridge Univ. Press
2. Ingrand F, Ghallab M. 2017. Deliberation for autonomous robots: a survey. Artif. Intell. 247:10–44
3. LaValle SM. 2006. Planning Algorithms. Cambridge, UK: Cambridge Univ. Press. http://planning.cs.
uiuc.edu
4. Stern R, Sturtevant N, Felner A, Koenig S, Ma H, et al. 2019. Multi-agent pathfinding: definitions, vari-
ants, and benchmarks. arXiv:1906.08291 [cs.AI]
5. Giftthaler M, Farshidian F, Sandy T, Stadelmann L, Buchli J. 2017. Efficient kinematic planning for
mobile manipulators with nonholonomic constraints using optimal control. In 2017 IEEE International
Conference on Robotics and Automation, pp. 3411–17. Piscataway, NJ: IEEE
6. Yang G, Chien S. 2017. Review on space robotics: toward top-level science through space exploration.
Sci. Robot. 2:eaan5074
7. Crosby M, Petrick RPA, Rovida F, Krüger V. 2017. Integrating mission and task planning in an industrial
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
24. Speck D, Dornhege C, Burgard W. 2017. Shakey 2016 – how much does it take to redo Shakey the robot?
IEEE Robot. Autom. Lett. 2:1203–9
25. Fox M, Long D. 2003. PDDL2.1: an extension to PDDL for expressing temporal planning domains.
J. Artif. Intell. Res. 20:61–124
26. Ghallab M, Laruelle H. 1994. Representation and control in IxTeT, a temporal planner. In Proceedings
of the Second International Conference on Artificial Intelligence Planning Systems, pp. 61–67. Palo Alto, CA:
AAAI Press
27. Muscettola N. 1994. HSTS: integrating planning and scheduling. In Intelligent Scheduling, ed. M Zweben,
MS Fox, pp. 169–212. San Francisco, CA: Morgan Kaufmann
28. Gigante N, Montanari A, Mayer MC, Orlandini A. 2017. Complexity of timeline-based planning. In
Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, pp. 116–
24. Palo Alto, CA: AAAI Press
29. Chien S, Rabideau G, Knight R, Sherwood R, Engelhardt B, et al. 2000. ASPEN – automating space mission
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
operations using automated planning and scheduling. Paper presented at the International Conference on
Space Operations, Tolouse, Fr., June 19–23
Access provided by Syracuse University Library on 10/21/19. For personal use only.
30. Barreiro J, Boyce M, Do M, Frank J, Iatauro M, et al. 2012. EUROPA: a platform for AI planning, schedul-
ing, constraint programming, and optimization. Paper presented at the 22nd International Conference on
Automated Planning and Scheduling, Atibaia, Brazil, June 25–29
31. Smith DE, Frank J, Cushing W. 2008. The ANML language. Paper presented at the 18th International
Conference on Automated Planning and Scheduling, Sydney, Sept. 14–18
32. Cashmore M, Cimatti A, Magazzeni D, Micheli A, Zehtabi P. 2019. Robustness envelopes for temporal
plans. In Proceedings of the 2019 AAAI Conference on Artificial Intelligence, pp. 7538–45. Palo Alto, CA: AAAI
Press
33. Verma V, Estlin T, Jónsson A, Pasareanu C, Simmons R, Tso K. 2005. Plan Execution Inter-
change Language (PLEXIL) for executable plans and command sequences. In Proceedings of i-
SAIRAS 2005: The 8th International Symposium on Artificial Intelligence, Robotics and Automation in Space.
Paris: Eur. Space Agency. Available at https://www.esa.int/Ourctivities/Spacengineeringechnology/
Automationndobotics/i-SAIRAS/(print)
34. Rusu RB, Jones EG, Marder-Eppstein E, Pantofaru C, Wise M, et al. 2011. Towards autonomous robotic
butlers: lessons learned with the PR2. In 2011 IEEE International Conference on Robotics and Automation,
pp. 5568–75. Piscataway, NJ: IEEE
35. Williams B, Ingham M, Chung S, Elliott P. 2003. Model-based programming of intelligent embedded
systems and robotic space explorers. Proc. IEEE 91:212–37
36. Dechter R, Meiri I, Pearl J. 1991. Temporal constraint networks. Artif. Intell. 49:61–95
37. Vidal T, Ghallab M. 1996. Dealing with uncertain durations in temporal constraint networks dedicated
to planning. In ECAI 96: 12th European Conference on Artificial Intelligence, ed. W Wahlster, pp. 48–54.
Chichester, UK: Wiley & Sons
38. Vidal T, Fargier H. 1997. Contingent durations in temporal CSPs: from consistency to controllabilities.
In Proceedings of TIME ’97: 4th International Workshop on Temporal Representation and Reasoning, pp. 78–85.
Piscataway, NJ: IEEE
39. Frank J, Jónsson A. 2003. Constraint-based attribute and interval planning. Constraints 8:339–64
40. Fratini S, Cesta A, De Benedictis R, Orlandini A, Rasconi R. 2011. APSI-based deliberation in goal ori-
ented autonomous controllers. In 11th Symposium on Advanced Space Technologies in Robotics and Automation.
Paris: Eur. Space Agency. Available at https://www.esa.int/Ourctivities/Spacengineeringechnology/
Automationndobotics/ProceedingsfSTRA
41. Vidal T, Fargier H. 1999. Handling contingency in temporal constraint networks: from consistency to
controllabilities. J. Exp. Theor. Artif. Intell. 11:23–45
42. Micheli A. 2016. Planning and scheduling in temporally uncertain domains. PhD Thesis, Univ. Trento, Trento,
Italy
43. Morris P, Muscettola N. 2000. Execution of temporal plans with uncertainty. In Proceedings of the Seven-
teenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial
Intelligence, pp. 491–96. Palo Alto, CA: AAAI Press
44. Barbulescu L, Rubinstein ZB, Smith SF, Zimmerman TL. 2010. Distributed coordination of mobile agent
teams: the advantage of planning ahead. In Proceedings of the 9th International Conference on Autonomous
Agents and Multiagent Systems, Vol. 1, pp. 1331–38. Richland, SC: Int. Found. Auton. Agents Multiagent
Syst.
45. Coles AJ, Coles A, Fox M, Long D. 2012. COLIN: planning with continuous linear numeric change.
J. Artif. Intell. Res. 44:1–96
46. Della Penna G, Magazzeni D, Mercorio F. 2012. A universal planning system for hybrid domains. Appl.
Intell. 36:932–59
47. Piotrowski WM, Fox M, Long D, Magazzeni D, Mercorio F. 2016. Heuristic planning for PDDL+
domains. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3213–
19. Palo Alto, CA: AAAI Press
48. Scala E, Haslum P, Thiébaux S. 2016. Heuristics for numeric planning via subgoaling. In Proceedings of
the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3228–34. Palo Alto, CA: AAAI
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
Press
49. Cashmore M, Fox M, Long D, Magazzeni D. 2016. A compilation of the full PDDL+ language into
Access provided by Syracuse University Library on 10/21/19. For personal use only.
SMT. In Proceedings of the Twenty-Sixth International Conference on Automated Planning and Scheduling,
pp. 79–87. Palo Alto, CA: AAAI Press
50. Bryce D, Gao S, Musliner DJ, Goldman RP. 2015. SMT-based nonlinear PDDL+ planning. In Proceedings
of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 3247–53. Palo Alto, CA: AAAI Press
51. Fernández-González E, Williams BC, Karpas E. 2018. ScottyActivity: mixed discrete-continuous plan-
ning with convex optimization. J. Artif. Intell. Res. 62:579–664
52. Savas E, Fox M, Long D, Magazzeni D. 2016. Planning using actions with control parameters. In
ECAI 2016: 22nd European Conference on Artificial Intelligence, ed. GA Kaminka, M Fox, P Boucquet,
E Hüllermeier, V Dignum, et al., pp. 1185–93. Amsterdam: IOS
53. Sucan IA, Moll M, Kavraki LE. 2012. The Open Motion Planning Library. IEEE Robot. Autom. Mag.
19(4):72–82
54. Dantam NT, Chaudhuri S, Kavraki LE. 2018. The Task-Motion Kit: an open source, general-purpose
task and motion-planning framework. IEEE Robot. Autom. Mag. 25(3):61–70
55. Dantam NT, Kingston ZK, Chaudhuri S, Kavraki LE. 2016. Incremental task and motion planning: a
constraint-based approach. In Robotics: Science and Systems XII, ed. D Hsu, N Amato, S Berman, S Jacobs,
pap. 2. N.p.: Robot. Sci. Syst. Found.
56. McMahon J, Plaku E. 2017. Robot motion planning with task specifications via regular languages. Robotica
35:26–49
57. Cambon S, Alami R, Gravot F. 2009. A hybrid approach to intricate motion, manipulation and task plan-
ning. Int. J. Robot. Res. 28:104–26
58. Waldhart J, Gharbi M, Alami R. 2016. A novel software combining task and motion planning for human-
robot interaction. In 2016 AAAI Fall Symposium Series: Artificial Intelligence for Human-Robot Interaction,
pp. 100–2. Palo Alto, CA: AAAI Press
59. Srivastava S, Fang E, Riano L, Chitnis R, Russell S, Abbeel P. 2014. Combined task and motion plan-
ning through an extensible planner-independent interface layer. In 2014 IEEE International Conference on
Robotics and Automation, pp. 639–46. Piscataway, NJ: IEEE
60. Toussaint M. 2015. Logic-geometric programming: an optimization-based approach to combined task
and motion planning. In Proceedings of the 24th International Conference on Artificial Intelligence, pp. 1930–
36. Palo Alto, CA: AAAI Press
61. Garrett CR, Lozano-Pérez T, Kaelbling LP. 2018. FFRob: leveraging symbolic planning for efficient task
and motion planning. Int. J. Robot. Res. 37:104–36
62. Lagriffoul F, Dimitrov D, Bidot J, Saffiotti A, Karlsson L. 2014. Efficiently combining task and motion
planning using geometric constraints. Int. J. Robot. Res. 33:1726–47
63. Edelkamp S, Lahijanian M, Magazzeni D, Plaku E. 2018. Integrating temporal reasoning and sampling-
based motion planning for multigoal problems with dynamics and time windows. IEEE Robot. Autom. Lett.
3:3473–80
64. Lagriffoul F, Dantam NT, Garrett C, Akbari A, Srivastava S, Kavraki LE. 2018. Platform-independent
benchmarks for task and motion planning. IEEE Robot. Autom. Lett. 3:3765–72
65. Rintanen J. 2007. Complexity of concurrent temporal planning. In Proceedings of the Seventeenth Interna-
tional Conference on International Conference on Automated Planning and Scheduling, pp. 280–87. Palo Alto,
CA: AAAI Press
66. Helmert M. 2002. Decidability and undecidability results for planning with numerical state variables. In
Proceedings of the Sixth International Conference on Artificial Intelligence Planning Systems, pp. 44–53. Palo
Alto, CA: AAAI Press
67. Celorrio SJ, Jonsson A, Palacios H. 2015. Temporal planning with required concurrency using classical
planning. In Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling,
pp. 129–37. Palo Alto, CA: AAAI Press
68. Howard RA. 1960. Dynamic Programming and Markov Processes. Cambridge, MA: MIT Press
69. Åström KJ. 1965. Optimal control of Markov processes with incomplete state information. J. Math. Anal.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
Appl. 10:174–205
70. Cimatti A, Do M, Micheli A, Roveri M, Smith DE. 2018. Strong temporal planning with uncontrollable
Access provided by Syracuse University Library on 10/21/19. For personal use only.
86. Muise C, Felli P, Miller T, Pearce AR, Sonenberg L. 2016. Planning for a single agent in a multi-agent
environment using FOND. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial
Intelligence, pp. 3206–12. Palo Alto, CA: AAAI Press
87. Ma H, Hönig W, Cohen L, Uras T, Xu H, et al. 2017. Overview: a hierarchical framework for plan
generation and execution in multirobot systems. IEEE Intell. Syst. 32:6–12
88. Nir R, Karpas E. 2019. Automated verification of social laws for continuous time multi-robot systems. In
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, pp. 7683–90. Palo Alto, CA: AAAI
Press
89. Lacerda B, Faruq F, Parker D, Hawes N. 2019. Probabilistic planning with formal performance guarantees
for mobile service robots. Int. J. Robot. Res. 38:1098–123
90. Hawes N, Burbridge C, Jovan F, Kunze L, Lacerda B, et al. 2017. The STRANDS project: long-term
autonomy in everyday environments. IEEE Robot. Autom. Mag. 24(3):146–56
91. McDermott D. 2000. The 1998 AI Planning Systems Competition. AI Mag. 21(2):35–55
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
92. Simmons R, Apfelbaum D, Fox D, Goldman RP, Haigh KZ, et al. 2000. Coordinated deployment of
multiple, heterogeneous robots. In Proceedings: 2000 IEEE/RSJ International Conference on Intelligent Robots
Access provided by Syracuse University Library on 10/21/19. For personal use only.