You are on page 1of 23

AS03CH04-Magazzeni ARjats.

cls October 7, 2019 16:37

Annual Review of Control, Robotics, and


Autonomous Systems
Automated Planning
for Robotics
Erez Karpas1 and Daniele Magazzeni2
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org
Access provided by Syracuse University Library on 10/21/19. For personal use only.

1
Faculty of Industrial Engineering and Management, Technion, Haifa 3200003, Israel;
email: karpase@technion.ac.il
2
Department of Informatics, King’s College London, London WC2B 4BG, United Kingdom;
email: daniele.magazzeni@kcl.ac.uk

Annu. Rev. Control Robot. Auton. Syst. 2020. Keywords


3:4.1–4.23
automated planning, AI planning, robotics, artificial intelligence
The Annual Review of Control, Robotics, and
Autonomous Systems is online at
control.annualreviews.org Abstract
https://doi.org/10.1146/annurev-control-082619- Modern robots are increasingly capable of performing “basic” activities such
100135 as localization, navigation, and motion planning. However, for a robot to
Copyright © 2020 by Annual Reviews. be considered intelligent, we would like it to be able to automatically com-
All rights reserved bine these capabilities in order to achieve a high-level goal. The field of
automated planning (sometimes called AI planning) deals with automati-
cally synthesizing plans that combine basic actions to achieve a high-level
goal. In this article, we focus on the intersection of automated planning and
robotics and discuss some of the challenges and tools available to employ au-
tomated planning in controlling robots. We review different types of plan-
ning formalisms and discuss their advantages and limitations, especially in
the context of planning robot actions. We conclude with a brief guide aimed
at helping roboticists choose the right planning model to endow a robot with
planning capabilities.

4.1
Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

1. INTRODUCTION
Modern robots and the software tools for controlling them are becoming increasingly capable
of performing “basic” tasks such as localization, navigation, and motion planning. While these
tasks are challenging in and of themselves and remain the focus of significant and interesting
research, we are finally at a point where it is possible to intelligently combine these capabilities
in order to achieve a high-level goal. This is the focus of this article, in which we review some
of the work on applying automated planning (1) to intelligently control one or more robots. A
recent survey provides a comprehensive overview of deliberation for autonomous robots (2); rather
than repeating this survey, our objective here is to provide a broad understanding of planning for
robotics and, specifically, how to endow a robot with planning capabilities.
The word planning is used in different ways in robotics, meaning different activities with differ-
ent goals—for example, motion planning (e.g., 3), path planning (e.g., 4), and kinematic planning
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

(e.g., 5). As mentioned above, this article focuses on task planning, often called automated plan-
Access provided by Syracuse University Library on 10/21/19. For personal use only.

ning (the term we use throughout this article) or AI planning. Automated planning is often used
in combination with other types of planning in integrated solutions.
Automated planning has been used in several real-world robotics scenarios, including space
exploration (e.g., 6), logistics industrial robotics (e.g., 7), and the oil and gas industry (e.g., 8).
Work presented at the International Conference on Automated Planning and Scheduling (ICAPS)
Workshop on Planning and Robotics (PlanRob) and Workshop on Scheduling and Planning Ap-
plications (SPARK) has covered many other applications of planning.
Throughout the article, we use the RoboCup Logistics League (RCLL) (9) as an illustrative ex-
ample. The RCLL is part of the RoboCup competition and is designed to simulate manufacturing
operations in a factory. Each game is played between two teams, each of which controls three mo-
bile robots and six different production machines. The robots can move workpieces between the
production machines, and each machine can perform one of several processing operations on the
workpieces. In the game, orders for workpieces with different configurations arrive on the fly and
are associated with a time window in which the order must be completed. In addition to not crash-
ing into the other team’s robots, each team must solve a planning problem in which it controls its
robots and machines and determines which actions to perform to fulfill orders in a timely manner.

1.1. The Three Approaches to Autonomous Behavior


It is clear that the RCLL challenge (as well as many others) requires autonomous behavior, as
each team must react to the orders, which arrive dynamically. To put the subsequent discussion
in context, we first provide a brief overview of the three high-level approaches to autonomous
behavior, as identified by Geffner (10): programming-, learning-, and model-based approaches.

1.1.1. The programming-based approach. In the programming-based approach, a human


programmer (or team of programmers) manually specifies the desired behavior using some pro-
gramming language. The major disadvantage of this approach is that the programmer must an-
ticipate every possible scenario, which is an onerous task.
For the RCLL scenario, it is possible to program a controller that will behave reasonably well.
However, more realistic scenarios might pose a challenge, as demonstrated by the loss of the Mars
Polar Lander, which crashed on the surface of Mars in 1999. The lander had three landing legs,
each equipped with a sensor designed to detect when the lander touched down, and the software
was supposed to shut down the main engine when the lander detected that it had touched down.
The most likely cause of the crash was that the lander erroneously shut down its engines 40 m
above the surface due to transient noise in one of the three sensors (11). This condition, which
the programmers did not anticipate, caused the loss of the lander.

4.2 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

1.1.2. The learning-based approach. In the learning-based approach, the agent (or robot)
learns the correct behavior from experience—either its own (as in reinforcement learning) or an
expert’s (as in imitation learning). The learning-based approach can be (and has been) used to
train robots to perform basic tasks, for which it would also be very difficult to formulate a model.
However, reinforcement learning approaches typically do not offer any guarantees on when an
optimal policy has been found or on the safety of the robot when it is still exploring. In addition,
learning a complex sequence of tasks is not as easy as learning single tasks. Nevertheless, it is very
useful to learn how to perform basic actions and combine these actions using planning.
In the context of the RCLL scenario, it is conceivable that reinforcement learning will converge
on an optimal policy. However, the large number of possible states in this problem suggests that,
without some generalization, convergence will take a long time to achieve.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

1.1.3. The model-based approach. In the model-based approach, we assume that we have
Access provided by Syracuse University Library on 10/21/19. For personal use only.

some model of the world, and the agent uses the model to reason about possible courses of action
and how they will affect the state of the world. Automated planning is the model-based approach
to autonomous behavior and is discussed throughout the rest of the article.
In the context of the RCLL scenario, it is possible to quickly synthesize a course of action
that completes all outstanding orders, but finding an optimal solution still requires a significant
computational effort.

1.2. Planning-Based Agents


Having described the three approaches to autonomy, we now discuss the architecture of an agent
following these approaches, focusing on model-based agents. The classical model for how an agent
interacts with the environment is the sense–think–act loop, illustrated in Figure 1a. This inter-
action works in a loop as follows:

 Sense: The agent obtains some observations (percepts) from the environment using its
sensors.
 Think: The agent thinks about what to do next and chooses the next action to execute.
 Act: The agent performs the chosen action, thereby modifying the environment, and the
loop repeats.

If the agent’s decision-making is based on a fixed policy (which can be programmed manually
or learned), then this loop elegantly describes the agent’s operation. However, if we want to use an
automated planner in our decision-making, then the simple sense–think–act loop does not work:

a b Planner
Replan Plan

Agent (think) Executive


Sense Act Sense Act

Environment Environment

Figure 1
Illustration of two agent architectures. (a) The sense–think–act loop, in which the agent obtains observations
of its environment, thinks about what to do next, and executes the chosen action, after which the sequence
repeats. (b) A planning-based agent architecture, in which the decision-making is split between executive and
planner components.

www.annualreviews.org • Automated Planning for Robotics 4.3


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Planning is typically slow (usually measured in seconds, not milliseconds), and thus we cannot call
a planner at the rate of a typical robotic control loop (which typically runs at least once per second,
and often much faster). However, a planner will generate a long-term plan that will ultimately lead
to the goal.
Thus, in a model-based agent that uses a planner (which we refer to as a planning-based agent),
the decision-making is typically split among two different components: the executive and the plan-
ner (as illustrated in Figure 1b). The executive is responsible for interacting with the environment,
executing actions from the plan as appropriate, and receiving observations while monitoring plan
execution to make sure the plan is still valid. The executive also interacts with the planner by
calling for replanning (or plan repair) when the current plan becomes invalid (perhaps because
an action failed or there has been some unforeseen disturbance to the state of the world) and
receiving the plan for execution from the planner when it finishes planning.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

The above description of how a planning-based agent works is quite vague. In the rest of this
Access provided by Syracuse University Library on 10/21/19. For personal use only.

article, we review different types of models (formalisms) and discuss when and under what as-
sumptions they are applicable to robotics. For each such formalism, we also discuss how to execute
such plans. We begin by discussing the simplest type of planning formalism—classical planning.

2. CONTROLLING ROBOTS VIA CLASSICAL PLANNING


We start by describing how a planning-based agent can use classical planning to control a robot.
Classical planning assumes that (a) the world is deterministic, (b) the state is fully observable,
(c) the robot is the sole agent of change in the world, and (d) actions are instantaneous. Even with
these assumptions, however, classical planning is computationally intractable. Specifically, the
decision of whether a STRIPS planning problem (explained in Section 2.1) is solvable is PSPACE
complete (12).
We first review different formalisms for describing classical planning problems. We then ex-
plain how a robot can use these formalisms to behave intelligently in the real world despite the
unrealistic assumptions underlying classical planning.

2.1. Classical Planning Formalisms


Classical planning is the simplest type of planning model, and the different formalisms compactly
describe a transition system, consisting of (a) a set of possible states of the world, S; (b) a set of
possible (labeled) transitions between these states, T ⊆ S × S; (c) an initial state, s0 ∈ S; and (d) a
set of goal states, G ⊆ S. A plan is a sequence of transitions (each corresponding to an action) that
leads from the initial state s0 to some goal state sg ∈ G.
Different formalisms differ in how exactly they represent this transition system. In STRIPS
(13), a set of Boolean propositions (or facts) F describe the possible states, where a state s is repre-
sented by the set of all propositions that hold in that state. The transitions are obtained by actions,
where each action a is represented by its preconditions pre(a), its add effects add(a), and its delete
effects del (a). Action a can be applied in state s if pre(a) ⊆ s, and leads to state (s \ del (a)) ∪ add(a).
Finally, the goal states are represented by a set of facts g, where a state s satisfies the goal if and
only if g ⊆ s.
For example, to represent a robot moving around between machines in our RCLL scenario,
we would have the propositions at (r1, m1), at (r1, m2), . . . , at (r1, m6). The movement action
move(r1, m1, m2) would be represented with precondition {at (r1, m1)}, add effect {at (r1, m2)},
and delete effect {at (r1, m1)}, indicating that when robot r1 moves from m1 to m2, it is no longer
at m1.

4.4 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Alternatively, a finite-domain representation, also referred to as a simplified action structure


(SAS), can be used to describe the planning problem (14), in which a set of variables V is used
to describe the possible states, where each variable v ∈ V is associated with a finite set of possi-
ble values—its domain, dom(v). The possible states are all assignments to V . The transitions are
specified by a set of actions A, where each action a ∈ A is associated with a precondition pre(a)
and an effect eff(a), both of which are partial assignments to V . Action a is applicable in state s if
s | pre(a)—that is, the state s is consistent with the preconditions—and leads to the state s , which
is equivalent to s except that the values of the variables in eff(a) are updated according to eff(a).
Using the same example as above, we would have a variable loc(r1) for the location of robot r1,
with domain {m1, . . . , m6}. The move(r1, m1, m2) action would have the precondition loc(r1) =
m1 and effect loc(r1) = m2. Note that there is no need to explicitly encode the fact that r1 is no
longer at m1.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

Both STRIPS and SAS are grounded representations, using a flat (or propositional) represen-
Access provided by Syracuse University Library on 10/21/19. For personal use only.

tation of the world. However, planning problems could be very similar to each other—for example,
two different games of the RCLL scenario would be very similar to each other, varying only in
the specific positions of the robots and machines and in the orders. We say that these problems
come from the same domain.
The Planning Domain Definition Language (PDDL) (15) was created to facilitate comparing
the performance of different planners in the context of the International Planning Competition
(http://www.icaps-conference.org/index.php/Main/Competitions) and is used to represent
planning problems using a lifted (first-order logic) formalism. PDDL splits the definition of a
planning problem into two parts: the domain and the problem. The domain captures the parts of
the problem that are supposed to be common to all problems, while the problem file describes the
parts that are specific to each problem. PDDL is a rich language, and here we describe only its
core component; for more information, we refer the reader to a recent book dedicated to PDDL
(16).
A PDDL domain describes three things:

 The set of possible types of objects. For example, in the RCLL, the types of objects would
be robot, machine, workpiece, etc.
 The set of possible predicates, which describe the possible relations between objects of
the different types. For example, in the RCLL, the predicate at (?r, ?m) would indicate
whether robot ?r is at machine ?m. Each predicate is then grounded into several facts, e.g.,
at (r1, m1), at (r2, m1), . . . , each of which represents a specific fact that could be true or false.
 The set of schematic operators, which describe the possible actions. For example, the op-
erator move(?r, ?m1, ?m2) would describe the action of robot ?r moving from machine ?m1
to machine ?m2. Again, each operator can be grounded into several grounded actions, e.g.,
move(r3, m2, m7), indicating that robot r3 moves from machine m2 to machine m7.

While the same domain description can be used in different problems, the problem definition
describes the parts that are problem specific:

 The objects in the world. For example, in the RCLL, the objects would be robots r1–r3,
machines m1–m6, etc. Grounding predicates and operators is done with respect to the set
of objects in the problem description.
 The initial state, which is given as a list of all the true grounded propositions.
 The goal, which is given as a logical formula over the grounded propositions, describing the
desired condition goal states must satisfy.

www.annualreviews.org • Automated Planning for Robotics 4.5


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Coming back to our example, in the PDDL domain we would declare that there are two types
of objects, robots and machines:
(:types robot machine)

A robot can be at a machine:

(:predicates
(at ?r - robot ?m - machine)
)

And the move operator is as follows:

(:action move
:parameters (?r - robot ?m1 - machine ?m2 - machine)
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

:precondition (and (at ?r ?m1))


Access provided by Syracuse University Library on 10/21/19. For personal use only.

:effect (and (at ?r ?m2)


(not (at ?r ?m1))))

Then, in the PDDL problem file, we would declare the objects

(:objects r1 - robot
m1 m2 m3 m4 m5 m6 - machine)
as well as the initial state and the goal, which we omit here for the sake of brevity.
There are several key differences between PDDL and STRIPS or SAS. First, unlike STRIPS
and SAS, PDDL can be represented in text files. Nevertheless, grounded PDDL and STRIPS are
practically the same, as PDDL was inspired by the STRIPS formalism. This highlights a funda-
mental difference between the roles these formalisms typically serve: PDDL is used mainly as a
representation formalism and was designed to make modeling problems (relatively) easy. On the
other hand, PDDL is more difficult to reason with computationally, as it requires first-order logic
queries. Thus, most modern planners translate their PDDL input into grounded STRIPS or SAS
(e.g., 17), and the planner reasons internally using the grounded representation.

2.2. Modeling Real Robots Using Classical Planning


From the above descriptions of the classical planning formalism and the assumptions it makes
about the world, it may be unclear how classical planners can control robots that operate in the real
world. The real world is nondeterministic and only partially observable, while classical planning
assumes that actions are deterministic and that the full state of the world is known.
The key idea here is abstraction and hierarchy. A classical planner is rarely used to control a
robot’s motors directly. Rather, we typically assume the robot has a set of behaviors it knows how
to execute, where each behavior could be implemented as a policy in a lower-level Markov decision
process (MDP) (e.g., 18), by some rule-based system (e.g., 19), or in various other ways. The role
of the planner is then to sequence (or chain together) these behaviors to achieve the robot’s goal.
For example, in the RCLL scenario, we have seen that we can describe having the robot move
from some machine m1 to some machine m2 as a classical planning operator. In fact, executing
such an action (also called dispatching it) calls the robot’s navigation package, which deals with
an incredibly complex continuous problem in which the robot has to localize itself using its
sensors (odometry, lidar, camera), plan a path that avoids obstacles, execute the planned path
by controlling the wheel motors appropriately, and possibly replan on the fly to avoid moving
obstacles. Fortunately, as mentioned before, this basic robot capability is now at a stage where we
can treat it as a deterministic black box, which we can execute.
4.6 Karpas • Magazzeni
Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Of course, having a set of such reliable behaviors is not enough. We must also be able to abstract
the state of the world (or the belief state of the robot) into something that can be represented using
symbolic logic. In the example above, the classical planner does not need to care about the exact
(continuous) position of the robot; rather, it can plan only the movement between the different
machines. Recent work has shown that, given a set of behaviors, it is possible to learn how to
abstract the world into a symbolic description, which a classical planner can later use to plan over
these behaviors (18).
Another important assumption is that a lot of the uncertainty can be absorbed by the controller
responsible for executing each behavior. This would allow us to treat each behavior as having a
deterministic effect and sequence high-level actions using classical planning. If this is the case, then
the executive can wrap the planner in a replanning loop and call the planner whenever an action
has some unexpected outcome (e.g., 20). On the other hand, some problems are probabilistically
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

interesting (21), meaning that the determinize-and-replan approach described above could fail. In
Access provided by Syracuse University Library on 10/21/19. For personal use only.

such cases, a better approach would be to formulate the planning problem as stochastic planning
(22), obtaining a solution that is aware of the inherent uncertainty in the world (we expand on
stochastic planning in Section 4). For example, if there are two alternative plans, one expensive
but safe and the other potentially cheap but risky, the determinize-and-replan approach would
attempt to follow the cheap and risky option (as it is blind to the risk), while the stochastic planning
approach would trade off cost versus risk and make the choice that is better suited to the defined
utility of the agent.

2.3. Real Example: Shakey the Robot


We conclude this section by briefly reviewing the earliest intelligent robot, which was instrumental
in advancing the state of the art in both robotics and automated planning—Shakey the robot (23).
Although Shakey was built more than 50 years ago, the basic principles for controlling robots via
classical planning still apply today.
Shakey used a four-layer software architecture. At the bottom were the low-level actions—
primitive actions that served as the interface between the software and the hardware. These actions
(almost) directly control the hardware, for example, by setting a target for a controller. Examples
include rolling forward x feet and turning θ degrees right. While these actions allow for the most
flexibility in terms of commanding the robot, planning with these low-level actions is very difficult,
as the model of the world needs to be very precise and be able to deal with continuous variables,
uncertainty, concurrency, and many other features that classical planning does not support.
One level up were the intermediate-level actions. These actions absorb much of the uncer-
tainty in the world and essentially implement a policy using the low-level actions (and possibly
some other intermediate-level actions). In Shakey, these actions were implemented by manually
specifying a policy in the form of a Markov table and had some limited ability to recover from
errors, which in turn allowed the planner to reason about these actions using a simpler model of
the world. Examples include navigating to waypoint x, y while bypassing obstacles on the way.
The STRIPS (Stanford Research Institute Problem Solver) layer, which gave rise to the
STRIPS planning problem formulation, implemented a classical planner, which was responsible
for sequencing intermediate-level actions in order to achieve the goal. Each intermediate-level
action was modeled as a STRIPS operator with preconditions and effects, allowing the planner to
predict the state of the world after a sequence of actions is applied from the current state. For the
sake of historical accuracy, it is important to note that the formulation of STRIPS described in
the context of Shakey used a theorem prover on a first-order logic representation of the problem,
which allowed for handling partial knowledge (as described in Section 4).

www.annualreviews.org • Automated Planning for Robotics 4.7


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Finally, the PLANEX (plan execution) layer implemented the executive, which was responsible
for calling the planner when needed and monitoring the execution of the current plan. It also
contained some learning and plan reuse capabilities to potentially speed up calls to the planner
with similar planning problems.
Shakey was a groundbreaking project in its time. A recent effort to reproduce Shakey using
modern tools showed the significant advances in planning and robotics in the more than 50 years
since the Shakey project was started (24), requiring only several months of work rather than years.

3. CONTROLLING ROBOTS VIA TEMPORAL


AND NUMERIC PLANNING
In this section, we describe how planning can be used in domains where the state includes numeric
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

variables and where actions have a duration. PDDL 2.1 (25) is the extension of PDDL for model-
Access provided by Syracuse University Library on 10/21/19. For personal use only.

ing temporal numeric planning. In this setting, a plan is no longer a sequence of actions; instead,
each action in the plan is associated with a starting time point and a duration. On the one hand,
temporal planning allows the control of robots when faced with deadlines (e.g., one task must be
completed by 6 PM) or time windows (e.g., the charging station is available only between 2 PM
and 4 PM). On the other hand, temporal planning can be used to coordinate concurrent activities.
It is important to note the difference between temporal planning and scheduling, as they might
look similar. Scheduling is the problem of deciding when to do things, under the assumption that
the things to be done (actions) are already defined. This is not the case for planning, where the
problem is to decide what to do and when to do it. The combination of numeric and temporal rea-
soning allows modeling of numeric change over time, which is particularly useful when resources
have to be managed (e.g., the battery of the robot). This modeling can be done using durative
actions such as the following:

(:durative-action charge
:parameters (?r - robot ?wp waypoint)
:duration (>= ?duration 0)
:condition (and
(over all (charge_available_at ?wp))
(over all (robot_at ?r ?wp))
(over all (<= (battery_charge ?r) (battery_capacity ?r)))
(at start (undocked ?r)))
:effect (and
(at start (docked ?r)) (at start (not (undocked ?r)))
(at end (undocked ?r))
(increase (battery_charge ?r) (∗ #t (charging_rate)))
))

This action models the robot recharging its battery. Note that the duration for this action is
not fixed; rather, it is left to the planner to decide for how long to recharge the battery (>=
?duration 0). In fact, the battery charge continuously increases while the action is being
executed (we discuss continuous change in more detail in Section 3.2). One of the conditions for
applying this action is that the robot must be at a location where the charging station is available for
the whole duration of the action (which is modeled using the over all condition that works as an
invariant).
Durative actions can be used nicely with time windows to model scenarios in which some
activities must be performed within specific time bounds. For example, let us assume there is a

4.8 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

charging station at location wp19, but it is available for use only between 4 PM and 7 PM. It is
then possible to specify this constraint in the initial state using time initial literals as follows:
(at 14.00 (charge_available_at wp19)) (at 17.00 (not
(charge_available_at wp19))

Note that, because the durative action charge contains an overall condition requiring
charge_available_at wp19 to be true for the duration of the action, the planner is forced to
put the charge action in a valid time frame. For example:
14.00: (goto HST wp4 wp19) [0.34] 14.35: (charge HRS wp19) [1.20]

The temporal track of the International Planning Competition (https://ipc2018-temporal.


bitbucket.io) is a good reference point for examples of temporal PDDL domains as well as avail-
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

able temporal planners.


Access provided by Syracuse University Library on 10/21/19. For personal use only.

Timeline-based planning (e.g., see 26–28) is an alternative way of modeling temporal problems
in which the core idea is the use of timelines. A timeline represents the evolution of a state variable
over time, and constraints over the timeline are used to model actions as well as initial and goal
conditions. Timeline-based planning is particularly well suited to problems that have a strong
scheduling component. There are several languages for modeling problems using timelines (e.g.,
see 29, 30). In particular, the Action Notation Modeling Language (ANML) was designed to create
a unifying modeling language for timeline-based planning (31).

3.1. Executing Temporal Plans for Robots


Given a temporal plan that describes what the robot is supposed to do and when, the purpose of
plan execution is to execute the behaviors of the robot as prescribed by the plan (illustrated in
Figure 1b). However, in robotics, the execution of a behavior is often a process whose duration
can only be observed, and not controlled directly. For example, in the RCLL domain, when the
robot navigates from one machine to another, many factors will affect the duration of the travel,
including the exact initial position of the robot, the power available to the motors, and the presence
of the other team’s robots. To execute the plan successfully or to understand when a deviation has
invalidated the plan, the executor must be able to reason over the temporal constraints of the plan
during execution. Often this deviation from the model is small enough that minor adjustments
to the timing of actions are sufficient to continue execution. The issue treated by temporal plan
execution is to determine when to dispatch each action for execution and to understand when
the deviation of observed durations from those estimated in the model has invalidated the plan.
Recently, satisfiability modulo theories have been proposed to create robust envelopes around
plans to check when they are still valid despite delays in action executions (32).
The interaction between planning and plan execution allows for many different partitions
of reasoning between the planner and executive levels in an agent’s architecture (Figure 1b).
Whereas PLANEX directly mapped planned actions to behaviors in the Shakey robot, temporal
plan executors perform reasoning over the timing of actions, first refining the plan into a form
that explicitly models the temporal constraints that the executive must reason over to execute the
plan, often as automata [e.g., as in the Plan Execution Interchange Language (PLEXIL) (33) and
SMACH (34)] or as constraints [e.g., the Reactive Model-Based Programming Language (RMPL)
(35)].
In the latter case, temporal plans can be generated in the form of, or can be postprocessed
into, a network of temporal constraints—that is, a simple temporal network (STN) (36) or simple
temporal network with uncertainty (STNU) (37, 38).

www.annualreviews.org • Automated Planning for Robotics 4.9


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

Definition 1 (simple temporal network, from Reference 36). An STN is a tuple T , C ,


where T is a finite set of time points and C is a finite set of constraints of the form t1 − t2 ≤ k,
with t1 , t2 ∈ T and k ∈ R.

An STN plan represents several (often infinite) valid executions of the plan encoded in all the
possible satisfying assignments (called consistent assignments) of the STN.
Definition 2 (consistency, from Reference 36). An STN is consistent if there exists an
assignment μ : T → R such that each constraint c ∈ C is satisfied by substituting each time
point t ∈ C with μ(t ) in c. Such an assignment μ is called a consistent assignment.

To use STNs to encode temporal plans, time points are used to represent events, such as the
start and end of an action, while the temporal constraints encode the minimum and maximum
time that can pass between the events. The executor can use this structure to adjust the timing
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

of actions within these bounds and allow some flexibility in the execution. Planners such as the
Access provided by Syracuse University Library on 10/21/19. For personal use only.

Extensible Universal Remote Operations Planning Architecture (EUROPA) (39), IxTeT (26), and
the Advanced Planning and Scheduling Initiative (APSI) planner (40) can directly produce plans
in STN form.
One key underlying assumption of the STN framework is that the executor can freely choose
among all the time points to satisfy the constraints. Sometimes, however, the executor is simply
unable to control the duration of an activity or the execution time of an event. In those cases, the
executor can only observe some of the events while deciding when to schedule the others in order
to successfully execute the plan. These situations are covered by using an STNU, an extension of
the STN formalism to represent and reason about temporal knowledge in the presence of tempo-
ral uncertainty. Analogously to STNs, all temporal constraints in an STNU are binary and convex.
Definition 3 (simple temporal network with uncertainty, from Reference 41). An
STNU is a tuple T , C, L , where (a) T is a set of real-valued variables, called time points,
that is partitioned into the sets Tc and Tu of controllable and uncontrollable time points, re-
spectively; (b) C is a set of binary constraints, each of the form t1 − t2 ≤ k, for some t1 , t2 ∈ T
and k ∈ R; and (c) L is a set of contingent links, each of the form b, l, u, e , where b ∈ Tc ,
e ∈ Tu , and l, u ∈ R with 0 < l < u.

A contingent link, b, l, u, e , represents a temporal interval from b to e whose duration is uncon-
trollable but bounded by e − b ∈ [l, u]; b is called the activation time point of the uncontrollable
time point e.
In contrast to an STN, the notion of consistency does not apply in an STNU, as the uncon-
trollable time points cannot be scheduled by the executive. Three different kinds of controllability
have been presented in the literature: strong, weak, and dynamic (37).
An STNU is strongly controllable if there is a solution that consists of a fixed, unconditioned,
nonreactive assignment to each controllable time point that is guaranteed to satisfy the constraints
in the network regardless of how the uncontrollable time points turn out while respecting the
contingent links. Such a solution corresponds to a time-triggered plan, in which activities are
started at fixed times that are determined in advance of execution. Intuitively, strong controllability
captures the blind executor—that is, the executor is not allowed to depend on the observed timings
in any way (42).
An STNU is weakly controllable if there is a solution strategy that assigns values to the con-
trollable time points as a function of the uncontrollable time points. Although the values for the
uncontrollable time points need not be known when solving the problem, this version of control-
lability presumes that all such information is provided to the executor in advance of execution. In
some sense, we can see a weak strategy as a clairvoyant program that schedules the controllable

4.10 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

time points given a forecast of the uncontrollable outcome. In weak controllability, the executor
is informed a priori of the duration of all the activities, and it can depend on such prediction.
Finally, a network is dynamically controllable if it has a dynamic execution strategy that can
react to uncontrollable time-point observations, but only those that have occurred in the past. In
other words, the values that the execution strategy assigns to the controllable time points may de-
pend on uncontrollable events, but only if that information has already been observed in real time;
it cannot depend on advance knowledge of future uncontrollable events. Typically, the execution
strategy must be able to deal with the branching that derives from delays and may interleave the
start times of activities with the observation of uncontrollable time points. In dynamic controlla-
bility, the executor can depend on past observation but has no access to any prediction. Micheli
(42) provided a good overview of planning in the context of temporal networks.
Building on this work in controllability, Morris & Muscettola (43) and others described effi-
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

cient algorithms for the execution of temporal plans represented as temporal networks. By exe-
Access provided by Syracuse University Library on 10/21/19. For personal use only.

cuting a temporal plan in the form of a constraint satisfaction problem, the executor may follow a
least commitment approach, in which decisions can be delayed until a commitment must be made.
One important point is the different communication requirements between using strong and
dynamic controllability. If the STNU represents a joint plan for multiple robots, then using strong
controllability requires no communication whatsoever, since each robot executes the agreed-upon
schedule. By contrast, using dynamic controllability for execution requires the robots to commu-
nicate when certain events occur, as the dynamic schedule depends on this information. Some
work (e.g., 44) has also looked at a distributed version of this issue, where robots must explicitly
decide what information to share, although the details are beyond the scope of this article.

3.2. Planning with Continuous Variables


Propositional temporal planning deals with one continuous variable, time; all other variables are
discrete. For example, a set of robots navigating around a known road map can use propositional
temporal planning to reason about deadlines, coordination, etc. As we have seen above, time can
be handled in a special way, by encoding temporal constraints as temporal networks.
However, sometimes the need to model a problem with continuous variables arises. For exam-
ple, if we want to reason about a mobile robot’s battery level, we can discretize battery levels and
use propositional temporal planning. However, this makes the problem cumbersome, as we need
to encode numeric effects with conditional statements (for example, if the battery level was 50%,
and the robot moved for a duration of 2 min, the battery level is now 40%; if the battery level was
40%, and the robot moved for 4 min, the battery level is now 20%; etc.). A more natural encoding
would use a numeric variable to encode the robot’s battery level and describe the effect that the
battery level decreases at a rate of 5% per minute when the robot is driving.
Examining the use of numeric variables more closely reveals that there are several ways to use
them. First, it is possible to use instantaneous effects on numeric variables. For example, turning
on an instrument that consumes 100 W of power reduces the available power by 100 W. In PDDL,
this would be encoded as the following effect:

(decrease (available-power) 100)

Second, it is possible to use continuous numeric effects for effects that occur over time. For
example, turning on the same instrument would reduce the energy stored in the battery at a rate
of 100 Wh per hour, and this would continue as long as the instrument is still powered. In PDDL,
this would be encoded as

(decrease (available-energy) (∗ 100 #t))

www.annualreviews.org • Automated Planning for Robotics 4.11


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

where #t stands for the differential of t, t. Continuous effects have the indirect effect of intro-
ducing a coupling between time and state, as the value of a state variable (for example, available
energy) depends on the time the action using the instrument ends. Note that this also applies to
the durative action charge in the example given at the start of Section 3.
Temporal planners that support continuous effects, such as COLIN (45), replace the STN used
to check for temporal consistency with a linear program that encodes both temporal constraints
and constraints on the numeric variables. Other approaches to temporal planning with continuous
change use discretization (e.g., 46–48) or satisfiability modulo theories (e.g., 49, 50).
In the above formulation, the rate of change of the continuous effects is fixed. It is also possible
to formulate planning problems in which the rate of change is controllable by a control variable.
For example, if we have a continuous effect updating the position of a mobile robot, then con-
trolling the velocity of the robot controls the rate of change. The Scotty planner (51) solves a
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

second-order cone program to check the consistency of a candidate partial plan.


Access provided by Syracuse University Library on 10/21/19. For personal use only.

A different type of problem formulation allows for numeric action parameters. For example, an
action that withdraws money from an ATM could have a numeric control parameter that controls
how much money to withdraw. Note that this is different from controllable rates of change, as
the control parameters could be used in instantaneous effects. The Partial-Order Planning with
Constrained Real Numerics (POPCORN) planner (52) solves such problems, using mixed-integer
programming to check for consistency.
Finally, combined task and motion planning deals with planning with continuous robot
motions, mostly in the context of manipulation planning. These problems deal with continuous
variables that represent robot configurations, obstacle locations and geometries, and more. The
need to combine collision checking in high-dimensional configuration spaces with planning adds
another layer of complication to these problems. Several powerful frameworks for combined
task and motion planning have been developed in the robotics community, such as the Open
Motion Planning Library from the Kavraki laboratory (53), the Task-Motion Kit (TMKit) from
Dantam et al. (54, 55), and Combined Task and Motion Planning (CTAMP) from McMahon &
Plaku (56). Note that while most planners that address combined task and motion planning are
sequential in nature (e.g., 57–62), a few combine temporal planning with task and motion planning
(63).
Describing task and motion planning problems requires a richer modeling language that can
describe both the symbolic level and the geometric level of the problem. Such a language, together
with a platform-independent evaluation method for task and motion planning, has been proposed
by Lagriffoul et al. (64). As a comprehensive survey of task and motion planning is beyond the
scope of this article, we refer the reader to the Robotics: Science and Systems (RSS) Workshop
on Task and Motion Planning (http://dyalab.mines.edu/2019/rss-workshop), which is home
to many publications on this topic.
While it is tempting to use the most expressive planning formalism available to model the
problem at hand, there are some caveats. Propositional temporal planning is already more com-
putationally difficult than classical planning (65), being EXPSPACE complete (recall that classical
planning is PSPACE complete). Introducing numeric variables makes the planning problem un-
decidable in general (66).
Nevertheless, there is some hope, as some fragments of temporal planning are only PSPACE
complete (65), and some fragments of numeric planning are decidable (66). Furthermore, even
temporal planning problems that are not in PSPACE can be solved by compiling them into a
series of classical planning problems (67). Thus, it is important to understand the implications
that the choices made in modeling the problem have both on the fidelity of the model to reality
and on the ability to use the model to usefully plan within the time constraints.

4.12 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

For example, take a mobile robot driving in a factory, as in the RCLL scenario. If the robot
only ever needs to drive around among a fixed set of locations (in this case, the machines), then it is
possible to encode the robot’s position on a road map and use propositional temporal planning to
solve the planning problem. Temporal uncertainty can be used to encapsulate the uncertainty in
the driving time that results from the presence of the other team’s robots. However, if we give the
robots a new ability to hand a workpiece to another robot, and we assume that this handoff can oc-
cur anywhere, then we can no longer use a road map to encode the robots’ positions and must use
continuous variables to keep track of each robot’s position. In this case, if the robots have a fixed
velocity, we can use continuous linear effects, while if we want to control the robots’ velocities, we
can use a formulation with control variables. Finally, if we also need to reason about the exact place-
ment of the workpieces on the machines, combined task and motion planning would be in order.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

4. PLANNING WITH UNCERTAINTY


Access provided by Syracuse University Library on 10/21/19. For personal use only.

Until now, we have discussed only deterministic problems with full information. In this section,
we discuss how to relax these assumptions to deal with two issues that come up in the real world,
especially with robots: nondeterministic dynamics (e.g., actions can fail) and partial observability.
While such planning problems can often be formalized as an MDP (68) or a partially observable
MDP (POMDP) (69), techniques for solving these models often do not scale to solve large prob-
lems. Below, we review some techniques for using deterministic, full-information planners to plan
in uncertain, partially observable worlds.
The first case we address, which is also the easiest to handle in terms of planning, occurs when
the controller responsible for executing an action can absorb the uncertainty about the outcome
of the action, possibly taking longer to achieve this outcome. This occurs, for example, with a
mobile robot that can navigate and avoid obstacles and is guaranteed to reach its goal. In this case,
we can encode each action as having an uncontrollable duration and use deterministic temporal
planning with uncertainty about action durations (an STNU) to solve the planning problem (70).
If the controller cannot guarantee the outcome of the action with certainty but is still very re-
liable, then a simple determinize-and-replan approach often works well (20). The key idea behind
this approach is to look at the expected (or most likely) outcome of each action and treat the action
as a deterministic action with that outcome as its effect. The executive then uses a deterministic
planner but constantly monitors plan execution. If an unexpected outcome occurs for one of the
actions in the plan, then the executive triggers replanning from the current state.
The above approach works when the inherent uncertainty is not probabilistically interesting
(21), meaning that any unexpected outcome can be recovered from. However, in some problems,
acting intelligently means taking risk (in the form of unexpected outcomes) into account when
choosing the course of action. For example, equipping a car with a spare tire allows it to deal with
the unexpected outcome of getting a flat tire—something that the above approach would never
do. The question then arises of how pessimistic the planner should be. One interesting solution
is to set a threshold k on the number of faults (unexpected outcomes) we would like the planner
to plan for and then use k-fault tolerant planning (e.g., 71).
While all of the above approaches assume each action has an expected (usually good) outcome
and possibly several other unexpected (bad) outcomes, this is not always the case. For example,
flipping a coin has two equally likely outcomes. When faced with such problems, the more gen-
eral approach is to use a probabilistic planning formulation, in which we encode all the possible
outcomes of each action, each with its probability. Probabilistic PDDL (PPDDL) (72) is a variant
of PDDL that adds probabilistic effects (as well as rewards and uncertainty about the initial state).
For example, if we want to encode a probabilistic effect of a robot r driving from location x to

www.annualreviews.org • Automated Planning for Robotics 4.13


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

location y, where it will reach y with 70% probability and remain at x with 30% probability, then
the effect of that action would be encoded as follows:
(probabilistic 0.7 (at ?r ?y) 0.3 (at ?r ?x))

While PPDDL is similar to PDDL and thus easier for users familiar with PDDL to learn, there are
several types of planning problems that are difficult to model with it, specifically those that feature
both concurrency (multiple actions can be executed in the same time step, e.g., when controlling
multiple robots) and uncertainty. In such cases, the Relational Dynamic Influence Diagram Lan-
guage (RDDL) (73) might be better suited to encode the planning problem at hand.
Regardless of the choice of description language, when solving probabilistic planning problems
we must use probabilistic planners. While there has been much progress in probabilistic planners
in recent years (e.g., 22), probabilistic planning is still inherently more difficult than deterministic
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

planning, as the solution must now account for multiple possible outcomes for each action.
Access provided by Syracuse University Library on 10/21/19. For personal use only.

So far, we have discussed actions with nondeterministic effects. However, sometimes uncer-
tainty takes a different form. One such case occurs when the actions are deterministic but goals
arrive dynamically, online. An intelligent robot would anticipate these future goals and possibly
take some action to be prepared. For example, in the RCLL scenario, we know future orders will
arrive, so it is usually a good idea to stock raw material, which will likely be used later when the
order becomes known. Anticipatory planning (74) uses a distribution on future goals to take ac-
tions in anticipation of goals that are likely to arrive. The key idea here is to use a deterministic
planner to solve several samples of future goals and then identify the best immediate action based
on its utility in each of the resulting plans. This process is then repeated to select the next action,
and so on.
Another type of uncertainty occurs when we have partial knowledge about which objects exist
in the world. Many planners make the closed-world assumption—that is, the assumption that all
objects in the world are known in advance—which rarely holds in the real world. Nevertheless,
in some cases, even though the closed-world assumption is violated, it is still possible to use a
closed-world planner. In our RCLL scenario, for example, it is possible to formulate an open-
world quantified goal (75) that states that all open orders must be fulfilled. Note that this goal
does not require knowing which orders are currently open (as orders will arrive online), but for
any given set of orders, we can use a closed-world planner to synthesize a sequence of actions.
Both open-world quantified goals and anticipatory planning deal with specific types of uncer-
tainty due to partial knowledge. Generally, planning with partial knowledge (or partial observabil-
ity) is hard (76). Nevertheless, it is sometimes possible to solve a classical planning problem that
compiles away some of the uncertainty. The key idea here is to use a limited subset of epistemic
modal logic (77) to represent that the robot knows some fact p to be true (represented as the fact
K p), that the robot knows some fact p to be false (represented as the fact K ¬p), or that the robot
does not know whether p is true or false (in which case neither K p nor K ¬p is true).
In conformant planning, we are looking for a single sequence of actions that is guaranteed to
achieve the goal (78). In contingent planning, we are looking for a policy (usually represented
as a tree that accounts for the possible outcomes of each action) that is guaranteed to achieve
the goal (79). However, this policy tree could be very large, and only one branch of it will be
needed (although it is impossible to know offline which one). Thus, online planners (e.g., 80),
which choose the next action to perform without synthesizing a complete policy, are often useful
in practice.
The uncertainty we have addressed so far was due to nature and thus can be assumed to be
random. However, robots that operate in the presence of other agents (humans or other robots)
must face uncertainty that comes from the strategic behavior of the other agents, which can no

4.14 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

longer be treated as random. In multirobot problems, the relationships among the robots are very
important.
First, the robots could be cooperative (that is, share the same goal). If the robots are controlled
centrally, then this is just single-agent planning. However, if the robots have decentralized
control, we can use models such as decentralized POMDP (81), which allows us to plan not only
regular actions and sensing actions but also communication actions. While such models are highly
intractable, the use of carefully designed macro actions allows us to solve useful decentralized
planning problems for real robots (82). Alternatively, epistemic logic could also be employed for
multiagent planning, so that each agent reasons about the other agents’ state of knowledge (83).
Second, the robots could be adversarial and have competing objectives. For example, this occurs
in the RoboCup Soccer League (84), in which two teams of robots play a game of soccer. In such
cases, the problem can be formulated as a partially observable stochastic game. While methods
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

for solving such models exist (85), their practicality in general is still limited.
Access provided by Syracuse University Library on 10/21/19. For personal use only.

Finally, the robots could be noncooperative but not adversarial—that is, each robot has its own
goal, but the goals do not compete with each other. In this case, we are faced with the problem of
coordinating multiple robots to ensure that they do not interfere with each other. In such cases,
it is possible to plan for each agent individually using a nondeterministic single-agent planning
formalism by treating the other agents’ actions as uncertainty caused by nature (86). Alternatively,
a hierarchical approach could be used that plans for each robot and then combines the plans
while reasoning over their causal and temporal dependencies (87). A different approach calls for
instituting a social law that prohibits some actions to ensure that each agent can use a deterministic
single-agent planner to choose a plan, and any combination of plans will be guaranteed to work
together (88).
We conclude this section with an example of a real use case where planning with uncer-
tainty is used: the Spatio-Temporal Representation and Activities for Cognitive Control in Long-
Term Scenarios (STRANDS) project (http://strands.acin.tuwien.ac.at). This project developed
a framework for planning under uncertainty for mobile robots (89) that generates policies that
meet specifications given in cosafe linear temporal logic over MDP models of mobile robot be-
havior. The MDPs are based on a discretization of the environment based on a topological map.
As it navigates, the robot gathers data on the failure or success of navigation through an edge
and the time taken to do so. These data are used to build probabilistic edge models of navigation
success and corresponding durations, which allows for the construction of an MDP representing
robot navigation and task execution, where the probabilistic outcomes are learned from experi-
ence. The objective is then posed as finding policies that (a) maximize the probability of satisfying
a cosafe linear temporal logic specification, (b) achieve as much of the specification as possible if
it becomes unsatisfiable during execution, and (c) minimize the expected time to achieve the first
two goals. This approach has been used for long-term deployments of mobile robots (90).
For other excellent examples of the successful use of planning for mobile robots, we refer the
reader to the work done by Manuela Veloso and her team (http://www.cs.cmu.edu/∼mmv).

5. ENCODING CONTROL KNOWLEDGE


The general philosophy behind domain-independent planning is to encode “physics, not advice”
(91, p. 37)—that is, to encode all possible actions and let the planner choose what to do. However,
in the context of planning for robots, this is not always feasible or advisable. Given that physical
commands to the hardware set some voltage at some output port, it is clear that we want to model
the problem at a higher level of abstraction, as discussed above. However, in this section, we discuss
going beyond modeling the problem in an abstract way to enrich the planner with knowledge
about how things should be done.
www.annualreviews.org • Automated Planning for Robotics 4.15
Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

There are two different reasons we may want to tell the planner how things should be done.
First and foremost, sometimes modeling the dynamics of the planning problem accurately is
cumbersome and difficult, especially considering that we must also do this while abstracting
the problem. For example, consider a home service robot tasked with getting a beer out of the
refrigerator. We may want to tell our robot to always close the refrigerator door after getting the
beer, which is arguably easier than encoding the laws of thermodynamics that can be used to prove
that the food in the refrigerator will get warmer if the door is left open, as well as some biological
and chemical knowledge that can be used to prove that food at room temperature will spoil faster.
Second, we may have some additional knowledge about how to solve the planning problem
more quickly or more efficiently (that is, advice). However, following the “physics, not advice”
maxim, we should take care to understand the implications of using this control knowledge—can
we guarantee that the optimal solution will remain reachable? Can we guarantee that using the
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

control knowledge will really speed up planning? Thus, we argue that while the first reason for
Access provided by Syracuse University Library on 10/21/19. For personal use only.

using control knowledge is valid, especially in real-world scenarios, where encoding the laws of
nature might be very difficult, we should attempt to encode control knowledge to speed up the
planner only when planning speed is not adequate for the task at hand—and even so, we should do
so carefully. In the remainder of this section, we review some formalisms that allow us to encode
control knowledge.
We begin by describing hierarchical task networks (HTNs), a formalism that has been used in
many practical planning tasks (e.g., 92). In HTN planning, a problem is described by a set of tasks.
Some of these tasks are primitive—that is, they are actions that can be directly executed by the
robot. Nonprimitive tasks can be decomposed into task networks using methods. For example, a
task could be “get beer from refrigerator,” which can be decomposed into the sequence of tasks
“go to refrigerator,” “open refrigerator door,” “get beer,” “close refrigerator door,” “go back to
human,” and “deliver beer.” Depending on the robot’s capabilities, each of these tasks could be
further decomposed until some primitive task can be executed.
Although it seems like HTN planning is easier, HTN planning is actually more complex than
classical planning (93) and in the worst case could even be undecidable. However, recall that worst-
case complexity is rarely the right measure for choosing a planner, and furthermore, under some
restrictions, HTN planning can be only PSPACE complete. We further remark that, under some
restrictions, HTN planning can be translated to PDDL and solved using a classical planner (94).
Another formalism that allows us to encode control knowledge is RMPL (35). RMPL does not
rely on a hierarchical structure of decomposing tasks into lower-level tasks, but rather allows the
user to write a control program. A control program can be viewed as a partially elaborated plan,
where in some parts of the plan the user specifies exactly what the planner must do, while in other
parts the user specifies only some subgoal. The planner must then complete the partially specified
parts of the plan in a way that leaves the entire plan consistent.

6. SUMMARY: HOW DO I MAKE MY ROBOT SMART?


We conclude this article with a brief guide aimed at roboticists who want to endow their robots
with the ability to plan without needing to become planning experts themselves. Thus, we focus on
how to choose the right tool for the task at hand out of the many tools developed by the planning
community over the years.
The first (and most difficult) step is to model the robot’s task as a planning problem, which
is even more difficult because we must also choose the right planning formalism to model the
problem in. We have found that it is usually best to start by listing the possible actions the robot
knows how to execute—that is, which low-level behaviors are implemented in the robot and should

4.16 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

be sequenced in order to achieve the goal. At this stage, having a list of action names is enough,
possibly already with some of the action parameters. For example, in our RCLL scenario, the
list of actions includes robots moving from machine to machine, robots loading workpieces onto
machines, robots picking up workpieces from machines, and machine processing operations.
After the list of possible actions is available, the next step is to try to model each action in
more detail, identifying its preconditions and effects. At this step, we can also determine whether
actions are deterministic, giving us important input as to the choice of our planning formalism.
Additionally, during this step, we keep track of the predicates used for each action’s preconditions
and effects, and these become the predicates used to model the state of the world.
We also remark that sometimes we need to use numeric variables rather than discrete predi-
cates (for example, to keep track of battery levels). For the sake of readability, we abuse notation
here and call numeric variables predicates as well. The decision about whether to use numeric vari-
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

ables (such as the x, y position of the robot) or a discrete abstraction (for example, a road map with
Access provided by Syracuse University Library on 10/21/19. For personal use only.

discrete locations) often depends on the actions—if we can depend on the robot to robustly navi-
gate between locations on a road map, then we can use a discrete predicate to model its position.
Otherwise, we might need to treat its position as a continuous variable and incur the associated
cost in computation required to plan.
Note that the discussion of predicates demonstrates an important point about how planning
domains are built—they are built in service of sequencing actions, and the predicates themselves
are there to support modeling action preconditions and effects. However, it is important to keep
in mind that any predicate used in the domain will somehow need to be provided as input to
the executive, as we discuss next. In our RCLL scenario, because the robots’ low-level behaviors
are fairly robust, we can model a robot’s position as a discrete road map whose waypoints
are the different machines. Additionally we have predicates describing the assembly status
of each workpiece and the location of each workpiece (in one of the machines or held by a
robot).
The next step, although not strictly part of modeling the task as a planning problem, is nev-
ertheless an important one: defining the state estimator. Starting with the list of predicates from
the above step, we must determine for each predicate whether we can sense it passively (that is,
without taking any special actions) using real sensors and possibly some processing. If this is the
case for all predicates, then things are relatively simple, as we can rely on a state estimator to con-
tinuously feed us observations of the full state. If there is some predicate we cannot sense passively,
then it is sometimes possible to keep track of this predicate by starting from a known value and
updating the value of the predicate according to the actions performed by the robot and occasional
observations. If this is the case, then we can still treat the problem as fully observable, assuming the
robot is the sole agent of change in the world. However, in some cases the agent must proactively
take some actions to sense parts of the world, and we must resort to a partially observable planning
formalism, which makes planning more complex. However, as discussed in Section 4, sometimes
it is possible to still treat the problem as fully observable and handle newly discovered information
via replanning.
After we have a list of predicates (with their associated observability properties) and a list of
actions (with their preconditions and effects), we can make a more informed decision about the
type of planning formalism to employ. If all actions are deterministic and all predicates are fully
observable (or at least we can treat them as fully observable), then we must decide whether to
use classical planning or temporal planning. The advantage of using classical planning is that the
planning process will typically be much quicker. However, if we want to support concurrent action
(for example, controlling two robots working together) or reason about deadlines, then we should
use temporal planning instead.

www.annualreviews.org • Automated Planning for Robotics 4.17


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

If some actions are nondeterministic, then it is usually a good idea to try to determine whether
the planning problem is probabilistically interesting. If it is not, and we have full observability of
the state, then using a deterministic planner inside a replanning loop is often a good solution. Oth-
erwise, an executive that can handle nondeterministic planning should be used. Note that special
care must be taken when implementing the dispatch of nondeterministic actions, as the executive
will often need feedback about which outcome of the action occurred. This can sometimes be
inferred from the state estimator, but many executives will want the action dispatcher to return
some information about the status of the action, which could lead to potential synchronization
issues in software development.
Finally, if we do not have full state observability, then we will often need to use some formalism
for partially observable planning with sensing. If it is still possible to handle newly discovered
information via replanning, then this is often the most efficient way of integrating a planner into
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

the robot. Otherwise, we will typically need to use a contingent planner and closely monitor the
Access provided by Syracuse University Library on 10/21/19. For personal use only.

observations we get from the state estimator to execute the right branch of the plan according to
the information observed by the robot.
Choosing the right planning formalism for modeling the problem at hand is often the most
important decision when attempting to endow robots with the ability to plan. However, we must
still write some software to implement this intelligent behavior. The open source ROSPlan frame-
work (95; http://kcl-planning.github.io/ROSPlan) provides a collection of tools for automated
planning inside the Robot Operating System (ROS) ecosystem (96) and can be used to build differ-
ent architectures. Once the right formalism and planner have been chosen, ROSPlan components
can be assembled to build an agent architecture that connects to the rest of a ROS-based system.
ROSPlan supports many of the planning formalisms that have been discussed, providing integra-
tion for the PDDL family of languages for propositional, temporal, or contingent planning and
probabilistic planning with RDDL (73) and PPDDL (72).
ROSPlan consists of reusable components that include a knowledge base for storing a rep-
resentation of the state in one formalism and producing problem instances, an interface to the
knowledge base for state estimation, an interface to the planner for producing plans, an interface
to the plan executor, and several default plan executors for propositional, temporal, and proba-
bilistic planning. The motivation of ROSPlan is to provide a framework for connecting automated
planning to a ROS platform in an open and accessible way. For this reason, the tools provided are
lightweight and make few assumptions about the nature of the agent. ROSPlan is therefore well
suited to replace what can be a considerable integration effort.

DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS
We would like to acknowledge colleagues who gave us useful feedback on earlier drafts of this
article or helped us with specific topics, including Michael Cashmore, Bruno Lacerda, Andrea
Micheli, and Nick Hawes. D.M. was partially supported by Engineering and Physical Sciences
Research Council grant EP/R033722/1, “Trust in Human-Machine Partnerships.”
We dedicate this article to the memory of Nils Nilsson, one of the founding fathers of both
automated planning and robotics.

4.18 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

LITERATURE CITED
1. Ghallab M, Nau D, Traverso P. 2016. Automated Planning and Acting. New York: Cambridge Univ. Press
2. Ingrand F, Ghallab M. 2017. Deliberation for autonomous robots: a survey. Artif. Intell. 247:10–44
3. LaValle SM. 2006. Planning Algorithms. Cambridge, UK: Cambridge Univ. Press. http://planning.cs.
uiuc.edu
4. Stern R, Sturtevant N, Felner A, Koenig S, Ma H, et al. 2019. Multi-agent pathfinding: definitions, vari-
ants, and benchmarks. arXiv:1906.08291 [cs.AI]
5. Giftthaler M, Farshidian F, Sandy T, Stadelmann L, Buchli J. 2017. Efficient kinematic planning for
mobile manipulators with nonholonomic constraints using optimal control. In 2017 IEEE International
Conference on Robotics and Automation, pp. 3411–17. Piscataway, NJ: IEEE
6. Yang G, Chien S. 2017. Review on space robotics: toward top-level science through space exploration.
Sci. Robot. 2:eaan5074
7. Crosby M, Petrick RPA, Rovida F, Krüger V. 2017. Integrating mission and task planning in an industrial
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

robotics framework. In Proceedings of the Twenty-Seventh International Conference on Automated Planning


Access provided by Syracuse University Library on 10/21/19. For personal use only.

and Scheduling, pp. 471–79. Palo Alto, CA: AAAI Press


8. Long D. 2019. Planning a way into a deep hole. Presentation at the Workshop on Scheduling and Planning
Applications (SPARK), 29th International Conference on Automated Planning and Scheduling, Berkeley,
CA, July 10–15
9. Niemueller T, Ewert D, Reuter S, Ferrein A, Jeschke S, Lakemeyer G. 2013. RoboCup Logistics League
sponsored by Festo: a competitive factory automation testbed. In RoboCup 2013: Robot World Cup XVII,
ed. S Behnke, M Veloso, A Visser, R Xiong, pp. 336–47. Berlin: Springer
10. Geffner H. 2010. The model-based approach to autonomous behavior: a personal view. In Proceedings of
the Twenty-Fourth AAAI Conference on Artificial Intelligence, pp. 1709–12. Palo Alto, CA: AAAI Press
11. Albee A, Battel S, Brace R, Burdick G, Casani J, et al. 2000. Report on the loss of the Mars Polar Lander and
Deep Space 2 missions. Tech. Rep. 20000061966, Jet Propuls. Lab., Calif. Inst. Technol., Pasadena, CA
12. Bylander T. 1994. The computational complexity of STRIPS planning. Artif. Intell. 69:165–204
13. Fikes RE, Nilsson NJ. 1971. STRIPS: a new approach to the application of theorem proving to problem
solving. In Proceedings of the 2nd International Joint Conference on Artificial Intelligence, pp. 608–20. San
Francisco, CA: Morgan Kaufmann
14. Bäckström C, Klein I. 1991. Planning in polynomial time: the SAS-PUBS class. Comput. Intell. 7:181–97
15. McDermott D, Ghallab M, Howe A, Knoblock C, Ram A, et al. 1998. PDDL – the Planning Domain
Definition Language. Tech. Rep. CVC TR-98-003/DCS TR-1165, Yale Cent. Comput. Vision Control,
New Haven, CT
16. Haslum P, Lipovetzky N, Magazzeni D, Muise C. 2019. An introduction to the Planning Domain Defi-
nition Language. Synth. Lect. Artif. Intell. Mach. Learn. 13(2):1–187
17. Helmert M. 2009. Concise finite-domain representations for PDDL planning tasks. Artif. Intell. 173:503–
35
18. Konidaris G, Kaelbling LP, Lozano-Pérez T. 2018. From skills to symbols: learning symbolic represen-
tations for abstract high-level planning. J. Artif. Intell. Res. 61:215–89
19. Niemueller T, Hofmann T, Lakemeyer G. 2018. CLIPS-based execution for PDDL planners. In Pro-
ceedings of the 28th International Conference on Automated Planning and Scheduling, pp. 509–17. Palo Alto,
CA: AAAI Press
20. Yoon S, Fern A, Givan R. 2007. FF-Replan: a baseline for probabilistic planning. In Proceedings of
the Seventeenth International Conference on International Conference on Automated Planning and Scheduling,
pp. 352–59. Palo Alto, CA: AAAI Press
21. Little I, Thiebaux S. 2007. Probabilistic planning versus replanning. Paper presented in the Workshop on
IPC: Past, Present and Future, 17th International Conference on Automated Planning and Scheduling,
Providence, RI, Sept. 22–26
22. Kolobov A. 2012. Planning with Markov decision processes: an AI perspective. Synth. Lect. Artif. Intell.
Mach. Learn. 6(1):1–210
23. Nilsson NJ. 1984. Shakey the robot. Tech. Rep. 323, AI Cent., SRI Int., Palo Alto, CA

www.annualreviews.org • Automated Planning for Robotics 4.19


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

24. Speck D, Dornhege C, Burgard W. 2017. Shakey 2016 – how much does it take to redo Shakey the robot?
IEEE Robot. Autom. Lett. 2:1203–9
25. Fox M, Long D. 2003. PDDL2.1: an extension to PDDL for expressing temporal planning domains.
J. Artif. Intell. Res. 20:61–124
26. Ghallab M, Laruelle H. 1994. Representation and control in IxTeT, a temporal planner. In Proceedings
of the Second International Conference on Artificial Intelligence Planning Systems, pp. 61–67. Palo Alto, CA:
AAAI Press
27. Muscettola N. 1994. HSTS: integrating planning and scheduling. In Intelligent Scheduling, ed. M Zweben,
MS Fox, pp. 169–212. San Francisco, CA: Morgan Kaufmann
28. Gigante N, Montanari A, Mayer MC, Orlandini A. 2017. Complexity of timeline-based planning. In
Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, pp. 116–
24. Palo Alto, CA: AAAI Press
29. Chien S, Rabideau G, Knight R, Sherwood R, Engelhardt B, et al. 2000. ASPEN – automating space mission
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

operations using automated planning and scheduling. Paper presented at the International Conference on
Space Operations, Tolouse, Fr., June 19–23
Access provided by Syracuse University Library on 10/21/19. For personal use only.

30. Barreiro J, Boyce M, Do M, Frank J, Iatauro M, et al. 2012. EUROPA: a platform for AI planning, schedul-
ing, constraint programming, and optimization. Paper presented at the 22nd International Conference on
Automated Planning and Scheduling, Atibaia, Brazil, June 25–29
31. Smith DE, Frank J, Cushing W. 2008. The ANML language. Paper presented at the 18th International
Conference on Automated Planning and Scheduling, Sydney, Sept. 14–18
32. Cashmore M, Cimatti A, Magazzeni D, Micheli A, Zehtabi P. 2019. Robustness envelopes for temporal
plans. In Proceedings of the 2019 AAAI Conference on Artificial Intelligence, pp. 7538–45. Palo Alto, CA: AAAI
Press
33. Verma V, Estlin T, Jónsson A, Pasareanu C, Simmons R, Tso K. 2005. Plan Execution Inter-
change Language (PLEXIL) for executable plans and command sequences. In Proceedings of i-
SAIRAS 2005: The 8th International Symposium on Artificial Intelligence, Robotics and Automation in Space.
Paris: Eur. Space Agency. Available at https://www.esa.int/Ourctivities/Spacengineeringechnology/
Automationndobotics/i-SAIRAS/(print)
34. Rusu RB, Jones EG, Marder-Eppstein E, Pantofaru C, Wise M, et al. 2011. Towards autonomous robotic
butlers: lessons learned with the PR2. In 2011 IEEE International Conference on Robotics and Automation,
pp. 5568–75. Piscataway, NJ: IEEE
35. Williams B, Ingham M, Chung S, Elliott P. 2003. Model-based programming of intelligent embedded
systems and robotic space explorers. Proc. IEEE 91:212–37
36. Dechter R, Meiri I, Pearl J. 1991. Temporal constraint networks. Artif. Intell. 49:61–95
37. Vidal T, Ghallab M. 1996. Dealing with uncertain durations in temporal constraint networks dedicated
to planning. In ECAI 96: 12th European Conference on Artificial Intelligence, ed. W Wahlster, pp. 48–54.
Chichester, UK: Wiley & Sons
38. Vidal T, Fargier H. 1997. Contingent durations in temporal CSPs: from consistency to controllabilities.
In Proceedings of TIME ’97: 4th International Workshop on Temporal Representation and Reasoning, pp. 78–85.
Piscataway, NJ: IEEE
39. Frank J, Jónsson A. 2003. Constraint-based attribute and interval planning. Constraints 8:339–64
40. Fratini S, Cesta A, De Benedictis R, Orlandini A, Rasconi R. 2011. APSI-based deliberation in goal ori-
ented autonomous controllers. In 11th Symposium on Advanced Space Technologies in Robotics and Automation.
Paris: Eur. Space Agency. Available at https://www.esa.int/Ourctivities/Spacengineeringechnology/
Automationndobotics/ProceedingsfSTRA
41. Vidal T, Fargier H. 1999. Handling contingency in temporal constraint networks: from consistency to
controllabilities. J. Exp. Theor. Artif. Intell. 11:23–45
42. Micheli A. 2016. Planning and scheduling in temporally uncertain domains. PhD Thesis, Univ. Trento, Trento,
Italy
43. Morris P, Muscettola N. 2000. Execution of temporal plans with uncertainty. In Proceedings of the Seven-
teenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial
Intelligence, pp. 491–96. Palo Alto, CA: AAAI Press

4.20 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

44. Barbulescu L, Rubinstein ZB, Smith SF, Zimmerman TL. 2010. Distributed coordination of mobile agent
teams: the advantage of planning ahead. In Proceedings of the 9th International Conference on Autonomous
Agents and Multiagent Systems, Vol. 1, pp. 1331–38. Richland, SC: Int. Found. Auton. Agents Multiagent
Syst.
45. Coles AJ, Coles A, Fox M, Long D. 2012. COLIN: planning with continuous linear numeric change.
J. Artif. Intell. Res. 44:1–96
46. Della Penna G, Magazzeni D, Mercorio F. 2012. A universal planning system for hybrid domains. Appl.
Intell. 36:932–59
47. Piotrowski WM, Fox M, Long D, Magazzeni D, Mercorio F. 2016. Heuristic planning for PDDL+
domains. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3213–
19. Palo Alto, CA: AAAI Press
48. Scala E, Haslum P, Thiébaux S. 2016. Heuristics for numeric planning via subgoaling. In Proceedings of
the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 3228–34. Palo Alto, CA: AAAI
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

Press
49. Cashmore M, Fox M, Long D, Magazzeni D. 2016. A compilation of the full PDDL+ language into
Access provided by Syracuse University Library on 10/21/19. For personal use only.

SMT. In Proceedings of the Twenty-Sixth International Conference on Automated Planning and Scheduling,
pp. 79–87. Palo Alto, CA: AAAI Press
50. Bryce D, Gao S, Musliner DJ, Goldman RP. 2015. SMT-based nonlinear PDDL+ planning. In Proceedings
of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 3247–53. Palo Alto, CA: AAAI Press
51. Fernández-González E, Williams BC, Karpas E. 2018. ScottyActivity: mixed discrete-continuous plan-
ning with convex optimization. J. Artif. Intell. Res. 62:579–664
52. Savas E, Fox M, Long D, Magazzeni D. 2016. Planning using actions with control parameters. In
ECAI 2016: 22nd European Conference on Artificial Intelligence, ed. GA Kaminka, M Fox, P Boucquet,
E Hüllermeier, V Dignum, et al., pp. 1185–93. Amsterdam: IOS
53. Sucan IA, Moll M, Kavraki LE. 2012. The Open Motion Planning Library. IEEE Robot. Autom. Mag.
19(4):72–82
54. Dantam NT, Chaudhuri S, Kavraki LE. 2018. The Task-Motion Kit: an open source, general-purpose
task and motion-planning framework. IEEE Robot. Autom. Mag. 25(3):61–70
55. Dantam NT, Kingston ZK, Chaudhuri S, Kavraki LE. 2016. Incremental task and motion planning: a
constraint-based approach. In Robotics: Science and Systems XII, ed. D Hsu, N Amato, S Berman, S Jacobs,
pap. 2. N.p.: Robot. Sci. Syst. Found.
56. McMahon J, Plaku E. 2017. Robot motion planning with task specifications via regular languages. Robotica
35:26–49
57. Cambon S, Alami R, Gravot F. 2009. A hybrid approach to intricate motion, manipulation and task plan-
ning. Int. J. Robot. Res. 28:104–26
58. Waldhart J, Gharbi M, Alami R. 2016. A novel software combining task and motion planning for human-
robot interaction. In 2016 AAAI Fall Symposium Series: Artificial Intelligence for Human-Robot Interaction,
pp. 100–2. Palo Alto, CA: AAAI Press
59. Srivastava S, Fang E, Riano L, Chitnis R, Russell S, Abbeel P. 2014. Combined task and motion plan-
ning through an extensible planner-independent interface layer. In 2014 IEEE International Conference on
Robotics and Automation, pp. 639–46. Piscataway, NJ: IEEE
60. Toussaint M. 2015. Logic-geometric programming: an optimization-based approach to combined task
and motion planning. In Proceedings of the 24th International Conference on Artificial Intelligence, pp. 1930–
36. Palo Alto, CA: AAAI Press
61. Garrett CR, Lozano-Pérez T, Kaelbling LP. 2018. FFRob: leveraging symbolic planning for efficient task
and motion planning. Int. J. Robot. Res. 37:104–36
62. Lagriffoul F, Dimitrov D, Bidot J, Saffiotti A, Karlsson L. 2014. Efficiently combining task and motion
planning using geometric constraints. Int. J. Robot. Res. 33:1726–47
63. Edelkamp S, Lahijanian M, Magazzeni D, Plaku E. 2018. Integrating temporal reasoning and sampling-
based motion planning for multigoal problems with dynamics and time windows. IEEE Robot. Autom. Lett.
3:3473–80

www.annualreviews.org • Automated Planning for Robotics 4.21


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

64. Lagriffoul F, Dantam NT, Garrett C, Akbari A, Srivastava S, Kavraki LE. 2018. Platform-independent
benchmarks for task and motion planning. IEEE Robot. Autom. Lett. 3:3765–72
65. Rintanen J. 2007. Complexity of concurrent temporal planning. In Proceedings of the Seventeenth Interna-
tional Conference on International Conference on Automated Planning and Scheduling, pp. 280–87. Palo Alto,
CA: AAAI Press
66. Helmert M. 2002. Decidability and undecidability results for planning with numerical state variables. In
Proceedings of the Sixth International Conference on Artificial Intelligence Planning Systems, pp. 44–53. Palo
Alto, CA: AAAI Press
67. Celorrio SJ, Jonsson A, Palacios H. 2015. Temporal planning with required concurrency using classical
planning. In Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling,
pp. 129–37. Palo Alto, CA: AAAI Press
68. Howard RA. 1960. Dynamic Programming and Markov Processes. Cambridge, MA: MIT Press
69. Åström KJ. 1965. Optimal control of Markov processes with incomplete state information. J. Math. Anal.
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

Appl. 10:174–205
70. Cimatti A, Do M, Micheli A, Roveri M, Smith DE. 2018. Strong temporal planning with uncontrollable
Access provided by Syracuse University Library on 10/21/19. For personal use only.

durations. Artif. Intell. 256:1–34


71. Domshlak C. 2013. Fault tolerant planning: complexity and compilation. In Proceedings of the Twenty-Fifth
International Conference on International Conference on Automated Planning and Scheduling, pp. 129–37. Palo
Alto, CA: AAAI Press
72. Younes HLS, Littman ML. 2004. PPDDL 1.0: an extension to PDDL for expressing planning domains with
probabilistic effects. Tech. Rep. CMU-CS-04-167, Carnegie Mellon Univ., Pittsburgh, PA
73. Sanner S. 2010. Relational Dynamic Influence Diagram Language (RDDL): language description. Unpubl.
Manuscr., Aust. Natl. Univ., Canberra. http://users.cecs.anu.edu.au/ ssanner/IPPC011/RDDL.pdf
74. Burns E, Benton J, Ruml W, Yoon SW, Do MB. 2012. Anticipatory on-line planning. In Proceedings of the
Twenty-Second International Conference on Automated Planning and Scheduling, pp. 333–37. Palo Alto, CA:
AAAI Press
75. Talamadupula K, Benton J, Kambhampati S, Schermerhorn P, Scheutz M. 2010. Planning for human-
robot teaming in open worlds. ACM Trans. Intell. Syst. Technol. 1:14
76. Bonet B, Geffner H. 2014. Belief tracking for planning with sensing: width, complexity and approxima-
tions. J. Artif. Intell. Res. 50:923–70
77. Fagin R, Halpern JY, Moses Y, Vardi MY. 2003. Reasoning About Knowledge. Cambridge, MA: MIT Press
78. Palacios H, Geffner H. 2009. Compiling uncertainty away in conformant planning problems with
bounded width. J. Artif. Intell. Res. 35:623–75
79. Brafman RI, Shani G. 2012. A multi-path compilation approach to contingent planning. In Proceedings of
the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1868–74. Palo Alto, CA: AAAI Press
80. Bonet B, Formica G, Ponte M. 2017. Completeness of online planners for partially observable determinis-
tic tasks. In Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling,
pp. 38–46. Palo Alto, CA: AAAI Press
81. Bernstein DS, Zilberstein S, Immerman N. 2000. The complexity of decentralized control of Markov
decision processes. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, pp. 32–
37. San Francisco, CA: Morgan Kaufmann
82. Amato C, Konidaris G, Cruz G, Maynor CA, How JP, Kaelbling LP. 2015. Planning for decentralized
control of multiple robots under uncertainty. In 2015 IEEE International Conference on Robotics and Au-
tomation, pp. 1241–48. Piscataway, NJ: IEEE
83. Engesser T, Bolander T, Mattmüller R, Nebel B. 2017. Cooperative epistemic multi-agent planning
for implicit coordination. In Proceedings of the Ninth Workshop on Methods for Modalities, ed. S Ghosh,
R Ramanujam, pp. 75–90. Electr. Proc. Theor. Comput. Sci. Vol. 243. Waterloo, Aust.: Open Publ. Assoc.
84. Kitano H, Asada M, Kuniyoshi Y, Noda I, Osawa E, Matsubara H. 1997. RoboCup: a challenge problem
for AI. AI Mag. 18(1):73–85
85. Hansen EA, Bernstein DS, Zilberstein S. 2004. Dynamic programming for partially observable stochastic
games. In Proceedings of the 19th National Conference on Artificial Intelligence, pp. 709–15. Palo Alto, CA:
AAAI Press

4.22 Karpas • Magazzeni


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)
AS03CH04-Magazzeni ARjats.cls October 7, 2019 16:37

86. Muise C, Felli P, Miller T, Pearce AR, Sonenberg L. 2016. Planning for a single agent in a multi-agent
environment using FOND. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial
Intelligence, pp. 3206–12. Palo Alto, CA: AAAI Press
87. Ma H, Hönig W, Cohen L, Uras T, Xu H, et al. 2017. Overview: a hierarchical framework for plan
generation and execution in multirobot systems. IEEE Intell. Syst. 32:6–12
88. Nir R, Karpas E. 2019. Automated verification of social laws for continuous time multi-robot systems. In
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, pp. 7683–90. Palo Alto, CA: AAAI
Press
89. Lacerda B, Faruq F, Parker D, Hawes N. 2019. Probabilistic planning with formal performance guarantees
for mobile service robots. Int. J. Robot. Res. 38:1098–123
90. Hawes N, Burbridge C, Jovan F, Kunze L, Lacerda B, et al. 2017. The STRANDS project: long-term
autonomy in everyday environments. IEEE Robot. Autom. Mag. 24(3):146–56
91. McDermott D. 2000. The 1998 AI Planning Systems Competition. AI Mag. 21(2):35–55
Annu. Rev. Control Robot. Auton. Syst. 2020.3. Downloaded from www.annualreviews.org

92. Simmons R, Apfelbaum D, Fox D, Goldman RP, Haigh KZ, et al. 2000. Coordinated deployment of
multiple, heterogeneous robots. In Proceedings: 2000 IEEE/RSJ International Conference on Intelligent Robots
Access provided by Syracuse University Library on 10/21/19. For personal use only.

and Systems, Vol. 3, pp. 2254–60. Piscataway, NJ: IEEE


93. Erol K, Hendler J, Nau DS. 1996. Complexity results for HTN planning. Ann. Math. Artif. Intell. 18:69–93
94. Alford R, Kuter U, Nau D. 2009. Translating HTNS to PDDL: a small amount of domain knowledge can
go a long way. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp. 1629–34.
San Francisco, CA: Morgan Kaufmann
95. Cashmore M, Fox M, Long D, Magazzeni D, Ridder B, et al. 2015. ROSPlan: planning in the Robot
Operating System. In Proceedings of the Twenty-Fifth International Conference on Automated Planning and
Scheduling, pp. 333–41. Palo Alto, CA: AAAI Press
96. Quigley M, Conley K, Gerkey BP, Faust J, Foote T, et al. 2009. ROS: an open-source robot operating sys-
tem. Paper presented at the IEEE International Conference on Robotics and Automation, Kobe, Japan,
May 12–17

www.annualreviews.org • Automated Planning for Robotics 4.23


Review in Advance first posted on
October 14, 2019. (Changes may
still occur before final publication.)

You might also like