You are on page 1of 8

2012 IEEE International Conference on Systems, Man, and Cybernetics

October 14-17, 2012, COEX, Seoul, Korea

COSA2 - A Cognitive System Architecture with Centralized


Ontology and Specific Algorithms
Stefan Brggenwirth and Axel Schulte
Institute of Flight Systems
Bundeswehr University Munich
85577 Neubiberg, Germany
{stefan.brueggenwirth, axel.schulte}@unibw.de

Abstract—In this article, we present the architectural and architecture as the basis for the COSA2 framework. Then, we
algorithmic details for a COgnitive System Architecture that describe each of its constituting subfunctions and the syntactic
uses a Centralized Ontology with Specific Algorithms (COSA2 ).
COSA2 is a layered intelligent agent framework on the basis and algorithmic details. Throughout the article we illustrate the
of the modified Rasmussen model of human performance. It concepts by an example taken from UAV mission management.
encompasses integrated algorithmic support for goal-oriented The article concludes with open research questions.
situation interpretation, dynamic planning and plan execution,
as well as provisions for reactive behavior. A unique feature is II. T HE C OGNITIVE P ROCESS
the claim for an expressive, centralized knowledge representation,
used by all functions to ensure consistency. The framework is The cognitive process describes a cyclic information pro-
being applied to different problems in the domain of uninhabited cessing scheme that is used within the Cognitive System
aerial vehicles. This article focuses on high level, concept-based Architecture. It is based on a model of human performance in
behavior and illustrates the modeling and processing details by
a simplified UAV mission management example. information processing published by J. Rasmussen [6] in 1983.
Keywords—Intelligent agents; Knowledge modeling; Plan exe- It furthermore serves as a guideline for knowledge acquisition
cution, formation, and generation; Rule-based processing from domain experts and as an ontology guideline for domain
specific modeling.
I. I NTRODUCTION
Increasing demand for highly automated, uninhabited sys- ,ƵŵĂŶWĞƌĨŽƌŵĂŶĐĞ
tems requires the development of generic, yet flexible software
ƉƌŽĐĞĚƵƌĞͲďĂƐĞĚ ĐŽŶĐĞƉƚͲďĂƐĞĚ

K^ ŽŶĐĞƉƚƐ DŽƚŝǀĂƚŝŽŶĂůŽŶƚĞdžƚƐ dĂƐŬKƉƚŝŽŶƐ


ŵĂƚĐŚŝŶŐ ŐŽĂůƐΘ ƚĂƐŬ
ďĞŚĂǀŝŽƵƌ

frameworks to provide these systems with intelligent decision ŝĚĞŶƚŝĨŝĐĂƚŝŽŶͲ ĐŽŶĐĞƉƚƐ 'ŽĂů ĐŽŶƐƚƌĂŝŶƚƐ
WůĂŶŶŝŶŐ
ĂŐĞŶĚĂ
ƌĞůĞǀĂŶƚ /ĚĞŶƚŝĨŝĐĂƚŝŽŶ
making capabilities. The Institute of Flight Systems has suc- ĐƵĞƐ ĞƚĞƌŵŝŶĂƚŝŽŶ

cessfully applied its Cognitive System Architecture (COSA) dĂƐŬ^ŝƚƵĂƚŝŽŶƐ WƌŽĐĞĚƵƌĞƐ


ĐƵƌƌĞŶƚƚĂƐŬ ĂĐƚŝŽŶ
[1] in various domains, ranging from multiple uninhabited dĂƐŬ ;ŝŶƚĞŶƚͿ dĂƐŬ
ďĞŚĂǀŝŽƵƌ

ŝŶƐƚƌƵĐƚŝŽŶƐ
aerial vehicle (UAV) guidance [2], [3] to intelligent system ƚĂƐŬͲƌĞůĞǀĂŶƚ
ĞƚĞƌŵŝŶĂƚŝŽŶ džĞĐƵƚŝŽŶ
ĐƵĞƐ
configuration management [4]. Applications developed for
ƵĞDŽĚĞůƐ ^ĞŶƐŽƌŝͲDŽƚŽƌWĂƚƚĞƌŶƐ
COSA follow a conceptual processing scheme, called the cog-
ƐŬŝůůͲďĂƐĞĚ
ďĞŚĂǀŝŽƵƌ

&ĞĂƚƵƌĞ ĐƚŝŽŶ
nitive process, that resembles the human way of information &ŽƌŵĂƚŝŽŶ ĐŽŶƚƌŽůͲƌĞůĞǀĂŶƚĐƵĞƐ ŽŶƚƌŽů
processing. By using this analogy we belief that: (1) the
modeling and ontology creation by human domain experts, ƐĞŶƐĂƚŝŽŶƐ ƐĞŶƐĂƚŝŽŶƐ
ĞĨĨĞĐƚŽƌ
tŽƌŬ ĐŽŵŵĂŶĚƐ
as well as the interface to a human operator may benefit from ůŐŽƌŝƚŚŵŝĐ ĐůĂƐƐĞƐ ǁŝƚŚŝŶ K^͗ ŶǀŝƌŽŶͲ
the semantics used in the cognitive process and (2) intelligent /ŶĨĞƌĞŶĐĞͲĂƐĞĚ ^ĞĂƌĐŚͲĂƐĞĚ WĂƚƚĞƌŶͲĂƐĞĚ ŵĞŶƚ

agents using the COSA framework have the ability to act


in a flexible and comprehensible way. The first version of Figure 1. The modified Rasmussen scheme [7] illustrating the processing
scheme of COSA2
the COSA framework used the cognitive processor Soar [5]
as a rule-based inference engine. Beside inference however,
it was found that the individual conceptual processing steps A. The Modified Rasmussen-Scheme
specified in the cognitive process may benefit from further The modified Rasmussen scheme [7] shown in Fig. 1,
algorithmic support provided by a generic cognitive system laid the intellectual foundation in the attempt to establish
architecture. In this article, we describe a novel COgnitive an analogous computation paradigm for intelligent software
System Architecture with a Centralized Ontology and Specific agents. The modified Rasmussen scheme breaks down the
Algorithms (COSA2 ) to facilitate each such processing step, complex phenomena of human cognition into a number of
along with the embracing architecture to connect these sub- simplified, cognitive subfunctions, each responsible for certain
processes into an intelligent processing framework. dedicated aspects of human cognition. These are shown as
This article is organized as follows: First, we describe boxes in the figure, with arrows indicating the primary flow
the cognitive process and the resulting overarching agent of information. During runtime, the cognitive process iterates

978-1-4673-1714-6/12/$31.00 ©2012 IEEE 307


through the different cognitive subfunctions. All cognitive sub- hs dĂƌŐĞƚ
functions are associated within one of three horizontal layers, ĚŝƐƚĂŶĐĞ
that describe increasing levels of abstraction and flexibility >Ăƚ сϰϴ͘ϬϮ ϴ͘ϱ >Ăƚ сϰϴ͘ϭϮ
in human behavior. The inner, light-blue box represents the >ŽŶсϭϭ͘ϰϵ >ŽŶсϭϭ͘ϳϮ
subset of human cognitive subfunctions that are algorithmically
Figure 2. A graph with weighted edges is used to describe the situational
supported by COSA2 . knowledge
Note that Rasmussen’s scheme is typically used, e.g. by
psychologists to describe the human cognition performance. In
Each edge in the WM graph represents a certain
its original interpretation, the different subfunctions, as well as
fact in the agent’s world knowledge. An edge e1 =
the three layers of behavior are essentially interwoven. Besides
(U AV, T arget, distance, 8.5) may represent the fact, that
the centralized knowledge representation, we have taken a
UAV is 8.5 km away from Target. A condition is a set of
rather pragmatic, computer-science and application oriented
propositional expressions connected by logical connectives,
approach by separating the cognitive subfunctions. E.g. human
that allow the matching of complex patterns including AND,
skill-based behavior (e.g. visual perception occurring as part
OR, NOT relationships and path-expressions. The central
of the human Feature Formation process) is only partially
pattern-matching algorithm for all cognitive subfunctions GET-
covered by the architecture.
M ATCHING I NSTANCES (condition, WM) searches the WM for
all sets of fact-tuples, that match a certain condition. For
B. Central Knowledge Representation
example, the condition “distanceU AV,T arget < 10 returns the
Even though each cognitive subfunction serves a specific fact e1 . Note, that multiple combinations of facts may match
purpose in the cognitive process, we describe each subfunction the condition, and are hence found as individual instances.
using a generic ordering scheme as shown in Fig. 1. The cog-
nitive process distinguishes static, a-priori knowledge specified D. Related Work
during design time (shown in blue, e.g. C ONCEPTS or TASK Compared to its predecessor COSA, COSA2 shifted the
O PTIONS) and situational knowledge, generated during run- focus from a cognitive architecture (developed to describe
time (shown in red in Fig. 1, e.g. identification relevant cues or human cognition processes, such as Soar or ACT-R [8]) to-
matching concepts). As already indicated, gray boxes represent wards the behavior of rational agents investigated by computer
the cognitive subfunctions that constitute the cognitive process. science or robotics. From this agent-theoretical point of view,
Each cognitive subfunction performs a specific algorithm on the approach resembles a hybrid agent architecture [9] with
the basis of the indicated a-priori knowledge. Algorithms are horizontal layering, as found in Fergusons TouringMachines
divided into the three classes inference-, search and pattern [10] or other Three-Tier architectures [11]. The lowest layer
matching-based, and explained further in section 3. (skill-based) is reduced to converting subsymbolic input data
An important claim of the cognitive process is for a central to an abstract symbolic representation (and vice versa). These
knowledge representation. This means, that both situational symbols are then fed in parallel into a reactive layer1 with
and a-priori knowledge are specified exactly once, and accessi- prestored procedures (procedure-based) and a deliberative,
ble from all the different subfunctions. As a consequence, each goal-oriented layer (concept-based) comparable to a BDI-agent
cognitive subfunction may generally read all situational knowl- design [12], augmented with dynamic planning capabilities
edge elements available in working memory. However, it writes [13].
specifically only into the situational knowledge indicated by A major challenge for the centralized knowledge represen-
the outgoing arrow. For example, the cognitive subfunction tation of COSA2 is a tight integration between action planning
Identification is implemented as an inference algorithm which and reasoning about the consequences within a single world
uses C ONCEPTS as a-priori knowledge. It may read from model. Among the first to address this issue was the possible
anywhere in working memory (with its primary source be- worlds approach by Ginsberg and Smith [14]. Classical plan-
ing identification relevant cues) and exclusively modify the ning languages like STRIPS [15] handle the framing problem
situational knowledge on matching concepts. [16] well, but still suffer from the ramification problem. That
is, all consequences of executing an action, including indirectly
C. Working Memory Representation inferred effects, must explicitly be listed as postcondition.
Derived predicates, introduced later with the Planning Domain
The situational knowledge forms the agent’s working mem-
Definition Language (PDDL) [17] help relief this pitfall, but
ory. It is represented as a graph with weighted edges. Like
intentionally restrict the expressiveness of indirect effects. A
in traditional objected oriented programming languages, we
reactive planner, that can handle indirect effects was developed
distinguish between static, prototypical class definitions and
by Williams [18] and applied in embedded intelligent systems.
concrete instantiations of these classes as unique objects.
The COSA2 framework offers the full expressiveness of its
During runtime, the agent may instantiate an arbitrary number
inference engine to derive indirect effects.
of concrete objects from these class templates as vertices. As
shown in Fig. 2, directed edges are used to store semantic 1 Reactive behavior is realized only rudimentarily in our framework and
relations between the objects or attributes. mentioned here for completeness

308
III. D ESCRIPTION OF THE C OGNITIVE P ROCESSING S TEPS cognitive subfunctions responsible within the cognitive process
As expressed by the Rasmussen scheme in Fig. 1, each to draw the situational picture. As shown in the figure, she
cognitive subfunction is responsible for certain aspects within distinguishes three levels of SA with increasing complexity.
the human cognition process. This analogy describes the func-
/ŶĨĞƌĞŶĐĞͲĂƐĞĚ ^ĞĂƌĐŚͲĂƐĞĚ WĂƚƚĞƌŶͲďĂƐĞĚ
tional requirements on each cognitive subfunction and serves
as a modeling guideline for the implementation of the cognitive WĞƌĐĞƉƚŝŽŶ ŽĨ
ĞůĞŵĞŶƚƐ ŝŶ
ŽŵƉƌĞŚĞŶƐŝŽŶ
ŽĨ ĐƵƌƌĞŶƚ
WƌŽũĞĐƚŝŽŶ ŽĨ
ĨƵƚƵƌĞ ƐƚĂƚƵƐ WůĂŶŶŝŶŐ
^ƚĂƚĞŽĨ ƚŚĞ WĞƌĨŽƌŵĂŶĐĞ
ƐŝƚƵĂƚŝŽŶ ĂŶĚ
process as an intelligent agent framework. ĞŶǀŝƌŽŶŵĞŶƚ
ĐƵƌƌĞŶƚ ƐŝƚƵĂƚŝŽŶ
ĞĐŝƐŝŽŶ
ŽĨ ĂĐƚŝŽŶƐ
>ĞǀĞůϭ
During runtime, COSA2 transitions through the individual
>ĞǀĞůϮ >ĞǀĞůϯ

states of the cognitive process as shown in Fig. 3. Besides the • 'ŽĂůƐ


execution state within the cognitive process, the architecture • KďũĞĐƚŝǀĞƐ
• džƉĞĐƚĂƚŝŽŶƐ

separately maintains the state of the current task agenda, as also


illustrated in the figure. The processing cycle can be divided Figure 4. Interpretation of situation awareness based on Endsley’s model
into three parts that use specific algorithms: (1) Environment
and goal-interpretation using inference algorithms (2) Action
planning and adjustment using search and constraint satisfac- Level 1 (L-1) simply represents the elementary building
tion algorithms (3) Plan execution using pattern matching. blocks of a perceived situation, comparable to Identification
Relevant Cues generated by Feature Formation.
WƌŽŐƌĞƐƐ
dŝŵĞůĂLJĞƌ
Level 2 (L-2) SA combines the different elements to derive
z

/ĚĞŶƚŝĨŝĐĂƚŝŽŶ
>ϮƌƵůĞ
ĨŝƌĞĚ
/ĚĞŶƚŝĨŝĐĂƚŝŽŶ
>ϯƌƵůĞ
ĨŝƌĞĚ
'ŽĂů hŶĞǀĂůƵĂƚĞĚ
ŐĞŶĚĂсс
WůĂŶŶĞĚ
a comprehensive representation of the situation, analogous to
Matching Concepts as deduced by Identification. In addition,
WůĂŶ^ŽůƵƚŝŽŶ WůĂŶͬ
>Ϯ >ϯ ĞƚĞƌŵŝŶĂƚŝŽŶ dŝŵĞůĂLJĞƌƐ͍
>ϮƌƵůĞƐ >ϯƌƵůĞƐ 'ƌƵůĞƐ ŶĂůLJƐŝƐ ZĞƉůĂŶ
ĐŚĞĐŬĞĚ ĐŚĞĐŬĞĚ ĐŚĞĐŬĞĚ ŐĞŶĚĂсс
>ϭƌƵůĞ &ĂŝůĞĚ
ĨŝƌĞĚ
ŐĞŶĚĂ^ƚĂƚĞ
E ŐĞŶĚĂ͗с
hŶĐŚĞĐŬĞĚ ŐĞŶĚĂсс
sĂůŝĚ
WůĂŶ
&Ăŝů
Endsley includes goals and objectives into L-2 SA, since they
hŶĐŚĞĐŬĞĚ
ZĞƐĞƚ
tD
ĚũƵƐƚ
ŐĞŶĚĂ ŐĞŶĚĂсс
critically influence the interpretation of a situation.
&Ăŝů
ĞƚĞƌŵŝŶĞ
WƌŽŐƌĞƐƐŝŽŶ;Ϳ ŐĞŶĚĂсс
Finally, level 3 (L-3) SA takes a projection of future
ĞƚĞƌŵŝŶĞ dŝŵĞĚ
WƌŽŐƌĞƐƐŝŽŶ;Ϳ dŝŵĞĚ
ŐĞŶĚĂсс
E
ĞƚĞƌŵŝŶĞ
WƌŽŐƌĞƐƐŝŽŶ
environment states into account. Based on this interpretation
sĂůŝĚ͍
ĚũƵƐƚ
ŐĞŶĚĂ;Ϳ
ĚũƵƐƚ
ŐĞŶĚĂ;Ϳ
z
of Endsley’s model, Feature Formation, Identification and
&ĂŝůĞĚ
^ŽůƵƚŝŽŶ
sĂůŝĚ Goal Determination create SA within the cognitive process.
dĂƐŬ dĂƐŬ

2) Feature Formation (L-1 Perception): The human Fea-


ŶĂůLJƐŝƐ;Ϳ
ĞƚĞƌŵŝŶĂƚŝŽŶ džĞĐƵƚŝŽŶ
WůĂŶ;Ϳ ^ŽůƵƚŝŽŶ
WůĂŶ;Ϳ
ŶĂůLJƐŝƐ;Ϳ
>ϭƌƵůĞƐ
ĐŚĞĐŬĞĚ
WůĂŶŶĞĚ
ture Formation process encompasses elaborate perception
&ĞĂƚƵƌĞ
&ŽƌŵĂƚŝŽŶ;>ϭͿ
and signal processing tasks, in order to assemble semantic
/ŶƉƵƚZĞĂĚ information from an initially unfiltered stream of sensations.
ZĞĂĚ
/ŶƉƵƚ
tƌŝƚĞKƵƚƉƵƚ;Ϳ
Within COSA2 , it is assumed that appropriate preprocessing
steps take place outside the framework, so that assigning
^ƚĂƌƚ meaningful symbols to incoming input-data becomes trivial.
Feature Formation is then reduced to being the first sub-
Figure 3. State transition diagram of the cognitive processing cycle and the function, that starts a new processing cycle by sampling the
task agenda
input interface of the agent. It uses a-priori knowledge on C UE
M ODELS to transform subsymbolic, numeric input data from
A. Situational understanding using an inference engine the work environment into a symbolic representation. These
Within the cognitive process, the three subfunctions Fea- symbols are then provided as identification relevant cues to the
ture Formation, Identification and Goal Determination use concept-based layer or as task-relevant cues to the procedure-
an inference mechanism to build up the situational understand- based layer. A C UE M ODEL rule may for example express
ing of the agent, that sets the basis for subsequent computation a conclusion that each edge representing an input voltage
processes. The relevant a-priori knowledge associated with the between 0-5V is to be interpreted as ’low’ voltage.
three subfunctions is given as production rules. A produc- 3) Identification (L-2 and L-3 Perception): The Identifica-
tion rule has the form [name] : (condition, effect, priority, tion subfunction assembles a high-level representation of the
[duration]) and expresses a simple ’if condition =⇒ then situation perceived by the agent. The associated knowledge
conclusion’ relationship: For each graph-pattern in the working on C ONCEPTS comprises rules to reason about the immediate,
memory that matches the rule’s condition (see previous section currently perceived situation (L-2) and about projections of
II-C), the rule is said to have fired, and a certain effect is future states (L-3).
applied by adding or modifying edges in the working memory. a) Reasoning about the current environment state using
1) Endsley’s Model of Situation Awareness: To illustrate L-2 rules: When reasoning about a situation, the Identification
the design choices made for the inference-based subfunctions, subfunction typically matches upon identification relevant cues
it is worthwhile looking at a perspective on human cognition created by the Feature Formation subfunction, and applies
based on Endsley’s model [19] (Fig. 4). The concept focuses on its a-priori knowledge on C ONCEPTS to interrelate these and
the elements of situation awareness (SA) as observed by human expand a broader, comprehensive situational picture. As shown
operators. It provides some analogies to the inference-based in Fig. 6, the result is written into the matching concepts

309
Ϭ &ƵůĨŝůůĞĚс&ΛϬ ϭϬ
/ŶƚĞƌƉƌĞƚDŝƐƐŝŽŶKƌĚĞƌΛϬ ĐŚŝĞǀĞDŝƐƐŝŽŶΛϬ
>ϮZƵůĞ 'ŽĂů;ZĞŵŽǀĞĚͲƚͲŶĚͿ

WŚŽƚŽŐƌĂƉŚĞĚс&ΛϬ WŽƐϭсy͕zΛϭϬ

WŽƐϮсy͕zΛϭϬ ŝƐƚĂŶĐĞ сϭϬΛϭϬ


dĂƌŐĞƚDŽǀŝŶŐΛϬͲϭϬ ŽŵƉƵƚĞŝƐƚĂŶĐĞΛϭϬ DĂŝŶƚĂŝŶŝƐƚĂŶĐĞΛϭϬ
>ϯZƵůĞ >ϮZƵůĞ 'ŽĂů;DĂLJ EĞǀĞƌKĐĐƵƌͿ

Figure 5. Different perception levels during environment interpretation

namespace in working memory. This cognitive subfunction checks the perceived, abstracted situation for a violation of
condenses various, unrelated symbols from the situational desirable states and constraints, modeled as M OTIVATIONAL
feature space into meaningful, abstract concepts. C ONTEXTS by the knowledge engineer. Like C ONCEPTS,
these are described as classes, with attributes to hold specific
ŽŶĐĞƉƚƐ
information on the goal and rules, that describe the condition

under which the goal becomes instantiated. The semantic is

here, that an instance of a goal in the working memory is
͘͘

/ĚĞŶƚŝĨŝĐĂƚŝŽŶ ͘ created, when the agent is in an undesirable state. With respect


ŝĚĞŶƚŝĨŝĐĂƚŝŽŶͲ ŵĂƚĐŚŝŶŐĐŽŶĐĞƉƚсϭ
ƌĞůĞǀĂŶƚĐƵĞƐ to the subsequent planning process, three different types of
goals are supported:
ƐŝƚƵĂƚŝŽŶĂů
ĨĞĂƚƵƌĞƐƉĂĐĞ • RemovedAtEnd Hard goal, that needs to be satisfied,
after all planned tasks that have been executed. E.g. Goal
requiring the accomplishment of a certain mission order.
Figure 6. Identification expands a comprehensive situational picture [7]
• MayNeverOccur Hard goal, that describes a critical state,
that may never be reached. E.g. Goal describing an out-
b) Reasoning about hypothetical environment develop- of-fuel condition.
ments using L-3 rules: Motivated by Endsley’s model of • PenaltyOnOccurence Soft goal, describing a soft-
situation awareness, external events or processes, occurring in constraint, that increases a penalty counter when violated.
the environment with a temporal extent, are described as L-3 Among possible solutions that satisfy the hard goals,
perceptions. L-1 and L-2 perception rules describe immediate COSA2 selects the one that minimizes the penalty counter.
conclusions on facts that have already occurred and been E.g. Penalty counter is increased by 10 units when cross-
observed by the agent. An active L-3 rule, on the other hand, ing an unsafe area.
describes an ongoing transition process in the environment,
that strives towards a steady state in the future, modeled by 5) Summary and Example: Figure 5 summarizes the rule
the rule’s effects. inference of different cognitive subfunctions, used for en-
Each L-3 rule is therefore associated with a duration, vironment interpretation of a simple UAV mission manage-
that determines the timelayer when the nominal, anticipated ment agent. The figure shows the higher-level, concept-based
effects of the transition process apply. Within this timelayer, subfunctions Identification and Goal Determination only.
the anticipated situation is then interpreted by Identification Feature Formation is not covered here, as the associated L-1
and Goal Determination rules. Effects of an L-3 rule are rules are primarily used to process raw input-data.
considered hypothetical, until actually encountered in the envi- The scenario contains the UAV and a moving target. The
ronment. Due to their hypothetical character, L-3 rules along ACU has a built-in goal MaintainDistance of type MayNever-
with its anticipated and derived effects are retracted, should Occur, and another goal of type RemovedAtEnd to fulfill
its precondition become invalidated. Usually, the precondition the mission order, in this case to photograph the moving
of an L-3 rule describes a deviation criterion, measuring the object. The figure illustrates the currently perceived situation.
deviation between current situation and anticipated effects of It contains all active rules (represented by colored boxes with
the L-3 rule. If the deviation between current and anticipated round corners), along with the edge modifications they created
future state, falls below a certain threshold, the transition (represented as labeled arrows). The @-symbol denotes the
process described by the L-3 rule is no longer active and the timelayer, at which the rule fires and its edge modification
rule is retracted. occur.
An L-3 rule might for example express that, if there is a At T = 0, a regular L-2 rule (InterpretMissionOrder) fires,
deviation between the commanded and the current position of that analyzes the mission order to determine, whether or not
a UAV (precondition), the current position will strive towards the target has been photographed yet. As a result, the edge
the commanded position (effect), and eventually reach it, at a Fulfilled is set to False (at T = 0), which in turn instantiates a
point in time given by the rule duration. goal AchieveMission. The goal-type RemovedAtEnd indicates
4) Goal Determination: The concept-based layer of to the subsequent planning process, that the goal must be
COSA2 is driven by goals. The Goal Determination step accomplished after execution of the plan. In addition, an L-

310
3 rule (TargetMoving starting at T = 0 with duration D = 10) facts to the working memory. All rules, that match on a
is triggered, that creates a second timelayer at T = 10. The modified fact in their condition, need to be checked again,
projected situation is then interpreted by the set of L-2 rules. and are therefore added to the RULE AGENDA based on their
In this example, the rule ComputeDistance would fire and priority. Once no more rules remain on the RULE AGENDA,
assess the new distance between the UAV and the target. the cycle has reached the quiescence state. In this state, the
This new distance might exceed the maximum admissible working memory content is considered updated and consistent
range as modeled by the MaintainDistance goal. The goal-type with the given knowledge in the rule base.
MayNeverOccur indicates to the subsequent planning process, The cognitive processing cycle starts by sampling the input
that the natural progression of the environment would lead interface and then calls I NFERENCE (C UE M ODELS ∪ C ON -
to a critical state. Hence, the agent must not stay passive CEPTS ∪ M OTIVATIONAL C ONTEXTS, WM) as described in
with respect to the moving object, either by preventing the Fig. 7. Even though all rules are concurrently relevant, the
anticipated effects of the L-3 rule or by performing other priority ranges are arranged as follows: C UE M ODELS (L-1)
actions to avoid the violation of the MaintainDistance-Goal. > C ONCEPTS (L-2) > C ONCEPTS (L-3) > M OTIVATIONAL
C ONTEXT rules. The rules are therefore generally checked
1: procedure I NFERENCE(knowledge, wm) in that order, e.g. the first M OTIVATIONAL C ONTEXT rule is
2: time ← 0 evaluated, after all active C ONCEPTS rules have fired.
3: loop A special convention holds for the management of L-3 rules
4: while ruleAgenda = empty do within the C ONCEPTS knowledge to ensure future time-layers
5: rule ← DEQUEUE F RONT(ruleAgenda) are evaluated in the right order. When an L-3 rule has fired, its
6: matches ← GET M ATCHING - effects are not applied right away, but added to an associative
I NSTANCES(rule, wm) array, that stores the occurrence time (current timelayer + rule
7: if rule ∈ L3 then  Postpone L3 Rule eval duration) for each instance of a matched L-3 rule. As shown in
8: L3Rules ← A DD(time + dur, matches) Fig. 3, the cognitive process progresses to the next timelayer,
9: continue by dequeuing the L-3 instances with the earliest occurrence
10: end if time and applying its effects. On the next timelayer, inference
11: for each instance in matches do of matching C ONCEPTS and M OTIVATIONAL C ONTEXT starts
12: modF acts ← FIRE RULE(instance) over until all timelayers are evaluated.
13: SUBMIT T OWM(modF acts, wm) B. Planning of own tasks as a search problem
14: touchedRules ← GET T OUCHED -
Planning describes the problem of creating a sequence of
C ONDITIONS(modF acts, knowledge)
tasks, to be executed by the agent with the aim of transitioning
15: ruleAgenda ← ADD(touchedRules)
from the current environmental state into some desired target
16: end for
state. The current environment state is primarily described by
17: end while
the matching concepts, whereas the target state results from the
18: if L3Rules = empty break
goals & constraints identified during the Goal Determination
19: SORT K EYS (L3Rules)  Next timelayer
phase. The set of possible actions of the agent is described as
20: time, matches ← DEQUEUE F RONT(L3Rules)
TASK O PTIONS, where a task is defined by a set of precon-
21: for each instance in matches do
ditions that must be satisfied to enable it and the expected
22: modF acts ← FIRE RULE(instance)
effects that result from its execution. If the planning process
23: SUBMIT T OWM(modF acts, wm)
was successful, the selected subset of tasks is stored as the
24: touchedRules ← GET T OUCHED -
task agenda for execution.
C ONDITIONS(modF acts, knowledge)
25: ruleAgenda ← ADD(touchedRules) >ϯͲŽŶĐĞƉƚ
26: end for
WƌŽũĞĐƚŝŽŶ
27: end loop
28: end procedure ŽŶĐĞƉƚƐ DŽƚŝǀĂƚŝŽŶĂůŽŶƚĞdžƚƐ dĂƐŬKƉƚŝŽŶƐ
ŵĂƚĐŚŝŶŐ ŐŽĂůƐΘ ƚĂƐŬ
ŝĚĞŶƚŝĨŝĐĂƚŝŽŶͲ
ĐŽŶĐĞƉƚƐ 'ŽĂů ĐŽŶƐƚƌĂŝŶƚƐ ĐĂŶĚŝĚĂƚĞ
Figure 7. Inference algorithm used in COSA2 ƌĞůĞǀĂŶƚ /ĚĞŶƚŝĨŝĐĂƚŝŽŶ
ĞƚĞƌŵŝŶĂƚŝŽŶ
WůĂŶŶŝŶŐ
ĐƵĞƐ
ƚĂƐŬ
ĂŐĞŶĚĂ

6) Algorithmic Details: COSA2 uses an graph-based infer- Figure 8. The projection based planning approach reuses knowledge on
ence engine, described in [20], [21]. Central to the algorithm C ONCEPTS and M OTIVATIONAL C ONTEXTS
(Fig. 7) is a priority queue (RULE AGENDA), that stores an
ordered list of unchecked rules with modified conditions. The 1) Projection based planning approach: The idea of central
interpreter removes the rule with the highest priority from the knowledge representation suggests that a-priori knowledge
RULE AGENDA and builds all active rule instances that match used elsewhere to describe portions of the ontology, is reused
the condition in the working memory. Each active instance is whenever applicable. In traditional planning languages, the
then fired by applying it’s action and submitting the modified new environment state is often modeled as a direct effect

311
of executing a task. In contrast, the approach taken in the ĚũƵƐƚĞĚ^ŽůƵƚŝŽŶ;dĂƐŬŐĞŶĚĂͿ

COSA2 framework is based on projection, as shown in Fig. >ŝŶĞĂƌ


tŽƌŬŝŶŐ
8: A new search state is expanded by applying the effects DĞŵŽƌLJ
ƵƌƌĞŶƚ^ŝƚƵĂƚŝŽŶ;hƉĚĂƚĞͿ KƉƚŝŵŝnjĂƚŝŽŶ WůĞdž
ĂŶĚ
of a feasible task candidate and subsequently interpreting the ^ĐŚĞĚƵůŝŶŐ
resulting situation (using the inference-based subfunctions as
^ŽůƵƚŝŽŶ
described in section III-A). ZƵůĞ
ŽŶƐƚƌĂŝŶƚƐ

ĂƐĞ WůĂŶ
K^Ϯ
A physical effect to be applied by the agent is often split ŶĂůLJƐŝƐ

into the active part of sending the respective command and a ŽŵƉŝůĞƌ
ĂƐĞ
^ŽůƵƚŝŽŶ
passive part, caused by the anticipated environment reaction
W>
to the command, modeled as an L-3 concept rule. If, e.g. the D> WůĂŶŶĞƌ W>
mission requires the UAV to relocate, modeling of this process
is split into the activity of sending the command to the flight
management system (Task: FlyToX in Fig. 10), and the envi- Figure 9. COSA2 planning architecture
ronment reaction of transitioning the UAV to the new location
(L-3: EnrouteToX). The environment reaction is represented by
an L-3 concept rule, describing the anticipated achievement 4) PDDL planning: For initial planning or if replanning
of the new physical position after a certain duration. The is required, an external PDDL planner is used to generate a
agnostic semantics of an L-3 rule as explained in section III-A3 baseline-solution, using nominal parameter values (shown in
are preserved: Assuming the command to program the new orange). PDDL [17] is an open academic standard to describe
waypoint (FlyToX) happens instantaneously and can never fail, planning problems. It originated from the state-based STRIPS
the time to reach the new position is naturally described by the [15] description of an action by a set of preconditions and
duration of the L-3 rule (EnrouteToX), which is automatically effects. A task may be selected for execution, if its precon-
monitored for failure by the architecture. This allows the ditions are fulfilled in the current state. Execution of a task
consistent integration with the perception process described transitions to a follow-up state by applying the tasks effects.
in the previous section. During the compilation of the COSA2 knowledge model, each
task and rule is translated into an PDDL action. Weighted
edges of BOOLEAN type are represented as PDDL predicates,
2) Transitioning from perception to planning in the exam-
INT or FLOAT types as PDDL fluents. Once a valid baseline
ple: Fig. 10 shows a valid agenda, that is a solution to the
solution is found, the agenda state transitions to PLANNED.
mission management example given in section III-A5. During
The semantics of the inference-based projection phase are not
the inference based subfunctions, the agent perceived a moving
naturally available in PDDL and encoded as conditional effects
target as indicated by the respective L-3 rule. As mentioned,
of actions.
this describes the anticipated, nominal environment, should the
rule’s preconditions remain valid. 5) Linear optimization and scheduling: Once a valid task
agenda is generated, the planning subfunction is reduced to
In the given scenario, the ACU may utilize the external determining, whether the existent agenda is still valid and
development to its own benefit, by scheduling an interception progressing with its execution is still in line with the goals.
course at the right time. This is indicated by the task FlyToX Comparable to the long-lasting intentions of a BDI-agent
(shown in red) and the L-3 rule EnrouteToX (shown in green) it design, we follow the convention to stick to the structure
triggers. As before, the L-2 rule computeDistance determines of a generated agenda as long as possible. This avoids the
the new distance between the UAV and the position of the computationally expensive recomputation of a plan and gives
target. A sufficiently low distance in turn establishes the the framework a certain long-term momentum that prevents
required precondition for two follow-up tasks. TakePhoto will numeric oscillations with minimal operational benefit. Besides
lead to the accomplishment of the mission order and the the structure of an agenda (that is, the order of selected tasks),
retraction of the respective goal. the numeric parameters of the agenda require constant adjust-
ment, when executing the plan in a real-world, continuous
3) Planning architecture in COSA2 : In order to efficiently environment. In particular floating point parameters, such as
tackle the computational complexity related to planning, three scheduled execution times for tasks or threshold distances that
different components are involved in the planning (and replan- trigger new tasks, usually never precisely hit their nominal
ning) process in COSA2 as shown in Fig. 9. The sequence, values when executed in reality.
in which the components are invoked is illustrated in Fig. 3. However, the essential information that is captured in the
The planning subfunction transitions through several substates, structure of the agenda - rather than sticking literally to
depending on the state of the task agenda. Should the goal- nominal parameter values - is the set of logical constraints
determination subfunction report unsatisfied goals or violated among its elements, that must be fulfilled, in order to retain the
constraints, the planning subfunction is initiated by setting the agenda’s causal structure (shown in the graph representation
state of the task agenda to UNCHECKED. of the agenda, e.g. Fig. 10).

312
Ϭ ϭϬ
dĂƌŐĞƚDŽǀŝŶŐΛϬͲϭϬ
WŽƐϭсy͕zΛϭϬ
>ϯZƵůĞ

ŵĚWŽƐсdΛϬ ŝƐƚĂŶĐĞ сϬΛϭϬ WŚŽƚŽŐƌĂƉŚĞĚсdΛϭϬ &ƵůĨŝůůĞĚсdΛϭϬ


WŽƐϮсy͕zΛϭϬ
&ůLJdŽyΛϬ ŶƌŽƵƚĞdŽyΛϬͲϭϬ ŽŵƉƵƚĞŝƐƚĂŶĐĞΛϭϬ dĂŬĞWŚŽƚŽΛϭϬ /ŶƚĞƌƉƌĞƚDŝƐƐŝŽŶKƌĚĞƌΛϭϬ ĐŚŝĞǀĞDŝƐƐŝŽŶΛϭϬ
dĂƐŬ >ϯZƵůĞ >ϮZƵůĞ dĂƐŬ >ϮZƵůĞ 'ŽĂů;ZĞƚƌĂĐƚĞĚͿ

Figure 10. Agenda graph during execution, representing a solution to the planning problem

a) Plan Analysis: Once the PDDL planner found a the numeric parameters leads to greater flexibility in plan
solution, it is imported into COSA2 and its causal structure execution.
is analyzed (shown in blue, Fig. 9). This happens by stepping c) Agenda progression monitoring: Once a valid agenda
through each task and inference step, while recording the de- has been found, a key function, located in-between envi-
pendencies between added effects and required preconditions ronment perception and planning, is to determine the new
as shown in Fig. 10. A task, that adds a precondition that progression stage within the agenda - that is, to determine
is required later on by a rule, leads to a temporal ordering which events of the agenda have already occurred and what
constraint that enforces the task to be executed, before the is to be done next? As already indicated, dynamic execution
rule fires. In addition, numeric constraints on variables are of the agenda cannot literally stick to the scheduled times or
recorded, e.g. a maximum threshold for a distance that occurs numerical values, but must consider the causal relationship
in a precondition of a rule contained in the structure of the within the agenda instead.
agenda. The state of the agenda is not modified by this. Again, L-3 rules play a critical role, as all numerical and
b) Numeric Agenda Adjustment: Once the temporal and temporal offsets originate in deviations between expected,
causal constraints of the baseline-solution have been analyzed, nominal environment reaction and reality. Determining the
they are stored and fed into the constraint-satisfaction program progression state therefore boils down to the question, which
CPlex and solved for the decision variables on each subsequent environment reactions, represented as L-3 rules in the agenda,
iteration of the cognitive process. have already occurred? This in turn, depends on the context of
Decision variables are the start times of tasks and start / end the remaining agenda, as the observed environment reaction
times of L-3 rules. Initially, the start time for task FlyToX is needs to be “close enough” to allow the remainder of the
a decision-variable, with an yet undetermined value. Timetags agenda to be executed. Note, that the deviation criterion used
are easily computed using the task-agenda graph (Fig. 10). for the perception of the L-3 rule, as explained in section III-A,
A time-tag for the firing of a rule equals the latest (highest) is not sufficient in general, as the progression stage transitively
timetag of all its precondition-edges. Outgoing edges of a rule depends on the remaining agenda.
indicate a modification of the edge by the rule - the respective For each currently perceived L-3 rule, the architecture
timetag of the edge therefore equals the timetag of the rule. evaluates, whether the L-3 rule has already been achieved, by
While the start time for a task is not constraint any further, the inserting for each of its effects the current values, while using
end time of an L-3 rule equals its start-time plus its duration. the nominal values for the effects of all other L-3 rules. If the
constraint satisfaction problem can be solved, the respective
Besides the temporal aspects, the architecture can keep track
L-3 rule is marked as achieved. Also, all rules that transitively
of numerical parameters of a task, which are given a nominal
depend on it are marked as achieved and their effects are fixed
value for PDDL planning and an upper and lower bound for nu-
to the solution values. Should an L-3 rule not be achieved after
merical adjustment. COSA2 therefore traces the computations
its duration has elapsed, the architecture consideres it failed
and transformations performed on the parameter throughout
and reports a execution monitoring violation.
the agenda graph, together with the numeric preconditions re-
Fig. 10 shows the agenda during execution with progression
quired (e.g. distance computation within the ComputeDistance
stage indications. A yellow cap on the left of rule indicates
rule). Together, temporal and numeric constraints encode the
its preconditions are achieved, whereas a yellow cap on the
causal structure of the task agenda as a constraint satisfaction
right indicates, that the rule’s effects are fixed. Note also, that
problem.
after the linear optimization phase, the decision variables have
If the constraint satisfaction problem can be solved, the been replaced by their solution values, e.g. the concrete timing
numerical values of the task agenda are updated and the state information is now attached. Also, the figure indicates two L-3
transitions to VALID. If no solution is found, execution of the rules, occurring in the environment, with about a fifth of their
remainder of the agenda has become infeasible, the agenda expected duration elapsed. The red line marks the time elapsed
state transitions to FAIL and a complete replanning and plan since agenda was planned.
analysis cycle is initiated.
In contrast to the combinatorial planning problem required C. Task Execution and progression monitoring by reactive
to generate the structure of the agenda, which is known to pattern recognition
be PSPACE complete in general [22], incremental updates for 1) Subfunctions within the procedure-based layer: Accord-
the plan adjustment can be solved using a linear optimization ing to Rasmussen [6], human behavior is dominated by routine
algorithm, which is computed very efficiently. Adjustment of procedures, optimized with respect to every-day situations.

313
This procedure-based behavior increases performance in en- R EFERENCES
vironments with recurring situational patterns, that trigger a [1] H. Putzer and R. Onken, “Cosa - a generic cognitive system architecture
predetermined action-sequence as reaction. In the modified based on a cognitive model of human behavior,” in Cognition, Technol-
Rasmussen scheme [7], the Task Determination subfunction ogy & Work, vol. 5, no. 2. Springer, 2003, pp. 140–151.
[2] A. Schulte, C. Meitinger, and R. Onken, “Human factors in the guidance
is responsible for matching such patterns (TASK S ITUATIONS) of uninhabited vehicles: Oxymoron or tautology? the potential of cogni-
and associating with it a predefined, routine action-procedure tive and co-operative automation,” in International Journal on Cognition
for fast, reactive behavior. Technology & Work. Heidelberg, Germany: Springer, 2008.
[3] J. Uhrmann and A. Schulte, “Task-based guidance of multiple uav using
In the context of procedure-based behavior, this can be cognitive automation,” in COGNITIVE 2011, The Third International
rudimentarily emulated by directly triggering a task from a Conference on Advanced Cognitive Technologies and Applications, 2011,
pp. 47–52.
Feature Formation rule. In the context of concept-based [4] W. Pecher, S. Bruggenwirth, and A. Schulte, “Using cognitive automa-
behavior, as focused on in this article, the subfunction Task tion for aircraft general systems management,” in 5th International
Determination detects when to toggle the next task from the Conference on System of Systems Engineering (SoSE). IEEE, 2010,
pp. 1–8.
agenda. [5] J. E. Laird, A. Newell, and P. S. Rosenbloom, “Soar : An architecture
A task may itself be composed of several low-level actions for general intelligence,” in Artificial Intelligence, vol. 33. Heidelberg,
that are stored as automated procedures. Execution of the Germany: Springer, 1987, pp. 1–64.
[6] J. Rasmussen, “Skills, rules and knowledge, signals, signs and symbols,
procedure is controlled by the Task Execution subfunction, and other distinctions in human performance models,” in IEEE Trans-
while Action Control sends the low level actions to the output actions on Systems, Man, and Cybernetics, vol. SMC-13. Heidelberg,
interface. Germany: Springer, 1983, pp. 257–266.
[7] R. Onken and A. Schulte, System-Ergonomic Design of Cognitive Au-
2) Task Determination: Once a task agenda is generated, tomation: Dual-Mode Cognitive Design of Vehicle Guidance and Control
the agent needs to successively execute the tasks in it, each at Work Systems. Springer, 2010.
[8] T. Johnson, “Control in act-r and soar,” in Proceedings of the Nineteenth
the appropriate time. A task is executed, if (1) its preconditions Annual Conference of the Cognitive Science Society, 1997, pp. 343–348.
are achieved and (2) its scheduled start-time has elapsed. [9] M. Wooldridge, An introduction to multiagent systems. Wiley, 2009.
Both conditions become trivial, since the reasoning on when [10] I. A. Ferguson, “Touringmachines: An architecture for dynamic, rational,
mobile agents,” Ph.D. dissertation, University of Cambridge, 1992.
to initiate the next task is primarily done by the agenda [11] E. Gat et al., On three-layer architectures. Cambridge: AAAI Press,
progression monitoring step of the concept-based planning 1998.
subfunction, as explained in the previous section. [12] M. Bratman, Intention, plans, and practical reason. Harvard University
Press Cambridge, MA, 1987.
[13] F. Meneguzzi, A. Zorzo, and M. da Costa Mora, “Propositional planning
in bdi agents,” in Proceedings of the 2004 ACM symposium on Applied
IV. C ONCLUSIONS computing, Nicosia, Cyprus, 2004, pp. 58–63.
[14] M. Ginsberg and D. Smith, “Reasoning about action i:: A possible worlds
In this article, we presented the architectural and algorith- approach,” in Artificial Intelligence, vol. 35, no. 2. Elsevier, 1988, pp.
165–195.
mic concepts for a new cognitive system architecture (COSA2 ). [15] R. Fikes and N. Nilsson, “STRIPS: A new approach to the application
The architecture is an attempt to build an intelligent agent of theorem proving to problem solving,” in Artificial intelligence, vol. 2,
framework on the basis of the modified Rasmussen model of no. 3-4. Elsevier, 1971, pp. 189–208.
[16] J. McCarthy and P. Hayes, Some philosophical problems from the
human performance. standpoint of artificial intelligence. Stanford University, 1968.
It encompasses unified functions for goal-driven situation [17] M. Fox and D. Long, “Pddl2. 1: An extension to pddl for expressing tem-
poral planning domains,” in Journal of Artificial Intelligence Research,
interpretation, planning and plan execution. A unique feature is vol. 20, no. 1, 2003, pp. 61–124.
the claim for a centralized knowledge representation, enabling [18] B. Williams and P. Pandurang Nayak, “A reactive planner for a model-
the reuse of knowledge modeled once, whenever applicable to based executive,” in International Joint Conference on Artificial Intelli-
gence, vol. 15. Citeseer, 1997, pp. 1178–1185.
ensure consistency. Three algorithmic classes (inference based, [19] M. R. Endsley and D. J. Garland, Situation Awareness Analysis and
search based and based on reactive pattern matching) have Measurement. Mahwah, NJ: Lawrence Erlbaum Associates, 2000.
been identified and prototypically implemented to close the [20] S. Brueggenwirth, R. Strenzke, A. Matzner, and A. Schulte, “A generic
cognitive system architecture applied to the multi-uav flight guidance
loop on deliberative behavior. Tighter coupling between the domain,” in Proceedings of the International Conference on Artificial
subfunctions, including subsymbolic, skill-based behavior, as Intelligence and Agents, Crawley, UK, 2010, pp. 292–298.
originally described in Rasmussens model of human cognition, [21] A. Matzner, M. Minas, and A. Schulte, “Efficient graph matching
with application to cognitive automation,” in Applications of Graph
is subject to current research. Along this effort, fast, reactive Transformations with Industrial Relevance. Berlin, Germany: Springer,
behavior shall be integrated as part of the procedure based 2008, pp. 297–312.
layer. [22] T. Bylander, “The computational complexity of propositional STRIPS
planning,” in Artificial Intelligence, vol. 69, no. 1-2. Elsevier, 1994,
A first successful flight experiment using the COSA2 frame- pp. 165–204.
work embedded on fixed-wing UAV was conducted last year [23] S. Clauss, S. Brueggenwirth, P. Aurich, A. Schulte, V. Dobrokhodov, and
in collaboration with the Naval Postgraduate school, Monterey I. Kaminer, “Design and Evaluation of a UAS combining Cognitive Au-
tomation and Optimal Control,” in Proceedings of Infotech@Aerospace.
[23]. Future areas of application for COSA2 will be in the The American Institute of Aeronautics and Astronautics, 2012.
field of assistant systems and task-based UAV guidance. This
effort also encompasses the integration of the framework with
airborne hardware and sensors, to evaluate the performance of
the architecture in real-world scenarios.

314

You might also like