You are on page 1of 14

Second-Order Cybernetics

Foresight Rather than Hindsight?


Future State Maximization As a Computational
Interpretation of Heinz von Foerster’s
Ethical Imperative Hannes Hornischer et al.
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

Hannes Hornischer • University of Graz, Austria • hannes/at/hornischer.net


Simon Plakolb • University of Graz, Austria • simon.plakolb/at/uni-graz.at
Georg Jäger • University of Graz, Austria • georg.jaeger/at/uni-graz.at
Manfred Füllsack • University of Graz, Austria • manfred.fuellsack/at/uni-graz.at

> Context • Many AI and machine-learning techniques are primarily focused on past-to-future extrapolations of sta-
tistical regularities in large amounts of data. We introduce a method that builds on an in-action sampling of probes
from possible futures with preference for those that prove promising for maximizing the perceivable space of possibil-
ities. This foresight-oriented (rather than hindsight-oriented) method is particularly promising for handling non-linear
or abruptly emerging developments. > Problem • What von Foerster called the Ethical Imperative seems less strictly
derived from physical principles than other well-known concepts in his work. Regarding investigations in recent AI re-
search, however, it appears that the Ethical Imperative corresponds almost literally to the so-called principle of Future
State Maximization, a principle that lately has been applied successfully to a range of coordination and learning tasks.
> Method • We discuss the principle of Future State Maximization, as previewed by von Foerster, against a background
of a general need for tackling uncertain futures by way of modeling, and introduce three computational investigations
on different coordination tasks based on Future State Maximization. > Results • We show that the principle of Future
State Maximization corresponds to von Foerster’s Ethical Imperative and to constructivist principles, and that it lends
itself to opening up interesting new horizons for AI research. > Implications • The article suggests an interpretation
of how von Foerster’s Ethical Imperative can be understood as a foresight- rather than hindsight-oriented method
against a background of computer-based modeling and AI research. Furthermore, it shows that computer-based
methods conform well with the epistemology of constructivism. > Keywords • Agent-based modeling, coordination
and learning tasks, Ethical Imperative, Future State Maximization, Heinz von Foerster.

36

Introduction here, that space and time are constructions space of possible actions of the bacteria. As
emerging from the use of models. No model a minimum, it implies two distinctions: “me
« 1 »  When asked how to deal with the means no space and no time, and hence also here”/“food there” and “now hungry”/“then
future, to scientists it may not seem far- no future. satiated.” In other words, it implies a spatial-
fetched to suggest the use of a simulation « 2 »  Let us illustrate this with bacteria ly and temporally structured environment,
model. Using a model can help to navigate in a Petri dish, containing patches with nu- which cannot be had other than within a
uncertain futures. However, as we shall see, trients. These bacteria prosper where there virtual reconstruction of what is at stake in
there seems to be an even more fundamen- is food and they die where there is none. To the given situation. And this means that it
tal and generic principle under which mod- live and to die, in this case, does not neces- implies a model: clearly, a very basic model,
el use can be subsumed and which currently sitate the use of a model. at first, a model, however, that includes self-
is subject to a range of insightful scientific « 3 »  Now, assume that some bacteria positioning – a self-model (Metzinger 2007)
investigations in artificial intelligence (AI). have a basic facility for nutrient procure- – and that holds a conception of space and
To introduce this principle, we will take a ment, such as the ability to follow a gradient time. The future, one of the aspects of time,
short detour and argue that model use is not to higher concentrations. Obviously, these thus emerges from the use of models that
only a possible answer to our initial ques- bacteria have an advantage. However, this provide evolutionary advantage through
tion, but already a precondition for ask- simple ability already implies some funda- opening up additional, though maybe just
ing it. The reason for this is, as we propose mental concepts, which, as such, enlarge the virtual, options.

Handling Editor • Alexander
CONSTRUCTIVIST FOUNDATIONRiegler • Vrije
s vol. Universiteit Brussel, Belgium
16, N°1
Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

« 4 »  This advantage becomes obvi- ing tasks, from cosmology (Bousso et al. configuration in the sense of von Foerster
ous as soon as the options include an 2007) to geosciences and biology (Martyu- – from an iterated interplay of irritations
“acting”/“not acting” distinction, e.g., in shev & Seleznev 2006), to computer science (for more details, see also Füllsack 2018). In
situations in which the option of acting in- (Charlesworth & Turner 2019). The method other words, FSX can be based on models
volves risks. A system with such a model has a physical generalization under the that emerge and are adapted in action. The
can pre-test and evaluate decisions about name Causal Entropic Forces (CEF) (Wiss- principle to maximize future options thus
whether to act, and it can anticipate con- ner-Gross & Freer 2013; Hornischer 2015) resonates with the concept of operational
sequences of actions. Such an anticipatory and a computer-based variant under the closure, illustrated by Maturana and Vare-
system (Rosen et al. 2012) would be able term Fractal AI (Cerezo & Ballester 2018; la’s (1987: 137) well-known metaphor about
to design strategies, for, e.g., food procure- Cerezo, Ballester & Baxevanakis 2018). And the submarine pilot who, on being con-
ment, in the safe virtual scenario of its it has a somewhat older predecessor with gratulated for avoiding underwater reefs,
model. Maintaining a model may be costly, the name Empowerment (Klyubin, Polani answers that all she did was to read certain
though. However, when these costs get war- & Nehaniv 2005). In the context of this ar- dials and maintain correlations between in-
ranted in evolutionary selection, anticipato- ticle, we will follow the suggestion of Henry dicators within the limits of the equipment
ry systems can become complex, along with Charlesworth and Matthew Turner (2019) of the submarine. FSX, like this pilot, oper-
the models they use. The number of options and discuss it as Future State Maximization ates on the basis of its own on-board means.
increases as the model makes it possible to (FSX). « 9 »  In the rest of the article we will
consider more and more details, up to the « 7 »  A core aspect of the method is proceed as follows: First, we outline the
point where their number triggers deci- what could be called a “pronounced orien- derivation and justification of the Ethical
sion problems and thus creates a higher- tation toward the future.” Foresight, rather Imperative in von Foerster’s constructivist
order demand for guiding principles such than hindsight, is the unifying orientation framework and show how it may conform
as norms, morals and imperatives, which of its manifold variants, with which they to FSX. Afterwards, we introduce the prin-
again are inconceivable other than within a meet yet another core aspect in the interests ciple in more detail, briefly compare its vari-
(next-order) model. of von Foerster (Foerster, Mora & Amiot ants, and illustrate its functioning using the
« 5 »  What we aim to point out with all 1960; Foerster 1972). In contrast to other example of a simple grid-world-navigation
this is that the future, on a most general lev- AI and machine-learning techniques, FSX task. In the subsequent three sections we
el, can be handled by generating additional, is not primarily focused on past-to-future present simulation experiments regarding
albeit just virtual, options, as is exemplar- extrapolations of statistical regularities in ƒ the group dynamics in game-theoretic
ily done when deploying a model. This may large amounts of data, but builds on an payoff optimization;
seem counterintuitive, at first, as it implies in-action sampling of probes from pos- ƒ the coordination of a water-based robot
mitigating uncertainty through the increase sible futures with preference for those that swarm in data gathering tasks; and
of uncertainty, i.e., fighting fire with fire. prove promising for maximizing the per- ƒ a revised version of the simple grid-
Nevertheless, what we claim is that an an- ceivable space of possibilities. With this, world example, which we updated to
swer to the question of how to handle the the principle seems particularly promising conform with basic constructivist as-
future could, on a most general level, read: for handling non-linear or abruptly emerg- sumptions.
Choose your options so as to increase the ing developments, prone to critical phase « 10 »  These simulation experiments
number of possible further options. transitions or even singularities. In view of are supposed to demonstrate the power of 37
« 6 »  Such a claim may sound famil- forecasts on current global developments the principle to yield interesting insights in
iar to constructivists, as it coincides with like climate change, digital transformation very different problem fields. In the context
a maxim that second-order cybernetician or the social and economic consequences of the last one, we review the implications
Heinz von Foerster called the Ethical Im- of Covid-19, FSX could prove useful as a the principle may have for navigating un-
perative: “Act always so as to increase the methodology for dealing with highly dy- certain futures. In the conclusion, finally,
number of choices” (Foerster 2003: 227).1 namic conditions, as we will show in more we venture into the risk of an interpretation
And, as mentioned, it also conforms to a detail at the end of this article. of how von Foerster could have seen his
method that has recently gained much in- « 8 »  Of particular interest is that FSX Ethical Imperative on the foreground of the
terest in various areas of AI research, where seems to conform to fundamental construc- FSX principle. Last but not least, this article
it is applied in coordination and learn- tivist assumptions. Note that the above for- should be considered as yet another illustra-
mulation addresses the maximization of the tion of how computer-based methods con-
1 | As pointed out on http://www.cybsoc. perceivable space of possibilities, which al- form to the epistemology of constructivism
org/heinz.htm (retrieved on 17 August 2020), von ludes to the space that is perceivable within (Füllsack 2013).
Foerster later changed this wording to: “I always the framework of the model used. As we
try to act so as to increase the number of choices,” will discuss in more detail in the context
which seems more in line with his idea of sepa- of the third experiment we introduce, FSX
rating ethics (which is only about the “I”) from builds on the use of a model that emerges
morality/morals (about the “Thou/You”). as a sort of eigenvalue – a temporarily stable

https://constructivist.info/16/1/036.hornischer
Heinz von Foerster’s ally be perceived as a “reality.” “The nervous erence to models, as argued above. We sug-
Ethical Imperative system,” von Foerster (2003: 225) writes, “is gest that conceiving constructions such as
organized (or it organizes itself) so that it the “self ” or the “other” as components of
« 11 »  A central aspect of von Foerster’s computes a stable reality.” a mental model, i.e., as a virtual conception
constructivist conception was the claim that « 13 »  As known from the literature, that creates new options, makes it possible
interacting dynamics generate stable and such a view on cognition may provoke a to derive the Ethical Imperative as a societal
lasting forms – “objects” – irrespective of strong objection that it is about solipsisti- extrapolation of the very basic principle that
their initial conditions (Foerster 1981). In- cally enclosed entities whose world is just is expressed in it, i.e., the principle of choos-
teractions run up to stabilities, even if there a world to themselves and who therefore ing one’s options so as to increase the num-
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

is no stable ground whatsoever for starting claim to be the center of the world. If what- ber of future options.
them in the first place. They suffice to cre- ever is outside is just a projection from the « 15 »  To see this more clearly, let us have
ate structure, patterns and forms. There inside, then this inside may be seen as all a look at some details of the FSX principle.
is no need to assume the existence of an a that matters. Von Foerster was well aware
priori structured mind-independent reality of the ethical implications of this objection.
(Riegler 2007). Order may simply emerge Therefore, he stressed that solipsists may Future State Maximization
from noise. This can be read as the core project other solipsists with similar views,
proposition of constructivism. and argued that when two such solipsists (be « 16 »  FSX is a quantifiable – and hence
« 12 »  Von Foerster took this insight they only mutual projections to each other) computer-implementable – way of using
from mathematical as well as from psy- insist on each being the world’s center, a the information (sensu Shannon) a model
chological findings, such as the insight that contradiction arises. Consequently, they will provides for testing and optimizing future
simple recursive operations – i.e., operations argue with each other and thus interact. And options. Its core aspect is a suggestion for
that feed their output back into themselves this bears the solution. By interacting, by re- quantifying the degrees of freedom (or op-
as input for generating the next iteration’s acting to mutual irritations, they may gener- tions) an agent (e.g., an organism) has in
output – may have eigenvalues or eigen- ate eigenforms, which eventually evolve into controlling its environment based on a
forms to which they converge, regardless of stable “selves.” This led von Foerster (2003) model, with the model itself possibly emerg-
their starting conditions. For example, the to formulate the equation, “reality = com- ing in the course of this quantification.
operation op(x) = x / 2 + 1 converges to the munity.” Since no solipsist can exist without More precisely, the concept distinguishes a
eigenvalue of 2, irrespective of the integer it others, each solipsist would do well to grant specific value for an agent’s current state ac-
is started from. Analogously, in psychology, the same rights to them. cording to its model of its environment and
Jean Piaget (1960) referred to the “object in- « 14 »  In several of his writings, von the respective values of all possible states as
variance” emerging in the course of an iter- Foerster lets this reasoning directly be fol- perceivable with this model, weighted by
ated interplay of observation and movement lowed by the Ethical Imperative. The logi- their probability of opening further options.
of a person trying to come to terms with her cal link, however, between the arguments That is, the current control over the agent’s
environment. A small child, for instance, against solipsism and the imperative itself environment contrasts with the possible
may appear irritated by a sensory input, an seems a bit vague, one may argue, in par- control that it could reach, given an action
observation, and react to it with coordina- ticular, if compared with other propositions within its control. The agent uses these dif-
38 tive movement in order to come to terms of his oeuvre, which are diligently derived ferences for directing its actions towards
with the irritation. The movement, how- from physical principles. Von Foerster states that, compared to the current state,
ever, changes the observational viewpoint of himself repeatedly claimed that his consid- maximize its options. To put it very simply,
the child, albeit slightly, and thereby again erations on ethics were reflections inspired the agent (or the organism), according to the
triggers an observation that might induce by epistemology rather than the result of a FSX principle, simply screens all current op-
movement and so on. Step by step from linear causal dependency deriving from the tions that are perceivable with its on-board
these sensorimotor recursions, the child, premises of epistemology. “For if it were a means, compares their values, and then goes
just by reacting to irritations, may construct consequence it would be a necessity. I assert for the option that promises most control,
stable concepts of the irritation – a ball however: it is not a necessity. It is an attitude with control here meaning possibilities for
with which she can play, the food in front that we can select from amongst all possible further actions. By striving to maximize its
of her, etc. – which eventually become part other attitudes” (Foerster & Bröcker 2002: control, the agent simply selects actions that
of the child’s “external reality.” Hence, in the 64). However, even as just an attitude, the increase its space of possibilities, thereby fol-
course of her recursive computations (sensu logical foundation of the conclusion may lowing a very simple, but universal principle
von Foerster), the child creates virtual sta- appear somewhat weak, or at least not as that is solely based on local information.
bilities, to which higher-level recursions can steadfast as one would wish. From the per- « 17 »  Technically, the principle can be
connect and subsequently mutually condi- spective of the FSX principle, however, it implemented in several ways. In the case of
tion the emergence of sufficiently robust seems that the vagueness could be mended empowerment, for instance (Klyubin, Pola-
conceptions. In “cognitive homeostasis” to some extent and a sort of logical reason- ni & Nehaniv 2005), the maximum amount
(Chin 2007), these conceptions may eventu- ing could be interpreted into it via the ref- of options is defined as the “maximal poten-

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

tial causal flow” (Ay & Polani 2008) from


an agent’s actuators (the means it has for 2 1.58 4.91 4.7 4.39
performing some action) to its sensors (the
2.32 2.32 2 5.36 5.21 5 4.7
means it has for sensing the state of its en-
2.32 2 5.58 5.43 5.21
vironment). This is formalized as channel
capacity in the sense of Claude Shannon 5.81 5.73 5.58 5.36
(1948), i.e., as the maximum mutual infor- 5.91 5.88 5.81 5.67
mation, measured in bits, that a probability 5.91 5.93 5.91 5.83
distribution of received signals on average
5.91 5.88
contains about the probability distribution
of the signals that were originally sent.2
« 18 »  Another implementation of the
FSX principle utilizes the connection be-
tween options and an information-based
entropy where the so-called causal entropy
is defined on a finite sequence of actions (a) (b)
(Wissner-Gross & Freer 2013). This entropy
reflects the amount of possible future evo- Figure 1 • (a) Empowerment agent moving within one-step horizon, empowerment values
lutions associated with a certain state of the indicated. (b) Empowerment agent within five-step horizon.
system, i.e., the amount of options available
to an agent starting from a certain state of
the system. From the causal entropy one
can derive the causal entropic force, which shows such an agent having moved two « 22 »  To accomplish this task, the
points along the gradient of the causal en- steps according to the empowerment prin- density of walkers in different parts of the
tropy. Letting an agent move along the caus- ciple, with the one-step-horizon empower- (modeled) future horizon is altered by re-
al entropic force, it effectively maximizes ment values indicated.4 distributing them in correspondence to the
the amount of accessible future states.3 The « 20 »  An agent in the same situation consequences of options at a time t in the
two implementations of the FSX principle, with a larger horizon, however, i.e., with the future. The causal slice in Figure 2 (blue el-
empowerment and causal entropic force, are ability to look further into the future, will lipsis) represents the set of consequences at
generally closely related. move towards the center of the grid world, time t resulting from initial actions. If cer-
« 19 »  For an illustration and an exem- since there, empowerment will be maxi- tain regions in this causal slice appear more
plary implementation of the FSX principle, mum (Figure 1b). promising (green circles compared to red
consider the following scenario of an agent « 21 »  However, as easily imaginable, circles) in regard to the maximization of op-
being positioned on a two-dimensional fi- beyond the limited complexity of a grid tions, the walkers are redistributed propor-
nite grid world, in which five kinds of ac- world, the number of options an agent has to tionally to the assessed prospectivity, i.e., a
tions are possible: going North, going East, compare can become vast. For this reason, fraction of them is cloned to the new region
going South, going West, or staying put. An related concepts such as CEF and Fractal and deleted at the point where they are. The 39
agent with a self-model that has a horizon of AI build on computing not all, but a suffi- size of the fraction is determined based on
one time step will be able to check these five ciently large sample of the options an agent how much exploration over exploitation
possibilities with respect to its neighboring perceives with its model. The idea here is is to be considered. In this way, it can be
patches. Placed somewhere in the center to equip agents with probes, which we call guaranteed that walkers prevalently screen
of the grid world, it will perceive that any “walkers” and which, similarly to test pilots, promising futures while not completely ig-
of these five actions is possible. However, virtually rate options in the agent’s model. noring the less promising ones, which may
if placed at one of the edges, it will perceive Walkers can pre-screen the possibility space prove advantageous at another moment in
that it has one option fewer. Positioned in a of the agent by simulating model-based ac- time (see Code 1).
corner, it even has two options fewer. If this tions to determine further steps through re- « 23 »  In the simple grid-world example
agent is made to follow the empowerment orientating based on the distribution of the of Figure 1, as well as in the examples in the
principle, it will thus move away from the actions’ consequences. next two sections, the on-board means of the
corners and edges, because avoiding them agent, i.e., the walkers and their determin-
means higher empowerment. Figure  1a ing parameters, are specified at setup. Con-
4 
| Empowerment here is calculated as ceptionally, however, the on-board means
2 |  For a formal definition of empowerment, the base-2 logarithm of visible actions, that is could be seen as emerging in the course
see Klyubin, Polani & Nehaniv (2005). E = log2(2n2 + 2n + 1) for patches without con- of an agent’s interactions with its environ-
3 |  For a formal discussion of causal entropy, straints, with n indicating the number of steps the ment, i.e., as “constructed” stabilities in the
see Wissner-Gross & Freer (2013). agent can look into the future. sense of von Foerster (1981). The construc-

https://constructivist.info/16/1/036.hornischer
tion, i.e., the development of parameters
Consequences of actions at time t such as the number of walkers, the number
ce at time t of steps they can look into the future, and
a usal sli
C their adaptability (i.e., the variability of the
size of the fraction of them being cloned in
function of exploiting or exploring new op-
tions) can be conceived in concordance with
the FSX principle. In other words, the model
and its components evolve if this evolution
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

e
increases the agent’s fitness, which in the

on
lc
present context amounts to the increases in

a
Walkers

us
its number of choices, as will be illustrated

Ca
in Example 3, where we will revisit the grid-
world example and show that an evolution
Initial actions A Initial actions B along the FSX principle can be initiated
from very basic starting conditions.
x (0) « 24 »  Surprisingly, the relatively simple
FSX principle can be applied to a wide range
Figure 2 • Causal cone as viewed by the agent x at time step 0, with walkers sampling future of tasks, which all focus on the one principle
paths indicated in green (prospective) and red (detrimental). of maximizing the number of possible fu-
tures. Alexander Wissner-Gross and Camer-
on Freer (2013), for instance, showed that it
easily solves the classical problem of balanc-
ing a rod upright on a moving cart. Sergio
// INITIALIZATION: Create N walkers with copies of the system’s state: Cerezo and Guillem Ballester (2018) pre-
FOR i := 1 TO N DO BEGIN sented insightful examples of steering agents
// Walkers start at the system’s initial state: through mazes or coordinating simulated
Walker(i).State := System.State spaceships with a hook on a rubber band,
// Take walker’s initial decision: forming a chaotic oscillator that is highly
Walker(i).Initial_decision := random values sensitive to small changes in initial condi-
END tions, see https://youtu.be/HLbThk624jI. In
// SCANNING PHASE: Evolve walkers from time = t to t + Tau in M ticks: another setting Cerezo, Ballester and Spiros
FOR t := 1 TO M DO BEGIN Baxevanakis (2018) applied the principle to
// PERTURBATION: 55 Atari 2600 games from OpenAI Gym,
FOR i := 1 TO N DO BEGIN https://gym.openai.com, a feat that lately
// At first tick use the stored initial decision has become a widely used benchmark for AI
40 IF (t=1) THEN methods. They show that an algorithm with
Walker(i).Degrees_of_freedom := Walker(i).Initial_decision this principle can learn to play these games
ELSE faster and more efficiently than human play-
Walker(i).Degrees_of_freedom := random values ers and some of the most advanced deep
// Use the simulation to fill the other state’s component: learning methods. Christian Guckelsberger,
Walker(i).State := Simulation(Walker(i).State, dt := Tau/M) Christoph Salge, and Julian Togelius (2018)
END used the principle to generate non-person
END characters with lifelike behavior in computer
// DECIDING PHASE: games, and Charlesworth and Turner (2019)
Best := ArgMax(Reward(Walker(i).State)) deployed it successfully for the simulation
Decision := Walker(Best).Initial_decision of swarms, showing that it may provide an
explanation that is even more fundamental
Code 1 • Pseudo code as suggested by Cerezo & Ballester (2018: 25) as a “simple starting point” than the well-known swarm principles of
for the implementation of the FSM procedure, in which walkers are cloned in regard to the separation, alignment and cohesion.
prospectivity of their positions in the “causal slice” (i.e., the reward they receive at time t). Note, « 25 »  In order to illuminate the prin-
however, that, as mentioned, FSM is applied in various versions in different disciplines. Imple- ciple further, in the next two sections, we in-
mentations can differ significantly. troduce two examples of our own research,
before we present a revised version of the
above grid-world example with a more ba-

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

sic setting, suggesting that the FSX principle


conforms to the epistemology of construc-
tivism. Note that not all problems can be
easily captured in terms of immediate util-
ity, which makes defining a metric with re-
spect to which the FSX principle is made to
maximize options one of the more challeng-
ing tasks in applying it. For this reason, we
chose the examples in regard to the diversity
of their problem contexts.

Example 1: Group dynamics


and the emergence
of leadership
« 26 »  In our first example, the FSX
principle is applied to a set of social-psycho-
logical experiments on collective behavior,
which were originally designed to investi- Figure 3 • Virtual play field in the experimental setup (Boos et al. 2014).
gate the emergence of leadership in groups
of humans (Boos et al. 2014). We imple-
mented these experiments in an FSX-based
computational simulation model to see if
the results could be reproduced.5 field, the avatar remains on that field until playing field. Two of them could see one of
« 27 »  In the human-based version, each the end of the game. The reward for each the six reward fields as yielding a double
of the ten participants (“players”) were asked player is multiplied by the number of avatars reward, and the rewards were multiplied
to move computer-generated avatars on a on a reward field, which provides an incen- by the number of agents reaching a reward
virtual playing field presented to them on tive for players to coordinate their action and field.
a screen hidden from the view of the other reach the same reward field. « 30 »  In contrast to the experiments
players. All verbal, visual or aural interaction « 28 »  In order to introduce a mecha- with human players, the software agents
among the players was prohibited. The play- nism allowing for the emergence of leader- were made to choose moves in correspon-
ing field is depicted in Figure 3. Avatars are ship, two of the ten players are told that one dence to the FSX principle, with the num-
represented by black circles, with a player’s of the reward fields, marked with two Euro ber of choices being defined as the number
own avatar being displayed larger than the symbols on their screens, yields the double of states an agent can be in, i.e., the number
rest. Players started out to move their avatars reward. Neither these two “informed play- of hexagonal fields an agent can visit at least
at the center hexagon and were free to move ers” nor the others are aware of the informa- once using its remaining available moves. 41
from their current position to any adjacent tional imbalance. For all, however, it is ad- To account for these choices, agents could
hexagon. The number of moves they can vantageous to reach reward fields to which sample their environment with large num-
make is limited. The small tails in the repre- more than just one avatar is moving. The bers of exploratory random walkers, which
sentation in Figure 3 indicate the directions aim of this setting was to see if the informed rated the adjacent fields in function of the
from which an avatar was moved towards its players, who were not allowed to communi- amount of future options. The number of
current position. Six hexagons are marked cate with others, would be able to “lead” the unique fields visited by a walker are used as
with a Euro symbol. These are considered uninformed majority towards the special an approximation for the amount of future
reward fields and have to be reached within reward field just by moving their avatars. options associated with the respective adja-
the allowed number of moves. If an avatar As the experiments by Boos et al. (2014) cent field.
reaches a reward field, its player is rewarded showed, this proved possible. « 31 »  Since, obviously, software agents
0.5 points, which directly translates to the « 29 »  To test the applicability of the have no intrinsic understanding of money
amount of money a player receives at the end FSX principle to this experiment, the set- or Euro signs, a different metric for the in-
of the experiment. After reaching a reward ting was implemented with software-based centive to move towards reward fields had to
agents making decisions based on Future be used. In our experiments, successful sam-
5 |  The results of this experiment are current- State Maximization. As in the human ver- pling activities, i.e., those that indicate or
ly submitted for separate publishing and hence are sion, ten agents were made to start out at reach a reward field, were rewarded with an
subject to publication restrictions. Therefore, de- the center hexagon with a finite number additional move the agent in question was
tailed data cannot be shown here. of moves available to navigate through the allowed to make. Consequently, agents con-

https://constructivist.info/16/1/036.hornischer
tain extent, appear to be reproducible with « 37 »  The swarm has the goal of col-
0.3 the FSX principle. lecting measurements of environmental pa-
« 33 »  This proposition was further rameters of the swarm’s surroundings, such
Relative frequency f

substantiated by the finding that, in most as temperature, turbidity or oxygen concen-


0.2 games, human and computer-based, either tration of the water. The swarm as a whole
no or nearly all uninformed agents reached has the ability to move as a collective based
the double-reward field (see Figure 4 and 5 on the general goal of increasing available
0.1 for results with humans). Obviously, hu- choices, with choices defined as the swarm-
mans as well as FSX agents in this setting wide variance in collected measurements of
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

tend to move in groups, so that once some environmental parameters.


0 uninformed agents find the double-reward, « 38 »  If this swarm is located in a uni-
None €€ € the bulk of them will do so, as well. The FSX form environment without any significant
Type of reached reward field principle reproduces this particular behav- deviations in environmental parameters,
ior, too. every robot in the swarm collects the same
Figure 4 • The relative frequency of “unin- « 34 »  Nevertheless, in highlighting the measurements and the swarm as a whole
formed” (not-leading) players positioned close match of human and FSX-agent be- has little variety in information to choose
at the end of the game, with (€) indicating havior, we do not want to imply that human from. In a diverse environment, however,
at regular reward fields, (€€) at the higher behavior can be expressed by equations or a the swarm would collect a variety of dif-
reward field, and “None” anywhere else. The small set of behavioral rules, in general. Our ferent measurements. The behavioral rule
result for regular reward fields is averaged experiments are meant to show that the FSX that dictates the movement direction of the
over all five regular reward fields (Boos et al. principle, and thus implicitly von Foerster’s swarm is based on maximizing the variance
2014). Ethical Imperative, may provide a general in measurements of environmental param-
method for mimicking and reproducing as- eters the swarm investigates. The resulting
pects of human behavior. emergent collective behavior lets the swarm
0.6 move towards the most diverse areas, i.e.,
towards areas with the largest variance in
Example 2: Coordinating
Relative frequency f

measurements. Needless to say, the swarm


0.4
an aquatic robot swarm has no intrinsic understanding of the rel-
evance of its different measurements. In the
« 35 »  In the second example, we ap- following, we describe the communication
0.2 plied the principle of maximizing future mechanism, the agents’ behavioral rule, and
choices to an aquatic robotic swarm (first in the resulting emergent collective behavior of
software and later in physical robots) with the swarm.
0 the task of long-term monitoring and ex- « 39 »  Figure  6 shows schematic il-
0 1 2 3 4 5 6 7 8 ploration of a dynamic environment (Horn- lustrations of the swarm. Squares repre-
Number of uninformed agents on €€ field ischer et al. 2020). sent robots, which randomly, in time, send
42 « 36 »  Marine environments are a chal- messages to their nearest neighbors, who
Figure 5 • The relative frequency of finding lenging target location in the field of robotics in their turn send the messages on to their
a number of uninformed players reaching since reliable communication underwater is neighbors (Figure  6a–d), ultimately result-
the higher reward field by the end of a game non-trivial and can be energy consuming. ing in radially outward propagating mes-
(Boos et al. 2014). Therefore, long-term operation dictates an sages. Figure 6e shows several trajectories of
energy-efficient design. With this in view, a a message. Each message contains measure-
sidering reward fields in terms of additional swarm of aquatic robots was programmed ments of environmental parameters, which
choices faced more options in terms of FSM. to coordinate in a self-organizational man- a robot has locally sampled. When receiving
« 32 »  This procedure proved success- ner, i.e., without the presence of a central a message, each robot appends its own mea-
ful in making agents find the reward fields. coordinating entity, with in-swarm com- surements to it and sends it to its neighbors.
What is more, in both investigations, in the munication limited to local communica- Over time, in this way, messages grow and
computer-based as well as in the human- tion only. The swarm consisted of 50 to 100 all robots receiving the message have access
based, the relative frequency of finding an identical robots (designed after Donati et al. to the measurement values of the sending
uninformed agent at the special (double- 2017), each of which was endowed with the robots. This communication mechanism is
yielding) reward field was significantly high- ability to take measurements at its position described as pseudo code under the INFOR-
er than that of finding it on regular reward and to communicate with its direct neigh- MATION GATHERING PHASE in Code 2.
fields. Obviously, deploying “informed” bors. Furthermore, each robot has a sense of « 40 »  Figure  6f highlights one of the
agents to steer collective behavior can have direction of incoming messages from other robots in the center of the swarm together
significant effects, and these effects, to a cer- robots. with three trajectories of messages initiated

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

in the periphery of the swarm, which sub-


sequently propagated inwards, plus a cone
of possible future states to choose from. The
robot thus receives messages from all direc-
tions, which here correspond to the infor-
mation mediated by the exploratory random
walkers as deployed in the examples above.
The robot uses those incoming messages for
sampling the environment and evaluating
R
it with respect to the variance of measure-
ments6 in order to determine the direction (a) (b) (c)
associated with the largest variance in acces-
sible information (see EVALUATION PHASE in
Code 2). This principle is applied by all ro-
bots in the same way. Each robot determines
the direction of increasing variance in infor-
mation as it occurs from its perspective, i.e.,
its own preferred direction, and communi-
cates it repeatedly to its nearest neighbors,
thereby aligning its own preferred direction
slightly towards the preferred direction of its
neighbors. Over time, the preferred direc- (d) (e) (f)
tions of all agents converge to a common
preferred direction close to the average of all Figure 6 • (a–d) The communication mechanism of the swarm. When agents (black squares)
preferred directions (see COLLECTIVE DECI- send a message, it gets relayed by neighboring agents, which ultimately produces a wave-like
SION PHASE in Code 2). propagation of messages through the swarm. (e) Three exemplary trajectories of messages,
« 41 »  This mechanism enables the which are relayed from one robot to the next, respectively. (f) The agent underneath the cone
swarm to collectively decide which direc- receives three exemplary messages, with the cone indicating the possible future states of the
tion provides a maximum of accessible agent to choose from.
information. Figure  7 shows the swarm in
an environment containing two domains
of different environmental parameters
(including a small noise), represented by
different background colors. Each square
represents a robot and the arrow indicates
its individual preferred direction. Most ro-
bots in the yellow domain have a preferred 43
direction towards the red domain, whereas
most robots in the red domain have a pre-
ferred direction towards the yellow domain.
As a consequence, the overall preferred
direction of the swarm is towards the bor-
der of the two domains, since it provides
the largest variance in measurements. Fig-
ure 7b shows the positions of the center of
the swarm over time, i.e., the average posi-

6 | The variance an agent calculates during


INFORMATION GATHERING is defined as
1 n (a) (b)
V dir
k = n / (m kj - m k) 2
j=1
where agent k received n measurements m from Figure 7 • (a) A swarm in an environment containing two different domains of environmental
direction dir, with m j d R and m k the arithme- parameter values, shown in yellow and red. Each square represents a robot and the arrow its
tic mean of all measurements associated with the respective individual preferred direction. (b) The “averaged” positions of the swarm moving
respective dir. towards the red domain and eventually along the border of the two domains.

https://constructivist.info/16/1/036.hornischer
// INFORMATION GATHERING PHASE: Communicate measurements within the swarm while the swarm maintains its distribution
FOR t := 1 TO information_gathering_time DO BEGIN around the locations associated with the
// Agents randomly in time initiate sending a message: largest amount of accessible information.
IF (random_value < probability_to_initiate_sending_message) THEN « 43 »  This algorithm was also imple-
message := sensor_readings_of_agent mented on physical robots and successfully
send message to neighboring agents applied under laboratory conditions in a
// Briefly ignore incoming messages to avoid repeatedly receiving setup conceptually equivalent to the setup
// the same messages described here. Note that this article only
addresses the task of collective decision
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

ignore incoming messages for a brief period


ELSE IF (received a message) THEN making, while other common challenges
append sensor_readings_of_agent to messages in robotic swarms such as actuation, colli-
send message to neighboring agents sion avoidance, and prevention of loss of
// Evaluate received message with respect to variance and direction members are not in the scope of this work.
variance := variance( measurements_in_received_message) Further details on functionality and limits
direction := direction_from_which_message_was_received of the algorithm as well as implementation
list_of_measurements(direction) ← append received measurements and results can be found in Hornischer et al.
// Briefly ignore incoming messages to avoid repeatedly receiving (2020).
// the same messages
ignore incoming messages for a brief period
ELSE Example 3: Learning
wait to behave
END
// EVALUATION PHASE: Determine preferred direction of agent « 44 »  The previous two examples in-
FOR direction IN list_of_variances DO BEGIN troduced applications of the FSX principle
average_variances(direction) := average(list_of_measurements(direction)) to the scientific research work mentioned
END in the first section. In these cases, FSX op-
// Choose direction associated with maximum variance as preferred direction erates on the base of a model for screening
own_preferred_direction ← direction of max( average_variances ) maximization options, which is pre-deter-
// COLLECTIVE DECISION PHASE: Let the swarm reach a common preferred direction mined to a certain extent, be it in the form
FOR t := 1 TO collective_decision_time DO BEGIN of a minimalistic notion of remaining pos-
// Agents randomly in time initiate sending a message: sible moves, as in the coordination task in
IF (random_value < probability_to_initiate_sending_message) THEN Example 1, or be it as a comprehension of
message := own_preferred_direction maximal variance, as in the case of the robot
send message to neighbors swarm in Example 2. This pre-determina-
ELSE IF (received a message) THEN tion implies a sort of input from an assumed
adjust own_preferred_direction by factor X external world to the agent, which limits the
44 towards preferred direction contained by the received message relevance of the examples for a constructiv-
ELSE ist epistemology. From a constructivist per-
wait spective, it might be more interesting what
END the minimal requirements are for an FSX
agent to start maximizing its options. Our
Code 2 • Pseudo code of the underlying algorithm. For ease of understanding, the algorithm was objective, therefore, was to find out what the
divided into three phases: INFORMATION GATHERING, EVALUATION, and COLLECTIVE DECISION. preconditions are that an agent can use for
screening its environment in action while
trying to get along in it.
« 45 »  In order to answers this question,
we revised the simple grid-world scenario
tion of all robots, with the initial position responds to the largest amount of accessible introduced in §§19ff. In this scenario, an
being the one of the swarm in Figure 7a and states of the environment and thus adheres agent, when placed randomly somewhere
eventually moving along the border of the to the FSX principle. Figure 8 illustrates this on the grid, simply compares all the op-
two domains. by showing the average variance in informa- tions it can see within its vision and settles
« 42 »  In this way, the swarm is able to tion accessible to the swarm during its tem- for the one that yields the highest “freedom,”
follow a gradient towards maximum in- poral evolution. The variance increases until i.e., the largest space of further options. As
formation variance, i.e., towards the most the swarm reaches the border of the two illustrated in Figure 1b, this makes it move
diverse areas in the environment. This cor- domains and then fluctuates around V = 4, towards the center of the grid.

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

« 46 »  In the revised version, however,


5
the agent’s setup is different. It does not
compare anything in the first place, but
starts out by randomly choosing one of its 4
possible options: going North, South, East,

Variance V
West, or staying put. From these random 3
choices, it can derive probabilities about
the states that result from its actions. Ini-
tially this will not yield more than overly 2
uncertain information conveying a very
rudimentary and inaccurate picture about 1
what might follow from a particular action.
Nevertheless, obviously, this information 0 5 10 15 20 25 30
suffices for the agent to send its walkers out Steps s
to screen possible options and to follow the
ones among them that reach the most di- Figure 8 • The average variance in information the swarm as a whole has access to, defined as
verse states. Many of these tentative actions the arithmetic mean of the variances calculated by each agent in the swarm, respectively. The
will be misguided, however. The FSX prin- variance in measurements, i.e., the amount of accessible information, increases up to the point
ciple will not be recognizable as such in this where the swarm reaches the border between the two domains. From then on it fluctuates
phase. However, step by step, experiences around V=4.
will accumulate and start providing an, at
first, rudimentary and rather inaccurate, but
eventually increasingly more useful model as information about the probability of all of its options and incorporating those
for choosing options that enlarge the agent’s being in a certain state after taking a par- into the model that seem to open up fur-
abstract space of future options. ticular action in an initial state. The model ther possibilities – just as in Piaget’s exam-
« 47 »  In other words, in this experi- gets iteratively constructed from the (ini- ple (§11), the small child constructed the
mental setting, the agent’s model emerges tially random) actions of the agent trying ball’s “object invariance” from sensorimo-

(a) (b) (c) (d) (e) (f) (g) (h)


50 samples 100 samples 200 samples 400 samples 1250 samples 1600 samples 2500 samples complete knowledge
of training data of training data of training data of training data of training data of training data of training data of the orignal model

Looking
2 steps
ahead

45

Looking
3 steps
ahead

Looking
4 steps
ahead

Looking
5 steps
ahead

Figure 9 • Effect of the amount and nature of training data with respect to the size of the screened horizon. The quality of the agent’s choices is
influenced by two parameters: the number of samples and the number of steps it looks ahead.

https://constructivist.info/16/1/036.hornischer
// INITIALIZATION: Create N walkers with copies of the system’s state: « 48 »  For the learning process in this
Options := {UP, DOWN, LEFT, RIGHT, STAY} experiment, an artificial neural network has
FOR i := 1 TO N DO BEGIN been applied, but, in principle, any machine-
// Walkers start at the agent’s initial state: learning technique will do. The learning
Walker(i).State := Agent.State process is illustrated in Figure 9, with higher
// Take walker’s initial decision: “FSX-empowerment” being depicted in
Walker(i).Initial_decision := random_choice(Options) brighter colors, allowing tracking of the in-
// Initialize set of reached states: appropriateness of early model stages and its
gradual improvement. Columns 2 to 6 show
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

Walker(i).Reached_States := Set()
END that the agent coincidentally gained more
// SCANNING PHASE: Evolve walkers from time = t to t + Tau in M ticks: information at first about the upper, the left
FOR t := 1 TO M DO BEGIN and the right edges of its environment. For
// PERTURBATION: quite some time it considers the lower edge
FOR i := 1 TO N DO BEGIN patches as providing nearly as much “FSX-
// At first tick use the stored initial decision empowerment” as the central patches. Only
IF (t=1) THEN after 2500 samples (7th column) does it ap-
Walker(i).Decision := Walker(i).Initial_decision pear to get it right and adjust its model ac-
ELSE cordingly.
Walker(i).Decision := random_choice(Options) « 49 »  Another interesting aspect in
// Use the learned model to fill the other state’s component: our results is that, in addition to aggregat-
Walker(i).State := Agent.Model.predict(Walker(i).State, ing promising feedback (or in other words:
Walker(i).Decision, to collecting experiences), the agent can
dt := Tau/M) optimize the scope of its future horizon. In
Walker(i).Reached_States.add(Walker(i).State) the early part of the model-building pro-
END cess (the left-most column of the images
END in Figure 9), a smaller scope (here, looking
// DECIDING PHASE: three steps ahead) seems to yield a more
Max := Size(Unique(Walker(i).Reached_States)) appropriate preliminary model than the
Decisions := Set() 5-step scope, where the agent maneuvers
FOR i := 1 TO N DO BEGIN itself into a corner trap. In contrast, in the
IF (Size(Unique(Walker(i).Reached_States)) == Best) THEN later stages of the model building, a larger
Good_Choices.add(i) scope yields the appropriate model. Obvi-
END ously, given enough sample time, the agent
Agent.Decision := Walker(random_choice(Good_Choices)).Initial_decision is able to adapt its future horizon optimally
// EXECUTION: to the complexity of its environment. It con-
Init_State := Agent.State structs its model while constantly sampling
46 Agent.State := Environment(Agent.State, Agent.Decision, dt := Tau/M) the result of choices and applying the FSX
// LEARNING PHASE: principle. Or, in other words, it constructs
Agent.Model.learn(Init_State, Agent.Decision, Agent.State, dt := Tau/M) its world, its “reality” as perceived through
(and only through) its model, by choosing
Code 3 • Pseudo code as used for a learning agent on a 10x10 grid. Note that no reward function with increasing certainty those of its actions
is applied as all states have equal reward values. However, the boundaries of the grid are impen- that maximize its future possibility space.
etrable, rendering the agent with fewer degrees of freedom close to the edges of its world. « 50 »  The FSX principle thus builds
on the use of a model that emerges as an
“eigenvalue” in the sense of von Foerster. It
gets constructed from the iterated attempts
of an agent to come to terms with its op-
tor recursions of its reactions to irritations higher the certainty about the actions’ re- tions. Initially, this agent’s possibility space
(compare this also to the concept of the sults. Once certainty is sufficiently high, the is too vast. It doesn’t give reasons for any
self-modeling robot by Bongard, Zykov & FSX principle is applied, in practice, nearly particular action. It has to be constrained at
Lipson 2006). As described in Code 3, the as efficiently as in the original simple grid first, and this is done by ongoing attempts
model emerges as an “eigenvalue,” so to say, example, yielding a clear preference of the at deriving probabilities for effects of pos-
from the feedbacks the agent receives from agent for the central patches of the grid, as sible actions, with these effects then being
its actions. The greater the number of states can be seen in the bottom-left-most images weighted in regard to the action space they
sampled (by initially random choices), the in Figure 9. are opening up, that is, in regard to what we

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

here call Future State Maximization. This the ones that provide highest resilience. is – due to the need for differentiation – its
process is re-iterated up to the point where And this can obviously be achieved by fol- consequence. Both the “self ” and the “oth-
a stable concept emerges, an “eigenvalue” of lowing the FSX principle. Under conditions er” (as model components) are, in terms of
what to do and what better not to do in a of non-linearity and sudden, unpredictable FSX, options, which when confronted with
situation where a maximum of further op- transitions, as diagnosed in the debates on each other must constrain each other’s pos-
tions is at stake. climate change, on digital transformation sibility spaces, which is the case when, for
« 51 »  The results of the above-de- or on the social and economic consequenc- example, one solipsistically disallows the
scribed experiment may illuminate yet an- es of Covid-19, the FSX principle seems to other’s existence. As mentioned above, von
other aspect of the FSX principle. If, for a gain its significance. That is where it makes Foerster reasoned in this context that this
moment, we assume that it is not the agent sense to prefer options that open up a maxi- leads to a contradiction, if the other’s dis-
that actively changes its place in its grid mum of further options. allowance of existence is reciprocal. From
world, but that the grid itself shifts, say by a here, it seems reasonable to opt for interac-
patch at a time in a random direction, with tion instead, for acting so as to not exclude
the agent staying put in its place, it seems Conclusion options, and through this to gain a chance
clear that these environmental changes will for the emergence of stable eigenforms, of
have different consequences depending on « 54 »  Heinz von Foerster might have “selves,” and from their interaction eventu-
the original position of the agent. If, for had something similar in mind when he ally of a “reality” (= “community,” Foerster
instance, the agent is positioned at the up- reasoned about his Ethical Imperative that 2003) that again entails further options.
per edge of the grid and the grid is moved « 56 »  There may be reason in this, but,
down by one patch, the agent will find itself
beyond the grid’s limits and hence stripped
“  in times of socio-cultural change the future
will not be like the past. With a future not clearly
as von Foerster admits, there is no necessity,
and there is no way to prove it, in particu-
of all of its options. Or if the agent is posi- perceived, we do not know how to act. With only lar without the computational means to test
tioned one step away from an edge or a cor- one certainty left, if we don’t act ourselves, we and analyze the emergence of eigenforms
ner when the grid moves, it will be shifted shall be acted upon. Thus, if we wish to be sub- from interacting particles, as it is now com-
towards the edge and lose a large share of its jects rather than objects, what we see now, i.e., monly done, e.g., in agent-based modeling.
options. Only in the case where the agent our perception, must be foresight rather than Maybe all that von Foerster could do back
is placed somewhere near the center of the
grid, may an environmental change by one

hindsight. (Foerster 1972: 31) in his days was to restrict himself to the
realm of ethics in proposing to “act always
or even by several patches have no fatal « 55 »  As agreeable as these indeed so as to increase the number of choices.”
consequences. The agent may still be able to rather ethical formulations of von Foerster « 57 »  However, our prime intention
move to one of its four neighboring fields are, they seem to stand somehow apart from here was not to enter into a debate on eth-
and thus stand the blow to its environment other concepts in his oeuvre, such as the ics or philosophy, but to point out that von
without too many problems. nontrivial machine, the second-order ob- Foerster’s formulation of the Ethical Imper-
« 52 »  In other words, the center fields server, and eigenbehavior, which all seem to ative nearly literally conforms to the prin-
on this grid are the ones that offer the agent build on formal argumentation. We still can ciple that we here call Future State Maximi-
highest resilience in case of environmental only surmise what von Foerster intended to zation. Furthermore:
changes. Whatever may happen, the pos- aim at when he formulated his Ethical Im- ƒ As shown in Experiment 3, FSX can be 47
sibilities to react remain largest in this part perative as “act always so as to increase the considered in terms of the emergence of
of the world. At the same time, these center number of choices.” Maybe he had ideas for eigenvalues.
fields are the ones that are sought out ac- experiments similar to the ones presented ƒ FSX allows for an interpretation of as-
tively according to the FSX-principle. The here but did not have the (computational) pects in von Foerster’s oeuvre and thus
states that an agent prefers by following the means to implement them. Or maybe he re- seems to blend in well with the frame-
FSX-principle hence seem to coincide with mained in the vagueness of ethics because work of a constructivist epistemology.
what maximizes its sustainability in the case he did not see a way to substantiate his in- ƒ The principle could prove to be an in-
of a blow to its environment. kling about how to answer possible objec- teresting and fruitful method for fur-
« 53 »  This may not seem to be of huge tions against constructivism. As mentioned ther research, particularly in times of
relevance in the case of small and incre- in §§13f, we think that his rejection of a dynamic changes and high demand for
mental changes. Adaptation to an environ- solipsistic interpretation of constructivism, possibilities to locate resilient and sus-
ment that changes slowly or with moderate for instance, would find support from the tainable solutions.
speed can be achieved by conventional re- FSX principle if one conceives of modeling
actions. The agent must just keep moving – as in the context we created here – as an
while its environment shifts one patch at a opening up of new (virtual) options. Fol-
time. However, in a dynamical environment lowing this line, self-modeling is then just a
with more severe changes, it can make sense particularly complex form of it, and the con-
to anticipate the most sustainable states, i.e., struction of a (virtual) concept of an “other”

https://constructivist.info/16/1/036.hornischer
{ HANNES HORNISCHER
is a PhD candidate at the Systems Science group of the University of Graz, with a
background in physics. In his research, he focusses on self-organization in complex
systems, using agent-based modeling. He is mainly involved in interdisciplinary projects
touching the fields of systems sciences, physics, biology and robotics, as well as social
psychology. While major parts of his scientific work consist of fundamental research
on artificial intelligence and swarm intelligence, they find application in the control of
autonomous robotic swarms as well as the investigation of dynamics in human groups.
ARTIFICIAL INTELLIGENCE CONCEPTS IN Second-Order Cybernetics

{ SIMON PLAKOLB
is a PhD candidate at the Systems Science group of the University of Graz. His primary
research interests are the simulation of complex, social systems, using parallel
computing, agent-based modeling, and artificial intelligence. As a boundary object
for his investigations, he mostly uses traffic-related topics ranging from meso- to
microscopically detailed models. His research touches the boundary in between computer
science and social simulation, where implementation details on parallel hardware are
as much of an issue as providing policy-relevant results. https://www.behaviour.space

{ GEORG JÄGER
obtained his PhD in theoretical and computational physics, where he simulated ultra-
cold quantum systems with applications in quantum computing and matter-wave
interferometry. During his Postdoc he was using similar methods, but his research focus
shifted towards complexity and sustainability research. Now, he is Assistant Professor
for Computational Systems Sciences at the University of Graz. In his research he is
interested in the simulation and analysis of complex systems, using tools like agent-based
modeling, machine learning and network science. Investigated systems range from purely
abstract ones to subjects closely related to sustainability. https://www.jaeger-ge.org

{
48
MANFRED FÜLLSACK
is Professor of Systems Sciences at the University of Graz. His research focuses
on complex systems and the possibility of simulating and analyzing them with the
help of computer models. He is particularly interested in the emergence and loss of
stable equilibria in these systems and the possibility of anticipating state changes
using statistical methods. Furthermore, he is interested in current developments
in the field of machine learning and artificial intelligence, and especially in their
consequences for all social, economic and ecological aspects of human work.

CONSTRUCTIVIST FOUNDATIONs vol. 16, N°1


Second-Order Cybernetics
Foresight Rather than Hindsight? Hannes Hornischer et al.

References Foerster H. von (2003) On constructing a reality. Maturana H. R. & Varela F. J. (1987) The tree of
In: Understanding understanding. Springer, knowledge: The biological roots of human
Ay N. & Polani D. (2008) Information flows New York: 211–227. Originally published in understanding. Shambhala, Boston.
in causal networks. Advances in Complex 1973. ▶︎ https://cepa.info/1278 Metzinger T. (2007) Empirical perspectives from
Systems 11(1): 17–41. Foerster H. von & Bröcker M. (2002) Teil der the self-model theory of subjectivity: A brief
Bongard J., Zykov V. & Lipson H. (2006) Welt: Fraktale einer Ethik – Ein Drama summary with examples. In: Banerjee R. &
Resilient machines through continuous self- in drei Akten. Carl-Auer-Systeme Verlag, Chakrabarti B. K. (eds.) Models of brain and
modeling. Science 314(5802): 1118–1121. Heidelberg. mind: Physical, computational and psycho-
Boos M., Pritz J., Lange S. & Belz M. (2014) Foerster H. von, Mora P. M. & Amiot L. W. logical approaches. Elsevier, Amsterdam:
Leadership in moving human groups. PLOS (1960) Doomsday: Friday, November 13, 215–278.
Computational Biology 10.4: e1003541. AD 2026. Science 132(29): 1291–1295. Piaget J. (1960) Child’s conception of geometry.
Bousso R., Harnik R., Kribs G. D. & Perez G. ▶︎ https://cepa.info/1596 Basic Books, New York.
(2007) Predicting the cosmological constant Füllsack M. (2013) Constructivism and com- Riegler A. (2007) Is Glasersfeld’s constructivism
from the causal entropic principle. Physical putation: Can computer-based model- a dangerous intellectual tendency? In: Glan-
Review D 76(4): 043513. ing add to the case for constructivism? ville R. & Riegler A. (eds.) The importance
Cerezo S. H. & Ballester G. D. (2018) Frac- Constructivist Foundations 9(1): 7–16. of being Ernst. Echoraum, Vienna: 263–275.
tal AI: A fragile theory of intelligence. ▶︎ https://constructivist.info/9/1/007 ▶︎ https://cepa.info/1776
ArXiv:1803.05049 [Cs], March. http://arxiv. Füllsack M. (2018) Plasticity, granularity Rosen R., Rosen J., Kineman J. J. & Nadin M.
org/abs/1803.05049. and multiple contingency: Essentials for (2012) Anticipatory systems: Philosophical,
Cerezo S. H., Ballester G. D. & Baxevanakis S. conceiving an artificial constructivist agent. mathematical, and methodological founda-
(2018) Solving Atari Games using fractals Constructivist Foundations 13(2): 282–291. tions. Second edition. Springer, New York.
and entropy. ArXiv:1807.01081 [Cs], July. ▶︎ https://constructivist.info/12/2/282 Shannon C. E. (1948) A mathematical
http://arxiv.org/abs/1807.01081 Guckelsberger C., Salge C. & Togelius J. (2018) theory of communication. The Bell System
Charlesworth H. J. & Turner M. S. (2019) New and surprising ways to be mean: Ad- Technical Journal 27(3): 379–423.
Intrinsically motivated collective motion. versarial NPCs with coupled empowerment Wissner-Gross A. D. & Freer C. E. (2013) Causal
Proceedings of the National Academy of Sci- minimisation. arXiv:1806.01387 [cs.AI]. entropic forces. Physical Review Letters
ences 116(31): 15362–15367. https://www. http://arxiv.org/abs/1806.01387. 110(16): 168702.
pnas.org/content/pnas/116/31/15362.full.pdf Hornischer H. (2015) Causal entropic forces:
Chin G. J. (2007) Cognitive homeostasis. Science Intelligent behaviour, dynamics and pattern Received: 14 April 2020
318(5851): 717. formation. Master’s Thesis, Georg-August- Accepted: 9 September 2020
Donati E., van Vuuren G. J., Tanaka K., Romano Universität Göttingen, Germany.
D., Schmickl T. & Stefanini C. (2017) Hornischer H., Varughese J. C., Thenius R.,
aMussels: Diving and anchoring in a new Wotawa F., Füllsack M. & Schmickl T.
bio-inspired under-actuated robot class for (2020) CIMAX: Collective information
long-term environmental exploration and maximization in robotic swarms using
monitoring. In: Gao Y., Fallah S., Jin Y. & local communication. Adaptive Behavior,
Lekakou C. (eds.) Towards autonomous 1059712320912021. 49
robotic systems. Springer, Cham: 300–314. Klyubin A., Polani D. & Nehaniv C. L. (2005)
Foerster H. von (1972) Perception of Empowerment: A universal agent-centric
the future and the future of percep- measure of control. In: 2005 IEEE Congress
tion. Instructional Science 1(1): 31–43. on Evolutionary Computation. Volume 1.
▶︎ https://cepa.info/1647 IEEE Press, Piscataway NJ: 128–135.
Foerster H. von (1981) Objects: Tokens for Martyushev L. M. & Seleznev V. D. (2006)
(eigen-)behaviors. In: Observing systems. Maximum entropy production principle
Intersystems Publications, Seaside CA: in physics, chemistry and biology. Physics
274–285. Originally published in 1976. Reports 426(1): 1–45.
▶︎ https://cepa.info/1270

This target article is part of a bigger picture that encompasses several open peer commentaries and the response
to these commentaries. You can read these accompanying texts on the following pages (or, if you have only the target
article, follow the embedded link that takes you to the journal’s web page from which you can download these texts).

https://constructivist.info/16/1/036.hornischer

You might also like