This action might not be possible to undo. Are you sure you want to continue?
699
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 699
Title: Dynamics Complex Systems
Shor t / Normal / Long
8
Huma n Ci vi li za t i on I :
De f i ni ng Comple xi t y
Conce pt ua l Out li ne
Our ultimate objective is to consider the relationship of a human being to
human civilization, where human civilization is considered as a complex system. We
use this problem to motivate our study of the definition of complexity.
The mathematical definition of the complexity of character strings follows
from information theory. This theory is generalized by algorithmic complexity to allow
all possible algorithms that can compress the strings. The complexity of a string is de
fined as the length of the shortest binary input to a universal Turing machine, such
that the output is the string.
The use of mappings from strings onto system states allows us to apply the
concepts of algorithmic complexity to physical systems. However, the complexity of
describing a microstate of the system is not really what we mean by system com
plexity. We define and study the complexity profile, which is the complexity of a sys
tem observed with a certain precision in space and time.
We est imate the complexity of vari ous systems, focusing on the complexity
of a human bei ng. Our final esti mate is based upon a combinat ion of the length of de
scripti ons in human language, genetic information in DNA, and component counti ng.
Mot i va t i on
8 . 1 . 1 Huma n civiliza t ion a s a complex syst em
The subject of this and the next chapter is human civilization—the collect ion of all
human beings on earth. Our longt er m objective is to und erstand whether and ho w
we can treat human civilization as a complex system and,more particular ly, as a com
plex organism. In biology, collections of inter act ing biological organisms acting t o
gether are called superorganisms. At times, we will adopt this convention and refer to
civilization as the human superorganism. Much of what we discuss is in early stages
of development and is designed to promote fur ther research.
8 . 1
8 . 4
8 . 3
8 . 2
8 . 1
BarYamChap8.pdf 3/10/02 10:52 AM Page 699
This subject is distinct from the others we have considered. The primary distinc
tion is that we have only one example of human civilization. This is not t rue about the
systems we have discussed in earlier chapters, with the except ion of evolution consid
ered globally. The uniqueness of the human superorganism p resents us with ques
tions of fundamental interest in science, related to how much we can know about an
individual system. When there are many instances, we can use infor mation provided
by various examples and the statistics of their proper ties. When ther e is only one sys
tem, to understand its properties or predict its behavior we must apply fundamental
principles that are valid for all complex systems. Since the field of complex systems is
dedicated to uncovering such pr inciples, the subject of the human superorganism
should be considered a premiere area for application of complex syst ems resear ch.
Centr al questions are: How can we characterize this complex system? How can we de
ter mine its proper ties? What can we tell about its dynamics—its past and future? We
note that as individuals we are elements of the human superorganism, thus our spa
tial and temporal experience may ver y well be more limited than that appropriate for
analyzing the human superorganism.
The study o f human civilization is guid ed by hist orical r ecords and contempo
r ary news. In contrast to protein folding, neural networks, evolution and develop
mental biology there are few reproducible labor ator y exper iments. Because of the ir
reproducibility of histor ical or contemporary events,these sources of infor mation are
properly not considered part of conventional science. While this can be a limitation,
it is also apparent that there is a large amount of information available.Our task is to
develop systematic methods for consider ing this kind of infor mation that will enable
us to approach questions about the nature of human civilization as a complex system.
Various asp ects o f these problems have been studied by historians, anthropologists
and sociologists.
Why consider human ci vilization as a single complex syst em? The r ecently dis
cussed concept of a global economy, and earlier the concept of a global village, sug
gest that we should consider the collective economic behavior of human beings and
possibly the global social behavior as a single system. Consider ing civilization as a sin
gle entity we are mot ivated to ask various questions about it. These questions relate to
all of the t opics we have covered in the earlier chapters: spatial and t empor al st r uc
ture, evolut ion and development. We would also like to understand the interaction of
human civilization with its environment.
In developing an understanding of human civilization, we recognize that a
widespread view o f human civilization as a single entity is relatively new and dr iven
by contemporary developments. At least super ficially, the hist or ical epoch described
by the dominance of nationstates appears to be quite different from the present
global economy. While recent events appear to be of particular significance to the
global view, our questions must be addressed in a historical context. Thus we should
include a discussion of the tr ansition to a global economy. We postpone this histori
cal discussion to the next chapter b ecause of the groundwork that we would like to
build in order to target a par ticular o bjective f or our analysis—that o f complexit y
classificat ion.
700 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 700
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 700
We are motivated to under stand complexity in the context of our effor t to un
derstand the nature of the human sup erorganism, or the nature of the global econ
omy. We would like to identify the type of complex system it is—to classify it. The first
distinction that we might make is between a complex material or a complex organism
(see Section 1.3.6). Could part of the global system be modified without affecting the
whole? From historical evidence discussed in the next chapter, the answer appears to
be no. This indicates that human civilization is a complex organism. The next ques
tion we would like to ask is: What kind of complex organism is it? By analogy we could
ask: Is it like a protein, a cell, a plant, an insect, a frog, a human b eing? What do we
mean by using such analogies? At least in part the problem is to describe the com
plexity of an entity’s behavior. Intuitively an insect is a simpler organism than a hu
man being, and this is of qualitative impor tance for our und erstanding of their dif
ferences. The degree of complexity should provide a scale that can distinguish
between the many different complex systems we are familiar with.
Our objective in this chapter is to d evelop a quantitat ive definition of complex
ity and behavior al complexit y. We then apply the d efinition to various complex sys
tems. The focus will be on the complexity o f an indi vidual human being. Once we
have established our complexity scale we will be in a position to apply it to human civ
ilization. We will und erstand formally why a collect ion of complex systems (human
beings) may be, but need not be, complex. Beyond recognizing human civilization as
a complex system,it is far more significant to identify the degree of its complexity. In
the following brief sect ions we establish some additional context for the importance
of measuring complexity using both unconventional and conventional examples of
organisms whose complexity should be evaluated.
8 . 1 . 2 Scena rio: a lien encount er
The possibility of encountering alien life has been debated within the scientific com
munit y. In popular literature, such encounters have been por tr ayed in various forms
ranging from benevolent to catastrophic. The scientific debate has focused thus far on
topics such as the statistics of planet for mation and the likelihood that planets con
tain life. The presence of organic molecules in meteorites and int erstellar gasses has
been interpreted as suggesting that alien life is likely to exist.Effor ts have been made
to listen for signs of alien life in r adio communications and to transmit infor mation
to aliens using the Voyager spacecraft, which is leaving the solar syst em marked with
information about human beings. Thus far there has been no scientifically confirmed
evidence for the existence of alien life. Even a single encounter would change the hu
man perspect ive on humanit y’s place in the univer se.
Let us consider one possible scenario for an encounter. An object that flashes
light int er mittently is found in or bit around one of the planets of the solar system.
The humans encountering this object are faced with the question of determining
whether the object is: (a) a signal device—specifically a recording, (b) a communica
tion device, or (c) a living organism. The cent ral problem can be seen to revolve
around determining whether, and in what way, the d evice is responsive to external
phenomena. Do the flashes of light occur without regard to the exter nal environment
M o t i va t i o n 701
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 701
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 701
in a predetermined sequence? Are they random? If the flashes are sensitive to the en
vironment,then what are they sensit ive to? We will see that these questions are equiv
alent to the question of determining the complexit y of the object’s behavior.
The concept of life in biology is often defined, or better yet, characterized, in
ter ms of consumption, excret ion and r eproduction. As a definition, these char acter
istics are well known to be incomplete, since there are lifeforms that do not repro
duce, such as the mule. Further more, a par t icular individual is still considered alive
even if it/he/she does not reproduce. Moreover, there are various physical systems
such as cr ystals and fire that have all these char acter istics in one form or another.
Moreover, there does not appear to be a direct connection between these biological
char acter istics and other characteristics o f life such as sentience and selfawareness.
When consider ing behavior, the biological perspective emphasizes the survival in
stinct as character istic of life. There are except ions to this,since there exist lifeforms
that are at times suicidal, either individually or collectively. The question of whether
an organism actively seeks life or death does not appear to be a character ization of life
but rather o f lifefor ms that are likely to sur vive. In our discussions, we may be de
veloping an additional characterization of life in ter ms of behavioral complexit y.
Definitions of life are oft en considered in sp eculating about the rights of and t reat
ment of real or imagined organisms—injured or unconscious humans, r obots, or
aliens. The d egr ee o f behavioral complexity is a character ization o f lifeforms that
may ultimately play a role in infor ming our ethical decisions with respect to various
biological lifeforms, whether t er rest rial or (if found) alien, and ar t ificial lifeforms
that we creat e.
8 . 1 . 3 Scena rio: blood cells
One of the areas bri ef ly to u ch ed upon in Ch a pter 6, wh i ch is at the foref ront of com
p l ex sys tems re s e a rch , is the st u dy of the immune sys tem . Bl ood cell s ,u n l i ke other cell s
in the body, a re mobile on a lengt h scale that is large com p a red to their size . In this
ch a r acteri s tic they are more similar to indepen dent or ganisms than to the ot her cell s
of the body. By their migra ti on t hey might be said to “ch oo s e” to assoc i a te with other
cells of the body, or with forei gn ch emicals and cell s . It is fair to say that our under
standing of t he beh avi or of i m mune cells remains pri m i tive . In parti c u l a r, the va ri ety
of po s s i ble ch emical interacti ons bet ween cells has on ly begun to be mapped out . Th e s e
i n teracti ons invo lve a va ri ety of ch emical messen gers . More direct cell  to  cell interac
ti ons wh ere parts of the mem brane or cellular fluid are tra n s ferred are also po s s i bl e .
One of the interesting questions that can be asked is whether, or at what level of
complexity, the inter actions become identifiable as a for m of language. It is not diffi
cult to imagine, for example, that a chemical communication or iginating fr om one
cell might be transferred through a chain of cell int eractions to a number of other
cells. In the context of the discussion in Section 2.4.5, the question o f existence of a
language might be formulated as a question about the possibility of messages with a
grammar—a combinator ial composition of parts that are categor ized like parts of
speech. Such combinator ial mechanisms are known to exist even at the molecular
level in the DNA coding of antibody r eceptor s that are a composite of different parts
702 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 702
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 702
of the genome. It remains to be seen whether intercellular communication is also gen
er ated in this fashion.
In the context of this chapter we can reduce the questions about the immune cells
to a single one—What is the degree o f complexity of the b ehavior o f the immune
cells? By its very nature this question can only be answered once a complete under
standing of immune cell behavior is rea ched. A limit ed und erstanding establishes a
lower bound for the complexity of the behavior. It should also be under stood that dif
ferent types of cells will most likely have quite different levels of behavioral complex
it y, just as different animals and man have differ ing levels of complexit y. Our objec
tive in this chapter is to show that it is possible to quantify the concept of complexity
in a way that is both natural and useful. The pr actical application of these definitions
is a central challenge for the field of complex systems.
8 . 1 . 4 Complexit y
Mathematical definitions of the complexity of systems are based upon the theories of
information and computation discussed in Sections 1.8 and 1.9. In Sect ion 8.2 they
will be used to t reat complexity in the context of mathematical objects such as char
acter str ings. To develop our understanding of the complexity of physical systems re
quires that we relate these concepts to those of thermodynamics (Sect ion 1.3) and
various extensions (e.g., Section 1.4) that enable the t reatment of nonequilibrium sys
tems. In Section 8.3 we discuss r elevant concepts and t ools that may be used for this
pur pose. In Section 8.4 we use sever al semiquantitat ive ap proaches to estimate the
value of the complexity of specific systems.
Our use of the word “complexity”is specified as an answer to the question, How
complex is it? We say, Its complexity is <number ><units>. Intuit ively, we can make a
connect ion between complexity and und erstanding. When we encounter something
new, whether p ersonally or in a scientific context, our objective is to under stand it.
The understanding enables us to use,modify, control or appr eciate it. We achieve un
derstanding in a number of ways, through classification, description and ultimat ely
through the ability to predict behavior. Complexity is a measure o f the inherent dif
ficulty to achieve the desired under standing. Simply stated, the complexity of a system
is the amount of informat ion necessary t o descr ibe it.
This is descript ive complexity. For dynamic systems the descript ion includes the
changes in the syst em over time. We will also discuss the response of a dynamic sys
tem to its environment. The amount of information necessary to describe this re
sponse is a system’s behavioral complexity. To use these definitions of complexity we
will introduce mathematical expressions based upon the theor y of information.
The qu a n ti t a tive def i n i ti on of i n form a ti on (Secti on 1.8) is rel a tively abstr act .
However, it can be measu red in familiar ter ms su ch as by t he nu m ber of ch a r acter s in a
tex t . As a prel i m i n a r y exercise in the discussion of com p l ex i ty, the re ader is invi ted to
exercise intu i ti on to esti m a te t he com p l ex i ty of a nu m ber of s ys tem s .Q u e s ti on 8 . 1 . 1
i n clu des a list of s ys tems that are de s i gn ed to sti mu l a te some thought abo ut com p l ex
i ty as a qu a n ti t a tive measu re of the beh avi or of a sys tem . The re ader should devo te
s ome thought to t his qu e s ti on before proceeding with the rest of the tex t .
M o t i va t i o n 703
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 703
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 703
Q
ue s t i on 8 . 1 . 1 Estimate the complexity o f some of the syst ems in the
following list. For this question use an intuitive definition of complex
ity—the amount of information that would be required to describe the sys
tem or its b ehavior. We use units o f bits to measure information. However,
to make it easier to visualize, you may use other convenient units such as
words or pages of text. So, we can paraphrase the question as, How much
would you have to write to describe the system behavior? A rough conver
sion factor of 1 bit per char acter can be used to conver t these estimates to
bits. It is not necessary to estimate the complexity of all the syst ems on the
list. Consider ing even a few of them is sufficient to d evelop an understand
ing of some of the issues that arise. Indeed, for some of these systems a rough
estimate is far from t rivial. Answers to this question will be given in the text
in the remainder of this chapter.
Hi nt You may find that you would use different amounts o f infor ma
tion depending on what aspects o f the syst em you are describing. In such
cases t r y to give more than one estimate or a range of values.
Physical Systems:
Ideal gas (1 mole at T · 0°K, P · 1at m)
Water in a glass
Chemical reaction
Brownian par ticle
Tur bulent flow
Protein
Virus
Bacterium
Immune system cell
Fish
Frog
Ant
Rabbit
Cow
Human being
Radio
Car
IBM 360
Personal Computer (PC/Macintosh)
The papers on your desk
A book
704 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 704
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 704
A librar y
Weather
The biosphere
Nature
Mathematical and Model Systems:
A number
Iter ative maps (growth, bifurcation to chaos)
1D random walk
short time
long time
Ising model (ferromagnet)
Tur ing machine
Fractals
Sier pinski gasket
3D random walk
Attr actor neur al networ k
Feedfor ward neural network
Subdivided attr actor neur al networ k
Complexi t y of Ma t he ma t i ca l Mode ls
Complexity is a propert y of the relationship between a system and various represen
tations of the system.Our object ive is to understand the complexity of systems com
posed of physical entities such as at oms,molecules or cells. Abstract r epresentations
of such systems are described in terms of characters or number s. It is helpful to pref
ace our discussion of physical systems with a discussion of the complexity of the char
acter s or numbers that we use to represent them.
8 . 2 . 1 Informa t ion, comput a t ion a nd a lgorit hmic complexit y
The discussion of Shannon information theor y in Section 1.8 was based on st rings of
characters that were generated by a source. The source gener ates each string, s, by se
lecting it from an ensemble. The informat ion from a part icular string was defined as
I · −log(P(s)) (8.2.1)
wher e P(s) is the probability of the st ring in the ensemble. If all st rings have equal
probability then this is the logarithm of the number of distinct str ings. The source it
self (or the ensemble) was char acterized by the aver age information of a large num
ber of st rings
(8.2.2)
<I > · − P(s) log(P(s))
s
∑
8 . 2
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 705
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 705
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 705
It was also possible to consider a more general source that selected characters to for m
a Markov chain. The probabilistic coupling between sequential character s reduced the
information content of the st r ing. It was possible to compress the st ring using a re
ver sible coding algor ithm (computation) that would enable the same information to
be r epresented in a more compact for m. The length o f the shor test binary compact
form is equal to the average information in a str ing.
Information theor y suggests that we can define the complexity of a string of char
acter s by the information content of the str ing. The information content is the same
as the length of the shor test binary encoding of the string. This is intuitive—since the
or iginal string can be obtained from its shortest representation,the same information
must be p resent in both. Within standard infor mation theor y, the encodings would
be limited to compression using a Markov chain model. However, more gener ally, we
could use any possible algor ithm for encoding (compressing) the string. Questions
about all possible algor ithms are precisely the domain of computation theor y. The de
finition of Kolmogorov (algor ithmic) complexity of a st ring makes use of computa
tion theor y to describe what we mean by “any possible algor ithm.” Allowing all algo
rithms is the same as allowing more gener al models for the string than a Markov
chain. Our objective in this section is to develop an understanding of algor ithmic
complexit y beginning from the theor y of computation.
Computation theor y (Section 1.9) describes the oper ations of logic and compu
tation on symbols. All the operations are deter ministic and are expr essible in ter ms of
a few elementary operations. The concept of universality of computation is based on
the understanding that a particular type of conceptual machine/computer—the uni
versal Turing machine (UTM)—can perform all possible computations if the in
st ructions are properly encoded as a finite str ing of char acters ser ving as the UTM in
put. Since we have no absolute definition of computation,there is no complete proof.
The existing proof shows that the UTM can perform all computations that can be
done by a much larger class of machines—the Turing machines (TM). Other models
for computation have been shown to be essential ly equivalent to these TM.A TM is
defined by a table of elementary operations that act on the input st ring. The word
“pr ogram” can be used either to r efer to the TM table or to its input and so its use is
best avoided in this context.
We would like to define the algorithmic complexity of a str ing, s, as the length of
the shor test possible binary TM input, such that the output is s. The relationship of
this to the encoding and decoding of Shannon should be apparent. In order to use this
as a definition,there are several matters that must be cleared up. To summarize: There
are actually two sour ces of information when we use a TM, the input st r ing and the
table. We need to take both of them into account to define the complexit y. There are
many ways to define complexit y; however, we can prove that any two definitions of
complexity differ by no more than a constant. We will also show that no matter what
definition we use, most st rings cannot be compr essed.
In order to motivate the lo gic of the following discussion, it is helpful to think
about how we might approach compressing various st r ings of char acters. The shor t
706 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 706
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 706
est compression should then be the complexity o f the st ring. One st r ing might be
formed out of a long subst ring of zeros followed by a long subst r ing of ones. This is
convenient to wr ite by indicating how many zeros followed by how many ones: N
0
N
1
.
We would make a binary st ring notation for N
0
N
1
and write a progr am that would
read this input and then ou tput the original st r ing. Another string might be a r epre
sentation of the Fibonacci numbers (1,1,2,3,5,8,…), start ing fr om the N
0
st number
and ending at the N
1
st number. We could write this using a similar notation as the pr e
vious one, but the program that we would write to generate the str ing is quite differ
ent. Both pr ograms would be quite simple. Now imagine that we want to communi
cate one of the original st rings to someone else. If we want to communicate it in
compressed form, we would have to send the program as well as the input. If there
were many st rings, we might be clever and send the programs only once. The prob
lem is that with only the input str ing, the r ecipient would not know which p rogr am
to apply to obtain the o riginal st ring. We need to send an additional piece of infor
mation that indicates which progr am to apply. The simplest way to do this is to assign
numbers to ea ch of the programs and p reface the p rogram input with the progr am
number. Once we do this, the st ring that we send uniquely determines the st r ing we
wish to communicate. This is necessar y, because if the interpretation of the transmit
ted string is not unique,then it would be impossible to guarantee a cor rect interpr e
tation. We now develop these thoughts using a more formal notat ion.
In what follows, the operation of a TM or a UTM will be indicated by functional
notation. The st ring that results from its application to a tape is indicated by U(s)
where s is the nonblank portion o f the tape (input str ing), U is the id entifier of the
TM,and the initial position of the TM head is assumed to be at the leftmost nonblank
character.
In o rder to define the complexity of a st r ing, we id entify a par ticular UTM U.
Then the complexity C
U
(s) of the string s is defined as the length of the shor test string
r such that U(r) · s. We call an input string r to U that gener ates s a representation of
s. Thus the length o f the shor test r epr esentation is C
U
( s). The centr al theorem of al
gorithmic complexity r elates the complexity according to one UTM U and another
UTM U ′. Before we state and prove the theorem, we discuss several incidental mat 
ters.
We first ask whether we need to use a UTM and not just any TM in the defini
tion. The answer is that the use o f a UTM is convenient,and we cannot significantly
improve the ability to compress strings by allowing the larger class of TM to be used
in the definition. Let us say that we have a UTM U and a TM V, we define a new
UTM W by:
W(0s) · V(s)
W(1s) · U(s)
(8.2.3)
—the first character indicates whether to use the TM V or the UTM U on the rest of
the input. Since the complexity according to the UTM W is at most one more than the
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 707
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 707
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 707
complexity according to the TM V, C
W
(s) ≤ C
V
(s) + 1, we see that using the larger class
of TM to define complexities can not impr ove our results for any part icular str ing by
more than one bit, which is not significant for long complex st rings.
We may be disturbed that the definition of complexity does not indicate that the
complexity of an incompressible st r ing is the same as the string length itself. Indeed
the definition does not r equire it. However, if we wanted to impose this as an auxil
iary condition, we could define the complexity o f a st r ing using a slightly different
construction. Given a UTM U, we define a new UTM V such that
V(0s′) · s′
V(1s′) · U(s′)
(8.2.4)
—the first char acter indicates whether the str ing is compressed. We then define the
complexity C
U
(s) of any string s as one less than the length of the shor test string r such
that V(r) · s. This is not quite a fair definition, because if we wanted to communicate
the st r ing s we would have to indicate all of r, including its first bit. This means that
we should define the complexity as the length of r, which would be a sacrifice of at
most one bit for incompressible str ings. Limiting the complexity of a st ring to be no
longer than the str ing itself might seem a natural idea. However, we note that the
Shannon infor mation, Eq. (8.2.1), is r elated only to the probability o f a st ring, and
may be larger than the or iginal string length for a par ticular str ing.
Returning to our basic definition of complexity, we have described the existence
of a shor test possible r epr esentation of any str ing s, and a single machine U that can
reconstruct each s from this r epresentation. The key theorem that we need to p rove
relates the complexity defined using one UTM U to the complexity defined using an
other UTM U′. The theorem is: the complexit y C
U
based o n U and the complexity
C
U ′
based on U ′ satisfy:
C
U
(s) ≤ C
U ′
(s) + C
U
(U′) (8.2.5)
wher e C
U
(U′) is independent of the string s. The proof of this expression results from
the ability of the UTM U to simulate U ′. To prove this we must improve slightly our
definition of complexity, or equivalently, we have to limit the UTM that are allowed.
This is discussed in Questions 8.2.1–8.2.3. It is shown there that we can pr eface binary
str ings input to the UTM U′ with a prefix that will make them gener ate the same out
put when input to U. We might call this pr efix r
U,U ′
a t ranslation program,it satisfies
the propert y that for any string r , U(r
U,U ′
r) · U ′(r). Let r
U ′
be a minimal representa
tion for U ′ of the st ring s. Then r
U,U ′
r
U ′
is a representation for U of the st ring s. The
length of this string must be great er than or equal to the length of the minimum string
r
U
necessar y to produce the same output:
C
U
(s) ·  r
U
 ≤  r
U,U ′
r
U ′
 ·  r
U ′
 +  r
U,U ′
 · C
U ′
(s) + C
U
(U ′) (8.2.6)
C
U
( U′) ·  r
U,U ′
 is the length of the translation pr ogram. We have proven the in
equalit y in Eq. (8.2.5).
Q
ue s t i on 8 . 2 . 1 Show that there exists a UTM U
0
such that for any TM U
that accepts binary input, there is a str ing r
U
so that for all s and r
satisfying s · U(r), we have that s · U
0
(r
U
r) .
708 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 708
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 708
Hi nt One way to do this is to use a modified form of the constr uction
given in Sect ion 1.9. The new constr uction requires modifying the nature of
the UTM—i.e., a trick.
Solut i on 8 . 2 . 1 We call the UTM described in Section 1.9,
˜
U
0
. We can sim
ulate the UTM U using
˜
U
0
; however, the form of the input string would not
quite satisfy the conditions of this theorem.
˜
U
0
has an input that looks like
r
U
r
t
(r), wher e the right part is only a funct ion of the input st ring r and the
left part is only a function of the UTM U. However, the tape part of the rep
resentation r
t
(r) uses a doubled binary for m for characters and marker s be
tween them so that it is not the same as the original tape. We must r eplace
the tape part of the representation with the o riginal st ring in or der to have
an input string of the form r
U
r.
Both
˜
U
0
and U have binary input strings. This means that we might tr y
to use the tape of U without modification in the tape part of the representa
tion given in Section 1.9. Then there would be no d elimiters between char
acters and no doubled binary representation. There is, however, one diffi 
culty. The UTM U
0
must keep tr ack of where the current position of the
UTM U would be during the same calculation. This was accomplished in
Section 1.9 by conver ting one of the M
1
markers to M
6
at the cur rent loca
tion of the UTM U. There are a number of ways to overcome this problem,
but all r equire us to introduce something new. We will do this by allowing
the UTM U
0
to have a counter that can keep t rack of the current position of
the UTM U. There are two ways to argue this.One is to allow, by proclama
tion, a counter that can reach ar bitrarily high numb ers. The other is to r ec
ognize that the longest st ring we might conceivably encounter is smaller
than the number of par ticles in the known universe, or ver y roughly
10
90
· 2
300
. This means that we can use an inter nal memory of 300 bits to rep
resent such a count er. This count er is initialized to 0 and set to the current
location of the UTM U at every st ep o f the calculation. This constr uction
gives us the desired UTM U
0
.
Q
ue s t i on 8 . 2 . 2 Using the result of Question 8.2.1, prove Eq.(8.2.5). See
the text for a hint.
Solut i on 8 . 2 . 2 The problem is that Eq.(8.2.5) is not actually correct for all
UTM (see Question 8.2.3) so we need to modify our conditions. In a sense,
the modification is minor because we only improve the definition slightly.
We do this by defining the complexity C
U
(s) f or an arbitrary UTM as the
minimum length of r such that W(r) · s where W is defined by:
W(0s) · U
0
(s)
W(1s) · U(s)
(8.2.7)
—the first bit specifies whether to use U or the special UTM U
0
const ructed
in Question 8.2.1. C
U
(s) d efined this way is at most one bit more than our
previous definition, for any par ticular string. It might be significantly
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 709
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 709
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 709
smaller. This should not be a problem, because our objective is to find shor t
representations of str ings. By using our special UTM U
0
in this definition, we
guarant ee that for any two UTM U and U ′, whose complexity is defined in
terms of W and W′ by Eq.(8.2.7), we can wr ite W(r
WW ′
r
W′
) · W′(r
W
). This
is possible because W inherits the p ropert ies of U
0
when the first char acter
of its input st ring is 0.
Q
ue s t i on 8 . 2 . 3 Show that some form o f qualification o f Eq. (8.2.5) is
necessary by demonstr ating that there exists a UTM that does not satisfy
this inequalit y. Therefore, Eq. (8.2.5) cannot be extended to all UTM.
Solut i on 8 . 2 . 3 One possibility is to have a UTM that uses only cer tain char
acter s in its input string. Specifically, define a UTM U that acts the same as a
UTM U′ but uses only ever y other character in its input str ing: U(r) · U ′( r ′)
if r is any string whose odd characters are the characters of r ′. The complex
ity o f a st r ing a ccording t o U is twice the complexity according t o U ′ and
therefore Eq. (8.2.5) is in valid in this case. With the modified definition of
complexit y given in Quest ion 8.2.2 this is no longer a problem.
Switching U and U ′ in Eq. (8.2.5) gives a similar inequality with a constant
C
U ′
(U ). Defining the larger of the two translat ion program lengths to be
C
U,U ′
· max(C
U
(U ′),C
U ′
(U)) (8.2.8)
we have proven that complexities defined by the UTM differ by no more than C
U,U′
:
C
U
( s) − C
U ′
(s) ≤ C
U,U ′
(8.2.9)
Since this constant is independent o f the complexity o f the st r ing s, it b ecomes in 
significant for large enough complexities. Thus, for st rings that are complex enough,
it doesn’t matter which UTM we use to define its complexity. The complexity defined
by one UTM is the same as the complexity d efined by another UTM. This consis
tency—universality—in the complexity of a string is essential in order for it to be well
defined. We will use a few examples to illustrate the nature of universality provided by
this definit ion.
The first example illustr ates the relationship of algor ithmic complexity to string
compression.Given a str ing s we can ask what methods of compression are useful for
the string. A useful compression algorithm cor responds to a patter n in the characters
of the string. A string might have many repetitive digits, or cyclically repeating digits.
Alter natively, it might be a sequence that can be generated using simple mathemat i
cal operations such as the Fibonacci series, or the digits of . There are many such pat
ter ns that are r elevant to the compression of st rings. We can choose a finite set of N
algorithms {V
i
}, where each one is represented by a TM that reconst ructs a st ring s
from a shorter string r by taking advantage of proper ties of the pattern. We now con
str uct a new TM U which is defined by:
U(r
i
r ′) · V
i
(r ′) (8.2.10)
710 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 710
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 710
where r
i
is a binary representation of the number i , having log(N ) bits. This is a UTM
if any of the V
i
is a UTM or it can be made into a UTM by Eq. (8.2.3). We use U to
define the complexit y C
U
(s) of any st r ing as described above. This complexity in
cludes both the length of r ′ and the number of bits (log( N )) in r
i
that t ogether con
stit ute the length of the input r to U. Once it is defined,this complexity is a measur e
of the complexity of all st rings. We do not use different TM to define the complexit y
of each st r ing; one UTM is used to define the complexit y of all str ings.
Despite the message of the last example,let us assume that we are evaluating the
complexit y of a par ticular st ring s. We define a new UTM U
s
by:
U
s
(0s′) · s
U
s
(1s′) · U(s)
(8.2.11)
—the first char acter tells U
s
if the str ing is s. We can use this new UTM to d efine the
complexity of all strings and for this definition the complexity of s is one. How does
this r elate to our the orem about the universality of complexity? The point is that in
this case the t ranslation program between U and U
s
contains the complete informa
tion about s and therefore must be at least as long as C
U
(s). What we have done is to
take the particular st ring s and inser t it into the table of U
s
. We see in this example
how universality is tied to an assumption that the complexities that are discussed are
longer than the TM t ranslation progr ams or, equivalently, the information in their ta
bles. Conceptually, we would say that universality of complexity is tied to an assump
tion of lack o f specific kno wledge on the part of the r ecipient (r epresented by the
UTM) of the infor mation itself. The choice of a particular UTM might be dictated by
an implicit und er standing of the set o f str ings that we would like to represent, even
though the complexity of a string is defined without reference to an ensemble of
st rings. However, this appar ent r elativism o f the complexity is limited by our basic
theorem that relates the complexity of distinct UTM,and by additional results about
the impossibility of compr essing most strings discussed in the following paragraphs.
We have gained an additional result from the const ruction of a single UTM that
gener ates all str ings from their compressed forms. This is that a representation r only
represents one st ring s. We can now prove that the probability that a st r ing of length
N can be compressed is ver y small. The proof proceeds from the observation that the
number of possible strings decreases ver y rapidly with decreasing str ing length. A
st ring s of length s · N compressed by k bits is represented by a particular str ing r of
length r  · C(s) · N − k. Since there are only 2
N−k
str ings of length N − k, at most 2
N−
k
st rings of length 2
N
can be compressed by k bits. The fractional compression is k/N.
For example,among all st rings of length 10
6
bits,at most 1 st r ing in 2
100
· 10
30
can be
compressed by 100 bits or .01% of the string length. This is not a ver y significant com
pression. Even so, this estimate of the a verage numb er o f st rings that can be com
pressed is much t oo large, because st r ings that are not of length N, e.g., st rings of
length N − 1 N − 2, …, N − k, would also be r epresented by st r ings of length N − k.
Thus most st rings are incompressible. Moreover, selecting a st ring at random will
yield an incompressible str ing.
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 711
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 711
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 711
Q
ue s t i on 8 . 2 . 4 Calculate a st rict lower bound for the aver age complex
ity of strings of length N.
Solut i on 8 . 2 . 4 We assume that st r ings of length N are compressed so that
they are r epresented by all o f the shor test st rings. One st ring is r epresented
by the null str ing (length 0), two st rings are represented by a single bit
(length 1), and so on. The relationship:
(8.2.12)
means that we will fill all of the possible strings up to length N − 1 and then
have one str ing left of length N. The average r epresentation length for any
complexit y measure must then satisfy:
(8.2.13)
The sum can be evaluated using a table of sums or:
(8.2.14)
giving:
(8.2.15)
Thus the average complexity o f str ings o f length N cannot be reduced by
more than two bits. This str ict lower bound applies to all measures of
complexit y.
We can also inter pret this discussion to mean that the best UTMs to use to define
complexity are those that are inver tible—they have a onetoone mapping of st rings
to representations. In this case we have a mapping r (s) which gives the unique repre
sentation of a str ing. The reason that such UTM are better is that there are only a lim
ited number of representations shorter than N ; if we use up mo re than one of them
for a particular string, then we will have fewer repr esentations to use for others. Such
UTM are closely analo gous to our understanding of encoding and decoding as de
scribed in infor mation theor y. The UTM is the decoder and the mapping of the str ing
onto its representat ion is the encoding.
Because most strings are incompressible, we can also prove that if we have an en
semble of str ings defined by the p robability P(s), then the aver age algorithmic com
plexity of these st rings is essentially the same as the Shannon infor mation. In par tic
ular, the ensemble of all of the st rings of length N have a Shannon information of N
bits and an aver age algorithmic complexity which is the same. The catch is recogniz
<C(s) > ≥ N −2 ( ) +
1
2
N
( N +2) >N −2
l2
l
l ·0
N −1
∑
·
1
ln(2)
d
d
2
l
l ·0
N −1
∑
·1
·
1
ln(2)
d
d
2
N
−1
2 −1
·1
·N 2
N
−2( 2
N
−1)
<C(s) > ≥
1
2
N
l2
l
l·0
N −1
∑
+ N

.
`
,
2
N
· 2
l
l·0
N −1
∑
+1
712 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 712
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 712
ing that to specify P(s) itself requires an algorithm whose complexity must enter into
the discussion. The pr oof follows from the discussion in Section 1.8. An ensemble de
fined by a probability P( s) can be encoded in such a way that the average str ing length
is given by the Shannon information. We now realize that to define the st r ing com
plexity we must include the descr iption of the decoding operation:
(8.2.16)
where the exp ression C( P) r epresents the complexity of the decoding oper ation for
the universal computer U for the ensemble given by P(s). C(P) depends in part on the
algor ithm used to specify the ensemble probabilit y P(s). For the aver age ensemble
complexity to be essentially equal to the average Shannon information,the specifica
tion of the ensemble must itself be simple.
For Markov chains a similar result applies—the Shannon infor mation of a str ing
representing a Mar kov chain is the same as the algorithmic complexity of the same
string, as long as the algor ithm specifying the Mar kov chain is simple.
A general consequence of the definition of algorithmic complexity is a limitation
on what TM can do. No TM can gener ate a str ing more complex than the input string
that it is p rovided with, plus the information in its table—otherwise we would have
redefined the complexity of the output string to take this into consideration. This is a
key limitation of TM: TM (and computers that are realizations of this model) cannot
generate new infor mation. They can only process information they are given. As dis
cussed briefly in Section 1.9.7, this limitation can be over come by a TM that is given
a st ring o f random bits as input. The infinit ely complex input means the limitation
does not apply. It remains to be demonstrated what tasks such a TM can per for m that
are not possible for a conventional TM. If such tasks are ident ified,t here will be im
portant implications for computer design. In this context, it may also be suggested
that some forms of creativity might be linked to the a vailability of randomness (see
Section 1.9.7). We will retur n to this issue at the end of the chapter.
While the definition of complexity using UTM is ap pealing, there is a profound
difficulty with this proof. It is nonconstr uctive. No method is given to determine the
complexity of a par ticular st ring. Indeed, it can be proven that this is a fundamen
tally difficult task—the time necessary for a TM to determine C(s) grows exponen
tially with the length of s. At least this is t rue when there is a bound on the complex
it y, e.g., by Eq. (8.2.4). Other wise the complexity is noncomputable. We find the
complexity of a st r ing by t r ying all input st rings in the UTM to see which one gives
the necessary output. If the complexity is not bounded, then the halting problem
implies that we cannot t ell if the UTM will halt on a part icular input,thus it is non
computable. If the complexity of the st ring is bound ed, then we only tr y st rings up
to this bound, and it is possible to deter mine if the UTM will halt for memb ers o f
this bounded set of st rings. Nevertheless, tr ying each st r ing requires a time that
grows exponentially with the bound, and therefore is not practical except for a few
ver y simple str ings. The pr ocess of finding the complexity of a str ing is akin to a
process of t rying models for the st ring. A model is a TM that might, when given the
P(s)C(s)
s
∑
· P(s)I
s
s
∑
+C( P)
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 713
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 713
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 713
proper input, generate the st r ing. It is possible to t r y many models. However, to de
termine the actual compressed st r ing may not be practical in any reasonable time.
With any par ticular set of models, we can, however, find an upper bound on the
complexity of a st r ing. One of the possible models is that of a Markov chain as used
by Shannon information theor y. Algorithmic complexity allows more general TM
models. However, by our discussion it is improbable that a randomly chosen st r ing
will be compressible by any algor ithm.
In summar y, the universality of complexity is a stat ement that the use of differ
ent UTMs in the definition of complexity affects the result by no more than a con
stant. This constant is the length of the program that tr anslates the input of one UTM
to the other. Significantly, the mo re complex the st ring is, the mo re univer sal is the
value o f its complexity. This follows because the length of translation programs b e
comes less and less relevant for longer and longer descriptions/r epresentations. Since
we are interested in proper ties of complex syst ems whose descriptions are long, we
can, with caution, rely on the universality of their complexit y. This is not the case with
simple systems whose descriptions and therefore complexities are “subjective”—they
depend on the conventions for descript ion. These conventions, in our mathematical
definition,are represented by the choice of UTM used to define complexity. We also
showed that most strings are not compressible and that the Shannon information
measure is the same as the average algor ithmic complexity for all concisely describ
able ensembles. In what follows,unless otherwise mentioned, we assume a part icular
definition of complexit y C(s) using the UTM U.
8 . 2 . 2 Ma t hema t ica l syst ems: numbers a nd funct ions
One of the difficulties in discussing complexity is that many elementary mathemat i
cal constructs have unusual proper ties when considered fr om the point o f view o f
complexity. Philosopher s have been t roubled by these points,and they have been ex
tensively d ebated o ver the centuries. Most o f the p r oblems revolve around various
forms of infinity. Unlimited numbers and infinite precision oft en simplify symbolic
mathematical discussions;however, they are not well behaved from the point of view
of complexity measures. There appears to be a paradox here that will be clarified when
we distinguish between the complexity of a set of numbers and the complexity of an
element of the set.
Let us consider the complexity of specifying a single int eger. The difficulty with
integers is that there are infinitely many of them. Using an infor mation theor y point
of view, assigning equal probability to all integers would imply that any particular in
teger would have no probability of occur r ing. If I ask you to give me a posit ive int e
ger, from 1 to infinity with equal probability, there is no chance that you will give me
an integer below any part icular cutoff value,say N. This means that you will need ar
bit rarily many digits to specify the integer, and there is no limit to the information re
quired. Thus the c omplexity of specifying a single int eger is infinite. However, if we
allow only integers between 1 and a large posit ive number—say N · 10
90
, roughly the
number of elementary particles in the known universe—the complexity of specifying
one of the integers is only log( N ),about 300 bits. The drastic differ ence between the
714 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 714
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 714
complexity of specifying an ar bitrary integer (infinite) and the complexity of an enor
mously large number of integers (300 bits) suggests that systems that are easy to d e
fine may be highly complex. The whole field of number theor y has shown that int e
gers are not as simple as they first appear. The measure of complexity of specifying a
single int eger may appear to be far from more abst ract discussions like those of the
halting problem or Gödel’s theorem (Section 1.9.5),however, they are related. This is
apparent since these theorems do not apply to finite sets.
In what sense are integers simple? We can consider the length of a UTM input
str ing that can generate all the posit ive int egers. As discussed in the last section, this
is similar to the d efinition of their Kolmogorov or algorithmic complexity. The pro
gr am would, star ting fr om zero and keeping a list, progressively add one to the pre
ceding int eger. The p roblem is that such a program ne ver halts, and the task is not
complete. We can gener alize our definition of a Turing machine to allow for this case
by saying that, by definition, this simple program is generating all int egers. Then the
algorithmic complexity of the integer s is quite small. Another way to do this is to con
sider the complexity of recognizing an integer—the recognition complexity.
Recognizing an int eger is trivial if we are considering only binary st rings, because all
of them repr esent integers. The point ,however, is that we can expand the space of pos
sible characters to include various symbols:lett ers,punct uation, mathematical oper
ations, etc. The mathematical operations might act upon int egers. We then ask how
long is a TM progr am that can recognize any int eger that appears as a combination of
such characters. The length of such a program is also small.
We see that we must distinguish between the complexity of elements of a set and
the set itself. A pr ogram that recognizes int egers is concerned with the att ributes of
the integers required to define them as a set, rather than the specification of a partic
ular integer. The algorithmic complexity of the set of all integers is small even though
the infor mation contained in a single integer can be arbitr arily large. This distinction
between the infor mation contained in an element of a set and the information neces
sary to define the set will also be impor tant when we consider the complexity of phys
ical systems.
The complexity of a single real number is also infinite. Specifying an arbit rar y
real number requires infinit ely many digits. However, if we confine ourselves to an y
reasonable precision, the complexity becomes ver y manageable. For example, the
most accurately known fundamental constant in science is the elect ron magnet ic mo
ment in Bohr magnetons
e
/
B
· 1.001159652193(10) (8.2.17)
where the parenthesis indicates the error estimate, cor responding to 11 accurate
decimal digits or 37 binary digits. If we consider 1 −
e
/
B
we immediately lose
3 decimal digits. Thus, similar to int egers, the pract ical complexity of a real number
is not ver y large.
The discussion of integers and reals suggests that under practical circumstances
a single number is not a highly complex object .Generally, the complexity of a system
arises because of the presence of a large number of parameters that must be specified.
C o m p l e x i t y o f m a t h e m a t i c a l m o d e l s 715
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 715
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 715
However, there is only reason to consider them collectively as a system if they are cou
pled to each other.
The next categor y of mathematical objects that we consider are funct ions. To
specify a function f ( s) we must either describe its operation by a formula or sp ecify
its action on each possible argument. We consider Boolean functions (functions with
binary output, see Section 1.9.2), f (s) · t1, of a binary st ring, s · (s
1
s
2
. . . s
Ne
). The
number of arguments of the function—input bits—is N
e
. There are 2
N
e
possible
values of the input st ring. For each of these there are two possible outcomes (output
values). All Boolean functions may be sp ecified by listing the binary output for each
possible input state. Each possible out put is independent. The numb er o f different
Boolean functions is the number of possible sets o f outputs which is 2
2
N
e
. Assuming
that all of the possible Boolean functions are equally likely, the complexity of a
Boolean function (the amount of information necessary to specify it) is the logarithm
of this number or C( f ) · 2
N
e
. The r epresentation of a Boolean funct ion in t erms of
C( f ) binary variables can also be made explicit as a st ring r epresenting the presence
or absence of terms in the disjunctive normal for m described in Section 1.9.2.
A binary function with N
a
outputs is the same as N
a
independent Boolean func
tions. If we assume that all possible combinations of Boolean funct ions are equally
likely, then the total complexity is the sum of the complexit y of each, or
(8.2.18)
The asymmetr y between input and output is a fundamental one. It arises because we
need to specify for each possible input which of the possible outputs is output.
Specifying “which” is a logarithmic oper ation in the number of possibilities, and
therefore the influence of the ou tput space on the complexity is logarithmic com
pared to the influence of the input. This discussion will be generalized lat er to con
sider a physical system that acts in response to its environment. The environment will
be specified by a number of binary variables (environmental complexity) N
e
, and its
act ions will be specified by a number of binar y variables (act ion complexit y) N
a
.
Comple xi t y of Phys i ca l Sys t e ms
In order to apply our understanding of the complexity of mathematical const ructs to
physical syst ems, we must develop a fundamental understanding of representations.
The complexity of a physical system is to be defined as the length of the shor test
st ring s that can represent its proper ties—the results of possible measurements/
obser vations. In Section 8.3.1 we discuss the relationship between thermodynamics
and information theor y. This will enable us to define the complexity of ergodic and
nonergodic systems. The resulting infor mation measure is essentially that of Shannon
information theor y. When we c onsider algor ithmic complexit y, we can ask whether
this is the smallest amount of information that might be used. This is discussed in
Section 8.3.2. Sect ion 8.3.3 introduces the complexity profile, which measures the
complexity as a funct ion of the scale of observation. Implications of the time scale of
obser vation, for chaotic dynamics, are discussed in Sect ion 8.3.4. Sect ion 8.3.5
8 . 3
C( f ) ·N
a
2
N
e
716 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 716
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 716
discusses examples and proper ties of the complexity p rofile. Sections 8.3.1 through
8.3.5 are based upon descriptive complexit y. To bett er account for the b ehavior of a
system in response to its environment we consider behavioral complexity in
Section 8.3.6. This turns out to be closely related to descriptive complexity. Other is
sues related t o the role of the obser ver are discussed in Sect ion 8.3.7.
8 . 3 . 1 Ent ropy a nd t he complexit y of physica l syst ems
The definition of complexity of a system requires us to develop an understanding of
the relationship of information to the physical propert ies of a system. The most direct
relationship is the relationship of entropy and infor mation. At the outset,it should be
understood that these are very different concept s.Ent ropy is a specific physical prop
ert y of systems that are in equilibr ium, or are in welldefined ensembles. Information
is not a unique physical proper t y. Instead it is related to representations of digits.
Infor mation can be a pr opert y of a time sequence or any other set of degrees of free
dom. For example, the infor mation content of a set o f char acters written on a piece
of paper can be given. The entropy, however, would be largely a proper ty of the paper
or the ink. The ent ropy of paper is difficult to determine pr ecisely, but simpler sub
stances have entropies that have been deter mined and are tabulat ed at sp ecific tem
peratures and pressures. We also know that entr opy is conser ved in reversible adia
batic processes and increases in ir reversible ones.
Despite the significant conceptual difference between information and ent ropy,
the formal definition of information discussed in Section 1.8 appears ver y similar to
the definition of entropy discussed in Section 1.3. Thus, it makes sense that the two
are related when we develop an und erstanding of complexity. It is helpful to review
the definitions. The entropy was defined first for the microcanonical ensemble, which
specifies the macroscopic energy U, number of par ticles N, and volume V, of the sys
tem. We assume that all states (microstates) of the system with this energy, number of
par ticles and volume ar e equally likely in the ensemble. The entropy was wr itten as
S · k ln (U, N, V) (8.3.1)
wher e (U, N, V) is the number of such states. The coefficient k is defined so that the
units of entropy are consistent with units of energy and temper ature for the ther mo
dynamic relationship T · dU /dS.
In for m a ti on was def i n ed for a str ing of ch a racter s . G iven the prob a bi l i ty of t h e
s tr ing of ch a racters , the inform a ti on is def i n ed by Eq . ( 8 . 2 . 1 ) . The loga rithm is taken
to be base 2 so that the inform a ti on is measu red in units of bi t s . We see that the infor
m a ti on con tent is rel a ted to sel ect ing a single state out of an en s em ble of po s s i bi l i ti e s .
We can relate the two definitions in a mathematically direct but conceptually sig
nificant way. If we want to specify a particular microstate of a thermodynamic system,
we must select this microstate from the whole ensemble. The probability of this par
ticular state is given in the microcanonical ensemble by P · 1/ . If we think abou t
the state of the system as a message containing infor mation, we can use Eq.(8.2.1) to
give the amount of infor mation as:
I({x,p} ( U, N,V)) · S(U, N,V) /(k ln2) (8.3.2)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 717
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 717
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 717
This expr ession should be understood as the amount of information contained in a
microstate {x,p}, when the syst em is in the macrostate sp ecified by U, N,V—it is also
the information necessary to describe precisely the microstate. This is the fundamen
tal relationship we are looking for. We review its meaning in terms of the description
of a part icular idealized physical system.
If we want to describe the microstate of a system, like a gas of par ticles in a box,
classically we must specify all of the positions and momenta of the part icles {x
i
,p
i
}. If
N is the number o f par ticles, then there are 6N coordinates, 3 position and 3 mo
mentum coordinates for each particle. To specify exactly the position of each particle
appears to require ar bitr ary precision in these coordinates. If we had to sp ecify even
a single position exactly, it would take an infinite numb er of binary digits. However,
quantum mechanics is inherently gr anular, thus there is a smallest distance ∆x within
which we do not need to specify one position coordinate of a particle. The part icle lo
cation is uniquely given once it is within a region ∆x. More cor rectly, the particle must
be located within a region of position and momentum of ∆x∆p · h, where h is
Planck’s constant. The granularity defines the precision necessary to specify the posi
tions and momenta, and thus also the amount of information (number of bits)
needed in order to describe completely the microstate. The definition of the entropy
takes this into account, other wise the counting of possible microstates of the system
would be infinite. The complete calculation of the entropy (which also takes into ac
count the indistinguishability of the part icles) is given in Question 1.3.2. We now rec
ognize that the calculation of the entropy is precisely a calculation of the information
necessar y to descr ibe the microstate.
There is another way to think about the relationship of entropy and infor mation.
It follows from the recognition that the number of states of a str ing of
I({x,p} ( U, N,V)) bits is the same as the number of states of the system. If we consider
a mapping of system states onto str ings, the st r ings enumerate or label the system
states. If there are I({x,p} ( U, N,V)) bits in each string, then ther e is a onetoone map
ping of system states onto the st r ings, and a st r ing uniquely identifies a system state.
We say that a st ring represents a system microstate.
We thus identify the ent ropy of a physical syst em as the amount o f infor mation
necessary to identify a single microstate fr om a specified macroscopic ensemble. For
an ergodic macroscopic syst em, this definition is a robust one. It does not matter if
we consider a typical or an aver age amount of information. What happens if the sys
tem is nonergodic? There are two kinds of nonergodic systems we will discuss: a
magnet with a welldefined magnetization below its order ing phase transition (see
Section 1.6), and a glass where there are many fr ozen coordinates describing the lo 
cal arrangements of atoms (see Section 1.4). Many of these coordinates do not
change during the time of a t ypical exper iment. Should we include the inf ormation
necessary to specify the frozen variables as part of the entropy? We would like to sep
arate the discussion of the frozen variables from the fast ones that are in equilib
r ium. We use the entropy S to refer to the fast ensemble—the enumer ation of the ki
netically accessible states of the system. The same function of the frozen variables we
will call C.
718 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 718
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 718
For the magnet, the amount of information contained in frozen variables is
small. For the Ising model of a magnet (Section 1.6), below the magnetization transi
tion only a single binary variable is necessary to specify if the system magnetization is
UP or DOWN. We treat the magnet by giving the information about the magnetization
explicitly as part of the ensemble description. The amount of infor mation is insignif
icant compared to the information in the microstate of a system,and ther efore is gen
erally ignored.
In contr ast, for a glass,the amount of infor mation that is included in the frozen
variables is large. How does this information relate to the thermodynamic treatment
of the system? The conventional thermodynamic theor y of phase transitions does not
consider the exist ence of frozen information. It is designed for systems like the mag
net, where this information is insignificant, and thus it does not apply to the glass
tr ansition.A different theor y is necessary which includes the change from an ergodic
to a nonergodic system, or a change from infor mation in fast variables to information
in frozen variables. Is there any relationship between the frozen infor mation and the
entropy? If they are r elated at all, there are two intuitive possibilities. One is that we
must sp ecify the fr ozen variables as part of the ensemble, and the amount of infor
mation necessary to describe the fast variables is just as large as ifthere were no frozen
variables. The other is that the frozen variables balance against the fast variables so
that when there is more frozen information ther e is less infor mation in the fast var i
ables. In order to deter mine which is cor rect, we will need to consider an exper iment
that measures both. As long as an experiment is being performed in which the frozen
variables never change, then the amount of infor mation in the frozen variables is
fixed. Thermodynamic experiments only depend on entropy differ ences. We will need
to consider an exper iment that changes the frozen variables—for example,heating up
a glass until it becomes a liquid or cooling it fr om a liquid to a glass. In such an ex
periment the frozen infor mation must be accounted for. The difficulty with a glass is
that we do not have an independent way to deter mine the amount of frozen infor
mation. For tunately, there is another system where we do.
Ther e is an intermediate example b etween a magnet and a glass that is of con
siderable interest. The st ructure of ice has a glasslike fr ozen disorder of its hydrogen
atoms below approximately 100°K. The simplest way to think about this disorder is
that it arises from a choice of or ientations of the water molecule around the position
of the oxygen at om. This means that there is a macroscopic amount o f information
necessary to specify the static str ucture of ice. The amount of information associated
with this disorder can be calculated directly using a model for the str ucture of ice that
takes into account the cor relations between molecular orientations that are needed to
for m a selfconsistent hydrogen structure within the oxygen lattice.A first estimate is
based on an average o f 3/ 2 or ientations per molecule or C · Nk ln(3/ 2) · 0.806
cal/moleK. A r eview of better calculations is given in a book by Fletcher. The best is
C · 0.8145 t 0.0002 cal/mole°K. The other calculation we need is the amount of en
t ropy in steam. This can be obtained using a slight modification of the ideal gas cal
culation,that takes into account the rotational and internal vibrational motion of the
water molecule.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 719
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 719
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 719
The key exper iment is to measure the change in the entropy of the syst em as a
funct ion of temper ature as it is heat ed from ice all the way to steam. We find the en
tropy using the standard thermodynamic relat ionship (Section 1.3)
q · TdS (8.3.3)
where q is the heat added to the system. At close to a temper ature o f zero d egrees
Kelvin ( T · 0K) the entropy is zero because all motion stops, and there is only one
possible state of the system. Thus we would expect
(8.3.4)
—the total amount of entropy added to the system as it is heated up should be the
same as the entropy of the gas. However, experimentally there is a difference of 0.82 t
0.05 cal/moleK between the two. This is the amount of entropy in the gas that was not
added to the system as it was heat ed. The coincidence of two number s—the amount
of entropy missing and the calculation of the infor mation in the fr ozen st r ucture of
the hydrogen atoms, suggests that the missing entr opy was present in the original state
of the ice.
(8.3.5)
This in turn implies that the information in the frozen degrees of freedom was tr ans
ferr ed (but conserved) to the fast degrees of freedom. Eq.(8.3.5) is not consistent with
the standard ther modynamic r elationship in Eq. (8.3.3). Instead it should be modi
fied to read:
q · Td(S + C ) (8.3.6)
This should be under stood as implying that adding heat to a system increases the in
for mation either of the fast or frozen variables. Adding heat (e.g., to ice) increases the
temper ature of the system,so that fewer variables are frozen. In this case C decreases
and S increases mo re than would be given by the conventional r elationship o f Eq.
(8.3.3). When heat is not added to a system, we see that there can be processes that
change the number of fast degrees of freedom and the number of static degrees of free
dom while leaving their sum the same. We will consider this further in later sections.
Eq. (8.3.6) is important enough to p resent it again from a different perspect ive.
The discussion will help demonstr ate its validity by using a theoret ical argument
(Fig. 8.3.1). Rather than considering it from the point of view of heating ice till it be
comes steam, we consider what happens either to ice or to a glass when we cool it
down through the tr ansition where degrees of freedom become frozen. In a theoreti
cal description we start,ab ove the freezingin tr ansition, with an ensemble of systems.
As we cool the system we remove heat,and this is reflected in a decrease in the num
ber of possible states of the system. We think of this as a shrinking o f the number of
elements of the ensemble. However, as we go through the freezingin t ransition, the
S(T) ·C(T ·0) +
q
T
0
T
∫
S(T) · q / T
0
T
∫
720 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 720
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 720
ensemble breaks up into disjoint pieces that can not make tr ansitions to each other.
Any particular mater ial must be in one of the disjoint pieces. Thus for a par ticular ma
terial we must t rack only part of the original ensemble. For an incr emental decrease
in temperature due to an incremental removal of heat, the information needed to
identify (describe) a part icular microstate is the sum of the information necessary to
describe which o f the disjoint parts of the ensemble the system is in, plus the infor
mation needed to specify which of the microstates the system is in once its ensemble
fragment has been sp ecified. This is the meaning of Eq. (8.3.6). The information t o
specify the ensemble fragment was t ransfer red from the entropy S to the ensemble in
for mation C. The r eduction of the ent ropy, S, is not r eflected in the amount of heat
that is removed.
We are now in a position to give a first definition of complexity. In order to de
scribe a system and its behavior over time, we must describe the ensemble it is in. This
information is given by C/ k ln(2). If we insist on describing the microstate of the sys
tem, we must add the information contained in the fast degrees of freedom S/ k ln(2).
The question is whether we should insist on describing the microstate. Typically, the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 721
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 721
Title: Dynamics Complex Systems
Shor t / Normal / Long
*
T
1
T
2
T
3
T
4
Fi gure 8 . 3 . 1 Sch e ma t ic illust ra t ion of t h e e ffe ct on mot ion in ph a se spa ce of coolin g t h rough
a gla ss t ra n sit ion . Above t h e gla ss t ra n sit ion ( T
1
, T
2
a nd T
3
) t h e syst e m is e rgodic — it e x
plore s t h e e n t ire ph a se spa ce . Coolin g t h e syst e m ca use s t h e ph a se spa ce t o sh rin k smoot h ly.
Th e e n t ropy, t h e loga rit h m of t h e volume of ph a se spa ce , de cre a se s. Be low t h e gla ss t ra n si
t ion , T
4
, t h e syst e m is n o lon ge r e rgodic a n d t h e ph a se spa ce bre a ks up in t o pie ce s. A pa r
t icula r syst e m e xplore s on ly on e of t h e pie ce s. Th e t ot a l a moun t of in forma t ion n e ce ssa ry t o
spe cify a pa rt icula r microst a t e ( e . g. in dica t e d by t h e *) is t h e sum of C/ k ln ( 2) , t h e in for 
ma t ion n e ce ssa ry t o spe cify wh ich pie ce , a nd S/ k ln ( 2) , t h e in forma t ion n e ce ssa ry t o spe cify
t h e pa rt icula r st a t e wit h in t he pie ce.
BarYamChap8.pdf 3/10/02 10:52 AM Page 721
whole point of describing an ensemble is that we don’t need to specify the part icular
microstate. We will return to address this question in greater detail later. However, for
now it is reasonable to consider describing the system to be specifying just the en
semble. This implies that the information in the frozen variables C/ k ln(2) is the com
plexity. For a thermodynamic syst em in the micr ocanonical ensemble, the complex
ity would be given by the (small) number of bits in the specification o f the three
variables ( U, N,V ) and the number of bits necessary to specify the t ype o f element
(atom,molecule) that is present. The actual amount of infor mation seems not to be
precisely defined. For example, we have not identified the number of bits to be used
in specifying ( U, N,V ). As we have seen in the discussion of algor ithmic complexity,
this is to be expected, since the conventions of how the inf ormation is sp ecified ar e
cr ucial when there is only a small amount.
We have learned from this discussion that for a nonergodic system, the com
plexity (the frozen ensemble information) is bounded by the sum over the numb er
of fast and static degrees of freedom ( C + S > C). For mat erial syst ems, we know in
principle how to measure this. As in the case o f ice, we heat up the syst em to the va
por phase where the entropy can be calculated,then subt ract the entropy added dur
ing the heating process. This gives us the value of C + S at the temperature from
which the heating began. If we know that C >> S, then the result is the complexity it
self. In order for this t echnique to work at all, the complexity must be large enough
so that experimental accuracy can enable its measurement. Estimates we will give
later imply that complexities of biological organisms are too small to be measured in
this way.
The concept of frozen degrees of freedom immediately raises the question of the
time scale in which the experiment is performed. Degrees of freedom that are frozen
on one time scale are not on sufficiently longer ones. If our time scale of obser vation
would be ar bitrarily long, we would always describe syst ems in equilibr ium. The en
tropy would then be large and the complexity would be negligible.On the other hand,
if our time scale of obser vation was extremely shor t so that microscopic motions were
detected, then our complexity would be large and the ent ropy would be negligible.
This motivat es the introduct ion of the complexit y profile in Section 8.3.3.
Q
ue s t i on 8 . 3 . 1 Calculate the information necessary to specify the mi
crostate of a mole of an ideal gas at T · 0°C and P · 1atm. Use the mass
of a helium or neon atom for the mass of the ideal gas particle. This requires
a careful investigation of units.A table of fundamental physical constants is
given on the following page.
Solut i on 8 . 3 . 1 The ent ropy of an ideal gas is found in Section 1.3 t o be:
S · kN[ln(V/N ( T)
3
) + 5/2] (8.3.7)
( T ) · (h
2
/2 mkT )
1/2
(8.3.8)
The infor mation content of a microstate is given by Eq. (8.3.2).
722 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 722
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 722
Each of the quantities must be evaluated numerically from appropr iate
tables. A mole of par ticles is
N
0
· 6.0221367 × 10
23
/mole (8.3.9)
At the temper ature
T
0
· 0 °C · 273.13 °K (8.3.10)
kT
0
· 0.0235384 eV (8.3.11)
and pressure
P
0
· 1atm · 1.01325 × 10
5
Pascal · 1.01325 × 10
5
Newton/m
2
(8.3.12)
the volume (of a mole of par ticles) of an ideal gas is:
V · N
0
kT /P
0
· 22.41410 × 10
−3
m
3
/ mole (8.3.13)
the volume per par t icle is:
V/N · 37219.5 Å
3
(8.3.14)
At the same temper ature we have:
(T) · (2 mkT / h
2
)
−1/ 2
· m[AMU ]
−1/ 2
× 1.05633
°
A (8.3.15)
This gives the total information for a mole of helium gas at these conditions
of
I · N
0
(18.5533 + 3/2 ln(m[AMU ])) = 1.24 × 10
25
(8.3.16)
Note that the amount of information per par ticle is only of order 10 bits.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 723
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 723
Title: Dynamics Complex Systems
Shor t / Normal / Long
hc = 12398.4 eV Å
k = 1.380658x10
23
Joule/°K
R = kN
0
= 8.3144 Joule/°K/mole
c = 2.99792458 10
8
Meter/second
h = 6.6260755 10
34
Joule second
e = 1.60217733 10
19
Coulomb
ProtonMass = 1.6726231x10
27
kilogram
1 AMU = 1.6605402x10
27
kilogram = 9.31494x10
9
eV
M [Helium] = 4.0026 AMU
M [Neon] = 20.179 AMU
M [Helium] c
2
= 3.7284x10
9
M [Neon] c
2
=1.87966x10
10
Ta ble 8 . 3 . 1 Fun da me n t a l const a n t s
BarYamChap8.pdf 3/10/02 10:52 AM Page 723
8 . 3 . 2 Algorit hmic complexit y of physica l syst ems
The complexity of a system is designed to measure the amount of infor mation neces
sary to describe it, or its behavior. In this section we address the key word “necessar y.”
This word suggests that we are after the minimum amount of information. The min
imum amount of information depends on our capabilities of inference from a smaller
amount of information. As discussed in Section 8.2.2, logical inference and compu
tation lead to the definition of algorithmic complexity. However, for an ensemble that
can be described simply, the algorithmic complexity is no different than the Shannon
infor mation.
Since we have established a connection between the complexity of physical sys
tems and representations in ter ms of character str ings, we can apply these results di
rectly to physical syst ems.A physical system in e quilibrium is r epresented by an e n
semble. At any par ticular time, it is in a single microstate. The sp ecification o f this
microstate can be compressed by encoding in cer tain rare cases. However, on average
the compression cannot lead to an amount of information significantly different from
the ent ropy (divided by k ln(2)) of the syst em. This conclusion follows because the
microcanonical (or canonical) ensemble can be concisely described. For a nonergodic
system like a glass,the microstate description has been separated into two parts. It is
no longer t rue that the ensemble o f dynamically accessible states o f a par t icular sys
tem is concisely describable. The information in the frozen degrees of freedom is pre
cisely the information necessary to specify the ensemble of dynamically accessible
states. The total information, (C + S)/ k ln(2), r epresents the selection of a microstate
from a simple ensemble (microcanonical or canonical). Since the total information
cannot be compressed, neither can either of the two parts of the infor mation—the
frozen degrees o f freedom that we have identified with the complexity, or the addi
tional information necessary to specify a par t icular microstate. Thus the algor ithmic
complexit y is the same as the infor mation for either part.
We can now, finally, explain the experimental observation that an adiabatic
process does not change the ent ropy of a syst em (Section 1.3). The algor ithmic de
scription of an adiabatic process r equires only a few pieces of information, e.g., the
size of a force applied over a specified distance. If a new microstate of the system can
be described by the original microstate plus the process of adiabatic change,then the
amount o f information in the microstate has not been changed, and the adiabatic
process does not change the microstate algorithmic complexity—the ent ropy of the
system.Like other aspects of statistical mechanics (Section 1.3),this should not be un
derstood as a proof but rather as an explanation of the relationship of the thermody
namic obser vation to the microscopic proper ties. Using this explanation, we can iden
tify the nature of an adiabatic process as one that is described microscopically by a
small amount of information.
This becomes clearer when we compare adiabatic and irreversible processes.Our
argument that an adiabatic process does not change the entropy is based on consid
ering the infor mation necessary to describe an adiabatic process—slowly moving a
piston to expand the space available to a gas. An ir reversible process could achieve a
similar expansion, but would not be ther modynamically the same. Take, for example,
724 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 724
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 724
the removal of a partition that separates the gas from a second,init ially empty, cham
ber. The irrever sible process of expansion of the gas results in a final state which has
a higher entropy (see Question 1.3.4). The removal of a partition in itself does not ap
pear to require a lot of information to describe.One moment aft er the partition is re
moved, the ent ropy of the syst em is the same as before. To understand how the en
t ropy increases, we must consider the nat ure of ir rever sible dynamics.
A key ingredient in our understanding of physical systems is that the time evolu
tion of an isolated system can be obtained from the simple laws of mechanics (classi
cal or quantum). This means that the dynamics of an isolat ed syst em conser ves the
amount of information as well as the energy. Such dynamics are called conser vative.
If we consider an ensemble o f systems star t ing in a par t icular region of phase space,
the phase space position evolves in time, but the volume of the phase space that is oc
cupied—the entropy—does not change. This conservation of phase space can be un
derstood from our discussion of algorithmic complexit y: since the deterministic dy
namics of a syst em can be computed, the algorithmic complexity of the system is
conser ved. Where d oes the additional entropy come fr om for the final equilibrium
state after the expansion?
There are two parts to the process of proceeding to a true equilibrium state. In
the first part the distinct ion between the nonequilibrium and equilibr ium state is ob
scured. At first there is macroscopically obser vable information—the par t icles are in
one half of the chamber. This infor mation is converted to microscopic correlations
between at omic positions and momenta. The conversion occurs when the gas ex
pands to fill the chamber, and various currents that follow this expansion become
smaller and smaller in ext ent. The microscopic cor relations cannot be obser ved on a
macroscopic scale,and for standard obser vations the system is indistinguishable from
an equilibrium state. The t ransfer of information from macroscopic to microscopic
scale is r elated to issues of chaos in the dynamics of physical syst ems, which will b e
discussed later.
The second part to the process is an actual increase in the entropy of the system.
The additional entropy must come from outside the system. In macroscopic physical
processes, we are not gener ally concer ned with isolating the system from information
t ransfer, only with isolating the system from energy tr ansfer. Thus we can surmise that
the expansion o f the gas is followed by an information t ransfer that enables the en
t ropy to increase to its equilibrium value without changing the energy of the system.
Many of the issues r elated to describing this nonequilibr ium process will not be ad
dressed here. We will,however, begin to address the topic of the scale of obser vation
at which correlations appear using the complexit y pr ofile in the following sect ion.
8 . 3 . 3 Complexit y profile
Ge ne ra l a pproa ch In this sect ion we discuss the relationship of microscopic and
macroscopic complexity. Our o bjective is to develop a consistent language for dis
cussing complexity as a function of length scale. In the following sect ion we will dis
cuss the complexity as a function of time scale, which gener alizes the discussion of
frozen and fast degr ees of freedom in Sect ion 8.3.1.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 725
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 725
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 725
When we describe a system, we are not gener ally inter ested in a microscopic de
scription of the positions and velocities of all of the par ticles. For a thermodynamic
system there are only a few macroscopic parameters that we use to describe the sys
tem. This is indeed the reason we use ent ropy as a summary of the many hidden pa
r ameter s of the syst em that we are not interested in. The microscopic parameters
change too fast and over too small distances to matter for our macroscopic measure
ments/experience. The same is t rue more gener ally about systems that are not in equi
librium: a macroscopic descript ion does not r equire specifying the position of each
atom. This implies that we must d evelop an understanding of complexity that is not
tied to the microscopic descript ion, but is relevant to obser vations at a part icular
length and time scale.
This point lies at the root of a con ceptual probl em in t hinking abo ut the com
p l ex i t y of s ys tem s . A gas in equ i l i br ium has a large en tropy wh i ch is its micro s cop i c
com p l ex i t y. This is co u n ter to our understanding of com p l ex sys tem s . Sys tems in equ i
l i br ium are intu i tively simpler than non equ i l i brium sys tems su ch as a human bei n g. In
Secti on 8.3.1 we star ted to ad d ress this probl em by iden ti f ying the com p l ex i t y of a non
er godic sys tem as the inform a ti on nece s s a ry to specify the frozen degrees of f reedom .
We now discuss a more sys tem a tic approach to dealing with mac ro s copic ob s erva ti on s .
In order to consider the macroscopic complexity, we have to define what we mean
by macroscopic in a formal sense. The concept of macroscopic must be understood
in relation to a part icular obser ver. While we often consider exper imental results to be
independent of the observer, there are various ways in which the obser ver is essential
to the obser vation. In this context, in which we are concerned with the meaning of
macroscopic, considering the observer is essent ial.
How do we ch a racteri ze the differen ce bet ween a micro s copic and a mac ro s cop i c
ob s er ver? The most cr ucial differen ce is that a micro s copic ob s erver is able to disti n
guish bet ween all inheren t ly disti n g u i s h a ble states of the sys tem , while a mac ro s cop i c
ob s er ver cannot. For a mac ro s copic ob s er ver, m a ny micro s cop i c a lly disti n ct states ap
pear the same. This is rel a ted to our understanding of com p l ex i ty, because the mac ro
s copic ob s er ver need on ly specify wh i ch of the mac ro s cop i c a lly dist i n ct states the sys
tem is in. The micro s copic ob s er ver must specify wh i ch of t he micro s cop i c a lly disti n ct
s t a tes t he sys tem is in. Thus the mac ro s copic com p l ex i ty must alw ays be small er than
the micro s copic com p l ex i ty of a sys tem . In s te ad of con s i dering a unique mac ro s cop i c
ob s er ver, we wi ll con s i der a sequ en ce of ob s er vers wit h a progre s s ively poorer abi l i ty
to distinguish micro s t a te s . Using these ob s er vers , we wi ll define t he com p l ex i t y prof i l e .
I de a l ga s These ideas can be direct ly app l i ed to the ideal ga s . We gen era lly think abo ut
a mac ro s copic ob s erver as having an inabi l i ty to distinguish finescale distance s . Thu s
we ex pect that the usual uncert a i n t y in parti cle po s i ti on ∆x wi ll increase for a mac ro
s copic ob s erver. However, we learn from qu a n tum mechanics that a unique micro s t a te
of the sys tem is def i n ed using an uncert a i n t y in both po s i ti on and mom en tu m , ∆x∆p
·h. Thus for the mac ro s copic ob s erver to confuse disti n ct micro s t a te s , the produ ct ∆x∆p
must be larger than its minimum va lue—an ob s er va ti on of the sys tem provi des mea
su rem ents of the po s i ti on and mom en tum of e ach parti cl e , whose uncert a i n ty has a
produ ct gre a ter than h. We can label our ob s er ver s by this uncert a i n ty, wh i ch we call
˜
h.
726 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 726
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 726
If we retr ace our steps to the calculation of the entr opy of an ideal gas
(Question 1.3.2), we can recognize that essentially the same calculation applies to the
complexity with the uncer tainty
˜
h. An obser ver with the uncer tainty
˜
h will determine
the complexity of the ideal gas according to Eq.(8.3.7) and Eq.(8.3.8), with h replaced
by
˜
h. Thus we define the complexity profile for the ideal gas in equilibrium as:
(8.3.17)
This equation describes a complexity that decreases as the ability of the observer t o
distinguish states decreases. This is as we expected. Despite the weak logarithmic de
pendence on
˜
h , C(
˜
h) decreases rapidly because the coefficient of the logarithm is so
large. By the time
˜
h is about 100 times h the complexity profile has become negative
for the ideal gases descr ibed in Question 8.3.1.
What does a negative complexity mean? It actually means that we have not done
the calculation quite right. The counting of states we did for the ideal gas assumed that
the par ticles were well separat ed fr om each other. If they b egin to overlap then we
must count the possible states differently. This over lap is significant precisely when
Eq.(8.3.17) becomes negative. If the particles really overlapped then quantum statis
tics b ecomes imp or tant; the gas is said to be degenerate and satisfies either Fer mi
Dir ac or BoseEinst ein statistics. In our case the overlap arises only because the o b
server cannot distinguish differ ent par ticle positions. In this case, the counting of
states is appropr iate to a classical ideal gas, as we now explain.
To calculate the complexity as a function of
˜
h for an equilibrium state whose en
t ropy is S, we start by calculating the number of microstates that the observer cannot
distinguish. The logarithm of this number of microstates, which we call S(
˜
h)/k ln(2),
is the amount of infor mation necessary to specify a microstate, if the macrostate is
known. Thus we have that:
(8.3.18)
To count the number of microstates that the observer cannot distinguish,we note that
the possible microstates of a par ticular par ticle are grouped together by the obser ver
into bins (r egions or cells o f position and momentum) of size ( ∆x∆p)
d
·
˜
h
d
, where
d · 3 is the dimensionality of space. The obser ver deter mines only that a particle is
within a cer tain region. In the classical ideal gas each par ticle moves ind ependently,
so more than one particle may occupy the same microstate. However, this is unlikely.
As
˜
h increases it becomes increasingly likely that there is more than one par ticle in a
region. If the number of part icles in a cer tain region is n
i
, then the number of distinct
microstates of the bin that the obser ver does not distinguish is:
(8.3.19)
wher e g · (
˜
h /h)
d
is the number of microstates within a r egion. This is the product of
the number of states each particle may be in, cor rected for particle indistinguishabil
it y. The number of microstates of the whole system that appear to the observer to be
the same is the product of such terms for each region:
g
n
i
n
i
!
C(
˜
h ) ·S −S(
˜
h )
˜
h >h
C(
˜
h ) ·S −3kN ln(
˜
h / h)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 727
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 727
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 727
(8.3.20)
From this we can deter mine the complexity of the state deter mined by the obser ver
as:
(8.3.21)
If we consider this expression when g · 1—a microscopic obser ver—then n
i
is almost
always either zero or one and each term in the product is one (a more exact treatment
requires treating the statistics of a degenerate gas). Then C (
˜
h) is S, which means that
the microstate complexity is just the entr opy. For g > 1 but not t oo large, n
i
will still
be either zer o or one, and we r ecover Eq. (8.3.17). On the other hand, using this ex
pression it is possible to show that for a large value of g, when the values of n
i
are sig
nificantly larger than one, the complexity goes to zero.
We can understand this by r ecognizing that as g increases, the number of par t i
cles in each bin increases and becomes closer to the average number of par ticles in a
bin according to the macroscopic probability distribution. This is the equilibrium
macrostate. By our conventions we are measuring the amount of infor mation neces
sary f or the observer to specify its observation in relation to the equilibrium state.
Therefor e, when the average number of particles in a bin becomes close enough to this
distr ibut ion,t here is no infor mation that must be given. To write this explicitly, when
n
i
is much larger than one we apply Ster ling’s approximation to the factorial in
Eq. (8.3.21) to obtain:
(8.3.22)
where P
i
· n
i
/g is the probability a part icle is in a part icular state according to t h e ob
s er ver. It is shown in Quest i on 8.3.2 t hat C (
˜
h) is zero wh en P
i
is t he equ i l i br iu m
prob a bi l i ty for finding a part i cle in r egi on i ( n o te t hat i stands for both po s i ti on and
m om en tum (x, p) ) .
There are additional smaller terms in Sterling’s ap proximation to the factor ial
that we have neglected. These t erms are gener ally igno red in calculations of the en
t ropy because they are not propor tional to the number of par t icles. They are, how
ever, relevant to calculations of the complexit y:
(8.3.23)
The additional t erms are r elated to fluct uations in the d ensit y. This will become ap
parent when we analyze nonunifor m systems below.
We will discuss additional examples of the complexity profile below. First we sim
plify the complexity profile for obser vers that measure only the positions and not the
momenta of par ticles.
C(
˜
h ) ·S −k
i
∑
n
i
ln( g / n
i
) +1
( )
+k
i
∑
ln( 2 n
i
)
C(
˜
h ) ·S −k
i
∑
n
i
ln( g / n
i
) +1
( )
·S +k g
i
∑
P
i
ln(P
i
) −kN
C(
˜
h ) ·S −S(
˜
h ) ·S −k ln(
g
n
i
n
i
!
i
∏
)
g
n
i
n
i
!
i
∏
728 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 728
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 728
Q
ue s t i on 8 . 3 . 2 Show that Eq.(8.3.22) is zero when P
i
is the equilibrium
probability of locating a particle in a par ticular state id entified by mo
mentum p and position x. For simplicity assume that all g states in the cell
have essentially the same position and momentum.
Solut i on 8 . 3 . 2 We calculate an expression for P
i
→ P(x,p) using
Boltzmann probabilit y for a single par ticle (since all are independent):
(8.3.24)
where Z is the one par t icle par tition function given by:
(8.3.25)
We evaluate the expression:
(8.3.26)
which, by Eq.(8.3.22), we want to show is the same as the ent ropy. Since all
g states in cell i have essential ly the same position and momentum, this is
equal to:
(8.3.27)
which is most readily evaluated by recognizing it as:
(8.3.28)
which is S as given in Eq. (8.3.7).
Pos i t i on wi t hout mome nt um The use of the scale parameter ∆x∆p in the above
discussion should t rouble us, because we do not gener ally consider the momentum
uncer tainty on the macroscopic scale. The resolution of this problem arises because
we have assumed that the system has a known energy or temperature. If we know the
temperature then we know the thermal velocity or momentum:
∆p ≈ √mkTi (8.3.29)
It does not make sense to have a mom en t um uncer t a i n ty of a par t i cle that is mu ch
gre a ter t han this. Using ∆x∆p · h t his means there is also a natu r al uncer t a i n ty in po
s i ti on wh i ch is t he ther mal wavel engt h given by Eq . ( 8 . 3 . 8 ) . This is the maximal
qu a n tum po s i ti on uncert a i n t y, unless the ob s er ver can distinguish the thermal mo
t i on of i n d ivi dual par ti cl e s . We can now t hink abo ut a sequ en ce of ob s er ver s who do
not distinguish the mom en t um of p a rt i cles (they have a larger uncert a i n t y than t he
t h ermal mom en tum) but have increasing uncert a i n ty in po s i ti on given by L ·∆x, or
g · (L/ )
d
. For su ch ob s er vers t he equ i l i brium mom en tum prob a bi l i t y distr i but i on
kN +kNZ
−1
ln(V / N
3
) −
1 d
d

.
`
,
Z ·kN ln(V / N
3
) +5/ 2

.
`
,
−k
x ,p
∑
P( x, p) ln(P( x, p)) +kN ·k ln(V / N
3
) +p
2
/ 2mkT

.
`
,
x ,p
∑
N
3
/ V

.
`
,
e
−p
2
/ 2mkT
−k g
i
∑
P( x, p) ln( P( x, p)) +kN
Z · e
−p
2
/ 2mkT
x ,p
∑
·
d
3
xd
3
p
h
3
e
−p
2
/2mkT
∫
·
V
3
P( x, p) ·NZ
−1
e
−p
2
/ 2mkT
C o m p l e x i t y o f p h y s i c a l s y s t e m s 729
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 729
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 729
is to be assu m ed . In t his case t he nu m ber of p a r ti cles in a cell n
i
con tr i butes a term
to the en t ropy that is equal to the en tr opy of a gas wit h t his many part i cles in the vo l
ume L
d
. This gives a tot al en t ropy of :
(8.3.30)
and the complexity is:
(8.3.31)
which differs in form from Eq. (8.3.22) only in the constant.
While we generally do not think about measuring momentum, we do measur e
velocit y. This follows fr om the content of the previous paragraph. We consider ob
servers that measure particle positions at differ ent times and from this they may infer
the velocity and indirectly the momentum. Since the observer measures n
i
, the deter
mination of velocity d epends on the obser ver’s ability to distinguish moving spatial
density variations. Thus we consider the measurement of n(x,t), where x has macro
scopic meaning as a granular coordinate that has discrete values separat ed by L. We
emphasize,however, that this descript ion of a space and timedependent density as
sumes that the local momentum distribution of the system is consistent with an equi
libr ium ensemble. The more fundamental description is given by the dist ribut ion of
particle positions and momenta, n
i
· n( x,p). Thus, for example, we can also describe
a rotating disk that has no macroscopic changes in density over time, but the rotation
is still macroscopic. We can also describe fluid flow in an incompressible fluid. In this
section we continue to rest rict ourselves to the description of obser vations at a par
ticular time. The time dependence of obser vations will be considered in Section 8.3.5.
Thus far we have consider ed syst ems that are in generic states selected fr om the
equilibr ium ensemble. Equilibr ium systems are uniform on all but ver y microscopic
scales, unless we are exactly at a phase transition. Thus, most of the complexity dis
appears on a scale that is far smaller than typical macroscopic obser vations. This is
not necessarily true about nonequilibrium systems. Syst ems that are in states that are
far from equilibrium can have nonuniform densities of particles.A macroscopic ob
server will see these macroscopic variations. We will consider a couple of different ex
amples of nonequilibrium states to illustr ate some proper ties of the complexity pro
file. Before we do this we need to consider the effect of algorithmic compression on
the complexity profile.
Algori t hmi c comple xi t y a nd e rror To discuss macroscopic complexity more com
pletely, we turn to algor ithmic complexity as a funct ion of scale. The complexity of a
system,par ticularly a nonequilibr ium system,should be defined in ter ms of the algo
r ithmic complexity of its description. This means that patterns that are present in the
positions (or momenta) of its par ticles can be used to simplify the description.
Using this discussion we can reformulate our understanding of the complexit y
profile. We defined the profile using obser vers with progressively poorer ability to dis
tinguish microstates. The fr action of the ensemble occupied by these states defined
C( L) ·S −k
i
∑
n
i
ln( g /n
i
) +5/ 2
( )
S(L) ·k
i
∑
n
i
ln(L
d
/n
i
3
) +5 / 2

.
`
,
730 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 730
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 730
the complexity. Using an algorithmic perspective we say, equivalently, that the ob
ser ver cannot distinguish the t rue state from a state that has a smaller algorithmic
complexity. An obser ver with a value of g · 2 cannot distinguish which of two states
each par ticle occupies in the real microstate. Let us label the single par ticle states us
ing an ind ex that enumerates them. We can then imagine a checkerboard (in six di
mensions of position and momentum) where odd ind exed states are black and even
ones are white. The observer cannot tell if a particle is in a black or a white state. Thus,
no matter what the real state is,there is a simpler state where only odd (or only even)
indexed states of the par ticles are occupied, which cannot be distinguished from the
real system by the observer. The algorithmic complexity of this state with particles in
odd indexed states is essentially the complexity that we determined above, C(g · 2)—
it is the information necessary to sp ecify this state out of all the states that have par
ticles only in odd indexed states. Thus,in ever y case, we can specify the complexity of
the syst em for the obser ver as the complexity o f the simplest state that is consistent
with the obser vations—by Occam’s razor, this is the state that the obser ver will use to
descr ibe the system.
We note that this is also equivalent to defining the complexity profile as the length
of the description as the er ror allowed in the descript ion increases. The total er ror as
a function of g for the ideal gas is
(8.3.32)
where N is the number of par ticles in the system. The factor of 1/2 arises because the
average er ror is half of the maximum er ror that could occur. This approach is helpful
since it suggests how to generalize the complexity profile for systems that have differ
ent types of par ticles. We can define the complexity profile as a function of the num
ber of err ors that are made. This is better than using a par t icular length scale, which
implies a different error for part icles of different mass as indicated by Eq.(8.3.8). For
conceptual simplicit y, we will continue to wr ite the complexity profile as a function
of g or of length scale.
None qui li bri um s t a t e s Our next object ive is to consider none quilibrium states.
When we have a nonequilibr ium state,the microstate of the system is simpler than an
equilibr ium state to begin with. As we mentioned at the end of Section 8.3.2,there are
nonequilibrium states that cannot be distinguished from equilibrium states on a
macroscopic scale. These nonequilibr ium states have microscopic cor relations. Thus,
the microscopic complexity is lower than the equilibrium entropy, while the macr o
scopic complexity is the same as in equilibrium:
C(g) < C
0
(g) · S
0
g · 1
C( g) · C
0
(g) g >> 1
(8.3.33)
where we use the subscript 0 to indicate quantities of the equilibrium state. We illus
t rate this by an example. Using the indexing o f single par ticle states we just intro
duced, we take a microstate where all par ticles are in odd indexed states. The mi
1
2
log ∆x
i
∆p
i
/ h
∏
( ) ·
1
2
N log(g)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 731
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 731
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 731
crostate complexity is the same as that of an equilibrium state at g · 2, which is less
than the ent ropy of the equilibr ium system:
C(g · 1) · C
0
(g · 2) < C
0
(g · 1)
However, the complexity of this system for scales of obser vation g ≥ 2 is the same as
that of an equilibr ium system—macroscopic obser vers do not distinguish them.
This scenario, where the complexity of a nonequilibrium state starts smaller but
then quickly becomes equal to the equilibrium state complexity, does not always hold.
It is true that the microscopic complexity must be less than or equal to the entropy of
an equilibr ium syst em, and that all systems have the same complexity when L is the
size of the syst em. However, what we will show is that the complexity of a nonequi
librium system can be higher than that of the equilibrium syst em at large scales that
are smaller than the size of the syst em. This is apparent in the case, for example, of a
nonuniform densit y at large scales.
To illust r ate what happens for such a nonequilibr ium state, we consider a system
that has nonuniformity that is char acteristic of a par ticular length scale L
0
, which is
significantly larger than the microscopic scale but smaller than the size of the sys
tem. This means that n
i
is smooth on finer scales,and there is no par ticular relation
ship between what is going on in one region of length scale L
0
and another. The val
ues of n
i
will be taken from a Gaussian dist ribution around the equilibrium value n
0
with a standard deviation of . We assume that is larger than the natural density
fluctuations, which have a standard deviation of
0
·√n
0
. For convenience we also as
sume that is much smaller than n
0
.
We can calculate both the complexity C(L), and the apparent ent ropy S(L) for
this syst em. We start by calculating them at the scale L
0
. C(L
0
) is the amount of in
formation necessary to sp ecify the d ensity values. This is the product of the number
of cells V/L
d
times the infor mation in a number selected from a Gaussian distribution
of width . From Quest ion 8.3.3 this is:
(8.3.34)
The number of microstates consistent with this macrostate at L
0
is given by the sum
of ideal gas entropies in each region:
(8.3.35)
Since is less than n
0
, this can be evaluated by expanding to second order in n
i
·
n
i
− n
0
:
(8.3.36)
wher e S
0
is the entropy of the equilibrium system, and we used < n
2
i
> ·
2
. We note
that when ·
0
the logarithmic t er ms in the complexity reduce to the extra t erms
S(L
0
) ·S
0
−k
( n
i
)
2
2n
0
i
∑
·S
0
−
kV
2
2L
0
d
n
0
S(L
0
) · −k
i
∑
n
i
ln(n
i
/ g) +( 5/ 2)kN
C( L
0
) ·k
V
L
0
d
(
1
2
(1 +ln(2 )) +ln )
732 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 732
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 732
found in Eq. (8.3.23). Thus, these t erms are the infor mation needed to describe the
equilibrium fluctuat ions in the densit y.
We can understand the beh avi or of t he com p l ex i ty profile of this sys tem . By con
s t ru cti on , the minimum amount of i n form a ti on needed to specify the micro s t a te is
C( ) · S(L
0
) + C( L
0
) . This is the sum over the en tropy of equ i l i brium gases with den
s i ties n
i
in vo lumes L
d
0
, p lus C( L
0
) . Si n ce S(L
0
) is linear in the nu m ber of p a r ti cl e s , wh i l e
C(L
0
) is loga rithmic in and therefore loga rithmic in the nu m ber of p a r ti cl e s , we con
clu de that C(L
0
) is mu ch small er than S(L
0
) . For L > the com p l ex i t y profile C(L) de
c reases like that of an equ i l i brium ideal ga s . The term S(L
0
) is el i m i n a ted at a micro
s copic length scale larger t han but mu ch small er than L
0
. However, C(L
0
) rem a i n s .
Due to t his ter m the com p l ex i ty crosses that of an equ i l i brium gas to become larger.
For lengt h scales up to L
0
the com p l ex i ty is essen ti a lly constant and equal to Eq .( 8 . 3 . 3 4 ) .
Above L
0
it dec reases to zero as L con ti nues to increase by vi r tue of the ef fect of com
bining the different n
i
i n to fewer regi on s . Com bining the regi ons re sults in a Gaussian
d i s tri but i on with a standard devi a ti on that dec reases as the squ a re root of the nu m ber
of ter ms → ( L
0
/L)
d / 2
. Thu s , the com p l ex i ty and en tropy profiles for L > L
0
a re :
(8.3.37)
This expression continues to be valid until there is only one region left,and the com
plexity goes to zero. The precise way the complexity goes to zero is not describ ed by
Eq. (8.3.37), since the Gaussian distribution does not apply in this limit.
There are several comments that we can make that are relevant to understanding
complexity profiles in general. First we see that in order for the macroscopic com
plexity to be higher than that in equilibrium, the ent ropy at the same scale must be
reduced S(L) < S
0
. This is necessary because the sum S(L) + C(L)—the total informa
tion necessary to specify a microstate—cannot be greater than S
0
. However, we also
note that the reduction in S(L) is much larger than the increase in C(L). The ratio be
tween the two is given by:
(8.3.38)
For >
0
· √n
0
this is greater than one. We can understand this result in two ways.
First, a complex macroscopic system must be far from equilibr ium, and therefore
must have a much smaller entropy than an equilibrium system. Second, a macro
scopic observer makes many errors in determining the microstate,and therefore if the
microstate is similar to an equilibr ium state,the obser ver cannot distinguish the two
and the macroscopic proper ties must also be similar to an equilibrium state. For ever y
bit of information that distinguishes the macrostate, there must be many bits of dif
fer ence in the microstate.
S( L)
C( L)
· −
2
2n
0
L
d / 2
L
0
d / 2
1
ln( /
0
)
S(L) ·S
0
−
kV
2
2(LL
0
)
d /2
n
0
C( L) ·k
V
L
d
(
1
2
(1 +ln(2 )) +ln L
0
L
( )
d / 2
)
C o m p l e x i t y o f p h y s i c a l s y s t e m s 733
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 733
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 733
In calculating the com p l ex i ty of the sys tem at a particular scale, we assu m ed that
the ob s erver was in error in obtaining the po s i ti on and mom en tum of e ach par ti cl e .
However, we assu m ed that the nu m ber of p a r ti cles wit hin each bin was determ i n ed ex
act ly. Thus the com p l ex i ty we calculated is the inform a ti on nece s s a r y to specify the nu m
ber of p a rti cles precise to the single par ti cl e . This is why even the equ i l i brium den s i ty
f lu ctu a ti ons were de s c ri bed . An altern a tive , m ore re a s on a bl e , a pproach assumes that
p a r ti cle co u n ting is also su bj ect to error. For simplicit y we can assume that the er ror is
a fr acti on of the nu m ber of p a rti cles co u n ted . For mac ro s copic sys tems this fr acti on is
mu ch larger than the equ i l i brium flu ctu a ti on s , wh i ch therefore need not be de s c ri bed .
This approach also modifies the for m of the com p l ex i ty profile of the nonu n i form ga s
in Eq .( 8 . 3 . 3 7 ) . The error in measu rem ent increases as n
0
(L) ∝ L
d
with the scale of ob
s erva ti on . Let ting m
0
( L) be the error in a measu rem ent of p a r ti cle nu m ber, we wri te :
(8.3.39)
The consequence of this modification is that the complexity decreases somewhat
more rapidly as the scale o f observation increases. The expression for the ent ropy in
Eq. (8.3.37) is unchanged.
Q
ue s t i on 8 . 3 . 3 What is the information in a number (character) se
lected from a Gaussian distr ibution of standard deviation ?
Solut i on 8 . 3 . 3 Start ing from a Gaussian distribution (Eq. 1.2.39),
(8.3.40)
we calculate the infor mation (Eq. 8.2.2):
(8.3.41)
where the second term in the integral can be evaluated using < x
2
> ·
2
.
We note that this result is to be int er preted as the infor mation in a dis
crete distr ibution of integr al values of x, like a random walk,that in the limit
of large gives a Gaussian dist r ibut ion. The units that are used to measur e
define the precision to which the values of x are to be described. It thus
makes sense that the information to specify an integer of typical magnitude
is essentially log( ).
8 . 3 . 4 Time dependence—cha os a nd t he complexit y profile
Ge ne ra l a pproa ch In describing a syst em, we are int erested in macroscopic obser
vations over time, n(x, t). As with the uncertainty in position,a macroscopic obser ver
is not able to distinguish the time of obser vation within less than a certain time in
I · − dxP( x) log(P( x))
∫
· dxP( x) log( 2 ) + ln(2)x
2
/ 2
2

.
`
,
∫
·log( 2 ) +ln(2) / 2
P( x) ·
1
2
e
−x
2
/ 2
2
C( L) ·k
V
L
d
(
1
2
(1 +ln(2 )) +ln
L
0
d/ 2
n
0
(L)L
d / 2
) ≈k
V
L
d
ln
L
0
3 d / 2
n
0
( L
0
)L
3d / 2
734 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 734
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:52 AM Page 734
terval T · ∆t. To define what this means, we say that the system is r epresented by an
ensemble with probabilit y P
L,T
( n(x; t)), or more generally P
L,T
(n(x, p; t)). The differ
ent microstates that occur during the time inter val T are all part of this ensemble. This
may appear different than the definition we used for the spatial uncer tainty. However,
the definitions can be restated in a way that makes them appear equivalent. In this re
statement we r ecognize that the obse rver performs measurements that are, in effect,
averages over various possible microscopic measurements. The average measure
ments over space and time represent the system (or system ensemble) that is to be de
scribed by the obser ver. This representation will be discussed further in Section 8.3.6.
The use of an ensemble is convenient b ecause the obser ver may only measure one
quantity, but we can consider various quantities that can be measured using the same
degree of precision. The ensemble represents all possible measurements with this de
gree of precision. For example, the observer can measure cor relations between par t i
cle positions that are fixed over time. If we aver aged the densit y n(x, t) over time,these
cor relations could disappear because of the movement of the whole system. However,
if we average over the ensemble,they do not. We define the complexity profile C(L, T )
as the amount of information necessary to specify the ensemble P
L,T
(n( x, t)). A de
scription at a finer scale contains all of the information necessary to describe the
coarser scale. Thus, C(L, T ) is a monotonic decreasing function of its arguments. A
direct analysis is discussed in Question 8.3.4. We start,however, by consider ing the ef
fect on C(L, T ) of pr ediction and the lack of predictability in chaotic dynamics.
Pre di ct a bi li t y a nd cha os As discussed earlier, a key ingredient in our understand
ing of physical syst ems is that the time evolution of an isolat ed syst em (or a system
whose interactions with its environment are specified) can be obtained from the sim
ple laws of mechanics star ting from a complete microscopic descript ion of the posi
tion and momenta of the par ticles. Thus, if we use a small enough L and T, so that
each par ticle can be distinguished, we only need to specify P
L,T
(n( x, t)) over a shor t
per iod of time (or the simultaneous values of position and momentum) in order to
predict the behavior o ver all subsequent times. The laws o f mechanics are also re
ver sible. We describe the past as well as the future from the description of a system at
a part icular time. This must mean that information is not lost over time. Systems that
do not lose information over time are called conservative systems.
However, when we increase the spatial scale of obser vation, L, then the informa
tion loss—the complexity r eduction—also limits the pr edictability of a syst em. We
are not guarant eed that by kno wing P
L, T
( n(x, t)) at a scale L we can p redict the sys
tem behavior. This is t rue even if we are only concerned about predicting the behav
ior at the scale L. We may need additional smallerscale information to describe the
time evolution of the syst em. This is p recisely the origin of the study of chaotic sys
tems discussed in Section 1.1. Chaotic syst ems take information fr om smaller scales
and bring it to larger scales. Chaotic syst ems may be contrasted with dissipat ive sys
tems that take information from larger scales to smaller scales. If we per turb (disturb)
a dissipative system,the eff ect disappears over time. Looking at such a system at a par
ticular time, we cannot tell if it was pertur bed at some time far enough in the past.
Since the information on a microscopic scale must be conser ved, we know that the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 735
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 735
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 735
information that is lost on the macroscopic scale must be preserved on the micro
scopic scale. In this sense we can say that information has been t ransferred from the
macroscopic to the microscopic scale. For such syst ems, we cannot describe the past
from present information on a par ticular length scale.
The degree of predictability is manifest when we consider that the complexity of
a system C(L, T ) at a part icular L and T depends also on the duration of the descrip
tion—the limits of t ∈[t
1
, t
2
]. Like the spatial ext ent of the system, this temporal ex
tent is part of the system definition. We typically keep these limits constant as we vary
T to obtain the complexity profile. However, we can also characterize the dependence
of the complexity on the time limits t
1
, t
2
by determining the rate at which inf or ma
tion is either gained or lost for a chaotic or stable system. For complex syst ems, the
flow of information between length scales is bidir ectional—even if the total amount
of information at a particular scale is preser ved, the inf ormation may change over
time by transfer to or from shor ter length scales. Unlike most theories of current s,in
formation currents remain relevant even though they may be equal and opposite. All
of the infor mation that affects behavior at a par t icular length scale,at any time over
the duration of the descr ipt ion, should be included in the complexity.
It is helpful to develop a conceptual image of the flow of infor mation in a system.
We begin by considering a conser vat ive, nonchaotic and nondissipat ive system seen
by an obser ver who is able to distinguish 2
C(L)/ k ln(2)
· e
C( L) / k
states. C(L) / k ln(2) is the
amount of infor mation necessary to describe the system during a single time interval
of length T. For a conservative syst em the amount of information necessary to de
scribe the state at a particular time d oes not change over time. The dynamics o f the
system causes the state of the system to change over time among these states. The se
quence of states could be descr ibed one by one. This would require
N
T
C( L) / k ln(2) (8.3.42)
bits, where N
T
· (t
2
− t
1
) /T is the number of time inter vals. However, we can also de
scribe the state at a part icular time (e.g.,the initial conditions) and the dynamics. The
amount of information to do this is:
(C(L) + C
t
(L,T ) ) /k ln(2) (8.3.43)
C
t
(L,T ) /k ln(2) is the information needed to describe the dynamics. For a nonchaotic
and nondissipative system we can show that this information is quite small. We know
from the previous section that the macrostate of the system of complexit y C(L) is con
sistent with a microstate which has the same complexit y. The microstate has a dy
namics that is simple,since it follows the dynamics of standard physical law. The dy
namics of the simple microstate also describes the dynamics of the macrostate, which
must ther efore also be simple. Therefore Eq.(8.3.43) is smaller than Eq.(8.3.42) and
the complexity is C(L,T ) · C( L) + C
t
( L,T ) ≈ C(L). This holds for a system following
conser vative, nonchaotic and nondissipative dynamics.
For a system that is chaotic or dissipative, the pict ure must be modified to ac
commodate the flow of information between scales. From the previous paragraph we
conclude that all of the interesting (complex) dynamics of a system is provided by in
736 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 736
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 736
formation that comes fr om finer scales. The obser ver does not see this information
before it appears in the state of the system—i.e.,in the dynamics. If we allow ourselves
to see the finerscale information we can track the flow of information that the ob
server does not see. In a conventional chaotic system,the flow of information can be
characterized by its Lyaponov exponents. For a system that is described by a single real
valued par ameter, x(t), the Lyaponov exponent is defined as an aver age over :
h · ln((x′(t ) − x( t) )/ ( x′(t − 1) − x( t − 1))) (8.3.44)
wher e unpr imed and p rimed coordinates indicate two different t rajectories. We can
readily see how this affects the information needed by an obser ver to describe the dy
namics. Consider an obser ver at a par ticular scale, L. The obser ver sees the system in
state x(t − 1) at time t − 1, but he d etermines x(t − 1) only within a bin of width L.
Using the dynamics of the system that is assumed to be known, the obser ver can de
termine the state of the system at the next time. This ext r apolation is not p recise, so
the obser ver needs additional infor mation to specify the next location. The amount
of infor mation needed is the lo garithm of the number of bins that one bin expands
into during one time step. This is precisely h / ln(2) bits of infor mation. Thus, the
complexit y of the dynamics for the observer is given by:
C(L,T ) · C( L) + C
t
(L,T ) + N
T
kh (8.3.45)
where we have used the same notation as in Eq. (8.3.43).
A physical system that has many dimensions,like the microscopic ideal gas, will
have one Lyaponov exponent for each of 6N dimensions of position and momentum.
If the dynamics is conser vative then the sum over all the Lyaponov exponents is zero,
(8.3.46)
where ∆x
i
(t) · x′
i
(t) −x
i
(t) and ∆p
i
(t) · p′
i
(t) −p
i
(t). This follows directly from conser
vation of volumes of phase space in conser vative dynamics. However, while the sum
over all exponents is zer o, some of the exponents may be positive and some negative.
These cor respond to chaotic and dissipative modes of the dynamics. We can imagine
the flow of information as consisting of two st reams, one going to higher scales and
one to lower scales. The complexity of the system is given by:
(8.3.47)
As indicated, the sum is only over positive values.
Two cautionary r emarks about the application of Lyaponov exponents to com
plex physical systems are necessar y. Unlike many standard models of chaos,a complex
system does not have the same number of degrees of freedom at every scale. The num
ber of independent bits of information describing the system above a particular scale
is given by the complexity profile, C( L). Thus,the flow of information between scales
should be thought of as due to a number of closed loops that extend from a par ticu
lar lowest scale up to a par ticular highest scale. As the scale increases,the complexity
C( L,T ) ·C( L) +C
t
( L, T ) + N
T
k h
i
i:h
i
>0
∑
i
∑
h
i
· log(
i
∏
∆x
i
(t )∆p
i
(t ) /
i
∏
∆x
i
(t −T )∆p
i
(t −T )) ·0
C o m p l e x i t y o f p h y s i c a l s y s t e m s 737
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 737
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 737
decreases. Thus, so does the maximum number of Lyaponov exponents. This means
that the sum over Lyaponov exponents is itself a function of scale. More generally, we
must also be concerned that C(L) can be time dependent,as it is in many irreversible
processes.
The second r emark is that over time the cycling of information between scales
may bring the same infor mation back more than once. Eq. (8.3.47) does not distin 
guish this,and therefore may include multiple counting of the same information. We
should understand this expression as an upper bound on the complexit y.
Ti me s ca le de pe nde nce Once we have chaotic b ehavior, we can consider various
descriptions of the time dependence of the behavior seen by a particular observer. All
of the models we considered in Chapter 1 are applicable. The state of the system may
be selected at random from a part icular distribution (ensemble) of states at successive
time int er vals. This is a sp ecial case o f the more general Mar kov chain model that is
described by a set of t ransition probabilities. Longrange correlations that are not eas
ily described by a Markov chain may also be important in the dynamics.
In order to discuss the complexity pr ofile as a funct ion of T, we consider a
Markov chain model. From the analysis in Question 8.3.4 we learn that the loss of
complexity with time scale occurs as a result of cycles in the dynamics. These cycles
need not be deter ministic; they may be stochastic—cycles that do not repeat indefi
nitely but rather can occur one or mo re times through the probabilistic selection of
successive states. Thus,a high complexity for large T arises when there is a large space
of states with low chance of repetition in the dynamics. The highest complexity would
arise from a deter ministic dynamics with cycles that are longer than T. This might
seem to contradict our previous conclusion, where the deterministic dynamics was
found to be simple. However, a complex deterministic dynamics can arise if the suc
cessive states are specified by informat ion from a smaller scale.
Q
ue s t i on 8 . 3 . 4 Consider the information in a Markov chain of N
T
states
at int er vals T
0
given by the transition mat r ix P(s′ s). Assume the com
plexity of specifying the transition matr ix—the complexity of the dynamics
—C
t
· C(P(s′ s)),is itself small.( See Question 8.3.5 for the case of a complex
deterministic dynamics.)
a. Show that the more d eter ministic the chain is, the less infor mation it
contains.
b. Show that for an obser ver at a longer time scale consisting of two time
steps (T · 2T
0
) the information is reduced. Hint: Use convexity of infor
mation as described in Question 1.8.8, f ( 〈x〉) > 〈 f( x)〉, for the function
f (x) · −x log(x).
c. Show that the complexity does not d ecrease for a syst em that d oes not
allow 2cycles.
Solut i on 8 . 3 . 4 When the complexity of the dynamics is small, then the
complexit y of the Mar kov chain is given by:
C · C(s) + C
t
+ N
T
k ln(2)I(s′ s) (8.3.48)
738 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 738
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 738
where the terms correspond to the infor mation in the initial state of the sys
tem,the information in the dynamics and the incr emental information per
update needed to specify the next state. The r elationship between this and
Eq.(8.3.47) should be apparent. This expression does not hold if C
t
is large,
because ifit is larger than N
T
C(s),then the chain is more concisely described
by specifying each of the states of the system (see Question 8.3.5).
The proof of ( a) follows from realizing that the more deter ministic the
system is,the smaller is I(s′ s). This may be used to define how deter ministic
the dynamics is.
To analy ze the com p l ex i t y of t he Ma rkov chain for an ob s er ver at ti m e
scale 2T
0
, we need to com bine su cce s s ive sys tem st ates into an unordered
pair—t he en s em ble of s t a tes seen by the ob s erver. We use the not ati on {s′, s}
for a pair of s t a te s . Thu s , we are con s i dering a new Ma rkov chain of tr a n s i
ti ons bet ween unordered pairs . To analy ze this we need t wo prob a bi l i ti e s :t h e
prob a bi l i ty of a pair and t he tra n s i ti on prob a bi l i ty from one pair to the nex t .
The latter is the new tra n s i ti on matr i x . The prob a bi l i ty of a particular pair is:
(8.3.49)
where P(s) is the p robability of a par t icular state of the syst em and the two
ter ms in the up per line cor respond to the p robability of start ing from s
1
to
make the pair, and star ting fr om s
2
to make the pair. The t ransition matr ix
for pairs is given by
(8.3.50)
which is valid only for s
1
≠ s
2
and for s′
1
≠ s′
2
. Other cases are t reated like
Eq.(8.3.49). Eq.(8.3.50) includes all four possible ways of gener ating the se
quence of the two pairs. The normalization is needed because the transition
mat rix is the probability of {s′
1
≠ s′
2
} occurr ing, assuming the pair {s
1
, s
2
} has
already occurr ed.
To show (b) we must prove that the process of combining the states into
pairs reduces the infor mation necessary to describe the chain. This is appar
ent since the obser ver loses the information about the state or der within each
pair. To show it from the equations, we note from Eq.(8.3.49) that the prob
ability of a particular pair is larger than or equal to the probability of each of
the two possible unordered pairs. Since the probabilities are larger, the in
formation is smaller. Thus the information contained in the first pair is
smaller for T · 2 than for T · 1. We must show the same result for each suc
cessive pair. The tr ansition probability can be seen to be an average over two
terms in the round parenthesis. By convexity, the infor mation in the aver age
is less than the average infor mation of each term.Each of the terms is a sum
P({ ′ s
1
, ′ s
2
} {s
1
,s
2
}) · P( ′ s
1
 ′ s
2
)P( ′ s
2
 s
1
) +P( ′ s
2
 ′ s
1
)P( ′ s
1
 s
1
)
( )
P(s
1
 s
2
)P(s
2
)
[
+ P( ′ s
1
 ′ s
2
) P( ′ s
2
 s
2
) +P( ′ s
2
 ′ s
1
) P( ′ s
1
 s
2
) ( )P(s
2
 s
1
)P(s
1
)
]
/ P({s
1
,s
2
})
P({s
1
,s
2
}) ·
P(s
1
 s
2
) P(s
2
) +P(s
2
 s
1
)P(s
1
) s
2
≠s
1
P(s
1
 s
1
)P(s
1
) s
2
·s
1
¹
'
¹
C o m p l e x i t y o f p h y s i c a l s y s t e m s 739
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 739
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 739
over the probabilities of two possible orderings, and is therefore larger than
or equal to the probability of either ordering. Thus,the information needed
to sp ecify any pair in the chain is smaller than the corresponding informa
tion in the chain of states.
Finally, to prove (c) we note that the less the order of states is lost when
we combine states into pairs, the more complexity is retained. If transitions
in the dynamics can only occur in one direction,then we can infer the order
and information is not lost. Thus, for T · 2 the complexity is retained if the
dynamics is not reversible—there are no 2cycles. From the equations we see
that if only one of P(s
1
 s
2
) and P(s
2
 s
1
) can be nonzero, and similarly for
P( s′
1
 s′
2
) and P(s′
2
 s ′
1
), then only one term sur vives in Eq. (8.3.49) and
Eq. (8.3.50) and no aver aging is p erformed. For ar bitrar y T the complexity
is the same as at T · 1 if the dynamics does not allow loops of size less than
or equal to T.
Q
ue s t i on 8 . 3 . 5 Calculate the maximum infor mation that might in prin
ciple be necessary to specify completely a deterministic dynamics of a
system whose complexity at any time is C( L). Contr ast this with the maxi
mum complexity of descr ibing N
T
steps of this system.
Solut i on 8 . 3 . 5 The number of possible states of the syst em is 2
C(L) / k ln(2)
.
Each of these must be assigned a successor by the dynamics. The maximum
possible information to sp ecify the dynamics arises if there is no algorithm
that can specify the successor, so that each successor must be identified ou t
of all possible states. This would require 2
C(L) / k ln(2)
C(L) /k ln(2) bits.
The maximum com p l ex i ty of N
T
s teps is just N
T
C(L) , as long as this is
s m a ll er than the previous re su l t . Wh i ch is gen era lly a re a s on a ble assu m pti on .
A simple example of chaotic behavior that is relevant to complex systems is that
of a mobile system—an animal or human being—where the motion is int er nally di
rected.A descript ion of the system behavior, even at a length scale larger than the sys
tem itself, must describe this motion. However, the motion is determined by infor
mation contained on a smaller length scale just prior to its occurrence. This satisfies
the for mal requirements for chaotic behavior regardless of the specifics of the motion
involved. Stated differently, the largescale motion would be changed by modifica
tions of the internal state of the system. This is consistent with the sensitivity of
chaotic mot ion to smaller scale changes.
Another example of information t ransfer b etween different scales is related t o
adaptability, which requires that infor mation about the external environment be rep
resented in the organism. This gener ally involves the transfer of infor mation between
a larger scale and a smaller scale.Specifically, between observed phenomena and their
representation in the synapses of the ner vous system.
When we describe a system at a part icular moment of t ime,the complexity of the
system at its own scale or larger is zero—or a constant if we include the descript ion of
the equilibrium system. However, when we consider the descript ion of a system over
740 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 740
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 740
time, then the complexity is larger due to the syst em motion. Increasing the scale of
obser vation continues to result in a progressive decrease in complexity. At a scale that
is larger than the system itself, it is the motion of the system as measured by its loca
tion at successive time intervals that is to be described. As the scale becomes larger,
smaller scale motions are not obser ved,and a simpler description of motion is possi
ble. The obser ver only notes changes in position that are larger than the scale of
obser vation.
A natural question that can be asked in this context is whether the motion of the
system is due to exter nal influences or due to the system itself. For example,a particle
moving in a fluid may be displaced by the motion of the fluid. This should be con
sidered different from a mobile bacteria. Similar ly, a basketball in a game moves
through its t rajector y not because of its own volition, but rather because of the voli
tion of the player s. How do we distinguish this from a syst em that mo ves due to its
own act ions? More generally, we must ask how we must deal with the environmental
influences for a system that is not isolated. This question will be dealt with in Section
8.3.6 on behavior al complexity. Before we address this question,in the next section we
discuss sever al aspects of the complexity profile, including the relationship of the
complexit y of the whole to the complexit y of its par ts.
8 . 3 . 5 Propert ies of complexit y profiles of syst ems
a nd component s
Ge ne ra l prope rt i e s We can readily understand some of the properties that we
would expect to find in complexity profiles of systems that are difficult to calculate di
rectly. Fig. 8.3.2 il lustr ates the complexity profile for a few syst ems. The paragraphs
that follow describe some of their feat ures.
For any syst em, the complexity at the smallest values of L, T is the microscopic
complexity—the amount of infor mation necessary to describe a part icular mi
crostate. For an equilibr ium state this is the same as the thermodynamic entropy,
which is the ent ropy of a system observed on an arbit rarily long time scale. This is not
t rue in general because shortrange corr elations decrease the microstate complexit y,
but do not affect the apparent macroscopic entropy. We have thus also defined the en
t ropy profile S(L,T) as the amount of information necessary to deter mine an arbitrar y
microstate consistent with the obser ved macrostate. From our discussion of noner
godic syst ems in Sect ion 8.3.1 we might also conclude that at any scale L, T the sum
of the complexity C(L,T ) and the ent ropy S(L,T ) of the system (the fast degrees of
freedom) should add up to the microscopic complexity or macroscopic ent ropy
C(0,0) ≈ S( ∞,∞) ≈ C(L,T ) + S(L,T ) (8.3.51)
However, this is valid only under special circumstances—when the macroscopic state
is selected at random from the ensemble of macrostates,and the microstate is selected
at random from the possible microstates.A glass may satisfy this requirement; how
ever, other complex systems need not.
For a t ypical system in equilibrium, as L,T is increased the system rapidly
becomes homogeneous in space and time. Specifically, the d ensity of the system is
C o m p l e x i t y o f p h y s i c a l s y s t e m s 741
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 741
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 741
742 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 742
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(0, T)
(1)
(3)
(2)
T
(4)
C(L,0)
(1,2)
(3)
L
(4)
BarYamChap8.pdf 3/10/02 10:53 AM Page 742
uniform in space and time,aside from unobser vable small fluctuations, once the scale
of observation is larger than either the correlation length or the correlation time of
the syst em. Indeed, this might be taken to be the definition of the correlation length
and time—the scale at which the microscopic information becomes irrelevant to the
properties of the system. Beyond the correlation length,the average behavior charac
ter istic of the macroscopic scale is all that remains,and the complexity profile is con
stant at all length and time scales less than the size of the system.
We can con tr ast the com p l ex i t y profile of a t herm odynamic sys tem with what we
ex pect from va r ious com p l ex sys tem s . For a gl a s s , the com p l ex i t y profile is qu i te dif
ferent in time and in space . A t ypical glass is unifor m if L is larger t han a micro s cop i c
cor rel a ti on len g t h . Thu s , the com p l ex i ty profile of the glass is similar to an equ i l i br iu m
s ys tem as a functi on of L. However, it is different as a functi on of T. The frozen degree s
of f reedom t hat make it a non er godic sys tem at typical time scales of ob s er va ti on guar
a n tee this. At typical va lues of T the tem por al en s em ble of the sys tem inclu des the state s
that are re ach ed by vi bra ti onal modes of t he sys tem , but not t he atomic re a rra n gem en t s
ch a racter i s tic of f luid moti on . Thu s , the atomic vi br a ti ons cannot be ob s er ved except
at micro s copic va lues of T. However, a significant part of the micro s copic de s c ri pti on
remains nece s s a ry at lon ger time scales. Corre s pon d i n gly, a plateau in the com p l ex i ty
profile ex tends up to ch a racteri s tic time scales of human ob s erva ti on . At a tem pera tu re 
depen dent and mu ch lon ger time scale, the com p l ex i ty profile declines to its therm o
dynamic limit. This time scale, t he rel a x a ti on t i m e , is acce s s i ble near the glass tra n s i
ti on tem pera tu re . For lower tem pera tu res it is not. Because the glass is uniform in space ,
the plateau should be rel a t ively flat and end abru pt ly. This is because spat ial uniform i t y
i n d i c a tes that the rel a x a ti on t ime is essen ti a lly a local proper ty with a narrow distri b
uti on . A more ex ten ded spatial coupling would give rise to a grading of the plateau and
a broadening of the t ime scale at wh i ch the plateau disappe a rs .
C o m p l e x i t y o f p h y s i c a l s y s t e m s 743
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 743
Title: Dynamics Complex Systems
Shor t / Normal / Long
Fi gure 8 . 3 . 2 Sch e ma t ic plot s of t h e comple xit y profile C( L, T) of four diffe re n t syst e ms.
C( L, T) is t h e a moun t of in forma t ion n e ce ssa ry t o de scribe t h e syst e m e n se mble a s a fun ct ion
of t h e le n gt h sca le, L, a n d t ime sca le, T, of obse rva t ion . Top pa n e l sh ows t h e t ime sca le de 
pe nde n ce , bot t om pa n e l sh ows t h e le n gt h sca le de pe n de n ce . ( 1) An e quilibrium syst e m h a s
a comple xit y profile t h a t is sh a rply pe a ke d a t T · 0 a nd L · 0. On ce t h e le n gt h or t ime sca le
is be yon d t h e corre la t ion le n gt h or corre la t ion t ime re spe ct ive ly, t h e comple xit y is just t he
ma croscopic comple xit y a ssocia t e d wit h t h e rmodyn a mic qua n t it ie s (U, N, V) , wh ich va n ish e s
on a n y re a son a ble sca le . ( 2) For a gla ss t h e comple xit y profile a s a fun ct ion of t ime sca le
C( 0, T) de ca ys ra pidly a t first due t o a ve ra gin g ove r a t omic vibra t ion s; it t h e n re a ch e s a
pla t e a u t h a t re pre se n t s t h e froze n de gre e s of fre e dom. At much lon ge r t ime sca le s t h e com
ple xit y profile de ca ys t o it s t h e rmodyn a mic limit . Un like C( 0, T) , C( L, 0) of a gla ss de ca ys like
a t h e rmodyn a mic syst e m be ca use it is h omoge n e ous in spa ce . ( 3) A ma gn e t a t a se con d or
de r ph a se t ra n sit ion h a s a comple xit y profile t h a t follows powe r la w be h a vior in bot h le n gt h
a n d t ime sca le . St och a st ic fra ct a ls ca pt ure t h is kin d of be h a vior. ( 4) A comple x biologica l or
ga n ism h a s a comple xit y profile t h a t sh ould follow simila r be h a vior t o t h a t of a fra ct a l.
Howe ve r it h a s pla t e a u like re gion s t h a t corre spon d t o crossin g t h e sca le of in t e rn a l compo
ne n t s, such a s mole cule s a nd ce lls.
BarYamChap8.pdf 3/10/02 10:53 AM Page 743
More gen er a lly, for a com p l ex sys tem we ex pect that many para m eters wi ll be re
qu i red to de s c ri be its proper ties at all lengt h and time scales, at least up to some frac
ti on of the spatial and tem poral scale of the sys tem itsel f .S t a r ting from the micro s cop i c
com p l ex i ty, the com p l ex i t y profile should not be ex pected to fall smoo t h ly. In bi o l og
ical or ga n i s m s , we can ex pect that as we increase the scale of ob s er va ti on ,t h ere wi ll be
p a r ticular length scales at wh i ch details wi ll be lost. P l a teaus in t he profile are rel a ted
to t he ex i s ten ce of well  def i n ed levels of de s c ri pt i on . For ex a m p l e , an iden ti f i a ble level
of cellular beh avi or would cor re s pond to a plate a u , because over a ra n ge of l ength scales
l a r ger than the cell , a full acco u n ting of cellular properti e s , but not of t he inter nal be
h avi or of the cell , must be given . Th ere are many cells that have a ch a racter i s tic size
and are immobi l e . However, because different cell pop u l a ti ons have different sizes and
s ome cells ar e mobi l e , the sharpness of the tra n s i ti on should be smoo t h ed . We can at
least qu a l i t a tively iden t ify sever al different plate a u s . At the shortest time scale t he atom i c
vi bra ti ons wi ll be avera ged out to end the fir st plate a u .L a r ger atomic moti ons or mol
ecular beh avi or wi ll be aver a ged out on a secon d ,l a r ger scale. The internal cellular be
h avi or wi ll t hen be aver a ged out . F i n a lly, the internal beh avi or of t i s sues and or ga n s
wi ll be avera ged out on a sti ll lon ger length and t ime scale. It is the degrees of f reedom
that remain rel evant on t he lon gest length scale that are key to the com p l ex i t y of t h e
s ys tem . These degrees of f reedom manifest the con cept of em er gent co ll ective beh av
i or. Ul ti m a tely, t h ey must be tr ace a ble back to t he micro s copic degrees of f reedom .
De s c ri bing the con n ecti on bet ween the micro s copic para m eter s and mac ro s cop i c a lly
rel evant para m eters has occ u p i ed our atten ti on in mu ch of this boo k .
Mathematical models that best capture the complexity profile of a complex sys
tem are fractals (see Section 1.10). Mathematical fr actals with no granularity (no
smallest length scale) have infinite complexity. However, if we define a smallest length
scale, cor responding to the at omic length scale o f a physical syst em, and we define a
longest length scale that is the size of the syst em, then we can plot the spatial com
plexity p rofile of a fr actallike syst em. There are two quite distinct kinds of mathe
matical fr actals, deter ministic and stochastic fr actals. The d eterministic fractals ar e
specified by an algor ithm with only a few parameters,and thus their algorithmic com
plexity is small. Examples are the Kant or set or the Sier pinski gasket. The algorithm
describes how to create finer and finer scale detail. The only difficulty in specifying the
fr actal is sp ecifying the number o f levels to which the algorithm should be iterated.
This infor mation (the number of iter ations) requires a parameter whose length grows
logarithmically with the ratio of the size of the syst em to the smallest length scale.
Thus, a d eterministic fr actal has a complexity p rofile that decreases logarithmically
with obser vation length scale L, but is very small on all length scales.
Stochastic fr actals are qualitat ively different. In such fr actals, there are rand om
choices made at ever y scale of the str uctur e.St ochastic fractals can be based upon the
Kant or set or Sierpinski gasket, by including random choices in the algor ithm. They
may also be systems representing the spatial structure of various stochastic processes.
Such a syst em requires infor mation to describe its structure on ever y length scale. A
stochastic fractal is a member of an ensemble,and its algorithmic as well as ensemble
complexity will scale as a power law of the scale of obser vation L. As L increases, the
744 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 744
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 744
amount of information is reduced, but there is no length scale smaller than the size of
the system at which it is completely lost. Time series that have fr actal behavior—that
have p owerlaw cor relations—would also display a powerlaw d ependence o f their
complexity profile as a function of T. The simplest physical model that demonst r ates
such fractal proper ties in space and time is an Ising model at its secondorder tr ansi
tion point. At this t r ansition there are fluct uations on all spatial and t emporal scales
that have powerlaw behavior in both. Observers with larger values of L can see the
behavior of the correlations only on the longer length scales. A renor malization
t reatment, discussed in Section 1.10, can give the value of the complexity profile.
These examples illustrate how microscopic infor mation may become ir relevant on
larger length scales, while leaving collective information that r emains relevant at the
longer scales.
The complexity profile enables us to consider again the d efinition of a complex
system. As we stated, it seems intuit ive that a complex syst em is complex on many
scales. This st rengthens the identification of the fractal model of space and time as a
central model for the understanding of complex systems. We have also gained an un
derstanding of the difference between deterministic and stochastic fractal systems. We
see that the glass is complex in its t empor al behavior, but not in its spatial behavior,
and therefore is only a part ial example of a complex system. If we want to identify a
unique complexity of a system, there is a natural space and time scale at which to de
fine it. For the spatial scale, L
s
, we consider a significant fraction of the system—one
tenth of its size. For the t emporal scale, T
s
, we consider the r elaxation (autocor rela
tion) time of the behavior on this same length scale. This is essential ly the maximal
complexity for this length scale, which would be the same as setting T · 0. However,
we could also take a natural time scale of T
s
· L
s
/ v
s
where v
s
is a characteristic veloc
ity of the syst em. This form makes the increase in time scale for larger length scales
(systems) apparent. Leaving out the time scale,since it is dependent on the space scale,
we can wr ite the complexity of a system s as
C
s
· C
s
( L
s
) · C
s
(L
s
, L
s
/ v
s
) ≈ C
s
(L
s
, 0) (8.3.52)
In Section 1.10 we discussed generally the scaling of quantities as a funct ion of
the precision to which we describe the system.One of the central questions in the field
of complex systems is understanding how complexity scales. This scaling is con
cretized by the complexity profile.One of the object ives is to understand the ultimate
limits to complexit y. Given a par t icular length or time scale, we ask what is the max
imum possible complexity at that scale.One could say that this complexity is limited
by the thermodynamic entr opy; however, there are further limitations. These limita
tions are established by the nature of physical law that establishes the dynamics and
interactions of the components. Thus it is unlikely that atoms can be attached to each
other in such a way that the behavior of each atom is relevant to the spatiot empor al
behavior of an organism at the length and time scale relevant to a human being. The
details of behavior must be lost as we obser ve on longer length and time scales; this
results in a loss o f complexit y. The complexity scaling of complex organisms should
follow a line like that given in Fig. 8.3.2. The highest complexity of an organism results
C o m p l e x i t y o f p h y s i c a l s y s t e m s 745
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 745
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 745
from the retention o f the greatest significance of details. This is in contrast to ther
modynamic systems, where all of the degrees of freedom average out on a ver y shor t
length and time scale. At this time we do not know what limits can be placed on the
rate of decrease of complexit y with scale.
Compone nt s a nd sys t e ms As we discussed in Ch a pter 2, a com p l ex sys tem is form ed
o ut of a hiera rchy of i n terdepen dent su b s ys tem s . Thu s , rel evant to va rious qu e s ti on s
a bo ut the com p l ex i t y profile is an understanding of the com p l ex i ty that may arise wh en
we bring toget h er com p l ex sys tems to form a larger com p l ex sys tem . In gen eral it is not
clear that bri n ging toget h er many com p l ex sys tems must give rise to a co ll ective com
p l ex sys tem . This was discussed in Ch a pter 6, wh ere one example was a flock of a n i
m a l s . Here we can provi de ad d i ti onal meaning to this statem ent using the com p l ex i t y
prof i l e . We wi ll discuss the rel a ti onship of the com p l ex i ty of com pon ents to the com
p l ex i t y of the sys tem they are par t of . To be def i n i te , we can con s i der a flock of s h eep.
The example is ch o s en to expand our vi ew tow a rd more gen eral app l i c a ti on of t h e s e
i de a s . The gen er al statem ents we make app ly to any sys tem form ed out of su b s ys tem s .
Let us assume that we know the complexity of a sheep, C
sheep
(L
sheep
), the amount
of information necessary to describe the relevant behaviors of eating, walking, repro
ducing, flocking, etc.,at a length scale of about onetenth the size of the sheep. For our
current pur poses this might be a lot of information contained in a large number of
books, or a little information contained in a single paragr aph of text .Later, in Section
8.4, we will obtain an estimate of the complexity as, of order, one book or 10
7
bits.
We now consider a flock of N sheep and construct a description of this flock. We
begin by taking infor mation that describes each of the sheep. Combining these de
scriptions, we have a description of the flock. This information is,however, highly re
dundant. Much of the infor mation that describes one sheep can also be used to de
scribe other sheep. Of course there are differences in size and in behavior. However,
having described one sheep in detail we can describe the differences, or we can de
scribe general char acteristics of sheep and then sp ecialize them for each of the indi
vidual sheep. Using this strat egy, a descrip tion o f the flo ck will be shorter than the
sum of the lengths of the descrip t ions of each of the sheep. Still, this is not what we
really want. The descrip tion of the flock behavior has to be on its own length scale
L
flock
, which is much larger than L
sheep
. So we shift our observation of behavior to this
longer length scale and find that most of the details of the individual sheep behavior
have become irrelevant to the descript ion of the flock. We describe the flock behavior
in terms of sheep densit y, grazing activit y, migration, reproductive rates, etc. Thus we
writ e that:
C
flock
· C
flock
(L
flock
) << C
flock
(L
sheep
) << NC
sheep
(L
sheep
) · NC
sheep
(8.3.53)
where N is the numb er o f sheep in the flock. Among other conclusions, we see that
the complexity of a flock may actually be smaller than the complexit y of one sheep.
More generally, the r elationship b etween the complexity of the collective com
plex system and the complexity of component systems is crucially dependent on the
existence of coherence and cor relations in the behavior of the components that can
arise either from common origins for the behavior or from interactions between the
746 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 746
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 746
components. We first describe this qualitatively by consider ing the two inequalities in
Eq. (8.3.53). The second inequality arises because differ ent sheep have the same be
havior. In this case their behavior is coherent. The first inequality arises because we
change the scale of obser vation and so lose the behavior of an individual sheep. There
is a t radeoff between these two inequalities. If the behaviors of the sheep are ind e
pendent,then their behavior cannot be observed on the longer scale.Specifically, the
movement of one sheep to the right is canceled by another sheep that starts at its right
and moves to the left. Thus, only corr elated motions of many sheep can be obser ved
on a longer scale.On the other hand,if their behaviors are cor related,then the com
plexity of describing all o f them is much smaller than the sum of the separate com
plexities. Thus, having a large collect ive complexity r equires a balance between d e
pendence and independence of the behavior of the components.
We can discuss this more quantitatively by consider ing the example of the
nonuniform ideal gas. The loss of information for uncorrelated quantities due to
combining them together is described by Eq.(8.3.37). To const ruct a model where the
quantities are cor related, we consider placing the same densities in a region of scale
L
1
> L
0
. This is the same model as the p revious one, but now on a length scale of L
1
.
The new value of is
1
· (L
1
/ L
0
)
d
. This increase of the standard deviation causes
an increase in the value of the complexity for all scales great er than L
1
. However, for
L < L
1
the complexity is just the complexity at L
1
, since there is no structure below this
scale. A comparative plot is given in Fig. 8.3.3.
We can come closer to considering the behavior of a collection of animals by con
sidering a model for their motion. We start with a scale L
0
just larger than the animal,
so that we do not describe its internal str uct ure—we describe only its location at suc
cessive inter vals of time. The characteristic time over which a sheep moves a distance
L
0
is T
0
. We will use a mo del for sheep motion that can illust rate the effect of coher
ence of many sheep, as well as the effect of coherent motion of an individual sheep
over time. To do this we assume that an indi vidual sheep moves in a st r aight line for
a distance qL
0
in a time qT
0
before choosing a new direction to move in at random.
For simplicity we can assume that the direct ion chosen is one of the four compass di
rections, though this is not necessary for the analysis. We will use this model to cal
culate the complexity profile of an individual sheep. Our treatment only describes the
leading behavior of the complexity profile and not var ious corrections.
For L · L
0
and T · T
0
, the complexity of describing the motion is exactly 2 bits
for ever y q steps to deter mine which of the four possible directions the sheep will
move next. Because the movement is in a str aight line, and the changes in direction
are at welldefined int er vals, we can r econstr uct the motion fr om the measurements
of any obser ver with L < qL
0
and T < qT
0
. Thus the complexity is:
C(L,T ) · 2N
T
/ q L < qL
0
, T < qT
0
(8.3.54)
Once the scale of obser vation is greater than qL
0
, the observer does not see ever y
change in direction. The she ep is moving in a random walk where ea ch st ep has a
length qL
0
and takes a time qT
0
, but the obser ver does not see each st ep. The distance
t raveled is proport ional to the square root of the time,and so the sheep moves a dis
C o m p l e x i t y o f p h y s i c a l s y s t e m s 747
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 747
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 747
tance L once in ever y (
0
/ L)
2
steps, where
0
· qL
0
is the standard deviation of the
r andom walk in each dimension. Ever y time the sheep t ravels a distance L we need 2
bits to describe its motion, and thus we have a complexity:
(8.3.55)
We note that at L · qL
0
Eq. (8.3.54) and Eq. (8.3.55) are equal.
To obtain the complexity profile for long times scales T > qT
0
, but shor t length
scales L < qL
0
, we use a simplified “blob” picture to combine the successive positions
of the sheep into an ensemble of positions. For T only a few times qT
0
we can expect
that the ensemble would enable us to reconst ruct the motion—the complexity is the
same as Eq.(8.3.54). However, eventually the ensemble of positions will over lap and
form a blob. At this point the movement of the sheep will be described by the move
ment of the blob, which itself undergoes a random walk. The standard d eviation of
this random walk is propor tional to the square root of the number of steps:
T <qT
0
L >qL
0
,
C( L,T ) ·2
N
T
q
0
2
L
2
· 2N
T
qL
0
2
L
2
748 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 748
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(L)
L
d
0 5 10 15 20
0.2
0.4
0.6
0.8
1
1.2
(1)
(2)
Fi gure 8 . 3 . 3 Plot of t h e comple xit y of a n on un iform ga s ( Eq. ( 8. 3. 37) ) , for t wo ca se s. The
first ( 1) h a s a corre la t ion in it s n on un iformit y a t a sca le L
0
a n d t h e se con d ( 2) a t a sca le
L
1
> L
0
. Th e ma gn it ude of t h e loca l de via t ion s in t h e de n sit y a re t h e sa me in t h e t wo ca se s .
Th e se con d ca se h a s a lowe r comple xit y a t sma lle r sca le s but a h igh e r comple xit y a t t h e la rge r
sca le s. Be ca use t h e comple xit y de cre a se s ra pidly wit h sca le , t o sh ow t h e e ffe ct s on a lin e a r
sca le L
1
wa s t a ke n t o be on ly
3
√10L
0
, a n d t h e h orizon t a l a xis is in un it s of L
3
me a sure d in un it s
of L
3
0
. Eq ( 8. 3. 39) would give simila r re sult s but t h e comple xit y would de ca y st ill more ra pidly.
BarYamChap8.pdf 3/10/02 10:53 AM Page 748
·
0
√T/qT
0
. Since this is larger than L, the amount of information is essentially that
of selecting a value from a Gaussian dist ribution of this standar d deviation:
L < , T > qT
0
(8.3.56)
There are a few points to be made about this exp ression. First, we use the minimum
of two values to sele ct the crossover point between the b ehavior in Eq. (8.3.54) and
the blob b ehavior. As we mentioned ab ove, the blob behavior only o ccurs for T sig
nificantly greater than qT
0
. The simplest way to id entify the crossover point is when
the new estimate of the complexity becomes lower than our previous value. The sec
ond point is that we have chosen to adjust the constant ter m added to the logarithm
so that when L · the complexity matches that given by Eq.(8.3.55), which describes
the behavior when L becomes large. Thus the limit on Eq.(8.3.55) should be general
ized to L > . This minor adjustment enables the complexity to be continuous despite
our rough approximations, and does not change any of the conclusions.
We can see from our results (Fig. 8.3.4) how var ying q affects the complexity.
Increasing q decreases the complexity at the scale of a sheep, C( L,T ) ∝ 1/q in
Eq. (8.3.54). However, it increases the complexity at longer scales C(L,T ) ∝ q in
Eq.(8.3.55). This is a straight forward consequence of increasing the coherence of the
motion over time. We also see that the complexity at long times decays inversely pro
port ional to the time but is relat ively insensitive to q. The value of q primarily affects
the crossover point to the long time behavior.
We now use two different assumptions to calculate the complexity of the flock. If
the mo vement o f all o f the sheep is coher ent, then the complexity of the flo ck for
length scales greater than the size of the flock is the same as the complexity of a sheep
for the same length scales. This is apparent because describing the movement of a sin
gle sheep is the same as describing the entire flock. We now see the significance of in
creasing q. Increasing q increases the flock complexity until qL
0
reaches L
1
, wher e L
1
is the size of the flock. Thus we can increase the complexity of the whole at the cost of
reducing the complexit y of the components.
If the movement of sheep are independent of each other, then the flock displace
ments—the displacements of its center of mass—are of char acteristic size / √N (see
Eq.5.2.21). We might be concer ned that the flock will disperse. However, as in our dis
cussions of polymers in Sect ion 5.2, inter actions that would keep the sheep t ogether
need not affect the motion of their center of mass. We could also int roduce into our
model a circular r eflect ing boundary (a moving pen) around the flock, with its cen
ter at the center of mass. Since the motion of the sheep with this boundary does not
require additional information over that without it,the complexity is the same. In ei
ther case, the complexity of flock motion (L > L
1
) is obtained as:
(8.3.57)
L >
C( L,T ) ·2N
T
qL
0
2
NL
2
C( L,T ) ·2
N
T
q
min(1,
qT
0
T
(1 +log(
L
))
·2
N
T
q
min(1,
qT
0
T
(1 +log(
L
0
L
qT
T
0
))
C o m p l e x i t y o f p h y s i c a l s y s t e m s 749
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 749
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 749
This is valid for all L if is less than L
1
. If we choose T to be ver y large, Eq. (8.3.56)
applies, with replaced by / √N. We see that when the motion of sheep are indepen
dent,the flock complexity is much lower than before—it decreases inver sely with the
number of sheep when L > . Even in this case, however, increasing q increases the
flock complexity. Thus coherence in the behavior of a single sheep in time, or coher
ence between different sheep, increases the complexity of the flock. However, the
maximum complexity of the flock is just that of an individual sheep, and this arises
only for coherent behavior when all mo vements are visible on the scale of the flock.
Any movements of an individual sheep that are smaller than the scale of the flock dis
appear on the scale of the flock. Thus even for coherent motion, in general the flock
complexit y is smaller than the complexit y of a sheep.
This example illustrates the effect of coherent behavior. However, we see that
even with coherent motion the complexity of a flock at its scale cannot be larger than
the complexity of the she ep at its own scale. This is a problem for us, because our
study of complex syst ems is focused up on syst ems whose complexity is larger than
their components. Without this possibilit y, there would be no complex syst ems. To
obtain a higher complexity of the whole we must modify this model. We must assume
750 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 750
Title: Dynamics Complex Systems
Shor t / Normal / Long
C(L)
L
50 100 150 200
0.2
0.4
0.6
0.8
1
q=50;T=1
q=50;T=500
q=100;T=1
q=100;T=500
Fi gure 8 . 3 . 4 Th e comple xit y profile is plot t e d for a mode l of t h e move me n t of sh e e p a s pa rt
of a flock. I n cre a sin g t h e dist a n ce a sh e e p move s in a st ra igh t lin e ( coh e re n ce of mot ion in
t ime ) , q, de cre a se s t h e comple xit y a t sma ll le n gt h sca le s a n d in cre a se s t h e comple xit y a t la rge
le n gt h sca le s. Solid lin e s a n d da sh e d lin e s sh ow t h e comple xit y profile a s a fun ct ion of le n gt h
sca le for a t ime sca le T · 1 a nd T · 500 re spe ct ive ly.
BarYamChap8.pdf 3/10/02 10:53 AM Page 750
more generally that the motion of a sheep is describable using a set of patterns of be
havior. Coherent motion of sheep still lead to a similar (or lower) complexit y. To in
crease the complexity, the motion o f the flock must have mo re complex patt erns of
motion. In order to achieve such patterns, the motions of the individual sheep must
be neither independent nor coherent—they must be cor related motions that com
bine patterns of sheep motion into the more complex patterns of flock motion. This
is possible only if there are interactions between them, which have not been included
here. It should now be clear that the objective of learning how the complexity of a sys
tem is related to the complexity of its components is central to our study of complex
systems.
Q
ue s t i on 8 . 3 . 6 Throughout much of this book our working definition
of complex systems or complex organisms as articulated in Section 1.3
and developed further in Chapter 2 was that a complex system has a behav
ior that is dependent on all of its parts. In par ticular, that it is impossible to
take part of a complex organism away without affect ing the behavior of the
whole and behavior of the part. How is this d efinition related to the defini
tion of complexity ar ticulated in this sect ion?
Solut i on 8 . 3 . 6 Our quantitative concept of complexity is a measure of the
infor mation necessary to describe the system behavior on its own length
scale. If the system behavior is complex,then it must require many parame
ters to describe. These parameters are r elated to the description of the sys
tem on a smaller length scale, where the parts of the system are manifest be
cause we can distinguish the descript ion of one part from another. To d o
this we limit P
L, T
(n(x, t)) to the domain of the part. The behavior of a sys
tem is thus related to the behavior of the parts. The more these are relevant
to the system b ehavior, the greater is the system complexit y. The infor ma
tion that describes the system b ehavior must be relevant on ever y smaller
length scale. Thus, we have a direct relationship between the definition of a
complex syst em in t erms o f parts and the definition in ter ms of informa
tion. Ultimately, the information necessary to describe the system behavior
is d etermined by the microscopic description of atomic positions and mo
tions. The more complex a system is, the more its behavior depends on
smaller scale components.
Q
ue s t i on 8 . 3 . 7 When we defined int erdependence we did not consider
the dependence of an animal on air as a relevant example. Explain.
Solut i on 8 . 3 . 7 We can now recognize that the use of infor mation as a
char acterization of behavior enables us to distinguish various forms of de
pendency. In par t icular, we see that the dependence of an animal on air is
simple, since the necessary proper ties of air are simple to describe. Thus,
the degree of interdependence of two syst ems should be measured as the
amount o f infor mation necessary to replace one in the description of the
other.
C o m p l e x i t y o f p h y s i c a l s y s t e m s 751
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 751
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 751
8 . 3 . 6 Beha viora l complexit y
Our ability to describe a system arises fr om measurements or observations of its be
havior. The use of system descriptions to define system complexity does not dir ectly
take this into account. The complexity profile brought us closer by acknowledging the
obser ver in the space and time scale of the description. By acknowledging the scale of
obser vation, we obtained a mechanism for distinguishing complex systems from
equilibrium systems,and a systematic method for characterizing the complexity of a
system. There is another approach to reaching the complexity pr ofile that incorpo
rates the obser ver and syst em relationship in a mo re satisfactor y manner. It also en
ables us to consider directly the interaction of the system with its environment, which
was not included pr eviously. To introduce the new approach, we ret urn to the under
pinning of descr iptive complexit y and pr esent the concept of behavior al complexity.
In Shannon’s approach to the study of infor mation in communication syst ems,
there were two quantities of fundamental interest. The first was the infor mation con
tent of an individual message, and the second was the aver age information provided
by a particular source. The discussion of algor ithmic complexity was based on a con
sider ation of the infor mation provided by a par ticular message—specifically, how
much it could be compressed. This car r ied over into our discussion of physical sys
tems when we introduced the microscopic complexity of a system as the infor mation
contained in a particular microscopic realization of the system. When all messages, or
all syst em states, have the same probability, then the information in the part icular
message is the same as the average infor mation, and we can wr ite:
(8.3.58)
The exp ression on the right, however, has a different pur pose. It is a quantity that
characterizes the ensemble rather than the individual microstate. It is a char acteriza
tion of the source r ather than of any par ticular message.
We can pursue this line of reasoning by considering more carefully how we might
char acter ize the source of the information, rather than the messages.One way to char
acterize the sour ce is to d etermine the average amount of infor mation in a message.
However, if we want to describe the source to someone, the most essential informa
tion is to give a descript ion of the kinds of messages that will be received—the en
semble of possible messages. Thus to character ize the source we need a description of
the pr obability of each kind o f message. How much infor mation do we need to d e
scr ibe these probabilities? We call this the behavior al complexit y of the sour ce.
A few examples in the context of a source of messages will ser ve to illustrate this
concept. Any descript ion of a source must assume a language that is to be used. We
assume that the language consists of a list of char acters or messages that can be re
ceived from the source, along with their probabilities.A delimiter (:) is used to sepa
rate the messages from their probability. For convenience, we will write probabilities
in decimal notation. A second delimit er (,) is used to separate different members of
the list.A source that gives zeros and ones at random with equal probability would be
described by {1:0.5,0:0.5}. It is convenient to include the length of a message in our
I ({x, p} (U , N ,V )) · −logP({x, p}) · − P({x, p})
{x,p }
∑
log(P({x, p}))
752 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 752
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 752
description of the source. Thus we might describe a source with length N · 1000 char
acter messages, each char acter zero and one with equal p robabilit y, as: {1000(1:0.5,
0:0.5)}. The message complexity of this source would be given by N, the length o f a
message. However, the behavioral complexity is given by (in this language): two dec
imal digits,two character s (1, 0),the number representing N (requiring log(N) char
acters) and several d elimiter s. We could also specify an ASCII language source by a
table of this kind that would consist of 256 elements and the probabilities of their oc
currence in some database. We see that the behavioral complexity is quite distinct
from the complexity of the messages provided by a source. In par t icular in the above
example it can be larger, if N · 1, or it can be much smaller, if N is large.
This definition of the behavioral complexity of a source runs into a minor prob
lem, because the probabilities are real numbers and would gener ally require arbitr ar y
numbers of digits to describe. To overcome this problem,t here must be a convention
assumed about the limit of precision that is desired in describing the source. In prin
ciple,this precision is related to the number of messages that might be received. This
convention could be part of the language, or could be defined by the specification it
self. The description of the source can also be compressed using the p rinciples of al
gorithmic complexity.
As we found above,the behavioral complexity can be much smaller than the in
formation complexity of a particular message—if the sour ce provides many random
digits, the complexity of the message is high but the complexity of the sour ce is low
because we can characterize it simply as a source of r andom number s. However, if the
probability of each message must be independently specified, the behavioral com
plexity of a sour ce is much larger than the infor mation content of a par t icular mes
sage. If a part icular message requires N bits of infor mation,then the number of pos
sible messages is 2
N
. Listing all of the possible messages requires N 2
N
bits, and
specifying each probability with Q bits would give us a total of (N + Q)2
N
bits to de
scribe the source. This could be reduced if the messages are placed in an agreedupon
order ; then the number of bits is Q2
N
. This is still exponentially larger than the infor
mation in a par ticular message. Thus, the complexity of an ar bitr ary sour ce of mes
sages of a particular length is much larger than the complexity of the messages it sends.
We are int erested in the behavioral complexity when our objective is to use the
messages that we receive to understand the source, rather than to make use of the in
formation itself. Behavioral complexity becomes particularly useful when it is smaller
than the complexity of a message, because it enables us to anticipate or pr edict the be
havior of the source.
We now apply these thoughts about the source as the syst em of interest, rather
than the message as the syst em of interest, to a discussion of the propert ies of physi
cal systems. To make the connection between source and syst em, we consider an ob
server of a physical system who performs a number of measurements. We might
imagine the measurements to consist of subjecting the system to light at various fr e
quencies and measuring their scatter ing and reflect ion (looking at the syst em), ob
servations of animals in the wild or in capt ivit y, or physical probes of the system. We
consider each measurement to be a message from the syst em to the obser ver. We must,
C o m p l e x i t y o f p h y s i c a l s y s t e m s 753
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 753
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 753
however, take note that any measurement consists of two parts,the conditions or en
vironment in which the obser vation was p erformed and the behavior of the system
under these conditions. We write any observation as a pair (e,a), where e represents
the environment and a represents a measur ement of system p roper ties (action) un
der the circumstances of the environment e. The obser ver, after per for ming a number
of measur ements, writes a description of the obser vations. This description char ac
terizes the syst em. It captures the propert ies of the list of measurements, rather than
of one part icular measurement. It may or may not explicitly contain the information
of each measur ement. Alternat ively, it may assign probabilities to a particular mea
surement. We would like to define the behavioral complexity as the amount of infor
mation contained in the obser ver’s description. However, we must be careful how we
do this because of the pr esence of the environmental descr ipt ion e.
In order to clarify this point, and to make contact between behavioral complex
ity and our previous discussion of descriptive complexity, we first consider the phys
ical syst em of interest to be essentially isolat ed. Then the environmental description
is irrelevant, and an obser vation consists only of the system measurement a. The list
of measurements is the set {a}. In this case it is relatively easy to see that the behav
ior al complexity of a physical system is its descript ive complexity—the set of all mea
surements char acter izes completely the state of the system.
If the entire set o f measurements is p er for med at a single instant, and has ar bi
t rary precision, then the behavior al complexity is the microstate complexity o f the
system. The result of any measurement can be obtained from a description of the mi
crostate, and the set of possible measurements determines the microstate.
For a set of measurements p erformed over time on an equilibrium syst em, the
behavior al complexity is the ensemble complexity—the number of parameters nec
essary to specify its ensemble. A par ticular message is a measurement of the syst em
properties, which in pr inciple might be detailed enough to determine the instanta
neous positions and momenta of all of the par ticles. However, the list of measure
ments is determined by the ensemble of states the system might have. As in
Section 8.3.1, we conclude that the complexity of an equilibr ium syst em is the com
plexity of describing its ensemble—specifying (U, N,V) and other parameters like
magnetization that result from the breaking of ergodicit y. For a glass, the ensemble
information is the information in the frozen coordinates p reviously d efined as the
complexit y. More generally, for a set of measurements perfor med over an int er val of
time T—or at one instant but with time determination error T—and with spatial po
sition determination errors given by L, we recover the complexity profile.
We now r eturn to c onsider a syst em that is not isolat ed but subject to an envi
ronmental influence so that an obser vation consists of the pair (e,a) (Fig. 8.3.5). The
complexity of describing such messages also contains the complexity of the environ
ment e. Does this mean that our system descript ion must include its environment and
that the complexity of the system is dependent on the complexity of the environment?
Complex systems or simple systems inter act and respond to the environment in
which they are found. Since the system response a is dependent on the environment
e, there is no doubt that the complexity of a is dependent on the complexity of e. Three
754 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 754
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 754
examples illustrate how the environmental influence is important. The tail o f a dog
has a par ticular motion that can be described, and the complexity can be character
ized. However, we may want to attr ibute much of this complexity to the rest of the dog
r ather than to the tail. Similar ly, the motion of a particle suspended in a liquid follows
Brownian motion, the description of which might be better att r ibuted to the liquid
than to the par ticle. Clearer yet is the example of the behavior of a basket ball during
a basketball game. These examples generalize to the consideration of any system, be
cause measuring the proper ties o f a syst em in an environment may cause us to be
measuring the influence of the environment, rather than the system. The obser ver
must describe the syst em behavior as a resp onse to a par ticular environment, rather
than just the behavior itself. Thus, we do not characterize the syst em by a list of ac
tions {a} but rather by the list of pairs {( e,a)} wher e our concern is to describe f the
funct ional mapping a · f ( e) from the environment e to the response a. Once we real
ize this, we can again affirm that a full microscopic description of the physical system
is enough to give all system responses. The p oint is that the complexity of a syst em
should not include the complexity of the influence upon it, but just the complexity of
its resp onse. This response is a proper ty of the syst em and is determined by a com
plete microscopic description. Conversely, a full description of behavior subject to all
possible environments would require complete microscopic information.
However, within a range of environments and with a desired degree of precision
(spatial and tempor al scale) it is possible to provide less information and still describe
the behavior. We consider the ensemble of messages (measurements) to have possible
times of obser vation over a range of times given by T and errors in position determi
nation L. Describing the ensemble of responses g ives us the behavioral complexity
profile C
b
(L,T ).
When the influence of the environment is not important, C(L,T ) and C
b
(L,T )
are the same. When the environment matters,it is also impor tant to characterize the
infor mation that is relevant about the environment. This is related to the problem of
C o m p l e x i t y o f p h y s i c a l s y s t e m s 755
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 755
Title: Dynamics Complex Systems
Shor t / Normal / Long
message (action)
System
Observer
System's
Environment
e
a
e
Fi gure 8 . 3 . 5 Th e obse rva t ion of syst e m be h a vior in volve s me a sure me n t s bot h of t h e syst e m’s
e n viron me n t , e , a n d t h e syst e m’s a ct ion s, a , in re spon se t o t h is e n viron me n t Th us we sh ould
ch a ra ct e rize a syst e m a s a fun ct ion , a · f( e ) , wh e re t h e fun ct ion f de scribe s it s a ct ion s in re
spon se t o it s e n viron me n t . I t is ge n e ra lly simple r t o de scribe a mode l for t h e syst e m st ruc
t ure, wh ich is a lso a mode l of f, ra t h e r t h a n a list of a ll of it s e n viron me n t  a ct ion ( e , a ) pa irs.
BarYamChap8.pdf 3/10/02 10:53 AM Page 755
prediction, because predicting the system behavior in the fut ure requires information
about the environment. As we have defined it,the descriptive complexity is the infor
mation necessary to p redict the behavior of the system over the time int er val t
2
− t
1
.
We can character ize the environmental influence by gener alizing Eq. (8.3.47) to in
clude a ter m that describes the rate of information t ransfer from the environment to
the system:
(8.3.59)
where C
e
(L)/k ln(2) is the infor mation about the environment necessary to predict the
state of the system at the next time step, and C
b
(L) is the behavior al complexity at one
time interval. Because the system itself is finite,the amount of information about the
universe that is relevant to the system behavior in any interval of t ime must also be fi
nite. We note that because the system affects the environment, which then affects the
system, Eq.(8.3.59) as written may count infor mation more than once. Thus,this ex
pression as wr itten is an upper bound on the complexity. We not ed this point also
with respect to the Lyaponov exponents after Eq. (8.3.47).
This use of behavior/response rather than a description to characterize a system
is related to the use of response funct ions in physics, or input/output relationships to
describe art ificial systems. The response funct ion can (in pr inciple) be completely de
rived from the microscopic description of a system. It is more directly relevant to the
system behavior in response to environmental influences, and thus is essential for di
rect compar ison with experimental results.
Behavioral complexity suggests that we should consider the system b ehavior as
represented by a funct ion a · f (e). The input to the function is a descript ion of the
environment; the output is the resp onse or action. There is a difficulty with this ap
proach in that the complexity of functions is gener ically much larger than that of the
system itself. From the discussion in Section 8.2.3 we know that the description of a
funct ion would require an amount of information given by C
f
· C
a
2
C
e
, where C
e
is the
environmental complexity, and C
a
is the complexity of the act ion. Because the envi
ronmental influence leads to an exponentially large complexity, it is clear that often
the most compact description of the system behavior will give its str ucture rather
than its resp onse to all inputs. Then, in p rinciple, the response can be derived fr om
the st r ucture. This also implies that the behavior of physical syst ems under different
environments cannot be ind ependent. We note that these conclusions must also ap
ply to human beings as complex systems that respond to their environment (see
Question 8.3.8).
Q
ue s t i on 8 . 3 . 8 Discuss the following statements with respect to human
beings as complex syst ems: “The most compact descrip tion of the sys
tem behavior will give its struct ure rather than its response to all inputs,” and
“This implies that the behavior of physical systems under different environ
ments cannot be independent.”
C( L,T ) ·C
b
( L, T ) +N
T
C
e
(L,T )
C
b
(L,T ) ·C
b
(L) +C
t
(L,T ) +N
T
k h
i
i:h
i
>0
∑
756 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 756
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 756
Solut i on 8 . 3 . 8 The first statem ent is rel evant to the discussion of beh avi or
ism as an approach to psych o l ogy (see Secti on 3.2.8). It says that the idea of
de s c ri bing human beh avi or by cataloging re acti ons to envi ron m ent al sti mu l i
is ulti m a tely an inef f i c i ent approach . It is more ef fective to use su ch measu re
m ents to con s tr u ct a model for the internal functi oning of the indivi dual and
use this model to de s c ri be the measu red re s pon s e s . The model de s c ri pti on is
mu ch more concise than the de s c ri pti on of a ll po s s i ble re s pon s e s .
Moreover, from the second stat ement we know that the model can d e
scribe the responses to circumstances that have not been measured. This also
means that the use of such models may be effective in predict ing the behav
ior of an individual.Specifically, that reactions of a human being are not in
dependent of past react ions to other circumstances. A model that incorpo
rates the p revious behaviors may have some ability to predict the behavior
to new circumstances. This is part of what we do when we interact with other
individuals—we const ruct models that represent their behavior and then
ant icipate how they will r eact to new circumstances.
The coupling between the reaction of a human being under one cir
cumstance to the reaction under a different circumstance is also relevant to
our understanding of human limitations. Opt imizing the response through
adaptation to a set of environments according to some goal is a process that
is limit ed in its effect iveness due to the coupling between responses to dif
ferent circumstances. An individual who is eff ective in some circumstances
may have qualities that lead to ineffective behavior under other circum
stances. We will discuss this in Chapter 9 in the context of consider ing the
specialization of human beings in society. This point is also applicable more
generally to living organisms and their ability to consume resources and
avoid predators as discussed in Chapter 6. Increasing complexity enables an
organism to be more effective, but the effect iveness und er a variet y of cir
cumstances is limit ed by the int erdependence of responses. This is r elevant
to the obser vation that living organisms generally consume limited types of
resources and live in par ticular ecological niches.
8 . 3 . 7 The observer a nd recognit ion
The explicit existence of an obser ver in the definition of behavioral complexity en
ables us to further consider the role of the observer in the definition of complexity.
What assumptions have we made about the propert ies of the obser ver? One of the as
sumptions that we have made is that the obser ver is more complex than the syst em.
What happens if the complexity of the syst em is greater than the complexity of the
obser ver? The complexity of an obser ver is the number of bits that may be used to de
scribe the obser ver. If the obser ver is describ ed by fewer bits than are needed to d e
scribe the syst em, then the obser ver will be unable to contain the description of the
system that is being obser ved. In this case,the obser ver will constr uct a descript ion of
the system that is simpler than the syst em actually is. There are several possible ways
that the observer may simplify the descript ion of the system. One is to reject the
C o m p l e x i t y o f p h y s i c a l s y s t e m s 757
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 757
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 757
obser vation o f all but a few kinds of messages. The other is to art ificially limit the
length of messages described. A third is to tr eat complex variability of the source as
r andom—described by simple probabilities. These simplifications are often done in
our modeling of physical systems.
An inherent pr oblem in discussing behavior al complexity using environmental
influence is that it is never possible to guarantee that the behavior of a system has been
fully char acter ized. For example, a rock can be describ ed as “just sitting there,” if we
want to describe the complexity of its motion under different environments. Of
course the nature o f the environment could be changed so that other behaviors will
be realized. We may, for example,discover that the rock is act ually a camouflaged an
imal. This is an inherent problem in behavioral complexity: it is never possible to
char acterize with certainty the complexity of a system under circumstances that have
not been measured. All such conclusions are extrapolations. Per forming such extr ap
olations is an essential part of the use of the description of a system. This is a general
problem that applies to quantitat ive scientific modeling as well as the use of experi
ence in gener al.
Finally, we describe the relevance of recognition to complexit y. The first
comment is relat ed to the recognition of sets of numbers introduced briefly in
Section 8.2.3. We introduced there the concept of recognition complexity of a set that
relies upon a recognizer (a special kind of TM called a predicate that gives a single bit
output) that can identify the system und er discussion. Specifically, when p resented
with the system it says, “This is it,” and when pr esented with any other system it says,
“This is not it.” We define the complexity of a system (or set o f systems) as the com
plexity of the simplest recognizer of the system (or set of systems). There are some in
teresting features of this definition.First we realize that this definition is well suited to
describing classes of systems. A description or model of a class of systems must iden
tify common attr ibutes rather than specific behaviors.A second interesting feature is
that the complexity of the recognizer depends on the possible univer se of systems that
it can be presented with. For example,the complexity of recognizing cows depends on
whether we allow ourselves to present the recognizer with all domestic animals, all
known biolo gical organisms on earth, all pot entially viable biological organisms, or
all possible systems. Naturally, this is an important issue in the field of pattern recog
nition, where the complexity of designing a syst em to r ecognize a par ticular patt ern
is st rongly dependent on the universe of possibilities within which the pattern must
be r ecognized. We will return to this point lat er when we consider the p roper ties of
human language in Sect ion 8.4.1.
A different form of complexity related to recognition may be abstr acted from the
Turing test of artificial intelligence. This test suggests that we will achieve an ar tificial
representation of intelligence when it becomes impossible to deter mine whether we
are interacting with an ar tificial or actual human b eing. We can assume that Turing
had in mind only a limited type of int eraction between the obser ver “we” and the sys
tems b eing obser ved—either the real or artificial r epresentation o f a human being.
This test, which relies upon an obser ver to recognize the system, can ser ve as the ba
sis for an additional definition of complexit y. We determine the minimal possible
758 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 758
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 758
complexity of a model (simulated representation) of the system which would be rec
ognized by a part icular observer under par t icular circumstances as the system. The
complexity of this model we call the substitution complexit y. The sensit ivity of this
definition to the nature of the obser ver and the conditions of the obser vation is man
ifest. In some ways this definit ion,however, is implicit in all of our earlier definitions.
In all cases, the complexity measures the length of a representation of the system.
Ultimately we must determine whether a particular r epresentation o f the syst em is
faithful. The “we” in the p revious sentence is some observer that must recognize the
system behavior in the constructed representation.
We conclude this section by r eviewing some o f the main concepts that were in
t roduced. We noted the sensitivity of complexity to the spatial and temporal scale rel
evant to the description or response. The complexity profile formally takes this int o
account. If necessary, we can define the unique complexity of a system to be its com
plexity profile evaluated at its own scale.A mo re complete characterization of the sys
tem uses the entire complexity profile. We found that the mathematical models most
closely associated with complexity—chaos and fractals—were both relevant. The for
mer described the influence of microscopic infor mation over time. The latter de
scribed the gr adual rather than rapid loss of information with spatial and temporal
scale. We also reconciled the notion of information as a measure of system complex
ity with the notion of complex systems as composed out of int erdependent parts.Our
next objective is to concretize this discussion further by estimating the complexity of
par ticular systems.
Comple xi t y Es t i ma t i on
Ther e are various difficulties associated with obtaining sp ecific values for the com
plexity of a par ticular syst em. There are both fundamental and practical p roblems.
Fundamental problems such as the difficulty in determining whether a representation
is maximally compressed are important. However, before this is an issue we must first
obtain a repr esentation.
One ap proach to obtaining the complexity o f a syst em is to construct a repr e
sentation. The explicit representation should then be used to make a simulation to
show that the system behavior is reproduced. If it is,then we know that the length of
the representation is an upper bound on the complexity of the system. We can hope,
however, that it will not be necessary to obtain explicit representations in order to es
timate complexities. The objective of this sect ion is to discuss various methods for es
timating the complexity of systems with which we are familiar. These approaches
make use of repr esentations that we cannot simulate, however, they do have r ecog
nizable relationships to the system.
Measuring complexity is an exp erimental problem. The only reason that we ar e
able to discuss the complexity of various syst ems is that we have already made many
measurements of the proper ties of various syst ems. We can make use of the existing
information to const ruct estimates of their complexity. A specific estimation method
is not necessar ily useful for all systems.
8 . 4
C o m p l e x i t y e s t i m a t i o n 759
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 759
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 759
Our object ive in this section is limited to obtaining “ballpark” estimates of the
complexity of systems. This means that our er rors will be in the exponent rather
than in the number itself. We would be ver y happy to have an estimate of complexity
such as 10
3t1
or 10
7±2
. When appropr iate, we keep t rack of halfdecades using factor s
of three, such as in 3 × 10
4
. These rough estimates will give us a first impression of
the degree of complexity of many of the systems we would like to understand. It
would tell us how difficult (ver y roughly) they are to describe. We will discuss three
methods—(1) use of intuition and human language descriptions, (2) use of a nat 
ural r epresentation tied to the system exist ence, where the principle example is the
genome of living organisms, and (3) use of component counting. Each of these
methods has flaws that will limit our confidence in the resulting estimates. However,
since we are tr ying to find rough estimates, we can still take advantage of them.
Consistency o f different methods will give us some confidence in our estimates of
complexit y.
While we will discuss the complexity of various systems,our focus will be on de
termining the complexity of a human being. Our final estimate,10
10±2
bits will be ob
tained by combining the results of different estimation techniques in the following
sections. The implications of obtaining an estimate of human complexity will be dis
cussed in Section 8.4.4. We start,however, by noting that the complexity of a human
being can be bounded by the physical entropy of the collect ion of atoms from which
he or she is formed. This is roughly the entropy of a similar weight of water, about 10
31
bits. This is the value of S / k ln2. As usual, we have assumed that there is nothing as
sociated with a human being except the mater ial of which he or she is formed, and
that this material is described by known physical law. This entropy is an upper bound
to the inf ormation necessary to specify the complete human b eing. The meaning o f
this number is that if we take away the person and we replace all of the atoms accord
ing to a specification of 10
31
bits o f infor mation, we have r eplaced microscopically
each atom where it was. According to our understanding of physical law, ther e can be
no discernible difference. We will discuss the implications for artificial intelligence in
Section 8.4.4, where we consider whether a computer could simulate the dynamics of
atoms in order to simulate the behavior of the human being.
The ent ropy of a human b eing is much larger than the complexity estimate we
are after, because we are interested in the complexity at a relevant spatial and tempo
r al scale. In general we consider the complexity of a system at the natural scale defined
in Section 8.3.5, onetenth the size of the system itself,and the relaxation time of the
behavior on this same length scale. We could also define the complexity by the ob
ser ver. For example,the maximum visual sensitivity of a human being is about 1/100
of a second and 0.1 mm. For either case, obser ving only at this spatial and temporal
scale decreases dramatically the relevance of the microscopic descript ion. The reduc
tion in information is hard to estimate directly. To estimate the r elevant complexit y,
we must consider other techniques. However, since most of the information in the en
tropy is needed to describe the position of molecules of water undergoing vibrations,
we can guess that the complexity is significantly smaller than the entr opy.
760 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 760
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 760
8 . 4 . 1 Huma n int uit ion—la ngua ge a nd complexit y
The first method for estimation of complexity—the use of human intuition and lan
guage—is the least cont rolled/scientific method of obtaining an estimate of the com
plexity of a system. This approach,in its most basic for m,is precisely what was asked
in Question 8.2.1. We ask someone what they believe the complexity of the system is.
It is assumed that the person we ask is somewhat knowledgeable about the system and
also about the problem of describing systems. Even though it appears highly arbitrar y,
we should not dismiss this ap proach too readily because human beings are designed
to understand complex systems. It could be argued that much of our development is
directed toward enabling us to constr uct predictive models of various parts of the en
vironment in which we live. The complexity of a system is directly related to the
amount of study we need in order to master or predict the behavior of a system. It is
not accidental that this is the fundamental objective of science—behavior prediction.
We are quite used to using the word “complexity”in a qualitative manner and even in
a comparative fashion—this is mor e complex or less complex than something else.
What is missing is the quantitative definition. In order for someone to give a quant i
tative estimate of the complexity of a system,it is necessary to provide a definition of
complexit y that can be readily understood.
One useful and intuit ive definition of complexity is the amount of infor mation
necessary to describe the behavior of a system. The information can be quantified in
ter ms of representations people are familiar with—the amount of text / the number of
pages /the number of books. This can be sufficient to cause a person to build a rough
mental model of the system descript ion, which is much more sophisticat ed than
many explicit representations that might be constr ucted. Ther e is an inherent limita
tion in this approach mentioned more generally ab ove—a human being cannot di
rectly estimate the complexity of an organism of similar or great er complexity than a
human being. In particular, we cannot use this approach directly to estimate the com
plexity of human beings. Thus we will focus on simpler animals first. For example, we
could ask the question in the following way: How much text is necessary to describ e
the behavior of a frog? We might emphasize for clarification that we are not interested
in comparative frogology, or molecular frogology. We are just interested in a descrip
tion of the behavior of a frog.
To gain additional confidence in this approach, we may go to the library and find
descriptions that are provided in books. Superficially, we find that there are entire
books devoted to a par ticular t ype of insect (mosquito, ant, butterfly), as there ar e
books devoted to the tiger or the ape. However, there is a qualitative difference be
tween these books. The books on insects are devoted to comparative descriptions,
where various t ypes of, e.g., mosquitoes, from around the world, their physiology,
and/or their evolut ionary hist or y are described. Tens to hundr eds of t ypes are com
pared in a single book. Except ional behaviors or examples are highlighted. The
amount o f text d evoted to the behavior o f a par ticular t ype o f mosquito could be
readily contained in less than a single chap ter. On the other hand,a book devoted to
tiger s may describe only behavior (e.g., not physiology), and one devoted to apes
C o m p l e x i t y e s t i m a t i o n 761
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 761
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 761
would describe only a particular individual in a manner that is limited to only part of
its behaviors.
Does the difference in texts describing insects and tigers reflect the social priori
ties of human beings? This appears to be difficult to support. The mosquito is much
more relevant to the wellbeing of human beings than the tiger. Mosquitoes are eas
ier to study in captivity and are more readily available in the wild. There are films that
enable us to obser ve the mosquito behavior at its own scale rather than at our usual
larger scale. Despite such films,there is no booklength descript ion of the behavior of
a mosquit o. This is t rue despite the importance of knowledge of its behavior to pre
vention of various diseases. Even if there is some degree o f subjectivity to the com
plexity estimates obtained from the lengths of descriptions found in books,the use of
existing books is a reasonable first attempt to o btain complexity estimates fr om the
information that has b een compiled by human b eings. We can also argue that when
ther e is greater exper ience with complexity and complexity estimation,our ability to
use intuition or existing t exts will improve and become impor tant tools in complex
it y estimat ion.
Before applying this methodology, however, we should understand mo re car e
fully the basic relationship of language to complexity. We have alrea dy discussed in
Section 1.8 the information in a string of English characters.A first estimate of 4.8 bits
per character could be based upon the existence of 26 letters and 1 space. In
Question 1.8.12,the best estimate obtained was 3.3 bits per character using a Markov
chain model that included cor relations between adjacent characters. To obtain an
even better estimate, we need to have a model that includes longerrange correlations
between characters. The most reliable estimates have been obtained by asking people
to guess the next char acter in an English t ext. It is assumed that people have a highly
sophisticated model for the st ructure of English and that the individual has no spe
cific knowledge of the text. The guesses were used to establish bounds on the infor
mation content. We can summarize these bounds as 0.9±0.3 bits/character. For our
present discussion, the difference between high and low bounds (a factor of 2) is not
significant. For convenience we will use 1 bit/character for our conversion factor. For
larger quant ities of text, this corresponds to values given in Table 8.4.1.
Our esti m a te of i n form a ti on in text has assu m ed a st ri ct ly nar ra tive English tex t .
We should also be con cern ed abo ut figures t hat accom p a ny de s c ri pt ive materi a l s . Doe s
the conven ti onal wi s dom of “a pictu re is wor th a thousand word s” m a ke sense? We can
con s i der t his bot h from the point of vi ew of d i rect com pre s s i on of the pictu re , and the
762 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 762
Title: Dynamics Complex Systems
Shor t / Normal / Long
Amount of text Information in text Text with figures
1 char 1 bit 
1 page · 3000 char 3x10
3
bit 10
4
1 chapter · 30 pages 10
5
bit 3x10
5
1 book · 10 chapters 10
6
bit 3x10
6
Ta ble 8 . 4 . 1 I n forma t ion e st ima t e s for st ra ight English t ext a n d illust ra t e d t ext .
BarYamChap8.pdf 3/10/02 10:53 AM Page 762
po s s i bi l i t y of rep l acing the figure by de s c ri ptive tex t . A thousand words corre s pon d s
to 5 × 1 0
3
ch a r acter s or bi t s ,a bo ut two pages of tex t . De s c ri ptive figures su ch as gra ph s
or diagr ams of ten consist of a few lines that can be con c i s ely de s c ri bed using a formu l a
and would have a small er com p l ex i ty. P h o togra phs are for m ed of h i gh ly correl a ted
gra phical infor m a ti on that can be com pre s s ed . In a bl ack and wh i te ph o togra ph 5 ×1 0
3
bits would corre s pond to a 70 × 70 grid of com p l etely indepen dent pixel s . If we rec a ll
that we are not intere s ted in small det a i l s , this seems re a s on a ble as an upper bo u n d .
Moreover, the text that accompanies a figur e gen era lly de s c ri bes it s essen tial con ten t .
Thus wh en we ask the key qu e s ti on — wh et h er two pages of text would be su f f i c i ent to
de s c ri be a typical figure and rep l ace its funct i on in the text —t his seems a som ewh a t
gen erous but not en ti rely unre a s on a ble va lu e . A figure typ i c a lly occupies half of a page
that would be otherwise occ u p i ed by tex t . Thu s , for a high ly illu s tra ted boo k , on aver
a ge containing one figure and on e  h a l f p a ge of text on each page , our esti m a te of t h e
i n for m a ti on con tent of the book would increase from 10
6
bit s by a factor of 2.5 to
ro u gh ly 3 × 1 0
6
bi t s . If t h ere is one pictu re on ever y two page s , t he inform a ti on con
tent of the book would be do u bl ed r a t h er t han tri p l ed . While it is not re a lly essen ti a l
for our level of prec i s i on , it seems re a s on a ble to adopt the conven ti on that esti m a te s
using de s c ri pti ons of beh avi or al com p l ex i t y inclu de figure s . We wi ll do so by incre a s
ing the previous va lues by a factor of 3 ( Ta ble 8.4.1). This wi ll not ch a n ge any of t h e
con clu s i on s .
Ther e is another aspect of the relationship of language to complexity. A language
uses individual words (like “frog”) to represent complex phenomena or systems (like
the physical system we call a frog). The complexity of the word “frog” is not the same
as the complexity of the frog. Why is this possible? According to our discussion of al
gorithmic complexity, the smallest possible representation of a complex system has a
length in bits which is equal to the syst em complexity. Her e we have an example of a
system—frog—whose representation “frog” is manifestly smaller than its complexity.
The resolution of this puzzle is through the concept of recognition complexity
discussed in Section 8.3.7.A word is a member of an ensemble of words,and the sys
tems that are described by these words are an ensemble of systems. It is only necessar y
that the ensemble o f words be mat ched to the ensemble o f syst ems described by the
words,not the whole ensemble of possible systems. Thus,the complexity of a word is
not relat ed to the complexity of the system, but rather to the complexity of specifying
the system—the logarithm of the number of systems that are part of the shared ex
perience of the individuals who are communicating. This is the cent ral point of recog
nition complexity. For a human being with exper ience and memory of only a limited
number of the set of all complex syst ems, to describe a system one must identify it
only in comparison with the systems in memory, not with those possible in principle.
Another way to think about this is to consider a human being as analogous to a
special UTM with a set of shor t representations that the UTM can expand to a spe
cific limit ed subset o f possible long descrip tions. For example, having memorized a
play by Shakespeare,it is only necessary to invoke the name to ret rieve the whole play.
This is,indeed,the essence of naming—a name is a shor t reference to a complex sys
tem. All words are names of more complex entit ies.
C o m p l e x i t y e s t i m a t i o n 763
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 763
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 763
In this way, language provides a systematic mechanism for compression of infor
mation. This implies that we should not use the length of a word to estimate the com
plexity of a system that it refers to. Does this also invalidate the use of human language
to obtain complexity estimates? On one hand, when we are asked to describe the be
havior of a frog, we assume that we must describe it without reference to the name it
self.“It behaves like a frog” is not a sufficient descript ion. There is a presumption that
a descrip tion of behavior is made to someone without specific knowledge. An est i
mate of the complexity of a fr og would be much higher than the complexity of the
word “frog.” On the other hand,the words that would be used to describe a fr og also
refer to complex entities or actions. Consistency in different estimates of the amount
of text necessary to describe a frog might arise from the use of a common language
and exper ience. We could expand the descript ion further by r equiring that a person
explain not only the b ehavior of the frog, but also the meaning o f each of the wor ds
used to describe the behavior of the frog. At this point ,however, it is more constr uc
tive to keep in mind the subtle relationship between language and complexity as par t
of our uncer tainty, and take the given estimates at face value. Ultimately, the com
plexity of a syst em is defined by the condition that all possible (in principle) behav
iors of the same complexity could be described using the same length of text. We ac
cept the possibility that languagebased estimates of complexity of biological
organisms may be systematically too small because they are common and familiar. We
may nevertheless have relative complexities est imated cor rectly.
Finally, we can argue that when we estimate the complexity of systems that ap 
proach the complexity o f a human b eing, the estimation p roblems becomes less se
vere. This follo ws b ecause o f our discussion of universality o f complexity given in
Section 8.2.2.Specifically, that the more complex a system is,the less relevant specific
knowledge is, and the more universal are estimates of complexity. Never theless, ulti
mately we will conclude that the inherent compression in use of language f or de
scribing familiar complex syst ems is the greatest contributor to uncertainty in com
plexity estimates.
Ther e is another approach to the use of human intuition and language in est i
mating complexit y. This is by reference to computer languages. For someone familiar
with computer simulation, we can ask for the length of the computer progr am that
can simulate the behavior of the system—more specifically, the length of the program
that can simulate a frog. Computer languages are generally not ver y high in infor ma
tion content, because there are a few commands and variables that are used through
out the program. Thus we might estimate the complexity of a progr am not by char 
acters, but by program lines at several bits per program line. Consistent with the
definition of algor ithmic complexity, the estimate of system complexity should also
include the complexity o f the compiler and of the computer oper ating syst em and
hardware. Compilers and operating systems are much more complex than many pro
grams by themselves. We can bypass this p roblem by consider ing instead the size of
the execution module—after application of the compiler.
Ther e are other p roblems with the use of natural or ar t ificial language descrip
tions, including:
764 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 764
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 764
1. Overestimation due to a lack of knowledge of possible representations. This
problem is r elated to the difficulty of deter mining the compressibility of infor
mation. The assumption of a par ticular length of text presumes a kind of r epre
sentation. This choice of representation may not be the most compact. This may
be due to the for m of the representation—specifically English text. Alter natively,
the assumption may be in the conceptual (semantic) framework. An example is
the complexity o f the motion of the planets in the Ptolemaic (earthcentered)
representation compared to the Copernican (suncentered) repr esentation.
Ptolemy would give a larger complexity estimate than Copernicus b ecause the
Ptolemaic syst em r equires a much longer descrip tion—which is the reason the
Copernican system is accepted as “true” today.
2. Underestimation due to lack of knowledge of the full behavior of the syst em. If
an individual is familiar with the behavior o f a syst em only under limit ed cir 
cumstances,the presumption that this limited knowledge is complete will lead to
a complexity estimate that is too low. Alternat ively, lack of knowledge may also
result in too high estimates if the individual extrapolates the missing knowledge
from more complex systems.
3. Difficulty with counting. Large numbers are gener ally difficult for people to
imagine or estimate. This is the advantage of identifying number s with length of
text, which is generally a more familiar quantit y.
With all of these limitations in mind, what are some of the estimates that we have
obtained? Table 8.4.2 was constructed using various books. The lengths of linguistic
descriptions of the behavior of biological organisms range from several pages to sev
er al books. Insects and fish are at pages,frogs at a chapter, most mammals at approx
imately a book, and monkeys and apes at several books. These numb er s span the
r ange of complexity estimates.
We have concluded that it is not possible to use this approach to obtain an est i
mate of human complexity. However, this is not quite t rue. We can apply this method
by taking the highest complexity estimate o f other syst ems and using this as a close
lower bound to the complexity of the human b eing. By close lower bound we mean
that the actual complexity should not be t remendously gr eater. According to our
C o m p l e x i t y e s t i m a t i o n 765
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 765
Title: Dynamics Complex Systems
Shor t / Normal / Long
Animal Text length Complexity (bits)
Fish a few pages 3x10
4
Grasshopper, Mosquito a few pages to a chapter 10
5
Ant (one, not colony) a few pages to a chapter 10
5
Frog a chapter or two 3x10
5
Rabbit a short book 10
6
Tiger a book 3x10
6
Ape a few books 10
7
Ta ble 8 . 4 . 2 Est ima t e s of t he a pproxima t e le n gt h of t e xt de script ion s of a n ima l be h a vior
BarYamChap8.pdf 3/10/02 10:53 AM Page 765
experience,the complexity estimates of animals tend to extend up to roughly a single
book. Pr imates may be estimated somewhat higher, with a range of one to tens of
books. This suggests that human complexity is somewhat larger than this latter num
ber—approximately 10
8
bits, or about 30 books. We will see how this compares t o
other estimates in the following sect ions.
Ther e are sever al other approaches to estimating human complexity based upon
language. The existence of booklength biographies implies a poor estimate of human
complexity of 10
6
bits. We can also estimate the complexity of a human being by the
t ypical amount of information that a person can learn.Specifically, it seems to make
sense to base an estimate on the length of a college education, which uses approxi
mately 30 text books. This is in direct agreement with the previous estimate of 10
8
bits.
It might be argued that this estimate is too low b ecause we have not inc luded other
parts of the education (elementary and high school and postgr aduate education) or
other kinds of education/information that are not academic. It might also be argued
that this is too high because students do not act ually know the entire content of 30
textbooks. One reason this numb er appears reasonable is that if the complexity of a
human being were much greater than this,there would be individuals who would en
dure tens or hundreds of college educations in different subjects. The estimate of
roughly 30 textbooks is also consistent with the general upper limit on the number of
books an individual can wr ite in a lifetime. The most prolific author in moder n times
is Isaac Asimov, with about 500 books. Thus from such textbased selfconsistent ev
idence we might assume that the estimate of 10
8
bits is not wrong by more than one
to two orders of magnitude. We now turn to estimation methods that are not based
on text.
8 . 4 . 2 Genet ic code
Biological organisms present us with a convenient and explicit representation for their
formation by development—the genome. It is generally assumed that most of the in
formation needed to describe the physiology of the organism is contained in genetic
information. For simplicity we might think of DNA as a kind o f progr am that is in 
terpreted by decoding machiner y during d evelopment and operation. In this r egard
the genome is much like a Turing machine tape (see Section 1.9), even though the
mechanism for transcription is quite different from the conventional Turing machine.
Some other perspect ives are given in Section 7.1. Regardless of how we ultimately view
the developmental process and cellular function, it appears natural to associate with
the genome the infor mation that is necessary to specify physiological design and func
tion. It is not difficult to determine an upper bound to the amount of information
that is contained in a DNA sequence. Taken at face value,this provides us with an es
timate of the complexity of an organism. We must then inquire as to the approxima
tions that are being made. We first discuss the approach in somewhat greater detail.
Considering the DNA as an alphabet of four characters provided by the four nu
cleotides or bases r epresented by A (adenine) T (t yrosine) C (cytosine) G (guanine),
a first estimate of the infor mation contained in a DNA sequence would be
N log(4) · 2N. N is the length of the DNA chain. Since DNA is for med of two com
766 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 766
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 766
plementary nucleotide chains in a double helix, its length is measur ed in base pairs.
While this estimate neglects many corrections, there are a number o f assumpt ions
that we are making about the organism that give a larger uncer tainty than some of the
corrections that we can apply. Therefore as a rough estimate,this is essentially as good
an estimate as we can obtain from this methodology at present .Specific numbers are
given in Table 8.4.3. We see that for a human being, the estimate is nearly 10
10
bits,
which is somewhat larger than that obtained fr om languagebased estimates in the
previous sect ion. What is more remarkable is that there is no syst ematic t rend of in
creasing genome length that parallels our expectations of increasing organism com
plexity based on estimates of the last sect ion. Aside fr om the increasing trend fr om
bacter ia to fungi to animals/plants,there is no apparent trend that would suggest that
genome length is correlated with our expectat ions about complexity.
We now p roceed to discuss limitations in this approach. The list of approxima
tions given below is not meant to be exhaustive, but it does suggest some of the diffi
culties in deter mining the information content even when there is a clear first nu
mer ical value to start from.
a. A significant percentage o f DNA is “noncoding.” This DNA is not t ranscribed
for protein st r uctures. It may be r elevant to the st ructural proper ties of DNA. It
may also contain other useful infor mation not directly relevant to protein se
quence. Never theless, it is likely that information in most of the base pairs that
are noncoding is not essential for organism behavior. Specifically, they can be re
placed by many other possible base pair sequences without effect. Since
30%–50% of human DNA is estimated to be coding, this cor rection would r e
duce the estimated complexit y by a factor of two to three.
b. Di rect forms of com pre s s i on : as pre s en t ly under s tood ,D NA is pri m a ri ly uti l i zed
t h ro u gh tr a n s c ri pti on to a sequ en ce of amino ac i d s . The coding for each amino
acid is given by a tr iple of b a s e s . Si n ce t here are many more triples (4
3
· 64) t han
amino acids ( twen t y) some of the sequ en ces have no amino acid co u n terp a rt ,a n d
C o m p l e x i t y e s t i m a t i o n 767
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 767
Title: Dynamics Complex Systems
Shor t / Normal / Long
Organism Genome length (base pairs) Complexity (bits)
Bacteria (E. coli) 10
6
–10
7
10
7
Fungi 10
7
–10
8
10
8
Plants 10
8
–10
11
3x10
8
–3x10
11
Insects 10
8
–7x10
9
10
9
Fish (bony) 5x10
8
–5x10
9
3x10
9
Frog and Toad 10
9
–10
10
10
10
Mammals 2x10
9
–3x10
9
10
10
Man 3x10
9
10
10
Ta ble 8 . 4 . 3 Est ima t e s of comple xit y ba se d upon ge n ome le n gt h . Exce pt for pla n t s, wh e re
t h e re is a pa rt icula rly wide ra n ge of ge n ome le n gt hs, a sin gle n umbe r is give n for t h e in for
ma t ion con t a in e d in t h e ge n ome, be ca use t h e a ccura cy doe s n ot just ify more spe cific n um
be rs. Ge n ome le ngt h s a nd ra nge s a re re pre se nt a t ive .
BarYamChap8.pdf 3/10/02 10:53 AM Page 767
t h ere are more than one sequ en ce that map on to t he same amino ac i d . This re
dundancy means that there is less inform a ti on in the DNA sequ en ce . Taking this
i n to account by assigning a triple of bases to one of t wen ty ch a r acters that repre
s ent amino acids would give a new esti m a te of (N / 3 ) l og( 20) · 1 . 4N. To improve
the esti m a te fur t h er, we would inclu de the rel a tive prob a bi l i t y of the differen t
amino ac i d s , and correl a ti ons bet ween them .
c. Gener al compression: more gener ally, we can ask how compressed the DNA en
coding of information is. We can r ely upon a basic opt imization of funct ion in
biology. This might suggest that some degree of compression is perfor med in or
der to reduce the complexity of t ransmission of the information from gener ation
to generation. However, this is not a p roof, and one could also argue in favor of
redundancy in order to avoid susceptibility to small changes. Moreover there are
likely to be inherent limitations on the compressibility of the infor mation due to
the possible transcription mechanisms that ser ve instead of decompression algo
r ithms. For example,ifa molecule that is to be represented has a long chain of the
same amino acid, e.g., aspaspaspaspaspaspaspaspaspaspaspaspasp
aspaspaspaspasp, it would be int eresting if this could be r epresented using a
chemical equivalent of (18)asp. This requires a tr anscript ion mechanism that re
peats segments—a DNA loop. There are organisms that are known to have highly
repet itive sequences (e.g., 10
7
r epetitions) forming a significant fr action of their
genome. Much of this may be noncoding DNA.
Other forms of compression may also be r elevant. For example, we can ask
if there are protein components/subchains that can be used in more than one
protein. This is relevant to the gener al redundancy of protein design. There is ev
idence that the genome does uses this p roper ty for compression by overlapping
the r egions that code for sever al different proteins. A par ticular r egion of DNA
may have several coding regions that can be combined in different ways to obtain
a number of different proteins. Transcription may start from distinct initial
points. Presumably, the information that describes the patter n of transcriptions
is represented in the noncoding segments that are between the coding segments.
Related to the issue of DNA code compression are questions about the complex
ity of protein pr imary structure in relation to its own function—specifically, how
much infor mation is necessary to describe the function of a protein. This may be
much less than the infor mation necessary to specify its pr imary struct ure (amino
acid sequence). This discussion is approaching issues of the scale at which com
plexity is measured—at the atomic scale where the specific amino acid is relevant,
or at the molecular scale at which the enzymatic function is relevant. We will
mention this limitation again in point (d).
d. Scale of representation:the genome codes for macromolecular and cellular func
tion of the biological organism. This is much less than the microscopic ent ropy,
since it does not code the atomic vib rations or molecular diffusion. However,
since our concern is for the organism’s macroscopic complexit y, the DNA is likely
to be coding a far greater complexity than we are interested in for multicellular
768 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 768
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 768
organisms. The assump tion is that much of the cellular chemical activity is not
relevant to a description of the behavior on the scale of the organism. If the DNA
were r epresenting the sum o f the molecular or cellular scale complexity of each
of the cells independently, then the error in estimating the complexity would be
quite large. However, the molecular and cellular b ehavior is generally r epeated
throughout the organism in different cells. Thus, the DNA is essentially r epre
senting the complexity of a single cellular function with the additional compli
cation of representing the variation in this funct ion. To the extent that the com
plexity of cellular behavior is smaller than that of the complete organism,it may
be assumed that the greatest part of the DNA code represents the macroscale be
havior. On the other hand, if the organism behavior is comparatively simple,the
greater part of the DNA representation would be d evoted to describing the cel
lular behavior.
e. Completeness of representation: we have assumed that DNA is the only source of
cellular information. However, during cell di vision not only the DNA is trans
ferred but also other cellular structures,and it is not clear how much infor mation
is necessary to specify their function. It is clear, however, that DNA does not con
tain all the infor mation. Otherwise it would be possible to t ransfer DNA from
one cell into any other cell and the organism would function through control by
the DNA. This is not the case. However, it may ver y well be that the description
of all other parts of the cell, including the tr anscript ion mechanisms, only in 
volves a small fraction of the information content compared to the DNA (for ex
ample,10
4
–10
6
bits compared to 10
7
–10
11
bits in DNA). Similar to our point (d),
the information in cellular structures is more likely to be irrelevant for organisms
whose complexity is high. We could note also that there are two sources of DNA
in the eukaryotic cell, nuclear DNA and mitochondrial DNA. The information in
the nuclear DNA dominates over the mitochondrial DNA, and we also expect it
to d ominate over other sources of cellular infor mation. It is possible, however,
that the other sources of information approach some fr action (e.g., 10%) of the
infor mation in the nuclear DNA, causing a small cor rection to our estimat es.
f. We have implicitly assumed that the development process of a biological organ
ism is deter ministic and uniquely determined by the genome. Randomness in the
process of development gives rise to additional information in the final str ucture
that is not contained in the genome. Thus, even organisms that have the same
DNA are not exa ctly the same. In humans, identical twins have been studied in
order to determine the difference between environmental and genet ic influence.
Here we are not considering the macroscale environmental influence, but rather
the microscale influence. This influence begins with the randomness of molecu
lar vibrations during the developmental process. The additional information
gained in this way would have to play a relatively minor functional role if there is
significance to the genetic cont rol over physiology. Nevertheless,a complete esti
mate of the complexity of a system must include this information. Without con
sidering different scales of st ructure or behavior, on the macroscale we should
C o m p l e x i t y e s t i m a t i o n 769
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 769
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 769
not expect the microscopic randomness to affect the complexity by more than a
factor of 2,and more likely the effect is not more than 10% in a t ypical biologi
cal organism.
g. We have also neglected the macroscale environmental influences on behavior.
These are usually described by adaptation and learning. For most biological or
ganisms,the environmental influences on behavior are believed to be small com
pared to genet ic influences. Instinct ive b ehaviors dominate. This is not as t rue
about many mammals and even less tr ue about human beings. Therefore,the ge
netic estimate becomes less reliable as an upper bound for human beings than it
is for lower animals. This point will be discussed in great er detail below.
We can see that the assump tions discussed in ( a), (b), (c) and ( d ) would lead t o
the DNA length being an overly large estimate of the complexity. Assumptions dis
cussed in (e), (f ) and (g) imply it is an underest imate.
One of the conceptual difficulties that we are presented with in considering
genome length as a complexity estimate is that plants have a much higher DNA length
than animals. This is in conflict with the conventional wisd om that animals have a
greater complexity of behavior than plants.We might adopt one of two approaches to
understanding this result: first, that plants are actually more complex than animals,
and second, that the DNA representation in plants does not make use of, or cannot
make use of, compression algor ithms that are present in animal cells.
If plants are syst ematically more complex than animals, there must be a general
quality of plants that has higher descriptive and behavior al complexity. A candidat e
for such a propert y is that plants are gener ally able to regenerate after injury. This in
herently requires mor e infor mation than the reliance upon a specific time histor y for
development. In essence,there must be some form of act ual blueprint for the organ
ism encoded in the genome that takes into account many possible circumstances.
From a programming point of view, this is a multiply reent r ant program. To enable
this feature may very well be more complex, or it may require a more redundant
(longer) representation of the same information. It is presumed that the struct ure of
animals has such a high intr insic complexity that r epresentation of a fully r egener a
tive organism would be impossible. This idea might be checked by considering the
genome length of animals that have great er ability to r egenerate. If they are substan
tially longer than similar animals without the ability to regenerate, the explanation
would be supported. Indeed, the salamander, which is the only vertebr ate with the
ability to regener ate limbs, has a genome of 10
11
base pairs. This is much larger than
that of other vertebrat es, and comparable to that of the largest plant genomes.
A more general reason for the high plant genome complexity that is consistent
with r egener ation would be that plants have syst ematically d eveloped a high com
plexity on smaller (molecular and cellular) rather than larger (organismal) scales.
One reason for this would be that plant immobility requires the development of com
plex molecular and cellular mechanisms to inhibit or survive par tial consumption by
other organisms. By our discussion of the complexity p rofile in Section 8.3, a high
complexity on small scales would not allow a high complexity on larger scales. This
770 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 770
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 770
explanation would also be consistent with our understanding of the relative simplic
it y of plants on the larger scale.
The second possibility is that there exists a syst ematic additional redundancy of
the genome in plants. This might be the result of par ticular proteins with chains of
repet itive amino acids.A protein for med out of a long chain of the same amino acid
might be funct ionally of importance in plants,and not in animals. This is a potential
explanation for the relative lengths of plant genome and animal genome.
One of the most striking features of the genome lengths found for various or
ganisms is their relative uniformit y. Widely different types of organisms have similar
genome lengths, while similar organisms may have quite different genome lengths.
One explanation for this that might be suggested is that genome lengths have in
creased systematically with evolutionary time. It is hard, however, to see why this
would be the case in all but the simplest models of evolution. It makes more sense to
infer that there are const raints on the genome lengths that have led it to gr avitate to
ward a value in the range 10
9
–10
10
. Increases in organism complexity then result from
fewer r edundancies and b etter compression, rather than longer genomes. In pr inci
ple, this could account for the pattern of complexities we have obtained.
Regardless o f the ultimate reason for various genome lengths, in each case the
complexity estimate from genome length provides an upper bound to the genet ic
component of organism complexity (c.f. points (e), (f ) and (g) above). Thus,the hu
man genome length provides us with an est imate of human complexit y.
8 . 4 . 3 Component count ing
The objective of complexity estimation is to determine the behavior al complexity of
a system as a whole. However, one of the impor tant clues to the complexity of the sys
tem is its composition from elements and their interactions. By counting the number
of elements, we can develop an understanding of the complexity of the system.
However, as with other estimation methods,it must be understood that there are in
herent problems in this approach. We will find that this method gives us a much
higher estimate than the other methods. In using this method we are faced with the
dilemma that lies at the heart of the ability to understand the nature of complex sys
tems—how does complex behavior arise out of the component behavior and their in
ter actions? The essential question that we face is: Assuming that we have a system
for med of N inter acting elements that have a complexit y C
0
(or a known distribution
of complexities),how can the complexit y C of the whole system be determined? The
maximal possible value would be NC
0
. However, as we discussed in Section 8.3,this is
reduced both by correlations between elements and by the change o f scale from that
of the elements to that of the system. We will discuss these problems in the context of
estimating human complexit y.
If we are to consider the behavioral complexity of a human being by counting
components, we must identify the relevant components to count. If we count the
number of atoms, we would be describing the microscopic complexit y. On the other
hand, we cannot count the number of parts on the scale of the organism (one)
because the p roblem in d etermining the complexity r emains in evaluating C
0
. Thus
C o m p l e x i t y e s t i m a t i o n 771
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 771
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 771
the object ive is to select components at an intermediate scale. Of the natural inter
mediate scales to consider, there are molecules, cells and organs. We will tackle the
problem by considering cells and discuss difficulties that arise in this context. The first
difficulty is that the complexity of behavior does not arise equally from all cells. It is
gener ally under stood that muscle cells and bone cells are largely unifor m in struct ure.
They may therefore collect ively be describ ed in t erms of a few parameters, and their
contribution to organism behavior can be summarized simply. In contrast,as we dis
cussed in Chapter 2,the behavior of the system on the scale of the organism is gener
ally attr ibuted to the ner vous system. Thus,aside from an inconsequential number of
additional parameters, we will consider only the cells of the nervous system. If we were
considering the behavior on a smaller length scale, then it would be natural to also
consider the immune system.
In order to make more progress, we must discuss a specific model for the ner vous
system and then d etermine its limitations. We can do this by considering the behav
ior of a model system we studied in detail in Chapter 2—the attractor neural networ k
model.Each of the neurons is a binary variable. Its behavior is specified by whether it
is ON or OFF. The behavior of the network is,however, described by the values of the
synapses. The total complexity of the synapses could be quite high if we allowed the
synapses to have many digits of precision in their values, but this does not contr ibute
to the complexity of the networ k behavior. Given our investigation of the stor age of
patter ns in the network, we can argue that the maximal number of independent pa
r ameters that may be specified for the operation of the network consists of the neural
firing patterns that are stor ed. This corresponds to
c
N
2
bits of information, where N
is the numb er of neurons,and
c
≈ 0.14 is a number that arose fr om our analysis of
network overload.
Th ere are several probl ems with app lying this for mula to bi o l ogical ner vous sys
tem s . The first is that the bi o l ogical net work is not fully con n ected . We could app ly a
similar formula to the net work assuming on ly the nu m ber of synapses N
s
that are pre
s en t , on avera ge , for a neu ron . This gives a va lue
c
N
s
N. This means that the stora ge ca
p ac i ty of the net work is small er, and should scale with the nu m ber of s y n a p s e s . For the
human brain wh ere N
s
has been esti m a ted at 10
4
and N ≈ 1 0
1 1
, this would give a va lu e
of 0.1 × 1 0
4
× 1 0
1 1
· 1 0
1 4
bi t s . The probl em with this esti m a te is that in order to spec i f y
the beh avi or of the net wor k , we need to specify not on ly the impri n ted patterns but also
wh i ch synapses are pre s ent and wh i ch are absen t .L i s ting the synapses that are pre s en t
would requ i re a set of nu m ber pairs that would specify wh i ch neu rons each neu ron is
a t t ach ed to. This list would requ i re ro u gh ly N N
s
l og (N ) · 3 × 1 0
1 6
bi t s , wh i ch is larger
than the nu m ber of bits of i n for m a ti on in the stora ge itsel f . This esti m a te may be re
du ced by a small amount, i f , as we ex pect , the synapses of a neu ron largely con n ect to
n eu rons that are nearby. We wi ll use 10
1 6
as the basis for our com p l ex i ty esti m a te .
The second major problem with this mo del is that real neur ons are far fr om bi
nary variables. Indeed, a neuron is a complex system. Each neuron responds to par 
ticular neurot ransmitt ers, and the synapse b etween two sp ecific neur ons is different
from other synapses. How many parameters would be needed to describe the behav
ior of an individual neuron,and how relevant are these parameters to the complexity
772 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 772
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 772
of the whole system? Naively, we might think that taking into account the complexity
of individual neur ons gives a much higher complexity than that considered ab ove.
However, this is not the case. We assume that the parameters necessary to describe an
individual neuron cor respond to a complexity C
0
, and it is necessary to specify the pa
rameters of all of the neur ons. Then the complexity of the whole system would in
clude C
0
N bits for the neur ons themselves. This would be greater than 10
16
bits only
if the complexity of the indi vidual neur ons were larger than 10
5
. A reasonable est i
mate of the complexity of a neuron is roughly 10
3
–10
4
bits. This would give a value of
C
0
N · 10
13
−10
14
bits, which is not a significant amount by comparison with 10
16
bits.
By these estimates,the complexity of the internal str ucture of a neuron is not great er
than the complexit y of its interconnect ions.
Similarly, we should consider the complexity of a synapse, which multiplies the
number of synapses. Synapses are significantly simpler than the neurons. We may es
timate their complexity as no more than 10 bits. This would be sufficient to specify
the synaptic strength and the type of chemicals involved in transmission. Multiplying
this by the total number of synapses (10
15
) gives 10
16
bits. This is the same as the in 
format ion necessar y to specify the list of synapses that are present.
Combining our estimates for the information necessary to sp ecify the st r ucture
of neurons,the structure of synapses and the list of synapses present, we obtain an es
timate for complexity of 10
16
bits. This estimate is significantly larger than the esti
mate found fr om the other two approaches. As we mentioned before, there are two
fundamental difficulties with this approach that make the estimate too high—
correlations among par ameters and the scale of descript ion.
Ma ny of the para m eters enu m era ted above are likely to be the same, giving rise to
the po s s i bi l i t y of com pre s s i on of the de s c ri pti on . Both the de s c ri pti on of an indivi du a l
n eu ron and the de s c ri pti on of the synapses bet ween them can be dra s ti c a lly simplified
i fa ll of t h em fo ll ow a pattern . For ex a m p l e , the vi sual sys tem invo lves processing of a vi
sual field wh ere the different neu rons at different loc a ti ons perform essen ti a lly the same
oper a ti on on the vi sual infor m a ti on . Even if t h ere are smooth va ri a ti ons in the para
m eters that de s c ri be both the neu ron beh avi or and the synapses bet ween them , we can
de s c ri be the processing of the vi sual field in ter ms of a small nu m ber of p a ra m eter s .
In deed , one would guess (an intu i ti on  b a s ed esti m a te) that processing of the vi sual fiel d
is qu i te com p l i c a ted (more than 10
2
bits) but would not exceed 10
3
– 1 0
5
bits altoget h er.
Si n ce a su b s t a n tial fracti on of the nu m ber of n eu rons in the brain is devo ted to initi a l
vi sual proce s s i n g, the use of this redu ced de s c ri pti on of the vi sual processing would re
du ce the esti m a te of the com p l ex i ty of the whole sys tem .
Nevertheless,the initial visual processing does not involve more than 10% of the
number of neurons. Even if we eliminate all of their parameters,the estimate of sys
tem complexity would not change. However, the idea behind this const ruct ion is that
whenever there are many neurons whose behavior can be grouped together into par
ticular functions,then the complexity of the descript ion is reduced. Thus if we can de
scribe neurons as belonging to a particular class of neurons (categor y or stereot ype),
then the complexity is reduced. It is known that neurons can be categor ized;however,
it is not clear how many parameters r emain once this cat egorization has been done.
C o m p l e x i t y e s t i m a t i o n 773
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 773
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 773
When we think about grouping the neurons t ogether, we might also realize that this
discussion is relevant to the consideration of the influence of environment and ge
netics on behavior. If the numb er of parameters necessary to describe the network
greatly exceeds the number of parameters in the genetic code, which is only 10
10
bits,
then many of these parameters must be specified by the environment. We will discuss
this again in the next section.
On a more ph i l o s ophical note , we com m ent t hat para m et er s t hat de s c ri be the
n er vous sys tem also inclu de the mall e a ble short  t erm mem or y. While t his may be
a small par t of t he tot al infor m a ti on , our est i m a te of beh avi or al com p l ex i ty
should r aise qu e s ti ons su ch as, How specific do we have to be? Should the con ten t
of s h or t  term mem or y be inclu ded? The argument in favor would be that we need
to repre s ent the human being in en ti ret y. The ar gument against would be t hat
what happen ed in the past five minut es or even the past day is not r el evant and we
can re s et t his par t of t he mem or y. Even tu a lly we may ask wh et h er the obj ective is
to r epre s ent t he specific inform a ti on known by an indivi dual or just his or her
“ch a r acter.”
We have not yet dir ectly addressed the role of subst ructure (Chap ter 2) in the
complexity of the ner vous syst em. In comparison with a fully connected network, a
network with substructure is more complex because it is necessary to specify the sub
struct ure, or more specifically which neurons (or which information) are proximate
to which. However, in a syst em that is subdivided by vir tue of having fewer synapses
between subdivisions, once we have counted the infor mation that is present in the se
lection of synapses,as we have done above,the subst ructure of the system has already
been included.
The second problem of estimating complexity based on component counting is
that we do not know how to r educe the complexity estimate based upon an increase
of the length scale of obser vation. The estimate we have obtained for the complexity
of the ner vous system is relevant to a description of its behavior on the scale of a neu
ron (it does, however, focus on cellular behavior most relevant to the behavior of the
organism). In order to overcome this problem, we need a method to assess the de
pendence of the organism behavior on the cellular behavior. A natural approach
might be to evaluate the robustness of the system behavior to changes in the compo
nents. Human beings are believed to lose approximately 10
6
neurons every day (even
without alcohol) corresponding to the loss of a significant fraction of the neurons
over the course of a lifetime. This suggests that individual neurons are not crucial t o
deter mining human behavior. It implies that there may be a couple of orders of mag
nitude between the estimate of neuron complexity and human complexity. However,
since the daily loss of neurons corresponds only to a loss of 1 in 10
5
neurons, we could
also argue that it would be hard for us to notice the impact of this loss. In any event,
our estimate based upon component counting, 10
16
, is eight orders of magnitude
larger than the estimates obtained from text and six orders of magnitude larger than
the genomebased estimate. To account for this difference we would have to argue that
99.999% of neuron parameter s are ir relevant to human b ehavior. This is t oo gr eat a
discrepancy to dismiss based upon such an argument.
774 H u m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 774
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 774
Finally, we can demonstrate that 10
16
is t oo large an estimate of complexity by
considering the counting of time rather than the counting of components. We con
sider a minimal time inter val of describing a human being to be of order 1 se cond,
and we allow for each second 10
3
bits of information. There are of order 10
9
seconds
in a lifetime. Thus we conclude that only, at most,10
12
bits of information are neces
sary to describe the act ions of a human. This estimate assumes that each second is in
dependently described from all other seconds,and no patterns of behavior exist. This
would seem to be a ver y generous estimate. We can cont rast this number with an es
timate of the total amount of infor mation that might be imprinted upon the
synapses. This can be estimated as the total number of neuronal states over the course
of a lifetime. For a neuron reaction time of order 10
−2
seconds,10
11
neurons,and 10
9
seconds in a lifetime, we have 10
22
bits of information. Thus we see that the total
amount of infor mation that passes through the ner vous system is much larger than
the infor mation that is represented there, which is larger than the infor mation that is
manifest in ter ms of behavior. This suggests either that the collect ive behavior of neu
rons requires redundant infor mation in the synapses,as discussed in Section 8.3.6, or
that the actions of an individual do not fully represent the possible act ions that the in
dividual would take under all circumstances. The latt er possibility ret urns us to the
discussion of Eq.(8.3.47) and Eq.(8.3.59), where we commented that the expression
is an upper bound, because information may cycle between scales or between system
and environment. Under these circumstances, the pot ential complexity o f a syst em
under the most diverse set of circumstances is not necessarily the obser ved complex
it y. Both of our approaches to component counting (spatial and tempor al) may over
estimate the complexit y due to this problem.
8 . 4 . 4 Complexit y of huma n beings, a rt ificia l int elligence,
a nd t he soul
We begin this section by summarizing the estimates of human complexity from the
previous sect ions,and then turn to some mor e philosophical considerations of its sig
nificance. We have found that the microscopic complexity of a human being is in the
vicinity of 10
30
bits. This is much larger than our estimates of the macroscopic com
plexity—languagebased 10
8
bits, genomebased 10
10
bits and component (neuron)
counting 10
16
bits. As discussed at the end of the last section, we r eplace the spatial
componentcounting estimate with the timecounting up per bound of 10
12
bits. We
will discuss the discrepancies between these number s and conclude with an estimat e
of 10
10t2
bits.
We can summarize our understanding of the different estimates. The language
based estimate is likely to be somewhat low because of the inher ent compression
achieved by language.One way to say this is that a college education, consisting of 30
textbooks, is based up on childhood learning (nonlinguistic and linguistic) that pro
vides meaning to the words, and therefore contains comparable or greater informa
tion. The genomebased complexity is likely to be a toolarge estimate of the influence
of genome on behavior, because genome infor mation is compressible and because
much of it must be relevant to molecular and cellular funct ion. The component
C o m p l e x i t y e s t i m a t i o n 775
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 775
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 775
counting estimate suggests that the infor mation obtained fr om exp erience is much
larger than the information due to the genome—specifically, that genetic infor mation
cannot specify the parameters of the neural network. This is consistent with our dis
cussion in Section 3.2.11 that suggested that synapses store learned infor mation while
the genome determines the o verall st ructure of the networ k. We must still conclude
that most of the network infor mation is not relevant to behavior at the larger scale. It
is redundant, and /or does not manifest itself in human behavior because of the lim
ited t ypes of external circumstances that are encount ered. Because of this last point,
the complexity for describing the response to arbitrary circumstances may be higher
than the estimate that we will give, but should still be significantly less than 10
16
bits.
Our estimate of the complexity of a human being is 10
10t2
bits. The er ror bars es
sentially br acket the values we obtained. The main final caveat is that the difficulty in
assessing the possibility of information compr ession may lead to a systematic bias to
high complexities. For the following discussion,the actual value is less important than
the existence of an estimate.
Consideration of the complexity of a human being is intimat ely related to fun 
damental issues in art ificial intelligence. The complexity of a human b eing specifies
the amount of information necessary to describe and, given an environment, predict
the behavior of a human being. There is no presumpt ion that the prediction would be
feasible using present technology. However, in pr inciple,t here is an implication of its
possibilit y. Our o bject ive here is to briefly discuss both philosophical and practical
implications of this obser vation.
The notion of reproducing human behavior in a computer (or by other artificial
means) has t raditionally been a major domain of confrontation between science and
religion,and science and popular thought. Some of these conflicts arise because of the
supposition by some religious philosophers of a nonmaterial soul that is presumed to
animate human b eings. Such nonmat erial entities are rejected in the context of sci
ence because they are, by definition,not measurable. It may be helpful to discuss some
of the alt ernate approaches to the traditional conflict that b ypass the cont roversy in
favor of slightly modified definitions.Specifically, we will consider the possibility of a
scientific definition of the concept of a soul. We will see that such a concept is not nec
essarily in conflict with notions of ar tificial intelligence. Instead it is closely relat ed to
the assumpt ions of this field.
One way to define the concept of soul is as the information that describes com
pletely a human being. We have just estimated the amount of this information. To un
derstand how this is r elated to the r eligious concept of soul, we must realize that the
concept of soul serves a purpose. When an individual dies,the existence of a soul rep
resents the independence of the human being from the mater ial of which he or she is
formed. If the mat erial of which the human being is made were essential to its func
tion, then ther e would be no independent functional descript ion. Also, there would
be no mechanism by which we could repr oduce human behavior without making use
of precisely the atoms of which he or she was formed. In this way the descript ion of a
soul suggests an abst raction of function from matter which is consistent with ab
str act ions that are familiar in science and modern thought, but might not be consis
776 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 776
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 776
tent with more pr imitive notions of matter. A primit ive concept of matter might in 
sist that the matt er of which we are formed is essential to our functioning. The sim
plest possible abstr act ion would be to state (as is claimed by physics) that the specific
atoms of which the human being are formed are not necessary to his or her function.
Instead, these atoms may be replaced by other indistinguishable atoms and the same
behavior will be found. Ar tificial intelligence takes this a large st ep further by stating
that there are other possible media in which the same behavior can be realized.A hu
man being is not dir ectly tied to the mater ial of which he is ma de. Instead there is a
funct ional description that can be implemented in various media, of which one pos
sible medium is the biological body that the human being was implemented in, when
we met him or her.
Viewed in this light, the stat ement o f the exist ence of a soul appears to be the
same as the claim of ar tificial intelligence—that a human being can be reproduced in
a different form by embodying the funct ion rather than the mechanism of the human
being. There is,however, a crucial distinct ion between the religious view and some of
the practical approaches of ar tificial intelligence. This difference is related to the no
tion of a universal artificial intelligence, which is concept ually similar to the model of
universal Turing machines. According to this view there is a generic model for intelli
gence that can be implemented in a computer. In contr ast,the religious view is t ypi
cally focused on the individual identity of an individual human being as manifest in
a unique soul. We have discussed in Chap ter 3 that our mo dels of human beings are
to be understood as nonuniversal and would indeed be better realized by the concept
of representing individual human beings rather than a generic ar tificial int elligence.
There are common features to the information p rocessing of different indi viduals.
However, we anticipate that the features character istic of human b ehavior are pre
dominantly sp ecific to each indi vidual rather than common. Thus the objective o f
creating ar tificial human beings might be b etter described as that of manifesting the
soul of an individual human.
We can illustrate this change in perspective by considering the Turing test for rec
ognizing ar tificial int elligence. The Turing test suggests that in a conversation with a
computer we may not be able to distinguish it from a human being. A key problem
with this prescript ion is that there is no sp ecification of which human being is to be
modeled. Human beings have varied complexity, and interactions are of varied levels
of intimacy. It would be quite easy to reproduce the conversation of a mute individ
ual, or even an obsessed individual. Which human being did Turing have in mind? We
can go beyond this object ion by recognizing that in order to fool us into thinking that
the computer is a human being, except for a ver y casual conversation, the computer
would have to represent a single human being with a name,a family hist or y, a profes
sion, opinions and a personality, not an abst ract notion of intelligence. Finally, we
may also ask whether the represented human being is someone we already know, or
someone we do not know, pr ior to the test.
While we bypassed the fundamental controversy between science and religion re
garding the presence of an immater ial soul, we suspect that the real conflict between
the approaches resides in a different place. This conflict is in the question of the
C o m p l e x i t y e s t i m a t i o n 777
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 777
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 777
intr insic value of a human being and his place in the universe. Both the religious and
popular view would like to place an importance on a human being that tr anscends the
value of the matt er of which he is f ormed. Philosophically, the scientific p er spective
has often been viewed as lower ing human worth. This is true whether it is physical sci
entists that view the material of which man is formed as “just” composed of the same
atoms as r ocks and wat er, or whether it is biolo gical scientists that consider the bio
chemical and cellular structures as the same as,and derived evolut ionarily from,an i
mal processes.
The study of complexity presents us with an opportunity in this regard.A quan
titative d efinition of complexity can provide a direct measure of the difference be
tween the behavior of a rock,an animal and a human being. We should recognize that
this capability can be a doubleedged sword. On the one hand it provides us with a
scientific method for distinguishing man from matter, and man from animal, by rec
ognizing that the par ticular ar rangement of atoms in a human being, or the particu
lar implementation of biology, achieves a funct ionality that is highly complex. At the
same time, by placing a number on this complexity it presents us with the finiteness
of the human being. For those who would like to view themselves as infinite,a finite
complexity may be humbling and difficult to accept. Other s who already r ecognize
the inher ent limitations of individual human beings,including themselves, may find
it comfort ing to know that this limitation is fundamental.
As is oft en the case,the value of a number attains meaning though comparison.
Specifically, we may consider the complexity of a human being and see it as either high
or low. We must have some reference point with respect to which we measure human
complexity. One r eference point was clear in the preceding discussion—that o f ani
mals. We found that our (linguistic) estimates of human complexity placed human
beings quantitat ively above those of animals, as we might expect. This result is quite
reasonable but d oes not suggest any clear dividing line between animals and man.
Ther e is, however, an independent value to which these complexities can be com
pared. For consistency, we use languagebased complexity est imates throughout.
The idea of biological evolut ion and the biological continuity of man from ani
mal is based up on the concept of the sur vival demands of the environment on man.
Let us consider for the moment the complexity of the demands of the environment.
We can estimate this complexity using relevant liter ature. Books that discuss survival
in the wild are t ypically quite sho r t, 3 × 10
5
bits. Such a book might describe more
than just basic survival—plants to eat and animal hunting—but also various skills of
a pr imitive life such as stone knives, tanning, basket making, and primit ive home or
boat constr uction. Alter natively, a book might discuss survival under extreme cir
cumstances rather than survival under more t ypical circumstances. Even so, the
amount of text is not longer than a rather br ief book. While there are many individ
uals who have devoted themselves to living in the wild,there are no encyclopedias of
relevant information. This suggests that in comparison with the complexity of a hu
man being, the complexity of sur vival demands is small. Indeed, this complexity ap
pears to be right at the estimated di viding line between animal (10
6
bits) and man
(10
8
bits). It is significant that an ape may have a complexity o f ten times the com
778 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 778
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 778
plexity of the environmental demands upon it, but a human b eing has a complexit y
of a hundr ed times this demand. Another way to ar r ive at this conclusion is to con
sider primitive man, or pr imitive tr ibes that exist today. We might ask about the com
plexity of their existence and specifically whether the demands of the survival are the
same as the complexity of their lives. From books that reflect studies of such peoples
we see that the descrip tion of their sur vival techniques is much shorter than the d e
scription of their social and cultural act ivities.A single aspect of their culture might
occupy a book, while the sur vival methods do not occupy even a single one.
We might compare the b ehavior of pr imitive man with the behavior of animal
predators. In contrast to gr azing animals, predator s satisfy their survival needs in
terms of food using only a small part of the day. One might ask why they did not de
velop complex cultural activities. One might think, for example, of sleeping lions.
While they do have a social life, it does not compare in complexity to that of human
beings. The explanation that our discussion pr ovides is that while time would allow
cultural act ivities, complexity does not. Thus, the complexity of such predators is es
sentially devoted to problems of sur vival. That of human beings is not.
This conclusion is quite intr iguing. Several interesting remarks follow. In this
context we can suggest that analyses of animal behavior should not necessarily be as
sumed to apply to human behavior. In par ticular, any animal behavior might be jus
tified on the basis of a survival demand. While this appr oach has also oft en been ap
plied to human beings—the sur vival advantages associated with culture, art and
science have oft en been suggested—our analysis suggests that this is not justified, at
least not in a dir ect fashion. Human behavior cannot be dr iven by sur vival demands
if the survival demands are simpler than the human behavior. Of course,this does not
r ule out that general aspects or patterns of behavior, or even some specific behaviors,
are driven by sur vival demands.
One of the distinct ions between man and animals is the relative dominance of in
stinctive behavior in animals,as compared to learned behavior in man. It is often sug
gested that human dependence on learned rather than instinctive behavior is simply
a different strategy for survival. However, ifthe complexity of the demands of sur vival
are smaller than that of a human being, this does not hold. We can argue instead that
if the complexity of sur vival demands are limit ed, then there is no reason for addi
tional instinctive behaviors. Thus, our results suggest that instinctive behavior is ac
tually a b etter st rategy for overcoming sur vival demands—because it is prevalent in
organisms whose behavior arises in response to survival demands. However, once
such demands are met, there is little reason to produce mor e complex instinct ive be
haviors, and for this reason human behavior is not instinctively driven.
We now turn to some more p r actical asp ects o f the implications of our com
plexity estimates for the problem of art ificial intelligence—or the recreation of an in
dividual in an artificial form. We may start from the microscopic complexity (roughly
the ent ropy) which corresponds to the information necessary to replace ever y atom
in the human being with another atom of the same kind, or alternatively, to represent
the atoms in a computer. We might imagine that the computer could simulate the dy
namics of the atoms in order to simulate the behavior of the human being. The
C o m p l e x i t y e s t i m a t i o n 779
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 779
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 779
pract icality of such an implementation is highly questionable. The problem is not just
that the number of bits of storage as well as the speed requirements are beyond mod
ern technology. It must be assumed that any computer representation of this dynam
ics must ultimat ely be composed of atoms. If the simulation is not composed out of
the at oms themselves, but some controllable r epresentation o f the at oms, then the
complexity of the machine must be significantly greater than that of a human being.
Moreover, unless the syst em is const ructed to respond to its environment in a man
ner similar to the resp onse of a human being, then the comput er must also simulat e
the environment. Such a task is likely to be for mally as well as pract ically impossible.
One cent ral question then becomes whether it is possible to compress the repre
sentation of a human being into a simpler one that can be stored.Our estimate o f be
havioral complexity, 10
10±2
bits, suggests that this might be possible. Since a CDROM
contains 5 × 10
9
bits, we are discussing 2 × 10
±2
CDROMs. At the lower end of this
range, 0.02 CDROMs is clear ly not a p roblem. Even at the up per end, two hundred
CDROMs is well within the domain of feasibility. Indeed, even if we chose to repre
sent the infor mation we estimated to be necessary to describe the neural network of
a single indi vidual, 10
16
bits or 2 million CDROMs, this would be a t echnologically
feasible project. We have made no claims about our ability to obtain the necessary in
formation for one indi vidual. However, once this information is obtained, it should
be possible to stor e it.A computer that can simulate the behavior of this individual
represents a more significant problem.
Before we discuss the problem of simulating a human being, we might ask what
the additional microscopic complexity present in a human body is good for.
Specifically, if only 10
10
bits are relevant to human behavior, what are most of the 10
31
bits doing? One way to think about this question is to ask why nature didn’t build a
similar machine with of order 10
10
atoms, which would be significantly smaller. We
might also ask whether we would know if such an organism existed.On our own scale,
we might ask why nature doesn’t build an organism with a complexity of order 10
30
.
We have already suggested that there may be inherent limitations to the complexity
that can be formed. However, there may also be another use of some of the additional
large number of micr oscopic pieces of informat ion.
One possible use of the additional infor mation can be inferred fr om our argu
ments about the difference between TM with and without a random tape. The dis
cussion in Sect ion 1.9.7 suggests that it may be necessary to have a source o f ran
domness to allow human qualities such as creativity. This fits nicely with our
discussion of chaos in complex syst em behavior. The implication is that the micro
scopic information becomes gradually relevant to the macroscopic behavior as a
chaotic process. We can assume that most microscopic information in a human being
describes the position and orientation o f water molecules. In this pict ure, random
motion of molecules affects cellular behavior, specifically the firing of neurons, that
ultimately affects human behavior. This does not mean that all of the microscopic in
for mation is relevant. Only a small number of bits can be relevant at any time.
However, we recognize that in order to obtain a certain number of random bits,there
must be a much larger reser voir of randomness. This is one approach to understand
ing a possible use of the microscopic information content of a human being. Another
780 Hu m a n Ci v i l i z a t i o n I
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 780
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 780
appr oach would ascribe the additional infor mation to the necessary sup port st ruc
tures for the complex behavior, but would not attribute to it an essential role as
infor mation.
We have d emonst r ated time and again that it is possible to build a stronger or
faster machine than a human b eing. This has led some people to believe that we can
also build a systematically more capable machine—in the for m of a robot. We have al
ready argued that the present notion of computers may not be sufficient if it becomes
necessary to include chaotic behavior. We can go beyond this argument by consider
ing the problem we have introduced of the fundamental limits to complexity for a col
lection of molecules. It may turn out that our quest for the design of a complex ma
chine will be limited by the same fundamental laws that limit the design of human
beings.One of the natural improvements for the design of deterministic machines is
to consider lower t emper atures that enable lower error rates and higher speeds, and
possibly the use of superconductors. However, the choice of a higher temperature may
be required to enable a higher microscopic complexity, which also limits the macr o
scopic complexit y. The mammalian body temperature may be selected to balance two
competing effects. At high temperatures there is a high microscopic complexity.
However, breaking the ergodic theorem requires low temper atures so that energy bar
riers can be effect ive in stopping movement in phase space.A way to argue this point
more gener ally is that the sensit ivity of human ears and eyes is not limited by the bi
ological design, but by fundamental limits of quantum mechanics. It may also be that
the behavioral complexity of a human being at its own length and time scale is lim
ited by fundamental law. As with the existence of ar t ificial sensor s in other parts of the
visual spectr um, we already know that machines with other capabilities can be built.
However, this argument suggests that it may not be possible to build a systematically
more complex ar tificial organism.
The previous discussion is not a proof that we cannot build a robot that is more
capable than a human being. However, any claims that it is possible should be tem
pered by the respect that we have gained from studying the effect iveness of biological
design. In this r egard, it is int eresting that some o f the modern approaches to ar tifi
cial intelligence consider the use of nanotechnology, which at least in part will make
use of biological molecules and methods.
Finally we can say that the concept of an infinite human being may not be en
tirely lost. Even the lowly TM whose inter nal (table) complexity is rather small can,in
ar bitr arily long time and with an infinite storage, reproduce ar bitrarily complex be
havior. In this r egard we should not consider just the complexity of a human b eing
but also the complexity of a human being in the context of his tools. For example, we
can consider the complexity of a human being with paper and pen,the complexity of
a human being with a computer, or the complexity of a human being with access to a
libr ar y. Since human b eings make use of external st orage that is limited only by the
available matter, over time a human being, through collaboration with other human
beings/generations extending through time,can r eproduce complex behavior limited
only by the matt er that is available. This br ings us back to questions of the behavior
of collections of human beings, which we will address in Chapter 9.
C o m p l e x i t y e s t i m a t i o n 781
# 29412 Cust: AddisonWesley Au: BarYam Pg. No. 781
Title: Dynamics Complex Systems
Shor t / Normal / Long
BarYamChap8.pdf 3/10/02 10:53 AM Page 781
700
Hu ma n Civ il iz a tion I
This subject is distinct from the others we have considered. The primary distinction is that we have only one example of human civilization. This is not true about the systems we have discussed in earlier chapters, with the exception of evolution considered globally. The uniqueness of the human superorganism presents us with questions of fundamental interest in science, related to how much we can know about an individual system. When there are many instances, we can use information provided by various examples and the statistics of their properties. When there is only one system, to understand its properties or predict its behavior we must apply fundamental principles that are valid for all complex systems. Since the field of complex systems is dedicated to uncovering such principles, the subject of the human superorganism should be considered a premiere area for application of complex systems research. Central questions are:How can we characterize this complex system? How can we determine its properties? What can we tell about its dynamics—its past and future? We note that as individuals we are elements of the human superorganism, thus our spatial and temporal experience may very well be more limited than that appropriate for analyzing the human superorganism. The study of human civilization is guided by historical records and contemporary news. In contrast to protein folding , neural networks, evolution and developmental biology there are few reproducible laboratory experiments. Because of the irreproducibility of historical or contemporary events,these sources of information are properly not considered part of conventional science. While this can be a limitation, it is also apparent that there is a large amount of information available.Our task is to develop systematic methods for considering this kind of information that will enable us to approach questions about the nature of human civilization as a complex system. Various aspects of these problems have been studied by historians, anthropologists and sociologists. Why consider human civilization as a single complex system? The recently discussed concept of a global economy, and earlier the concept of a global village, suggest that we should consider the collective economic behavior of human beings and possibly the global social behavior as a single system. Considering civilization as a single entity we are motivated to ask various questions about it. These questions relate to all of the topics we have covered in the earlier chapters: spatial and temporal structure, evolution and development. We would also like to understand the interaction of human civilization with its environment. In developing an understanding of human civilization, we recognize that a widespread view of human civilization as a single entity is relatively new and dr iven by contemporary developments. At least superficially, the historical epoch described by the dominance of nationstates appears to be quite different from the present global economy. While recent events appear to be of particular significance to the global view, our questions must be addressed in a historical context. Thus we should include a discussion of the transition to a global economy. We postpone this historical discussion to the next chapter because of the groundwork that we would like to build in order to target a particular objective f or our analysis—that of complexity classification.
M o t i va t i o n
701
We are motivated to understand complexity in the context of our effort to understand the nature of the human superorganism, or the nature of the global economy. We would like to identify the type of complex system it is—to classify it. The first distinction that we might make is between a complex material or a complex organism (see Section 1.3.6). Could part of the global system be modified without affecting the whole? From historical evidence discussed in the next chapter, the answer appears to be no. This indicates that human civilization is a complex organism. The next question we would like to ask is: What kind of complex organism is it? By analogy we could ask: Is it like a protein, a cell, a plant, an insect, a frog, a human being? What do we mean by using such analogies? At least in part the problem is to describe the complexity of an entity’s behavior. Intuitively an insect is a simpler organism than a human being, and this is of qualitative importance for our understanding of their differences. The degree of complexity should provide a scale that can distinguish between the many different complex systems we are familiar with. Our objective in this chapter is to develop a quantitative definition of complexity and behavioral complexity. We then apply the d efinition to various complex systems. The focus will be on the complexity of an individual human being. Once we have established our complexity scale we will be in a position to apply it to human civilization. We will understand formally why a collection of complex systems (human beings) may be, but need not be, complex. Beyond recognizing human civilization as a complex system,it is far more significant to identify the degree of its complexity. In the following brief sections we establish some additional context for the importance of measuring complexity using both unconventional and conventional examples of organisms whose complexity should be evaluated.
8.1.2 Scenario: alien encounter
The possibility of encountering alien life has been debated within the scientific community. In popular literature, such encounters have been portrayed in various forms ranging from benevolent to catastrophic. The scientific debate has focused thus far on topics such as the statistics of planet formation and the likelihood that planets contain life. The presence of organic molecules in meteorites and interstellar gasses has been interpreted as suggesting that alien life is likely to exist.Efforts have been made to listen for signs of alien life in radio communications and to transmit information to aliens using the Voyager spacecraft, which is leaving the solar system marked with information about human beings. Thus far there has been no scientifically confirmed evidence for the existence of alien life. Even a single encounter would change the human perspective on humanity’s place in the universe. Let us consider one possible scenario for an encounter. An object that flashes light intermittently is found in orbit around one of the planets of the solar system. The humans encountering this object are faced with the question of determining whether the object is: (a) a signal device—specifically a recording, (b) a communication device, or (c) a living organism. The central problem can be seen to revolve around determining whether, and in what way, the device is responsive to external phenomena. Do the flashes of light occur without regard to the external environment
702
Human Civilization I
in a predetermined sequence? Are they random? If the flashes are sensitive to the environment,then what are they sensitive to? We will see that these questions are equivalent to the question of determining the complexity of the object’s behavior. The concept of life in biology is often defined, or better yet, characterized, in terms of consumption, excretion and reproduction. As a definition, these characteristics are well known to be incomplete, since there are lifeforms that do not reproduce, such as the mule. Furthermore, a particular individual is still considered alive even if it/he/she does not reproduce. Moreover, there are various physical systems such as crystals and fire that have all these characteristics in one form or another. Moreover, there does not appear to be a direct connection between these biological characteristics and other characteristics of life such as sentience and selfawareness. When considering behavior, the biological perspective emphasizes the survival instinct as characteristic of life. There are exceptions to this,since there exist lifeforms that are at times suicidal, either individually or collectively. The question of whether an organism actively seeks life or death does not appear to be a characterization of life but rather o f lifeforms that are likely to survive. In our discussions, we may be developing an additional characterization of life in terms of behavioral complexity. Definitions of life are often considered in speculating about the rights of and treatment of real or imagined organisms—injured or unconscious humans, robots, or aliens. The degree of behavioral complexity is a characterization of lifeforms that may ultimately play a role in informing our ethical decisions with respect to various biological lifeforms, whether terrestrial or (if found) alien, and artificial lifeforms that we create.
8.1.3 Scenario: blood cells
One of the areas briefly touched upon in Chapter 6, which is at the forefront of complex systems research, is the study of the immune system. Blood cells,unlike other cells in the body, are mobile on a length scale that is large compared to their size. In this characteristic they are more similar to independent organisms than to the other cells of the body. By their migration they might be said to “choose” to associate with other cells of the body, or with foreign chemicals and cells. It is fair to say that our understanding of the behavior of immune cells remains primitive. In particular, the variety of possible chemical interactions between cells has only begun to be mapped out. These interactions involve a variety of chemical messengers. More direct celltocell interactions where parts of the membrane or cellular fluid are transferred are also possible. One of the interesting questions that can be asked is whether, or at what level of complexity, the interactions become identifiable as a form of language. It is not difficult to imagine, for example, that a chemical communication originating from one cell might be transferred through a chain of cell interactions to a number of other cells. In the context of the discussion in Section 2.4.5, the question of existence of a language might be formulated as a question about the possibility of messages with a grammar—a combinatorial composition of parts that are categorized like parts of speech. Such combinatorial mechanisms are known to exist even at the molecular level in the DNA coding of antibody receptors that are a composite of different parts
Intuitively. through classification. We achieve understanding in a number of ways. The reader should devote some thought to this question before proceeding with the rest of the text. For dynamic systems the description includes the changes in the system over time. How complex is it? We say. Our use of the word “complexity”is specified as an answer to the question. It remains to be seen whether intercellular communication is also generated in this fashion. . The practical application of these definitions is a central challenge for the field of complex systems. In Section 8. the complexity of a system is the amount of information necessary to describe it. When we encounter something new. The understanding enables us to use. In Section 8. whether personally or in a scientific context.modify. Our objective in this chapter is to show that it is possible to quantify the concept of complexity in a way that is both natural and useful.4 Complexity Mathematical definitions of the complexity of systems are based upon the theories of information and computation discussed in Sections 1.Section 1.8 and 1. Simply stated.9.3 we discuss relevant concepts and tools that may be used for this purpose. In Section 8.Question 8. As a preliminary exercise in the discussion of complexity.1. This is descriptive complexity.1. we can make a connection between complexity and understanding. To use these definitions of complexity we will introduce mathematical expressions based upon the theory of information. description and ultimately through the ability to predict behavior. The quantitative definition of information (Section 1.g. control or appreciate it. We will also discuss the response of a dynamic system to its environment.4 we use several semiquantitative approaches to estimate the value of the complexity of specific systems. just as different animals and man have differing levels of complexity.8) is relatively abstract. However. A limited understanding establishes a lower bound for the complexity of the behavior.4) that enable the treatment of nonequilibrium systems.2 they will be used to treat complexity in the context of mathematical objects such as character strings. 8. It should also be understood that different types of cells will most likely have quite different levels of behavioral complexity. Complexity is a measure of the inherent difficulty to achieve the desired understanding. In the context of this chapter we can reduce the questions about the immune cells to a single one—What is the degree of complexity of the behavior of the immune cells? By its very nature this question can only be answered once a complete understanding of immune cell behavior is reached. To develop our understanding of the complexity of physical systems requires that we relate these concepts to those of thermodynamics (Section 1. it can be measured in familiar terms such as by the number of characters in a text. Its complexity is <number><units>.1 includes a list of systems that are designed to stimulate some thought about complexity as a quantitative measure of the behavior of a system. our objective is to understand it. The amount of information necessary to describe this response is a system’s behavioral complexity.3) and various extensions (e.M o t i va t i o n 703 of the genome. the reader is invited to exercise intuition to estimate the complexity of a number of systems..
Hint You may find that you would use different amounts o f information depending on what aspects of the system you are describing. Considering even a few of them is sufficient to develop an understanding of some of the issues that arise. How much would you have to write to describe the system behavior? A rough conversion factor of 1 bit per character can be used to convert these estimates to bits. we can paraphrase the question as. Indeed. So. For this question use an intuitive definition of complexity—the amount of information that would be required to describe the system or its behavior.704 H uma n Ci vi liza t io n I uestion 8. P = 1atm) Water in a glass Chemical reaction Brownian particle Turbulent flow Protein Virus Bacterium Immune system cell Fish Frog Ant Rabbit Cow Human being Radio Car IBM 360 Personal Computer (PC/Macintosh) The papers on your desk A book .1. We use units of bits to measure information. In such cases try to give more than one estimate or a range of values.1 Estimate the complexity of some of the systems in the following list. you may use other convenient units such as words or pages of text. Answers to this question will be given in the text in the remainder of this chapter. However. Q Physical Systems: Ideal gas (1 mole at T = 0°K. It is not necessary to estimate the complexity of all the systems on the list. for some of these systems a rough estimate is far from trivial. to make it easier to visualize.
Abstract representations of such systems are described in terms of characters or numbers.Our objective is to understand the complexity of systems composed of physical entities such as atoms.1) where P(s) is the probability of the string in the ensemble.8 was based on strings of characters that were generated by a source. computation and algorithmic complexity The discussion of Shannon information theory in Section 1. 8.1 Information. The source itself (or the ensemble) was char acterized by the average information of a large number of strings <I > = − ∑ P(s)log(P(s)) s (8.molecules or cells. It is helpful to preface our discussion of physical systems with a discussion of the complexity of the characters or numbers that we use to represent them. The information from a particular string was defined as I = −log(P(s)) (8. The source generates each string. by selecting it from an ensemble.2.C o m p l e xi ty of m at h em at i ca l m odel s 705 A library Weather The biosphere Nature Mathematical and Model Systems: A number Iterative maps (growth.2) . If all st rings have equal probability then this is the logarithm of the number of distinct strings. bifurcation to chaos) 1D random walk short time long time Ising model (ferromagnet) Turing machine Fractals Sierpinski gasket 3D random walk Attractor neural network Feedforward neural network Subdivided attractor neural network 8. s.2.2 Complexity of Mathematical Models Complexity is a property of the relationship between a system and various representations of the system.2.
such that the output is s. The short . the input string and the table. To summarize: There are actually two sources of information when we use a TM.9) describes the operations of logic and computation on symbols. The information content is the same as the length of the shortest binary encoding of the string. Our objective in this section is to develop an understanding of algorithmic complexity beginning from the theory of computation. Computation theory (Section 1. most strings cannot be compressed.the same information must be present in both. however. There are many ways to define complexity. s. The length of the shortest binary compact form is equal to the average information in a string. The concept of universality of computation is based on the understanding that a particular type of conceptual machine/computer—the universal Turing machine (UTM)—can perform all possible computations if the instructions are properly encoded as a finite string of characters serving as the UTM input. we can prove that any two definitions of complexity differ by no more than a constant.A TM is defined by a table of elementary operations that act on the input string. Other models for computation have been shown to be essentially equivalent to these TM. We would like to define the algorithmic complexity of a string. more generally. it is helpful to think about how we might approach compressing various strings of characters. This is intuitive—since the original string can be obtained from its shortest representation. We will also show that no matter what definition we use. we could use any possible algorithm for encoding (compressing) the string. Information theory suggests that we can define the complexity of a string of characters by the information content of the string. the encodings would be limited to compression using a Markov chain model. However. Questions about all possible algorithms are precisely the domain of computation theory.” Allowing all algorithms is the same as allowing more general models for the string than a Markov chain. We need to take both of them into account to define the complexity. Within standard information theory. The word “program” can be used either to refer to the TM table or to its input and so its use is best avoided in this context. as the length of the shortest possible binary TM input. The relationship of this to the encoding and decoding of Shannon should be apparent. The existing proof shows that the UTM can perform all computations that can be done by a much larger class of machines—the Turing machines (TM). In order to motivate the logic of the following discussion. Since we have no absolute definition of computation.The probabilistic coupling between sequential characters reduced the information content of the string. The definition of Kolmogorov (algorithmic) complexity of a string makes use of computation theory to describe what we mean by “any possible algorithm.there is no complete proof.there are several matters that must be cleared up. It was possible to compress the st ring using a reversible coding algorithm (computation) that would enable the same information to be represented in a more compact form.706 Hu ma n Civ il iz a tion I It was also possible to consider a more general source that selected characters to form a Markov chain. All the operations are deterministic and are expressible in terms of a few elementary operations. In order to use this as a definition.
If there were many strings. Both programs would be quite simple.3) —the first character indicates whether to use the TM V or the UTM U on the rest of the input.…).and we cannot significantly improve the ability to compress strings by allowing the larger class of TM to be used in the definition. One string might be formed out of a long substring of zeros followed by a long substring of ones. The central theorem of algorithmic complexity relates the complexity according to one UTM U and another UTM U′. Once we do this. Then the complexity CU (s) of the string s is defined as the length of the shortest string r such that U(r) = s. This is convenient to write by indicating how many zeros followed by how many ones: N0N1. the string that we send uniquely determines the string we wish to communicate.8.C o m p l e xi ty of m at h em at i ca l m odel s 707 est compression should then be the complexity o f the string. Before we state and prove the theorem. we would have to send the program as well as the input.and the initial position of the TM head is assumed to be at the leftmost nonblank character. we define a new UTM W by: W(0s) = V(s) W(1s) = U(s) (8. This is necessary. The problem is that with only the input string.2. The answer is that the use of a UTM is convenient. we might be clever and send the programs only once. If we want to communicate it in compressed form. we identify a particular UTM U. Since the complexity according to the UTM W is at most one more than the . Another string might be a representation of the Fibonacci numbers (1. In what follows. We now develop these thoughts using a more formal notation. Let us say that we have a UTM U and a TM V. Thus the length of the shortest representation is CU (s). Now imagine that we want to communicate one of the original strings to someone else. We need to send an additional piece of information that indicates which program to apply. We would make a binary string notation for N0N1 and write a program that would read this input and then output the original string.1. We first ask whether we need to use a UTM and not just any TM in the definition. The st ring that results from its application to a tape is indicated by U(s) where s is the nonblank portion of the tape (input string).5. we discuss several incidental matters.3. the operation of a TM or a UTM will be indicated by functional notation.then it would be impossible to guarantee a correct interpretation. In order to define the complexity of a string. starting from the N0st number and ending at the N1st number.We could write this using a similar notation as the previous one.2. but the program that we would write to generate the string is quite different. The simplest way to do this is to assign numbers to each of the programs and preface the program input with the program number. the recipient would not know which program to apply to obtain the o riginal string. because if the interpretation of the transmitted string is not unique. U is the identifier of the TM. We call an input string r to U that generates s a representation of s.
However. we see that using the larger class of TM to define complexities can not improve our results for any particular string by more than one bit. We then define the complexity CU (s) of any string s as one less than the length of the shortest string r such that V(r) = s.2. we could define the complexity of a string using a slightly different construction. and a single machine U that can reconstruct each s from this representation.2. if we wanted to impose this as an auxiliary condition. The length of this string must be greater than or equal to the length of the minimum string rU necessary to produce the same output: CU (s) = rU  ≤ rU. Returning to our basic definition of complexity. Let rU ′ be a minimal representation for U′ of the string s.2. we have described the existence of a shortest possible representation of any string s. which is not significant for long complex strings.6) CU (U′) = rU. This is discussed in Questions 8.U ′rU ′ is a representation for U of the string s. We have proven the inequality in Eq. However. we note that the Shannon information.U ′rU ′ = rU ′ + rU.2.U ′  is the length of the translation program.708 Human Civilization I complexity according to the TM V. Eq.3. The theorem is: the complexity CU based on U and the complexity CU ′ based on U′ satisfy: CU (s) ≤ CU ′(s) + CU (U′) (8. . Limiting the complexity of a string to be no longer than the string itself might seem a natural idea. we define a new UTM V such that V(0s′) = s′ V(1s′) = U(s′) (8.2.1–8. This is not quite a fair definition. Then rU. Given a UTM U. which would be a sacrifice of at most one bit for incompressible strings.U ′ a translation program. there is a string rU so that for all s and r satisfying s = U(r).1 Show that there exists a UTM U0 such that for any TM U that accepts binary input. including its first bit.5) where CU (U′) is independent of the string s. (8.2. We might call this prefix rU. Q uestion 8.2. To prove this we must improve slightly our definition of complexity. The key theorem that we need to prove relates the complexity defined using one UTM U to the complexity defined using another UTM U′. CW (s) ≤ CV (s) + 1. It is shown there that we can preface binary strings input to the UTM U′ with a prefix that will make them generate the same output when input to U.4) —the first character indicates whether the string is compressed.5). The proof of this expression results from the ability of the UTM U to simulate U′.U ′ r) = U ′(r). (8. This means that we should define the complexity as the length of r. or equivalently.1). we have to limit the UTM that are allowed.2. because if we wanted to communicate the string s we would have to indicate all of r.it satisfies the property that for any string r. and may be larger than the original string length for a particular string. We may be disturbed that the definition of complexity does not indicate that the complexity of an incompressible string is the same as the string length itself. we have that s = U0(rU r) .U ′ = CU ′(s) + CU (U ′) (8. is related only to the probability of a string. Indeed the definition does not require it. U(rU.
the tape part of the representation rt (r) uses a doubled binary form for characters and markers between them so that it is not the same as the original tape. See the text for a hint.3) so we need to modify our conditions. the modification is minor because we only improve the definition slightly.2. The new construction requires modifying the nature of the UTM—i. but all require us to introduce something new. the form of the input string would not ˜ quite satisfy the conditions of this theorem.1 We call the UTM described in Section 1..5) is not actually correct for all UTM (see Question 8. The other is to recognize that the longest string we might conceivably encounter is smaller than the number of particles in the known universe.9.5). however.2.9 by converting one of the M1 markers to M 6 at the current location of the UTM U. by proclamation. We will do this by allowing the UTM U0 to have a counter that can keep track of the current position of the UTM U. U0. We must replace the tape part of the representation with the original string in order to have an input string of the form rU r. It might be significantly . a trick.(8. This was accomplished in Section 1.2.9. We do this by defining the complexity CU (s) for an arbitrary UTM as the minimum length of r such that W(r) = s where W is defined by: W(0s) = U0(s) W(1s) = U(s) (8.2. Then there would be no delimiters between characters and no doubled binary representation. prove Eq. There are two ways to argue this. This construction gives us the desired UTM U0. This means that we can use an internal memory of 300 bits to represent such a counter. Solution 8.2. one diffi culty. This counter is initialized to 0 and set to the current location of the UTM U at every step of the calculation.9. Q uestion 8. where the right part is only a function of the input string r and the left part is only a function of the UTM U.1. The UTM U0 must keep track of where the current position of the UTM U would be during the same calculation. ˜ Solution 8.7) —the first bit specifies whether to use U or the special UTM U0 constructed in Question 8.1. In a sense. There is. We can sim˜ ulate the UTM U using U0 . or very roughly 1090 = 2300.2.2. ˜ Both U0 and U have binary input strings.2 Using the result of Question 8. There are a number of ways to overcome this problem.2. This means that we might try to use the tape of U without modification in the tape part of the representation given in Section 1. However. U0 has an input that looks like rU rt (r). for any particular string.C o m p l e xi ty of m at h em at i ca l m od els 709 Hint One way to do this is to use a modified form of the construction given in Section 1.2. however.e. a counter that can reach arbitrarily high numbers. CU (s) defined this way is at most one bit more than our previous definition.2 The problem is that Eq.One is to allow.(8.
2. A string might have many repetitive digits. it might be a sequence that can be generated using simple mathematical operations such as the Fibonacci series. This consistency—universality—in the complexity of a string is essential in order for it to be well defined.10) .CU ′(U)) CU (s) − CU ′(s) ≤ CU. because our objective is to find short representations of strings. We now construct a new TM U which is defined by: U(ri r ′) = Vi (r′) (8. Switching U and U ′ in Eq. We will use a few examples to illustrate the nature of universality provided by this definition.9) Since this constant is independent of the complexity of the string s. Solution 8. (8.2.7). Thus. A useful compression algorithm corresponds to a pattern in the characters of the string. (8.U′: (8.5) is necessary by demonstrating that there exists a UTM that does not satisfy this inequality.U ′ = max(CU (U′).Given a string s we can ask what methods of compression are useful for the string. The complexity of a string according to U is twice the complexity according to U′ and therefore Eq. we can write W(rWW ′rW ′) = W′(rW).U ′ (8. Eq. define a UTM U that acts the same as a UTM U′ but uses only every other character in its input string: U(r) = U′(r′) if r is any string whose odd characters are the characters of r′. or the digits of . There are many such patterns that are relevant to the compression of strings. Therefore.5) cannot be extended to all UTM. whose complexity is defined in terms of W and W ′ by Eq.5) is invalid in this case.2.2.2.710 H uma n Ci vi liza t io n I smaller.2 this is no longer a problem.2. it doesn’t matter which UTM we use to define its complexity.3 Show that some form of qualification of Eq.8) we have proven that complexities defined by the UTM differ by no more than CU.3 One possibility is to have a UTM that uses only certain characters in its input string.2.2. By using our special UTM U0 in this definition. With the modified definition of complexity given in Question 8.(8. for strings that are complex enough. we guarantee that for any two UTM U and U′. Specifically.5) gives a similar inequality with a constant CU ′(U ). (8.2. or cyclically repeating digits. We can choose a finite set of N algorithms {Vi }.2. (8.2. The first example illustrates the relationship of algorithmic complexity to string compression. Defining the larger of the two translation program lengths to be CU. where each one is represented by a TM that reconstructs a string s from a shorter string r by taking advantage of properties of the pattern. Q uestion 8. This should not be a problem. it becomes insignificant for large enough complexities. Alternatively. The complexity defined by one UTM is the same as the complexity defined by another UTM. This is possible because W inherits the properties of U0 when the first character of its input string is 0.
We define a new UTM Us by: Us (0s′) = s Us (1s′) = U(s) (8.at most 1 string in 2100 = 1030 can be compressed by 100 bits or . Once it is defined. We do not use different TM to define the complexity of each string. We have gained an additional result from the construction of a single UTM that generates all strings from their compressed forms.C o m p l e xi ty of m ath em ati cal m od els 711 where ri is a binary representation of the number i.2. Conceptually. one UTM is used to define the complexity of all strings.3).this complexity is a measure of the complexity of all strings. would also be represented by strings of length N − k. even though the complexity of a string is defined without reference to an ensemble of strings. Even so. We use U to define the complexity CU (s) of any string as described above. This complexity includes both the length of r′ and the number of bits (log(N)) in ri that together constitute the length of the input r to U. What we have done is to take the particular string s and insert it into the table of Us . e.This is not a very significant compression.g.let us assume that we are evaluating the complexity of a particular string s.and by additional results about the impossibility of compressing most strings discussed in the following paragraphs. at most 2N− k strings of length 2N can be compressed by k bits. having log(N) bits. We can now prove that the probability that a string of length N can be compressed is very small. the information in their tables. Since there are only 2N−k strings of length N − k.01% of the string length.. The fractional compression is k/N. How does this relate to our the orem about the universality of complexity? The point is that in this case the translation program between U and Us contains the complete information about s and therefore must be at least as long as CU (s). The proof proceeds from the observation that the number of possible strings decreases very rapidly with decreasing string length.among all st rings of length 106 bits. This is a UTM if any of the Vi is a UTM or it can be made into a UTM by Eq. .2. A string s of length s = N compressed by k bits is represented by a particular string r of length r  = C(s) = N − k. We see in this example how universality is tied to an assumption that the complexities that are discussed are longer than the TM translation programs or. strings of length N − 1 N − 2. For example. Despite the message of the last example. We can use this new UTM to define the complexity of all strings and for this definition the complexity of s is one. selecting a string at random will yield an incompressible string. equivalently.11) —the first character tells Us if the string is s. we would say that universality of complexity is tied to an assumption of lack o f specific knowledge on the part of the recipient (represented by the UTM) of the information itself. because strings that are not of length N. this estimate of the average number of strings that can be compressed is much too large. However. this apparent relativism of the complexity is limited by our basic theorem that relates the complexity of distinct UTM. …. The choice of a particular UTM might be dictated by an implicit understanding of the set of strings that we would like to represent. This is that a representation r only represents one string s. Moreover. Thus most strings are incompressible. (8. N − k.
2. Such UTM are closely analogous to our understanding of encoding and decoding as described in information theory. then the average algorithmic complexity of these strings is essentially the same as the Shannon information. then we will have fewer representations to use for others.2.14) giving: < C(s) > ≥ ( N − 2) + 1 2N (N + 2) > N −2 (8. The reason that such UTM are better is that there are only a limited number of representations shorter than N. the ensemble of all of the strings of length N have a Shannon information of N bits and an average algorithmic complexity which is the same. We can also interpret this discussion to mean that the best UTMs to use to define complexity are those that are invertible—they have a onetoone mapping of strings to representations.2. Because most strings are incompressible. and so on. two strings are represented by a single bit (length 1). One string is represented by the null string (length 0).2. The UTM is the decoder and the mapping of the string onto its representation is the encoding. This strict lower bound applies to all measures of complexity.12) means that we will fill all of the possible strings up to length N − 1 and then have one string left of length N.2. In this case we have a mapping r(s) which gives the unique representation of a string. Solution 8.4 We assume that strings of length N are compressed so that they are represented by all of the shortest strings.4 Calculate a strict lower bound for the average complexity of strings of length N.15) Thus the average complexity o f strings of length N cannot be reduced by more than two bits.712 H uma n Ci vi liza t io n I Q uestion 8. The relationship: 2N = N −1 l=0 ∑ 2l +1 (8. we can also prove that if we have an ensemble of strings defined by the probability P(s). In particular. The catch is recogniz . The average representation length for any complexity measure must then satisfy: < C(s) > ≥ 1 N −1 l l2 + N N 2 l=0 ∑ (8.2. if we use up more than one of them for a particular string.13) The sum can be evaluated using a table of sums or: N −1 l =0 ∑ l2l = 1 d ln(2) d N −1 l =0 ∑ 2 l =1 = 1 d 2 N −1 ln(2) d 2 − 1 = N 2N − 2(2N − 1) =1 (8.
Otherwise the complexity is noncomputable.. The infinitely complex input means the limitation does not apply. it can be proven that this is a fundamentally difficult task—the time necessary for a TM to determine C(s) grows exponentially with the length of s. This is a key limitation of TM: TM (and computers that are realizations of this model) cannot generate new information. The proof follows from the discussion in Section 1. If the complexity is not bounded.9. If such tasks are identified.2.the specification of the ensemble must itself be simple.g. No method is given to determine the complexity of a particular string. At least this is true when there is a bound on the complexity. The process of finding the complexity of a string is akin to a process of trying models for the string. e. In this context.thus it is noncomputable.C o m p l exi ty of m at hem at ica l m o dels 713 ing that to specify P(s) itself requires an algorithm whose complexity must enter into the discussion. No TM can generate a string more complex than the input string that it is provided with.2. It is nonconstructive. this limitation can be overcome by a TM that is given a string of random bits as input. It remains to be demonstrated what tasks such a TM can perform that are not possible for a conventional TM. A model is a TM that might. C(P) depends in part on the algorithm used to specify the ensemble probability P(s). as long as the algorithm specifying the Markov chain is simple. it may also be suggested that some forms of creativity might be linked to the availability of randomness (see Section 1. We find the complexity of a string by trying all input strings in the UTM to see which one gives the necessary output. For Markov chains a similar result applies—the Shannon information of a string representing a Markov chain is the same as the algorithmic complexity of the same string. then the halting problem implies that we cannot tell if the UTM will halt on a particular input. trying each string requires a time that grows exponentially with the bound. As discussed briefly in Section 1.7. when given the . For the average ensemble complexity to be essentially equal to the average Shannon information.4). Indeed. If the complexity of the string is bounded. there is a profound difficulty with this proof. plus the information in its table—otherwise we would have redefined the complexity of the output string to take this into consideration. by Eq. Nevertheless. They can only process information they are given. and therefore is not practical except for a few very simple strings. An ensemble defined by a probability P(s) can be encoded in such a way that the average string length is given by the Shannon information. and it is possible to determine if the UTM will halt for members of this bounded set of strings.7). We will return to this issue at the end of the chapter.there will be important implications for computer design. (8.16) where the expression C(P) represents the complexity of the decoding operation for the universal computer U for the ensemble given by P(s). We now realize that to define the st ring complexity we must include the description of the decoding operation: ∑ P(s)C(s) = ∑ P(s)I s + C(P) s s (8. While the definition of complexity using UTM is appealing. A general consequence of the definition of algorithmic complexity is a limitation on what TM can do.8.9. then we only try strings up to this bound.
Most o f the problems revolve around various forms of infinity. Since we are interested in properties of complex systems whose descriptions are long. This constant is the length of the program that translates the input of one UTM to the other. find an upper bound on the complexity of a string. the universality of complexity is a statement that the use of different UTMs in the definition of complexity affects the result by no more than a constant. from 1 to infinity with equal probability.714 Hu man Civ il iz a tion I proper input.about 300 bits. Unlimited numbers and infinite precision often simplify symbolic mathematical discussions. However.say N.however. With any particular set of models. the more universal is the value of its complexity. In what follows. However. generate the string. Using an information theory point of view. We also showed that most strings are not compressible and that the Shannon information measure is the same as the average algorithmic complexity for all concisely describable ensembles. This follows because the length of translation programs becomes less and less relevant for longer and longer descriptions/representations. with caution. if we allow only integers between 1 and a large positive number—say N = 1090. the more complex the string is. we can. they are not well behaved from the point of view of complexity measures.2 Mathematical systems: numbers and functions One of the difficulties in discussing complexity is that many elementary mathematical constructs have unusual properties when considered from the point of view of complexity. These conventions.2. The drastic difference between the . Let us consider the complexity of specifying a single integer. however. rely on the universality of their complexity. by our discussion it is improbable that a randomly chosen string will be compressible by any algorithm. One of the possible models is that of a Markov chain as used by Shannon information theory. This is not the case with simple systems whose descriptions and therefore complexities are “subjective”—they depend on the conventions for description. to determine the actual compressed string may not be practical in any reasonable time. we can. Algorithmic complexity allows more general TM models.and they have been extensively debated over the centuries. In summary.There appears to be a paradox here that will be clarified when we distinguish between the complexity of a set of numbers and the complexity of an element of the set. Thus the c omplexity of specifying a single integer is infinite. in our mathematical definition. we assume a particular definition of complexity C(s) using the UTM U. However. roughly the number of elementary particles in the known universe—the complexity of specifying one of the integers is only log(N). assigning equal probability to all integers would imply that any particular integer would have no probability of occurring. 8. The difficulty with integers is that there are infinitely many of them. If I ask you to give me a positive integer. Significantly.are represented by the choice of UTM used to define complexity. and there is no limit to the information required.unless otherwise mentioned. there is no chance that you will give me an integer below any particular cutoff value. Philosophers have been troubled by these points. It is possible to try many models. This means that you will need arbitrarily many digits to specify the integer.
This distinction between the information contained in an element of a set and the information necessary to define the set will also be important when we consider the complexity of physical systems. the complexity of a system arises because of the presence of a large number of parameters that must be specified. similar to integers. In what sense are integers simple? We can consider the length of a UTM input string that can generate all the posit ive integers. by definition. We can generalize our definition of a Turing machine to allow for this case by saying that. the complexity becomes very manageable. rather than the specification of a particular integer. The whole field of number theory has shown that integers are not as simple as they first appear. if we confine ourselves to any reasonable precision. However. starting from zero and keeping a list. progressively add one to the preceding integer. The algorithmic complexity of the set of all integers is small even though the information contained in a single integer can be arbitrarily large. As discussed in the last section.punctuation. and the task is not complete. they are related.9. The complexity of a single real number is also infinite. Then the algorithmic complexity of the integers is quite small. We then ask how long is a TM program that can recognize any integer that appears as a combination of such characters.however. is that we can expand the space of possible characters to include various symbols:letters.001159652193(10) (8. mathematical operations. because all of them represent integers.Generally.C o m p l exi ty of m ath em ati cal m od els 715 complexity of specifying an arbitrary integer (infinite) and the complexity of an enormously large number of integers (300 bits) suggests that systems that are easy to define may be highly complex. For example. The problem is that such a program ne ver halts. Another way to do this is to consider the complexity of recognizing an integer—the recognition complexity.however. corresponding to 11 accurate decimal digits or 37 binary digits. The program would. The length of such a program is also small. the most accurately known fundamental constant in science is the electron magnetic moment in Bohr magnetons e/ B = 1. this is similar to the definition of their Kolmogorov or algorithmic complexity. The mathematical operations might act upon integers.5). this simple program is generating all integers. etc. We see that we must distinguish between the complexity of elements of a set and the set itself.2. A program that recognizes integers is concerned with the attributes of the integers required to define them as a set. If we consider 1 − e / B we immediately lose 3 decimal digits. The measure of complexity of specifying a single integer may appear to be far from more abstract discussions like those of the halting problem or Gödel’s theorem (Section 1. Thus. Recognizing an integer is trivial if we are considering only binary strings. The point. the practical complexity of a real number is not very large. Specifying an arbitrary real number requires infinitely many digits. .17) where the parenthesis indicates the error estimate. This is apparent since these theorems do not apply to finite sets. The discussion of integers and reals suggests that under practical circumstances a single number is not a highly complex object.
9. Each possible output is independent. and its actions will be specified by a number of binary variables (action complexity) Na. If we assume that all possible combinations of Boolean functions are equally likely. To specify a function f (s) we must either describe its operation by a formula or specify its action on each possible argument. . The number of different Ne Boolean functions is the number of possible sets of outputs which is 22 . or C( f ) = N a 2 N e (8. The environment will be specified by a number of binary variables (environmental complexity) Ne .9. In Section 8.18) The asymmetry between input and output is a fundamental one. The complexity of a physical system is to be defined as the length of the shortest string s that can represent its properties—the results of possible measurements/ observations. which measures the complexity as a function of the scale of observation. The resulting information measure is essentially that of Shannon information theory. for chaotic dynamics. .3. and therefore the influence of the ou tput space on the complexity is logarithmic compared to the influence of the input. This discussion will be generalized later to consider a physical system that acts in response to its environment. of a binary string.3. This is discussed in Section 8. For each of these there are two possible outcomes (output values). The number of arguments of the function—input bits—is Ne . there is only reason to consider them collectively as a system if they are coupled to each other. f (s) = ±1.3 introduces the complexity profile. It arises because we need to specify for each possible input which of the possible outputs is output. s = (s1s 2 . see Section 1.2). Specifying “which” is a logarithmic operation in the number of possibilities. we can ask whether this is the smallest amount of information that might be used. Assuming that all of the possible Boolean functions are equally likely. All Boolean functions may be specified by listing the binary output for each possible input state. The representation of a Boolean function in terms of C(f ) binary variables can also be made explicit as a string representing the presence or absence of terms in the disjunctive normal form described in Section 1. we must develop a fundamental understanding of representations.3 Complexity of Physical Systems In order to apply our understanding of the complexity of mathematical constructs to physical systems.2. Implications of the time scale of observation.2. are discussed in Section 8. When we c onsider algorithmic complexity. We consider Boolean functions (functions with binary output.2. This will enable us to define the complexity of ergodic and nonergodic systems. The next category of mathematical objects that we consider are functions. Section 8.3.716 Hu ma n Civ il iza t io n I However.sNe ). There are 2N e possible values of the input string.4.3. A binary function with Na outputs is the same as Na independent Boolean functions. 8. Section 8.5 . the complexity of a Boolean function (the amount of information necessary to specify it) is the logarithm of this number or C(f ) = 2Ne . then the total complexity is the sum of the complexity of each.1 we discuss the relationship between thermodynamics and information theory.3.
it should be understood that these are very different concepts. the formal definition of information discussed in Section 1.V )) = S(U.N.3. Information can be a property of a time sequence or any other set of degrees of freedom.3. The entropy was written as S = k ln (U.3. We see that the information content is related to selecting a single state out of an ensemble of possibilities.1 through 8. The entropy. The entropy was defined first for the microcanonical ensemble.which specifies the macroscopic energy U. (8.3.3. The logarithm is taken to be base 2 so that the information is measured in units of bits.2) . For example.V )/(k ln2) (8. If we think about the state of the system as a message containing information.2.(8. We also know that entropy is conserved in reversible adiabatic processes and increases in irreversible ones.p}(U.V ) is the number of such states. Despite the significant conceptual difference between information and entropy.2. Thus.C o m p l ex it y of p hys ica l sys t e m s 717 discusses examples and properties of the complexity profile. and volume V. Instead it is related to representations of digits. N. the information is defined by Eq. we must select this microstate from the whole ensemble. Other issues related to the role of the observer are discussed in Section 8. we can use Eq.3. or are in welldefined ensembles. would be largely a property of the paper or the ink. it makes sense that the two are related when we develop an understanding of complexity. This turns out to be closely related to descriptive complexity.6.1). of the system. The entropy of paper is difficult to determine precisely. The coefficient k is defined so that the units of entropy are consistent with units of energy and temperature for the thermodynamic relationship T = dU/dS. The most direct relationship is the relationship of entropy and information. We can relate the two definitions in a mathematically direct but conceptually significant way. Information was defined for a string of characters. Sections 8. Information is not a unique physical property.N. To better account for the behavior of a system in response to its environment we consider behavioral complexity in Section 8.3. however.3.8 appears very similar to the definition of entropy discussed in Section 1. number of particles N. The probability of this particular state is given in the microcanonical ensemble by P = 1/ . 8.5 are based upon descriptive complexity. the information content of a set of characters written on a piece of paper can be given.1) where (U.V ) (8.1) to give the amount of information as: I({x.Entropy is a specific physical property of systems that are in equilibrium.N. If we want to specify a particular microstate of a thermodynamic system.7.1 Entropy and the complexity of physical systems The definition of complexity of a system requires us to develop an understanding of the relationship of information to the physical properties of a system. number of particles and volume are equally likely in the ensemble. It is helpful to review the definitions. We assume that all states (microstates) of the system with this energy. but simpler substances have entropies that have been determined and are tabulated at specific temperatures and pressures. Given the probability of the string of characters. At the outset.
If we consider a mapping of system states onto strings. What happens if the system is nonergodic? There are two kinds of nonergodic systems we will discuss: a magnet with a welldefined magnetization below its ordering phase transition (see Section 1.p}. If we had to specify even a single position exactly. If N is the number of particles. It does not matter if we consider a typical or an average amount of information. The same function of the frozen variables we will call C. the particle must be located within a region of position and momentum of ∆x∆p = h.N. then there is a onetoone mapping of system states onto the strings.V )) bits in each string. 3 position and 3 momentum coordinates for each particle. when the system is in the macrostate specified by U. We thus identify the entropy of a physical system as the amount of information necessary to identify a single microstate from a specified macroscopic ensemble. quantum mechanics is inherently granular. If we want to describe the microstate of a system.3.p}(U.N. We now recognize that the calculation of the entropy is precisely a calculation of the information necessary to describe the microstate. To specify exactly the position of each particle appears to require arbitrary precision in these coordinates. and a string uniquely identifies a system state. Should we include the information necessary to specify the frozen variables as part of the entropy? We would like to separate the discussion of the frozen variables from the fast ones that are in equilibrium.718 Human Civilization I This expression should be understood as the amount of information contained in a microstate {x. it would take an infinite number of binary digits.4). the strings enumerate or label the system states. We say that a string represents a system microstate. However. More correctly. If there are I({x. otherwise the counting of possible microstates of the system would be infinite. and a glass where there are many frozen coordinates describing the local arrangements of atoms (see Section 1.2. . like a gas of particles in a box. classically we must specify all of the positions and momenta of the particles {xi . Many of these coordinates do not change during the time of a typical experiment. We use the entropy S to refer to the fast ensemble—the enumeration of the kinetically accessible states of the system. and thus also the amount of information (number of bits) needed in order to describe completely the microstate.V )) bits is the same as the number of states of the system.N. this definition is a robust one. The particle location is uniquely given once it is within a region ∆x. For an ergodic macroscopic system. It follows from the recognition that the number of states of a string of I({x.6). There is another way to think about the relationship of entropy and information. We review its meaning in terms of the description of a particular idealized physical system.This is the fundamental relationship we are looking for.p}(U. where h is Planck’s constant.pi}.V—it is also the information necessary to describe precisely the microstate. The definition of the entropy takes this into account. The granularity defines the precision necessary to specify the positions and momenta. The complete calculation of the entropy (which also takes into account the indistinguishability of the particles) is given in Question 1. thus there is a smallest distance ∆x within which we do not need to specify one position coordinate of a particle. then there are 6N coordinates.
The difficulty with a glass is that we do not have an independent way to determine the amount of frozen information. As long as an experiment is being performed in which the frozen variables never change. and the amount of information necessary to describe the fast variables is just as large as ifthere were no frozen variables. The amount of information is insignificant compared to the information in the microstate of a system. The other is that the frozen variables balance against the fast variables so that when there is more frozen information there is less information in the fast variables. We treat the magnet by giving the information about the magnetization explicitly as part of the ensemble description. The simplest way to think about this disorder is that it arises from a choice of orientations of the water molecule around the position of the oxygen atom. for a glass. For the Ising model of a magnet (Section 1. The structure of ice has a glasslike frozen disorder of its hydrogen atoms below approximately 100°K.6).that takes into account the rotational and internal vibrational motion of the water molecule. This means that there is a macroscopic amount of information necessary to specify the static structure of ice. In such an experiment the frozen information must be accounted for.C o m p l ex it y of p hys ic a l sys t e m s 719 For the magnet. The other calculation we need is the amount of entropy in steam. then the amount of information in the frozen variables is fixed. . The best is C = 0.A first estimate is based on an average o f 3/2 orientations per molecule or C = Nk ln(3/2) = 0. In contrast. We will need to consider an experiment that changes the frozen variables—for example. the amount of information contained in frozen variables is small. there is another system where we do. where this information is insignificant. How does this information relate to the thermodynamic treatment of the system? The conventional thermodynamic theory of phase transitions does not consider the existence of frozen information.8145 ± 0. or a change from information in fast variables to information in frozen variables. there are two intuitive possibilities. we will need to consider an experiment that measures both. This can be obtained using a slight modification of the ideal gas calculation. and thus it does not apply to the glass transition. Thermodynamic experiments only depend on entropy differences. It is designed for systems like the magnet. There is an intermediate example b etween a magnet and a glass that is of considerable interest.806 cal/moleK.and therefore is generally ignored.heating up a glass until it becomes a liquid or cooling it from a liquid to a glass.0002 cal/mole°K. The amount of information associated with this disorder can be calculated directly using a model for the structure of ice that takes into account the correlations between molecular orientations that are needed to form a selfconsistent hydrogen structure within the oxygen lattice. Fortunately.A different theory is necessary which includes the change from an ergodic to a nonergodic system. In order to determine which is correct. One is that we must specify the frozen variables as part of the ensemble.the amount of information that is included in the frozen variables is large. below the magnetization transition only a single binary variable is necessary to specify if the system magnetization is UP or DOWN. Is there any relationship between the frozen information and the entropy? If they are related at all. A review of better calculations is given in a book by Fletcher.
In a theoretical description we start.. as we go through the freezingin transition.3) q = TdS (8.3. At close to a temperature of zero degrees Kelvin (T = 0K) the entropy is zero because all motion stops. Rather than considering it from the point of view of heating ice till it becomes steam.(8.3).82 ± 0. In this case C decreases and S increases more than would be given by the conventional relationship o f Eq. 8. we consider what happens either to ice or to a glass when we cool it down through the transition where degrees of freedom become frozen.1).3). The coincidence of two numbers—the amount of entropy missing and the calculation of the information in the frozen structure of the hydrogen atoms. We think of this as a shrinking of the number of elements of the ensemble.5) This in turn implies that the information in the frozen degrees of freedom was transferred (but conserved) to the fast degrees of freedom. (8.6) is important enough to present it again from a different perspective. We will consider this further in later sections.3. As we cool the system we remove heat. This is the amount of entropy in the gas that was not added to the system as it was heated.3.3. Eq.3. However. the .and this is reflected in a decrease in the number of possible states of the system. with an ensemble of systems. suggests that the missing entropy was present in the original state of the ice. (8. and there is only one possible state of the system. experimentally there is a difference of 0.05 cal/moleK between the two. (8. When heat is not added to a system. Instead it should be modified to read: q = Td(S + C ) (8.6) This should be understood as implying that adding heat to a system increases the information either of the fast or frozen variables. The discussion will help demonstrate its validity by using a theoretical argument (Fig.720 Hu ma n Civ il iz a tion I The key experiment is to measure the change in the entropy of the system as a function of temperature as it is heated from ice all the way to steam.above the freezingin transition.3.so that fewer variables are frozen.3. to ice) increases the temperature of the system. However. Eq.3) where q is the heat added to the system. Thus we would expect T S(T) = q /T 0 ∫ (8.5) is not consistent with the standard thermodynamic relationship in Eq.3.g. We find the entropy using the standard thermodynamic relationship (Section 1. S(T) = C(T = 0) + ∫0 T T q (8.3.4) —the total amount of entropy added to the system as it is heated up should be the same as the entropy of the gas. we see that there can be processes that change the number of fast degrees of freedom and the number of static degrees of freedom while leaving their sum the same. Adding heat (e.
the information necessary to specify the particular state within the piece. Thus for a particular material we must track only part of the original ensemble. we must add the information contained in the fast degrees of freedom S/k ln(2). ensemble breaks up into disjoint pieces that can not make transitions to each other. This is the meaning of Eq. the information needed to identify (describe) a particular microstate is the sum of the information necessary to describe which of the disjoint parts of the ensemble the system is in. If we insist on describing the microstate of the system.3. decreases.g. Below the glass transition.6). and S/k ln(2). the system is no longer ergodic and the phase space breaks up into pieces. Any particular material must be in one of the disjoint pieces. The total amount of information necessary to specify a particular microstate (e. T4.C o m p l e x i t y o f p h y s i c a l s ys t e m s 721 T1 T2 T3 T4 * Figure 8. The information to specify the ensemble fragment was transferred from the entropy S to the ensemble information C. plus the information needed to specify which of the microstates the system is in once its ensemble fragment has been specified. A particular system explores only one of the pieces. S.3. For an incremental decrease in temperature due to an incremental removal of heat. the information necessary to specify which piece. (8. indicated by the *) is the sum of C/ k ln(2). is not reflected in the amount of heat that is removed. The question is whether we should insist on describing the microstate. In order to describe a system and its behavior over time.1 Schematic illustration of the effect on motion in phase space of cooling through a glass transition. The reduction of the entropy. Above the glass transition (T1. We are now in a position to give a first definition of complexity.T2 and T3) the system is ergodic — it explores the entire phase space. the . we must describe the ensemble it is in. This information is given by C/k ln(2). Typically. Cooling the system causes the phase space to shrink smoothly. the logarithm of the volume of phase space. The entropy.
V ). The entropy would then be large and the complexity would be negligible.N. This requires a careful investigation of units.3.8) The information content of a microstate is given by Eq. In order for this technique to work at all. For material systems.3. we know in principle how to measure this. However. For a thermodynamic system in the microcanonical ensemble.On the other hand. (8. We have learned from this discussion that for a nonergodic system. If our time scale of observation would be arbitrarily long. Solution 8. if our time scale of observation was extremely short so that microscopic motions were detected. then our complexity would be large and the entropy would be negligible.3. If we know that C >> S. N. Use the mass of a helium or neon atom for the mass of the ideal gas particle. This motivates the introduction of the complexity profile in Section 8. we would always describe systems in equilibrium. we heat up the system to the vapor phase where the entropy can be calculated. The actual amount of information seems not to be precisely defined.1 Calculate the information necessary to specify the microstate of a mole of an ideal gas at T = 0°C and P = 1atm. Estimates we will give later imply that complexities of biological organisms are too small to be measured in this way. Degrees of freedom that are frozen on one time scale are not on sufficiently longer ones.2).3 to be: S = kN[ln(V/N (T)3) + 5/2] (T) = (h /2 mkT ) 2 1/2 (8. the complexity would be given by the (small) number of bits in the specification of the three variables (U. This implies that the information in the frozen variables C/k ln(2) is the complexity. The concept of frozen degrees of freedom immediately raises the question of the time scale in which the experiment is performed. the complexity (the frozen ensemble information) is bounded by the sum over the numb er of fast and static degrees of freedom (C + S > C).A table of fundamental physical constants is given on the following page. for now it is reasonable to consider describing the system to be specifying just the ensemble.3.1 The entropy of an ideal gas is found in Section 1.3. the complexity must be large enough so that experimental accuracy can enable its measurement.7) (8. we have not identified the number of bits to be used in specifying (U. . This gives us the value of C + S at the temperature from which the heating began.then subtract the entropy added during the heating process.molecule) that is present.3. then the result is the complexity itself. As we have seen in the discussion of algorithmic complexity. this is to be expected. For example. Q uestion 8.3. since the conventions of how the information is sp ecified are crucial when there is only a small amount. We will return to address this question in greater detail later.V ) and the number of bits necessary to specify the type of element (atom. As in the case of ice.722 H uma n Ci vi liza t io n I whole point of describing an ensemble is that we don’t need to specify the particular microstate.
4 eV Å k = 1.01325 × 105 Pascal = 1.24 × 1025 (8.6260755 1034 Joule second e = 1.12) (8.3.1 Fundamental constants .3.05633A ° (8.16) Note that the amount of information per particle is only of order 10 bits.5533 + 3/2 ln(m[AMU])) = 1.3.6605402x1027 kilogram = 9.11) (8.13) (8.9) (8.10) (8.01325 × 105 Newton/m2 the volume (of a mole of particles) of an ideal gas is: V = N0kT/P0 = 22.0221367 × 1023 /mole At the temperature T0 = 0 °C = 273.3.3144 Joule/°K/mole c = 2.6726231x1027 kilogram 1 AMU = 1.C o m p l ex it y of p hys ic a l sys t e m s 723 Each of the quantities must be evaluated numerically from appropriate tables.3.380658x1023 Joule/°K R = kN0 = 8.99792458 108 Meter/second h = 6.3. hc = 12398.60217733 1019 Coulomb ProtonMass = 1.7284x109 M [Neon] c 2 =1.0026 AMU M [Neon] = 20.41410 × 10−3 m3/mole the volume per particle is: V/N = 37219.3.5 Å3 At the same temperature we have: (T) = (2 mkT/h 2)−1/2 = m[AMU]−1/2 × 1.15) This gives the total information for a mole of helium gas at these conditions of I = N0 (18.13 °K kT0 = 0.3.3. A mole of particles is N0 = 6.179 AMU M [Helium] c 2 = 3.31494x109 eV M [Helium] = 4.87966x1010 Table 8.14) (8.0235384 eV and pressure P0 = 1atm = 1.
this should not be understood as a proof but rather as an explanation of the relationship of the thermodynamic observation to the microscopic properties.2 Algorithmic complexity of physical systems The complexity of a system is designed to measure the amount of information necessary to describe it.Like other aspects of statistical mechanics (Section 1.then the amount of information in the microstate has not been changed.g.” This word suggests that we are after the minimum amount of information. for example.3). The algorithmic description of an adiabatic process requires only a few pieces of information. we can apply these results directly to physical systems. The information in the frozen degrees of freedom is precisely the information necessary to specify the ensemble of dynamically accessible states. the algorithmic complexity is no different than the Shannon information. We can now. Since the total information cannot be compressed.2. The total information. However. This becomes clearer when we compare adiabatic and irreversible processes.3. However. (C + S)/k ln(2). It is no longer true that the ensemble of dynamically accessible states of a particular system is concisely describable. Using this explanation.A physical system in e quilibrium is represented by an e nsemble. on average the compression cannot lead to an amount of information significantly different from the entropy (divided by k ln(2)) of the system. and the adiabatic process does not change the microstate algorithmic complexity—the entropy of the system. we can identify the nature of an adiabatic process as one that is described microscopically by a small amount of information. As discussed in Section 8. Since we have established a connection between the complexity of physical systems and representations in terms of character strings.724 Hu ma n Civ il iz a tion I 8.the microstate description has been separated into two parts. explain the experimental observation that an adiabatic process does not change the entropy of a system (Section 1. for an ensemble that can be described simply. Take. the size of a force applied over a specified distance.2. This conclusion follows because the microcanonical (or canonical) ensemble can be concisely described. e.Our argument that an adiabatic process does not change the entropy is based on considering the information necessary to describe an adiabatic process—slowly moving a piston to expand the space available to a gas. For a nonergodic system like a glass. If a new microstate of the system can be described by the original microstate plus the process of adiabatic change. neither can either of the two parts of the information—the frozen degrees of freedom that we have identified with the complexity. it is in a single microstate. Thus the algorithmic complexity is the same as the information for either part. represents the selection of a microstate from a simple ensemble (microcanonical or canonical). or its behavior. .3). or the additional information necessary to specify a particular microstate. The specification of this microstate can be compressed by encoding in certain rare cases. finally. In this section we address the key word “necessary. but would not be thermodynamically the same.. An irreversible process could achieve a similar expansion. At any particular time. The minimum amount of information depends on our capabilities of inference from a smaller amount of information. logical inference and computation lead to the definition of algorithmic complexity.
C o m p l ex it y of p hys ica l sys t e m s 725 the removal of a partition that separates the gas from a second. Such dynamics are called conservative. the entropy of the system is the same as before.and for standard observations the system is indistinguishable from an equilibrium state.3. If we consider an ensemble of systems starting in a particular region of phase space. This conservation of phase space can be understood from our discussion of algorithmic complexity: since the deterministic dynamics of a system can be computed. The irreversible process of expansion of the gas results in a final state which has a higher entropy (see Question 1.3 Complexity profile General approach In this section we discuss the relationship of microscopic and macroscopic complexity.initially empty. Where does the additional entropy come from for the final equilibrium state after the expansion? There are two parts to the process of proceeding to a true equilibrium state.One moment after the partition is removed. The microscopic correlations cannot be observed on a macroscopic scale. This information is converted to microscopic correlations between atomic positions and momenta.3. we must consider the nature of irreversible dynamics. Thus we can surmise that the expansion of the gas is followed by an information transfer that enables the entropy to increase to its equilibrium value without changing the energy of the system. In macroscopic physical processes. but the volume of the phase space that is occupied—the entropy—does not change. In the following section we will discuss the complexity as a function of time scale. the algorithmic complexity of the system is conserved. which will be discussed later. This means that the dynamics of an isolated system conserves the amount of information as well as the energy. We will. The second part to the process is an actual increase in the entropy of the system. A key ingredient in our understanding of physical systems is that the time evolution of an isolated system can be obtained from the simple laws of mechanics (classical or quantum). The conversion occurs when the gas expands to fill the chamber. At first there is macroscopically observable information—the particles are in one half of the chamber. In the first part the distinction between the nonequilibrium and equilibrium state is obscured. The transfer of information from macroscopic to microscopic scale is related to issues of chaos in the dynamics of physical systems. we are not generally concerned with isolating the system from information transfer.3. and various currents that follow this expansion become smaller and smaller in extent.however. the phase space position evolves in time. chamber.4). . The removal of a partition in itself does not appear to require a lot of information to describe. Many of the issues related to describing this nonequilibrium process will not be addressed here. The additional entropy must come from outside the system. 8. only with isolating the system from energy transfer. begin to address the topic of the scale of observation at which correlations appear using the complexity profile in the following section.1. To understand how the entropy increases. Our objective is to develop a consistent language for discussing complexity as a function of length scale. which generalizes the discussion of frozen and fast degrees of freedom in Section 8.
726
Human Civilization I
When we describe a system, we are not generally interested in a microscopic description of the positions and velocities of all of the particles. For a thermodynamic system there are only a few macroscopic parameters that we use to describe the system. This is indeed the reason we use entropy as a summary of the many hidden parameters of the system that we are not interested in. The microscopic parameters change too fast and over too small distances to matter for our macroscopic measurements/experience. The same is true more generally about systems that are not in equilibrium: a macroscopic description does not require specifying the position of each atom. This implies that we must develop an understanding of complexity that is not tied to the microscopic description, but is relevant to observations at a particular length and time scale. This point lies at the root of a conceptual problem in thinking about the complexity of systems. A gas in equilibrium has a large entropy which is its microscopic complexity. This is counter to our understanding of complex systems. Systems in equilibrium are intuitively simpler than nonequilibrium systems such as a human being. In Section 8.3.1 we started to address this problem by identifying the complexity of a nonergodic system as the information necessary to specify the frozen degrees of freedom. We now discuss a more systematic approach to dealing with macroscopic observations. In order to consider the macroscopic complexity, we have to define what we mean by macroscopic in a formal sense. The concept of macroscopic must be understood in relation to a particular observer. While we often consider experimental results to be independent of the observer, there are various ways in which the observer is essential to the observation. In this context, in which we are concerned with the meaning of macroscopic, considering the observer is essential. How do we characterize the difference between a microscopic and a macroscopic observer? The most crucial difference is that a microscopic observer is able to distinguish between all inherently distinguishable states of the system, while a macroscopic observer cannot. For a macroscopic observer, many microscopically distinct states appear the same. This is related to our understanding of complexity, because the macroscopic observer need only specify which of the macroscopically distinct states the system is in. The microscopic observer must specify which of the microscopically distinct states the system is in. Thus the macroscopic complexity must always be smaller than the microscopic complexity of a system. Instead of considering a unique macroscopic observer, we will consider a sequence of observers with a progressively poorer ability to distinguish microstates. Using these observers, we will define the complexity profile. Ideal gas These ideas can be directly applied to the ideal gas.We generally think about a macroscopic observer as having an inability to distinguish finescale distances. Thus we expect that the usual uncertainty in particle position ∆x will increase for a macroscopic observer. However, we learn from quantum mechanics that a unique microstate of the system is defined using an uncertainty in both position and momentum, ∆x∆p = h.Thus for the macroscopic observer to confuse distinct microstates,the product ∆x∆p must be larger than its minimum value—an observation of the system provides measurements of the position and momentum of each particle, whose uncertainty has a ˜ product greater than h. We can label our observers by this uncertainty, which we call h.
C o m p l ex ity of phys ic a l sys t e m s
727
If we retrace our steps to the calculation of the entropy of an ideal gas (Question 1.3.2), we can recognize that essentially the same calculation applies to the ˜ ˜ complexity with the uncertainty h. An observer with the uncertainty h will determine the complexity of the ideal gas according to Eq.(8.3.7) and Eq.(8.3.8), with h replaced ˜ by h. Thus we define the complexity profile for the ideal gas in equilibrium as: ˜ ˜ C(h ) = S − 3kN ln(h /h) ˜ h>h (8.3.17)
This equation describes a complexity that decreases as the ability of the observer to distinguish states decreases. This is as we expected. Despite the weak logarithmic de˜ ˜ pendence on h , C(h) decreases rapidly because the coefficient of the logarithm is so ˜ is about 100 times h the complexity profile has become negative large. By the time h for the ideal gases described in Question 8.3.1. What does a negative complexity mean? It actually means that we have not done the calculation quite right. The counting of states we did for the ideal gas assumed that the par ticles were well separated from each other. If they begin to overlap then we must count the possible states differently. This overlap is significant precisely when Eq.(8.3.17) becomes negative. If the particles really overlapped then quantum statistics b ecomes important; the gas is said to be degenerate and satisfies either FermiDirac or BoseEinstein statistics. In our case the overlap arises only because the o bserver cannot distinguish different particle positions. In this case, the counting of states is appropriate to a classical ideal gas, as we now explain. ˜ To calculate the complexity as a function of h for an equilibrium state whose entropy is S, we start by calculating the number of microstates that the observer cannot ˜ distinguish. The logarithm of this number of microstates, which we call S(h)/k ln(2), is the amount of information necessary to specify a microstate, if the macrostate is known. Thus we have that: ˜ ˜ C(h ) = S −S(h) (8.3.18) To count the number of microstates that the observer cannot distinguish,we note that the possible microstates of a particular particle are grouped together by the observer ˜ into bins (regions or cells of position and momentum) of size (∆x∆p)d = hd, where d = 3 is the dimensionality of space. The observer determines only that a particle is within a certain region. In the classical ideal gas each particle moves independently, so more than one particle may occupy the same microstate. However, this is unlikely. ˜ As h increases it becomes increasingly likely that there is more than one particle in a region. If the number of particles in a certain region is ni , then the number of distinct microstates of the bin that the observer does not distinguish is: g ni ni! (8.3.19)
˜ where g = (h/h)d is the number of microstates within a region. This is the product of the number of states each particle may be in, corrected for particle indistinguishability. The number of microstates of the whole system that appear to the observer to be the same is the product of such terms for each region:
728
H uma n Ci vi liza t io n I
∏ ni !
i
g ni
(8.3.20)
From this we can determine the complexity of the state determined by the observer as: ˜ ˜ C(h ) = S −S(h) = S −k ln(
∏
i
g ni ) ni!
(8.3.21)
If we consider this expression when g = 1—a microscopic observer—then ni is almost always either zero or one and each term in the product is one (a more exact treatment ˜ requires treating the statistics of a degenerate gas). Then C (h) is S, which means that the microstate complexity is just the entropy. For g > 1 but not too large, ni will still be either zero or one, and we recover Eq. (8.3.17). On the other hand, using this expression it is possible to show that for a large value of g, when the values of ni are significantly larger than one, the complexity goes to zero. We can understand this by recognizing that as g increases, the number of particles in each bin increases and becomes closer to the average number of particles in a bin according to the macroscopic probability distribution. This is the equilibrium macrostate. By our conventions we are measuring the amount of information necessary f or the observer to specify its observation in relation to the equilibrium state. Therefore, when the average number of particles in a bin becomes close enough to this distribution,there is no information that must be given. To write this explicitly, when ni is much larger than one we apply Sterling’s approximation to the factorial in Eq. (8.3.21) to obtain: ˜ C(h ) = S −k ni ln(g /ni )+ 1 = S +k g Pi ln(Pi ) −kN (8.3.22)
∑ (
i
)
∑
i
where Pi = ni /g is the probability a particle is in a particular state according to the ob˜ server. It is shown in Question 8.3.2 that C (h) is zero when Pi is the equilibrium probability for finding a particle in region i (note that i stands for both position and momentum (x,p)). There are additional smaller terms in Sterling’s approximation to the factorial that we have neglected. These terms are generally ignored in calculations of the entropy because they are not proportional to the number of particles. They are, however, relevant to calculations of the complexity: ˜ C(h ) = S −k
∑ ni ( ln(g /ni )+ 1) + k∑
i i
ln( 2 ni )
(8.3.23)
The additional terms are related to fluctuations in the density. This will become apparent when we analyze nonuniform systems below. We will discuss additional examples of the complexity profile below. First we simplify the complexity profile for observers that measure only the positions and not the momenta of particles.
C o m p l ex it y of p hys ica l sys t e m s
729
uestion 8.3.2 Show that Eq.(8.3.22) is zero when Pi is the equilibrium probability of locating a particle in a particular state identified by momentum p and position x. For simplicity assume that all g states in the cell have essentially the same position and momentum. Solution 8.3.2 We calculate an expression for Pi → P(x,p) using Boltzmann probability for a single particle (since all are independent): P(x, p) = NZ −1e − p
2
Q
/ 2mkT
(8.3.24)
where Z is the one particle partition function given by: Z=
∑ e − p / 2mkT = ∫
2
d 3xd 3 p h
3
e −p
2
/2mkT
=
V
3
(8.3.25)
x ,p
We evaluate the expression: −k
∑ g P(x, p)ln(P(x, p))+ kN
i
(8.3.26)
which, by Eq.(8.3.22), we want to show is the same as the entropy. Since all g states in cell i have essentially the same position and momentum, this is equal to: 2 −k P(x, p)ln(P(x, p))+kN =k ln(V / N 3 ) + p 2 /2mkT N 3 /V e − p / 2mkT
∑
x ,p
∑
x ,p
(8.3.27) which is most readily evaluated by recognizing it as: 1 d 3 kN + kNZ −1 ln(V /N 3 )− Z =kN ln(V /N ) + 5/2 (8.3.28) d which is S as given in Eq. (8.3.7). Position without momentum The use of the scale parameter ∆x∆p in the above discussion should trouble us, because we do not generally consider the momentum uncertainty on the macroscopic scale. The resolution of this problem arises because we have assumed that the system has a known energy or temperature. If we know the temperature then we know the thermal velocity or momentum: ∆p ≈ √mkTi (8.3.29) It does not make sense to have a momentum uncertainty of a particle that is much greater than this. Using ∆x∆p = h this means there is also a natural uncertainty in position which is the thermal wavelength given by Eq. (8.3.8). This is the maximal quantum position uncertainty, unless the observer can distinguish the thermal motion of individual particles. We can now think about a sequence of observers who do not distinguish the momentum of particles (they have a larger uncertainty than the thermal momentum) but have increasing uncertainty in position given by L =∆ x, or g = (L / )d. For such observers the equilibrium momentum probability distribution
Algorithmic complexity and error To discuss macroscopic complexity more completely.22) only in the constant. Equilibrium systems are uniform on all but very microscopic scales. In this section we continue to restrict ourselves to the description of observations at a particular time.A macroscopic observer will see these macroscopic variations. Since the observer measures ni . but the rotation is still macroscopic. the determination of velocity depends on the observer’s ability to distinguish moving spatial density variations.should be defined in terms of the algorithmic complexity of its description. unless we are exactly at a phase transition. In this case the number of particles in a cell ni contributes a term to the entropy that is equal to the entropy of a gas with this many particles in the volume Ld. we turn to algorithmic complexity as a function of scale. We defined the profile using observers with progressively poorer ability to distinguish microstates.5.3. We consider observers that measure particle positions at different times and from this they may infer the velocity and indirectly the momentum. for example. most of the complexity disappears on a scale that is far smaller than typical macroscopic observations. This means that patterns that are present in the positions (or momenta) of its particles can be used to simplify the description.31) which differs in form from Eq. The complexity of a system. Before we do this we need to consider the effect of algorithmic compression on the complexity profile.t). Thus. Thus we consider the measurement of n(x. The time dependence of observations will be considered in Section 8. This gives a total entropy of: S(L) =k and the complexity is: C(L) = S − k ∑ i ni ln(Ld /n i 3 )+ 5 /2 (8. While we generally do not think about measuring momentum.however. Using this discussion we can reformulate our understanding of the complexity profile. we do measure velocity.particularly a nonequilibrium system. we can also describe a rotating disk that has no macroscopic changes in density over time. The more fundamental description is given by the distribution of particle positions and momenta. We emphasize.and timedependent density assumes that the local momentum distribution of the system is consistent with an equilibrium ensemble. This follows from the content of the previous parag raph.730 Human Civilization I is to be assumed. This is not necessarily true about nonequilibrium systems. that this description of a space. The fraction of the ensemble occupied by these states defined .3. Thus far we have considered systems that are in generic states selected from the equilibrium ensemble.30) ∑ ni (ln(g /n i ) + 5/2) i (8. We will consider a couple of different examples of nonequilibrium states to illustrate some properties of the complexity profile.p).3. (8. We can also describe fluid flow in an incompressible fluid. where x has macroscopic meaning as a granular coordinate that has discrete values separated by L. Thus. ni = n(x.3. Systems that are in states that are far from equilibrium can have nonuniform densities of particles.
3. this is the state that the observer will use to describe the system.3. Thus. For conceptual simplicity. An observer with a value of g = 2 cannot distinguish which of two states each particle occupies in the real microstate. As we mentioned at the end of Section 8.32) where N is the number of particles in the system.(8.33) where we use the subscript 0 to indicate quantities of the equilibrium state. The factor of 1/2 arises because the average error is half of the maximum error that could occur.3.there are nonequilibrium states that cannot be distinguished from equilibrium states on a macroscopic scale. The observer cannot tell if a particle is in a black or a white state. We can define the complexity profile as a function of the number of errors that are made. we take a microstate where all particles are in odd indexed states.8). we can specify the complexity of the system for the observer as the complexity of the simplest state that is consistent with the observations—by Occam’s razor. Nonequilibrium states Our next objective is to consider none quilibrium states. no matter what the real state is. This approach is helpful since it suggests how to generalize the complexity profile for systems that have different types of particles. Using the indexing of single par ticle states we just introduced. which implies a different error for particles of different mass as indicated by Eq. We note that this is also equivalent to defining the complexity profile as the length of the description as the error allowed in the description increases.2. we will continue to write the complexity profile as a function of g or of length scale. Thus.the microstate of the system is simpler than an equilibrium state to begin with. while the macroscopic complexity is the same as in equilibrium: C(g) < C0(g) = S0 C(g) = C0(g) g=1 g >> 1 (8. Using an algorithmic perspective we say. The algorithmic complexity of this state with particles in odd indexed states is essentially the complexity that we determined above.C o m p l e x i t y o f p h y s i c a l s ys t e m s 731 the complexity. We illustrate this by an example. These nonequilibrium states have microscopic correlations. that the observer cannot distinguish the true state from a state that has a smaller algorithmic complexity. the microscopic complexity is lower than the equilibrium entropy. C(g = 2)— it is the information necessary to specify this state out of all the states that have particles only in odd indexed states.3.there is a simpler state where only odd (or only even) indexed states of the particles are occupied.in every case. Thus. When we have a nonequilibrium state. which cannot be distinguished from the real system by the observer. The total error as a function of g for the ideal gas is 1 log( 2 ∏ ∆xi ∆pi /h ) = 2 N log(g) 1 (8. This is better than using a particular length scale. Let us label the single particle states using an index that enumerates them. We can then imagine a checkerboard (in six dimensions of position and momentum) where odd indexed states are black and even ones are white. The mi . equivalently.
C(L0) is the amount of information necessary to specify the density values.and there is no particular relationship between what is going on in one region of length scale L 0 and another. and we used < ni2> = 2. where the complexity of a nonequilibrium state starts smaller but then quickly becomes equal to the equilibrium state complexity. It is true that the microscopic complexity must be less than or equal to the entropy of an equilibrium system. what we will show is that the complexity of a nonequilibrium system can be higher than that of the equilibrium system at large scales that are smaller than the size of the system. The values of ni will be taken from a Gaussian distribution around the equilibrium value n0 with a standard deviation of . We can calculate both the complexity C(L). the complexity of this system for scales of observation g ≥ 2 is the same as that of an equilibrium system—macroscopic observers do not distinguish them.3. we consider a system that has nonuniformity that is characteristic of a particular length scale L 0 . does not always hold. We assume that is larger than the natural density fluctuations. To illustrate what happens for such a nonequilibrium state. This is apparent in the case.34) The number of microstates consistent with this macrostate at L 0 is given by the sum of ideal gas entropies in each region: S(L0 ) = −k ∑ i ni ln(ni /g) +(5/2)kN (8. This is the product of the number of cells V/Ld times the information in a number selected from a Gaussian distribution of width . This means that ni is smooth on finer scales.35) Since is less than n 0 . for example. of a nonuniform density at large scales. From Question 8. this can be evaluated by expanding to second order in ni = ni − n 0 : S(L0 ) = S0 −k ∑ i ( n i )2 kV 2 = S0 − d 2n0 2L 0n 0 (8.732 Hu ma n Civ il iz a tion I crostate complexity is the same as that of an equilibrium state at g = 2. and the apparent entropy S(L) for this system.3.36) where S0 is the entropy of the equilibrium system. which is less than the entropy of the equilibrium system: C(g = 1) = C0(g = 2) < C0(g = 1) However. This scenario. which is significantly larger than the microscopic scale but smaller than the size of the system. We note that when = 0 the logarithmic terms in the complexity reduce to the extra terms .3 this is: C(L0 ) = k V 1 ( (1 + ln(2 ))+ ln ) Ld 2 0 (8. and that all systems have the same complexity when L is the size of the system. However. For convenience we also assume that is much smaller than n 0 .3. We start by calculating them at the scale L 0 . which have a standard deviation of 0 =√n 0 .3.
these terms are the information needed to describe the equilibrium fluctuations in the density.and the complexity goes to zero. Thus. For L > the complexity profile C(L) decreases like that of an equilibrium ideal gas.the observer cannot distinguish the two and the macroscopic properties must also be similar to an equilibrium state.37) 2(LL 0 )d /2 n 0 This expression continues to be valid until there is only one region left. Second. there must be many bits of difference in the microstate. the minimum amount of information needed to specify the microstate is C( ) = S(L 0) + C(L 0).38) For > 0 = √n 0 this is greater than one. However. (8. This is necessary because the sum S(L) + C(L)—the total information necessary to specify a microstate—cannot be greater than S0. Combining the regions results in a Gaussian distribution with a standard deviation that decreases as the square root of the number of terms → (L0 /L)d /2. the entropy at the same scale must be reduced S(L) < S0.(8. This is the sum over the entropy of equilibrium gases with densities ni in volumes Ld .C o m p l ex it y of p hys ica l sys t e m s 733 found in Eq. The ratio between the two is given by: 2 d /2 S(L) L 1 =− d /2 ln( / C(L) 2n0 L 0 0) (8. and therefore must have a much smaller entropy than an equilibrium system.3. First. By construction.37).and therefore if the microstate is similar to an equilibrium state. However. the complexity and entropy profiles for L > L 0 are: C(L) = k V 1 ( (1 + ln(2 ))+ ln Ld 2 S(L) = S0 − kV 2 (L L ) 0 d /2 ) (8.For every bit of information that distinguishes the macrostate. The term S(L 0) is eliminated at a microscopic length scale larger than but much smaller than L 0.23). we conclude that C(L 0) is much smaller than S(L 0). Since S(L 0) is linear in the number of particles. We can understand the behavior of the complexity profile of this system. a macroscopic observer makes many errors in determining the microstate. since the Gaussian distribution does not apply in this limit. C(L 0 ) remains. First we see that in order for the macroscopic complexity to be higher than that in equilibrium. Thus.while 0 C(L 0) is logarithmic in and therefore logarithmic in the number of particles.plus C(L 0). We can understand this result in two ways. For length scales up to L 0 the complexity is essentially constant and equal to Eq. Due to this term the complexity crosses that of an equilibrium gas to become larger.3. There are several comments that we can make that are relevant to understanding complexity profiles in general. a complex macroscopic system must be far from equilibrium. we also note that the reduction in S(L) is much larger than the increase in C(L).3.34).3. Above L 0 it decreases to zero as L continues to increase by virtue of the effect of combining the different ni into fewer regions.3. . The precise way the complexity goes to zero is not describ ed by Eq. (8.
3. 8.2. P(x) = 1 2 e −x 2 /2 2 (8. For macroscopic systems this fraction is much larger than the equilibrium fluctuations. we write: C(L) = k V 1 Ld/ 2 V L3 d/2 0 0 ( (1 + ln(2 ))+ ln ) ≈ k d ln Ld 2 n 0 (L)Ld /2 L n 0(L0 )L3d / 2 (8.3. 1. we assumed that the observer was in error in obtaining the position and momentum of each particle.a macroscopic observer is not able to distinguish the time of observation within less than a certain time in .Thus the complexity we calculated is the information necessary to specify the number of particles precise to the single particle.734 Human Civilization I In calculating the complexity of the system at a particular scale. (8. Letting m 0(L) be the error in a measurement of particle number.3.40) we calculate the information (Eq.2. 8. we assumed that the number of particles within each bin was determined exactly.that in the limit of large gives a Gaussian distribution.3 What is the information in a number (character) selected from a Gaussian distribution of standard deviation ? Solution 8.41) = log( 2 ) + ln(2)/2 where the second term in the integral can be evaluated using < x 2 > = 2.(8.3.3. We note that this result is to be interpreted as the information in a discrete distribution of integral values of x. It thus makes sense that the information to specify an integer of typical magnitude is essentially log( ). Q uestion 8. However. An alternative. This is why even the equilibrium density fluctuations were described. we are interested in macroscopic observations over time.39).4 Time dependence—chaos and the complexity profile General approach In describing a system. approach assumes that particle counting is also subject to error.37) is unchanged.3. The expression for the entropy in Eq. The units that are used to measure define the precision to which the values of x are to be described.37). For simplicity we can assume that the error is a fraction of the number of particles counted. The error in measurement increases as n 0(L) ∝ Ld with the scale of observation.3. As with the uncertainty in position. like a random walk.3 Starting from a Gaussian distribution (Eq.39) The consequence of this modification is that the complexity decreases somewhat more rapidly as the scale o f observation increases. n(x.3. This approach also modifies the form of the complexity profile of the nonuniform gas in Eq. more reasonable.2): I = − dxP(x)log(P(x)) = dxP(x) log( 2 ∫ ∫ )+ ln(2)x 2 /2 2 (8. which therefore need not be described.t).
This is true even if we are only concerned about predicting the behavior at the scale L.T (n(x. A description at a finer scale contains all of the information necessary to describe the coarser scale. Predictability and chaos As discussed earlier. A direct analysis is discussed in Question 8.the eff ect disappears over time.3. we say that the system is represented by an ensemble with probability PL.t)) over a short period of time (or the simultaneous values of position and momentum) in order to predict the behavior over all subsequent times.t)) at a scale L we can predict the system behavior. C(L.4. or more generally PL. the observer can measure correlations between particle positions that are fixed over time. However. we only need to specify PL. Thus.This may appear different than the definition we used for the spatial uncertainty.they do not. Chaotic systems take information from smaller scales and bring it to larger scales. Chaotic systems may be contrasted with dissipative systems that take information from larger scales to smaller scales.C o m p l ex it y of p h y s i c a l s ys t e m s 735 terval T = ∆t. p. The laws of mechanics are also reversible. However. T ) as the amount of information necessary to specify the ensemble PL. if we use a small enough L and T. a key ingredient in our understanding of physical systems is that the time evolution of an isolated system (or a system whose interactions with its environment are specified) can be obtained from the simple laws of mechanics starting from a complete microscopic description of the position and momenta of the particles. Since the information on a microscopic scale must be conserved. Thus.If we averaged the density n(x. but we can consider various quantities that can be measured using the same degree of precision. averages over various possible microscopic measurements.T (n(x. so that each particle can be distinguished. T ) is a monotonic decreasing function of its arguments. The average measurements over space and time represent the system (or system ensemble) that is to be described by the observer.3.6. when we increase the spatial scale of observation. t)). We are not guaranteed that by knowing PL.t) over time. For example. We define the complexity profile C(L.T (n(x.however. we cannot tell if it was perturbed at some time far enough in the past. L. This is precisely the origin of the study of chaotic systems discussed in Section 1. if we average over the ensemble. We start.T (n(x. To define what this means. in effect.T (n(x. However. by considering the effect on C(L. we know that the . The different microstates that occur during the time interval T are all part of this ensemble. t)). In this restatement we recognize that the obse rver performs measurements that are.t)). This representation will be discussed further in Section 8. We describe the past as well as the future from the description of a system at a particular time. then the information loss—the complexity reduction—also limits the predictability of a system. Systems that do not lose information over time are called conservative systems. We may need additional smallerscale information to describe the time evolution of the system. the definitions can be restated in a way that makes them appear equivalent. This must mean that information is not lost over time. The ensemble represents all possible measurements with this degree of precision. The use of an ensemble is convenient b ecause the observer may only measure one quantity.these correlations could disappear because of the movement of the whole system. If we perturb (disturb) a dissipative system. Looking at such a system at a particular time.1.T ) of prediction and the lack of predictability in chaotic dynamics.
The sequence of states could be described one by one.736 Hu man Civ il iz a tion I information that is lost on the macroscopic scale must be preserved on the microscopic scale. the flow of information between length scales is bidirectional—even if the total amount of information at a particular scale is preserved.3.t 2 by determining the rate at which inf ormation is either gained or lost for a chaotic or stable system.43) is smaller than Eq.3. should be included in the complexity. t2].(8.g. we can also describe the state at a particular time (e.T ))/k ln(2) (8.T ) = C(L) + Ct(L.information currents remain relevant even though they may be equal and opposite. Like the spatial extent of the system. For such systems. the picture must be modified to accommodate the flow of information between scales. C(L)/k ln(2) is the amount of information necessary to describe the system during a single time interval of length T. This holds for a system following conservative. We typically keep these limits constant as we vary T to obtain the complexity profile. For complex systems. The degree of predictability is manifest when we consider that the complexity of a system C(L. this temporal extent is part of the system definition.at any time over the duration of the description. However.42) and the complexity is C(L. All of the information that affects behavior at a particular length scale. Therefore Eq. nonchaotic and nondissipative dynamics.T ) at a particular L and T depends also on the duration of the description—the limits of t ∈[t1. we can also characterize the dependence of the complexity on the time limits t1. The microstate has a dynamics that is simple. The amount of information to do this is: (C(L) + Ct(L.42) bits. For a conservative system the amount of information necessary to describe the state at a particular time does not change over time. Unlike most theories of currents.T )/k ln(2) is the information needed to describe the dynamics. However.43) Ct (L.T ) ≈ C(L). which must therefore also be simple. the inf ormation may change over time by transfer to or from shorter length scales. In this sense we can say that information has been transferred from the macroscopic to the microscopic scale. nonchaotic and nondissipative system seen by an observer who is able to distinguish 2C(L)/k ln(2) = eC(L)/k states.(8. From the previous paragraph we conclude that all of the interesting (complex) dynamics of a system is provided by in . where NT = (t2 − t1)/T is the number of time intervals. This would require NT C(L)/k ln(2) (8. The dynamics of the system causes the state of the system to change over time among these states. The dynamics of the simple microstate also describes the dynamics of the macrostate.3.We know from the previous section that the macrostate of the system of complexity C(L) is consistent with a microstate which has the same complexity.the initial conditions) and the dynamics. It is helpful to develop a conceptual image of the flow of information in a system. For a nonchaotic and nondissipative system we can show that this information is quite small. We begin by considering a conservative.3. we cannot describe the past from present information on a particular length scale.. For a system that is chaotic or dissipative.since it follows the dynamics of standard physical law.
the observer can determine the state of the system at the next time. As the scale increases. ∑ i hi = log( ∏ i ∆x i (t )∆pi(t )/ ∏ i ∆x i (t −T)∆pi (t −T)) = 0 (8. For a system that is described by a single real valued parameter.T)+ NT k i:h i >0 ∑h i (8.3. Two cautionary remarks about the application of Lyaponov exponents to complex physical systems are necessary.3.C o m p l ex it y of p hys ic a l sys t e m s 737 formation that comes from finer scales.e. If the dynamics is conservative then the sum over all the Lyaponov exponents is zero.3. We can readily see how this affects the information needed by an observer to describe the dynamics. Consider an observer at a particular scale. while the sum over all exponents is zero. but he determines x(t − 1) only within a bin of width L. C(L).the flow of information can be characterized by its Lyaponov exponents.the flow of information between scales should be thought of as due to a number of closed loops that extend from a particular lowest scale up to a particular highest scale. This is precisely h/ ln(2) bits of information.47) As indicated. Thus.T ) + NT kh (8. some of the exponents may be positive and some negative. A physical system that has many dimensions. If we allow ourselves to see the finerscale information we can track the flow of information that the observer does not see. However. Unlike many standard models of chaos. The observer sees the system in state x(t − 1) at time t − 1. The complexity of the system is given by: C(L.43). the complexity of the dynamics for the observer is given by: C(L.the complexity . The observer does not see this information before it appears in the state of the system—i. so the observer needs additional information to specify the next location. will have one Lyaponov exponent for each of 6N dimensions of position and momentum. These correspond to chaotic and dissipative modes of the dynamics. This follows directly from conseri vation of volumes of phase space in conservative dynamics.like the microscopic ideal gas.T ) = C(L) + Ct (L.46) where ∆xi(t) = x′i (t) −xi(t) and ∆pi(t) = p′ (t) −pi(t).44) where unprimed and primed coordinates indicate two different trajectories.45) where we have used the same notation as in Eq.a complex system does not have the same number of degrees of freedom at every scale. the Lyaponov exponent is defined as an average over: h = ln((x′(t) − x(t))/(x′(t − 1) − x(t − 1))) (8. This extrapolation is not precise.3.. In a conventional chaotic system. (8.3. The number of independent bits of information describing the system above a particular scale is given by the complexity profile. L. The amount of information needed is the lo garithm of the number of bins that one bin expands into during one time step.T) = C(L) +C t (L. x(t). the sum is only over positive values.in the dynamics. Thus. one going to higher scales and one to lower scales. We can imagine the flow of information as consisting of two streams. Using the dynamics of the system that is assumed to be known.
More generally. These cycles need not be deterministic. Longrange correlations that are not easily described by a Markov chain may also be important in the dynamics. The second remark is that over time the cycling of information between scales may bring the same information back more than once. (8.) a. they may be stochastic—cycles that do not repeat indefinitely but rather can occur one or more times through the probabilistic selection of successive states. This might seem to contradict our previous conclusion.8. However. The state of the system may be selected at random from a particular distribution (ensemble) of states at successive time intervals. All of the models we considered in Chapter 1 are applicable.and therefore may include multiple counting of the same information. This means that the sum over Lyaponov exponents is itself a function of scale. This is a special case of the more general Markov chain model that is described by a set of transition probabilities.a high complexity for large T arises when there is a large space of states with low chance of repetition in the dynamics.as it is in many irreversible processes. c.is itself small. so does the maximum number of Lyaponov exponents. Show that for an observer at a longer time scale consisting of two time steps (T = 2T0) the information is reduced. From the analysis in Question 8. a complex deterministic dynamics can arise if the successive states are specified by information from a smaller scale. Thus. for the function f (x) = −x log(x).4 we learn that the loss of complexity with time scale occurs as a result of cycles in the dynamics.8.(See Question 8. then the complexity of the Markov chain is given by: C = C(s) + Ct + NT k ln(2)I(s′s) (8.3. Time scale dependence Once we have chaotic behavior.738 Human Civilization I decreases. We should understand this expression as an upper bound on the complexity. where the deterministic dynamics was found to be simple. Show that the more deterministic the chain is.3.4 Consider the information in a Markov chain of NT states at intervals T0 given by the transition matrix P(s′s). Show that the complexity does not decrease for a system that does not allow 2cycles. The highest complexity would arise from a deterministic dynamics with cycles that are longer than T. we must also be concerned that C(L) can be time dependent.3.47) does not distinguish this.5 for the case of a complex deterministic dynamics.3. b. Assume the complexity of specifying the transition matrix—the complexity of the dynamics —Ct = C(P(s′s)). the less information it contains.3. we can consider various descriptions of the time dependence of the behavior seen by a particular observer. f (〈x〉) > 〈 f(x)〉. Eq.48) . Solution 8.4 When the complexity of the dynamics is small. Hint: Use convexity of information as described in Question 1. Q uestion 8. we consider a Markov chain model.3. In order to discuss the complexity profile as a function of T. Thus.
To analyze this we need two probabilities:the probability of a pair and the transition probability from one pair to the next. To show (b) we must prove that the process of combining the states into pairs reduces the information necessary to describe the chain. Thus the information contained in the first pair is smaller for T = 2 than for T = 1.3.(8.3.3. The latter is the new transition matrix.then the chain is more concisely described by specifying each of the states of the system (see Question 8.47) should be apparent. Since the probabilities are larger. The proof of (a) follows from realizing that the more deterministic the system is. The relationship between this and Eq. This may be used to define how deterministic the dynamics is. We must show the same result for each successive pair.s 2 })= [ (P(s1  s ′ )P(s 2 s 1 )+ P(s ′  s1 )P(s1 s1 ))P(s 1 s 2 )P(s 2 ) ′ ′ ′ 2 ′ ′ 2 ′ + ( P(s1  s 2 )P(s 2 s 2 ) + P(s2  s1 )P(s 1 s 2 )) P(s 2 s1 )P(s 1) ]/P({s1 . This is apparent since the observer loses the information about the state order within each pair.(8.(8. By convexity. We use the notation {s′.3.3. Eq. we note from Eq. The probability of a particular pair is: P(s s )P(s 2 ) + P(s 2 s 1 )P(s1 ) s 2 ≠ s1 P({s1 .(8.49) where P(s) is the probability of a particular state of the system and the two terms in the upper line correspond to the probability of starting from s1 to make the pair.49). To analyze the complexity of the Markov chain for an observer at time scale 2T0. because ifit is larger than NT C(s). we are considering a new Markov chain of transitions between unordered pairs. and starting from s 2 to make the pair. Other cases are treated like 1 2 Eq.50) which is valid only for s1 ≠ s 2 and for s′ ≠ s′ . s} for a pair of states. To show it from the equations.s2 }) = 1 2 P(s1 s1 )P(s1 ) s 2 = s1 (8.C o m p l ex ity of phys ic a l sys t e m s 739 where the terms correspond to the information in the initial state of the system.s 2}) ′ ′ ′ ′ ′ ′ (8. The transition matrix for pairs is given by P({s1 .Each of the terms is a sum . s2} has 1 2 already occurred. we need to combine successive system states into an unordered pair—the ensemble of states seen by the observer.5).the information in the dynamics and the incremental information per update needed to specify the next state. This expression does not hold if Ct is large.49) that the probability of a particular pair is larger than or equal to the probability of each of the two possible unordered pairs. The transition probability can be seen to be an average over two terms in the round parenthesis. assuming the pair {s1 . s2 } {s1 .50) includes all four possible ways of generating the sequence of the two pairs.3. Thus.the smaller is I(s′s).3. the information in the average is less than the average information of each term. The normalization is needed because the transition matrix is the probability of {s′ ≠ s′ } occurring. the information is smaller.
The maximum complexity of NT steps is just NT C(L). the largescale motion would be changed by modifications of the internal state of the system. From the equations we see that if only one of P(s1s 2) and P(s 2s1) can be nonzero.the information needed to specify any pair in the chain is smaller than the corresponding information in the chain of states.A description of the system behavior. Another example of information t ransfer between different scales is related to adaptability.then we can infer the order and information is not lost.5 The number of possible states of the system is 2C(L) /k ln(2). A simple example of chaotic behavior that is relevant to complex systems is that of a mobile system—an animal or human being—where the motion is internally directed. for T = 2 the complexity is retained if the dynamics is not reversible—there are no 2cycles. the more complexity is retained.This generally involves the transfer of information between a larger scale and a smaller scale. so that each successor must be identified out of all possible states. (8. and is therefore larger than or equal to the probability of either ordering.Which is generally a reasonable assumption. even at a length scale larger than the system itself.49) and 1 2 2 ′ Eq.3. must describe this motion. The maximum possible information to specify the dynamics arises if there is no algorithm that can specify the successor. This is consistent with the sensitivity of chaotic motion to smaller scale changes.3.740 Hu ma n Civ il iz a tion I over the probabilities of two possible orderings. Finally. then only one term survives in Eq. Stated differently. when we consider the description of a system over Q . For arbitrary T the complexity is the same as at T = 1 if the dynamics does not allow loops of size less than or equal to T. This satisfies the formal requirements for chaotic behavior regardless of the specifics of the motion involved. Each of these must be assigned a successor by the dynamics.the complexity of the system at its own scale or larger is zero—or a constant if we include the description of the equilibrium system.3. Contrast this with the maximum complexity of describing NT steps of this system.3. and similarly for P(s′ s′ ) and P(s′ s 1). Thus. However. (8.Specifically. Thus. This would require 2C(L) /k ln(2)C(L) /k ln(2) bits. between observed phenomena and their representation in the synapses of the nervous system. to prove (c) we note that the less the order of states is lost when we combine states into pairs. the motion is determined by information contained on a smaller length scale just prior to its occurrence. Solution 8.50) and no averaging is performed. However. When we describe a system at a particular moment of time.5 Calculate the maximum information that might in principle be necessary to specify completely a deterministic dynamics of a system whose complexity at any time is C(L). If transitions in the dynamics can only occur in one direction. which requires that information about the external environment be represented in the organism. uestion 8. as long as this is smaller than the previous result.
6 on behavioral complexity.3. We have thus also defined the entropy profile S(L. The paragraphs that follow describe some of their features. a basketball in a game moves through its trajectory not because of its own volition. At a scale that is larger than the system itself. including the relationship of the complexity of the whole to the complexity of its parts. this is valid only under special circumstances—when the macroscopic state is selected at random from the ensemble of macrostates. For an equilibrium state this is the same as the thermodynamic entropy. For a typical system in equilibrium.This is not true in general because shortrange correlations decrease the microstate complexity. Fig. The observer only notes changes in position that are larger than the scale of observation. the complexity at the smallest values of L. T the sum of the complexity C(L. As the scale becomes larger. From our discussion of nonergodic systems in Section 8.3.2 illustrates the complexity profile for a few systems. which is the entropy of a system observed on an arbitrarily long time scale. the density of the system is .and a simpler description of motion is possible. smaller scale motions are not observed.C o m p l ex it y of p h ys i c a l s ys t e m s 741 time.T) as the amount of information necessary to determine an arbitrary microstate consistent with the observed macrostate.a particle moving in a fluid may be displaced by the motion of the fluid. but do not affect the apparent macroscopic entropy. For any system. This should be considered different from a mobile bacteria.∞) ≈ C(L. then the complexity is larger due to the system motion.3. 8.5 Properties of complexity profiles of systems and components General properties We can readily understand some of the properties that we would expect to find in complexity profiles of systems that are difficult to calculate directly.T is increased the system rapidly becomes homogeneous in space and time.3. This question will be dealt with in Section 8.and the microstate is selected at random from the possible microstates. we must ask how we must deal with the environmental influences for a system that is not isolated.1 we might also conclude that at any scale L. other complex systems need not. Similarly. however.in the next section we discuss several aspects of the complexity profile.0) ≈ S(∞. For example. How do we distinguish this from a system that moves due to its own actions? More generally.T ) + S(L. A natural question that can be asked in this context is whether the motion of the system is due to external influences or due to the system itself.T ) (8. it is the motion of the system as measured by its location at successive time intervals that is to be described. Before we address this question.T ) of the system (the fast degrees of freedom) should add up to the microscopic complexity or macroscopic entropy C(0. but rather because of the volition of the players.51) However. Increasing the scale of observation continues to result in a progressive decrease in complexity. 8.3. Specifically.T ) and the entropy S(L.A glass may satisfy this requirement. T is the microscopic complexity—the amount of information necessary to describe a particular microstate. as L.
2) (3) L .T) (2) (3) (1) (4) T C(L.742 Hu ma n Civ il iz a tion I C(0.0) (4) (1.
0) of a glass decays like a thermodynamic system because it is homogeneous in space. V ). However it has plateaulike regions that correspond to crossing the scale of internal components. the relaxation time.aside from unobservable small fluctuations. the complexity profile declines to its thermodynamic limit. it then reaches a plateau that represents the frozen degrees of freedom.T ) decays rapidly at first due to averaging over atomic vibrations. Beyond the correlation length.3. A more extended spatial coupling would give rise to a grading of the plateau and a broadening of the time scale at which the plateau disappears. (4) A complex biological organism has a complexity profile that should follow similar behavior to that of a fractal. uniform in space and time. This time scale. which vanishes on any reasonable scale. At much longer time scales the complexity profile decays to its thermodynamic limit. once the scale of observation is larger than either the correlation length or the correlation time of the system. L. bottom panel shows the length scale dependence. such as molecules and cells.At a temperaturedependent and much longer time scale.T). However. Unlike C(0. However. This is because spatial uniformity indicates that the relaxation time is essentially a local property with a narrow distribution. the complexity is just the macroscopic complexity associated with thermodynamic quantities (U.T ) of four different systems.N. it is different as a function of T. Stochastic fractals capture this kind of behavior. We can contrast the complexity profile of a thermodynamic system with what we expect from various complex systems. T. this might be taken to be the definition of the correlation length and time—the scale at which the microscopic information becomes irrelevant to the properties of the system. the plateau should be relatively flat and end abruptly. Because the glass is uniform in space.At typical values of T the temporal ensemble of the system includes the states that are reached by vibrational modes of the system. (2) For a glass the complexity profile as a function of time scale C(0. .and the complexity profile is constant at all length and time scales less than the size of the system. The frozen degrees of freedom that make it a nonergodic system at typical time scales of observation guarantee this. Once the length or time scale is beyond the correlation length or correlation time respectively.T) is the amount of information necessary to describe the system ensemble as a function of the length scale. the complexity profile of the glass is similar to an equilibrium system as a function of L. C(L. is accessible near the glass transition temperature. a plateau in the complexity profile extends up to characteristic time scales of human observation. A typical glass is uniform if L is larger than a microscopic correlation length. the atomic vibrations cannot be observed except at microscopic values of T.C o m p l e x i t y o f p h y s i c a l s ys t e m s 743 Figure 8. (1) An equilibrium system has a complexity profile that is sharply peaked at T = 0 and L = 0. Thus. of observation.the average behavior characteristic of the macroscopic scale is all that remains. a significant part of the microscopic description remains necessary at longer time scales. For lower temperatures it is not.2 Schematic plots of the complexity profile C(L. the complexity profile is quite different in time and in space. (3) A magnet at a secondorder phase transition has a complexity profile that follows powerlaw behavior in both length and time scale. Correspondingly. C(L. Indeed. Thus. For a glass. Top panel shows the time scale dependence. but not the atomic rearrangements characteristic of fluid motion. and time scale.
because different cell populations have different sizes and some cells are mobile. Thus. we can expect that as we increase the scale of observation.and its algorithmic as well as ensemble complexity will scale as a power law of the scale of observation L. We can at least qualitatively identify several different plateaus. the . they must be traceable back to the microscopic degrees of freedom. the complexity profile should not be expected to fall smoothly. In biological organisms. A stochastic fractal is a member of an ensemble.larger scale. a deterministic fractal has a complexity profile that decreases logarithmically with observation length scale L. These degrees of freedom manifest the concept of emergent collective behavior. In such fractals. deterministic and stochastic fr actals. Mathematical fractals with no granularity (no smallest length scale) have infinite complexity. a full accounting of cellular properties. The only difficulty in specifying the fractal is specifying the number of levels to which the algorithm should be iterated. For example. for a complex system we expect that many parameters will be required to describe its properties at all length and time scales. must be given. Mathematical models that best capture the complexity profile of a complex system are fractals (see Section 1.At the shortest time scale the atomic vibrations will be averaged out to end the first plateau. if we define a smallest length scale. Ultimately. Finally. by including random choices in the algorithm. an identifiable level of cellular behavior would correspond to a plateau. at least up to some fraction of the spatial and temporal scale of the system itself. Such a system requires information to describe its structure on every length scale.there will be particular length scales at which details will be lost. Examples are the Kantor set or the Sierpinski gasket. the internal behavior of tissues and organs will be averaged out on a still longer length and time scale. They may also be systems representing the spatial structure of various stochastic processes. However. there are random choices made at every scale of the structure. then we can plot the spatial complexity profile of a fractallike system. but is very small on all length scales.Stochastic fractals can be based upon the Kantor set or Sierpinski gasket. It is the degrees of freedom that remain relevant on the longest length scale that are key to the complexity of the system. This information (the number of iterations) requires a parameter whose length grows logarithmically with the ratio of the size of the system to the smallest length scale. There are many cells that have a characteristic size and are immobile.and thus their algorithmic complexity is small. Plateaus in the profile are related to the existence of welldefined levels of description. The deterministic fractals are specified by an algorithm with only a few parameters. The internal cellular behavior will then be averaged out. and we define a longest length scale that is the size of the system.Starting from the microscopic complexity.because over a range of length scales larger than the cell.Larger atomic motions or molecular behavior will be averaged out on a second. but not of the internal behavior of the cell. However. The algorithm describes how to create finer and finer scale detail. As L increases. Stochastic fractals are qualitatively different. Describing the connection between the microscopic parameters and macroscopically relevant parameters has occupied our attention in much of this book. There are two quite distinct kinds of mathematical fractals.744 Human Civilization I More generally.10). corresponding to the atomic length scale of a physical system. the sharpness of the transition should be smoothed.
A renormalization treatment. but not in its spatial behavior. This strengthens the identification of the fractal model of space and time as a central model for the understanding of complex systems. We have also gained an understanding of the difference between deterministic and stochastic fractal systems. The complexity profile enables us to consider again the definition of a complex system.since it is dependent on the space scale.3. The simplest physical model that demonstrates such fractal properties in space and time is an Ising model at its secondorder transition point.One of the objectives is to understand the ultimate limits to complexity. it seems intuitive that a complex system is complex on many scales. we consider a significant fraction of the system—onetenth of its size. The highest complexity of an organism results .One could say that this complexity is limited by the thermodynamic entropy. we could also take a natural time scale of Ts = Ls / vs where vs is a characteristic velocity of the system. Leaving out the time scale. For the spatial scale. These examples illustrate how microscopic information may become ir relevant on larger length scales. but there is no length scale smaller than the size of the system at which it is completely lost. As we stated.10.2. Time series that have fractal behavior—that have powerlaw correlations—would also display a powerlaw dependence of their complexity profile as a function of T. there are further limitations. Given a particular length or time scale. For the temporal scale. there is a natural space and time scale at which to define it.3. we consider the relaxation (autocorrelation) time of the behavior on this same length scale. however. This is essential ly the maximal complexity for this length scale. Thus it is unlikely that atoms can be attached to each other in such a way that the behavior of each atom is relevant to the spatiotemporal behavior of an organism at the length and time scale relevant to a human being. while leaving collective information that remains relevant at the longer scales. discussed in Section 1. If we want to identify a unique complexity of a system. However. this results in a loss of complexity. we ask what is the maximum possible complexity at that scale.52) In Section 1.0) (8. Ts . The complexity scaling of complex organisms should follow a line like that given in Fig. At this transition there are fluctuations on all spatial and temporal scales that have powerlaw behavior in both.10 we discussed generally the scaling of quantities as a function of the precision to which we describe the system.One of the central questions in the field of complex systems is understanding how complexity scales. These limitations are established by the nature of physical law that establishes the dynamics and interactions of the components. Observers with larg er values of L can see the behavior of the correlations only on the longer length scales.C o m p l e x i t y o f p h y s i c a l s ys t e m s 745 amount of information is reduced. The details of behavior must be lost as we observe on longer length and time scales. which would be the same as setting T = 0. Ls . we can write the complexity of a system s as Cs = Cs(Ls) = Cs(Ls .Ls /v s ) ≈ Cs(Ls . and therefore is only a partial example of a complex system. can give the value of the complexity profile. This scaling is concretized by the complexity profile.We see that the glass is complex in its temporal behavior. This form makes the increase in time scale for larger length scales (systems) apparent. 8.
We describe the flock behavior in terms of sheep density. The example is chosen to expand our view toward more general application of these ideas. flocking. walking. For our current purposes this might be a lot of information contained in a large number of books. etc. However. grazing activity. reproductive rates. in Section 8. this is not what we really want. relevant to various questions about the complexity profile is an understanding of the complexity that may arise when we bring together complex systems to form a larger complex system. Components and systems As we discussed in Chapter 2. Among other conclusions. We begin by taking information that describes each of the sheep. a description of the flock will be shorter than the sum of the lengths of the descriptions of each of the sheep. Of course there are differences in size and in behavior..4. or a little information contained in a single paragraph of text. or we can describe general characteristics of sheep and then specialize them for each of the individual sheep. which is much larger than Lsheep . migration. Thus we write that: Cflock = Cflock(Lflock) << Cflock(Lsheep) << NCsheep(Lsheep) = NCsheep (8. having described one sheep in detail we can describe the differences. reproducing. So we shift our observation of behavior to this longer length scale and find that most of the details of the individual sheep behavior have become irrelevant to the description of the flock. one book or 107 bits. a complex system is formed out of a hierarchy of interdependent subsystems. we will obtain an estimate of the complexity as. Let us assume that we know the complexity of a sheep. we can consider a flock of sheep. we have a description of the flock. In general it is not clear that bringing together many complex systems must give rise to a collective complex system.Later. where one example was a flock of animals. Much of the information that describes one sheep can also be used to describe other sheep.however. Using this strategy. the amount of information necessary to describe the relevant behaviors of eating. Thus. More generally. Still.746 Hu ma n Ci vi liza t io n I from the retention of the greatest significance of details. etc. This information is. At this time we do not know what limits can be placed on the rate of decrease of complexity with scale. The general statements we make apply to any system formed out of subsystems. Here we can provide additional meaning to this statement using the complexity profile. we see that the complexity of a flock may actually be smaller than the complexity of one sheep. Combining these descriptions. We now consider a flock of N sheep and construct a description of this flock. where all of the degrees of freedom average out on a very short length and time scale.at a length scale of about onetenth the size of the sheep. highly redundant.53) where N is the numb er of sheep in the flock. Csheep(Lsheep).3. This was discussed in Chapter 6. The description of the flock behavior has to be on its own length scale Lflock . We will discuss the relationship of the complexity of components to the complexity of the system they are part of. of order. the relationship between the complexity of the collective complex system and the complexity of component systems is crucially dependent on the existence of coherence and correlations in the behavior of the components that can arise either from common origins for the behavior or from interactions between the . This is in contrast to thermodynamic systems. To be definite.
then the complexity of describing all of them is much smaller than the sum of the separate complexities. but the observer does not see each step. For L = L 0 and T = T0.T ) = 2NT /q L < qL 0. This is the same model as the previous one. Our treatment only describes the leading behavior of the complexity profile and not various corrections. though this is not necessary for the analysis. The she ep is moving in a random walk where each step has a length qL0 and takes a time qT0. we can reconstruct the motion from the measurements of any observer with L < qL 0 and T < qT0. Thus the complexity is: C(L. The first inequality arises because we change the scale of observation and so lose the behavior of an individual sheep.3. We first describe this qualitatively by considering the two inequalities in Eq. we consider placing the same densities in a region of scale L1 > L 0 . There is a tradeoff between these two inequalities. Thus. having a large collective complexity requires a balance between d ependence and independence of the behavior of the components. The distance traveled is proportional to the square root of the time.On the other hand. To construct a model where the quantities are correlated. (8. only correlated motions of many sheep can be observed on a longer scale. since there is no structure below this scale. 8. For simplicity we can assume that the direction chosen is one of the four compass directions.53). the movement of one sheep to the right is canceled by another sheep that starts at its right and moves to the left.Specifically.3. In this case their behavior is coherent. We will use a model for sheep motion that can illustrate the effect of coherence of many sheep. but now on a length scale of L1.3.3. T < qT0 (8.We start with a scale L 0 just larger than the animal. To do this we assume that an individual sheep moves in a straight line for a distance qL0 in a time qT0 before choosing a new direction to move in at random.3. so that we do not describe its internal structure—we describe only its location at successive intervals of time. and the changes in direction are at welldefined intervals. We will use this model to calculate the complexity profile of an individual sheep. However.54) Once the scale of observation is greater than qL0. The second inequality arises because different sheep have the same behavior. We can come closer to considering the behavior of a collection of animals by considering a model for their motion. The loss of information for uncorrelated quantities due to combining them together is described by Eq. This increase of the standard deviation causes an increase in the value of the complexity for all scales greater than L1. We can discuss this more quantitatively by considering the example of the nonuniform ideal gas. the observer does not see every change in direction. The new value of is 1 = (L1 /L 0)d. Thus. for L < L1 the complexity is just the complexity at L1. The characteristic time over which a sheep moves a distance L 0 is T0. If the behaviors of the sheep are independent.C o m p l ex it y of p hys ica l sys t e m s 747 components.if their behaviors are correlated. A comparative plot is given in Fig. Because the movement is in a straight line.and so the sheep moves a dis .(8.37).then their behavior cannot be observed on the longer scale. the complexity of describing the motion is exactly 2 bits for every q steps to determine which of the four possible directions the sheep will move next. as well as the effect of coherent motion of an individual sheep over time.
39) would give similar results but the complexity would decay still more rapidly.T) = 2 2 NT 0 qL2 = 2N T 20 q L2 L L > qL0 . Eq (8. T <qT 0 (8.3. for two cases.(8.8 (1) 0. but short length scales L < qL 0 .3. (8. and thus we have a complexity: C(L. to show the effects on a linear 3 scale L1 was taken to be only √10L0. At this point the movement of the sheep will be described by the movement of the blob.3. which itself undergoes a random walk.3. To obtain the complexity profile for long times scales T > qT0. we use a simplified “blob” picture to combine the successive positions of the sheep into an ensemble of positions.3. Because the complexity decreases rapidly with scale.2 (2) 0 5 10 Ld 15 20 Figure 8. (8.3.4 0.54).748 Hu ma n Civ il iza t io n I 1. 0 tance L once in every ( 0 /L)2 steps.54) and Eq.55) are equal.55) We note that at L = qL0 Eq. eventually the ensemble of positions will overlap and form a blob. The second case has a lower complexity at smaller scales but a higher complexity at the larger scales. However.3. and the horizontal axis is in units of L3 measured in units of L3. The first (1) has a correlation in its nonuniformity at a scale L0 and the second (2) at a scale L1 > L0.2 C(L) 1 0. Every time the sheep travels a distance L we need 2 bits to describe its motion.6 0. The standard deviation of this random walk is proportional to the square root of the number of steps: .3 Plot of the complexity of a nonuniform gas (Eq. The magnitude of the local deviations in the density are the same in the two cases. (8.37)). For T only a few times qT0 we can expect that the ensemble would enable us to reconstruct the motion—the complexity is the same as Eq. where 0 = qL 0 is the standard deviation of the random walk in each dimension.
the complexity is the same. As we mentioned above.3. The simplest way to identify the crossover point is when the new estimate of the complexity becomes lower than our previous value.55) should be generalized to L > . This is a straightforward consequence of increasing the coherence of the motion over time. (8.3. Thus we can increase the complexity of the whole at the cost of reducing the complexity of the components. Thus the limit on Eq. the blob behavior only occurs for T significantly greater than qT0. the amount of information is essentially that of selecting a value from a Gaussian distribution of this standard deviation: C(L.57) .T) ∝ q in Eq. Since the motion of the sheep with this boundary does not require additional information over that without it. In either case. Increasing q decreases the complexity at the scale of a sheep. First.4) how varying q affects the complexity. then the complexity of the flock for length scales greater than the size of the flock is the same as the complexity of a sheep for the same length scales.2. and does not change any of the conclusions. as in our discussions of polymers in Section 5. it increases the complexity at longer scales C(L. If the movement of all of the sheep is coherent.T) ∝ 1/q in Eq. 0 (1 + log( 0 q T L L < . The value of q primarily affects the crossover point to the long time behavior. 8. This minor adjustment enables the complexity to be continuous despite our rough approximations. where L1 is the size of the flock.56) There are a few points to be made about this expression. 0 (1 + log( )) q T L qT )) T0 N qT L = 2 T min(1. We now use two different assumptions to calculate the complexity of the flock.(8. the complexity of flock motion (L > L1) is obtained as: C(L.3. we use the minimum of two values to sele ct the crossover point between the b ehavior in Eq.55).3. with its center at the center of mass.3.21). If the movement of sheep are independent of each other.3. This is apparent because describing the movement of a single sheep is the same as describing the entire flock. However. We can see from our results (Fig. We could also introduce into our model a circular reflecting boundary (a moving pen) around the flock.2.55).3. Increasing q increases the flock complexity until qL0 reaches L1.T) = 2N T qL2 0 NL2 L> (8. Since this is larger than L.54) and the blob b ehavior. then the flock displacements—the displacements of its center of mass—are of characteristic size /√N (see Eq. We might be concerned that the flock will disperse.3. (8. C(L. The second point is that we have chosen to adjust the constant term added to the logarithm so that when L = the complexity matches that given by Eq.54).5. which describes the behavior when L becomes large. We now see the significance of increasing q.C o m p l ex ity of phys ic a l sys t e m s 749 = 0√T/qT0.(8. However. We also see that the complexity at long times decays inversely proportional to the time but is relatively insensitive to q.(8.T) = 2 NT qT min(1. T > qT0 (8. interactions that would keep the sheep together need not affect the motion of their center of mass.
2 50 100 L 150 200 Figure 8. This is valid for all L if is less than L1. This is a problem for us.6 q=100. Solid lines and dashed lines show the complexity profile as a function of length scale for a time scale T = 1 and T = 500 respectively.4 q=100.3. in general the flock complexity is smaller than the complexity of a sheep. Any movements of an individual sheep that are smaller than the scale of the flock disappear on the scale of the flock. and this arises only for coherent behavior when all movements are visible on the scale of the flock.T=500 0. Increasing the distance a sheep moves in a straight line (coherence of motion in time). increases the complexity of the flock.the flock complexity is much lower than before—it decreases inversely with the number of sheep when L > . we see that even with coherent motion the complexity of a flock at its scale cannot be larger than the complexity of the she ep at its own scale. or coherence between different sheep.T=1 0. Even in this case. Eq.750 Human Civilization I 1 C(L) q=50. with replaced by /√N. the maximum complexity of the flock is just that of an individual sheep.T=1 0.56) applies. however. To obtain a higher complexity of the whole we must modify this model. Thus even for coherent motion.8 q=50. q. because our study of complex systems is focused upon systems whose complexity is larger than their components. This example illustrates the effect of coherent behavior. We must assume . decreases the complexity at small length scales and increases the complexity at large length scales. increasing q increases the flock complexity. (8. Without this possibility. We see that when the motion of sheep are independent. there would be no complex systems.T=500 0. However. However. If we choose T to be very large. Thus coherence in the behavior of a single sheep in time.3.4 The complexity profile is plotted for a model of the movement of sheep as part of a flock.
6 Our quantitative concept of complexity is a measure of the information necessary to describe the system behavior on its own length scale. which have not been included here. The more these are relevant to the system behavior. These parameters are related to the description of the system on a smaller length scale. Explain. The information that describes the system behavior must be relevant on every smaller length scale. that it is impossible to take part of a complex organism away without affecting the behavior of the whole and behavior of the part. The behavior of a system is thus related to the behavior of the parts. .3. Ultimately.7 We can now recognize that the use of information as a characterization of behavior enables us to distinguish various forms of dependency. the information necessary to describe the system behavior is determined by the microscopic description of atomic positions and motions. In particular. In particular. Q uestion 8. the greater is the system complexity.3. since the necessary properties of air are simple to describe. T (n(x.3.7 When we defined interdependence we did not consider the dependence of an animal on air as a relevant example. This is possible only if there are interactions between them. Coherent motion of sheep still lead to a similar (or lower) complexity. Solution 8. we see that the dependence of an animal on air is simple.3 and developed further in Chapter 2 was that a complex system has a behavior that is dependent on all of its parts. we have a direct relationship between the definition of a complex system in terms of parts and the definition in terms of information. How is this definition related to the definition of complexity articulated in this section? Solution 8. Q uestion 8.t)) to the domain of the part. Thus. To increase the complexity. If the system behavior is complex.then it must require many parameters to describe. To do this we limit PL. The more complex a system is. where the parts of the system are manifest because we can distinguish the description of one part from another. the motion o f the flock must have more complex patterns of motion. the degree of interdependence of two syst ems should be measured as the amount of information necessary to replace one in the description of the other.C o m p l ex it y o f p h y s i c a l s ys t e m s 751 more generally that the motion of a sheep is describable using a set of patterns of behavior. the more its behavior depends on smaller scale components. It should now be clear that the objective of learning how the complexity of a system is related to the complexity of its components is central to our study of complex systems. the motions of the individual sheep must be neither independent nor coherent—they must be correlated motions that combine patterns of sheep motion into the more complex patterns of flock motion. In order to achieve such patterns.3. Thus.6 Throughout much of this book our working definition of complex systems or complex organisms as articulated in Section 1.
However. There is another approach to reaching the complexity profile that incorporates the observer and system relationship in a more satisfactory manner.6 Behavioral complexity Our ability to describe a system arises from measurements or observations of its behavior. The first was the information content of an individual message. have the same probability. We assume that the language consists of a list of characters or messages that can be received from the source. if we want to describe the source to someone. however. Any description of a source must assume a language that is to be used.3. then the information in the particular message is the same as the average information.N .0:0. When all messages. we will write probabilities in decimal notation. A few examples in the context of a source of messages will serve to illustrate this concept. It is a characterization of the source rather than of any particular message. Thus to characterize the source we need a description of the probability of each kind of message.) is used to separate different members of the list. To introduce the new approach. how much it could be compressed. which was not included previously.One way to characterize the source is to determine the average amount of information in a message. there were two quantities of fundamental interest. p}) = − {x. It is a quantity that characterizes the ensemble rather than the individual microstate.752 H uma n Ci vi liza t io n I 8.5. In Shannon’s approach to the study of information in communication systems. p}(U . The discussion of algorithmic complexity was based on a consideration of the information provided by a particular message—specifically.58) The expression on the right. and the second was the average information provided by a particular source. rather than the messages. The complexity profile brought us closer by acknowledging the observer in the space and time scale of the description. p})) (8. has a different purpose. A second delimiter (.p } ∑ P({x. This carried over into our discussion of physical systems when we introduced the microscopic complexity of a system as the information contained in a particular microscopic realization of the system. the most essential information is to give a description of the kinds of messages that will be received—the ensemble of possible messages.and a systematic method for characterizing the complexity of a system.5}. and we can write: I({x.3.V )) = −logP({x. For convenience. It also enables us to consider directly the interaction of the system with its environment. How much information do we need to describe these probabilities? We call this the behavioral complexity of the source. we obtained a mechanism for distinguishing complex systems from equilibrium systems.or all system states.A delimiter (:) is used to separate the messages from their probability. we return to the underpinning of descriptive complexity and present the concept of behavioral complexity. p})log(P({x. along with their probabilities.A source that gives zeros and ones at random with equal probability would be described by {1:0. The use of system descriptions to define system complexity does not directly take this into account. By acknowledging the scale of observation. It is convenient to include the length of a message in our . We can pursue this line of reasoning by considering more carefully how we might characterize the source of the information.
This definition of the behavioral complexity of a source runs into a minor problem. observations of animals in the wild or in captivity. the behavioral complexity of a source is much larger than the information content of a particular message. 0:0. to a discussion of the properties of physical systems.Complexity of physical systems 753 description of the source. In particular in the above example it can be larger. or physical probes of the system.5. We are interested in the behavioral complexity when our objective is to use the messages that we receive to understand the source. if N is large. rather than to make use of the information itself. The message complexity of this source would be given by N. or could be defined by the specification itself. As we found above. Behavioral complexity becomes particularly useful when it is smaller than the complexity of a message. each character zero and one with equal probability. However. We might imagine the measurements to consist of subjecting the system to light at various frequencies and measuring their scattering and reflection (looking at the system). We consider each measurement to be a message from the system to the observer. Thus we might describe a source with length N = 1000 character messages. if N = 1.then the number of possible messages is 2N. Listing all of the possible messages requires N 2N bits. we consider an observer of a physical system who performs a number of measurements. because it enables us to anticipate or predict the behavior of the source. We now apply these thoughts about the source as the system of interest. We see that the behavioral complexity is quite distinct from the complexity of the messages provided by a source.We must. Thus. . This is still exponentially larger than the information in a particular message. if the probability of each message must be independently specified. the behavioral complexity is given by (in this language): two decimal digits.two characters (1.the behavioral complexity can be much smaller than the information complexity of a particular message—if the source provides many random digits. This convention could be part of the language. the complexity of an arbitrary source of messages of a particular length is much larger than the complexity of the messages it sends. To overcome this problem. or it can be much smaller. 0). This could be reduced if the messages are placed in an agreedupon order. as: {1000(1:0.5)}. then the number of bits is Q2N. We could also specify an ASCII language source by a table of this kind that would consist of 256 elements and the probabilities of their occurrence in some database. If a particular message requires N bits of information. the length of a message. The description of the source can also be compressed using the principles of algorithmic complexity. However. To make the connection between source and system.this precision is related to the number of messages that might be received. and specifying each probability with Q bits would give us a total of (N + Q)2N bits to describe the source. In principle. rather than the message as the system of interest.there must be a convention assumed about the limit of precision that is desired in describing the source. because the probabilities are real numbers and would generally require arbitrary numbers of digits to describe. the complexity of the message is high but the complexity of the source is low because we can characterize it simply as a source of random numbers.the number representing N (requiring log(N) characters) and several delimiters.
3.1. which in principle might be detailed enough to determine the instantaneous positions and momenta of all of the par ticles. take note that any measurement consists of two parts. we conclude that the complexity of an equilibrium system is the complexity of describing its ensemble—specifying (U. the behavioral complexity is the ensemble complexity—the number of parameters necessary to specify its ensemble. there is no doubt that the complexity of a is dependent on the complexity of e. the ensemble information is the information in the frozen coordinates previously defined as the complexity. the list of measurements is determined by the ensemble of states the system might have.the conditions or environment in which the observation was performed and the behavior of the system under these conditions. In this case it is relatively easy to see that the behavioral complexity of a physical system is its descriptive complexity—the set of all measurements characterizes completely the state of the system. We would like to define the behavioral complexity as the amount of information contained in the observer’s description. However. where e represents the environment and a represents a measurement of system properties (action) under the circumstances of the environment e. rather than of one particular measurement.N. Does this mean that our system description must include its environment and that the complexity of the system is dependent on the complexity of the environment? Complex systems or simple systems interact and respond to the environment in which they are found. It captures the properties of the list of measurements. for a set of measurements performed over an interval of time T—or at one instant but with time determination error T—and with spatial position determination errors given by L. then the behavioral complexity is the microstate complexity of the system. We now return to c onsider a system that is not isolated but subject to an environmental influence so that an observation consists of the pair (e. we recover the complexity profile. 8.3. Since the system response a is dependent on the environment e. The observer.a) (Fig. and the set of possible measurements determines the microstate. Alternatively. we first consider the physical system of interest to be essentially isolated.5). This description characterizes the system. The complexity of describing such messages also contains the complexity of the environment e.V ) and other parameters like magnetization that result from the breaking of ergodicity. If the entire set of measurements is performed at a single instant. A particular message is a measurement of the system properties. It may or may not explicitly contain the information of each measurement. Then the environmental description is irrelevant. As in Section 8. However. More generally. For a set of measurements performed over time on an equilibrium system. after performing a number of measurements. and to make contact between behavioral complexity and our previous discussion of descriptive complexity. We write any observation as a pair (e. The result of any measurement can be obtained from a description of the microstate. it may assign probabilities to a particular measurement. and has arbitrary precision. writes a description of the observations. and an observation consists only of the system measurement a.a). we must be careful how we do this because of the presence of the environmental description e. In order to clarify this point. The list of measurements is the set {a}.754 Hu ma n Civ il iz a tion I however. Three . For a glass.
which is also a model of f. Thus.3. e. Conversely. The point is that the complexity of a system should not include the complexity of the influence upon it. When the influence of the environment is not important. we may want to attribute much of this complexity to the rest of the dog rather than to the tail. a. because measuring the properties of a system in an environment may cause us to be measuring the influence of the environment.T ) and Cb(L.a)} where our concern is to describe f the functional mapping a = f (e) from the environment e to the response a. Clearer yet is the example of the behavior of a basketball during a basketball game. but just the complexity of its response.it is also important to characterize the information that is relevant about the environment.C o m p l ex it y o f p h y s i c a l s ys t e m s 755 System's Environment e a Observer message (action) System e Figure 8. we do not characterize the system by a list of actions {a} but rather by the list of pairs {(e. the description of which might be better attributed to the liquid than to the particle. The observer must describe the system behavior as a response to a par ticular environment. When the environment matters. C(L. we can again affirm that a full microscopic description of the physical system is enough to give all system responses. rather than a list of all of its environmentaction (e. rather than the system. The tail o f a dog has a particular motion that can be described.a) pairs. and the complexity can be characterized. a = f(e). This response is a property of the system and is determined by a complete microscopic description. rather than just the behavior itself. It is generally simpler to describe a model for the system structure.T ). However. within a range of environments and with a desired degree of precision (spatial and temporal scale) it is possible to provide less information and still describe the behavior. However. the motion of a particle suspended in a liquid follows Brownian motion. These examples generalize to the consideration of any system. and the system’s actions. where the function f describes its actions in response to its environment. a full description of behavior subject to all possible environments would require complete microscopic information.T ) are the same. This is related to the problem of . examples illustrate how the environmental influence is important.5 The observation of system behavior involves measurements both of the system’s environment. Once we realize this. Describing the ensemble of responses g ives us the behavioral complexity profile Cb(L. We consider the ensemble of messages (measurements) to have possible times of observation over a range of times given by T and errors in position determination L. in response to this environment Thus we should characterize a system as a function. Similarly.
and Cb(L) is the behavioral complexity at one time interval.47). We noted this point also with respect to the Lyaponov exponents after Eq.8 Discuss the following statements with respect to human beings as complex systems: “The most compact description of the system behavior will give its structure rather than its response to all inputs. As we have defined it.T) = Cb (L. The response function can (in principle) be completely derived from the microscopic description of a system.47) to include a term that describes the rate of information transfer from the environment to the system: C(L. From the discussion in Section 8. There is a difficulty with this approach in that the complexity of functions is generically much larger than that of the system itself.59) as written may count information more than once. uestion 8. Eq. Behavioral complexity suggests that we should consider the system behavior as represented by a function a = f (e).the descriptive complexity is the information necessary to predict the behavior of the system over the time interval t2 − t1.” and “This implies that the behavior of physical systems under different environments cannot be independent. This use of behavior/response rather than a description to characterize a system is related to the use of response functions in physics.the amount of information about the universe that is relevant to the system behavior in any interval of time must also be finite. in principle. where Ce is the environmental complexity.3. Because the environmental influence leads to an exponentially large complexity.3. which then affects the system. It is more directly relevant to the system behavior in response to environmental influences. We note that these conclusions must also apply to human beings as complex systems that respond to their environment (see Question 8. Then. and thus is essential for direct comparison with experimental results. because predicting the system behavior in the future requires information about the environment. Thus. the response can be derived from the structure.3.T ) = Cb (L) +C t (L. The input to the function is a description of the environment. We can characterize the environmental influence by generalizing Eq.T )+ N T k i:hi >0 ∑hi (8. (8.2.3.3.756 Hu ma n Ci vi liza t io n I prediction. We note that because the system affects the environment. Because the system itself is finite.3 we know that the description of a function would require an amount of information given by Cf = Ca 2Ce. the output is the response or action.3. it is clear that often the most compact description of the system behavior will give its structure rather than its response to all inputs.8). This also implies that the behavior of physical systems under different environments cannot be independent.T) Cb (L. (8.this expression as written is an upper bound on the complexity.59) where Ce(L)/k ln(2) is the information about the environment necessary to predict the state of the system at the next time step.(8. and Ca is the complexity of the action.T)+ N T C e (L.” Q . or input/output relationships to describe artificial systems.
What assumptions have we made about the properties of the observer? One of the assumptions that we have made is that the observer is more complex than the system. then the observer will be unable to contain the description of the system that is being observed. In this case. There are several possible ways that the observer may simplify the description of the system. One is to reject the . If the observer is described by fewer bits than are needed to describe the system.C o m p l ex ity of p hys ic a l sys t e m s 757 Solution 8. This also means that the use of such models may be effective in predicting the behavior of an individual. It is more effective to use such measurements to construct a model for the internal functioning of the individual and use this model to describe the measured responses. from the second statement we know that the model can describe the responses to circumstances that have not been measured. Moreover. The model description is much more concise than the description of all possible responses. What happens if the complexity of the system is greater than the complexity of the observer? The complexity of an observer is the number of bits that may be used to describe the observer. The coupling between the reaction of a human being under one circumstance to the reaction under a different circumstance is also relevant to our understanding of human limitations.2. Increasing complexity enables an organism to be more effective. This is part of what we do when we interact with other individuals—we construct models that represent their behavior and then anticipate how they will react to new circumstances.the observer will construct a description of the system that is simpler than the system actually is. 8. An individual who is eff ective in some circumstances may have qualities that lead to ineffective behavior under other circumstances.8 The first statement is relevant to the discussion of behaviorism as an approach to psychology (see Section 3. This is relevant to the observation that living organisms generally consume limited types of resources and live in particular ecological niches. Optimizing the response through adaptation to a set of environments according to some goal is a process that is limited in its effectiveness due to the coupling between responses to different circumstances. We will discuss this in Chapter 9 in the context of considering the specialization of human beings in society.Specifically. A model that incorporates the previous behaviors may have some ability to predict the behavior to new circumstances. but the effectiveness under a variety of circumstances is limited by the interdependence of responses.8). that reactions of a human being are not independent of past reactions to other circumstances. This point is also applicable more generally to living organisms and their ability to consume resources and avoid predators as discussed in Chapter 6.3. It says that the idea of describing human behavior by cataloging reactions to environmental stimuli is ultimately an inefficient approach.7 The observer and recognition The explicit existence of an observer in the definition of behavioral complexity enables us to further consider the role of the observer in the definition of complexity.3.
a rock can be described as “just sitting there. For example. These simplifications are often done in our modeling of physical systems. Specifically. We will return to this point later when we consider the properties of human language in Section 8. The other is to artificially limit the length of messages described. We introduced there the concept of recognition complexity of a set that relies upon a recognizer (a special kind of TM called a predicate that gives a single bit output) that can identify the system under discussion. We determine the minimal possible .discover that the rock is actually a camouflaged animal.3. Finally. Of course the nature of the environment could be changed so that other behaviors will be realized. where the complexity of designing a system to recognize a particular pattern is strongly dependent on the universe of possibilities within which the pattern must be recognized. Naturally. We can assume that Turing had in mind only a limited type of interaction between the observer “we” and the systems being observed—either the real or artificial representation of a human being. This is a general problem that applies to quantitative scientific modeling as well as the use of experience in general. all known biological organisms on earth. An inherent problem in discussing behavioral complexity using environmental influence is that it is never possible to guarantee that the behavior of a system has been fully characterized.” We define the complexity of a system (or set of systems) as the complexity of the simplest recognizer of the system (or set of systems). The first comment is related to the recognition of sets of numbers introduced briefly in Section 8. A description or model of a class of systems must identify common attributes rather than specific behaviors.758 Human Civilization I observation of all but a few kinds of messages.First we realize that this definition is well suited to describing classes of systems. A third is to treat complex variability of the source as random—described by simple probabilities.the complexity of recognizing cows depends on whether we allow ourselves to present the recognizer with all domestic animals. when presented with the system it says. this is an important issue in the field of pattern recognition. This is an inherent problem in behavioral complexity: it is never possible to characterize with certainty the complexity of a system under circumstances that have not been measured. All such conclusions are extrapolations.A second interesting feature is that the complexity of the recognizer depends on the possible universe of systems that it can be presented with.” and when presented with any other system it says. For example. for example. can serve as the basis for an additional definition of complexity.” if we want to describe the complexity of its motion under different environments. “This is not it. or all possible systems. There are some interesting features of this definition.2. Performing such extrapolations is an essential part of the use of the description of a system. A different form of complexity related to recognition may be abstracted from the Turing test of artificial intelligence.4. This test suggests that we will achieve an artificial representation of intelligence when it becomes impossible to determine whether we are interacting with an artificial or actual human b eing. We may. we describe the relevance of recognition to complexity.1. which relies upon an observer to recognize the system. all potentially viable biological organisms. “This is it. This test.
A specific estimation method is not necessarily useful for all systems. In all cases. The sensitivity of this definition to the nature of the observer and the conditions of the observation is manifest. If it is. In some ways this definition. they do have recognizable relationships to the system. We can hope. If necessary. .4 Complexity Estimation There are various difficulties associated with obtaining specific values for the complexity of a particular system. We can make use of the existing information to construct estimates of their complexity. We also reconciled the notion of information as a measure of system complexity with the notion of complex systems as composed out of interdependent parts. Fundamental problems such as the difficulty in determining whether a representation is maximally compressed are important. The “we” in the previous sentence is some observer that must recognize the system behavior in the constructed representation. However. One approach to obtaining the complexity of a system is to construct a representation. The complexity profile formally takes this into account. 8. however.then we know that the length of the representation is an upper bound on the complexity of the system.Our next objective is to concretize this discussion further by estimating the complexity of particular systems. before this is an issue we must first obtain a representation. We found that the mathematical models most closely associated with complexity—chaos and fractals—were both relevant. We noted the sensitivity of complexity to the spatial and temporal scale relevant to the description or response. that it will not be necessary to obtain explicit representations in order to estimate complexities. The former described the influence of microscopic information over time.however. We conclude this section by reviewing some of the main concepts that were introduced. The explicit representation should then be used to make a simulation to show that the system behavior is reproduced. The objective of this section is to discuss various methods for estimating the complexity of systems with which we are familiar. the complexity measures the length of a representation of the system. is implicit in all of our earlier definitions.A more complete characterization of the system uses the entire complexity profile. The complexity of this model we call the substitution complexity. The latter described the gradual rather than rapid loss of information with spatial and temporal scale. we can define the unique complexity of a system to be its complexity profile evaluated at its own scale. Ultimately we must determine whether a particular representation of the system is faithful.Complexity estimation 759 complexity of a model (simulated representation) of the system which would be recognized by a particular observer under particular circumstances as the system. There are both fundamental and practical problems. however. Measuring complexity is an experimental problem. These approaches make use of representations that we cannot simulate. The only reason that we are able to discuss the complexity of various systems is that we have already made many measurements of the properties of various systems.
where the principle example is the genome of living organisms. These rough estimates will give us a first impression of the degree of complexity of many of the systems we would like to understand. For example.4. We will discuss the implications for artificial intelligence in Section 8. where we consider whether a computer could simulate the dynamics of atoms in order to simulate the behavior of the human being. We will discuss three methods—(1) use of intuition and human language descriptions.5. . It would tell us how difficult (very roughly) they are to describe. While we will discuss the complexity of various systems. we must consider other techniques. since most of the information in the entropy is needed to describe the position of molecules of water undergoing vibrations. However. This entropy is an upper bound to the information necessary to specify the complete human being. we keep track of halfdecades using factors of three. Each of these methods has flaws that will limit our confidence in the resulting estimates. (2) use of a natural representation tied to the system existence. and (3) use of component counting. The entropy of a human being is much larger than the complexity estimate we are after. This is the value of S /k ln2. we can still take advantage of them. The implications of obtaining an estimate of human complexity will be discussed in Section 8. onetenth the size of the system itself. The reduction in information is hard to estimate directly.the maximum visual sensitivity of a human being is about 1/100 of a second and 0. by noting that the complexity of a human being can be bounded by the physical entropy of the collection of atoms from which he or she is formed. we have replaced microscopically each atom where it was. This means that our errors will be in the exponent rather than in the number itself. However. According to our understanding of physical law.760 H uma n Ci vi liza t io n I Our objective in this section is limited to obtaining “ballpark” estimates of the complexity of systems. we have assumed that there is nothing associated with a human being except the material of which he or she is formed.3.4. This is roughly the entropy of a similar weight of water.however. because we are interested in the complexity at a relevant spatial and temporal scale. We would be very happy to have an estimate of complexity such as 103±1 or 107±2. For either case.1 mm. As usual. and that this material is described by known physical law. To estimate the relevant complexity. In general we consider the complexity of a system at the natural scale defined in Section 8.and the relaxation time of the behavior on this same length scale.4. We could also define the complexity by the observer. we can guess that the complexity is significantly smaller than the entropy. about 1031 bits.our focus will be on determining the complexity of a human being. We start. When appropriate. observing only at this spatial and temporal scale decreases dramatically the relevance of the microscopic description.4. since we are trying to find rough estimates.10 10±2 bits will be obtained by combining the results of different estimation techniques in the following sections. The meaning of this number is that if we take away the person and we replace all of the atoms according to a specification of 1031 bits of information. Our final estimate. there can be no discernible difference. Consistency of different methods will give us some confidence in our estimates of complexity. such as in 3 × 104.
We are just interested in a description of the behavior of a frog.. In order for someone to give a quantitative estimate of the complexity of a system. or molecular frogology. we cannot use this approach directly to estimate the complexity of human beings. It could be argued that much of our development is directed toward enabling us to construct predictive models of various parts of the environment in which we live.g.it is necessary to provide a definition of complexity that can be readily understood. and/or their evolutionary history are described. It is not accidental that this is the fundamental objective of science—behavior prediction. The books on insects are devoted to comparative descriptions. which is much more sophisticated than many explicit representations that might be constructed. we could ask the question in the following way: How much text is necessary to describe the behavior of a frog? We might emphasize for clarification that we are not interested in comparative frogology. butterfly). Exceptional behaviors or examples are highlighted. Tens to hundreds of types are compared in a single book. There is an inherent limitation in this approach mentioned more generally above—a human being cannot directly estimate the complexity of an organism of similar or greater complexity than a human being.4. e. On the other hand. This approach. To gain additional confidence in this approach. mosquitoes.C o m p l e xi ty es t ima tio n 761 8.is precisely what was asked in Question 8. we find that there are entire books devoted to a particular t ype of insect (mosquito.1.in its most basic form. not physiology).. where various types of. However.g. Even though it appears highly arbitrary. What is missing is the quantitative definition. we may go to the library and find descriptions that are provided in books. and one devoted to apes . Superficially. The amount of text devoted to the behavior of a par ticular t ype of mosquito could be readily contained in less than a single chapter. their physiology. In particular. there is a qualitative difference between these books. We ask someone what they believe the complexity of the system is. For example. from around the world. we should not dismiss this approach too readily because human beings are designed to understand complex systems. as there are books devoted to the tiger or the ape. This can be sufficient to cause a person to build a rough mental model of the system description. The complexity of a system is directly related to the amount of study we need in order to master or predict the behavior of a system. Thus we will focus on simpler animals first.1 Human intuition—language and complexity The first method for estimation of complexity—the use of human intuition and language—is the least controlled/scientific method of obtaining an estimate of the complexity of a system. We are quite used to using the word “complexity”in a qualitative manner and even in a comparative fashion—this is more complex or less complex than something else. The information can be quantified in terms of representations people are familiar with—the amount of text/the number of pages /the number of books. It is assumed that the person we ask is somewhat knowledgeable about the system and also about the problem of describing systems. One useful and intuitive definition of complexity is the amount of information necessary to describe the behavior of a system.2.a book devoted to tigers may describe only behavior (e. ant.
4. The mosquito is much more relevant to the wellbeing of human beings than the tiger. We can also argue that when there is greater experience with complexity and complexity estimation.there is no booklength description of the behavior of a mosquito. .3 bits/character.762 H uma n Ci vi liza t io n I would describe only a particular individual in a manner that is limited to only part of its behaviors. The guesses were used to establish bounds on the information content. For our present discussion. we need to have a model that includes longerrange correlations between characters. The most reliable estimates have been obtained by asking people to guess the next character in an English text. Even if there is some degree of subjectivity to the complexity estimates obtained from the lengths of descriptions found in books. It is assumed that people have a highly sophisticated model for the structure of English and that the individual has no specific knowledge of the text. this corresponds to values given in Table 8.the use of existing books is a reasonable first attempt to obtain complexity estimates from the information that has been compiled by human beings.8 bits per character could be based upon the existence of 26 letters and 1 space. Despite such films. Before applying this methodology. This is true despite the importance of knowledge of its behavior to prevention of various diseases.9±0. We can summarize these bounds as 0.8 the information in a string of English characters. We have already discussed in Section 1.There are films that enable us to observe the mosquito behavior at its own scale rather than at our usual larger scale. For larger quantities of text.our ability to use intuition or existing texts will improve and become important tools in complexity estimation. however. and the Amount of text 1 char 1 page = 3000 char 1 chapter = 30 pages 1 book = 10 chapters Information in text 1 bit 3x103 bit 105 bit 106 bit Text with figures 104 3x105 3x106 Table 8. the difference between high and low bounds (a factor of 2) is not significant.A first estimate of 4. Does the conventional wisdom of “a picture is worth a thousand words” make sense? We can consider this both from the point of view of direct compression of the picture. we should understand more carefully the basic relationship of language to complexity.1 Information estimates for straight English text and illustrated text. We should also be concerned about figures that accompany descriptive materials. To obtain an even better estimate. In Question 1. For convenience we will use 1 bit/character for our conversion factor. Mosquitoes are easier to study in captivity and are more readily available in the wild. Our estimate of information in text has assumed a strictly narrative English text.the best estimate obtained was 3.12.1.8.4. Does the difference in texts describing insects and tigers reflect the social priorities of human beings? This appears to be difficult to support.3 bits per character using a Markov chain model that included correlations between adjacent characters.
5 to roughly 3 × 106 bits. on average containing one figure and onehalf page of text on each page.the essence of naming—a name is a short reference to a complex system. This is the central point of recognition complexity. Thus when we ask the key question—whether two pages of text would be sufficient to describe a typical figure and replace its function in the text—this seems a somewhat generous but not entirely unreasonable value. the text that accompanies a figure generally describes its essential content. It is only necessary that the ensemble of words be matched to the ensemble of systems described by the words. for a highly illustrated book. not with those possible in principle. The complexity of the word “frog” is not the same as the complexity of the frog.C o m p l ex ity es ti ma ti on 763 possibility of replacing the figure by descriptive text. Moreover. it seems reasonable to adopt the convention that estimates using descriptions of behavioral complexity include figures. A language uses individual words (like “frog”) to represent complex phenomena or systems (like the physical system we call a frog). If we recall that we are not interested in small details.A word is a member of an ensemble of words. the smallest possible representation of a complex system has a length in bits which is equal to the system complexity. this seems reasonable as an upper bound. This is. If there is one picture on every two pages. Thus. Thus.1). A thousand words corresponds to 5 × 103 characters or bi t s . For example.not the whole ensemble of possible systems.indeed. There is another aspect of the relationship of language to complexity. Here we have an example of a system—frog—whose representation “frog” is manifestly smaller than its complexity.it is only necessary to invoke the name to retrieve the whole play.3. to describe a system one must identify it only in comparison with the systems in memory. A figure typically occupies half of a page that would be otherwise occupied by text.In a black and white photograph 5 × 103 bits would correspond to a 70 × 70 grid of completely independent pixels. While it is not really essential for our level of precision.7. but rather to the complexity of specifying the system—the logarithm of the number of systems that are part of the shared experience of the individuals who are communicating.the complexity of a word is not related to the complexity of the system. Another way to think about this is to consider a human being as analogous to a special UTM with a set of short representations that the UTM can expand to a specific limited subset of possible long descriptions.a bo ut two pages of text.4. All words are names of more complex entities. Why is this possible? According to our discussion of algorithmic complexity. having memorized a play by Shakespeare. the information content of the book would be doubled rather than tripled. For a human being with experience and memory of only a limited number of the set of all complex systems. our estimate of the information content of the book would increase from 106 bits by a factor of 2.and the systems that are described by these words are an ensemble of systems. . We will do so by increasing the previous values by a factor of 3 (Table 8. Photographs are formed of highly correlated graphical information that can be compressed. Descriptive figures such as graphs or diagrams often consist of a few lines that can be concisely described using a formula and would have a smaller complexity. This will not change any of the conclusions. The resolution of this puzzle is through the concept of recognition complexity discussed in Section 8.
that the more complex a system is.the less relevant specific knowledge is. This is by reference to computer languages.the words that would be used to describe a frog also refer to complex entities or actions. it is more constructive to keep in mind the subtle relationship between language and complexity as part of our uncertainty. There is a presumption that a description of behavior is made to someone without specific knowledge. including: . but by program lines at several bits per program line. we assume that we must describe it without reference to the name itself. At this point.2. we can argue that when we estimate the complexity of systems that approach the complexity of a human being. We could expand the description further by requiring that a person explain not only the behavior of the frog. We accept the possibility that languagebased estimates of complexity of biological organisms may be systematically too small because they are common and familiar. the estimate of system complexity should also include the complexity o f the compiler and of the computer operating system and hardware. Ultimately. Thus we might estimate the complexity of a program not by characters. and take the given estimates at face value. ultimately we will conclude that the inherent compression in use of language f or describing familiar complex systems is the greatest contributor to uncertainty in complexity estimates. Nevertheless.Specifically. There are other problems with the use of natural or artificial language descriptions. Does this also invalidate the use of human language to obtain complexity estimates? On one hand.however.” On the other hand. but also the meaning of each of the words used to describe the behavior of the frog. This follows because o f our discussion of universality o f complexity g iven in Section 8. Consistent with the definition of algorithmic complexity. language provides a systematic mechanism for compression of information. Consistency in different estimates of the amount of text necessary to describe a frog might arise from the use of a common language and experience.“It behaves like a frog” is not a sufficient description. There is another approach to the use of human intuition and language in estimating complexity. when we are asked to describe the behavior of a frog.2. An estimate of the complexity of a frog would be much higher than the complexity of the word “frog. We may nevertheless have relative complexities estimated correctly. we can ask for the length of the computer program that can simulate the behavior of the system—more specifically. and the more universal are estimates of complexity. because there are a few commands and variables that are used throughout the program. Computer languages are generally not very high in information content. Finally. Compilers and operating systems are much more complex than many programs by themselves. the length of the program that can simulate a frog. This implies that we should not use the length of a word to estimate the complexity of a system that it refers to.764 Human Civilization I In this way. For someone familiar with computer simulation. We can bypass this problem by considering instead the size of the execution module—after application of the compiler. the estimation problems becomes less severe. the complexity of a system is defined by the condition that all possible (in principle) behaviors of the same complexity could be described using the same length of text.
However. Alternatively.2 Estimates of the approximate length of text descriptions of animal behavior . Insects and fish are at pages. Difficulty with counting. Overestimation due to a lack of knowledge of possible representations. the assumption may be in the conceptual (semantic) framework. Ptolemy would give a larger complexity estimate than Copernicus because the Ptolemaic system requires a much longer description—which is the reason the Copernican system is accepted as “true” today. Alternatively. Mosquito Ant (one. The assumption of a particular length of text presumes a kind of representation. This may be due to the form of the representation—specifically English text. We can apply this method by taking the highest complexity estimate o f other systems and using this as a close lower bound to the complexity of the human being. The lengths of linguistic descriptions of the behavior of biological organisms range from several pages to several books. These numb ers span the range of complexity estimates. Large numbers are generally difficult for people to imagine or estimate. not colony) Frog Rabbit Tiger Ape Text length a few pages a few pages to a chapter a few pages to a chapter a chapter or two a short book a book a few books Complexity (bits) 3x104 105 105 3x105 106 3x106 107 Table 8. 2. If an individual is familiar with the behavior of a system only under limited circumstances. most mammals at approximately a book.the presumption that this limited knowledge is complete will lead to a complexity estimate that is too low. This choice of representation may not be the most compact. this is not quite true.2 was constructed using various books. which is generally a more familiar quantity. what are some of the estimates that we have obtained? Table 8. An example is the complexity of the motion of the planets in the Ptolemaic (earthcentered) representation compared to the Copernican (suncentered) representation. and monkeys and apes at several books. By close lower bound we mean that the actual complexity should not be tremendously greater. lack of knowledge may also result in too high estimates if the individual extrapolates the missing knowledge from more complex systems. This is the advantage of identifying numbers with length of text. Underestimation due to lack of knowledge of the full behavior of the system. We have concluded that it is not possible to use this approach to obtain an estimate of human complexity.4. With all of these limitations in mind. 3.4. This problem is related to the difficulty of determining the compressibility of information. According to our Animal Fish Grasshopper.Complexity estimation 765 1.frogs at a chapter.
4.1. We will see how this compares to other estimates in the following sections. a first estimate of the information contained in a DNA sequence would be N log(4) = 2N. Since DNA is formed of two com . Primates may be estimated somewhat higher. N is the length of the DNA chain.766 H uma n Ci vi liza t io n I experience. Considering the DNA as an alphabet of four characters provided by the four nucleotides or bases represented by A (adenine) T (tyrosine) C (cytosine) G (guanine). We first discuss the approach in somewhat greater detail. Taken at face value. The estimate of roughly 30 textbooks is also consistent with the general upper limit on the number of books an individual can write in a lifetime.2 Genetic code Biological organisms present us with a convenient and explicit representation for their formation by development—the genome. In this regard the genome is much like a Turing machine tape (see Section 1.the complexity estimates of animals tend to extend up to roughly a single book. There are several other approaches to estimating human complexity based upon language. it appears natural to associate with the genome the information that is necessary to specify physiological design and function.this provides us with an estimate of the complexity of an organism. One reason this number appears reasonable is that if the complexity of a human being were much greater than this. It might also be argued that this is too high because students do not actually know the entire content of 30 textbooks. even though the mechanism for transcription is quite different from the conventional Turing machine. We must then inquire as to the approximations that are being made.Regardless of how we ultimately view the developmental process and cellular function. We now turn to estimation methods that are not based on text. It might be argued that this estimate is too low because we have not inc luded other parts of the education (elementary and high school and postgraduate education) or other kinds of education/information that are not academic. It is not difficult to determine an upper bound to the amount of information that is contained in a DNA sequence.there would be individuals who would endure tens or hundreds of college educations in different subjects. Thus from such textbased selfconsistent evidence we might assume that the estimate of 108 bits is not wrong by more than one to two orders of magnitude. It is generally assumed that most of the information needed to describe the physiology of the organism is contained in genetic information. with a range of one to tens of books. We can also estimate the complexity of a human being by the typical amount of information that a person can learn. This is in direct agreement with the previous estimate of 108 bits. 8.9). The existence of booklength biographies implies a poor estimate of human complexity of 106 bits. Some other perspectives are given in Section 7. it seems to make sense to base an estimate on the length of a college education. which uses approximately 30 textbooks.Specifically. The most prolific author in modern times is Isaac Asimov. or about 30 books. with about 500 books. This suggests that human complexity is somewhat larger than this latter number—approximately 10 8 bits. For simplicity we might think of DNA as a kind o f program that is interpreted by decoding machinery during development and operation.
it is likely that information in most of the base pairs that are noncoding is not essential for organism behavior. Aside from the increasing trend from bacteria to fungi to animals/plants. It may be relevant to the structural properties of DNA. they can be replaced by many other possible base pair sequences without effect.there is no apparent trend that would suggest that genome length is correlated with our expectations about complexity.this is essentially as good an estimate as we can obtain from this methodology at present.4. We see that for a human being. It may also contain other useful information not directly relevant to protein sequence. Direct forms of compression: as presently understood .Specific numbers are given in Table 8. What is more remarkable is that there is no systematic trend of increasing genome length that parallels our expectations of increasing organism complexity based on estimates of the last section.D NA is primarily utilized through transcription to a sequence of amino acids. a single number is given for the information contained in the genome.” This DNA is not transcribed for protein structures. its length is measured in base pairs. Genome lengths and ranges are representative.Complexity estimation 767 Organism Bacteria (E. Nevertheless. a. there are a number of assumptions that we are making about the organism that give a larger uncertainty than some of the corrections that we can apply. Specifically. but it does suggest some of the difficulties in determining the information content even when there is a clear first numerical value to start from. The list of approximations given below is not meant to be exhaustive. While this estimate neglects many corrections. because the accuracy does not justify more specific numbers.4. Since 30%–50% of human DNA is estimated to be coding.3 Estimates of complexity based upon genome length. The coding for each amino acid is given by a triple of bases. Therefore as a rough estimate.a n d . A significant percentage of DNA is “noncoding.3. plementary nucleotide chains in a double helix. where there is a particularly wide range of genome lengths. coli) Fungi Plants Insects Fish (bony) Frog and Toad Mammals Man Genome length (base pairs) 10 –10 107–108 108–1011 108–7x109 5x108–5x109 109–1010 2x109–3x109 3x109 6 7 Complexity (bits) 107 108 3x108–3x1011 109 3x109 1010 1010 1010 Table 8. We now proceed to discuss limitations in this approach. which is somewhat larger than that obtained from languagebased estimates in the previous section. b. the estimate is nearly 10 10 bits. Except for plants. Since there are many more triples (43 = 64) than amino acids (twenty) some of the sequences have no amino acid counterp a rt . this correction would r educe the estimated complexity by a factor of two to three.
Transcription may start from distinct initial points. This discussion is approaching issues of the scale at which complexity is measured—at the atomic scale where the specific amino acid is relevant. A particular region of DNA may have several coding regions that can be combined in different ways to obtain a number of different proteins. This is relevant to the general redundancy of protein design. This might suggest that some degree of compression is performed in order to reduce the complexity of transmission of the information from generation to generation.g. d. Scale of representation:the genome codes for macromolecular and cellular function of the biological organism.ifa molecule that is to be represented has a long chain of the same amino acid. To improve the estimate further. However. This is much less than the microscopic entropy. We will mention this limitation again in point (d). the DNA is likely to be coding a far greater complexity than we are interested in for multicellular . the information that describes the pattern of transcriptions is represented in the noncoding segments that are between the coding segments. we can ask how compressed the DNA encoding of information is.. it would be interesting if this could be represented using a chemical equivalent of (18)asp. Presumably. and one could also argue in favor of redundancy in order to avoid susceptibility to small changes. and correlations between them. Other forms of compression may also be relevant. This redundancy means that there is less information in the DNA sequence. We can rely upon a basic optimization of function in biology. we would include the relative probability of the different amino acids. This may be much less than the information necessary to specify its primary structure (amino acid sequence). e. For example. However.768 Hu ma n Civ il iz a tion I there are more than one sequence that map onto the same amino acid.. aspaspaspaspaspaspaspaspaspaspaspaspaspaspaspaspaspasp. Moreover there are likely to be inherent limitations on the compressibility of the information due to the possible transcription mechanisms that serve instead of decompression algorithms. There are organisms that are known to have highly repetitive sequences (e. c.g.4N. For example. this is not a proof. Related to the issue of DNA code compression are questions about the complexity of protein primary structure in relation to its own function—specifically. how much information is necessary to describe the function of a protein. or at the molecular scale at which the enzymatic function is relevant. since our concern is for the organism’s macroscopic complexity. There is evidence that the genome does uses this property for compression by overlapping the regions that code for several different proteins. since it does not code the atomic vibrations or molecular diffusion. Much of this may be noncoding DNA. This requires a transcription mechanism that repeats segments—a DNA loop. 107 repetitions) forming a significant fraction of their genome. Taking this into account by assigning a triple of bases to one of twenty characters that represent amino acids would give a new estimate of (N/3)log(20) = 1. General compression: more generally. we can ask if there are protein components/subchains that can be used in more than one protein.
However. causing a small correction to our estimates. identical twins have been studied in order to determine the difference between environmental and genetic influence. even organisms that have the same DNA are not exactly the same. We could note also that there are two sources of DNA in the eukaryotic cell.. if the organism behavior is comparatively simple. the information in cellular structures is more likely to be irrelevant for organisms whose complexity is high. including the transcription mechanisms. but rather the microscale influence.Similar to our point (d). however. Completeness of representation: we have assumed that DNA is the only source of cellular information. f. Thus. It is clear.104–106 bits compared to 107–1011 bits in DNA). the DNA is essentially representing the complexity of a single cellular function with the additional complication of representing the variation in this function. 10%) of the information in the nuclear DNA.a complete estimate of the complexity of a system must include this information. It is possible. However. only involves a small fraction of the information content compared to the DNA (for example.it may be assumed that the greatest part of the DNA code represents the macroscale behavior. nuclear DNA and mitochondrial DNA. Nevertheless.C o m p l e xi ty est i mati on 769 organisms. Without considering different scales of structure or behavior. Thus. The assumption is that much of the cellular chemical activity is not relevant to a description of the behavior on the scale of the organism. on the macroscale we should . To the extent that the complexity of cellular behavior is smaller than that of the complete organism. e. Otherwise it would be possible to transfer DNA from one cell into any other cell and the organism would function through control by the DNA. the molecular and cellular behavior is generally repeated throughout the organism in different cells. This influence begins with the randomness of molecular vibrations during the developmental process. however. In humans. that DNA does not contain all the information. Here we are not considering the macroscale environmental influence. This is not the case.the greater part of the DNA representation would be devoted to describing the cellular behavior.g.and it is not clear how much information is necessary to specify their function. However. and we also expect it to dominate over other sources of cellular information. If the DNA were representing the sum of the molecular or cellular scale complexity of each of the cells independently. The information in the nuclear DNA dominates over the mitochondrial DNA. On the other hand. during cell division not only the DNA is transferred but also other cellular structures. then the error in estimating the complexity would be quite large. We have implicitly assumed that the development process of a biological organism is deterministic and uniquely determined by the genome. that the other sources of information approach some fraction (e. Randomness in the process of development gives rise to additional information in the final structure that is not contained in the genome. The additional information gained in this way would have to play a relatively minor functional role if there is significance to the genetic control over physiology. it may very well be that the description of all other parts of the cell.
or it may require a more redundant (longer) representation of the same information. or cannot make use of. this is a multiply reentrant program. a hig h complexity on small scales would not allow a high complexity on larger scales. and second. that plants are actually more complex than animals. the explanation would be supported. From a programming point of view. has a genome of 1011 base pairs.770 H uma n Ci vi liza t io n I not expect the microscopic randomness to affect the complexity by more than a factor of 2.there must be some form of actual blueprint for the organism encoded in the genome that takes into account many possible circumstances. To enable this feature may very well be more complex. and comparable to that of the largest plant genomes. which is the only vertebrate with the ability to regenerate limbs. Assumptions discussed in (e).the genetic estimate becomes less reliable as an upper bound for human beings than it is for lower animals. (c) and (d) would lead to the DNA length being an overly large estimate of the complexity. A more general reason for the high plant genome complexity that is consistent with regeneration would be that plants have systematically developed a high complexity on smaller (molecular and cellular) rather than larger (organismal) scales. These are usually described by adaptation and learning . This point will be discussed in greater detail below. This inherently requires more information than the reliance upon a specific time history for development. If plants are systematically more complex than animals. Therefore. A candidate for such a property is that plants are generally able to regenerate after injury. g. (b). In essence. It is presumed that the structure of animals has such a high intrinsic complexity that representation of a fully regenerative organism would be impossible.We might adopt one of two approaches to understanding this result: first. compression algorithms that are present in animal cells. One reason for this would be that plant immobility requires the development of complex molecular and cellular mechanisms to inhibit or survive partial consumption by other organisms. By our discussion of the complexity profile in Section 8. there must be a general quality of plants that has higher descriptive and behavioral complexity. This is not as true about many mammals and even less true about human beings. We have also neglected the macroscale environmental influences on behavior. that the DNA representation in plants does not make use of. the salamander.3. Indeed. For most biological organisms. We can see that the assumptions discussed in (a). This is much larger than that of other vertebrates. If they are substantially longer than similar animals without the ability to regenerate. Instinctive behaviors dominate. This .the environmental influences on behavior are believed to be small compared to genetic influences. This idea might be checked by considering the genome length of animals that have greater ability to regenerate.and more likely the effect is not more than 10% in a typical biological organism. (f ) and (g) imply it is an underestimate. One of the conceptual difficulties that we are presented with in considering genome length as a complexity estimate is that plants have a much higher DNA length than animals. This is in conflict with the conventional wisdom that animals have a greater complexity of behavior than plants.
3 Component counting The objective of complexity estimation is to determine the behavioral complexity of a system as a whole. we would be describing the microscopic complexity. as with other estimation methods. If we are to consider the behavioral complexity of a human being by counting components. rather than longer genomes. However. By counting the number of elements. We will find that this method gives us a much higher estimate than the other methods. to see why this would be the case in all but the simplest models of evolution. On the other hand. Increases in organism complexity then result from fewer redundancies and better compression. in each case the complexity estimate from genome length provides an upper bound to the genetic component of organism complexity (c. this could account for the pattern of complexities we have obtained. we must identify the relevant components to count. while similar organisms may have quite different g enome lengths. One explanation for this that might be suggested is that genome lengths have increased systematically with evolutionary time.A protein formed out of a long chain of the same amino acid might be functionally of importance in plants. In principle.it must be understood that there are inherent problems in this approach. If we count the number of atoms.and not in animals. as we discussed in Section 8. It is hard. This might be the result of particular proteins with chains of repetitive amino acids. One of the most striking features of the genome lengths found for various organisms is their relative uniformity. however. The second possibility is that there exists a systematic additional redundancy of the genome in plants.this is reduced both by correlations between elements and by the change of scale from that of the elements to that of the system. Regardless of the ultimate reason for various genome lengths. (f ) and (g) above). It makes more sense to infer that there are constraints on the genome lengths that have led it to gravitate toward a value in the range 109–1010. Widely different types of organisms have similar genome lengths.4.3. We will discuss these problems in the context of estimating human complexity. However. Thus. In using this method we are faced with the dilemma that lies at the heart of the ability to understand the nature of complex systems—how does complex behavior arise out of the component behavior and their interactions? The essential question that we face is: Assuming that we have a system formed of N interacting elements that have a complexity C0 (or a known distribution of complexities). one of the important clues to the complexity of the system is its composition from elements and their interactions. This is a potential explanation for the relative lengths of plant genome and animal genome. points (e). Thus . 8.how can the complexity C of the whole system be determined? The maximal possible value would be NC0.f. we cannot count the number of parts on the scale of the organism (one) because the problem in determining the complexity remains in evaluating C0. However.the human genome length provides us with an estimate of human complexity. we can develop an understanding of the complexity of the system.C o m p l exi ty es ti ma ti on 771 explanation would also be consistent with our understanding of the relative simplicity of plants on the larger scale.
Thus.1 × 104 × 1011 = 1014 bits. It is generally understood that muscle cells and bone cells are largely uniform in structure. For the human brain where Ns has been estimated at 104 and N ≈ 1011. Its behavior is specified by whether it is ON or OFF. Given our investigation of the storage of patterns in the network. for a neuron.and how relevant are these parameters to the complexity . and the synapse b etween two specific neurons is different from other synapses. There are several problems with applying this formula to biological nervous systems.aside from an inconsequential number of additional parameters. The problem with this estimate is that in order to specify the behavior of the network.This gives a value c Ns N. a neuron is a complex system. then it would be natural to also consider the immune system.the behavior of the system on the scale of the organism is generally attributed to the nervous system. we will consider only the cells of the nervous system.14 is a number that arose from our analysis of network overload. this would give a value of 0. We can do this by considering the behavior of a model system we studied in detail in Chapter 2—the attractor neural network model. there are molecules. We will tackle the problem by considering cells and discuss difficulties that arise in this context. and their contribution to organism behavior can be summarized simply. Each neuron responds to particular neurotransmitters. the synapses of a neuron largely connect to neurons that are nearby. If we were considering the behavior on a smaller length scale. described by the values of the synapses. and should scale with the number of synapses. How many parameters would be needed to describe the behavior of an individual neuron. The behavior of the network is. This estimate may be reduced by a small amount. They may therefore collectively be described in terms of a few parameters. In contrast.and c ≈ 0. as we expect.L i s ting the synapses that are present would require a set of number pairs that would specify which neurons each neuron is attached to. This corresponds to c N 2 bits of information. Indeed. cells and organs. on average. but this does not contribute to the complexity of the network behavior.Each of the neurons is a binary variable.as we discussed in Chapter 2. Of the natural intermediate scales to consider. This list would require roughly NNs log(N) = 3 × 1016 bits. we must discuss a specific model for the nervous system and then determine its limitations.772 H uma n Ci vi liza t io n I the objective is to select components at an intermediate scale. We could apply a similar formula to the network assuming only the number of synapses Ns that are present. This means that the storage capacity of the network is smaller. where N is the number of neurons. In order to make more progress. which is larger than the number of bits of information in the storage itself. The second major problem with this model is that real neurons are far from binary variables. The total complexity of the synapses could be quite high if we allowed the synapses to have many digits of precision in their values. if. we can argue that the maximal number of independent parameters that may be specified for the operation of the network consists of the neural firing patterns that are stored. The first is that the biological network is not fully connected. We will use 1016 as the basis for our complexity estimate. we need to specify not only the imprinted patterns but also which synapses are present and which are absen t .however. The first difficulty is that the complexity of behavior does not arise equally from all cells.
However. Thus if we can describe neurons as belonging to a particular class of neurons (category or stereotype). Synapses are significantly simpler than the neurons. By these estimates.Complexity estimation 773 of the whole system? Naively.the structure of synapses and the list of synapses present. . Even if we eliminate all of their parameters. the visual system involves processing of a visual field where the different neurons at different locations perform essentially the same operation on the visual information. Many of the parameters enumerated above are likely to be the same. For example. we obtain an estimate for complexity of 1016 bits. which multiplies the number of synapses. we should consider the complexity of a synapse. the idea behind this construction is that whenever there are many neurons whose behavior can be grouped together into particular functions. which is not a significant amount by comparison with 1016 bits. We may estimate their complexity as no more than 10 bits. Indeed. Similarly.the estimate of system complexity would not change. Even if there are smooth variations in the parameters that describe both the neuron behavior and the synapses between them.the initial visual processing does not involve more than 10% of the number of neurons. there are two fundamental difficulties with this approach that make the estimate too high— correlations among parameters and the scale of description. This estimate is significantly larger than the estimate found from the other two approaches. However. This would be sufficient to specify the synaptic strength and the type of chemicals involved in transmission. then the complexity is reduced. Then the complexity of the whole system would include C0N bits for the neurons themselves. Multiplying this by the total number of synapses (10 15) gives 1016 bits. giving rise to the possibility of compression of the description. This would be greater than 10 16 bits only if the complexity of the individual neurons were larger than 105. this is not the case. A reasonable estimate of the complexity of a neuron is roughly 103–104 bits.We assume that the parameters necessary to describe an individual neuron correspond to a complexity C0. Combining our estimates for the information necessary to specify the structure of neurons. it is not clear how many parameters remain once this categorization has been done. This would give a value of C0N = 1013−1014 bits. Since a substantial fraction of the number of neurons in the brain is devoted to initial visual processing. Both the description of an individual neuron and the description of the synapses between them can be drastically simplified i fa ll of them follow a pattern.then the complexity of the description is reduced. As we mentioned before. Nevertheless. we might think that taking into account the complexity of individual neurons g ives a much higher complexity than that considered ab ove.one would guess (an intuitionbased estimate) that processing of the visual field is quite complicated (more than 102 bits) but would not exceed 103–105 bits altogether. and it is necessary to specify the parameters of all of the neurons. This is the same as the information necessary to specify the list of synapses that are present. we can describe the processing of the visual field in terms of a small number of parameters. the use of this reduced description of the visual processing would reduce the estimate of the complexity of the whole system. It is known that neurons can be categorized.however.the complexity of the internal structure of a neuron is not greater than the complexity of its interconnections.
In order to overcome this problem. once we have counted the information that is present in the selection of synapses. which is only 1010 bits. However. since the daily loss of neurons corresponds only to a loss of 1 in 105 neurons.” We have not yet directly addressed the role of substructure (Chapter 2) in the complexity of the nervous system. While this may be a small part of the total information. we need a method to assess the dependence of the organism behavior on the cellular behavior. in a system that is subdivided by virtue of having fewer synapses between subdivisions. is eight orders of magnitude larger than the estimates obtained from text and six orders of magnitude larger than the genomebased estimate.774 Human Civilization I When we think about grouping the neurons together. If the number of parameters necessary to describe the network greatly exceeds the number of parameters in the genetic code. In comparison with a fully connected network. In any event. our estimate based upon component counting. we might also realize that this discussion is relevant to the consideration of the influence of environment and genetics on behavior. How specific do we have to be? Should the content of shortterm memory be included? The argument in favor would be that we need to represent the human being in entirety. The argument against would be that what happened in the past five minutes or even the past day is not relevant and we can reset this part of the memory. focus on cellular behavior most relevant to the behavior of the organism). This is too great a discrepancy to dismiss based upon such an argument. Human beings are believed to lose approximately 106 neurons every day (even without alcohol) corresponding to the loss of a significant fraction of the neurons over the course of a lifetime. The estimate we have obtained for the complexity of the nervous system is relevant to a description of its behavior on the scale of a neuron (it does. a network with substructure is more complex because it is necessary to specify the substructure. A natural approach might be to evaluate the robustness of the system behavior to changes in the components. we could also argue that it would be hard for us to notice the impact of this loss. or more specifically which neurons (or which information) are proximate to which. we comment that parameters that describe the nervous system also include the malleable shortterm memory.To account for this difference we would have to argue that 99. . It implies that there may be a couple of orders of magnitude between the estimate of neuron complexity and human complexity. We will discuss this again in the next section. then many of these parameters must be specified by the environment. However.999% of neuron parameters are irrelevant to human behavior. Eventually we may ask whether the objective is to represent the specific information known by an individual or just his or her “character. On a more philosophical note. This suggests that individual neurons are not crucial to determining human behavior.the substructure of the system has already been included. 1016. our estimate of behavioral complexity should raise questions such as.as we have done above. The second problem of estimating complexity based on component counting is that we do not know how to reduce the complexity estimate based upon an increase of the length scale of observation. however.
and 109 seconds in a lifetime. We will discuss the discrepancies between these numbers and conclude with an estimate of 1010±2 bits. artificial intelligence. because genome information is compressible and because much of it must be relevant to molecular and cellular function.4. We can contrast this number with an estimate of the total amount of information that might be imprinted upon the synapses. Thus we conclude that only. which is larger than the information that is manifest in terms of behavior.and then turn to some more philosophical considerations of its significance. consisting of 30 textbooks.as discussed in Section 8. 8. and therefore contains comparable or greater information.and no patterns of behavior exist.3.4 Complexity of human beings. and the soul We begin this section by summarizing the estimates of human complexity from the previous sections. As discussed at the end of the last section. we can demonstrate that 1016 is too large an estimate of complexity by considering the counting of time rather than the counting of components. We have found that the microscopic complexity of a human being is in the vicinity of 1030 bits. We can summarize our understanding of the different estimates.(8. This is much larger than our estimates of the macroscopic complexity—languagebased 10 8 bits. This suggests either that the collective behavior of neurons requires redundant information in the synapses. at most. Thus we see that the total amount of information that passes through the nervous system is much larger than the information that is represented there.(8. This would seem to be a very generous estimate.C o m p l e xi ty es t ima tio n 775 Finally. The component . This can be estimated as the total number of neuronal states over the course of a lifetime.One way to say this is that a college education. we have 1022 bits of information. genomebased 1010 bits and component (neuron)counting 10 16 bits.47) and Eq. The latter possibility returns us to the discussion of Eq. We consider a minimal time interval of describing a human being to be of order 1 se cond.6. because information may cycle between scales or between system and environment. we replace the spatial componentcounting estimate with the timecounting upper bound of 1012 bits. is based upon childhood learning (nonlinguistic and linguistic) that provides meaning to the words. Under these circumstances. the potential complexity of a system under the most diverse set of circumstances is not necessarily the observed complexity. Both of our approaches to component counting (spatial and temporal) may overestimate the complexity due to this problem. and we allow for each second 103 bits of information.1012 bits of information are necessary to describe the actions of a human. There are of order 109 seconds in a lifetime.3.1011 neurons.3. For a neuron reaction time of order 10−2 seconds. or that the actions of an individual do not fully represent the possible actions that the individual would take under all circumstances. The languagebased estimate is likely to be somewhat low because of the inherent compression achieved by language. where we commented that the expression is an upper bound.59). This estimate assumes that each second is independently described from all other seconds. The genomebased complexity is likely to be a toolarge estimate of the influence of genome on behavior.
We must still conclude that most of the network information is not relevant to behavior at the larger scale. However. predict the behavior of a human being. For the following discussion. then there would be no independent functional description.Specifically.776 Hu man Civ il iz a tion I counting estimate suggests that the information obtained from experience is much larger than the information due to the genome—specifically. It may be helpful to discuss some of the alternate approaches to the traditional conflict that bypass the controversy in favor of slightly modified definitions. Our estimate of the complexity of a human being is 1010±2 bits. Also.2. we must realize that the concept of soul serves a purpose.and science and popular thought. If the material of which the human being is made were essential to its function. It is redundant. Such nonmaterial entities are rejected in the context of science because they are. The complexity of a human being specifies the amount of information necessary to describe and. that genetic information cannot specify the parameters of the neural network. In this way the description of a soul suggests an abstraction of function from matter which is consistent with abstractions that are familiar in science and modern thought. the complexity for describing the response to arbitrary circumstances may be higher than the estimate that we will give. When an individual dies. there would be no mechanism by which we could reproduce human behavior without making use of precisely the atoms of which he or she was formed. but might not be consis . but should still be significantly less than 1016 bits. This is consistent with our discussion in Section 3. Because of this last point. we will consider the possibility of a scientific definition of the concept of a soul. Consideration of the complexity of a human being is intimately related to fundamental issues in artificial intelligence.the existence of a soul represents the independence of the human being from the material of which he or she is formed.there is an implication of its possibility.the actual value is less important than the existence of an estimate. Instead it is closely related to the assumptions of this field. and /or does not manifest itself in human behavior because of the limited types of external circumstances that are encountered. in principle. given an environment. To understand how this is related to the religious concept of soul. There is no presumption that the prediction would be feasible using present technology. Our objective here is to briefly discuss both philosophical and practical implications of this observation.11 that suggested that synapses store learned information while the genome determines the overall structure of the network. by definition. One way to define the concept of soul is as the information that describes completely a human being. The main final caveat is that the difficulty in assessing the possibility of information compression may lead to a systematic bias to high complexities.not measurable.We will see that such a concept is not necessarily in conflict with notions of artificial intelligence. The notion of reproducing human behavior in a computer (or by other artificial means) has traditionally been a major domain of confrontation between science and religion. Some of these conflicts arise because of the supposition by some religious philosophers of a nonmaterial soul that is presumed to animate human beings. We have just estimated the amount of this information. The error bars essentially bracket the values we obtained.
however. which is conceptually similar to the model of universal Turing machines. We have discussed in Chapter 3 that our models of human beings are to be understood as nonuniversal and would indeed be better realized by the concept of representing individual human beings rather than a generic artificial intelligence. the statement of the existence of a soul appears to be the same as the claim of artificial intelligence—that a human being can be reproduced in a different form by embodying the function rather than the mechanism of the human being. or even an obsessed individual. these atoms may be replaced by other indistinguishable atoms and the same behavior will be found. The simplest possible abstraction would be to state (as is claimed by physics) that the specific atoms of which the human being are formed are not necessary to his or her function. There is. This difference is related to the notion of a universal artificial intelligence.A human being is not directly tied to the material of which he is made. we anticipate that the features characteristic of human behavior are predominantly sp ecific to each individual rather than common. However. We can illustrate this change in perspective by considering the Turing test for recognizing artificial intelligence. Instead there is a functional description that can be implemented in various media.Complexity estimation 777 tent with more primitive notions of matter. Human beings have varied complexity. While we bypassed the fundamental controversy between science and religion regarding the presence of an immaterial soul. Finally. In contrast. opinions and a personality. of which one possible medium is the biological body that the human being was implemented in. or someone we do not know. The Turing test suggests that in a conversation with a computer we may not be able to distinguish it from a human being. a crucial distinction between the religious view and some of the practical approaches of artificial intelligence. we suspect that the real conflict between the approaches resides in a different place. A primitive concept of matter might insist that the matter of which we are formed is essential to our functioning.a family history. and interactions are of varied levels of intimacy. Instead. According to this view there is a generic model for intelligence that can be implemented in a computer. This conflict is in the question of the .the religious view is t ypically focused on the individual identity of an individual human being as manifest in a unique soul. Which human being did Turing have in mind? We can go beyond this objection by recognizing that in order to fool us into thinking that the computer is a human being. a profession. prior to the test. A key problem with this prescription is that there is no specification of which human being is to be modeled. not an abstract notion of intelligence. Artificial intelligence takes this a large step further by stating that there are other possible media in which the same behavior can be realized. It would be quite easy to reproduce the conversation of a mute individual. when we met him or her. except for a very casual conversation. There are common features to the information p rocessing of different individuals. the computer would have to represent a single human being with a name. Viewed in this light. Thus the objective of creating artificial human beings might be better described as that of manifesting the soul of an individual human. we may also ask whether the represented human being is someone we already know.
3 × 105 bits. There is. or whether it is biological scientists that consider the biochemical and cellular structures as the same as. For consistency. We found that our (linguistic) estimates of human complexity placed human beings quantitatively above those of animals. On the one hand it provides us with a scientific method for distinguishing man from matter. Even so.and derived evolutionarily from.a finite complexity may be humbling and difficult to accept. At the same time.including themselves. we may consider the complexity of a human being and see it as either high or low. may find it comforting to know that this limitation is fundamental.there are no encyclopedias of relevant information. tanning. the amount of text is not longer than a rather brief book. Indeed. the scientific perspective has often been viewed as lowering human worth. As is often the case. Let us consider for the moment the complexity of the demands of the environment. Philosophically. and primitive home or boat construction. a book might discuss survival under extreme circumstances rather than survival under more typical circumstances.778 Hu ma n Civ il iz a tion I intrinsic value of a human being and his place in the universe. an independent value to which these complexities can be compared. Books that discuss survival in the wild are typically quite short. This is true whether it is physical scientists that view the material of which man is formed as “just” composed of the same atoms as rocks and water. as we might expect. Both the religious and popular view would like to place an importance on a human being that transcends the value of the matter of which he is f ormed. Specifically. however.the value of a number attains meaning though comparison. We can estimate this complexity using relevant literature. we use languagebased complexity estimates throughout. Alternatively. One reference point was clear in the preceding discussion—that of animals. While there are many individuals who have devoted themselves to living in the wild. or the particular implementation of biology.an animal and a human being.animal processes. and man from animal. We must have some reference point with respect to which we measure human complexity. It is significant that an ape may have a complexity of ten times the com . For those who would like to view themselves as infinite. the complexity of survival demands is small. The idea of biological evolution and the biological continuity of man from animal is based upon the concept of the survival demands of the environment on man. Such a book might describe more than just basic survival—plants to eat and animal hunting—but also various skills of a primitive life such as stone knives. Others who already recognize the inherent limitations of individual human beings.A quantitative definition of complexity can provide a direct measure of the difference between the behavior of a rock. by recognizing that the particular arrangement of atoms in a human being. We should recognize that this capability can be a doubleedged sword. by placing a number on this complexity it presents us with the finiteness of the human being. This suggests that in comparison with the complexity of a human being. basket making. This result is quite reasonable but does not suggest any clear dividing line between animals and man. The study of complexity presents us with an opportunity in this regard. this complexity appears to be right at the estimated dividing line between animal (106 bits) and man (108 bits). achieves a functionality that is highly complex.
Several interesting remarks follow. That of human beings is not. We might compare the behavior of primitive man with the behavior of animal predators.or alternatively. once such demands are met.this does not rule out that general aspects or patterns of behavior. this does not hold. at least not in a direct fashion. Of course. the complexity of such predators is essentially devoted to problems of survival. but a human being has a complexity of a hundred times this demand. it does not compare in complexity to that of human beings. In particular. complexity does not. Thus. any animal behavior might be justified on the basis of a survival demand. Another way to arrive at this conclusion is to consider primitive man. From books that reflect studies of such peoples we see that the descrip tion of their survival techniques is much shorter than the description of their social and cultural activities. or primitive tribes that exist today. One might ask why they did not develop complex cultural activities. Human behavior cannot be driven by survival demands if the survival demands are simpler than the human behavior. We now turn to some more practical asp ects o f the implications of our complexity estimates for the problem of artificial intelligence—or the recreation of an individual in an artificial form. predators satisfy their survival needs in terms of food using only a small part of the day. are driven by survival demands. then there is no reason for additional instinctive behaviors. We might ask about the complexity of their existence and specifically whether the demands of the survival are the same as the complexity of their lives. to represent the atoms in a computer. of sleeping lions. We can argue instead that if the complexity of survival demands are limit ed.A single aspect of their culture might occupy a book. and for this reason human behavior is not instinctively driven. We might imagine that the computer could simulate the dynamics of the atoms in order to simulate the behavior of the human being. or even some specific behaviors. Thus. our results suggest that instinctive behavior is actually a better strategy for overcoming survival demands—because it is prevalent in organisms whose behavior arises in response to survival demands. One might think. One of the distinctions between man and animals is the relative dominance of instinctive behavior in animals. In this context we can suggest that analyses of animal behavior should not necessarily be assumed to apply to human behavior. However. In contrast to grazing animals.Complexity estimation 779 plexity of the environmental demands upon it. for example. While this approach has also often been applied to human beings—the survival advantages associated with culture. We may start from the microscopic complexity (roughly the entropy) which corresponds to the information necessary to replace every atom in the human being with another atom of the same kind. This conclusion is quite intriguing. However. there is little reason to produce more complex instinctive behaviors. The . The explanation that our discussion provides is that while time would allow cultural activities. while the survival methods do not occupy even a single one. ifthe complexity of the demands of survival are smaller than that of a human being. While they do have a social life. It is often suggested that human dependence on learned rather than instinctive behavior is simply a different strategy for survival. art and science have often been suggested—our analysis suggests that this is not justified.as compared to learned behavior in man.
if only 1010 bits are relevant to human behavior. This is one approach to understanding a possible use of the microscopic information content of a human being. Since a CDROM contains 5 × 109 bits. specifically the firing of neurons. 1010±2 bits.02 CDROMs is clearly not a problem. what are most of the 1031 bits doing? One way to think about this question is to ask why nature didn’t build a similar machine with of order 10 10 atoms. The problem is not just that the number of bits of storage as well as the speed requirements are beyond modern technology.there must be a much larger reservoir of randomness. It must be assumed that any computer representation of this dynamics must ultimately be composed of atoms. However. One possible use of the additional information can be inferred from our arguments about the difference between TM with and without a random tape. Specifically. unless the system is constructed to respond to its environment in a manner similar to the response of a human being.9.On our own scale. then the complexity of the machine must be significantly greater than that of a human being. Another . Indeed. we might ask what the additional microscopic complexity present in a human body is good for. We have already suggested that there may be inherent limitations to the complexity that can be formed. it should be possible to store it. Moreover. suggests that this might be possible.780 Hu ma n Civ il iz a tion I practicality of such an implementation is highly questionable. 0. once this information is obtained.7 suggests that it may be necessary to have a source of randomness to allow human qualities such as creativity. which would be significantly smaller.Our estimate o f behavioral complexity.A computer that can simulate the behavior of this individual represents a more significant problem. two hundred CDROMs is well within the domain of feasibility. The discussion in Section 1. this would be a technologically feasible project. However. We have made no claims about our ability to obtain the necessary information for one individual. In this picture. even if we chose to represent the information we estimated to be necessary to describe the neural network of a single individual. random motion of molecules affects cellular behavior. Even at the upper end. that ultimately affects human behavior. This does not mean that all of the microscopic information is relevant. At the lower end of this range. 1016 bits or 2 million CDROMs. The implication is that the microscopic information becomes gradually relevant to the macroscopic behavior as a chaotic process. However. Such a task is likely to be formally as well as practically impossible. If the simulation is not composed out of the atoms themselves. then the computer must also simulate the environment. we might ask why nature doesn’t build an organism with a complexity of order 1030. but some controllable representation of the atoms. we are discussing 2 × 10±2 CDROMs. there may also be another use of some of the additional large number of microscopic pieces of information. This fits nicely with our discussion of chaos in complex system behavior. Before we discuss the problem of simulating a human being. Only a small number of bits can be relevant at any time. We might also ask whether we would know if such an organism existed. We can assume that most microscopic information in a human being describes the position and orientation of water molecules. One central question then becomes whether it is possible to compress the representation of a human being into a simpler one that can be stored. we recognize that in order to obtain a certain number of random bits.
over time a human being. However. The mammalian body temperature may be selected to balance two competing effects. As with the existence of artificial sensors in other parts of the visual spectrum. through collaboration with other human beings/generations extending through time. . which at least in part will make use of biological molecules and methods. Finally we can say that the concept of an infinite human being may not be entirely lost. We have demonstrated time and again that it is possible to build a stronger or faster machine than a human being. reproduce arbitrarily complex behavior. but would not attribute to it an essential role as information. This brings us back to questions of the behavior of collections of human beings. For example.in arbitrarily long time and with an infinite storage. Even the lowly TM whose internal (table) complexity is rather small can. However. In this regard we should not consider just the complexity of a human b eing but also the complexity of a human being in the context of his tools. We have already argued that the present notion of computers may not be sufficient if it becomes necessary to include chaotic behavior. or the complexity of a human being with access to a library. In this regard. However. which also limits the macroscopic complexity.can reproduce complex behavior limited only by the matter that is available. and possibly the use of superconductors.the complexity of a human being with a computer. it is interesting that some of the modern approaches to artificial intelligence consider the use of nanotechnology. breaking the ergodic theorem requires low temperatures so that energy barriers can be effective in stopping movement in phase space. but by fundamental limits of quantum mechanics. The previous discussion is not a proof that we cannot build a robot that is more capable than a human being. It may also be that the behavioral complexity of a human being at its own length and time scale is limited by fundamental law.Complexity estimation 781 approach would ascribe the additional information to the necessary support structures for the complex behavior. We can go beyond this argument by considering the problem we have introduced of the fundamental limits to complexity for a collection of molecules. It may turn out that our quest for the design of a complex machine will be limited by the same fundamental laws that limit the design of human beings. Since human beings make use of external storage that is limited only by the available matter. However. the choice of a higher temperature may be required to enable a higher microscopic complexity. any claims that it is possible should be tempered by the respect that we have gained from studying the effectiveness of biological design.A way to argue this point more generally is that the sensitivity of human ears and eyes is not limited by the biological design. this argument suggests that it may not be possible to build a systematically more complex artificial organism. At high temperatures there is a high microscopic complexity. which we will address in Chapter 9. we can consider the complexity of a human being with paper and pen. we already know that machines with other capabilities can be built. This has led some people to believe that we can also build a systematically more capable machine—in the form of a robot.One of the natural improvements for the design of deterministic machines is to consider lower temperatures that enable lower error rates and higher speeds.